CN116401680A - Industrial control vulnerability detection method and system based on gradient lifting decision tree algorithm - Google Patents
Industrial control vulnerability detection method and system based on gradient lifting decision tree algorithm Download PDFInfo
- Publication number
- CN116401680A CN116401680A CN202310677008.0A CN202310677008A CN116401680A CN 116401680 A CN116401680 A CN 116401680A CN 202310677008 A CN202310677008 A CN 202310677008A CN 116401680 A CN116401680 A CN 116401680A
- Authority
- CN
- China
- Prior art keywords
- vulnerability detection
- detection model
- industrial control
- decision tree
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 110
- 238000003066 decision tree Methods 0.000 title claims abstract description 39
- 238000004422 calculation algorithm Methods 0.000 title claims abstract description 27
- 238000012549 training Methods 0.000 claims abstract description 42
- 238000000034 method Methods 0.000 claims abstract description 39
- 238000011156 evaluation Methods 0.000 claims abstract description 37
- 238000012360 testing method Methods 0.000 claims abstract description 23
- 230000015654 memory Effects 0.000 claims description 31
- 239000013598 vector Substances 0.000 claims description 21
- 230000003068 static effect Effects 0.000 claims description 12
- 238000003860 storage Methods 0.000 claims description 12
- 238000004458 analytical method Methods 0.000 claims description 6
- 230000000694 effects Effects 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 6
- 238000009472 formulation Methods 0.000 claims 1
- 239000000203 mixture Substances 0.000 claims 1
- 238000005516 engineering process Methods 0.000 description 5
- 238000009776 industrial production Methods 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000003211 malignant effect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 206010063385 Intellectualisation Diseases 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000012502 risk assessment Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/57—Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
- G06F21/577—Assessing vulnerabilities and evaluating computer system security
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/24323—Tree-organised classifiers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/02—Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computer Hardware Design (AREA)
- Computer Security & Cryptography (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Computing Systems (AREA)
- Medical Informatics (AREA)
- Mathematical Physics (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The application discloses an industrial control vulnerability detection method and system based on a gradient lifting decision tree algorithm, comprising the following steps: acquiring a data packet in an industrial control system, analyzing the data packet, extracting data characteristics, and dividing the data characteristics into a training set and a testing set; training a vulnerability detection model by using the training set based on a gradient lifting decision tree algorithm; evaluating the trained vulnerability detection model according to the test set; selecting and calculating performance evaluation indexes of the vulnerability detection model; evaluating the vulnerability detection model according to the calculated value of the performance evaluation index; and (5) detecting the loopholes of the industrial control system through a loophole detection model meeting the evaluation standard. The method and the device continuously train the vulnerability detection model to reach the detection model meeting the expected standard, and have high detection accuracy.
Description
Technical Field
The application relates to the field of computer network security, in particular to an industrial control vulnerability detection method and system based on a gradient lifting decision tree algorithm.
Background
The industrial control system transmits data to the industrial Internet in real time through equipment such as a sensor, an instrument and the like, and remote control and monitoring of industrial equipment and industrial production flow are realized. In recent years, with the rapid development of the internet of things and the continuous promotion of market demands, industrial control systems are continuously developed towards the direction of intellectualization, automation and high efficiency. However, with popularization and promotion of industrial control systems, network security problems of industrial control systems are increasingly receiving attention of enterprises. Once the network security of the industrial control system is problematic, unpredictable losses are caused to the industrial production activities of enterprises and the life and property security of workers. Therefore, enterprises commonly adopt the industrial control vulnerability detection system to monitor network security vulnerabilities, so that vulnerability information can be mastered in time, corresponding risk avoidance measures are adopted, the safety and the production efficiency of the industrial control system are improved, and the orderly performance of industrial production activities is ensured.
The industrial control vulnerability detection system is a tool special for carrying out security vulnerability scanning and detection on the industrial control system, can help enterprises to discover vulnerabilities and potential safety hazards existing in the industrial control system in time, and improves the security protection capability of the industrial control system. The industrial control vulnerability detection system integrates the functions of automatically identifying the industrial control system, constructing a vulnerability database, generating a risk assessment report, a safety detection report and the like according to a targeted vulnerability scanning technology, and has the core functions of rapidly and automatically detecting and analyzing the safety vulnerability and hidden danger in the industrial control system.
At present, along with the continuous development of information technology and continuous popularization of Internet application, the number of loopholes in the industrial Internet is continuously increased, the types of loopholes show a diversified trend, the complexity of the loopholes is also increased, and the traditional industrial control loophole detection system is difficult to meet the safety requirement of a modern industrial control system. At present, the traditional industrial control vulnerability detection system mainly has the following problems: 1) The leak detection accuracy is not high: as modern industrial control systems are increasingly complex, the accuracy of leak detection is lower and lower, and the phenomenon of missing detection or false detection often occurs, so that the efficiency of industrial production activities is greatly affected; 2) The occupation of resources is high: when the conventional vulnerability detection system faces a novel vulnerability detection scene, the problem of high calculation complexity exists, and massive memory resources are occupied, so that the performance of the whole industrial control system is reduced, and the stability is weakened; 3) The requirements of the professional technology are higher: the traditional industrial control system of the increasingly complex vulnerability detection scene is not adequate, a large number of professional security personnel are required to be invested for manual judgment, and the service level of the security personnel is also required to be very high; 4) System compatibility is poor: the existing industrial control system has various systems and protocols, and the traditional vulnerability detection system cannot be well compatible with the industrial control system.
Disclosure of Invention
Purpose of (one) application
Based on the above, in order to improve the accuracy of the vulnerability detection and solve the problem that the vulnerability detection system cannot be compatible when the industrial control system is changed, the application discloses the following technical scheme.
(II) technical scheme
The application discloses an industrial control vulnerability detection method based on a gradient lifting decision tree algorithm, which comprises the following steps:
s1, acquiring a data packet in an industrial control system, analyzing the data packet, extracting data characteristics, and dividing the data characteristics into a training set and a testing set;
s11, carrying out dynamic and static combination analysis on the data characteristics, and forming a digital vector from the data characteristics;
s12, dividing data features in the form of digital vectors to obtain a training set and a testing set;
s2, training a vulnerability detection model by using the training set based on a gradient lifting decision tree algorithm;
s3, evaluating a trained vulnerability detection model according to the test set;
s31, selecting and calculating performance evaluation indexes of the vulnerability detection model;
s32, evaluating the vulnerability detection model according to the calculated value of the performance evaluation index;
s4, achieving vulnerability detection of the industrial control system through a vulnerability detection model meeting evaluation standards.
In one possible implementation, the data features include static features including the number of annotations and dynamic featuresNumber of variables->Number of functions->Operator quantity->Instruction sequenceControl flow graph->The method comprises the steps of carrying out a first treatment on the surface of the The dynamic characteristics include API call +>Function callInput/output->Resource utilization->And memory map->。
In one possible implementation, the numerical vector construction formula includes:
. In one possible implementation manner, the training process of the vulnerability detection model includes:
s21, initializing a classifier: setting an initial classifier as an average value of all sample data features in a training set;
wherein,,representing a current classifier; />Representing the data characteristics of the ith sample in the training set, wherein n is the number of samples in the current training set; />An actual value representing a data characteristic of the i-th sample;
wherein,,residual representing sample i on the mth decision tree,/->Representing a predicted value of the current decision tree;
s23, building a tree model: fitting residualLearning a regression tree to obtain a regression tree +.>;
S24, increasing model complexity: adding the current decision tree into the regression tree to obtain an updated decision tree;
s25, repeating S22-S24: stopping iteration after the fitting effect is achieved, and obtaining a final lifting tree;
In one possible implementation manner, the performance evaluation index and the evaluation method include:
the accuracy P refers to the proportion of the code which is correctly classified and judged in all samples, the higher the accuracy is, the higher the success rate of vulnerability detection is, TP represents the true benign code, and FP represents the false benign code;
the recall rate R refers to the number of samples which are correctly predicted as benign codes by the vulnerability detection model in all samples which are actually benign codes, the recall rate is high, the missed judgment of the classifier on the classification of the positive samples is less, and FN represents false malicious codes;
wherein the F1 score is a harmonic average of the accuracy and the recall, the higher the F1 value, the better the classifier performance.
As a second aspect of the present application, the present application further discloses an adaptive industrial control vulnerability detection system based on a gradient lifting decision tree algorithm, including:
the acquisition module is used for: the method comprises the steps of acquiring a data packet in an industrial control system, analyzing the data packet, extracting data characteristics, and dividing the data characteristics into a training set and a testing set;
training module: the vulnerability detection model is trained by the training set based on a gradient lifting decision tree algorithm;
and an evaluation module: for evaluating the vulnerability detection model from the test set;
and a detection module: the method is used for achieving vulnerability detection of the industrial control system through a vulnerability detection model meeting evaluation standards.
In one possible implementation, the acquiring module includes:
an analysis submodule: the method is used for analyzing dynamic and static combination of the data features, and forming the data features into digital vectors;
dividing a molecular module: the method is used for dividing the data characteristics in the form of digital vectors to obtain a training set and a testing set.
In one possible implementation, the evaluation module includes:
selecting a sub-module: the performance evaluation index is used for selecting and calculating the vulnerability detection model;
an evaluation sub-module: for evaluating the vulnerability detection model based on the calculated value of the performance evaluation index
As a third aspect of the application, the application also discloses a storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the method of any of the above.
As a fourth aspect of the present application, the present application also discloses an electronic device, including: one or more processors and a memory, wherein the memory is configured to store one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of the preceding claims.
(III) beneficial effects
According to the industrial control vulnerability detection method and system based on the gradient lifting decision tree algorithm, the vulnerability detection model is trained based on the gradient lifting decision tree algorithm, the data characteristics are described through the multidimensional digital vectors, whether the vulnerability detection model has a vulnerability or not is analyzed as much as possible, the detection accuracy is improved, the false alarm probability is reduced, the detection performance of the vulnerability detection model is evaluated according to the performance evaluation index, the overall performance of the current model and the optimization direction of the next step are determined according to the quality of the performance evaluation index, and the accuracy of model vulnerability detection is further improved.
Drawings
The embodiments described below with reference to the drawings are exemplary and intended for the purpose of illustrating and explaining the present application and are not to be construed as limiting the scope of protection of the present application.
Fig. 1 is a schematic flow chart of an industrial control vulnerability detection method based on a gradient lifting decision tree algorithm disclosed in the present application.
FIG. 2 is a block diagram of an industrial control vulnerability detection system based on a gradient lifting decision tree algorithm disclosed in the present application.
Detailed Description
In order to make the purposes, technical solutions and advantages of the implementation of the present application more clear, the technical solutions in the embodiments of the present application will be described in more detail below with reference to the accompanying drawings in the embodiments of the present application.
An embodiment of the industrial control vulnerability detection method based on the gradient lifting decision tree algorithm disclosed in the application is described in detail below with reference to fig. 1. The method disclosed by the embodiment mainly comprises the following steps.
S1, acquiring a data packet in an industrial control system, analyzing the data packet, extracting data characteristics, and dividing the data characteristics into a training set and a testing set;
the industrial control system is composed of different hardware, software, an operating system, a network protocol and a network environment, a large number of data packets are generated in interaction, the data packets are captured in the industrial control system, and the captured data packets are divided into benign data packets without loopholes and malignant data packets with loopholes in a manual discrimination mode;
s11, carrying out dynamic and static combination analysis on the data characteristics, and forming a digital vector from the data characteristics;
wherein the data features include static features including annotation quantity and dynamic featuresNumber of variables->Number of functions->Number of operatorsQuantity->Instruction sequence->Control flow graphThe method comprises the steps of carrying out a first treatment on the surface of the The dynamic characteristics include API call +>Function call +.>Input/outputResource utilization->And memory map->。
The numerical vector formation formula includes:
the data features in the data packets are all described by digital vectors, and each data packet is composed of 11-dimensional digital vectors.
S12, dividing data features in the form of digital vectors to obtain a training set and a testing set;
and respectively selecting 4/5 digital vectors from the benign data packet and the malignant data packet as a training set, and using the remaining 1/5 digital vectors as a test set.
S2, training a vulnerability detection model by using the training set based on a gradient lifting decision tree algorithm;
the training method comprises the following specific steps of:
s21, initializing a classifier: setting an initial classifier as a training setAn average of all sample data features;
wherein,,representing a current classifier; />Representing the data characteristics of the ith sample in the training set, wherein n is the number of samples in the current training set; />An actual value representing a data characteristic of the i-th sample;
s22, calculating residual errors: calculating the residual error of each sample, namely the difference between the predicted value and the target variable;
wherein,,residual representing sample i on the mth decision tree,/->Representing a predicted value of the current decision tree;
s23, building a tree model: fitting residualLearning a regression tree to obtain a regression tree +.>;
S24, increasing model complexity: adding the current decision tree into the regression tree to obtain an updated decision tree;
s25, repeating S22-S24: reach fittingAfter the effect, stopping iteration to obtain a final lifting tree;
S3, evaluating a trained vulnerability detection model according to the test set;
wherein the evaluating step comprises:
s31, selecting and calculating performance evaluation indexes of the vulnerability detection model;
the performance evaluation index of the test comprises an accuracy rate, a recall rate and an F1 score.
S32, evaluating the vulnerability detection model according to the calculated value of the performance evaluation index;
the accuracy P refers to the proportion of the code which is correctly classified and judged in all samples, the higher the accuracy is, the higher the success rate of vulnerability detection is, TP represents the true benign code, and FP represents the false benign code;
the recall rate R refers to the number of samples which are correctly predicted as benign codes by the vulnerability detection model in all samples which are actually benign codes, the recall rate is high, the missed judgment of the classifier on the classification of the positive samples is less, and FN represents false malicious codes;
wherein the F1 score is a harmonic average of the accuracy and the recall, the higher the F1 value, the better the classifier performance.
S4, achieving vulnerability detection of the industrial control system through a vulnerability detection model meeting evaluation standards.
When the test result meets the condition that the accuracy is higher than 99%, the recall rate is lower than 0.05%, and the F1 score exceeds 0.98, the vulnerability detection model meeting the evaluation standard can be deployed and used, and the data characteristics applied by the industrial control system can be input into the vulnerability detection model to realize vulnerability detection.
For the vulnerability detection model which does not meet the evaluation standard, the data in the training set needs to be expanded, and the training process of S2 is continued.
An embodiment of the industrial control vulnerability detection system based on the gradient lifting decision tree algorithm disclosed in the present application is described in detail below with reference to fig. 2. The system disclosed in this embodiment includes:
the acquisition module is used for: the method comprises the steps of acquiring a data packet in an industrial control system, analyzing the data packet, extracting data characteristics, and dividing the data characteristics into a training set and a testing set;
wherein, the acquisition module includes:
an analysis submodule: the method is used for analyzing dynamic and static combination of the data features, and forming the data features into digital vectors;
dividing a molecular module: the method is used for dividing the data characteristics in the form of digital vectors to obtain a training set and a testing set.
Training module: the vulnerability detection model is trained by the training set based on a gradient lifting decision tree algorithm;
and an evaluation module: for evaluating the vulnerability detection model from the test set;
wherein, select the submodule: the performance evaluation index is used for selecting and calculating the vulnerability detection model;
an evaluation sub-module: and the vulnerability detection model is used for evaluating the vulnerability detection model according to the calculated value of the performance evaluation index.
And a detection module: the method is used for achieving vulnerability detection of the industrial control system through a vulnerability detection model meeting evaluation standards.
In summary, the invention constructs the 11-dimensional digital vector to describe the data characteristics by the dynamic and static combined data analysis method, which is helpful to further analyze whether the loophole occurs in the working condition system as detailed as possible, improve the detection accuracy and reduce the false alarm probability. According to the invention, three indexes of accuracy, recall and F1 score are used for evaluating the vulnerability detection performance of the model. In general, when the accuracy of a classifier is high, its recall rate tends to be low, and vice versa. In order to balance the relation between the accuracy and the recall, F1 score is introduced to carry out comprehensive evaluation, so that the accuracy of model vulnerability detection can be further improved.
The gradient lifting decision tree algorithm is used for vulnerability detection of an industrial control system. The gradient lifting decision tree algorithm has the characteristics of high precision and strong robustness, and has better leak detection performance than the traditional leak detection system. In the vulnerability detection, the characteristics of the vulnerability need to be comprehensively interpreted and analyzed so that the vulnerability can be rapidly positioned, and the gradient lifting decision tree algorithm has the characteristics of strong interpretation and easy understanding of output results, and has great advantages in the vulnerability detection field.
The invention can train the vulnerability detection model matched with the industrial control system formed by different hardware, software, operating systems, network protocols and network environments, can quickly train the model when the industrial control system is changed, and can update the model at any time to adapt to the new system condition, thereby having extremely strong system compatibility and expansibility which are not possessed by the traditional industrial control vulnerability detection system.
It should be noted that the above modules may be executed on a computer terminal.
The self-adaptive industrial control vulnerability detection system based on the gradient lifting decision tree algorithm comprises a processor and a memory, wherein the modules and the like are stored in the memory as program units, and the processor executes the program units stored in the memory to realize corresponding functions.
The processor includes a kernel, and the kernel fetches the corresponding program unit from the memory. The core may be provided with one or more memories, which may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM), including at least one memory chip.
The embodiment of the invention provides a computer readable storage medium, wherein a program is stored on the computer readable storage medium, and the program is executed by a processor to realize the self-adaptive industrial control vulnerability detection method based on the gradient lifting decision tree algorithm.
The embodiment of the invention provides electronic equipment, which comprises a processor, a memory and a program stored in the memory and capable of running on the processor, wherein the self-adaptive industrial control vulnerability detection method based on a gradient lifting decision tree algorithm is realized when the processor executes the program. The device herein may be a server, a PC (computer), a PAD (portable computer), a mobile phone, etc.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, etc., such as Read Only Memory (ROM) or flash RAM. Memory is an example of a computer-readable medium.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
In the description of this application, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises an element.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and changes may be made to the present application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. which are within the spirit and principles of the present application are intended to be included within the scope of the claims of the present application.
Claims (10)
1. The industrial control vulnerability detection method based on the gradient lifting decision tree algorithm is characterized by comprising the following steps of:
s1, acquiring a data packet in an industrial control system, analyzing the data packet, extracting data characteristics, and dividing the data characteristics into a training set and a testing set;
s11, carrying out dynamic and static combination analysis on the data characteristics, and forming a digital vector from the data characteristics;
s12, dividing data features in the form of digital vectors to obtain a training set and a testing set;
s2, training a vulnerability detection model by using the training set based on a gradient lifting decision tree algorithm;
s3, evaluating a trained vulnerability detection model according to the test set;
s31, selecting and calculating performance evaluation indexes of the vulnerability detection model;
s32, evaluating the vulnerability detection model according to the calculated value of the performance evaluation index;
s4, achieving vulnerability detection of the industrial control system through a vulnerability detection model meeting evaluation standards.
2. The method of claim 1, wherein the data features include static features and dynamic features, the static features including a number of annotationsNumber of variables->Number of functions->Operator quantity->Instruction sequence->Control flow graph->The method comprises the steps of carrying out a first treatment on the surface of the The dynamic characteristics include API call +>Function callInput/output->Resource utilization->And memory map->。
4. the method of claim 3, wherein the training process of the vulnerability detection model comprises:
s21, initializing a classifier: setting an initial classifier as an average value of all sample data features in a training set;
wherein (1)>Representing a current classifier; />Representing the data characteristics of the ith sample in the training set, wherein n is the number of samples in the current training set; />An actual value representing a data characteristic of the i-th sample;
s22, calculating residual errors: calculating a residual error of each sample;
wherein (1)>Representing the residual of sample i on the mth decision tree,representing a predicted value of the current decision tree;
s23, building a tree model: fitting residualLearning a regression tree to obtain a regression tree +.>;
S24, increasing model complexity: adding the current decision tree into the regression tree to obtain an updated decision tree;
s25, repeating S22-S24: stopping iteration after the fitting effect is achieved, and obtaining a final lifting tree;
5. The method of claim 1, wherein the performance assessment indicator and assessment method comprises:
accuracy P:
wherein the accuracy P refers to the code with correct classification judgment in all samplesThe higher the accuracy rate is, the higher the success rate of vulnerability detection is, TP represents true benign codes, and FP represents false benign codes;
recall ratio R:
the recall rate R refers to the number of samples which are correctly predicted as benign codes by the vulnerability detection model in all samples which are actually benign codes, the recall rate is high, the missed judgment of the classifier on the classification of the positive samples is less, and FN represents false malicious codes;
f1 score:
6. The self-adaptive industrial control vulnerability detection system based on the gradient lifting decision tree algorithm is characterized by comprising the following steps:
the acquisition module is used for: the method comprises the steps of acquiring a data packet in an industrial control system, analyzing the data packet, extracting data characteristics, and dividing the data characteristics into a training set and a testing set;
training module: the vulnerability detection model is trained by the training set based on a gradient lifting decision tree algorithm;
and an evaluation module: for evaluating the vulnerability detection model from the test set;
and a detection module: the method is used for achieving vulnerability detection of the industrial control system through a vulnerability detection model meeting evaluation standards.
7. The system of claim 6, wherein the acquisition module comprises:
an analysis submodule: the method is used for analyzing dynamic and static combination of the data features, and forming the data features into digital vectors;
dividing a molecular module: the method is used for dividing the data characteristics in the form of digital vectors to obtain a training set and a testing set.
8. The system of claim 6, wherein the evaluation module comprises:
selecting a sub-module: the performance evaluation index is used for selecting and calculating the vulnerability detection model;
an evaluation sub-module: and the vulnerability detection model is used for evaluating the vulnerability detection model according to the calculated value of the performance evaluation index.
9. A storage medium having stored thereon a plurality of instructions adapted to be loaded by a processor and to carry out the method of any one of claims 1 to 5.
10. An electronic device, comprising: one or more processors and a memory, wherein the memory is configured to store one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310677008.0A CN116401680A (en) | 2023-06-08 | 2023-06-08 | Industrial control vulnerability detection method and system based on gradient lifting decision tree algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310677008.0A CN116401680A (en) | 2023-06-08 | 2023-06-08 | Industrial control vulnerability detection method and system based on gradient lifting decision tree algorithm |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116401680A true CN116401680A (en) | 2023-07-07 |
Family
ID=87008049
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310677008.0A Pending CN116401680A (en) | 2023-06-08 | 2023-06-08 | Industrial control vulnerability detection method and system based on gradient lifting decision tree algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116401680A (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112288191A (en) * | 2020-11-19 | 2021-01-29 | 国家海洋信息中心 | Ocean buoy service life prediction method based on multi-class machine learning method |
CN113256066A (en) * | 2021-04-23 | 2021-08-13 | 新疆大学 | PCA-XGboost-IRF-based job shop real-time scheduling method |
CN116050605A (en) * | 2022-12-30 | 2023-05-02 | 国网陕西省电力有限公司经济技术研究院 | Power load prediction method based on neural network and random forest method |
-
2023
- 2023-06-08 CN CN202310677008.0A patent/CN116401680A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112288191A (en) * | 2020-11-19 | 2021-01-29 | 国家海洋信息中心 | Ocean buoy service life prediction method based on multi-class machine learning method |
CN113256066A (en) * | 2021-04-23 | 2021-08-13 | 新疆大学 | PCA-XGboost-IRF-based job shop real-time scheduling method |
CN116050605A (en) * | 2022-12-30 | 2023-05-02 | 国网陕西省电力有限公司经济技术研究院 | Power load prediction method based on neural network and random forest method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111401416B (en) | Abnormal website identification method and device and abnormal countermeasure identification method | |
CN116389235A (en) | Fault monitoring method and system applied to industrial Internet of things | |
CN113132297A (en) | Data leakage detection method and device | |
CN113051571B (en) | Method and device for detecting false alarm vulnerability and computer equipment | |
CN116401680A (en) | Industrial control vulnerability detection method and system based on gradient lifting decision tree algorithm | |
CN117171761A (en) | Method and device for scheduling vulnerability scanning mode | |
CN117609992A (en) | Data disclosure detection method, device and storage medium | |
CN117319001A (en) | Network security assessment method, device, storage medium and computer equipment | |
EP4254241A1 (en) | Method and device for image-based malware detection, and artificial intelligence-based endpoint detection and response system using same | |
CN116311829A (en) | Remote alarm method and device for data machine room | |
CN115577363A (en) | Detection method and device for deserialization utilization chain of malicious code | |
CN115643044A (en) | Data processing method, device, server and storage medium | |
CN116260640B (en) | Information interception control method and system for big data analysis based on artificial intelligence | |
US12118095B1 (en) | Machine learning model for calculating confidence scores associated with potential security vulnerabilities | |
CN115718672B (en) | Application abnormality detection method and device | |
CN117932676B (en) | Data desensitization method and system based on network interface access control | |
CN111510340B (en) | Access request detection method and device, electronic equipment and readable storage medium | |
CN115270123B (en) | Method and device for identifying illegal operation of application program, electronic equipment and medium | |
CN112688944B (en) | Local area network security state detection method, device, equipment and storage medium | |
CN117478550A (en) | Network abnormal data mining method and system | |
CN114595463A (en) | Risk detection method and device | |
Dobler et al. | Systematic review, analysis, and characterisation of malicious industrial network traffic datasets for aiding Machine Learning algorithm performance testing | |
CN117914526A (en) | Method and system for analyzing and tracing abnormal communication behaviors of persistent storage | |
CN117176459A (en) | Security rule generation method and device | |
Gaddah et al. | Cyber Threat Detection Using Machine Learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20230707 |
|
RJ01 | Rejection of invention patent application after publication |