CN116401680A - Industrial control vulnerability detection method and system based on gradient lifting decision tree algorithm - Google Patents

Industrial control vulnerability detection method and system based on gradient lifting decision tree algorithm Download PDF

Info

Publication number
CN116401680A
CN116401680A CN202310677008.0A CN202310677008A CN116401680A CN 116401680 A CN116401680 A CN 116401680A CN 202310677008 A CN202310677008 A CN 202310677008A CN 116401680 A CN116401680 A CN 116401680A
Authority
CN
China
Prior art keywords
vulnerability detection
detection model
industrial control
decision tree
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310677008.0A
Other languages
Chinese (zh)
Inventor
原树生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Wangteng Technology Co ltd
Original Assignee
Beijing Wangteng Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Wangteng Technology Co ltd filed Critical Beijing Wangteng Technology Co ltd
Priority to CN202310677008.0A priority Critical patent/CN116401680A/en
Publication of CN116401680A publication Critical patent/CN116401680A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • G06F21/577Assessing vulnerabilities and evaluating computer system security
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Mathematical Physics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The application discloses an industrial control vulnerability detection method and system based on a gradient lifting decision tree algorithm, comprising the following steps: acquiring a data packet in an industrial control system, analyzing the data packet, extracting data characteristics, and dividing the data characteristics into a training set and a testing set; training a vulnerability detection model by using the training set based on a gradient lifting decision tree algorithm; evaluating the trained vulnerability detection model according to the test set; selecting and calculating performance evaluation indexes of the vulnerability detection model; evaluating the vulnerability detection model according to the calculated value of the performance evaluation index; and (5) detecting the loopholes of the industrial control system through a loophole detection model meeting the evaluation standard. The method and the device continuously train the vulnerability detection model to reach the detection model meeting the expected standard, and have high detection accuracy.

Description

Industrial control vulnerability detection method and system based on gradient lifting decision tree algorithm
Technical Field
The application relates to the field of computer network security, in particular to an industrial control vulnerability detection method and system based on a gradient lifting decision tree algorithm.
Background
The industrial control system transmits data to the industrial Internet in real time through equipment such as a sensor, an instrument and the like, and remote control and monitoring of industrial equipment and industrial production flow are realized. In recent years, with the rapid development of the internet of things and the continuous promotion of market demands, industrial control systems are continuously developed towards the direction of intellectualization, automation and high efficiency. However, with popularization and promotion of industrial control systems, network security problems of industrial control systems are increasingly receiving attention of enterprises. Once the network security of the industrial control system is problematic, unpredictable losses are caused to the industrial production activities of enterprises and the life and property security of workers. Therefore, enterprises commonly adopt the industrial control vulnerability detection system to monitor network security vulnerabilities, so that vulnerability information can be mastered in time, corresponding risk avoidance measures are adopted, the safety and the production efficiency of the industrial control system are improved, and the orderly performance of industrial production activities is ensured.
The industrial control vulnerability detection system is a tool special for carrying out security vulnerability scanning and detection on the industrial control system, can help enterprises to discover vulnerabilities and potential safety hazards existing in the industrial control system in time, and improves the security protection capability of the industrial control system. The industrial control vulnerability detection system integrates the functions of automatically identifying the industrial control system, constructing a vulnerability database, generating a risk assessment report, a safety detection report and the like according to a targeted vulnerability scanning technology, and has the core functions of rapidly and automatically detecting and analyzing the safety vulnerability and hidden danger in the industrial control system.
At present, along with the continuous development of information technology and continuous popularization of Internet application, the number of loopholes in the industrial Internet is continuously increased, the types of loopholes show a diversified trend, the complexity of the loopholes is also increased, and the traditional industrial control loophole detection system is difficult to meet the safety requirement of a modern industrial control system. At present, the traditional industrial control vulnerability detection system mainly has the following problems: 1) The leak detection accuracy is not high: as modern industrial control systems are increasingly complex, the accuracy of leak detection is lower and lower, and the phenomenon of missing detection or false detection often occurs, so that the efficiency of industrial production activities is greatly affected; 2) The occupation of resources is high: when the conventional vulnerability detection system faces a novel vulnerability detection scene, the problem of high calculation complexity exists, and massive memory resources are occupied, so that the performance of the whole industrial control system is reduced, and the stability is weakened; 3) The requirements of the professional technology are higher: the traditional industrial control system of the increasingly complex vulnerability detection scene is not adequate, a large number of professional security personnel are required to be invested for manual judgment, and the service level of the security personnel is also required to be very high; 4) System compatibility is poor: the existing industrial control system has various systems and protocols, and the traditional vulnerability detection system cannot be well compatible with the industrial control system.
Disclosure of Invention
Purpose of (one) application
Based on the above, in order to improve the accuracy of the vulnerability detection and solve the problem that the vulnerability detection system cannot be compatible when the industrial control system is changed, the application discloses the following technical scheme.
(II) technical scheme
The application discloses an industrial control vulnerability detection method based on a gradient lifting decision tree algorithm, which comprises the following steps:
s1, acquiring a data packet in an industrial control system, analyzing the data packet, extracting data characteristics, and dividing the data characteristics into a training set and a testing set;
s11, carrying out dynamic and static combination analysis on the data characteristics, and forming a digital vector from the data characteristics;
s12, dividing data features in the form of digital vectors to obtain a training set and a testing set;
s2, training a vulnerability detection model by using the training set based on a gradient lifting decision tree algorithm;
s3, evaluating a trained vulnerability detection model according to the test set;
s31, selecting and calculating performance evaluation indexes of the vulnerability detection model;
s32, evaluating the vulnerability detection model according to the calculated value of the performance evaluation index;
s4, achieving vulnerability detection of the industrial control system through a vulnerability detection model meeting evaluation standards.
In one possible implementation, the data features include static features including the number of annotations and dynamic features
Figure SMS_5
Number of variables->
Figure SMS_6
Number of functions->
Figure SMS_7
Operator quantity->
Figure SMS_8
Instruction sequence
Figure SMS_9
Control flow graph->
Figure SMS_10
The method comprises the steps of carrying out a first treatment on the surface of the The dynamic characteristics include API call +>
Figure SMS_11
Function call
Figure SMS_1
Input/output->
Figure SMS_2
Resource utilization->
Figure SMS_3
And memory map->
Figure SMS_4
In one possible implementation, the numerical vector construction formula includes:
Figure SMS_12
. In one possible implementation manner, the training process of the vulnerability detection model includes:
s21, initializing a classifier: setting an initial classifier as an average value of all sample data features in a training set;
Figure SMS_13
wherein,,
Figure SMS_14
representing a current classifier; />
Figure SMS_15
Representing the data characteristics of the ith sample in the training set, wherein n is the number of samples in the current training set; />
Figure SMS_16
An actual value representing a data characteristic of the i-th sample;
s22, calculating residual errors: calculating a residual error of each sample;
Figure SMS_17
wherein,,
Figure SMS_18
residual representing sample i on the mth decision tree,/->
Figure SMS_19
Representing a predicted value of the current decision tree;
s23, building a tree model: fitting residual
Figure SMS_20
Learning a regression tree to obtain a regression tree +.>
Figure SMS_21
S24, increasing model complexity: adding the current decision tree into the regression tree to obtain an updated decision tree;
Figure SMS_22
s25, repeating S22-S24: stopping iteration after the fitting effect is achieved, and obtaining a final lifting tree;
Figure SMS_23
wherein, promote tree
Figure SMS_24
Is a weighted sum of the first M trees.
In one possible implementation manner, the performance evaluation index and the evaluation method include:
accuracy P:
Figure SMS_25
the accuracy P refers to the proportion of the code which is correctly classified and judged in all samples, the higher the accuracy is, the higher the success rate of vulnerability detection is, TP represents the true benign code, and FP represents the false benign code;
recall ratio R:
Figure SMS_26
the recall rate R refers to the number of samples which are correctly predicted as benign codes by the vulnerability detection model in all samples which are actually benign codes, the recall rate is high, the missed judgment of the classifier on the classification of the positive samples is less, and FN represents false malicious codes;
f1 score:
Figure SMS_27
wherein the F1 score is a harmonic average of the accuracy and the recall, the higher the F1 value, the better the classifier performance.
As a second aspect of the present application, the present application further discloses an adaptive industrial control vulnerability detection system based on a gradient lifting decision tree algorithm, including:
the acquisition module is used for: the method comprises the steps of acquiring a data packet in an industrial control system, analyzing the data packet, extracting data characteristics, and dividing the data characteristics into a training set and a testing set;
training module: the vulnerability detection model is trained by the training set based on a gradient lifting decision tree algorithm;
and an evaluation module: for evaluating the vulnerability detection model from the test set;
and a detection module: the method is used for achieving vulnerability detection of the industrial control system through a vulnerability detection model meeting evaluation standards.
In one possible implementation, the acquiring module includes:
an analysis submodule: the method is used for analyzing dynamic and static combination of the data features, and forming the data features into digital vectors;
dividing a molecular module: the method is used for dividing the data characteristics in the form of digital vectors to obtain a training set and a testing set.
In one possible implementation, the evaluation module includes:
selecting a sub-module: the performance evaluation index is used for selecting and calculating the vulnerability detection model;
an evaluation sub-module: for evaluating the vulnerability detection model based on the calculated value of the performance evaluation index
As a third aspect of the application, the application also discloses a storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the method of any of the above.
As a fourth aspect of the present application, the present application also discloses an electronic device, including: one or more processors and a memory, wherein the memory is configured to store one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of the preceding claims.
(III) beneficial effects
According to the industrial control vulnerability detection method and system based on the gradient lifting decision tree algorithm, the vulnerability detection model is trained based on the gradient lifting decision tree algorithm, the data characteristics are described through the multidimensional digital vectors, whether the vulnerability detection model has a vulnerability or not is analyzed as much as possible, the detection accuracy is improved, the false alarm probability is reduced, the detection performance of the vulnerability detection model is evaluated according to the performance evaluation index, the overall performance of the current model and the optimization direction of the next step are determined according to the quality of the performance evaluation index, and the accuracy of model vulnerability detection is further improved.
Drawings
The embodiments described below with reference to the drawings are exemplary and intended for the purpose of illustrating and explaining the present application and are not to be construed as limiting the scope of protection of the present application.
Fig. 1 is a schematic flow chart of an industrial control vulnerability detection method based on a gradient lifting decision tree algorithm disclosed in the present application.
FIG. 2 is a block diagram of an industrial control vulnerability detection system based on a gradient lifting decision tree algorithm disclosed in the present application.
Detailed Description
In order to make the purposes, technical solutions and advantages of the implementation of the present application more clear, the technical solutions in the embodiments of the present application will be described in more detail below with reference to the accompanying drawings in the embodiments of the present application.
An embodiment of the industrial control vulnerability detection method based on the gradient lifting decision tree algorithm disclosed in the application is described in detail below with reference to fig. 1. The method disclosed by the embodiment mainly comprises the following steps.
S1, acquiring a data packet in an industrial control system, analyzing the data packet, extracting data characteristics, and dividing the data characteristics into a training set and a testing set;
the industrial control system is composed of different hardware, software, an operating system, a network protocol and a network environment, a large number of data packets are generated in interaction, the data packets are captured in the industrial control system, and the captured data packets are divided into benign data packets without loopholes and malignant data packets with loopholes in a manual discrimination mode;
s11, carrying out dynamic and static combination analysis on the data characteristics, and forming a digital vector from the data characteristics;
wherein the data features include static features including annotation quantity and dynamic features
Figure SMS_28
Number of variables->
Figure SMS_30
Number of functions->
Figure SMS_31
Number of operatorsQuantity->
Figure SMS_32
Instruction sequence->
Figure SMS_33
Control flow graph
Figure SMS_34
The method comprises the steps of carrying out a first treatment on the surface of the The dynamic characteristics include API call +>
Figure SMS_35
Function call +.>
Figure SMS_29
Input/output
Figure SMS_36
Resource utilization->
Figure SMS_37
And memory map->
Figure SMS_38
The numerical vector formation formula includes:
Figure SMS_39
the data features in the data packets are all described by digital vectors, and each data packet is composed of 11-dimensional digital vectors.
S12, dividing data features in the form of digital vectors to obtain a training set and a testing set;
and respectively selecting 4/5 digital vectors from the benign data packet and the malignant data packet as a training set, and using the remaining 1/5 digital vectors as a test set.
S2, training a vulnerability detection model by using the training set based on a gradient lifting decision tree algorithm;
the training method comprises the following specific steps of:
s21, initializing a classifier: setting an initial classifier as a training setAn average of all sample data features;
Figure SMS_40
wherein,,
Figure SMS_41
representing a current classifier; />
Figure SMS_42
Representing the data characteristics of the ith sample in the training set, wherein n is the number of samples in the current training set; />
Figure SMS_43
An actual value representing a data characteristic of the i-th sample;
s22, calculating residual errors: calculating the residual error of each sample, namely the difference between the predicted value and the target variable;
Figure SMS_44
wherein,,
Figure SMS_45
residual representing sample i on the mth decision tree,/->
Figure SMS_46
Representing a predicted value of the current decision tree;
s23, building a tree model: fitting residual
Figure SMS_47
Learning a regression tree to obtain a regression tree +.>
Figure SMS_48
S24, increasing model complexity: adding the current decision tree into the regression tree to obtain an updated decision tree;
Figure SMS_49
s25, repeating S22-S24: reach fittingAfter the effect, stopping iteration to obtain a final lifting tree;
Figure SMS_50
wherein, promote tree
Figure SMS_51
Is a weighted sum of the first M trees.
S3, evaluating a trained vulnerability detection model according to the test set;
wherein the evaluating step comprises:
s31, selecting and calculating performance evaluation indexes of the vulnerability detection model;
the performance evaluation index of the test comprises an accuracy rate, a recall rate and an F1 score.
S32, evaluating the vulnerability detection model according to the calculated value of the performance evaluation index;
wherein, the accuracy rate P:
Figure SMS_52
the accuracy P refers to the proportion of the code which is correctly classified and judged in all samples, the higher the accuracy is, the higher the success rate of vulnerability detection is, TP represents the true benign code, and FP represents the false benign code;
recall ratio R:
Figure SMS_53
the recall rate R refers to the number of samples which are correctly predicted as benign codes by the vulnerability detection model in all samples which are actually benign codes, the recall rate is high, the missed judgment of the classifier on the classification of the positive samples is less, and FN represents false malicious codes;
f1 score:
Figure SMS_54
wherein the F1 score is a harmonic average of the accuracy and the recall, the higher the F1 value, the better the classifier performance.
S4, achieving vulnerability detection of the industrial control system through a vulnerability detection model meeting evaluation standards.
When the test result meets the condition that the accuracy is higher than 99%, the recall rate is lower than 0.05%, and the F1 score exceeds 0.98, the vulnerability detection model meeting the evaluation standard can be deployed and used, and the data characteristics applied by the industrial control system can be input into the vulnerability detection model to realize vulnerability detection.
For the vulnerability detection model which does not meet the evaluation standard, the data in the training set needs to be expanded, and the training process of S2 is continued.
An embodiment of the industrial control vulnerability detection system based on the gradient lifting decision tree algorithm disclosed in the present application is described in detail below with reference to fig. 2. The system disclosed in this embodiment includes:
the acquisition module is used for: the method comprises the steps of acquiring a data packet in an industrial control system, analyzing the data packet, extracting data characteristics, and dividing the data characteristics into a training set and a testing set;
wherein, the acquisition module includes:
an analysis submodule: the method is used for analyzing dynamic and static combination of the data features, and forming the data features into digital vectors;
dividing a molecular module: the method is used for dividing the data characteristics in the form of digital vectors to obtain a training set and a testing set.
Training module: the vulnerability detection model is trained by the training set based on a gradient lifting decision tree algorithm;
and an evaluation module: for evaluating the vulnerability detection model from the test set;
wherein, select the submodule: the performance evaluation index is used for selecting and calculating the vulnerability detection model;
an evaluation sub-module: and the vulnerability detection model is used for evaluating the vulnerability detection model according to the calculated value of the performance evaluation index.
And a detection module: the method is used for achieving vulnerability detection of the industrial control system through a vulnerability detection model meeting evaluation standards.
In summary, the invention constructs the 11-dimensional digital vector to describe the data characteristics by the dynamic and static combined data analysis method, which is helpful to further analyze whether the loophole occurs in the working condition system as detailed as possible, improve the detection accuracy and reduce the false alarm probability. According to the invention, three indexes of accuracy, recall and F1 score are used for evaluating the vulnerability detection performance of the model. In general, when the accuracy of a classifier is high, its recall rate tends to be low, and vice versa. In order to balance the relation between the accuracy and the recall, F1 score is introduced to carry out comprehensive evaluation, so that the accuracy of model vulnerability detection can be further improved.
The gradient lifting decision tree algorithm is used for vulnerability detection of an industrial control system. The gradient lifting decision tree algorithm has the characteristics of high precision and strong robustness, and has better leak detection performance than the traditional leak detection system. In the vulnerability detection, the characteristics of the vulnerability need to be comprehensively interpreted and analyzed so that the vulnerability can be rapidly positioned, and the gradient lifting decision tree algorithm has the characteristics of strong interpretation and easy understanding of output results, and has great advantages in the vulnerability detection field.
The invention can train the vulnerability detection model matched with the industrial control system formed by different hardware, software, operating systems, network protocols and network environments, can quickly train the model when the industrial control system is changed, and can update the model at any time to adapt to the new system condition, thereby having extremely strong system compatibility and expansibility which are not possessed by the traditional industrial control vulnerability detection system.
It should be noted that the above modules may be executed on a computer terminal.
The self-adaptive industrial control vulnerability detection system based on the gradient lifting decision tree algorithm comprises a processor and a memory, wherein the modules and the like are stored in the memory as program units, and the processor executes the program units stored in the memory to realize corresponding functions.
The processor includes a kernel, and the kernel fetches the corresponding program unit from the memory. The core may be provided with one or more memories, which may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM), including at least one memory chip.
The embodiment of the invention provides a computer readable storage medium, wherein a program is stored on the computer readable storage medium, and the program is executed by a processor to realize the self-adaptive industrial control vulnerability detection method based on the gradient lifting decision tree algorithm.
The embodiment of the invention provides electronic equipment, which comprises a processor, a memory and a program stored in the memory and capable of running on the processor, wherein the self-adaptive industrial control vulnerability detection method based on a gradient lifting decision tree algorithm is realized when the processor executes the program. The device herein may be a server, a PC (computer), a PAD (portable computer), a mobile phone, etc.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, etc., such as Read Only Memory (ROM) or flash RAM. Memory is an example of a computer-readable medium.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
In the description of this application, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises an element.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and changes may be made to the present application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. which are within the spirit and principles of the present application are intended to be included within the scope of the claims of the present application.

Claims (10)

1. The industrial control vulnerability detection method based on the gradient lifting decision tree algorithm is characterized by comprising the following steps of:
s1, acquiring a data packet in an industrial control system, analyzing the data packet, extracting data characteristics, and dividing the data characteristics into a training set and a testing set;
s11, carrying out dynamic and static combination analysis on the data characteristics, and forming a digital vector from the data characteristics;
s12, dividing data features in the form of digital vectors to obtain a training set and a testing set;
s2, training a vulnerability detection model by using the training set based on a gradient lifting decision tree algorithm;
s3, evaluating a trained vulnerability detection model according to the test set;
s31, selecting and calculating performance evaluation indexes of the vulnerability detection model;
s32, evaluating the vulnerability detection model according to the calculated value of the performance evaluation index;
s4, achieving vulnerability detection of the industrial control system through a vulnerability detection model meeting evaluation standards.
2. The method of claim 1, wherein the data features include static features and dynamic features, the static features including a number of annotations
Figure QLYQS_2
Number of variables->
Figure QLYQS_6
Number of functions->
Figure QLYQS_7
Operator quantity->
Figure QLYQS_8
Instruction sequence->
Figure QLYQS_9
Control flow graph->
Figure QLYQS_10
The method comprises the steps of carrying out a first treatment on the surface of the The dynamic characteristics include API call +>
Figure QLYQS_11
Function call
Figure QLYQS_1
Input/output->
Figure QLYQS_3
Resource utilization->
Figure QLYQS_4
And memory map->
Figure QLYQS_5
3. The method of claim 2, wherein the numerical vector formulation formula comprises:
Figure QLYQS_12
4. the method of claim 3, wherein the training process of the vulnerability detection model comprises:
s21, initializing a classifier: setting an initial classifier as an average value of all sample data features in a training set;
Figure QLYQS_13
wherein (1)>
Figure QLYQS_14
Representing a current classifier; />
Figure QLYQS_15
Representing the data characteristics of the ith sample in the training set, wherein n is the number of samples in the current training set; />
Figure QLYQS_16
An actual value representing a data characteristic of the i-th sample;
s22, calculating residual errors: calculating a residual error of each sample;
Figure QLYQS_17
wherein (1)>
Figure QLYQS_18
Representing the residual of sample i on the mth decision tree,
Figure QLYQS_19
representing a predicted value of the current decision tree;
s23, building a tree model: fitting residual
Figure QLYQS_20
Learning a regression tree to obtain a regression tree +.>
Figure QLYQS_21
S24, increasing model complexity: adding the current decision tree into the regression tree to obtain an updated decision tree;
Figure QLYQS_22
s25, repeating S22-S24: stopping iteration after the fitting effect is achieved, and obtaining a final lifting tree;
Figure QLYQS_23
wherein, promote tree->
Figure QLYQS_24
Is a weighted sum of the first M trees.
5. The method of claim 1, wherein the performance assessment indicator and assessment method comprises:
accuracy P:
Figure QLYQS_25
wherein the accuracy P refers to the code with correct classification judgment in all samplesThe higher the accuracy rate is, the higher the success rate of vulnerability detection is, TP represents true benign codes, and FP represents false benign codes;
recall ratio R:
Figure QLYQS_26
the recall rate R refers to the number of samples which are correctly predicted as benign codes by the vulnerability detection model in all samples which are actually benign codes, the recall rate is high, the missed judgment of the classifier on the classification of the positive samples is less, and FN represents false malicious codes;
f1 score:
Figure QLYQS_27
wherein the F1 score is a harmonic average of the accuracy and the recall, the higher the F1 value, the better the classifier performance.
6. The self-adaptive industrial control vulnerability detection system based on the gradient lifting decision tree algorithm is characterized by comprising the following steps:
the acquisition module is used for: the method comprises the steps of acquiring a data packet in an industrial control system, analyzing the data packet, extracting data characteristics, and dividing the data characteristics into a training set and a testing set;
training module: the vulnerability detection model is trained by the training set based on a gradient lifting decision tree algorithm;
and an evaluation module: for evaluating the vulnerability detection model from the test set;
and a detection module: the method is used for achieving vulnerability detection of the industrial control system through a vulnerability detection model meeting evaluation standards.
7. The system of claim 6, wherein the acquisition module comprises:
an analysis submodule: the method is used for analyzing dynamic and static combination of the data features, and forming the data features into digital vectors;
dividing a molecular module: the method is used for dividing the data characteristics in the form of digital vectors to obtain a training set and a testing set.
8. The system of claim 6, wherein the evaluation module comprises:
selecting a sub-module: the performance evaluation index is used for selecting and calculating the vulnerability detection model;
an evaluation sub-module: and the vulnerability detection model is used for evaluating the vulnerability detection model according to the calculated value of the performance evaluation index.
9. A storage medium having stored thereon a plurality of instructions adapted to be loaded by a processor and to carry out the method of any one of claims 1 to 5.
10. An electronic device, comprising: one or more processors and a memory, wherein the memory is configured to store one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-5.
CN202310677008.0A 2023-06-08 2023-06-08 Industrial control vulnerability detection method and system based on gradient lifting decision tree algorithm Pending CN116401680A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310677008.0A CN116401680A (en) 2023-06-08 2023-06-08 Industrial control vulnerability detection method and system based on gradient lifting decision tree algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310677008.0A CN116401680A (en) 2023-06-08 2023-06-08 Industrial control vulnerability detection method and system based on gradient lifting decision tree algorithm

Publications (1)

Publication Number Publication Date
CN116401680A true CN116401680A (en) 2023-07-07

Family

ID=87008049

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310677008.0A Pending CN116401680A (en) 2023-06-08 2023-06-08 Industrial control vulnerability detection method and system based on gradient lifting decision tree algorithm

Country Status (1)

Country Link
CN (1) CN116401680A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112288191A (en) * 2020-11-19 2021-01-29 国家海洋信息中心 Ocean buoy service life prediction method based on multi-class machine learning method
CN113256066A (en) * 2021-04-23 2021-08-13 新疆大学 PCA-XGboost-IRF-based job shop real-time scheduling method
CN116050605A (en) * 2022-12-30 2023-05-02 国网陕西省电力有限公司经济技术研究院 Power load prediction method based on neural network and random forest method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112288191A (en) * 2020-11-19 2021-01-29 国家海洋信息中心 Ocean buoy service life prediction method based on multi-class machine learning method
CN113256066A (en) * 2021-04-23 2021-08-13 新疆大学 PCA-XGboost-IRF-based job shop real-time scheduling method
CN116050605A (en) * 2022-12-30 2023-05-02 国网陕西省电力有限公司经济技术研究院 Power load prediction method based on neural network and random forest method

Similar Documents

Publication Publication Date Title
CN111401416B (en) Abnormal website identification method and device and abnormal countermeasure identification method
CN116389235A (en) Fault monitoring method and system applied to industrial Internet of things
CN113132297A (en) Data leakage detection method and device
CN113051571B (en) Method and device for detecting false alarm vulnerability and computer equipment
CN116401680A (en) Industrial control vulnerability detection method and system based on gradient lifting decision tree algorithm
CN117171761A (en) Method and device for scheduling vulnerability scanning mode
CN117609992A (en) Data disclosure detection method, device and storage medium
CN117319001A (en) Network security assessment method, device, storage medium and computer equipment
EP4254241A1 (en) Method and device for image-based malware detection, and artificial intelligence-based endpoint detection and response system using same
CN116311829A (en) Remote alarm method and device for data machine room
CN115577363A (en) Detection method and device for deserialization utilization chain of malicious code
CN115643044A (en) Data processing method, device, server and storage medium
CN116260640B (en) Information interception control method and system for big data analysis based on artificial intelligence
US12118095B1 (en) Machine learning model for calculating confidence scores associated with potential security vulnerabilities
CN115718672B (en) Application abnormality detection method and device
CN117932676B (en) Data desensitization method and system based on network interface access control
CN111510340B (en) Access request detection method and device, electronic equipment and readable storage medium
CN115270123B (en) Method and device for identifying illegal operation of application program, electronic equipment and medium
CN112688944B (en) Local area network security state detection method, device, equipment and storage medium
CN117478550A (en) Network abnormal data mining method and system
CN114595463A (en) Risk detection method and device
Dobler et al. Systematic review, analysis, and characterisation of malicious industrial network traffic datasets for aiding Machine Learning algorithm performance testing
CN117914526A (en) Method and system for analyzing and tracing abnormal communication behaviors of persistent storage
CN117176459A (en) Security rule generation method and device
Gaddah et al. Cyber Threat Detection Using Machine Learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20230707

RJ01 Rejection of invention patent application after publication