CN111159011A

CN111159011A - Instruction vulnerability prediction method and system based on deep random forest

Info

Publication number: CN111159011A
Application number: CN201911248246.XA
Authority: CN
Inventors: 顾晶晶; 柳塍; 晏祖佳
Original assignee: Nanjing University of Aeronautics and Astronautics
Current assignee: Nanjing University of Aeronautics and Astronautics
Priority date: 2019-12-09
Filing date: 2019-12-09
Publication date: 2020-05-15
Anticipated expiration: 2039-12-09
Also published as: CN111159011B

Abstract

The invention discloses a method and a system for predicting instruction vulnerability based on a deep random forest, wherein the method comprises the following steps: extracting instruction characteristic information related to each program instruction and instruction vulnerability, and generating an instruction characteristic vector representing the instruction vulnerability; fault injection is carried out on the training program, and vulnerability values of all program instructions are obtained; combining the instruction characteristic vector and the instruction vulnerability value to generate an instruction vulnerability sample data set; performing sliding sampling on the instruction vulnerability sample data set through a sliding window to generate an expanded sample data set; constructing and training an instruction vulnerability prediction model based on a deep random forest; and extracting the instruction feature vector of the target program to be predicted, and realizing the instruction vulnerability prediction of the target program to be predicted by combining an instruction vulnerability prediction model. The system is used for realizing the method. The method has the advantages of high prediction accuracy, low demand on the sample set and less manual adjustment work, and can be effectively applied to prediction of instruction vulnerability after the program is influenced by instantaneous faults.

Description

Instruction vulnerability prediction method and system based on deep random forest

Technical Field

The invention belongs to the field of software reinforcement and software reliability, and particularly relates to a method and a system for predicting instruction vulnerability based on deep random forest.

Background

With the rapid development of semiconductor manufacturing processes, the size of a computer chip is continuously reduced, so that the sensitivity of the computer chip to spatial radiation is greatly improved. Under the environment of space radiation, the single event upset effect generated by high-energy particle irradiation or electromagnetic interference on an integrated circuit chip with a high process level is one of the main reasons for the failure of a computer system. The Single Event Upset (SEU) effect refers to a phenomenon in which a certain bit of a memory value is affected and logic state inversion occurs, and this phenomenon is generally referred to as a soft error.

Soft errors are generally classified into (1) the occurrence of soft errors has no influence on the normal operation of a program; (2) the occurrence of a soft error causes a program to crash or hang; (3) the occurrence of a soft error causes an implicit error to occur in the program, i.e., the program operates normally, but the operation results in an error, and such an error is generally referred to as sdc (simple Data correction). The first type of error does not affect program operation, and the second type of error is serious but easy to detect. Compared to these two types of errors, SDC will cause more serious program problems due to its implicit propagation properties.

For software SDC errors, the conventional error detection method based on redundant instructions copies all instructions in a program, which results in huge performance, so that research on redundancy technologies at present focuses on how to select fragile instructions in the program for partial redundancy, so as to achieve the purpose of reducing overhead. Existing selection methods can be divided into three categories: (1) a selection method based on fault injection; (2) a selection method based on program analysis; (3) a selection method based on machine learning. The fault injection-based instruction selection method comprises the steps of fault injection of program instructions one by one, screening of instructions with high vulnerability by observing fault injection results and carrying out redundancy reinforcement. Program instruction vulnerability is determined through program analysis based on a program analysis method, for example, an error propagation model is established to calculate instruction SDC vulnerability through analyzing the propagation path of program transient fault in the article Errorflow model, Modeling and analysis of software propagating hardware faults. The method based on machine learning combines the advantages of the former two, avoids complex propagation calculation process, and reduces fault injection overhead. In recent years, prediction of program fragile instructions by methods such as support vector machines and neural nets has been studied, but such methods require a large number of prediction data sets and complicated manual parameter adjustment to achieve higher accuracy.

Disclosure of Invention

The invention aims to provide a method and a system for predicting instruction vulnerability, which can reduce the data set scale and parameter adjustment complexity required by model training and can be applied to large-scale programs or complex environments.

The technical solution for realizing the purpose of the invention is as follows: an instruction vulnerability prediction method based on a deep random forest comprises the following steps:

step 1, performing static analysis on a training program, extracting instruction characteristic information related to each program instruction and instruction vulnerability, and generating an instruction characteristic vector V representing the instruction vulnerability corresponding to the program instruction_features；

Step 2, fault injection is carried out on the training program, and vulnerability values P of all program instructions are obtained_SDC(I_i)；

Step 3, combining the instruction characteristic vector V_featuresAnd an instruction vulnerability value P_SDC(I_i) Generating an instruction vulnerability sample data set D, wherein each sample S in the data set comprises an instruction feature vector V corresponding to a certain program instruction_featuresAnd a vulnerability value P_SDC(I_i)；

Step 4, sliding sampling is carried out on the instruction vulnerability sample data set D through a sliding window model, instruction sequence expansion characteristics of sample data are obtained, and an expansion sample data set is generated;

step 5, constructing and training an instruction vulnerability prediction model based on the deep random forest based on the extended sample data set;

and 6, extracting the instruction feature vector of the target program to be predicted according to the process of the step 1, and combining the instruction vulnerability prediction model obtained in the step 5 to realize the instruction vulnerability prediction of the target program to be predicted.

Further, the instruction feature vector V for characterizing the vulnerability of the instruction in step 1_featuresThe following 7-tuple:

V_features＝<V_{tran_bran},V_comp,V_addr,V_mask,V_loop,V_arith,V_block>

in the formula, V_{tran_bran}Indicating branch and branch-related instruction characteristics, including branch-related characteristic f_{is_branch}Function call related feature f_{is_call}Return instruction feature f_{is_return}；V_compIndicating compare-instruction-related features, including integer compare-instruction feature f_{is_int_cmp}Floating point compare instruction characteristics f_{is_float_cmp}；V_addrIndicating address-instruction-dependent features, including address-instruction-reference feature f_{is_used_in_add}Address width characteristic f of destination operation instruction_{dest_op_width}Store instruction characteristics f_{is_used_stroe}；V_maskRepresenting fault-mask-related features, including logic and instruction features f_{is_and}Logical OR instruction characteristic f_{is_or}Logic shift instruction feature f_{is_sh}；V_loopIndicating loop instruction dependency characteristics, including loop position instruction characteristic f_{is_loop}Cycle depth characteristic f_{loop_d}；V_arithRepresenting arithmetic operation correlation features, comprising: addition-subtraction instruction characteristics f_{is_add/sub}Multiply-divide instruction feature f_{is_mul/div}；V_blockRepresenting features related to basic block information, including: basic block length feature f_{bb_length}Characteristic f of the number of instructions to be executed in the basic block_{bb_remain_ins_num}Number of precursor basic blocks characteristic f_{pred_bb_num}The number of subsequent basic blocks characteristic f_{suc_bb_num}。

Further, in step 2, obtaining the vulnerability value P of each program instruction_SDC(I_i) The formula used is:

in the formula I_iDenotes the ith program instruction, P_SDC(I_i) Representing program instructions I_iSDC vulnerability value of (M), w represents the bit width of the instruction destination register, M_jRepresenting pairs of program instructions I_iNumber of SDC failures after fault injection at jth bit, F_jRepresents the pair instruction I_iThe total number of fault injections performed by the jth bit.

Further, in step 4, the sliding sampling is performed on the instruction vulnerability sample data set D through the sliding window model, the instruction sequence expansion feature of the sample data is obtained, and the expanded sample data set is generated, which specifically includes:

let m be 2 as the initial value, m belongs to N^*，2≤m≤p，p∈N^*Setting a p value in a self-defined mode;

step 4-1, constructing a sliding window model as follows:

W_m＝m×n

in the formula, W_mThe width of the sliding window is, n is the characteristic number of each sample S in the instruction vulnerability sample data set D, and is the sliding step length of the sliding window;

step 4-2, splicing M samples in the instruction vulnerability sample data set D into a new sample E_i，E_iThe characteristic number of (1) is M multiplied by n;

step 4-3, using sliding window model to sample E_iPerforming sliding sampling to obtain M +1-M samples with the size of W_mThe sample of (1);

step 4-4, utilizing two random forest regression models to carry out regression on M +1-M forest trees with the size of W_mTraining the sample to obtain 2(M +1-M) regression values as an expansion characteristic; wherein the label value of the sample during training is the label value of a certain sample randomly selected from M samples, and the label value is the vulnerability value P_SDC(I_i)；

Step 4-5, increasing M by 1, judging whether M is larger than p, if not, returning to the step 4-1, otherwise, outputting the (p-1) x (2M-p) dimension expansion characteristics obtained in the whole circulation process;

and 4-6, splicing the original n-dimensional feature of each sample S with the (p-1) x (2M-p) dimensional expansion feature to obtain an expansion sample of the (p-1) x (2M-p) + n-dimensional feature corresponding to the sample S, so as to generate an expansion sample data set.

Further, the step 5 of constructing and training an instruction vulnerability prediction model based on the deep random forest based on the extended sample data set specifically includes:

step 5-1, constructing a first layer of cascade regression forest, wherein the first layer of cascade regression forest comprises N random forests, and constructing a first layer of cascade regression forest

4, the expansion sample data set obtained in the step 4 is used as an initial input vector in the deep random forest regression, and therefore an output vector comprising N enhanced features is output;

step 5-2, constructing the next layer of cascade regression forest, and outputting the vector v of the previous layer of cascade regression forest_enhanedAnd the input vector v_inputSpliced v_input,v_enhanced]As the input of the hierarchical cascade regression forest, then, evaluating the accuracy of the whole cascade forest up to the layer by using a cross validation method, namely calculating the mean square error between the regression result and the true value of all random forests on the layer;

step 5-3, judging whether the accuracy obtained in the step 5-2 is improved compared with the accuracy corresponding to the previous layer of cascade regression forest, if so, returning to the step 5-2; and otherwise, judging that the accuracy reaches a threshold value, not increasing the number of layers of the deep random forest any more, ending the construction and training process, obtaining an instruction vulnerability prediction model based on the deep random forest, wherein the average value of regression of all random forests in the last layer of cascade regression forest is the prediction result of the instruction vulnerability prediction model.

A deep random forest based instruction vulnerability prediction system, the system comprising:

the first feature extraction module is used for performing static analysis on the training program, extracting instruction feature information related to the instruction vulnerability of each program instruction, and generating an instruction feature vector V representing the instruction vulnerability corresponding to the program instruction_features；

A second feature extraction module for performing fault injection on the training program to obtain vulnerability value P of each program instruction_SDC(I_i)；

A first sample data set construction module for combining the instruction feature vector V_featuresAnd an instruction vulnerability value P_SDC(I_i) Generating an instruction vulnerability sample data set D, wherein each sample S in the data set comprises an instruction feature vector V corresponding to a certain program instruction_featuresAnd a vulnerability value P_SDC(I_i)；

The second sample data set construction module is used for performing sliding sampling on the instruction vulnerability sample data set D through a sliding window model, obtaining the instruction sequence expansion characteristic of the sample data and generating an expansion sample data set;

the prediction model construction module is used for constructing and training an instruction vulnerability prediction model based on the deep random forest based on the extended sample data set;

and the prediction module is used for extracting the instruction feature vector of the target program to be predicted according to the working process of the first feature extraction module and realizing the instruction vulnerability prediction of the target program to be predicted by combining the instruction vulnerability prediction model.

Further, the second sample data set constructing module includes sequentially executed:

a parameter initialization unit for initializing m to 2, m belongs to N^*，2≤m≤p，p∈N^*；

A sliding window model construction unit, configured to construct a sliding window model as follows:

W_m＝m×n

a sample splicing unit for splicing M samples in the instruction vulnerability sample data set D into a new sample E_i，E_iThe characteristic number of (1) is M multiplied by n;

a sliding sampling unit for sampling E with a sliding window model_iPerforming sliding sampling to obtain M +1-M samples with the size of W_mThe sample of (1);

a training unit for training M +1-M forest regression models with W_mTraining the sample to obtain 2(M +1-M) regression values as an expansion characteristic; wherein the label value of the sample during training is the label value of a certain sample randomly selected from M samples, and the label value is the vulnerability value P_SDC(I_i)；

The first judging unit is used for enabling M to be increased by 1, judging whether M is larger than p or not, if not, returning to the execution of the sliding window model building unit, and otherwise, outputting the (p-1) x (2M-p) dimension expansion characteristics obtained in the whole circulation process;

and the expanding sample data set constructing unit is used for splicing the original n-dimensional feature of each sample S and the (p-1) x (2M-p) dimensional expanding feature to obtain an expanding sample of the (p-1) x (2M-p) + n-dimensional feature corresponding to the sample S, so that an expanding sample data set is generated.

Further, the prediction model building module comprises sequentially executed:

the first cascade regression forest building unit is used for building a first layer of cascade regression forest which comprises N random forests, and an extended sample data set obtained by the second sample data set building module is used as an initial input vector in the deep random forest regression, so that an output vector comprising N enhanced features is output;

a second cascade regression forest construction unit for constructing the next cascade regression forest and outputting the output vector v of the previous cascade regression forest_enhanedAnd the input vector v_inputSpliced v_input,v_enhanced]As the input of the hierarchical cascade regression forest, then, evaluating the accuracy of the whole cascade forest up to the layer by using a cross validation method, namely calculating the mean square error between the regression result and the true value of all random forests on the layer;

the second judging unit is used for judging whether the accuracy obtained by the second cascade regression forest constructing unit is increased compared with the accuracy corresponding to the previous layer of cascade regression forest or not, and if yes, the second cascade regression forest constructing unit is executed in a returning mode; and otherwise, judging that the accuracy reaches a threshold value, not increasing the number of layers of the deep random forest any more, ending the construction and training process, obtaining an instruction vulnerability prediction model based on the deep random forest, wherein the average value of regression of all random forests in the last layer of cascade regression forest is the prediction result of the instruction vulnerability prediction model.

Compared with the prior art, the invention has the following remarkable advantages: 1) the deep random deep forest model can obtain high prediction accuracy on a small-scale sample, so that the prediction model only needs a small amount of training data collection work and is low in complexity; 2) the depth random forest model can automatically adjust the cascade depth according to the training accuracy, so that the parameter adjusting difficulty is reduced while the prediction accuracy is high; 3) sequence features among the instruction samples are extracted through a sliding window scanning method, so that the feature space can more accurately reflect the vulnerability of the instruction SDC, and the prediction accuracy is improved.

The present invention is described in further detail below with reference to the attached drawing figures.

Drawings

FIG. 1 is a flowchart of an instruction vulnerability prediction method based on a deep random forest according to the present invention.

FIG. 2 is a comparison graph of accuracy of prediction results in an embodiment of the present invention.

FIG. 3 is a diagram illustrating a comparison of mean square error of the prediction results of the present invention with other prediction methods in accordance with the present invention.

Detailed Description

With reference to fig. 1, the present invention provides a method for predicting instruction vulnerability based on a deep random forest, which includes the following steps:

step 1, performing static analysis on a training program, extracting instruction characteristic information related to each program instruction and instruction vulnerability, and generating an instruction characteristic vector V representing the instruction vulnerability corresponding to the program instruction_features(ii) a Wherein the instruction feature vector V_featuresThe following 7-tuple:

V_features＝<V_{tran_bran},V_comp,V_addr,V_mask,V_loop,V_arith,V_block>

Step 2, fault injection is carried out on the training program, and vulnerability values P of all program instructions are obtained_SDC(I_i) The formula used is:

in the formula I_iDenotes the ith program instruction, P_SDC(I_i) Representing program instructions I_iSDC vulnerability value of (w) represents an instructionBit width of destination register, M_jRepresenting pairs of program instructions I_iNumber of SDC failures after fault injection at jth bit, F_jRepresents the pair instruction I_iThe total number of fault injections performed by the jth bit.

Step 3, combining the instruction characteristic vector V_featuresAnd an instruction vulnerability value P_SDC(I_i) Generating an instruction vulnerability sample data set D, wherein each sample S in the data set comprises an instruction feature vector V corresponding to a certain program instruction_featuresAnd a vulnerability value P_SDC(I_i)。

And 4, performing sliding sampling on the instruction vulnerability sample data set D through a sliding window model to obtain the instruction sequence expansion characteristics of the sample data and generate an expansion sample data set. The method specifically comprises the following steps:

step 4-1, constructing a sliding window model as follows:

W_m＝m×n

Here, it is further preferable that p is 5 and M is 10, then step 4 specifically includes:

let m be 2 as the initial value, m belongs to N^*，2≤m≤5；

Step 4-1, constructing a sliding window model as follows:

W_m＝m×n

in the formula, W_mThe width of the sliding window is, n is the characteristic number of each sample S in the instruction vulnerability sample data set D, and is the moving step length of the sliding window;

step 4-2, splicing 10 samples in the instruction vulnerability sample data set D into a new sample E_i，E_iThe characteristic number of (2) is 10 xn;

step 4-3, using sliding window model to sample E_iPerforming sliding sampling to obtain 11-m samples with the size of W_mThe sample of (1);

step 4-4, utilizing two random forest regression models to perform pair treatment on 11-m forest regression models with the size of W_mTraining the sample to obtain 2(11-m) regression values as an expansion characteristic; wherein the label value of the sample during training is the label value of the 10 th sample, and the label value is the vulnerability value P_SDC(I_i)；

Step 4-5, increasing m by 1, judging whether m is larger than 5, if not, returning to the step 4-1, otherwise, outputting 60-dimensional expansion characteristics obtained by the whole cycle;

and 4-6, splicing the original n-dimensional features and the 60-dimensional expansion features of each sample S to obtain an expansion sample of the 60+ n-dimensional features corresponding to the sample S, thereby generating an expansion sample data set.

And 5, constructing and training an instruction vulnerability prediction model based on the deep random forest based on the extended sample data set. The method specifically comprises the following steps:

step 5-1, constructing a first layer of cascade regression forest, wherein the cascade regression forest comprises N random forests, and taking the expansion sample data set obtained in the step 4 as an initial input vector in the deep random forest regression, so as to output an output vector comprising N enhanced features;

step 5-3, judging whether the accuracy obtained in the step 5-2 is improved compared with the accuracy corresponding to the previous layer of cascade regression forest, if so, returning to the step 5-2; and otherwise, judging that the accuracy reaches the threshold value, not increasing the number of layers of the deep random forest any more, ending the construction and training process, obtaining an instruction vulnerability prediction model based on the deep random forest, and obtaining the prediction result of the instruction vulnerability prediction model by using the average value of regression of all random forests in the last layer of cascade regression forest.

Here, it is further preferable that the hierarchical joint regression forest in step 5-1 includes N random forests, specifically: the hierarchical joint regression forest comprises N-4 random forests which are respectively two random forest regression models f_normalAnd two extreme forest regression models f_extremly。

The invention provides an instruction vulnerability prediction system based on a deep random forest, which comprises:

the first characteristic extraction module is used for carrying out static analysis on the training program, extracting instruction characteristic information related to the vulnerability of each program instruction and generating the program instructionInstruction feature vector V corresponding to program instruction and representing instruction vulnerability_features。

A second feature extraction module for performing fault injection on the training program to obtain vulnerability value P of each program instruction_SDC(I_i)。

A first sample data set construction module for combining the instruction feature vector V_featuresAnd an instruction vulnerability value P_SDC(I_i) Generating an instruction vulnerability sample data set D, wherein each sample S in the data set comprises an instruction feature vector V corresponding to a certain program instruction_featuresAnd a vulnerability value P_SDC(I_i)。

And the second sample data set construction module is used for performing sliding sampling on the instruction vulnerability sample data set D through the sliding window model, obtaining the instruction sequence expansion characteristic of the sample data and generating an expansion sample data set. The module specifically comprises the following steps:

a parameter initialization unit for initializing m to 2, m belongs to N^*，2≤m≤p，p∈N^*Setting a p value in a self-defined mode;

W_m＝m×n

a training unit for training M +1-M forest regression models with W_mTraining the sample to obtain 2(M +1-M) regression values as an expansion characteristic; wherein the label value of the sample during training is randomly selected from M samplesTaking the label value of a certain sample, wherein the label value is the vulnerability value P_SDC(I_i)；

And the prediction model construction module is used for constructing and training an instruction vulnerability prediction model based on the deep random forest based on the extended sample data set. The module comprises the following steps of:

the second judging unit is used for judging whether the accuracy obtained by the second cascade regression forest constructing unit is increased compared with the accuracy corresponding to the previous layer of cascade regression forest or not, and if yes, the second cascade regression forest constructing unit is executed in a returning mode; and otherwise, judging that the accuracy reaches the threshold value, not increasing the number of layers of the deep random forest any more, ending the construction and training process, obtaining an instruction vulnerability prediction model based on the deep random forest, and obtaining the prediction result of the instruction vulnerability prediction model by using the average value of regression of all random forests in the last layer of cascade regression forest.

And the prediction module is used for extracting the instruction feature vector of the target program to be predicted according to the working process of the first feature extraction module and realizing the instruction vulnerability prediction of the target program to be predicted by combining an instruction vulnerability prediction model.

The present invention will be described in further detail with reference to examples.

Examples

And (3) experimental environment configuration: intel i 78750H CPU, under the Ubuntu Linux 16.04 operating system in the 16G memory. Randomly selecting a part of test programs in a Mibench benchmark test set as a training set, extracting instruction features of a source program by using an analysis program based on an LLVM (Low level virtual machine) compiler to generate an instruction feature vector x, and injecting faults of the training program one by using an LLFI (LLVM based Fault Injection tool) to obtain an instruction SDC vulnerability value y. The total collection is about 4300 sample data, and the characteristic dimension n is 21.

Starting from the 10 th sample, sliding sampling operation is carried out one by utilizing a sliding window, 60-dimensional expansion characteristics are generated through two random forest regressors, expansion samples with 81-dimensional characteristics are finally obtained, and an expansion sample data set is generated. And then, taking the expanded sample data set as an input vector of a deep random forest, using 4 random forest regression models which are the same in pairs for each layer of random forest to generate a 4-dimensional enhancement vector, splicing the 4-dimensional enhancement vector with the initial 21 features to generate a 25-dimensional vector, and taking the 25-dimensional vector as the input of the next layer of random forest. And after the model training is finished, evaluating the accuracy on the test set.

Selecting Isqrt (square root calculation), FFT (Fourier transform), Dijkstra (shortest path planning algorithm), Bitstring (bit and character string conversion), Qsort (quick sorting) and Rad2deg (radian conversion) from a Mibench test suite. After the LLVM is subjected to feature extraction, the prediction model obtained through training is used for carrying out SDC vulnerability prediction on each instruction of the test program, the vulnerability average value of all instructions of each program is calculated, meanwhile, the prediction average values of other prediction models are compared, and the result is shown in figure 2. It can be seen from the figure that the predictive effect of the present invention is closer to the true value on each test program. Where Baseline represents the commanded vulnerability criterion value obtained by fault injection. FIG. 3 shows the comparison of the mean square error of the prediction results with other prediction methods, and it can be seen that the method of the present invention achieves the minimum error value on all test procedures.

In conclusion, the method has high prediction accuracy, low requirement on the sample set and less manual adjustment work, and can be effectively applied to prediction of instruction vulnerability after the program is influenced by the transient fault.

Claims

1. An instruction vulnerability prediction method based on a deep random forest is characterized by comprising the following steps:

2. The method for predicting the vulnerability of instructions based on deep random forest as claimed in claim 1, wherein the instruction feature vector V characterizing the vulnerability of instructions in step 1_featuresThe following 7-tuple:

V_features＝〈V_{tran_bran},V_comp,V_addr,V_mask,V_loop,V_arith,V_block〉

3. The deep random forest-based finger of claim 1The method for predicting the vulnerability is characterized in that the step 2 of obtaining the vulnerability value P of each program instruction_SDC(I_i) The formula used is:

4. The method according to claim 1, wherein the step 4 is to perform sliding sampling on the instruction vulnerability sample data set D through a sliding window model to obtain an instruction sequence extension characteristic of sample data and generate an extension sample data set, and specifically includes:

step 4-1, constructing a sliding window model as follows:

W_m＝m×n

step 4-4, utilizing two random forest regression models to carry out regression on M +1-M forest trees with the size of W_mThe sample is trained to obtain 2(M +1-m) regression values as expansion characteristics; wherein the label value of the sample during training is the label value of a certain sample randomly selected from M samples, and the label value is the vulnerability value P_SDC(I_i)；

5. The method of claim 1, wherein p is 5 and M is 10.

6. The method for predicting the instruction vulnerability based on the deep random forest according to claim 1, wherein the step 5 of constructing and training the instruction vulnerability prediction model based on the deep random forest based on the extended sample data set specifically comprises:

7. The method for predicting instruction vulnerability based on deep random forest as claimed in claim 6, wherein the hierarchical associative regression forest in step 5-1 includes N random forests, specifically: the hierarchical joint regression forest comprises N-4 random forests which are respectively two random forest regression models f_normalAnd two extreme forest regression models f_extremly。

8. A system for instruction vulnerability prediction based on a deep random forest, the system comprising:

9. The system of claim 8, wherein the second sample data set construction module comprises, executed in order:

W_m＝m×n

10. The system of claim 8, wherein the prediction model building module comprises, performed in sequence: