CN115878498A - Key byte extraction method for predicting program behavior based on machine learning - Google Patents
Key byte extraction method for predicting program behavior based on machine learning Download PDFInfo
- Publication number
- CN115878498A CN115878498A CN202310195368.7A CN202310195368A CN115878498A CN 115878498 A CN115878498 A CN 115878498A CN 202310195368 A CN202310195368 A CN 202310195368A CN 115878498 A CN115878498 A CN 115878498A
- Authority
- CN
- China
- Prior art keywords
- program
- neural network
- target program
- input
- test set
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000006399 behavior Effects 0.000 title claims abstract description 34
- 238000010801 machine learning Methods 0.000 title claims abstract description 14
- 238000000605 extraction Methods 0.000 title description 6
- 238000012360 testing method Methods 0.000 claims abstract description 86
- 238000000034 method Methods 0.000 claims abstract description 43
- 238000003062 neural network model Methods 0.000 claims abstract description 33
- 238000012549 training Methods 0.000 claims abstract description 20
- 230000006870 function Effects 0.000 claims description 45
- 238000013528 artificial neural network Methods 0.000 claims description 22
- 230000008569 process Effects 0.000 claims description 9
- 239000011159 matrix material Substances 0.000 claims description 6
- 238000003780 insertion Methods 0.000 claims description 5
- 230000037431 insertion Effects 0.000 claims description 5
- 230000035772 mutation Effects 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 2
- 238000004458 analytical method Methods 0.000 description 13
- 230000008859 change Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000001514 detection method Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 235000019580 granularity Nutrition 0.000 description 3
- 230000004913 activation Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000014509 gene expression Effects 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- UPLPHRJJTCUQAY-WIRWPRASSA-N 2,3-thioepoxy madol Chemical compound C([C@@H]1CC2)[C@@H]3S[C@@H]3C[C@]1(C)[C@@H]1[C@@H]2[C@@H]2CC[C@](C)(O)[C@@]2(C)CC1 UPLPHRJJTCUQAY-WIRWPRASSA-N 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
Images
Landscapes
- Stored Programmes (AREA)
Abstract
The invention discloses a method for extracting key bytes based on machine learning prediction program behaviors, which comprises the following steps: running the input seed file on a target program, generating a large number of test sets X through a fuzzy test variation algorithm, performing code instrumentation on the target program to obtain a program PI, and running the test sets X on the program PI to obtain a test set Y; taking the X-Y data pairs of the test set as training data, and training the neural network model until the neural network model obtained by training can predict the behavior of the target program; constructing a saliency map based on a neural network model obtained by training, and extracting key bytes from the saliency map; wherein the key byte is an input that affects target program behavior. The invention effectively saves the time overhead and the performance overhead of program behavior tracking.
Description
Technical Field
The invention relates to the technical field of computer information security, in particular to a method for extracting key bytes based on machine learning prediction program behaviors.
Background
Key bytes refer to inputs that affect the behavior of the target program, and based on a given set of inputs, the program behavior is observed and the bytes in the inputs that affect the program behavior are deduced back, i.e., key byte fetches are extracted. The extracted key bytes can be widely applied to the fields of system privacy data leakage, vulnerability detection, guidance fuzzy test and the like. In tracking program data flow, observing program behavior and extracting key bytes, a taint analysis technology is generally used, which detects the safety problem of a system by marking sensitive data in the system and tracking the propagation of marked data in the program, however, as the program scale is enlarged, the time cost brought by taint analysis is exponentially multiplied due to the need of tracking the information flow from taint source to taint gathering point in the responsible program.
At present, most program behavior tracking tools are realized based on taint analysis tools such as Valgrind, pin, qemu and the like. James Newsome publishes the Taintcheck developed based on Valgrind, which realizes the detection of the buffer overflow vulnerability, but ignores the tracking of the control flow. Wangjiang proposes a QEMU-based binary program offline dynamic taint analysis method, realizes extraction of a running track of a binary program by modifying a decoding and executing mechanism of QEMU, simultaneously finishes marking program input by using a HOOK technology, establishes a vulnerability model, and finishes offline program track analysis and program vulnerability detection according to a propagation strategy and a security check strategy generated by the vulnerability model while virtually replaying the program. However, the above methods all have a problem of excessive time consumption.
Machine learning is a recent research focus and it is enthusiastic to introduce methods of machine learning in different fields to improve the prior art. The TaintInduce provides a method for learning information propagation rules of a specific platform from input and output instructions. TaintInduce learns the information propagation rules based on the template and uses an algorithm to reduce the task to a prerequisite of only learning different input sets and information propagation labels. The TaintInduce learns the information propagation rule through a machine learning method, the accuracy of a single propagation rule is improved, but the TaintInduce still has the problems of high false alarm and high expense in the program behavior tracking process due to the propagation-based design.
Disclosure of Invention
In view of this, the present invention provides a method for extracting a key byte based on machine learning to predict program behavior, so as to solve the above technical problem.
The invention discloses a method for extracting key bytes based on machine learning prediction program behaviors, which comprises the following steps:
step 1: the method comprises the steps of running an input seed file on a target program, generating a large number of test sets X through a fuzzy test variation algorithm, performing code instrumentation on the target program to obtain a program PI, and running the test sets X on the program PI to obtain a test set Y;
step 2: taking the X-Y data pairs of the test set as training data, and training the neural network model until the neural network model obtained by training can predict the behavior of the target program; wherein, the test X is input data of the neural network, and the test set Y is label data of the neural network; the test set X-Y data pairs consist of a test set X and a test set Y;
and 3, step 3: constructing a saliency map based on a neural network model obtained by training, and extracting key bytes from the saliency map; wherein the key byte is an input that affects target program behavior.
Further, the step 1 comprises:
step 11: the fuzzy test takes the provided seed file as input, a large amount of variation operations are operated on a target program, and whether related results are caused after operation is checked; wherein the relevant result comprises that the target program crashes and a new execution path is found;
step 12: and performing basic block level code instrumentation on the target program to obtain a program PI, and running a test set X on the program PI to obtain an execution path of the target program, namely a test set Y.
Further, in step 11, to ensure that the length of the test set X is unchanged, three fuzzy test mutation algorithms, bitflip, arithmetric and intet, are selected to generate a large number of test sets X.
Further, the step 12 includes:
step 121: defining a function IFunc for insertion, wherein the function IFunc is inserted before each basic block, and when the basic block passes through in the execution process of a target program, the function IFunc is called to output the number of the basic block and the function where the basic block is located;
step 122: acquiring a target program;
step 123: initializing the number value num of the basic block, wherein the number value starts from 1;
step 124: traversing a function F of the target program;
step 125: traversing each basic block in the function F;
step 126: calling the inserted function IFunc;
step 127: adding 1 to the number value num of the basic block;
step 128: judging whether the traversal of the function F is finished, if not, executing the step 125, and if so, executing the step 129;
step 129: and executing the test set X by using the program PI, wherein the program PI outputs the position, the number and the file name of the executed basic block to obtain an execution path of the target program, namely the test set Y.
Further, in the step 2:
the neural network model is used for learning the data flow propagation process of different inputs in the target program by observing a large number of test set X-Y data pairs in the execution track of the target program and simulating the processing logic of the target program; the neural network model takes program input as model input and predicts an execution path of a target program.
Further, given one of the test set X-Y data pairsAnd corresponding target program execution path>When the output of the neural network model is &>:
Wherein,represents the second in test set XiRespective data +>Represents the first in test set YiIndividual data->Output vector representing a hidden layer of a neural network>Represents a ReLU function, <' > based on>、The trainable parameters representing each level of each layer are,kindex number representing a layer, <' > or>Indicates that the test data is->Neural network models in time, in combination>Trainable weight parameters representing a neural network model>Representing a sigmod function. />
Further, the neural network model comprises an input layer, a hidden layer and an output layer; the input layer is connected with the output layer through the hidden layer.
Further, the step 3 comprises:
step 31: calculating partial derivatives of the execution path predicted by the trained neural network model relative to the test set X;
step 32: constructing a saliency map based on the partial derivatives;
step 33: key bytes are extracted from the saliency map.
Further, the step 31 specifically includes:
is provided withF() Indicates input->Calculating a value of an output variable based on the value of an input->Is defined as follows:
wherein,Xthe input data representing the neural network, i.e. test set X,represents input>The nth byte, partial derivative +>A Jacobian matrix forming a neural network function, each element of the matrix representing an output ÷ or ÷ value>Relative to the inputGradient of each byte.
Further, the step 32 specifically includes:
wherein,is->Prediction output->For all inputs->The derivative sum of the nth byte represents the influence of the nth byte on the behavior of the currently executed target program, and the larger the value of the derivative sum is, the larger the influence is;
the step 33 specifically includes:
wherein,the key bytes are important bytes which can affect the execution path of the target program in the input field; top _ k represents the function that selects the k largest elements from the vector, and arg represents the function that returns the index of the selected element.
Due to the adoption of the technical scheme, the invention has the following advantages:
the invention can predict the behavior of the program by means of machine learning model simulation and different expressions of the learning program, and realizes the light-weight and accurate end-to-end information flow tracking. Compared with the traditional method for tracking the program by using the taint analysis tool, the method effectively saves the time overhead and the performance overhead of program behavior tracking. The model obtained by training by the method is used for guiding subsequent work such as fuzzy test, vulnerability mining and the like, so that the working efficiency can be greatly improved, and the analysis time can be saved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments described in the embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings.
FIG. 1 is a flowchart illustrating a method for extracting key bytes based on machine learning prediction program behavior according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of pile insertion logic according to an embodiment of the present invention;
FIG. 3 is a diagrammatic illustration of a stake insertion flow of an embodiment of the present invention;
FIG. 4 is a diagram of a neural network model architecture according to an embodiment of the present invention;
FIG. 5 is a key byte diagram according to an embodiment of the present invention.
Detailed Description
The present invention will be further described with reference to the accompanying drawings and examples, it being understood that the examples described are only some of the examples and are not intended to limit the invention to the embodiments described herein. All other embodiments available to those of ordinary skill in the art are intended to be within the scope of the embodiments of the present invention.
The technical problems to be solved by the invention are as follows:
(1) Accuracy problem of key byte extraction
When key byte extraction is carried out by using taint analysis, variables which have no data and control dependency relationship with program behaviors are often marked as taints, and false alarm is generated; if b is marked as taint, then for rule-based propagation, s and t will also be marked as taint, as s = a + b, t = s-b. However, b cannot actually affect t, which may cause excessive bytes in the input to be defined as key bytes, resulting in false alarm; alternatively, variables that have data and control dependencies on program behavior are not marked as dirty variables, resulting in false positives, e.g., if b >1 then a = 5, if b is marked as dirty and the value of b is greater than 1, a will not be marked as dirty since a does not have direct data communication with b, whereas in practice the value of a depends on b, which results in the byte in the input that should be the critical byte being ignored, resulting in false negations. Both of the above conditions affect the accuracy of key byte extraction.
(2) System resource and time overhead is excessive
When the taint analysis is used for extracting the key bytes, some codes for collecting information are inserted on the basis of not damaging the original logic of the target program, so that the relevant information of program operation is obtained. And a shadow memory is added on the basis of the original data to represent the pollution condition of the register and the memory. The method obtains the specific information in the program execution by inserting the piles in the program, has high analysis precision, but frequent pile inserting operation and the design of the shadow memory occupy a large amount of system resources, increases the time overhead, and increases the overhead exponentially along with the expansion of the program scale.
Referring to fig. 1, the present invention provides an embodiment of a method for extracting key bytes based on machine learning to predict program behavior; the embodiment of the invention uses a machine learning model of a neural network, a taint analysis technology tracks the propagation of marking data in a program with a large amount of time and resource cost, the neural network can predict the behavior of the program through different expressions of the learning program, and the influence of a taint source on a taint collection point in the program is calculated by utilizing gradient analysis, so that the lightweight end-to-end information flow tracking is realized.
The whole framework logic is expanded around the model and can be divided into 3 steps,
step 1: the method comprises the steps of running an input seed file on a target program, generating a large number of test sets X through a fuzzy test variation algorithm, performing code instrumentation on the target program to obtain a program PI, and running the test sets X on the program PI to obtain a test set Y;
the fuzzy test takes the provided seed file as input, carries out a large amount of variation operations, checks whether the running results cause the crash of the target program, discovers a new execution path and the like. The mutation operation of the fuzz test generally comprises the following 6 types:
TABLE 1 fuzzy test variation operation
Serial number | Name(s) | Description of the preferred | Length variation | |
1 | bitflip | Flip by bit, 1 to 0 to 1 | Without |
|
2 | arithmetic | Integer add/subtract arithmetic operation | Without change | |
3 | interet | Replacing special contents in original file | Without change | |
4 | dictionary | Replacing/inserting automatically generated or user-provided tokens into the original document | There are variations | |
5 | havoc | Making a great deal of variation on the original document | There is a change in | |
6 | splice | Splicing two files to obtain a new file | There are variations |
In order to ensure that the length of the test set X is unchanged, three fuzzy test variation algorithms of bitflip, arithmetric and intet are selected to generate a large number of test sets, then code instrumentation is performed on a target program, the test set X is operated on the instrumented program to obtain the path execution condition of the program, and the test set Y is collected.
Code instrumentation logic code that performs certain functions is inserted before and after instructions that are intended to be observed or processed in the assembly code of the object program, as shown in figure 2. Code instrumentation can be performed at 3 granularities, typically from the instruction level, basic block level, and function level. This patent chooses to do basic block-level code instrumentation on the target program.
The basic block is a program execution statement with only one inlet and one outlet, the function is generally divided into a plurality of basic blocks by jump instructions such as 'CMP', if the first instruction of one basic block is executed, the rest of the instructions of the basic block are executed. Compared with instrumentation of instruction-level granularity, instrumentation for basic blocks can save time and shorten program scale; instrumentation for basic blocks may improve accuracy compared to instrumentation at function-level granularity. The process of performing instrumentation on the basic blocks is shown in fig. 3. The method comprises the following steps:
1) Defining a function IFunc for insertion, wherein the function IFunc is used for defining an output function, and before each basic block, when the basic block passes through in the program execution process, the function is firstly called to output the number of the basic block and the function where the basic block is located;
2) Acquiring a target program;
3) Initializing the number value num of the basic block, wherein the number value starts from 1;
4) Function F of traversing target program
5) Traversing each basic block in the function F;
6) Calling the inserted function IFunc;
7) The number value num of the basic block is added with 1;
8) And after the traversal of the functions is finished, outputting the basic block quantity information corresponding to each function of the program.
And executing the test set X by using the instrumented program, wherein the program outputs the position and information of the executed basic block, so that a test set Y containing program execution path information is obtained, and a test set of X-Y data pairs is provided for model training.
Step 2: taking the X-Y data pairs of the test set as training data, and training the neural network model until the neural network model obtained by training can predict the behavior of the target program; wherein, the test X is input data of the neural network, and the test set Y is label data of the neural network; the test set X-Y data pairs consist of a test set X and a test set Y;
the test set X is the input of the target program, the test set Y is the execution path of the target program, the input of the target program is usually user input, files or user privacy character strings, and in order to facilitate the understanding and the recognition of the model, the method converts the byte sequence into a bounded numerical value vector with the range of [0,255 ]. The method processes the information, normalizes the execution path variable by binary data, indicates that the basic block is executed by 1, indicates that the basic block is not executed by 0, and uniformly normalizes the test set Y into 01 character strings with the same length so as to ensure the rapid convergence of the model.
The method adopts a neural network to construct a training model, and the model consists of 3 completely communicated layers, namely an input layer, a hidden layer and an output layer. The hidden layer uses ReLU as an activation function for 4096 hidden units and the output layer predicts the variables using sigmod as an activation function.
The model learns the propagation process of the information flow by observing a large number of X-Y pairs in the program execution trace. The detailed architecture of the model, which takes program input as model input and predicts the program execution path, is shown in fig. 4. Given a specific set of inputs for a particular programAnd a corresponding program execution path pick>With the model predicting an execution path being >>The formula is as follows:
wherein,represents the second in test set XiIndividual data->Represents the first in test set YiIndividual data->Output vector representing a hidden layer of a neural network>Represents the ReLU function->、The trainable parameters representing each level of the hierarchy,kindex number representing a layer, <' > based on a predetermined index number>Indicates that the test data is->Neural network models in time, in combination>Trainable weight parameters representing a neural network model>Representing a sigmod function.
After training the neural network model, the method analyzes information flow in the target program by constructing a saliency map, which is detailed in step 3.
And 3, step 3: constructing a saliency map based on a neural network model obtained by training, and extracting key bytes from the saliency map; wherein the key byte is an input that affects target program behavior.
The key byte refers to an input that affects the behavior of the target program, and as shown in fig. 5, the red portion is a key byte diagram of a pdf file, and assuming that the length of the input x of the target program is m, the key byte is a portion of the data with the length of m, which affects the program execution path. This patent uses gradient analysis and saliency maps to compute the key bytes in the taint data. The saliency map is a gradient-based attribution method, and compared with other gradient-based methods, the saliency map focuses on the sensitivity of the neural network output to each feature, i.e., how the neural network output changes with respect to a minute change of the input. In this method, the saliency map method is chosen to be used because it is desirable to infer from the neural network which byte in the input would affect the execution path of the target program, i.e., produce the greatest sensitivity to the output of the neural network.
To extract the key bytes, the partial derivatives of the execution path predicted by the trained neural network model with respect to test set X are first computed. Is provided withIndicates input->The value of the output variable is calculated in relation to a given input ≧ during execution of the target program>Is defined as follows:
wherein,Xthe input data representing the neural network, i.e. test set X,indicates input->The nth byte, the partial derivative->A Jacobian matrix forming a neural network function, each element of the matrix representing an output ≥>Relative to the inputGradient of each byte. Then, based on the partial derivatives of the neural network model, a saliency map is constructed, the saliency map ≥>Is a vector, which is defined as follows:
in the formulaIs->Prediction output in neural network modelsThe sum of the derivatives of the nth byte of all the inputs represents the influence of the nth byte on the behavior of the currently executed program, and the influence value is represented by a number, wherein the larger the number is, the larger the influence is. The flow of program execution information may be analyzed using saliency maps. After the saliency map is generated, finally, the key byte is extracted, formulated as Down, and/or based on>Indicates a selection->A function of the maximum element, based on the value of the sum of the values of the coefficients>Representing a function that returns the selected element index:
wherein,is the important byte in the input field that will affect the execution path of the target program, i.e. the key byte.
Most of the program behaviors of the parser programs are determined by bytes with specified input positions, namely fixed positions of file format header files, and not by file contents. After analyzing a plurality of file formats, the total number of key bytes of the file parsing class program is found to be between 250 and 500, and accounts for about 5% of the total input bytes. In practice, 5% of threshold value can be selected to calculate key byteThe number of the base can be modified according to the actual situation.
Finally, it should be noted that: although the present invention has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that: modifications and equivalents may be made to the embodiments of the invention without departing from the spirit and scope of the invention, which is to be covered by the claims.
Claims (10)
1. A method for extracting key bytes based on machine learning prediction program behaviors is characterized by comprising the following steps:
step 1: running the input seed file on a target program, generating a large number of test sets X through a fuzzy test variation algorithm, performing code instrumentation on the target program to obtain a program PI, and running the test sets X on the program PI to obtain a test set Y;
and 2, step: taking the X-Y data pairs of the test set as training data, and training the neural network model until the neural network model obtained by training can predict the behavior of the target program; wherein, the test X is input data of the neural network, and the test set Y is label data of the neural network; the test set X-Y data pairs consist of a test set X and a test set Y;
and step 3: constructing a saliency map based on a neural network model obtained by training, and extracting key bytes from the saliency map; wherein the key byte is an input that affects target program behavior.
2. The method of claim 1, wherein step 1 comprises:
step 11: the fuzzy test takes the provided seed file as input, a large amount of variation operation is operated on the target program, and whether relevant results are caused after operation is checked; wherein the relevant result comprises that the target program crashes and a new execution path is found;
step 12: and performing basic block-level code instrumentation on the target program to obtain a program PI, and running a test set X on the program PI to obtain an execution path of the target program, namely a test set Y.
3. The method according to claim 2, wherein in step 11, three fuzzy test mutation algorithms bitflip, arithmetric and intet are selected to generate a large number of test sets X in order to ensure that the length of the test sets X is constant.
4. The method of claim 2, wherein step 12 comprises:
step 121: defining a function IFunc for insertion, wherein the function IFunc is inserted before each basic block, and when the basic block passes through the target program in the execution process, firstly calling the function IFunc to output the number of the basic block and the function where the basic block is located;
step 122: acquiring a target program;
step 123: initializing the number value num of the basic block, wherein the number value starts from 1;
step 124: traversing a function F of the target program;
step 125: traversing each basic block in the function F;
step 126: calling the inserted function IFunc;
step 127: the number value num of the basic block is added with 1;
step 128: judging whether the traversal of the function F is finished, if not, executing the step 125, and if so, executing the step 129;
step 129: and executing the test set X by using the program PI, and outputting the position, the number and the file name of the executed basic block by using the program PI to obtain an execution path of the target program, namely the test set Y.
5. The method according to claim 1, characterized in that in step 2:
the neural network model is used for learning the data flow propagation process of different inputs in the target program by observing a large number of test set X-Y data pairs in the execution track of the target program and simulating the processing logic of the target program; the neural network model takes program input as model input and predicts an execution path of a target program.
6. The method of claim 5, wherein the test set X-Y data pairs are givenAnd corresponding target program execution path>When the output of the neural network model is &>:
Wherein,represents the first in test set XiRespective data +>Represents the second in test set YiIndividual data->Output vector representing a hidden layer of a neural network>Represents the ReLU function->、Trainable on behalf of each levelThe parameters are set to be in a predetermined range,kthe index number of the presentation layer is,indicates that the test data is->Neural network model of time, based on the comparison of the measured time and the measured time>Trainable weight parameters representing a neural network model>Representing a sigmod function.
7. The method of any one of claims 1-6, wherein the neural network model comprises an input layer, a hidden layer, and an output layer; the input layer is connected with the output layer through the hidden layer.
8. The method of claim 6, wherein step 3 comprises:
step 31: calculating partial derivatives of the execution path predicted by the trained neural network model relative to the test set X;
step 32: constructing a saliency map based on the partial derivatives;
step 33: key bytes are extracted from the saliency map.
9. The method according to claim 8, wherein said step 31 is specifically:
is provided withF() Represents input>Calculating a value of an output variable based on the value of an input->Is defined as follows:
10. The method according to claim 9, wherein the step 32 is specifically:
wherein,is->Prediction output->For all inputs->The derivative sum of the nth byte represents the influence of the nth byte on the behavior of the currently executed target program, and the larger the numerical value of the nth byte is, the larger the influence is;
the step 33 is specifically:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310195368.7A CN115878498A (en) | 2023-03-03 | 2023-03-03 | Key byte extraction method for predicting program behavior based on machine learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310195368.7A CN115878498A (en) | 2023-03-03 | 2023-03-03 | Key byte extraction method for predicting program behavior based on machine learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115878498A true CN115878498A (en) | 2023-03-31 |
Family
ID=85761904
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310195368.7A Pending CN115878498A (en) | 2023-03-03 | 2023-03-03 | Key byte extraction method for predicting program behavior based on machine learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115878498A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116775127A (en) * | 2023-05-25 | 2023-09-19 | 哈尔滨工业大学 | Static symbol execution pile inserting method based on RetroWrite framework |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103440201A (en) * | 2013-09-05 | 2013-12-11 | 北京邮电大学 | Dynamic taint analysis device and application thereof to document format reverse analysis |
CN112463638A (en) * | 2020-12-11 | 2021-03-09 | 清华大学深圳国际研究生院 | Fuzzy test method based on neural network and computer readable storage medium |
-
2023
- 2023-03-03 CN CN202310195368.7A patent/CN115878498A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103440201A (en) * | 2013-09-05 | 2013-12-11 | 北京邮电大学 | Dynamic taint analysis device and application thereof to document format reverse analysis |
CN112463638A (en) * | 2020-12-11 | 2021-03-09 | 清华大学深圳国际研究生院 | Fuzzy test method based on neural network and computer readable storage medium |
Non-Patent Citations (1)
Title |
---|
DONGDONG SHE等: "Neutaint: Efficient Dynamic Taint Analysis with Neural Networks", 《2020 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP)》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116775127A (en) * | 2023-05-25 | 2023-09-19 | 哈尔滨工业大学 | Static symbol execution pile inserting method based on RetroWrite framework |
CN116775127B (en) * | 2023-05-25 | 2024-05-28 | 哈尔滨工业大学 | Static symbol execution pile inserting method based on RetroWrite frames |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112733137B (en) | Binary code similarity analysis method for vulnerability detection | |
CN111125716B (en) | Method and device for detecting Ethernet intelligent contract vulnerability | |
US7340475B2 (en) | Evaluating dynamic expressions in a modeling application | |
CN109977682A (en) | A kind of block chain intelligence contract leak detection method and device based on deep learning | |
CN105808438B (en) | A kind of Reuse of Test Cases method based on function call path | |
CN114297654A (en) | Intelligent contract vulnerability detection method and system for source code hierarchy | |
CN110162972B (en) | UAF vulnerability detection method based on statement joint coding deep neural network | |
CN110096439A (en) | A kind of method for generating test case towards solidity language | |
CN112364352A (en) | Interpretable software vulnerability detection and recommendation method and system | |
CN115455382A (en) | Semantic comparison method and device for binary function codes | |
CN117454387A (en) | Vulnerability code detection method based on multidimensional feature extraction | |
CN115878498A (en) | Key byte extraction method for predicting program behavior based on machine learning | |
CN113127933A (en) | Intelligent contract Pompe fraudster detection method and system based on graph matching network | |
CN116150757A (en) | Intelligent contract unknown vulnerability detection method based on CNN-LSTM multi-classification model | |
CN110955892B (en) | Hardware Trojan horse detection method based on machine learning and circuit behavior level characteristics | |
CN116340952A (en) | Intelligent contract vulnerability detection method based on operation code program dependency graph | |
CN115033895A (en) | Binary program supply chain safety detection method and device | |
CN117573142B (en) | JAVA code anti-obfuscator based on simulation execution | |
CN116702157B (en) | Intelligent contract vulnerability detection method based on neural network | |
CN110162472A (en) | A kind of method for generating test case based on fuzzing test | |
CN117725592A (en) | Intelligent contract vulnerability detection method based on directed graph annotation network | |
CN113886832A (en) | Intelligent contract vulnerability detection method, system, computer equipment and storage medium | |
CN117591913A (en) | Statement level software defect prediction method based on improved R-transducer | |
CN116467720A (en) | Intelligent contract vulnerability detection method based on graph neural network and electronic equipment | |
CN116578336A (en) | Software clone detection method based on plagiarism-detector countermeasure |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20230331 |