CN115617410B - Drive interface identification method, device, equipment and storage medium - Google Patents

Drive interface identification method, device, equipment and storage medium Download PDF

Info

Publication number
CN115617410B
CN115617410B CN202211356535.3A CN202211356535A CN115617410B CN 115617410 B CN115617410 B CN 115617410B CN 202211356535 A CN202211356535 A CN 202211356535A CN 115617410 B CN115617410 B CN 115617410B
Authority
CN
China
Prior art keywords
calling
program
function
calling program
stain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211356535.3A
Other languages
Chinese (zh)
Other versions
CN115617410A (en
Inventor
张超
殷婷婷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN202211356535.3A priority Critical patent/CN115617410B/en
Publication of CN115617410A publication Critical patent/CN115617410A/en
Application granted granted Critical
Publication of CN115617410B publication Critical patent/CN115617410B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/4401Bootstrapping
    • G06F9/4411Configuring for operating with peripheral devices; Loading of device drivers

Abstract

The application provides a drive interface identification method, a device, equipment and a storage medium. The method comprises the following steps: acquiring a user state calling program of a driver; extracting codes of the calling program, external function information of the calling program and a parameter list of a calling function of a driving interface of the calling program from the user state calling program; according to the external function information of the calling program and a parameter list of a calling function of a driving interface of the calling program, carrying out taint analysis on codes of the calling program to obtain driving interface calling parameters carrying taint labels; and analyzing the taint label carried by the calling parameter of the driving interface carrying the taint label, and analyzing to obtain the format of the driving interface. The method of the application improves the accuracy of the identification of the driving interface.

Description

Drive interface identification method, device, equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method, an apparatus, a device, and a storage medium for identifying a drive interface.
Background
The operating system is a computer program for managing software and hardware resources on a computer, can effectively improve the computing efficiency and the equipment safety, and is an indispensable member in an information age infrastructure. The kernel and the driver running in the kernel mode are basic components of an operating system, can directly manage and schedule hardware resources of a computer, and provide support for upper software. Kernel programs have a higher privilege level than other applications, and kernel vulnerabilities can often create serious hazards. Therefore, kernel and driver loopholes mining has been an important field of network space security. Before the automatic vulnerability discovery technology is applied to perform vulnerability discovery on the kernel and the driver, we need to identify and extract the driver interface format first, however, due to the huge number of interfaces and continuous update, we need to have the capability of automatically identifying the driver interface. Existing drive interface automated recognition tools are generally only capable of handling open source drives. At present, the closed source drive is widely applied, affects aspects of social life, and has high safety to be analyzed, so that an automatic interface analysis tool of the closed source drive is urgently required to be developed.
In the prior art, analysis of a closed source driving interface is performed by a driving program compiling method, but program compiling can cause program to lose part of format and semantic information of the interface, and the prior art tries to complement the interface information by analyzing external resources such as driving interaction flow, header files and the like.
However, these external resources are often difficult to obtain, and thus may lead to insufficient drive interface information, further causing a problem of inaccurate drive interface identification.
Disclosure of Invention
The application provides a driving interface identification method, a driving interface identification device, driving interface identification equipment and a storage medium, which are used for solving the problem of inaccurate driving interface identification.
In a first aspect, the present application provides a driving interface identifying method, including:
acquiring a user state calling program of a driver;
extracting codes of the calling program, external function information of the calling program and a parameter list of a calling function of a driving interface of the calling program from the user state calling program;
according to the external function information of the calling program and a parameter list of a calling function of a driving interface of the calling program, carrying out taint analysis on codes of the calling program to obtain driving interface calling parameters carrying taint labels;
and analyzing the taint label carried by the calling parameter of the driving interface carrying the taint label, and analyzing to obtain the format of the driving interface.
In a second aspect, the present application provides a drive interface recognition apparatus, comprising:
the acquisition module is used for acquiring a user state calling program of the driver;
the computing module is used for extracting codes of the calling program, external function information of the calling program and a parameter list of a calling function of a driving interface of the calling program from the user state calling program;
the computing module is also used for carrying out stain analysis on codes of the calling program according to the external function information of the calling program and a parameter list of a calling function of a driving interface of the calling program to obtain driving interface calling parameters carrying a stain label;
and the determining module is used for analyzing the taint label carried by the calling parameter of the driving interface carrying the taint label, and analyzing to obtain the format of the driving interface.
In a third aspect, the present application provides a drive interface identification apparatus comprising:
a memory, a processor, a communication interface;
the memory is used for storing executable instructions of the processor;
wherein the processor is configured to perform the drive interface identification method of the first aspect above via execution of the executable instructions.
In a fourth aspect, the present application provides a readable storage medium comprising:
the computer program, when executed by a processor, implements the drive interface identification method as described in the first aspect above.
According to the driving interface identification method, the driving interface identification device, the driving interface identification equipment and the storage medium, the code of the calling program is subjected to taint analysis according to the external function information of the calling program and the parameter list of the calling function of the driving interface, so that the format of the driving interface is analyzed, and the effect of improving the identification accuracy of the driving interface is achieved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
Fig. 1 is a schematic flow chart of a driving interface identification method according to an embodiment of the present application;
fig. 2 is a flow chart of a procedure for performing a stain analysis on a code of a calling program to obtain a driving interface calling parameter carrying a stain label according to external function information of the calling program and a parameter list of a driving interface calling function of the calling program provided by the embodiment of the application;
FIG. 3 is a schematic flow chart of marking a stain source with a stain label according to an embodiment of the present application;
FIG. 4 is an example of memory initialization state in the process of marking a recommended taint tag under an arm architecture according to an embodiment of the present application;
FIG. 5 is a diagram showing the return values and the directional memory blocks of the general function model according to the embodiment of the present application;
fig. 6 is a schematic structural diagram of a driving interface recognition device according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of a driving interface identification device according to an embodiment of the present application.
Specific embodiments of the present application have been shown by way of the above drawings and will be described in more detail below. The drawings and the written description are not intended to limit the scope of the inventive concepts in any way, but rather to illustrate the inventive concepts to those skilled in the art by reference to the specific embodiments.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the application. Rather, they are merely examples of apparatus and methods consistent with aspects of the application as detailed in the accompanying claims.
In the prior art, analysis of a driver interface is performed by a driver compiling method, but program compiling can cause program to lose part of format and semantic information of the interface, and the prior art attempts to complement the interface information by analyzing external resources such as driver interaction flow, header files and the like.
According to the driving interface identification method provided by the application, the code of the calling program is subjected to taint analysis according to the external function information of the calling program and the parameter list of the calling function of the driving interface, so that the format of the driving interface is analyzed. The stain analysis method enables the identification practicability of the driving interface format to be stronger, and compared with the driving program, the user mode calling program of the driving program contains richer driving interface information, so that the identification of the driving interface is more accurate.
The following describes the technical scheme of the present application and how the technical scheme of the present application solves the above technical problems in detail with specific embodiments. The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.
Fig. 1 is a schematic flow chart of a driving interface identification method according to a first embodiment of the present application.
As shown in fig. 1, the driving interface identification method of the present embodiment may include the steps of:
step S101, acquiring a user mode calling program of a driver.
Specifically, the Driver refers to a Device Driver (Device Driver), which is a special program that enables a computer program and a Device to communicate with each other. The user state calling program, that is, the calling program widely exists in various operating systems, and the operating system generally provides the calling program for the driver program so as to reduce direct interaction between the user and the driver program and influence the stability of the system. Compared with the driver, the user mode calling programs of the driver often contain richer interface information, and the calling programs call the driver interface in a standard mode, so that the format of the driver interface can be identified by analyzing the characteristics of the parameters of the driver interface constructed by the calling programs. The parameters may be characterized, for example: the driving interface calls the characteristics of the type of the parameter, semantic information and the like. The format of the drive interface represents the characteristic sum of all parameters required by the interface.
Step S102, extracting the code of the calling program, the external function information of the calling program and the parameter list of the calling program' S driving interface calling function from the user-state calling program.
Specifically, a portion of the key information may be extracted from the user mode calling program to aid in the identification of the driver interface format. The key information may include, but is not limited to: external function information of the calling program, parameter list of the calling function of the driving interface of the calling program and the like. Wherein, the external function information may include: the external function symbol and the address where the external function corresponding to the external function symbol is located.
Step S103, performing taint analysis on codes of the calling program according to external function information of the calling program and a parameter list of a calling function of a driving interface of the calling program to obtain driving interface calling parameters carrying a taint label.
Among them, the smear analysis is a program analysis technique for analyzing the correlation between program data by adding a smear label to the data of interest and observing the flow process of the data in the program.
The stain analysis process includes: determining a stain point source of stain analysis; carrying out stain label marking on a stain source; and (5) propagating the taint label into a taint pool. In this embodiment, the stain analysis is performed on the code of the calling program according to the external function information of the calling program and the parameter list of the calling function of the driving interface of the calling program. The stain source can be generally defined as a variable and a parameter capable of effectively reflecting the information of the driving interface, the stain source is marked by utilizing the stain label, and the stain label is transmitted to a stain pool along with the stain analysis process, namely, the driving interface calling parameter, so as to obtain the driving interface calling parameter carrying the stain label. Wherein the smear pool refers to a target object analyzed using a smear analysis method.
And step S104, analyzing the taint label carried by the calling parameter of the driving interface carrying the taint label, and analyzing to obtain the format of the driving interface.
The format of the driving interface refers to the sum of the characteristics of all parameters required by the interface, and the calling program calls the driving interface in a standard mode, so that the format of the driving interface can be identified by analyzing the characteristics of the calling function parameters. Among the above-mentioned parameter features, the most important parameter for identifying the interface format of the driver is the parameter transferred from the calling program to the driver interface, i.e. the driver interface calling parameter. Therefore, the type and semantic information of the calling parameters of the driving interface can be identified by analyzing the taint label carried by the calling parameters of the driving interface, and the format of the driving interface is further analyzed and obtained. It should be noted that the parameters transferred by the driving call may be pointers, pointing to complex structures, and in this case, the scheme parses the memory space pointed by the pointers field by field, and records the type and numerical information thereof. For example, the parameter created by the memory allocation function malloc () is a pointer type; parameters (local variables) allocated on the stack, either constant or array type; for example: parameters from other interfaces that call the return value may be considered as resource type variables that are passed between interfaces. For example: the procedure for stain analysis is as follows: the caller_function (this, string XX, …) { input_0=cfarrayCreate (XX, …); input_1=kext_index 1 (); input_2=global_var; driver_input (input_0, input_1, input_2); }. Wherein XX may be used to represent a smudge label, and first the caller_function (this, string XX, …) may be used to represent adding a smudge label to a smudge source; next, input_0=cfarrayCreate (XX, …); input_1=kext_index 1 (); input_2=global_var; the method can be used for representing the spread of the dirty point source along with the program execution; finally, driver_input (input_0, input_1, input_2); the method is used for identifying the driving interface call, and parameter characteristics required by the driving interface driver_indication can be analyzed by analyzing the taint labels carried by the input_0, input_1 and input_2 parameters.
The method for identifying the driving interface firstly obtains the user mode calling program of the driving program, then carries out stain analysis on the code of the calling program according to the external function information and the parameter list of the calling function of the driving interface, and further analyzes the code to obtain the format of the driving interface. In the embodiment, the format of the driving interface is identified by performing stain analysis on the user mode calling program of the driving program. The stain analysis method enables the practicability of the driving interface format to be stronger, and compared with the driving program, the user state calling program of the driving program contains richer driving interface information, so that the driving interface can be identified more accurately.
Fig. 2 is a flow chart of a procedure for performing a taint analysis on a code of a calling program to obtain a calling parameter of a driving interface carrying a taint tag according to external function information of the calling program and a parameter list of the calling function of the driving interface of the calling program according to a second embodiment of the present application. The present embodiment describes the specific flow of the stain analysis based on the embodiment of fig. 1.
As shown in fig. 2, according to the external function information of the calling program and the parameter list of the calling function of the driving interface of the calling program, performing the stain analysis on the code of the calling program to obtain the driving interface calling parameter carrying the stain tag in this embodiment may include the following steps:
step S201, determining a pollution point source of pollution analysis;
specifically, the stain source refers to variables and parameters that can effectively reflect the drive interface information. A set of dirty point sources may be defined for the code of the caller extracted from the user state caller in step S102 for accurate marking of the driver interface call parameters. The pollution point sources in the application can be divided into two types: type-dependent and value-dependent dirty point sources.
Among other things, type-dependent stain sources include: the special type variables create the return value of the function, the drive call function parameter variables and the function stack pointer, for example: a typical representation of a special type variable creation function is the function malloc (), and the variables created by that function are all of the pointer type. In addition, the calling function parameters may be provided by a derived function symbol or an inverse tool, and a plurality of calling function parameters may constitute a parameter list of the calling function. The function stack refers to a section of memory space allocated to a function in a program, and can be used for storing local variables, and a function stack pointer points to the function stack.
Wherein, the stain source related to the value mainly comprises a variable of the resource type. Variables for the resource type include: the return value variable of the drive interface and the global variable or member variable of the fabric. Specifically, the return value variable of a drive interface refers to a variable passed between different drive interface calls, such as: interface a uses the return value of interface B as input, e.g., the close () function uses as a argument the legal file descriptor created by the open () function. Since such variables have different legal values in different contexts, to identify such variables, it is necessary to set the return value of the drive interface to the dirty source. In addition, for ease of development, resource-type variables are also often stored in global or structural member variables, and thus global or structural member variables are also considered as a source of pollution for resource-type variables.
Step S202, marking a stain source with a stain label;
specifically, the present embodiment can identify the format of the closed source driver interface by analyzing the user-state calling program of the driver, that is, the calling program. Because analyzing the whole calling program can bring about larger time expenditure, the method can preprocess the user mode calling program of the driving program before the analysis of the driving interface format starts.
In particular, the program code fragments related to the driver interface calling function, i.e. the code fragments of the calling program, may be extracted from the code of the calling program. Further, the format of the driver interface may be identified by analyzing the program code fragments associated with the driver interface calling function, rather than having to identify based on the code of the entire calling program.
In the process of extracting a program code fragment related to a driver interface calling function from a code of a user mode calling program of a driver, first, a function calling the driver interface, that is, a driver interface calling function, may be determined using a general disassembly tool such as IDA (International disassemble, interactive decompilation tool). And taking the function entry as a starting point and the driving interface calling instruction as an end point, finding one or more paths in a program control flow diagram of the driving interface calling function, and splicing basic program blocks contained in the paths in sequence to obtain the code segments of the calling program. Where a path refers to an alternating sequence of a set of points (i.e., program basic blocks) and edges (i.e., jumping relationships between program basic blocks) in a control flow graph. The basic block of a program refers to a sequence of sentences executed in the program in the greatest order, with only one entry sentence and one exit sentence.
Wherein the basic blocks of the program contained in different program paths may differ, so that the extracted calling program code fragments may also differ. The format analysis results of the driver interfaces corresponding to the different caller code fragments can be verified against each other.
And secondly, a calling function of the current driving interface calling function, namely a caller of the calling function, can be further found, a corresponding passage can be extracted from a program control flow diagram of the caller of the calling function, program basic blocks contained in the passage are spliced in sequence, so that a new calling program code segment can be obtained, and the calling program code segment and the new calling program code segment can be spliced to perform inter-process analysis, so that the calling program code segment is perfected.
If the format information of the complete and sufficient driving interface can be obtained by analyzing the calling program code segment obtained in the first step, the calling program code segment of the calling function caller does not need to be continuously extracted. Otherwise, if the format information of the complete driving interface cannot be obtained after the analysis of the calling program code segment obtained in the first step, the calling program code segment of the calling function caller needs to be further extracted, and the calling program code segment is perfected so as to extract richer interface information.
Specifically, marking a smudge source with a smudge label during smudge analysis refers to marking the smudge source with a smudge label. Stain labels may include, but are not limited to: the labels such as the stain source type, the stain mark closed_level, the sequence of the function parameter variables in the parameter list, the function information created by the special type variables and the like can be properly adjusted by a user according to requirements. The information of each dirty point source can be accurately distinguished by analyzing the dirty label, and the dirty label has the characteristics of convenient storage and transmission. Common shadow stacks (shadow stacks) and metadata (metadata) may be used for the marking of the taint tags, but they are time consuming and memory overhead intensive.
Optionally, in this embodiment, the stain source is marked with a stain label according to external function information of the calling program and a parameter list of a calling function of a driving interface of the calling program.
Among the markings for spot labeling of a spot source include, but are not limited to: and marking the dirty label on the memory space pointed by the parameter transmission register, the function stack pointer register and the function stack pointer register for storing the dirty point source. Taking a 32bit as a storage unit to manage a parameter register and a memory as an example, the taint tag can be stored in the high order of each storage unit, wherein different bits of the storage unit can be used for marking different taint tags. For example: the highest four bits may be used to identify the type of source of the artifact to which the variable belongs, and if the variable comes from a special type of variable creation function, the next four bits may be used to value to mark which function the variable was specifically created from.
Meanwhile, in order to avoid the loss of the dirty mark caused by the dereferencing of the pointer, the embodiment assigns the same label to the pointer type variable and the memory space pointed by the pointer type variable, but marks the dereferencing times of the variable by using a mark dirty label connected_level in other bits of the storage unit. For example, an element is obtained from the array carrying the taint tag a, which still carries the a token, and the memory to which the element points also carries the a token. To distinguish between pointers and memory values pointed to by pointers, we can use the dirty tag "linked_level" to mark the number of times the access variable needs to perform dereferencing, e.g., the linked_level of the return value of function malloc () can be marked as 0, and the linked_level of the return value of function malloc () can be marked as 1. The taint tag may be propagated in duplicate among variables as program simulation executes.
In one possible stain tag label example. The highest four bits of the storage unit can be used for marking the type of the pollution point source to which the variable belongs, the second four bits are used for marking 'connected_level', if the variable comes from a special type variable creation function, the value of the second eight bits can be used for marking the variable to specifically create the independent variable creation function, if the variable comes from a function parameter, the sequence of the parameter variable of the next four bits marking the function in the parameter list can be used, and the parameter list is convenient to be in one-to-one correspondence with the parameter list. The user can adjust appropriately according to the needs. Wherein, the stain label for marking the stain source type may include: the labels corresponding to the special type variable creation function return values are 0xF, the labels corresponding to the function parameter variables are 0xB, the labels corresponding to the global variables or the structural member variables are 0xA, the labels corresponding to the driving interface return values are 0xA, and the labels corresponding to the function stack pointers are 0xC.
And step S203, spreading the taint label to a taint pool to obtain a driving interface calling parameter carrying the taint label.
Specifically, in the process of performing the taint analysis on the code of the calling program, the taint label marked on the taint point source is transmitted to the taint pool, namely, the driving interface calling parameter, so that the driving interface calling parameter carrying the taint label can be obtained.
According to the external function information of the calling program and the parameter list of the calling function of the driving interface of the calling program, the codes of the calling program are subjected to taint analysis, so that the process of calling parameters of the driving interface with taint labels is obtained, firstly, program code fragments related to the calling function of the driving interface are extracted, the format of the driving interface is identified through analysis of the code fragments of the calling program, and the identification efficiency of the driving interface is improved. And the lightweight stain marking and spreading scheme selected by the embodiment enables the embodiment to be more practical and strong in applicability.
Fig. 3 is a schematic flow chart of marking a stain source with a stain label according to a third embodiment of the present application. This embodiment describes in detail a specific flow of marking a stain source with a stain label based on the embodiment of fig. 2.
As shown in fig. 3, the marking process for performing the stain labeling on the stain source provided in this embodiment may include the following steps:
step S301, marking a stain label on the stain source before the simulation is performed.
The stain analysis scheme provided in this embodiment is implemented based on simulation execution of the calling program, and in order to save time and cost, the stain analysis scheme may be implemented by performing simulation execution on the calling program code segment, for example. Wherein the user may select an analog execution tool such as: the simulation execution tools Triton, unicorn and other tools are taken as simulation execution engines for simulation execution, and the simulation execution is carried out on the code fragments of the calling program.
Specifically, before the simulation execution starts, a dirty label is added to, i.e. assigned to, the dirty source in the memory pointed to by the parameter register, the function stack pointer register and both. Before the simulation execution starts, the simulation memory state of the simulation execution environment is initialized, the real condition of the code execution of the calling program segment is simulated in the initialization process, and the code segment related to the calling is set as the target of the simulation execution. Fig. 4 is an example of a memory initialization state in the recommended dirty tag marking process under the arm architecture, where the first column represents adding a dirty tag to a dirty source of a parameter register and a function stack pointer register, the first row of the first column represents a structure pointer, in THIS embodiment, is a THIS structure pointer in a c++ program, the second to eighth rows of the first column represent call function parameter variables, and the ninth and tenth rows represent function stack pointers. The second column, third column, etc. represent adding a dirty label to the construct member variables, as well as the dirty sources in the memory pointed to by the parameter registers and the function stack pointer registers.
In this embodiment, the stain sources that are primarily marked before the start of simulation execution include: calling function parameter variables in the dirty point sources related to types, function stack pointers and global variables or member variables of the structural body in the dirty point sources related to values. In the process of marking the stain label on the call function parameter variable in the stain point source related to the type, besides marking that the variable is a function parameter variable, the sequence of the function parameter variable in the parameter list of the call function of the driving interface of the calling program extracted in step S102, namely the sequence of the call function parameter variable in the parameter list, is convenient to correspond to the parameter list one by one, and the variable type is identified.
Step S302, performing simulation execution on codes of the user state calling program, and marking a stain label on a stain source.
Specifically, after marking the taint source with the taint tag before the simulation execution in step S301, the code of the user state calling program may be simulated and executed, and in the process of the simulation execution, the taint source in the memory pointed by the parameter register, the function stack pointer register and both of them may be marked with the taint tag. Meanwhile, the taint label is diffused into the taint pool along with the simulation execution, and the driving interface calling parameters carrying the taint label are obtained.
First, whether an external function call occurs during the simulation execution is analyzed. Specifically, it is determined whether a function call has occurred during the simulated execution of the calling program code segment. If no function call occurs in the simulation process of calling the function code segment, marking the stain source is not needed, and if the function call occurs in the simulation process of calling the function code segment, the type of the external function is identified according to the external function information. In this embodiment, the types of external functions include one or more of the following: basic functions of an operation program memory, special type variable creation functions and driving interface calling functions.
Second, simulation execution is performed according to the type of the external function. In particular, different simulation executions may be performed for different types of external functions. For example: if the external function is a basic function of the operation program memory, the external function can be simulated by using a basic function model, and a stain label is not added to the external function; if the external function is a special type variable creation function, the external function can be simulated by using a general function model, and a stain label is added to a return value generated by the special type variable creation function in the simulated execution; if the external function is a driving interface calling function, directly adding a stain label to the return value; if the external function is other functions, no processing is done and no smudge label is added.
In the simulation execution process of the code of the calling program, when the types of the external functions are basic functions of an operation program memory and special type variable creation functions, the external function model is used for performing simulation execution on the functions, and the simulation implementation of the functions possibly affects stain propagation.
Specifically, the external function model provided in this embodiment mainly has a general function model and a basic function model. Specifically, the general function model provided in this embodiment is used to simulate the behavior of a special type variable creation function. For such functions, we focus mainly on the parameters and return values, so the function model will return a pointer with a dirty label, mark the variable type returned by the current function, and ask the pointer to point to a memory block, where the values of the parameters, i.e. the dirty labels, that the program passes to the current function are stored. Based on such a design, the present application can distinguish between nested types of variables, such as: when the program performs newArray (str 1, str 2) and newArray (int 1, int 2) operations, different dirty marks are left. FIG. 5 is a diagram of a general function model return value and its directed memory block layout, wherein the first row of the table may represent a parameter value, i.e., a dirty tag, the second row of the table may represent a memory block size and a parameter number, and the third row of the table may represent a memory with a current function tag.
Specifically, the basic function model provided in this embodiment is used to simulate a basic function that modifies the program memory, i.e. operates the basic function of the program memory, for example: memcpy () functions, for which the number of such functions is limited, can be modeled in a conventional manner one-to-one basis function.
Specifically, when the simulation execution process runs to the driving interface call instruction described in step S202, the simulation execution is terminated, and the marking of the dirty point source with the dirty label is completed. Meanwhile, the smear label spreads into the smear pool as the simulation is performed. At this time, the data stored in the memories pointed by the parameter transmission register, the function stack pointer register and the parameter transmission register are the driving interface parameters carrying the taint label. And analyzing the stain labels carried in the driving interface call parameters stored in the parameter transfer register, the function stack memory and the memory pointed by the parameter transfer register and the function stack memory, so that the driving interface analysis can be completed, and the format of the driving interface can be identified.
The process of marking the stain label on the stain source provided by the embodiment is based on simulation execution of the code of the user state calling program, and marks the stain label of the stain analysis before and after the simulation execution, so that the finally obtained driving interface calling parameters carry more complete interface information, and the accuracy of driving interface identification is improved.
Wherein after parsing recognizes the format of the drive interface, for example, may be used: and a fuzzy test tool Syzkaller supporting interface format definition expands the fuzzy test and generates an interface description file with a format meeting the requirements of the fuzzy tester according to the interface analysis result. Specifically, fuzzy test is one of the most effective automatic driven vulnerability mining technologies at present, and a driving interface identified by the application can guide the fuzzy test to generate legal test cases, input a large number of automatically generated test cases to a test target and attempt to trigger program vulnerabilities. The quality of the test case directly determines the effect of the fuzzy test. And for a test target with a complex interface of a system driver, the test case needs to conform to a certain interface format to pass format check and trigger a program deep code. Therefore, before starting the drive fuzzing test, the drive interface identification method can be firstly used for identifying and extracting the drive interface format to generate a description file so as to help the fuzzing tester to generate high-quality test cases, and the fuzzing test is called as an interface sensitive fuzzing test. Because of the huge number of interfaces and the continuous update, the capability of automatically identifying the driving interfaces is needed. The driving interface identification method provided by the application can be used for automatically identifying the closed source driving interface, and solves the problems that the closed source driving interface is inaccurate or can not be identified due to complex format of the closed source driving interface and rich parameter verification.
Fig. 6 is a schematic structural diagram of a driving interface recognition device according to a fourth embodiment of the present application.
As shown in fig. 6, the driving interface identifying device 60 provided in this embodiment may include: the acquisition module 81, the calculation module 62, the determination module 63.
The obtaining module 61 is configured to obtain a user-state calling program of the driver.
And a calculation module 62, configured to extract, from the user-state calling program, the code of the calling program, external function information of the calling program, and a parameter list of a calling function of a driver interface of the calling program.
The calculation module 62 is further configured to perform a dirty analysis on the code of the calling program according to the address where the external function of the calling program is located and the parameter list of the calling function of the driving interface of the calling program, so as to obtain the driving interface calling parameter carrying the dirty label.
The determining module 63 is configured to analyze the taint tag carried by the calling parameter of the driving interface carrying the taint tag, and analyze the taint tag to obtain the format of the driving interface.
The device provided in this embodiment may be used to implement the technical solutions of the first to fourth embodiments of the foregoing method, and its implementation principle and technical effects are similar, and this embodiment is not repeated here.
Fig. 7 is a schematic structural diagram of a driving interface recognition apparatus according to a fifth embodiment of the present application.
As shown in fig. 7, the driving interface recognition apparatus 70 provided in the present embodiment may include:
memory 71, processor 72, communication interface 73;
memory 71 is used to store executable instructions of processor 72;
wherein the processor is configured to perform the drive interface identification method of any one of embodiments one to four via execution of executable instructions.
An embodiment of the present application also provides a readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements a drive interface identification method of performing any one of the above method embodiments one to four.
Other embodiments of the application will be apparent to those skilled in the art from consideration of the specification and practice of the application disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It is to be understood that the application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (12)

1. A drive interface identification method, comprising:
acquiring a user state calling program of a driver;
extracting codes of the calling program, external function information of the calling program and a parameter list of a calling function of a driving interface of the calling program from the user state calling program;
according to the external function information of the calling program and a parameter list of a calling function of a driving interface of the calling program, carrying out taint analysis on codes of the calling program to obtain driving interface calling parameters carrying taint labels;
analyzing the taint tag carried by the drive interface calling parameter carrying the taint tag, identifying the type and semantic information of the drive interface calling parameter from the taint tag, and analyzing to obtain the format of the drive interface.
2. The method of claim 1, wherein after extracting the code of the calling program from the user state calling program, further comprising:
program code segments associated with the driver interface calling function are extracted from the code of the calling program.
3. The method according to claim 2, wherein the performing the stain analysis on the code of the calling program according to the external function information of the calling program and the parameter list of the calling function of the driving interface of the calling program to obtain the driving interface calling parameter carrying the stain tag includes:
determining a stain source for the stain analysis;
marking the stain source with the stain label;
and transmitting the taint label to a taint pool to obtain a driving interface calling parameter carrying the taint label.
4. The method of claim 3, wherein the marking the stain source with the stain label comprises:
and marking the dirty label on a parameter transmission register, a function stack pointer register, and a memory space pointed by the parameter transmission register and the function stack pointer register, which are used for storing the dirty point source.
5. The method of claim 4, wherein the marking the dirty label for the parameter transfer register, the function stack pointer register, and the memory space pointed to by the parameter transfer register and the function stack pointer register of the stored dirty point source comprises:
and performing simulation execution on the codes of the user mode calling program, and marking the stain label on the stain source.
6. The method of claim 5, wherein the simulating the code of the user mode calling program further comprises, prior to marking the stain source with the stain tag:
marking the stain source with the stain tag prior to the simulation being performed.
7. The method of claim 6, wherein propagating the tainted label into a tainted pool results in a drive interface call parameter carrying a tainted label, comprising:
and the taint label is diffused into the taint pool along with the simulation execution, so that the driving interface calling parameters carrying the taint label are obtained.
8. The method of claim 5, wherein the simulating the code of the user mode calling program comprises:
analyzing whether an external function call occurs in the simulation executing process;
identifying the type of the external function according to the external function information;
and performing simulation execution according to the type of the external function.
9. The method of claim 8, wherein the type of external function comprises one or more of: basic functions of an operation program memory, special type variable creation functions and driving interface calling functions.
10. A drive interface recognition apparatus, comprising:
the acquisition module is used for acquiring a user state calling program of the driver;
the computing module is used for extracting codes of the calling program, external function information of the calling program and a parameter list of a calling function of a driving interface of the calling program from the user state calling program;
the computing module is also used for carrying out stain analysis on codes of the calling program according to the external function information of the calling program and a parameter list of a calling function of a driving interface of the calling program to obtain driving interface calling parameters carrying a stain label;
the determining module is used for analyzing the taint tag carried by the drive interface calling parameter carrying the taint tag, identifying the type and semantic information of the drive interface calling parameter from the taint tag, and analyzing to obtain the format of the drive interface.
11. A drive interface identification device, comprising:
a memory, a processor, a communication interface;
the memory is used for storing executable instructions of the processor;
wherein the processor is configured to perform the drive interface identification method of any one of claims 1 to 9 via execution of the executable instructions.
12. A readable storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the drive interface identification method of any of claims 1 to 9.
CN202211356535.3A 2022-11-01 2022-11-01 Drive interface identification method, device, equipment and storage medium Active CN115617410B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211356535.3A CN115617410B (en) 2022-11-01 2022-11-01 Drive interface identification method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211356535.3A CN115617410B (en) 2022-11-01 2022-11-01 Drive interface identification method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN115617410A CN115617410A (en) 2023-01-17
CN115617410B true CN115617410B (en) 2023-09-19

Family

ID=84877336

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211356535.3A Active CN115617410B (en) 2022-11-01 2022-11-01 Drive interface identification method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115617410B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116414722B (en) * 2023-06-07 2023-10-20 清华大学 Fuzzy test processing method and device, fuzzy test system and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521543A (en) * 2011-12-23 2012-06-27 中国人民解放军国防科学技术大学 Method for information semantic analysis based on dynamic taint analysis
CN103440201A (en) * 2013-09-05 2013-12-11 北京邮电大学 Dynamic taint analysis device and application thereof to document format reverse analysis
CN103995782A (en) * 2014-06-17 2014-08-20 电子科技大学 Taint analyzing method based on taint invariable set
CN109583200A (en) * 2017-09-28 2019-04-05 中国科学院软件研究所 A kind of program exception analysis method based on dynamic tainting
CN112925524A (en) * 2021-03-05 2021-06-08 清华大学 Method and device for detecting unsafe direct memory access in driver

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521543A (en) * 2011-12-23 2012-06-27 中国人民解放军国防科学技术大学 Method for information semantic analysis based on dynamic taint analysis
CN103440201A (en) * 2013-09-05 2013-12-11 北京邮电大学 Dynamic taint analysis device and application thereof to document format reverse analysis
CN103995782A (en) * 2014-06-17 2014-08-20 电子科技大学 Taint analyzing method based on taint invariable set
CN109583200A (en) * 2017-09-28 2019-04-05 中国科学院软件研究所 A kind of program exception analysis method based on dynamic tainting
CN112925524A (en) * 2021-03-05 2021-06-08 清华大学 Method and device for detecting unsafe direct memory access in driver

Also Published As

Publication number Publication date
CN115617410A (en) 2023-01-17

Similar Documents

Publication Publication Date Title
CN108614960B (en) JavaScript virtualization protection method based on front-end byte code technology
US9824214B2 (en) High performance software vulnerabilities detection system and methods
US20160371494A1 (en) Software Vulnerabilities Detection System and Methods
Filaretti et al. An executable formal semantics of PHP
US20060253739A1 (en) Method and apparatus for performing unit testing of software modules with use of directed automated random testing
US9274930B2 (en) Debugging system using static analysis
US10599852B2 (en) High performance software vulnerabilities detection system and methods
CN110941552B (en) Memory analysis method and device based on dynamic taint analysis
EP0523232B1 (en) Method for testing and debugging computer programs
CN115617410B (en) Drive interface identification method, device, equipment and storage medium
CN104407968B (en) A kind of method that the code command longest run time is calculated by static analysis
CN112131120B (en) Source code defect detection method and device
CN115022026A (en) Block chain intelligent contract threat detection device and method
Singh et al. Parallel chopped symbolic execution
Ren et al. A dynamic taint analysis framework based on entity equipment
He et al. Eunomia: enabling user-specified fine-grained search in symbolically executing WebAssembly binaries
Casinghino et al. Using binary analysis frameworks: The case for BAP and angr
US11740875B2 (en) Type inference in dynamic languages
Stallenberg et al. Guess what: Test case generation for Javascript with unsupervised probabilistic type inference
CN110221973A (en) Targeting formula parallel symbol towards c program defects detection executes method
CN115629762A (en) JSON data processing method and device, electronic equipment and storage medium
WO2021104027A1 (en) Code performance testing method, apparatus and device, and storage medium
Scherer et al. I/o interaction analysis of binary code
Tai Automated test sequence generation using sequencing constraints for concurrent programs
KR100916301B1 (en) Device and Method for Executing Kernel API Interactively

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant