CN112487438A - Heap object Use-After-Free vulnerability detection method based on identifier consistency - Google Patents

Heap object Use-After-Free vulnerability detection method based on identifier consistency Download PDF

Info

Publication number
CN112487438A
CN112487438A CN202011453648.6A CN202011453648A CN112487438A CN 112487438 A CN112487438 A CN 112487438A CN 202011453648 A CN202011453648 A CN 202011453648A CN 112487438 A CN112487438 A CN 112487438A
Authority
CN
China
Prior art keywords
program
pointer
heap
heap object
identifier
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011453648.6A
Other languages
Chinese (zh)
Other versions
CN112487438B (en
Inventor
宋巍
桂滨法
熊海龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Science and Technology
Original Assignee
Nanjing University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Science and Technology filed Critical Nanjing University of Science and Technology
Priority to CN202011453648.6A priority Critical patent/CN112487438B/en
Publication of CN112487438A publication Critical patent/CN112487438A/en
Application granted granted Critical
Publication of CN112487438B publication Critical patent/CN112487438B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • G06F21/577Assessing vulnerabilities and evaluating computer system security

Abstract

The invention discloses a heap object Use-After-Free vulnerability detection method based on identifier consistency, which takes a source code of a C/C + + program as input and takes a detected heap object Use-After-Free vulnerability as output. The method comprises the steps of firstly, carrying out static analysis on an input program, and finding out code positions of heap object allocation, pointer propagation and pointer dereferencing; thereafter, code instrumentation is performed on the program at these code locations to specify the same unique identifier for each allocated heap object and all pointers to that object; and finally, executing the instrumented program, and executing instrumentation codes in the program running process to compare whether the identifier of the pointer is matched with the identifier of the object actually pointed by the pointer, so as to judge whether a vulnerability exists. The method provided by the invention has the advantages of effectiveness and high efficiency, and can detect the Use-After-Free vulnerability of the heap object with lower performance overhead and memory overhead.

Description

Heap object Use-After-Free vulnerability detection method based on identifier consistency
Technical Field
The invention belongs to the field of program analysis and test, and particularly relates to a heap object Use-After-Free vulnerability detection method based on identifier consistency.
Background
Low-level languages such as C and C + + provide low-level management of heap memory, and developers can flexibly allocate and release heap objects on the heap memory. Due to the flexibility and high efficiency of C and C + +, a large number of system programs, such as browsers, databases, servers, etc., developed using C and C + + languages are widely used in daily life. However, as the program scale increases and the modular concept develops, it is difficult for developers to always correctly allocate and release heap objects. Therefore, the programs developed in C and C + + languages are prone to timing errors such as Use-After-Free bugs, which has become a significant cause of insecurity in modern software systems. One study report indicated that the number of Use-After-Free vulnerabilities registered in the book was on a rising trend year by year from 2009 to 2019, with over 80% of the Use-After-Free vulnerabilities flagged as high risk or severe risk. Therefore, developers should pay attention to the heap object Use-After-Free vulnerability in C/C + + programs.
At present, compared with other memory errors, the Use-After-Free bug of the heap object in the C/C + + program is difficult to detect by a manual or static analysis mode for the following reasons. First, pointers are aliased, making it difficult to infer all pointer aliases distributed across many data structures. Second, it is challenging to determine to which memory object the pointer should point. Finally, but not least, the problem of path explosion due to increased program size makes it quite difficult to perform inter-procedural analysis.
Although many research works have proposed dynamic detection methods for heap object Use-After-Free bugs in C/C + + programs, most of them are object location-based methods. They use shadow memory to record the allocation/release status of heap objects, but do not distinguish between different heap objects allocated at the same heap address in sequence with program execution. This means that they can only detect a Use-After-Free hole that occurs when the freed heap memory has not been reallocated. When the heap memory storing released heap objects is reallocated to store another heap object, they are unable to detect a heap object Use-After-Free hole that occurs on the reallocated heap memory. For prediction-based dynamic detection methods, they are only suitable for detecting concurrent Use-After-Free bugs of multi-threaded programs, and cannot detect Use-After-Free bugs within single threads of sequential programs and multi-threaded programs. Therefore, a dynamic method capable of effectively detecting the Use-After-Free bug of the heap object in the C/C + + program without being influenced by memory reuse is still lacking at present.
Disclosure of Invention
The invention aims to provide a heap object Use-After-Free vulnerability detection method based on identifier consistency, which is used for efficiently and effectively detecting the heap object Use-After-Free vulnerability in a C/C + + program.
The technical solution for realizing the purpose of the invention is as follows: a heap object Use-After-Free vulnerability detection method based on identifier consistency takes source codes of a C/C + + program as input and takes detected heap object Use-After-Free vulnerability as an output result, and comprises the following steps:
step 1, converting the source code of the input C/C + + program into an LLVM IR file by using LLVM and Clang, and finding the positions of the codes for heap object allocation, pointer propagation and pointer dereferencing in the input C/C + + program based on the LLVM IR file;
step 2, code instrumentation is carried out on the input C/C + + program, the functions of assigning the same unique identifier to each distributed heap object and all pointers pointing to the heap object and inserting memory check for pointer dereferencing are achieved, and the instrumented C/C + program is obtained;
and 3, running the instrumented C/C + + program, and executing instrumented codes in the running process of the program to compare whether the object identifier of the pointer is matched with the identifier of the current object actually pointed by the pointer, so as to detect whether a user-After-Free vulnerability of the heap object occurs.
Compared with the prior art, the invention has the following remarkable advantages: (1) the method is not influenced by memory reuse in the program execution process, and can effectively detect the Use-After-Free vulnerability of the heap object in the program execution process; (2) the method has the advantages that accuracy of detecting the Use-After-Free vulnerability of the heap object is improved, and meanwhile, low runtime overhead and low memory overhead are achieved.
Drawings
FIG. 1 is a flow chart of a heap object Use-After-Free vulnerability detection method based on identifier consistency.
FIG. 2 is an exemplary diagram of source code for a C/C + + program with a heap object Use-After-Free vulnerability.
FIG. 3 is an exemplary diagram of code in the form of LLVM IR resulting from compiling source code using LLVM and Clang.
Fig. 4 is a diagram of an example of code in LLVM IR form after instrumentation is complete.
FIG. 5 is an exemplary diagram of diagnostic information given when a heap object Use-After-Free vulnerability is detected.
Detailed Description
The invention provides an efficient heap object Use-After-Free vulnerability detection method based on identifier consistency, which takes a source code of a C/C + + program as an input and a detected Use-After-Free vulnerability as an output result, and the whole flow is shown in figure 1. The detection method is specifically realized as follows:
step 1, converting the source code of the input C/C + + program into an LLVM IR file by using LLVM and Clang, and finding the code positions of heap object allocation, pointer propagation and pointer dereferencing in the input C/C + + program based on the LLVM IR file, wherein the specific steps are as follows:
step 1-1, compiling C/C + + program source codes by using LLVM and Clang, and converting the C/C + + program source codes into LLVM IR files, wherein the obtained LLVM IR files are intermediate representation forms of the C/C + + program source codes;
and step 1-2, traversing all LLVM IR files, searching all heap object allocation, pointer propagation and pointer dereferencing statements, and recording the code position of each statement.
Step 2, performing code instrumentation on the input C/C + + program to realize the functions of assigning the same unique identifier to each allocated heap object and all pointers pointing to the heap object and inserting memory check for pointer dereference, so as to obtain the instrumented C/C + program, which specifically includes the following steps:
step 2-1, using a code instrumentation function provided by the LLVM to perform code instrumentation on an input C/C + + program, wherein the specific instrumentation rules include:
(1) at the code positions allocated by all heap objects, replacing the heap object allocation function with a self-defined heap object allocation function, thereby realizing that a unique identifier is assigned to each newly allocated heap object during heap object allocation, and the unique identifier of the heap object is invalidated during heap object release;
(2) at the code position of all pointer propagation, inserting corresponding codes so as to propagate the identifier of the heap object to all pointers pointing to the heap object, namely all pointers pointing to the heap object have the same unique identifier as the heap object;
(3) inserting corresponding memory check at all code positions for pointer dereferencing so as to judge whether a heap object Use-After-Free vulnerability occurs when the pointer dereferencing the heap object;
step 2-2, optimizing the instrumented LLVM IR file by using a code optimization function provided by the LLVM, wherein the specific optimization rule comprises the following steps:
(1) when the pointer dereferences a stack object or a global object, the inserted pointer dereference memory check is eliminated, since this does not involve a heap object Use-After-Free vulnerability;
(2) when the pointer dereferences the same stack of objects for multiple times in the cycle, inserting a pointer dereferencing memory check before the cycle, and eliminating the pointer dereferencing memory check in the cycle;
(3) when a plurality of pointer dereferences for accessing the same heap object exist among the sequentially executed codes, only one pointer dereference memory check is reserved, and the rest pointer dereference memory checks are eliminated;
step 2-3: and generating the C/C + + program after the instrumentation by using the obtained LLVM IR file.
Step 3, running the instrumented C/C + + program, and executing instrumented codes in the running process of the program to compare whether the object identifier of the pointer is matched with the identifier of the current object actually pointed by the pointer, so as to detect whether a heap object Use-After-Free vulnerability occurs, wherein the specific steps are as follows:
step 3-1, operating the C/C + + program after the pile is inserted by using a command line tool;
step 3-2, using a test case in a test suite attached to the C/C + + program or a test case generated by a fuzzy test tool as the input of the C/C + + program after the instrumentation;
3-3, when the program generates pointer dereferencing, the program executes the instrumented pointer dereferencing memory check code, and judges whether a heap object Use-After-Free vulnerability occurs or not by comparing whether the identifier of the heap object associated with the pointer is matched with the identifier of the current heap object actually pointed by the pointer;
3-4, if the stack object is matched with the user-After-Free object, no Use-After-Free bug occurs, and the program continues to execute; if not, the program crashes and gives corresponding Use-After-Free vulnerability diagnostic information, namely the positions of the corresponding source codes when the heap objects are allocated, released and accessed.
The present invention will be described in detail with reference to the following examples and drawings.
Examples
The invention relates to a heap object Use-After-Free vulnerability detection method based on identifier consistency. In order to detect a heap object Use-After-Free vulnerability, firstly, performing static analysis on an input C/C + + program to find the position of codes related to heap object allocation, pointer propagation and pointer dereferencing; then, code instrumentation is carried out on the C/C + + program at the located relevant code position to assign the same unique identifier to each allocated heap object and all pointers pointing to the object and insert memory check for pointer dereferencing; and finally, executing the instrumented program, and executing instrumented codes in the program running process to compare whether the object identifier of the pointer is matched with the identifier of the current object actually pointed by the pointer, so as to detect whether the program has a heap object Use-After-Free vulnerability.
In combination with the example, the method includes:
step 1, for the source code of the input C/C + + program, acquiring the position of the code related to heap object allocation, pointer propagation and pointer dereferencing in the program source code, and specifically comprising the following steps:
step 1-1, converting source codes of a C/C + + program into an LLVM IR file by using LLVM and Clang, wherein code examples of a source code form and an LLVM IR form are respectively shown in FIG. 2 and FIG. 3;
step 1-2, traverse all LLVM IR files, search all statements related to heap object allocation, pointer propagation, and pointer dereferencing. After the search is finished, obtaining all code positions of statements related to heap object allocation, pointer propagation and pointer dereferencing in the program;
step 2, performing code instrumentation on the input C/C + + program at the code positions of all heap object allocation, pointer propagation and pointer dereferencing statements, and specifically comprising the following steps:
step 2-1, using the formulated instrumentation rule, respectively performing instrumentation on heap object allocation, pointer propagation and pointer dereferencing statements to realize that the same unique identifier is assigned to each allocated heap object and all pointers pointing to the object and memory check is inserted for pointer dereferencing, as shown in fig. 4;
step 2-2, optimizing the LLVM IR file after the pile insertion according to the formulated optimization rule;
step 2-3, generating a C/C + + program after the pile is inserted by using the obtained LLVM IR file;
step 3, running the instrumented C/C + + program, and executing instrumented codes in the running process of the program to compare whether the object identifier of the pointer is matched with the identifier of the current object actually pointed by the pointer, so as to detect whether a heap object Use-After-Free vulnerability occurs, wherein the specific steps are as follows:
step 3-1, using a command line tool to run the instrumented C/C + + program, such as/a.out, wherein/is a path where the program to be run is located, and a.out is a name of the program to be run;
step 3-2, using the test case in the test suite attached to the C/C + + program or the test case generated by the fuzzy test tool as the input (optional) of the inserted C/C + + program;
3-3, when the program generates pointer dereferencing, the program executes the instrumented pointer dereferencing memory check code, and judges whether a heap object Use-After-Free vulnerability occurs or not by comparing whether the identifier of the heap object associated with the pointer is matched with the identifier of the current heap object actually pointed by the pointer;
3-4, if the stack object is matched with the user-After-Free object, no Use-After-Free bug occurs, and the program continues to execute; if not, the program crashes and gives corresponding Use-After-Free vulnerability diagnostic information, namely the positions of the corresponding source codes when the heap objects are allocated, released and accessed. As shown in FIG. 5, line 1 indicates that a heap object Use-After-Free vulnerability is detected; lines 2-7 show the location of the corresponding source code when allocating heap objects; lines 8-13 show the location of the corresponding source code when the heap object is released; lines 14-19 show the location of the corresponding source code when accessing the heap object.

Claims (5)

1. A heap object Use-After-Free vulnerability detection method based on identifier consistency takes source codes of a C/C + + program as input and detected Use-After-Free vulnerability as an output result, and is characterized by comprising the following specific steps of:
step 1, converting an input C/C + + program source code into an LLVM IR file by using LLVM and Clang, and finding code positions for heap object allocation, pointer propagation and pointer dereferencing in the input C/C + + program based on the LLVM IR file;
step 2, code instrumentation is carried out on the input C/C + + program, the same unique identifier is assigned to each distributed heap object and all pointers pointing to the heap object, and memory check is inserted for pointer dereferencing to obtain the instrumented C/C + program;
and 3, running the instrumented C/C + + program, and executing instrumented codes in the running process of the program to compare whether the object identifier of the pointer is matched with the identifier of the current object actually pointed by the pointer, so as to detect whether a heap object Use-After-Free vulnerability occurs.
2. The method for detecting the Use-After-Free vulnerability of the heap object based on the identifier consistency according to claim 1, wherein in step 1, LLVM and Clang are used to convert the source code of the input C/C + + program into LLVM IR files, and the code locations for heap object allocation, pointer propagation and pointer dereferencing in the input C/C + + program are found based on the LLVM IR files, which includes the following specific steps:
step 1-1, compiling C/C + + program source codes by using LLVM and Clang, and converting the C/C + + program source codes into LLVM IR files, wherein the obtained LLVM IR files are intermediate representation forms of the C/C + + program source codes;
and step 1-2, traversing all LLVM IR files, searching all heap object allocation, pointer propagation and pointer dereferencing statements, and recording the code position of each statement.
3. The identifier consistency-based heap object Use-After-Free vulnerability detection method according to claim 1, wherein the code instrumentation is performed on the input C/C + + program in step 2, so as to implement functions of assigning the same unique identifier to each allocated heap object and all pointers pointing to the heap object and inserting a memory check for pointer dereferencing, so as to obtain an instrumented C/C + program, and specifically includes the following steps:
step 2-1, using a code instrumentation function provided by the LLVM to perform code instrumentation on an input C/C + + program;
step 2-2, optimizing the instrumented LLVM IR file by using a code optimization function provided by the LLVM, wherein the specific optimization rule comprises the following steps:
(1) when the pointer dereferences the stack object or the global object, eliminating the inserted pointer dereferencing memory check;
(2) when the pointer dereferences the same stack of objects for multiple times in the cycle, inserting a pointer dereferencing memory check before the cycle, and eliminating the pointer dereferencing memory check in the cycle;
(3) when a plurality of pointer dereferences for accessing the same heap object exist among the sequentially executed codes, only one pointer dereference memory check is reserved, and the rest pointer dereference memory checks are eliminated;
step 2-3: and generating the C/C + + program after the instrumentation by using the obtained LLVM IR file.
4. The identifier consistency-based heap object Use-After-Free vulnerability detection method according to claim 3, wherein in step 2-1, the specific instrumentation rules include:
(1) replacing the called heap object allocation function with a self-defined heap object allocation function at the code positions allocated by all heap objects, thereby realizing that a unique identifier is assigned to each newly allocated heap object during heap object allocation, and the unique identifier of the heap object is invalidated during heap object release;
(2) at the code position where all the pointers are propagated, inserting corresponding codes so as to propagate the identifier of the heap object to all the pointers pointing to the heap object, namely all the pointers pointing to the heap object have the same unique identifier as the heap object;
(3) and inserting corresponding memory check at all code positions for pointer dereferencing so as to judge whether a heap object Use-After-Free vulnerability occurs when the pointer dereferencing the heap object.
5. The method for detecting the Use-After-Free vulnerability of a heap object based on identifier consistency according to claim 1, wherein in step 3, After the instrumented C/C + + program is executed, instrumentation code is executed during the program running process to compare whether the object identifier of the pointer and the identifier of the current object actually pointed to by the pointer match, so as to detect whether the Use-After-Free vulnerability of the heap object occurs, which includes the following specific steps:
step 3-1, operating the C/C + + program after the pile is inserted by using a command line tool;
step 3-2, using a test case in a test suite attached to the C/C + + program or a test case generated by a fuzzy test tool as the input of the C/C + + program after the instrumentation;
3-3, when the program generates pointer dereferencing, the program executes the instrumented pointer dereferencing memory check code, and judges whether a heap object Use-After-Free vulnerability occurs or not by comparing whether the identifier of the heap object associated with the pointer is matched with the identifier of the current heap object actually pointed by the pointer;
3-4, if the stack object is matched with the user-After-Free object, no Use-After-Free bug occurs, and the program continues to execute; if not, the program crashes and gives corresponding Use-After-Free vulnerability diagnostic information, namely the positions of the corresponding source codes when the heap objects are allocated, released and accessed.
CN202011453648.6A 2020-12-12 2020-12-12 Heap object Use-After-Free vulnerability detection method based on identifier consistency Active CN112487438B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011453648.6A CN112487438B (en) 2020-12-12 2020-12-12 Heap object Use-After-Free vulnerability detection method based on identifier consistency

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011453648.6A CN112487438B (en) 2020-12-12 2020-12-12 Heap object Use-After-Free vulnerability detection method based on identifier consistency

Publications (2)

Publication Number Publication Date
CN112487438A true CN112487438A (en) 2021-03-12
CN112487438B CN112487438B (en) 2022-11-04

Family

ID=74916225

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011453648.6A Active CN112487438B (en) 2020-12-12 2020-12-12 Heap object Use-After-Free vulnerability detection method based on identifier consistency

Country Status (1)

Country Link
CN (1) CN112487438B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105808369A (en) * 2016-03-29 2016-07-27 北京系统工程研究所 Memory leak detection method based on symbolic execution
CN109711167A (en) * 2018-12-21 2019-05-03 华中科技大学 A kind of UAF loophole defence method based on multilevel-pointer
CN110187988A (en) * 2019-06-06 2019-08-30 中国科学技术大学 Static function calling figure construction method suitable for Virtual Function and function pointer
CN111143199A (en) * 2019-12-11 2020-05-12 烽火通信科技股份有限公司 Method for detecting DPDK application program memory out-of-range access in cloud platform
CN111859388A (en) * 2020-06-30 2020-10-30 广州大学 Multi-level mixed vulnerability automatic mining method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105808369A (en) * 2016-03-29 2016-07-27 北京系统工程研究所 Memory leak detection method based on symbolic execution
CN109711167A (en) * 2018-12-21 2019-05-03 华中科技大学 A kind of UAF loophole defence method based on multilevel-pointer
CN110187988A (en) * 2019-06-06 2019-08-30 中国科学技术大学 Static function calling figure construction method suitable for Virtual Function and function pointer
CN111143199A (en) * 2019-12-11 2020-05-12 烽火通信科技股份有限公司 Method for detecting DPDK application program memory out-of-range access in cloud platform
CN111859388A (en) * 2020-06-30 2020-10-30 广州大学 Multi-level mixed vulnerability automatic mining method

Also Published As

Publication number Publication date
CN112487438B (en) 2022-11-04

Similar Documents

Publication Publication Date Title
US5590329A (en) Method and apparatus for detecting memory access errors
JP5333232B2 (en) Program debugging method, program conversion method, program debugging device using the same, program conversion device, and debugging program
US8555255B2 (en) Method of tracing object allocation site in program, as well as computer system and computer program therefor
US8458681B1 (en) Method and system for optimizing the object code of a program
EP1591895B1 (en) Method and system for detecting potential race conditions in multithreaded programs
CN106325970A (en) Compiling method and compiling system
US20100287536A1 (en) Profiling application performance according to data structure
US8677322B2 (en) Debugging in a multiple address space environment
US20080072007A1 (en) Method and computer programming product for detecting memory leaks
WO2010014981A2 (en) Method and apparatus for detection and optimization of presumably parallel program regions
US8056061B2 (en) Data processing device and method using predesignated register
WO2009055914A1 (en) Static analysis defect detection in the presence of virtual function calls
US20040128661A1 (en) Automatic data locality optimization for non-type-safe languages
Hück et al. Compiler-aided type tracking for correctness checking of MPI applications
US20120005460A1 (en) Instruction execution apparatus, instruction execution method, and instruction execution program
CN112487438B (en) Heap object Use-After-Free vulnerability detection method based on identifier consistency
US20080307174A1 (en) Dual Use Memory Management Library
US8756580B2 (en) Instance-based field affinity optimization
US9189297B2 (en) Managing shared memory
US20130152053A1 (en) Computer memory access monitoring and error checking
JP5199975B2 (en) Memory management method, memory management program, and information processing apparatus
Vitovská Instrumentation of LLVM IR
Luecke et al. The importance of run-time error detection
US11106522B1 (en) Process memory resurrection: running code in-process after death
Lang et al. Design and implementation of an escape analysis in the context of safety-critical embedded systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant