CN112487438B - Heap object Use-After-Free vulnerability detection method based on identifier consistency - Google Patents

Heap object Use-After-Free vulnerability detection method based on identifier consistency Download PDF

Info

Publication number
CN112487438B
CN112487438B CN202011453648.6A CN202011453648A CN112487438B CN 112487438 B CN112487438 B CN 112487438B CN 202011453648 A CN202011453648 A CN 202011453648A CN 112487438 B CN112487438 B CN 112487438B
Authority
CN
China
Prior art keywords
pointer
program
heap
heap object
identifier
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011453648.6A
Other languages
Chinese (zh)
Other versions
CN112487438A (en
Inventor
宋巍
桂滨法
熊海龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Science and Technology
Original Assignee
Nanjing University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Science and Technology filed Critical Nanjing University of Science and Technology
Priority to CN202011453648.6A priority Critical patent/CN112487438B/en
Publication of CN112487438A publication Critical patent/CN112487438A/en
Application granted granted Critical
Publication of CN112487438B publication Critical patent/CN112487438B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • G06F21/577Assessing vulnerabilities and evaluating computer system security

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)
  • Stored Programmes (AREA)

Abstract

The invention discloses a heap object Use-After-Free vulnerability detection method based on identifier consistency, which takes a source code of a C/C + + program as input and takes a detected heap object Use-After-Free vulnerability as output. The method comprises the steps of firstly, carrying out static analysis on an input program, and finding out code positions of heap object allocation, pointer propagation and pointer dereferencing; thereafter, code instrumentation is performed on the program at these code locations to specify the same unique identifier for each allocated heap object and all pointers to that object; and finally, executing the instrumented program, and executing instrumentation codes in the program running process to compare whether the identifier of the pointer is matched with the identifier of the object actually pointed by the pointer, so as to judge whether a vulnerability exists. The method provided by the invention has the advantages of effectiveness and high efficiency, and can detect the Use-After-Free vulnerability of the heap object with lower performance overhead and memory overhead.

Description

Heap object Use-After-Free vulnerability detection method based on identifier consistency
Technical Field
The invention belongs to the field of program analysis and test, and particularly relates to a heap object Use-After-Free vulnerability detection method based on identifier consistency.
Background
Low-level languages such as C and C + + provide low-level management of heap memory, and developers can flexibly allocate and release heap objects on the heap memory. Due to the flexibility and high efficiency of C and C + +, a large number of system programs, such as browsers, databases, servers, etc., developed using C and C + + languages are widely used in daily life. However, as the program scale increases and the modular concept develops, it is difficult for developers to always correctly allocate and release heap objects. Therefore, the programs developed in C and C + + languages are prone to timing errors such as Use-After-Free bugs, which has become a significant cause of insecurity in modern software systems. One study report indicated that the number of Use-After-Free holes registered in the book was on an increasing trend year by year from 2009 to 2019, with more than 80% of Use-After-Free holes being marked as high-risk or severe risk. Therefore, developers should pay attention to the heap object Use-After-Free vulnerability in C/C + + programs.
Currently, the Use-After-Free bug of the heap object in C/C + + programs is difficult to detect by manual or static analysis, compared to other memory errors, for the following reasons. First, pointers are aliased, making it difficult to infer all pointer aliases distributed across many data structures. Second, it is challenging to determine to which memory object the pointer should point. Finally, but not least, the problem of path explosion due to the increased program size makes it quite difficult to perform inter-procedural analysis.
Although many research works have proposed dynamic detection methods for Use-After-Free bugs of heap objects in C/C + + programs, most of them are object location-based methods. They use shadow memory to record the allocation/release status of heap objects, but do not distinguish between different heap objects that are allocated at the same heap address in sequence as the program executes. This means that they can only detect the Use-After-Free hole that occurs when the freed heap memory has not been reallocated. When the heap memory storing freed heap objects is reallocated to store another heap object, they cannot detect a Use-After-Free vulnerability of the heap object that occurs on the reallocated heap memory. For prediction-based dynamic detection methods, they are only suitable for detecting concurrent Use-After-Free bugs of multi-threaded programs, and cannot detect Use-After-Free bugs within single threads of sequential programs and multi-threaded programs. Therefore, a dynamic method capable of effectively detecting the Use-After-Free bug of the heap object in the C/C + + program without being influenced by memory reuse is still lacking at present.
Disclosure of Invention
The invention aims to provide a heap object Use-After-Free vulnerability detection method based on identifier consistency, which is used for efficiently and effectively detecting the heap object Use-After-Free vulnerability in a C/C + + program.
The technical solution for realizing the purpose of the invention is as follows: a heap object Use-After-Free vulnerability detection method based on identifier consistency takes source codes of a C/C + + program as input and takes detected heap object Use-After-Free vulnerability as an output result, and comprises the following steps:
step 1, converting the source code of the input C/C + + program into an LLVM IR file by using LLVM and Clang, and finding the positions of the codes for heap object allocation, pointer propagation and pointer dereferencing in the input C/C + + program based on the LLVM IR file;
step 2, code instrumentation is carried out on the input C/C + + program, the functions of assigning the same unique identifier to each distributed heap object and all pointers pointing to the heap object and inserting memory check for pointer dereferencing are achieved, and the instrumented C/C + program is obtained;
and 3, running the instrumented C/C + + program, and executing instrumented codes in the running process of the program to compare whether the object identifier of the pointer is matched with the identifier of the current object actually pointed by the pointer, so as to detect whether a user-After-Free vulnerability of the heap object occurs.
Compared with the prior art, the invention has the following remarkable advantages: (1) The method is not influenced by memory reuse in the program execution process, and can effectively detect the Use-After-Free vulnerability of the heap object in the program execution process; (2) The method has the advantages that accuracy of detecting the Use-After-Free vulnerability of the heap object is improved, and meanwhile, low runtime overhead and low memory overhead are achieved.
Drawings
FIG. 1 is a flow chart of a heap object Use-After-Free vulnerability detection method based on identifier consistency.
FIG. 2 is an exemplary diagram of source code for a C/C + + program with a heap object Use-After-Free vulnerability.
FIG. 3 is an exemplary diagram of code in the form of LLVM IR resulting from compiling source code using LLVM and Clang.
Fig. 4 is a diagram of an example of code in LLVM IR form after instrumentation is complete.
FIG. 5 is an exemplary diagram of diagnostic information given when a heap object Use-After-Free vulnerability is detected.
Detailed Description
The invention provides an efficient heap object Use-After-Free vulnerability detection method based on identifier consistency, which takes a source code of a C/C + + program as an input and a detected Use-After-Free vulnerability as an output result, and the whole flow is shown in figure 1. The detection method is specifically realized as follows:
step 1, converting the source code of the input C/C + + program into an LLVM IR file by using LLVM and Clang, and finding the code positions of heap object allocation, pointer propagation and pointer dereferencing in the input C/C + + program based on the LLVM IR file, wherein the specific steps are as follows:
step 1-1, compiling C/C + + program source codes by using LLVM and Clang, and converting the C/C + + program source codes into LLVM IR files, wherein the obtained LLVM IR files are intermediate representation forms of the C/C + + program source codes;
and step 1-2, traversing all LLVM IR files, searching all heap object allocation, pointer propagation and pointer dereferencing statements, and recording the code position of each statement.
Step 2, performing code instrumentation on the input C/C + + program to realize the functions of assigning the same unique identifier to each allocated heap object and all pointers pointing to the heap object and inserting memory check for pointer dereference, so as to obtain the instrumented C/C + program, which specifically includes the following steps:
step 2-1, using a code instrumentation function provided by the LLVM to perform code instrumentation on an input C/C + + program, wherein the specific instrumentation rules include:
(1) At the code positions allocated by all heap objects, replacing the heap object allocation function with a self-defined heap object allocation function, thereby realizing that a unique identifier is assigned to each newly allocated heap object during heap object allocation, and the unique identifier of the heap object is invalidated during heap object release;
(2) At the code position of all pointer propagation, inserting corresponding codes so as to propagate the identifier of the heap object to all pointers pointing to the heap object, namely all pointers pointing to the heap object have the same unique identifier as the heap object;
(3) Inserting corresponding memory check at all code positions for pointer dereferencing so as to judge whether a heap object Use-After-Free vulnerability occurs when the pointer dereferencing the heap object;
step 2-2, optimizing the instrumented LLVM IR file by using a code optimization function provided by the LLVM, wherein the specific optimization rule comprises the following steps:
(1) When the pointer dereferences a stack object or a global object, the inserted pointer dereference memory check is eliminated, since this does not involve a heap object Use-After-Free vulnerability;
(2) When the pointer dereferences the same stack of objects for multiple times in the cycle, inserting a pointer dereferencing memory check before the cycle, and eliminating the pointer dereferencing memory check in the cycle;
(3) When a plurality of pointer dereferences for accessing the same heap object exist among the codes which are sequentially executed, only one pointer dereference memory check is reserved, and the rest pointer dereference memory checks are eliminated;
step 2-3: and generating the instrumented C/C + + program by using the obtained LLVM IR file.
Step 3, the C/C + + program After the instrumentation is operated, instrumentation codes are executed in the program operation process to compare whether the object identifier of the pointer is matched with the identifier of the current object actually pointed by the pointer, so as to detect whether a heap object user-After-Free vulnerability occurs, and the specific steps are as follows:
step 3-1, operating the C/C + + program after the pile is inserted by using a command line tool;
step 3-2, using a test case in a test suite attached to the C/C + + program or a test case generated by a fuzzy test tool as the input of the C/C + + program after the instrumentation;
3-3, when the program generates pointer dereferencing, the program executes the instrumented pointer dereferencing memory check code, and judges whether a heap object Use-After-Free vulnerability occurs or not by comparing whether the identifier of the heap object associated with the pointer is matched with the identifier of the current heap object actually pointed by the pointer;
3-4, if the stack object is matched with the user-After-Free object, no Use-After-Free bug occurs, and the program continues to execute; if not, the program crashes and gives corresponding Use-After-Free vulnerability diagnostic information, namely, the positions of the corresponding source codes when the heap objects are allocated, released and accessed.
The present invention will be described in detail below with reference to examples and the accompanying drawings.
Examples
The invention relates to a heap object Use-After-Free vulnerability detection method based on identifier consistency. In order to detect a heap object Use-After-Free vulnerability, firstly, performing static analysis on an input C/C + + program to find the position of codes related to heap object allocation, pointer propagation and pointer dereferencing; then, code instrumentation is carried out on the C/C + + program at the located relevant code position to assign the same unique identifier to each allocated heap object and all pointers pointing to the object and insert memory check for pointer dereferencing; and finally, executing the instrumented program, and executing instrumented codes in the program running process to compare whether the object identifier of the pointer is matched with the identifier of the current object actually pointed by the pointer, so as to detect whether the program has a heap object Use-After-Free vulnerability.
In combination with the example, the method includes:
step 1, for the source code of the input C/C + + program, acquiring the position of the code related to heap object allocation, pointer propagation and pointer dereferencing in the program source code, and specifically comprising the following steps:
step 1-1, converting source codes of a C/C + + program into an LLVM IR file by using LLVM and Clang, wherein code examples of a source code form and an LLVM IR form are respectively shown in FIG. 2 and FIG. 3;
step 1-2, traverse all LLVM IR files, search all statements related to heap object allocation, pointer propagation, and pointer dereferencing. After the search is finished, obtaining all code positions of statements related to heap object allocation, pointer propagation and pointer dereferencing in the program;
step 2, performing code instrumentation on the input C/C + + program at the code positions of all heap object allocation, pointer propagation and pointer dereferencing statements, and specifically performing the following steps:
step 2-1, using the formulated instrumentation rule, respectively performing instrumentation on heap object allocation, pointer propagation and pointer dereferencing statements to realize that the same unique identifier is assigned to each allocated heap object and all pointers pointing to the object and memory check is inserted for pointer dereferencing, as shown in fig. 4;
step 2-2, optimizing the LLVM IR file after the pile insertion according to the formulated optimization rule;
step 2-3, generating a C/C + + program after the pile is inserted by using the obtained LLVM IR file;
step 3, the C/C + + program After the instrumentation is operated, instrumentation codes are executed in the program operation process to compare whether the object identifier of the pointer is matched with the identifier of the current object actually pointed by the pointer, so as to detect whether a heap object user-After-Free vulnerability occurs, and the specific steps are as follows:
step 3-1, using a command line tool to run the instrumented C/C + + program, such as/a.out, wherein,/is a path where the program to be run is located, and a.out is a name of the program to be run;
step 3-2, using the test case in the test suite attached to the C/C + + program or the test case generated by the fuzzy test tool as the input (not necessary) of the instrumented C/C + + program;
3-3, when the program generates pointer dereferencing, the program executes the instrumented pointer dereferencing memory check code, and judges whether a heap object Use-After-Free vulnerability occurs or not by comparing whether the identifier of the heap object associated with the pointer is matched with the identifier of the current heap object actually pointed by the pointer;
3-4, if the stack object is matched with the user-After-Free object, no Use-After-Free bug occurs, and the program continues to execute; if not, the program crashes and gives corresponding Use-After-Free vulnerability diagnostic information, namely, the positions of the corresponding source codes when the heap objects are allocated, released and accessed. As shown in FIG. 5, line 1 indicates that a heap object Use-After-Free vulnerability is detected; lines 2-7 give the location of the corresponding source code when allocating heap objects; lines 8-13 show the location of the corresponding source code when the heap object is released; lines 14-19 show the location of the corresponding source code when accessing the heap object.

Claims (4)

1. A heap object Use-After-Free vulnerability detection method based on identifier consistency takes source codes of a C/C + + program as input and detected Use-After-Free vulnerability as an output result, and is characterized by comprising the following specific steps of:
step 1, converting an input C/C + + program source code into an LLVM IR file by using LLVM and Clang, and finding code positions for heap object allocation, pointer propagation and pointer dereferencing in the input C/C + + program based on the LLVM IR file;
step 2, code instrumentation is carried out on the input C/C + + program, the same unique identifier is assigned to each distributed heap object and all pointers pointing to the heap object, and memory check is inserted for pointer dereferencing to obtain the instrumented C/C + program; the method specifically comprises the following steps:
step 2-1, using a code instrumentation function provided by the LLVM to perform code instrumentation on an input C/C + + program;
step 2-2, optimizing the instrumented LLVM IR file by using a code optimization function provided by the LLVM, wherein the specific optimization rule comprises the following steps:
(1) When the pointer dereferences the stack object or the global object, eliminating the inserted pointer dereferencing memory check;
(2) When the pointer dereferences the same stack of objects for multiple times in the cycle, inserting a pointer dereferencing memory check before the cycle, and eliminating the pointer dereferencing memory check in the cycle;
(3) When a plurality of pointer dereferences for accessing the same heap object exist among the sequentially executed codes, only one pointer dereference memory check is reserved, and the rest pointer dereference memory checks are eliminated;
step 2-3: generating a C/C + + program after the pile is inserted by using the obtained LLVM IR file;
and 3, running the instrumented C/C + + program, and executing instrumented codes in the running process of the program to compare whether the object identifier of the pointer is matched with the identifier of the current object actually pointed by the pointer, so as to detect whether a heap object Use-After-Free vulnerability occurs.
2. The method for detecting the Use-After-Free vulnerability of the heap object based on the identifier consistency according to claim 1, wherein in step 1, LLVM and Clang are used to convert the source code of the input C/C + + program into LLVM IR files, and the code locations for heap object allocation, pointer propagation and pointer dereferencing in the input C/C + + program are found based on the LLVM IR files, which includes the following specific steps:
step 1-1, compiling C/C + + program source codes by using LLVM and Clang, and converting the C/C + + program source codes into LLVM IR files, wherein the obtained LLVM IR files are intermediate representation forms of the C/C + + program source codes;
and step 1-2, traversing all LLVM IR files, searching all heap object allocation, pointer propagation and pointer dereferencing statements, and recording the code position of each statement.
3. The identifier consistency-based heap object Use-After-Free vulnerability detection method according to claim 1, wherein in step 2-1, the specific instrumentation rules comprise:
(1) Replacing the called heap object allocation function with a self-defined heap object allocation function at the code positions allocated by all heap objects, thereby realizing that a unique identifier is assigned to each newly allocated heap object during the allocation of the heap objects, and the unique identifier of the heap object is invalidated during the release of the heap object;
(2) At the code position of all pointer propagation, inserting corresponding codes so as to propagate the identifier of the heap object to all pointers pointing to the heap object, namely all the pointers pointing to the heap object have the same unique identifier as the heap object;
(3) And inserting corresponding memory check at all code positions for pointer dereferencing so as to judge whether a heap object Use-After-Free vulnerability occurs when the pointer dereferencing the heap object.
4. The identifier consistency-based heap object Use-After-Free vulnerability detection method according to claim 1, wherein in step 3, after the C/C + + program is executed, during the program running, instrumentation code is executed to compare whether the object identifier of the pointer and the identifier of the current object to which the pointer actually points to match, so as to detect whether a heap object Use-After-Free vulnerability occurs, which includes the following specific steps:
step 3-1, operating the instrumented C/C + + program by using a command line tool;
step 3-2, using a test case in a test suite attached to the C/C + + program or a test case generated by a fuzzy test tool as the input of the C/C + + program after the instrumentation;
3-3, when the program generates pointer dereferencing, the program executes the instrumented pointer dereferencing memory check code, and judges whether a heap object Use-After-Free vulnerability occurs or not by comparing whether the identifier of the heap object associated with the pointer is matched with the identifier of the current heap object actually pointed by the pointer;
3-4, if the stack object is matched with the user-After-Free object, no Use-After-Free bug occurs, and the program continues to execute; if not, the program crashes and gives corresponding Use-After-Free vulnerability diagnostic information, namely the positions of the corresponding source codes when the heap objects are allocated, released and accessed.
CN202011453648.6A 2020-12-12 2020-12-12 Heap object Use-After-Free vulnerability detection method based on identifier consistency Active CN112487438B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011453648.6A CN112487438B (en) 2020-12-12 2020-12-12 Heap object Use-After-Free vulnerability detection method based on identifier consistency

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011453648.6A CN112487438B (en) 2020-12-12 2020-12-12 Heap object Use-After-Free vulnerability detection method based on identifier consistency

Publications (2)

Publication Number Publication Date
CN112487438A CN112487438A (en) 2021-03-12
CN112487438B true CN112487438B (en) 2022-11-04

Family

ID=74916225

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011453648.6A Active CN112487438B (en) 2020-12-12 2020-12-12 Heap object Use-After-Free vulnerability detection method based on identifier consistency

Country Status (1)

Country Link
CN (1) CN112487438B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105808369A (en) * 2016-03-29 2016-07-27 北京系统工程研究所 Memory leak detection method based on symbolic execution
CN109711167A (en) * 2018-12-21 2019-05-03 华中科技大学 A kind of UAF loophole defence method based on multilevel-pointer
CN110187988A (en) * 2019-06-06 2019-08-30 中国科学技术大学 Static function calling figure construction method suitable for Virtual Function and function pointer
CN111143199A (en) * 2019-12-11 2020-05-12 烽火通信科技股份有限公司 Method for detecting DPDK application program memory out-of-range access in cloud platform
CN111859388A (en) * 2020-06-30 2020-10-30 广州大学 Multi-level mixed vulnerability automatic mining method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105808369A (en) * 2016-03-29 2016-07-27 北京系统工程研究所 Memory leak detection method based on symbolic execution
CN109711167A (en) * 2018-12-21 2019-05-03 华中科技大学 A kind of UAF loophole defence method based on multilevel-pointer
CN110187988A (en) * 2019-06-06 2019-08-30 中国科学技术大学 Static function calling figure construction method suitable for Virtual Function and function pointer
CN111143199A (en) * 2019-12-11 2020-05-12 烽火通信科技股份有限公司 Method for detecting DPDK application program memory out-of-range access in cloud platform
CN111859388A (en) * 2020-06-30 2020-10-30 广州大学 Multi-level mixed vulnerability automatic mining method

Also Published As

Publication number Publication date
CN112487438A (en) 2021-03-12

Similar Documents

Publication Publication Date Title
JP5333232B2 (en) Program debugging method, program conversion method, program debugging device using the same, program conversion device, and debugging program
US5590329A (en) Method and apparatus for detecting memory access errors
US7178134B2 (en) Method and apparatus for resolving memory allocation trace data in a computer system
US8555255B2 (en) Method of tracing object allocation site in program, as well as computer system and computer program therefor
US8458681B1 (en) Method and system for optimizing the object code of a program
CN106325970A (en) Compiling method and compiling system
EP1591895A2 (en) Method and system for detecting potential race conditions in multithreaded programs
US20100287536A1 (en) Profiling application performance according to data structure
US20080072007A1 (en) Method and computer programming product for detecting memory leaks
US8056061B2 (en) Data processing device and method using predesignated register
WO2013002979A2 (en) Debugging in a multiple address space environment
US20040128661A1 (en) Automatic data locality optimization for non-type-safe languages
US7725771B2 (en) Method and system for providing enhanced memory error messages
Hück et al. Compiler-aided type tracking for correctness checking of MPI applications
US20120005460A1 (en) Instruction execution apparatus, instruction execution method, and instruction execution program
CN112487438B (en) Heap object Use-After-Free vulnerability detection method based on identifier consistency
US20080307174A1 (en) Dual Use Memory Management Library
US9189297B2 (en) Managing shared memory
US20130152053A1 (en) Computer memory access monitoring and error checking
US8756580B2 (en) Instance-based field affinity optimization
EP2960798B1 (en) Automatic memory leak detection
JP5199975B2 (en) Memory management method, memory management program, and information processing apparatus
Luecke et al. The importance of run-time error detection
Vitovská Instrumentation of LLVM IR
US11106522B1 (en) Process memory resurrection: running code in-process after death

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant