US8839203B2 - Code coverage-based taint perimeter detection - Google Patents

Code coverage-based taint perimeter detection Download PDF

Info

Publication number
US8839203B2
US8839203B2 US13/115,985 US201113115985A US8839203B2 US 8839203 B2 US8839203 B2 US 8839203B2 US 201113115985 A US201113115985 A US 201113115985A US 8839203 B2 US8839203 B2 US 8839203B2
Authority
US
United States
Prior art keywords
tainted
code
taint
software code
perimeter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US13/115,985
Other versions
US20120304010A1 (en
Inventor
Edwin Lars Opstad
Andrew Renk
Daniel Margolis
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhigu Holdings Ltd
Original Assignee
Microsoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corp filed Critical Microsoft Corp
Priority to US13/115,985 priority Critical patent/US8839203B2/en
Assigned to MICROSOFT CORPORATION reassignment MICROSOFT CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: OPSTAD, EDWIN LARS, RENK, ANDREW, MARGOLIS, DANIEL
Publication of US20120304010A1 publication Critical patent/US20120304010A1/en
Application granted granted Critical
Publication of US8839203B2 publication Critical patent/US8839203B2/en
Assigned to MICROSOFT TECHNOLOGY LICENSING, LLC reassignment MICROSOFT TECHNOLOGY LICENSING, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MICROSOFT CORPORATION
Assigned to ZHIGU HOLDINGS LIMITED reassignment ZHIGU HOLDINGS LIMITED ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MICROSOFT TECHNOLOGY LICENSING, LLC
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3676Test management for coverage analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3688Test management for test execution, e.g. scheduling of test suites
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3604Software analysis for verifying properties of programs
    • G06F11/3612Software analysis for verifying properties of programs by runtime analysis

Definitions

  • Code coverage is a measure used in software testing that indicates the degree to which the source code of a computer program has been tested.
  • Current code-coverage tools typically use either a modified execution environment (virtualized execution) or rely on various types of execution instrumentation to instrument the entire binary code, such as by inserting code to log coverage at the start of every basic block. Each of these current methods, however, has a non-zero runtime overhead. Runtime is the period during which a computer program is executing.
  • a breakpoint is a means of acquiring knowledge about a program during its execution. This is normally achieved by having the programmer manually insert (by manually indicating instruction addresses/offsets, function names, and so forth) breakpoints in the code. More particularly, a breakpoint is an intentional stopping or pausing place in a program that is placed there for debugging purposes. During the pause the programmer inspects the test environment to determine whether the program is functioning as expected.
  • fuzz testing is a technique used to test for security and reliability problems in software. It is an automated or semi-automated technique that uses invalid, unexpected, or random data as inputs to a computer program. This can be achieved by mutating good input for a program into possibly bad input. For example, fuzzing may involve changing small parts of a file and delivering that content to an application in an attempt to cause the application to crash. The program then is monitored for exceptions such as crashes or failing built-in code assertions.
  • Smart fuzzing which is similar to conventional fuzzing, uses knowledge of the structure of the input data or feedback from the program under test to inform test case generation. Smarter fuzzing often enhances the code coverage when delivering fuzzed content by providing input that will match the expected input data structure more closely. Smart fuzzing is usually achieved by either requiring an extensive input structure definition to be provided at the start of fuzzing or with expensive runtime instrumentation and monitoring. Creating the input structure definitions requires significant engineering time. Typical runtime instrumentation and monitoring significantly increases the time needed to execute the program under test, which significantly reduces the fuzzing throughput.
  • One current technique that attempts to increase code coverage uses a constraint solver to try and solve the constraints generated from execution races.
  • the tool logs all conditional branches in the execution flow and derives symbolic representations of the conditional (what is being compared).
  • the constraint solver can then try to solve the inverse of that conditional (to figure out what input would cause the alternate branch to be taken).
  • this constraint solver technique is expensive, degrades performance, and has limitations on what it can solve.
  • Embodiments of the code coverage-based taint perimeter detection system and method test software code by determining code coverage of the code.
  • Embodiments of the system and method examine code coverage that has been seen across the inputs that have been executed in order to determine which tainted branch targets have never been covered.
  • Embodiments of the system and method examine only tainted branches that have not already been covered or tested by the any of the previous inputs. This makes embodiments of the system and method more efficient than existing techniques that examine branches that have already been covered and tested.
  • Embodiments of the code coverage-based taint perimeter detection system and method limit the scope of consideration to new code coverage that is induced by tainted input controlled by a fuzzing tool.
  • Embodiments of the system and method use tainted data flow analysis to determine code blocks that may be executed along an execution path that have not previously been executed.
  • software breakpoints are used to detect novel code execution.
  • Software breakpoints are an instruction that the central processing unit (CPU) recognizes as triggering a break.
  • Software breakpoints impose no runtime overhead except at startup and when actually triggered. This eliminates the general runtime overhead of existing solutions while providing new code execution detection of sufficient fidelity to provide feedback to an intelligent fuzz generator.
  • embodiments of the code coverage-based taint perimeter detection system and method determine tainted branches of the software code by performing tainted data flow analysis on execution traces of the code.
  • Conditional branch instructions where the branch taken is determined from tainted input are defined as “tainted branches.”
  • the code locations that result from the tainted branches are “tainted branch targets.”
  • Embodiments of the system and method then identify the tainted branch targets that have not yet been covered and detect when new inputs reach the intended tainted branch targets. This is achieved by monitoring program under test at the locations in the tainted branch targets not covered by existing inputs. In some embodiments the monitoring uses software breakpoints that are automatically placed at the locations in the tainted branches targets at runtime.
  • embodiments of the system and method perform tainted data flow analysis on the execution traces to obtain tainted branch targets.
  • the tainted branch targets are filtered and placed in a database, called a code coverage and tainted branch database.
  • the filtering removes any tainted branch targets that are already covered.
  • a current taint perimeter is obtained using the data in the code coverage and tainted branch database.
  • the current taint perimeter is the set of tainted branch targets that have not been covered with current inputs.
  • the current taint perimeter is monitored during runtime by using the filtered tainted branch targets. In some embodiments, this monitoring yields locations in the code where software breakpoints can be inserted. These breakpoints are inserted automatically during runtime into the filtered tainted branch targets.
  • the monitoring process of embodiments of the system and method includes generating a new test case from a set of templates and then executing the new test case. From this execution of the new test case, it is determined whether new code coverage has been achieved. If so, then the set of templates is updated by adding the new test case to the set of templates. Moreover, the code coverage and tainted branch database is updated by adding the new code coverage to the database.
  • An updated taint perimeter is found from the updated set of templates and the updated database.
  • Embodiments of the system and method then monitor the updated taint perimeter using newly filtered tainted branch targets. This iterative process continues until there are no more test cases. In this manner, embodiments of the system and method efficiently and effectively measure and extend the code coverage of the software code.
  • FIG. 1 is a block diagram illustrating a general overview of embodiments of the code coverage-based taint perimeter detection system and method implemented in a computing environment.
  • FIG. 2 is a flow diagram illustrating the general operation of embodiments of the code coverage-based taint perimeter detection system shown in FIG. 1 .
  • FIG. 3 is a flow diagram illustrating the operational details of embodiments of the code coverage-based taint perimeter detection system shown in FIGS. 1 and 2 .
  • FIG. 4 illustrates a simplified example of a general-purpose computer system on which various embodiments and elements of the code coverage-based taint perimeter detection system and method, as described herein and shown in FIGS. 1-3 , may be implemented.
  • FIG. 1 is a block diagram illustrating a general overview of embodiments of the code coverage-based taint perimeter detection system and method implemented in a computing environment. As shown in FIG. 1 , embodiments of the code coverage-based taint perimeter detection system 100 and method are implemented on a computing device 110 . In general, embodiments of the code coverage-based taint perimeter detection system 100 and method input software code to be tested 120 , process and test the code, and then output the tested code 130 .
  • embodiments of the code coverage-based taint perimeter detection system 100 and method measure the blocks of code covered by a set of templates 140 .
  • a template is a sample input that covers a part of the valid input range of the code to be tested 120 .
  • One at a time these templates are used to test the code that has not yet been tested.
  • Embodiments of the code coverage-based taint perimeter detection system 100 and method include an execution trace module 150 , a tainted analysis module 160 , a filtering module 170 , and a monitoring module 180 .
  • the execution trace module 150 generates the execution traces for the template being used as input to the code to be tested 120 .
  • the tainted analysis module 160 performs tainted data flow analysis on the execution traces and determines tainted branch targets.
  • the filtering module 170 filters the tainted branch targets to ensure that any tainted branch targets that have already been covered are not included in a set of filtered tainted branch targets.
  • the monitoring module 180 monitors a taint perimeter that is found using the filtered tainted branch targets to ensure detection of execution of code not previously covered. In some embodiments, this monitoring is performed using breakpoints that are automatically placed into the taint perimeter at runtime.
  • FIG. 2 is a flow diagram illustrating the general operation of embodiments of the code coverage-based taint perimeter detection system 100 shown in FIG. 1 .
  • Embodiments of code coverage-based taint perimeter detection system 100 efficiently and effectively test software code without modification of binary code and by limiting the monitoring of branches of the code to those branches that have not been covered.
  • the operation of embodiments of the code coverage-based taint perimeter detection system 100 begins by inputting a template that is a sample input for the code being tested (box 200 ). Next, execution traces are generated for the template (box 210 ). This yields the execution traces that are used in a tainted flow analysis.
  • Embodiments of the code coverage-based taint perimeter detection system 100 then determine tainted branches of the code by performing the tainted flow analysis on the execution traces (box 220 ). This tainted flow analysis yields tainted branch targets. The tainted branch targets then are filtered in order to remove those tainted branches of the code that have already been covered (box 230 ). Filtered tainted branches are obtained from this process.
  • Embodiments of the system 100 then monitor a taint perimeter of the code during runtime by using the filtered tainted branch targets (box 240 ).
  • software breakpoints are used to monitor the taint perimeter. This is achieved by automatically placing the breakpoints into the filtered tainted branch targets during runtime (box 250 ). This facilitates the efficient and effective testing of the software code.
  • the first stage of embodiments of the code coverage-based taint perimeter detection system 100 and method is to generate execution traces. This is achieved by obtaining an existing sample input that is known as a template.
  • the template is an input that covers a part of the valid input range of the software code being tested.
  • Execution traces are obtained from the software code being tested with the template as input using existing methods.
  • FIG. 3 is a flow diagram illustrating the operational details of embodiments of the code coverage-based taint perimeter detection system 100 and method shown in FIGS. 1 and 2 .
  • the operation of the system 100 begins by inputting a set of templates (box 300 ). Next, a template is selected from the set of templates (box 305 ). Execution traces then are generated for the selected template (box 310 ). The execution traces represent the code and data flow of the program for the selected template. The output is the execution traces (box 315 ).
  • the second stage of embodiments of the code coverage-based taint perimeter detection system 100 and method is to perform tainted data flow analysis on the execution traces to find tainted branches.
  • embodiments of the system 100 and method perform tainted data flow analysis on the execution traces to determine tainted branches (box 320 ). This achieved by analyzing tainted instructions to identify tainted branches.
  • a tainted instruction is any instruction that uses tainted data, which is any data that is controlled or comes from an external source (such as files).
  • Embodiments of the system 100 and method focus not on all instructions but specifically on conditional branches.
  • tainted branches are conditional branches where the conditional is controlled by data that flows from an external source. For example, if a section of a file is read, and then if that section of the file says “ABC” go down one path and “DEF” go down another path, then this is a tainted branch.
  • the tainted data flow analysis is performed on each of the execution traces in order to determine each of the tainted branches.
  • embodiments of the system 100 and method output tainted branch targets as a result of the tainted data flow analysis (box 325 ). These tainted branch targets are stored in a code coverage and tainted branch database (box 330 ). Embodiments of the system 100 and method repeat the process of generating execution traces and determining tainted branches as long as there are more templates (box 333 ) to trace and analyze.
  • the third stage of embodiments of the code coverage-based taint perimeter detection system 100 and method is to filter the tainted branches found in the earlier stage. Filtering takes the tainted branch targets found during the second stage and strips or filters the branches that have already been covered. In other words, the tainted branch targets that have been covered within the aggregated coverage from the full template set are removed. This is filtering the tainted branch targets.
  • embodiments of the system 100 and method output tainted branch targets as a result of the tainted data flow analysis (box 325 ) and store them in a code coverage and tainted branch database (box 330 ).
  • the current taint perimeter then is determined using data in the code coverage and tainted branch database (box 335 ) by removing tainted branch targets that have been covered.
  • Filtering of the tainted branches comes into play when there are new tainted branches from new inputs that are discovered during execution in the fourth stage, or the monitoring stage, as described below. When new tainted branches are found, then the filtering process is run again.
  • Embodiments of the system 100 and method uniquely use the trace analysis to find the tainted branches and then filter the tainted branches to exclude covered blocks from all the runs. The result is that filtered tainted branches are found.
  • the fourth stage of embodiments of the code coverage-based taint perimeter detection system 100 and method is to monitor the taint perimeter of the software code.
  • the taint perimeter is generated using the filtered tainted branches found in the earlier stage.
  • embodiments of the system 100 and method With the filtered tainted branches, embodiments of the system 100 and method know both the code that will execute under test conditions and the new code that has not been seen before. Given this information, embodiments of the system 100 and method can monitor for new code coverage in a targeted way without the need to watch every block that executes when the test is actually run.
  • Embodiments of the system 100 and method generate a new test case from the set of templates (box 340 ).
  • the new test case is executed (box 345 ).
  • the generating execution traces, determining tainted branches and filtering tainted branches stages will be executed using the new template, resulting in an updated taint perimeter based on the execution data flow from the new template. Future test cases generating in the monitoring the taint perimeter stage will benefit from this refined taint perimeter.
  • test cases If there is no new code coverage, then another determination is made as to whether there are more test cases (box 370 ). If so, then embodiments of the system 100 and method go back to the process of generating a new test case from the set of templates (box 340 ). The process then continues from that point as described thus far. If there are no more test cases, then the process is completed for the time and results of the tested code are output (box 375 ).
  • software breakpoints are used in the monitoring process.
  • a software breakpoint is one way of monitoring when a particular piece of code actually executes.
  • Other embodiments of the system 100 and method use other types of monitoring processes.
  • embodiments of the system 100 and method For embodiments that use breakpoints, embodiments of the system 100 and method to automatically determine where to insert the breakpoints. Because filtered tainted branch targets are used, embodiments of the system 100 and method use a much smaller number of breakpoints than would otherwise be used.
  • Breakpoints are automatically inserted as follows. By definition, for a tainted conditional branch instruction, there is a side of the branch that was taken and one side that was not taken. For a given filtered tainted branch target in the set of filtered tainted branch targets identified earlier, embodiments of the system 100 and method automatically place the breakpoints at every filtered tainted branch target.
  • embodiments of the system 100 and method When new coverage is detected, embodiments of the system 100 and method recalculate which pieces of code have not been covered even with the new coverage. This typically is performed in an iterative manner, as explained above. Also, when new coverage is taken, embodiments of the system 100 and method generate a new execution trace and find a new set of filtered tainted branches. Thus, new coverage may either narrow the set of breakpoints already set or also may expand it by providing additional filtered tainted branch targets to analyze.
  • FIG. 4 illustrates a simplified example of a general-purpose computer system on which various embodiments and elements of the code coverage-based taint perimeter detection system 100 and method, as described herein and shown in FIGS. 1-3 , may be implemented. It should be noted that any boxes that are represented by broken or dashed lines in FIG. 4 represent alternate embodiments of the simplified computing device, and that any or all of these alternate embodiments, as described below, may be used in combination with other alternate embodiments that are described throughout this document.
  • FIG. 4 shows a general system diagram showing a simplified computing device 10 .
  • Such computing devices can be typically be found in devices having at least some minimum computational capability, including, but not limited to, personal computers, server computers, hand-held computing devices, laptop or mobile computers, communications devices such as cell phones and PDA's, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, audio or video media players, etc.
  • the device should have a sufficient computational capability and system memory to enable basic computational operations.
  • the computational capability is generally illustrated by one or more processing unit(s) 12 , and may also include one or more GPUs 14 , either or both in communication with system memory 16 .
  • the processing unit(s) 12 of the general computing device of may be specialized microprocessors, such as a DSP, a VLIW, or other micro-controller, or can be conventional CPUs having one or more processing cores, including specialized GPU-based cores in a multi-core CPU.
  • the simplified computing device of FIG. 4 may also include other components, such as, for example, a communications interface 18 .
  • the simplified computing device of FIG. 4 may also include one or more conventional computer input devices 20 (e.g., pointing devices, keyboards, audio input devices, video input devices, haptic input devices, devices for receiving wired or wireless data transmissions, etc.).
  • the simplified computing device of FIG. 4 may also include other optional components, such as, for example, one or more conventional computer output devices 22 (e.g., display device(s) 24 , audio output devices, video output devices, devices for transmitting wired or wireless data transmissions, etc.).
  • typical communications interfaces 18 , input devices 20 , output devices 22 , and storage devices 26 for general-purpose computers are well known to those skilled in the art, and will not be described in detail herein.
  • the simplified computing device of FIG. 4 may also include a variety of computer readable media.
  • Computer readable media can be any available media that can be accessed by computer 10 via storage devices 26 and includes both volatile and nonvolatile media that is either removable 28 and/or non-removable 30 , for storage of information such as computer-readable or computer-executable instructions, data structures, program modules, or other data.
  • Computer readable media may comprise computer storage media and communication media.
  • Computer storage media includes, but is not limited to, computer or machine readable media or storage devices such as DVD's, CD's, floppy disks, tape drives, hard drives, optical drives, solid state memory devices, RAM, ROM, EEPROM, flash memory or other memory technology, magnetic cassettes, magnetic tapes, magnetic disk storage, or other magnetic storage devices, or any other device which can be used to store the desired information and which can be accessed by one or more computing devices.
  • computer or machine readable media or storage devices such as DVD's, CD's, floppy disks, tape drives, hard drives, optical drives, solid state memory devices, RAM, ROM, EEPROM, flash memory or other memory technology, magnetic cassettes, magnetic tapes, magnetic disk storage, or other magnetic storage devices, or any other device which can be used to store the desired information and which can be accessed by one or more computing devices.
  • Retention of information such as computer-readable or computer-executable instructions, data structures, program modules, etc. can also be accomplished by using any of a variety of the aforementioned communication media to encode one or more modulated data signals or carrier waves, or other transport mechanisms or communications protocols, and includes any wired or wireless information delivery mechanism.
  • modulated data signal or “carrier wave” generally refer to a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal.
  • communication media includes wired media such as a wired network or direct-wired connection carrying one or more modulated data signals, and wireless media such as acoustic, RF, infrared, laser, and other wireless media for transmitting and/or receiving one or more modulated data signals or carrier waves. Combinations of the any of the above should also be included within the scope of communication media.
  • software, programs, and/or computer program products embodying the some or all of the various embodiments of the code coverage-based taint perimeter detection system 100 and method described herein, or portions thereof, may be stored, received, transmitted, or read from any desired combination of computer or machine readable media or storage devices and communication media in the form of computer executable instructions or other data structures.
  • embodiments of the code coverage-based taint perimeter detection system 100 and method described herein may be further described in the general context of computer-executable instructions, such as program modules, being executed by a computing device.
  • program modules include routines, programs, objects, components, data structures, etc., that perform particular tasks or implement particular abstract data types.
  • the embodiments described herein may also be practiced in distributed computing environments where tasks are performed by one or more remote processing devices, or within a cloud of one or more devices, that are linked through one or more communications networks.
  • program modules may be located in both local and remote computer storage media including media storage devices.
  • the aforementioned instructions may be implemented, in part or in whole, as hardware logic circuits, which may or may not include a processor.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

A code coverage-based taint perimeter detection system and method for testing software code by determining code coverage and detecting new coverage of the code. Embodiments of the system and method perform tainted data flow analysis on execution traces of the code to determine tainted branch targets. The tainted branch targets may be filtered to remove any tainted branch targets that have already been covered. New coverage can be determined by monitoring the filtered tainted branch targets, which in some embodiments involves the use of software breakpoints that are automatically placed at the locations in the tainted branch targets at runtime. Embodiments of the system and method use an iterative process to ensure that only tainted branch targets that have not already been covered or tested are examined.

Description

BACKGROUND
Code coverage is a measure used in software testing that indicates the degree to which the source code of a computer program has been tested. Current code-coverage tools typically use either a modified execution environment (virtualized execution) or rely on various types of execution instrumentation to instrument the entire binary code, such as by inserting code to log coverage at the start of every basic block. Each of these current methods, however, has a non-zero runtime overhead. Runtime is the period during which a computer program is executing.
Code-coverage tools often use software breakpoints to record the execution of code deemed interesting by the user. In general, a breakpoint is a means of acquiring knowledge about a program during its execution. This is normally achieved by having the programmer manually insert (by manually indicating instruction addresses/offsets, function names, and so forth) breakpoints in the code. More particularly, a breakpoint is an intentional stopping or pausing place in a program that is placed there for debugging purposes. During the pause the programmer inspects the test environment to determine whether the program is functioning as expected.
One type of testing is fuzz testing. Conventional fuzz testing, or “fuzzing,” is a technique used to test for security and reliability problems in software. It is an automated or semi-automated technique that uses invalid, unexpected, or random data as inputs to a computer program. This can be achieved by mutating good input for a program into possibly bad input. For example, fuzzing may involve changing small parts of a file and delivering that content to an application in an attempt to cause the application to crash. The program then is monitored for exceptions such as crashes or failing built-in code assertions.
“Smart” fuzzing, which is similar to conventional fuzzing, uses knowledge of the structure of the input data or feedback from the program under test to inform test case generation. Smarter fuzzing often enhances the code coverage when delivering fuzzed content by providing input that will match the expected input data structure more closely. Smart fuzzing is usually achieved by either requiring an extensive input structure definition to be provided at the start of fuzzing or with expensive runtime instrumentation and monitoring. Creating the input structure definitions requires significant engineering time. Typical runtime instrumentation and monitoring significantly increases the time needed to execute the program under test, which significantly reduces the fuzzing throughput.
One problem with conventional fuzzing and smart fuzzing techniques is that they are only as good as the input received. Both techniques typically start with a static set of inputs and then fuzz from this static set. This means that these techniques usually are fuzzing from the same starting point. This makes it difficult for the fuzzing to get better over time. Besides the actual crashes that are detected, one challenge is how to make progress into new areas that otherwise are covered. Detecting new coverage is desirable because it indicates an opportunity to find new bugs in the parts of the execution code previously untested through fuzzing.
One current technique that attempts to increase code coverage uses a constraint solver to try and solve the constraints generated from execution races. In other words, during the execution trace the tool logs all conditional branches in the execution flow and derives symbolic representations of the conditional (what is being compared). The constraint solver can then try to solve the inverse of that conditional (to figure out what input would cause the alternate branch to be taken). However, this constraint solver technique is expensive, degrades performance, and has limitations on what it can solve.
Another current technique modifies the binary code being tested to insert code. This inserted code then notifies the monitoring process the code that is actually being executed. However, this again is expensive and has the disadvantage that it modifies the binary code.
SUMMARY
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
Embodiments of the code coverage-based taint perimeter detection system and method test software code by determining code coverage of the code. Embodiments of the system and method examine code coverage that has been seen across the inputs that have been executed in order to determine which tainted branch targets have never been covered. Embodiments of the system and method examine only tainted branches that have not already been covered or tested by the any of the previous inputs. This makes embodiments of the system and method more efficient than existing techniques that examine branches that have already been covered and tested.
Embodiments of the code coverage-based taint perimeter detection system and method limit the scope of consideration to new code coverage that is induced by tainted input controlled by a fuzzing tool. Embodiments of the system and method use tainted data flow analysis to determine code blocks that may be executed along an execution path that have not previously been executed. As a result, in some embodiments, software breakpoints are used to detect novel code execution. Software breakpoints are an instruction that the central processing unit (CPU) recognizes as triggering a break. Software breakpoints impose no runtime overhead except at startup and when actually triggered. This eliminates the general runtime overhead of existing solutions while providing new code execution detection of sufficient fidelity to provide feedback to an intelligent fuzz generator.
In general, embodiments of the code coverage-based taint perimeter detection system and method determine tainted branches of the software code by performing tainted data flow analysis on execution traces of the code. Conditional branch instructions where the branch taken is determined from tainted input are defined as “tainted branches.” The code locations that result from the tainted branches are “tainted branch targets.” Embodiments of the system and method then identify the tainted branch targets that have not yet been covered and detect when new inputs reach the intended tainted branch targets. This is achieved by monitoring program under test at the locations in the tainted branch targets not covered by existing inputs. In some embodiments the monitoring uses software breakpoints that are automatically placed at the locations in the tainted branches targets at runtime.
More specifically, embodiments of the system and method perform tainted data flow analysis on the execution traces to obtain tainted branch targets. The tainted branch targets are filtered and placed in a database, called a code coverage and tainted branch database. The filtering removes any tainted branch targets that are already covered.
A current taint perimeter is obtained using the data in the code coverage and tainted branch database. The current taint perimeter is the set of tainted branch targets that have not been covered with current inputs. The current taint perimeter is monitored during runtime by using the filtered tainted branch targets. In some embodiments, this monitoring yields locations in the code where software breakpoints can be inserted. These breakpoints are inserted automatically during runtime into the filtered tainted branch targets.
The monitoring process of embodiments of the system and method includes generating a new test case from a set of templates and then executing the new test case. From this execution of the new test case, it is determined whether new code coverage has been achieved. If so, then the set of templates is updated by adding the new test case to the set of templates. Moreover, the code coverage and tainted branch database is updated by adding the new code coverage to the database.
An updated taint perimeter is found from the updated set of templates and the updated database. Embodiments of the system and method then monitor the updated taint perimeter using newly filtered tainted branch targets. This iterative process continues until there are no more test cases. In this manner, embodiments of the system and method efficiently and effectively measure and extend the code coverage of the software code.
It should be noted that alternative embodiments are possible, and steps and elements discussed herein may be changed, added, or eliminated, depending on the particular embodiment. These alternative embodiments include alternative steps and alternative elements that may be used, and structural changes that may be made, without departing from the scope of the invention.
DRAWINGS DESCRIPTION
Referring now to the drawings in which like reference numbers represent corresponding parts throughout:
FIG. 1 is a block diagram illustrating a general overview of embodiments of the code coverage-based taint perimeter detection system and method implemented in a computing environment.
FIG. 2 is a flow diagram illustrating the general operation of embodiments of the code coverage-based taint perimeter detection system shown in FIG. 1.
FIG. 3 is a flow diagram illustrating the operational details of embodiments of the code coverage-based taint perimeter detection system shown in FIGS. 1 and 2.
FIG. 4 illustrates a simplified example of a general-purpose computer system on which various embodiments and elements of the code coverage-based taint perimeter detection system and method, as described herein and shown in FIGS. 1-3, may be implemented.
DETAILED DESCRIPTION
In the following description of embodiments of a code coverage-based taint perimeter detection system and method reference is made to the accompanying drawings, which form a part thereof, and in which is shown by way of illustration a specific example whereby embodiments of the code coverage-based taint perimeter detection system and method may be practiced. It is to be understood that other embodiments may be utilized and structural changes may be made without departing from the scope of the claimed subject matter.
I. System Overview
FIG. 1 is a block diagram illustrating a general overview of embodiments of the code coverage-based taint perimeter detection system and method implemented in a computing environment. As shown in FIG. 1, embodiments of the code coverage-based taint perimeter detection system 100 and method are implemented on a computing device 110. In general, embodiments of the code coverage-based taint perimeter detection system 100 and method input software code to be tested 120, process and test the code, and then output the tested code 130.
More specifically, embodiments of the code coverage-based taint perimeter detection system 100 and method measure the blocks of code covered by a set of templates 140. A template is a sample input that covers a part of the valid input range of the code to be tested 120. One at a time these templates are used to test the code that has not yet been tested.
Embodiments of the code coverage-based taint perimeter detection system 100 and method include an execution trace module 150, a tainted analysis module 160, a filtering module 170, and a monitoring module 180. The execution trace module 150 generates the execution traces for the template being used as input to the code to be tested 120. The tainted analysis module 160 performs tainted data flow analysis on the execution traces and determines tainted branch targets.
The filtering module 170 filters the tainted branch targets to ensure that any tainted branch targets that have already been covered are not included in a set of filtered tainted branch targets. The monitoring module 180 monitors a taint perimeter that is found using the filtered tainted branch targets to ensure detection of execution of code not previously covered. In some embodiments, this monitoring is performed using breakpoints that are automatically placed into the taint perimeter at runtime.
II. Operational Overview
FIG. 2 is a flow diagram illustrating the general operation of embodiments of the code coverage-based taint perimeter detection system 100 shown in FIG. 1. Embodiments of code coverage-based taint perimeter detection system 100 efficiently and effectively test software code without modification of binary code and by limiting the monitoring of branches of the code to those branches that have not been covered.
Referring to FIG. 2, the operation of embodiments of the code coverage-based taint perimeter detection system 100 begins by inputting a template that is a sample input for the code being tested (box 200). Next, execution traces are generated for the template (box 210). This yields the execution traces that are used in a tainted flow analysis.
Embodiments of the code coverage-based taint perimeter detection system 100 then determine tainted branches of the code by performing the tainted flow analysis on the execution traces (box 220). This tainted flow analysis yields tainted branch targets. The tainted branch targets then are filtered in order to remove those tainted branches of the code that have already been covered (box 230). Filtered tainted branches are obtained from this process.
Embodiments of the system 100 then monitor a taint perimeter of the code during runtime by using the filtered tainted branch targets (box 240). As explained in detail below, in some embodiments of the system 100 software breakpoints are used to monitor the taint perimeter. This is achieved by automatically placing the breakpoints into the filtered tainted branch targets during runtime (box 250). This facilitates the efficient and effective testing of the software code.
III. Operational Details
The operational details of embodiments of the code coverage-based taint perimeter detection system 100 and method will now be discussed. This includes the four main stages of generating execution traces, determining tainted branches, filtering the tainted branches, and monitoring the taint perimeter of the code.
III.A. Generating Execution Traces
The first stage of embodiments of the code coverage-based taint perimeter detection system 100 and method is to generate execution traces. This is achieved by obtaining an existing sample input that is known as a template. The template is an input that covers a part of the valid input range of the software code being tested. Execution traces are obtained from the software code being tested with the template as input using existing methods.
FIG. 3 is a flow diagram illustrating the operational details of embodiments of the code coverage-based taint perimeter detection system 100 and method shown in FIGS. 1 and 2. The operation of the system 100 begins by inputting a set of templates (box 300). Next, a template is selected from the set of templates (box 305). Execution traces then are generated for the selected template (box 310). The execution traces represent the code and data flow of the program for the selected template. The output is the execution traces (box 315).
III.B. Determining Tainted Branches
The second stage of embodiments of the code coverage-based taint perimeter detection system 100 and method is to perform tainted data flow analysis on the execution traces to find tainted branches. Referring to FIG. 3, embodiments of the system 100 and method perform tainted data flow analysis on the execution traces to determine tainted branches (box 320). This achieved by analyzing tainted instructions to identify tainted branches. A tainted instruction is any instruction that uses tainted data, which is any data that is controlled or comes from an external source (such as files). Embodiments of the system 100 and method focus not on all instructions but specifically on conditional branches.
By definition, tainted branches are conditional branches where the conditional is controlled by data that flows from an external source. For example, if a section of a file is read, and then if that section of the file says “ABC” go down one path and “DEF” go down another path, then this is a tainted branch. The tainted data flow analysis is performed on each of the execution traces in order to determine each of the tainted branches.
Referring to FIG. 3, embodiments of the system 100 and method output tainted branch targets as a result of the tainted data flow analysis (box 325). These tainted branch targets are stored in a code coverage and tainted branch database (box 330). Embodiments of the system 100 and method repeat the process of generating execution traces and determining tainted branches as long as there are more templates (box 333) to trace and analyze.
III.C. Filtering Tainted Branches
The third stage of embodiments of the code coverage-based taint perimeter detection system 100 and method is to filter the tainted branches found in the earlier stage. Filtering takes the tainted branch targets found during the second stage and strips or filters the branches that have already been covered. In other words, the tainted branch targets that have been covered within the aggregated coverage from the full template set are removed. This is filtering the tainted branch targets.
As discussed above, embodiments of the system 100 and method output tainted branch targets as a result of the tainted data flow analysis (box 325) and store them in a code coverage and tainted branch database (box 330). The current taint perimeter then is determined using data in the code coverage and tainted branch database (box 335) by removing tainted branch targets that have been covered.
Filtering of the tainted branches comes into play when there are new tainted branches from new inputs that are discovered during execution in the fourth stage, or the monitoring stage, as described below. When new tainted branches are found, then the filtering process is run again. Embodiments of the system 100 and method uniquely use the trace analysis to find the tainted branches and then filter the tainted branches to exclude covered blocks from all the runs. The result is that filtered tainted branches are found.
III.D. Monitoring the Taint Perimeter of the Code
The fourth stage of embodiments of the code coverage-based taint perimeter detection system 100 and method is to monitor the taint perimeter of the software code. The taint perimeter is generated using the filtered tainted branches found in the earlier stage.
With the filtered tainted branches, embodiments of the system 100 and method know both the code that will execute under test conditions and the new code that has not been seen before. Given this information, embodiments of the system 100 and method can monitor for new code coverage in a targeted way without the need to watch every block that executes when the test is actually run.
Embodiments of the system 100 and method generate a new test case from the set of templates (box 340). Next, the new test case is executed (box 345). A determination then is made as to whether there is new code coverage (box 350). If there is new code coverage, then the new test case is added to the set of templates (box 355). In addition, the generating execution traces, determining tainted branches and filtering tainted branches stages will be executed using the new template, resulting in an updated taint perimeter based on the execution data flow from the new template. Future test cases generating in the monitoring the taint perimeter stage will benefit from this refined taint perimeter.
If there is no new code coverage, then another determination is made as to whether there are more test cases (box 370). If so, then embodiments of the system 100 and method go back to the process of generating a new test case from the set of templates (box 340). The process then continues from that point as described thus far. If there are no more test cases, then the process is completed for the time and results of the tested code are output (box 375).
In some embodiments of the system 100 and method software breakpoints are used in the monitoring process. A software breakpoint is one way of monitoring when a particular piece of code actually executes. Other embodiments of the system 100 and method use other types of monitoring processes.
For embodiments that use breakpoints, embodiments of the system 100 and method to automatically determine where to insert the breakpoints. Because filtered tainted branch targets are used, embodiments of the system 100 and method use a much smaller number of breakpoints than would otherwise be used.
Breakpoints are automatically inserted as follows. By definition, for a tainted conditional branch instruction, there is a side of the branch that was taken and one side that was not taken. For a given filtered tainted branch target in the set of filtered tainted branch targets identified earlier, embodiments of the system 100 and method automatically place the breakpoints at every filtered tainted branch target.
During testing if the new code is executed for one of the filtered tainted branches and it takes the non-taken conditional branch, then embodiments of the system 100 and method send an alert from the breakpoint that the path was taken and that new coverage was achieved. Thus, breakpoints are only placed at non-taken branches of a tainted branch. Note that the non-taken branches refer to not just the path not taken in one particular input file, but the path that was not taken in any of the files in the set of templates. If the both branches of a tainted conditional branch instruction were taken, then no breakpoint is set.
When new coverage is detected, embodiments of the system 100 and method recalculate which pieces of code have not been covered even with the new coverage. This typically is performed in an iterative manner, as explained above. Also, when new coverage is taken, embodiments of the system 100 and method generate a new execution trace and find a new set of filtered tainted branches. Thus, new coverage may either narrow the set of breakpoints already set or also may expand it by providing additional filtered tainted branch targets to analyze.
IV. Exemplary Operating Environment
Embodiments of the code coverage-based taint perimeter detection system 100 and method described herein are operational within numerous types of general purpose or special purpose computing system environments or configurations. FIG. 4 illustrates a simplified example of a general-purpose computer system on which various embodiments and elements of the code coverage-based taint perimeter detection system 100 and method, as described herein and shown in FIGS. 1-3, may be implemented. It should be noted that any boxes that are represented by broken or dashed lines in FIG. 4 represent alternate embodiments of the simplified computing device, and that any or all of these alternate embodiments, as described below, may be used in combination with other alternate embodiments that are described throughout this document.
For example, FIG. 4 shows a general system diagram showing a simplified computing device 10. Such computing devices can be typically be found in devices having at least some minimum computational capability, including, but not limited to, personal computers, server computers, hand-held computing devices, laptop or mobile computers, communications devices such as cell phones and PDA's, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, audio or video media players, etc.
To allow a device to implement embodiments of the code coverage-based taint perimeter detection system 100 and method described herein, the device should have a sufficient computational capability and system memory to enable basic computational operations. In particular, as illustrated by FIG. 4, the computational capability is generally illustrated by one or more processing unit(s) 12, and may also include one or more GPUs 14, either or both in communication with system memory 16. Note that that the processing unit(s) 12 of the general computing device of may be specialized microprocessors, such as a DSP, a VLIW, or other micro-controller, or can be conventional CPUs having one or more processing cores, including specialized GPU-based cores in a multi-core CPU.
In addition, the simplified computing device of FIG. 4 may also include other components, such as, for example, a communications interface 18. The simplified computing device of FIG. 4 may also include one or more conventional computer input devices 20 (e.g., pointing devices, keyboards, audio input devices, video input devices, haptic input devices, devices for receiving wired or wireless data transmissions, etc.). The simplified computing device of FIG. 4 may also include other optional components, such as, for example, one or more conventional computer output devices 22 (e.g., display device(s) 24, audio output devices, video output devices, devices for transmitting wired or wireless data transmissions, etc.). Note that typical communications interfaces 18, input devices 20, output devices 22, and storage devices 26 for general-purpose computers are well known to those skilled in the art, and will not be described in detail herein.
The simplified computing device of FIG. 4 may also include a variety of computer readable media. Computer readable media can be any available media that can be accessed by computer 10 via storage devices 26 and includes both volatile and nonvolatile media that is either removable 28 and/or non-removable 30, for storage of information such as computer-readable or computer-executable instructions, data structures, program modules, or other data. By way of example, and not limitation, computer readable media may comprise computer storage media and communication media. Computer storage media includes, but is not limited to, computer or machine readable media or storage devices such as DVD's, CD's, floppy disks, tape drives, hard drives, optical drives, solid state memory devices, RAM, ROM, EEPROM, flash memory or other memory technology, magnetic cassettes, magnetic tapes, magnetic disk storage, or other magnetic storage devices, or any other device which can be used to store the desired information and which can be accessed by one or more computing devices.
Retention of information such as computer-readable or computer-executable instructions, data structures, program modules, etc., can also be accomplished by using any of a variety of the aforementioned communication media to encode one or more modulated data signals or carrier waves, or other transport mechanisms or communications protocols, and includes any wired or wireless information delivery mechanism. Note that the terms “modulated data signal” or “carrier wave” generally refer to a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. For example, communication media includes wired media such as a wired network or direct-wired connection carrying one or more modulated data signals, and wireless media such as acoustic, RF, infrared, laser, and other wireless media for transmitting and/or receiving one or more modulated data signals or carrier waves. Combinations of the any of the above should also be included within the scope of communication media.
Further, software, programs, and/or computer program products embodying the some or all of the various embodiments of the code coverage-based taint perimeter detection system 100 and method described herein, or portions thereof, may be stored, received, transmitted, or read from any desired combination of computer or machine readable media or storage devices and communication media in the form of computer executable instructions or other data structures.
Finally, embodiments of the code coverage-based taint perimeter detection system 100 and method described herein may be further described in the general context of computer-executable instructions, such as program modules, being executed by a computing device. Generally, program modules include routines, programs, objects, components, data structures, etc., that perform particular tasks or implement particular abstract data types. The embodiments described herein may also be practiced in distributed computing environments where tasks are performed by one or more remote processing devices, or within a cloud of one or more devices, that are linked through one or more communications networks. In a distributed computing environment, program modules may be located in both local and remote computer storage media including media storage devices. Still further, the aforementioned instructions may be implemented, in part or in whole, as hardware logic circuits, which may or may not include a processor.
Moreover, although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.

Claims (20)

What is claimed is:
1. A method implemented by at least one computing device, the method comprising:
generating multiple execution traces of software code using a set of multiple different inputs to the software code;
determining tainted branch targets by performing data flow analysis on the multiple execution traces of the software code, the tainted branch targets being associated with tainted conditional branches in the software code;
filtering the tainted branch targets to identify a taint perimeter of the software code, the taint perimeter comprising a subset of the tainted branch targets that have not been covered by the multiple different inputs;
automatically placing breakpoints into the taint perimeter during runtime of the software code while the software code is currently executing on the at least one computing device;
upon triggering of an individual breakpoint in the taint perimeter when executing the software code using a particular input, detecting that new code from the software code has been covered by the particular input;
generating a new execution trace for the new code;
performing additional data flow analysis on the new execution trace to identify additional tainted branch targets in the new code;
filtering the additional tainted branch targets to identify an updated taint perimeter of the software code;
automatically placing a new breakpoint into the updated taint perimeter; and
upon triggering of the new breakpoint when executing the software code using a further input, detecting that further new code from the software code has been covered by the further input.
2. The method of claim 1, further comprising monitoring the breakpoints in the taint perimeter to detect when the new code is covered.
3. The method of claim 1, wherein the set of multiple different inputs comprises a set of templates that are existing valid inputs for the software code.
4. The method of claim 3, further comprising generating a new test case from the set of templates, the new test case comprising the further input.
5. The method of claim 4, further comprising:
responsive to detecting that the further new code is covered by the further input, adding the further input to the set of templates.
6. The method of claim 5, further comprising adding the new code and the further new code to a code coverage and tainted branch database.
7. The method of claim 1, wherein the filtering the tainted branch targets comprises removing individual tainted conditional branch targets that have already been covered from the taint perimeter.
8. The method of claim 1, wherein the determining the tainted branch targets comprises determining that the tainted branch targets are controlled by external data and excluding, from the tainted branch targets, some other branch targets in the software code that are not controlled by the external data.
9. At least one computer-readable volatile memory, non-volatile memory, hard drive, or optical disk storing computer-executable instructions which, when executed by at least one processing unit, cause the at least one processing unit to perform acts comprising:
generating one or more execution traces of software code using a set of multiple different inputs to the software code;
determining tainted branch targets by performing data flow analysis on the one or more execution traces of the software code, the tainted branch targets being associated with one or more tainted conditional branches in the software code;
filtering the tainted branch targets to identify a taint perimeter of the software code, the taint perimeter comprising a subset of the tainted branch targets that have not been covered by the multiple different inputs;
automatically placing breakpoints into the taint perimeter during runtime of the software code;
upon triggering of a first breakpoint in the taint perimeter using a first input, detecting that additional code from the software code has been covered by the first input;
generating another execution trace for the additional code;
determining additional tainted branch targets in the additional code by performing additional data flow analysis on the another execution trace;
filtering the additional tainted branch targets to identify an updated taint perimeter of the software code;
automatically placing a second breakpoint into the updated taint perimeter; and
upon triggering of the second breakpoint when executing the software code using a second input, detecting that further additional code from the software code has been covered by the second input.
10. The at least one computer-readable volatile memory, non-volatile memory, hard drive, or optical disk of claim 9, the acts further comprising:
placing the subset of the tainted branch targets in a code coverage and tainted branch database; and
determining the taint perimeter using the code coverage and tainted branch database.
11. The at least one computer-readable volatile memory, non-volatile memory, hard drive, or optical disk of claim 9, wherein the set of multiple different inputs is a set of templates that are existing valid inputs for the software code.
12. The at least one computer-readable volatile memory, non-volatile memory, hard drive, or optical disk claim 11, the acts further comprising:
adding the second input to the set of templates responsive to detecting that the further additional code has been covered by the second input.
13. The at least one computer-readable volatile memory, non-volatile memory, hard drive, or optical disk of claim 9, wherein the data flow analysis comprises excluding, from the tainted branch targets, some other branch targets in the software code that are not controlled by external data.
14. A system comprising:
one or more processing units; and
at least one computer-readable volatile memory, non-volatile memory, hard drive, or optical disk storing computer-executable instructions which, when executed by the one or more processing units, cause the one or more processing units to:
obtain first execution traces of software code, the first execution traces reflecting execution of the software code using multiple inputs;
using first data flow analysis on the first execution traces, determine first tainted branch targets in the software code;
filter the first tainted branch targets to identify a first taint perimeter comprising a subset of the first tainted branch targets that have not been covered by the multiple inputs;
automatically place a first breakpoint in the first taint perimeter;
responsive to the first breakpoint in the first taint perimeter being triggered by an additional input, detect that the additional input causes additional code of the software code to be executed;
obtain a second execution trace of the additional code of the software code;
using second data flow analysis on the second execution trace, determine second tainted branch targets in the additional code of the software code;
filter the second tainted branch targets to identify a second taint perimeter comprising a subset of the second tainted branch targets that have not been covered by the multiple inputs and the additional input;
automatically place a second breakpoint in the second taint perimeter; and
responsive to the second breakpoint in the second taint perimeter being triggered by a further additional input, detect that the further additional input causes further additional code of the software code to be executed.
15. The system of claim 14, embodied as a single computer.
16. The system of claim 14, wherein the computer-executable instructions cause the one or more processing units to:
perform both the first data flow analysis and the second data flow analysis, the first data flow analysis comprising evaluating first data flow through the software code in the first execution traces and the second data flow analysis comprising evaluating second data flow through the software code in the second execution trace.
17. The system of claim 14, wherein the computer-executable instructions cause the one or more processing units to:
iteratively refine the second taint perimeter by inserting subsequent breakpoints into the software code, executing the software code using subsequent inputs, and identifying subsequent additional code coverage when the subsequent inputs trigger the subsequent breakpoints.
18. The system of claim 14, wherein the computer-executable instructions cause the one or more processing units to:
place multiple third breakpoints into a third taint perimeter, the third taint perimeter comprising the further additional code.
19. The system of claim 14, wherein the additional code is not executed in any of the first execution traces.
20. The system of claim 19, wherein the further additional code is not executed in any of the first execution traces and also is not executed in the second execution trace.
US13/115,985 2011-05-25 2011-05-25 Code coverage-based taint perimeter detection Active 2031-11-06 US8839203B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US13/115,985 US8839203B2 (en) 2011-05-25 2011-05-25 Code coverage-based taint perimeter detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US13/115,985 US8839203B2 (en) 2011-05-25 2011-05-25 Code coverage-based taint perimeter detection

Publications (2)

Publication Number Publication Date
US20120304010A1 US20120304010A1 (en) 2012-11-29
US8839203B2 true US8839203B2 (en) 2014-09-16

Family

ID=47220088

Family Applications (1)

Application Number Title Priority Date Filing Date
US13/115,985 Active 2031-11-06 US8839203B2 (en) 2011-05-25 2011-05-25 Code coverage-based taint perimeter detection

Country Status (1)

Country Link
US (1) US8839203B2 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140108867A1 (en) * 2012-03-14 2014-04-17 Nec Laboratories America, Inc. Dynamic Taint Analysis of Multi-Threaded Programs
US9436586B1 (en) * 2013-10-04 2016-09-06 Ca, Inc. Determining code coverage on Z/OS® server
US9934133B1 (en) 2016-09-19 2018-04-03 International Business Machines Corporation Code coverage through overlay hooks
CN108255711A (en) * 2017-12-29 2018-07-06 湖南优利泰克自动化系统有限公司 A kind of PLC firmware fuzz testing systems and test method based on stain analysis
US10915683B2 (en) 2018-03-08 2021-02-09 Synopsys, Inc. Methodology to create constraints and leverage formal coverage analyzer to achieve faster code coverage closure for an electronic structure
US11061809B2 (en) * 2019-05-29 2021-07-13 Red Hat, Inc. Software debugging system with improved test execution and log file tracking

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9710356B2 (en) * 2011-09-19 2017-07-18 International Business Machines Corporation Assertions in a business rule management system
JP5785474B2 (en) * 2011-10-27 2015-09-30 インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation Program debugging method, debugging apparatus, and debugging support GUI
US8899343B2 (en) * 2012-09-12 2014-12-02 International Business Machines Corporation Replacing contiguous breakpoints with control words
CN104572424B (en) * 2013-10-09 2018-04-20 阿里巴巴集团控股有限公司 Test method and device
WO2015080742A1 (en) 2013-11-27 2015-06-04 Hewlett-Packard Development Company, L.P. Production sampling for determining code coverage
WO2015147690A1 (en) * 2014-03-28 2015-10-01 Oracle International Corporation System and method for determination of code coverage for software applications in a network environment
US9552284B2 (en) * 2015-05-15 2017-01-24 Fujitsu Limited Determining valid inputs for an unknown binary program
US10346237B1 (en) * 2015-08-28 2019-07-09 EMC IP Holding Company LLC System and method to predict reliability of backup software
US10114737B2 (en) * 2015-09-14 2018-10-30 Salesforce.Com, Inc. Methods and systems for computing code coverage using grouped/filtered source classes during testing of an application
CN105955877B (en) * 2016-04-19 2017-03-29 西安交通大学 A kind of dynamic parallel program stain analysis method based on sign computation
US20180004635A1 (en) * 2016-06-30 2018-01-04 Fujitsu Limited Input discovery for unknown program binaries
US9990272B2 (en) 2016-08-03 2018-06-05 International Business Machines Corporation Test case generation for uncovered code paths
US11237946B2 (en) * 2018-05-03 2022-02-01 Sap Se Error finder tool
CN108932199B (en) * 2018-07-09 2020-11-17 南京网觉软件有限公司 Automatic taint analysis system based on user interface analysis
CN109542789B (en) * 2018-11-26 2022-03-25 泰康保险集团股份有限公司 Code coverage rate statistical method and device
US11782816B2 (en) 2019-03-19 2023-10-10 Jens C. Jenkins Input/output location transformations when emulating non-traced code with a recorded execution of traced code
US11281560B2 (en) 2019-03-19 2022-03-22 Microsoft Technology Licensing, Llc Input/output data transformations when emulating non-traced code with a recorded execution of traced code
US10713151B1 (en) * 2019-04-18 2020-07-14 Microsoft Technology Licensing, Llc Program execution coverage expansion by selective data capture
CN110727598B (en) * 2019-10-16 2022-03-04 西安电子科技大学 Binary software vulnerability detection system and method based on dynamic taint tracking
CN111639019B (en) * 2020-04-24 2023-08-25 北京五八信息技术有限公司 Code testing method, device and readable storage medium
US11561888B2 (en) * 2020-10-26 2023-01-24 Diffblue Ltd Initialization sequences for automatic software test generation

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5050168A (en) * 1989-12-29 1991-09-17 Paterson Timothy L Test coverage analyzer
US20020104074A1 (en) 2001-01-26 2002-08-01 Robert Hundt Providing debugging capability for program instrumented code
US6536036B1 (en) * 1998-08-20 2003-03-18 International Business Machines Corporation Method and apparatus for managing code test coverage data
US20030121011A1 (en) 1999-06-30 2003-06-26 Cirrus Logic, Inc. Functional coverage analysis systems and methods for verification test suites
US20030204836A1 (en) * 2002-04-29 2003-10-30 Microsoft Corporation Method and apparatus for prioritizing software tests
US20060225051A1 (en) 2005-04-05 2006-10-05 Cisco Technology, Inc. Method and system for code coverage
US7581209B2 (en) 2005-02-28 2009-08-25 Microsoft Corporation Method for determining code coverage

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5050168A (en) * 1989-12-29 1991-09-17 Paterson Timothy L Test coverage analyzer
US6536036B1 (en) * 1998-08-20 2003-03-18 International Business Machines Corporation Method and apparatus for managing code test coverage data
US20030121011A1 (en) 1999-06-30 2003-06-26 Cirrus Logic, Inc. Functional coverage analysis systems and methods for verification test suites
US20020104074A1 (en) 2001-01-26 2002-08-01 Robert Hundt Providing debugging capability for program instrumented code
US20030204836A1 (en) * 2002-04-29 2003-10-30 Microsoft Corporation Method and apparatus for prioritizing software tests
US7581209B2 (en) 2005-02-28 2009-08-25 Microsoft Corporation Method for determining code coverage
US20060225051A1 (en) 2005-04-05 2006-10-05 Cisco Technology, Inc. Method and system for code coverage

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
"PyDbg", Retrieved at <<http://pedram.redhive.com/PyDbg/docs/>>, Retrieved Date: Feb. 16, 2011, pp. 1-3.
"PyDbg", Retrieved at >, Retrieved Date: Feb. 16, 2011, pp. 1-3.
Artho, et al., "Combining Test Case Generation and Runtime Verification", Theoretical Computer Science-Abstract state machines and high-level system design and analysis, vol. 336, Issue 02-03, May 26, 2005, pp. 209-234.
Cornett, Steve, "Code Coverage Analysis", Retrieved at <<http://www.bullseye.com/coverage.html, Retrieved Date: Jan. 11, 2006, pp. 1-12.
Godefroid, et al., "Automated Whitebox Fuzz Testing", Network Distributed Security Symposium (NDSS), Internet Society, Apr. 3, 2008, pp. 1-16.
Tikir, et al. "Efficient Instrumentation for Code Coverage Testing", Proceedings of the 2002 ACM SIGSOFT international symposium on Software testing and analysis, vol. 27, Issue 04, Jul. 2002, pp. 1-11.

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140108867A1 (en) * 2012-03-14 2014-04-17 Nec Laboratories America, Inc. Dynamic Taint Analysis of Multi-Threaded Programs
US9436586B1 (en) * 2013-10-04 2016-09-06 Ca, Inc. Determining code coverage on Z/OS® server
US9934133B1 (en) 2016-09-19 2018-04-03 International Business Machines Corporation Code coverage through overlay hooks
US10169212B2 (en) 2016-09-19 2019-01-01 International Business Machines Corporation Code coverage through overlay hooks
US10176083B2 (en) 2016-09-19 2019-01-08 International Business Machines Corporation Code coverage through overlay hooks
US10417112B2 (en) 2016-09-19 2019-09-17 International Business Machines Corporation Code coverage through overlay hooks
CN108255711A (en) * 2017-12-29 2018-07-06 湖南优利泰克自动化系统有限公司 A kind of PLC firmware fuzz testing systems and test method based on stain analysis
US10915683B2 (en) 2018-03-08 2021-02-09 Synopsys, Inc. Methodology to create constraints and leverage formal coverage analyzer to achieve faster code coverage closure for an electronic structure
US11061809B2 (en) * 2019-05-29 2021-07-13 Red Hat, Inc. Software debugging system with improved test execution and log file tracking

Also Published As

Publication number Publication date
US20120304010A1 (en) 2012-11-29

Similar Documents

Publication Publication Date Title
US8839203B2 (en) Code coverage-based taint perimeter detection
US8510842B2 (en) Pinpointing security vulnerabilities in computer software applications
Ma et al. Accurate, low cost and instrumentation-free security audit logging for windows
Bao et al. Execution anomaly detection in large-scale systems through console log analysis
US8776029B2 (en) System and method of software execution path identification
Soltani et al. A guided genetic algorithm for automated crash reproduction
US9355003B2 (en) Capturing trace information using annotated trace output
US10474565B2 (en) Root cause analysis of non-deterministic tests
US20140380280A1 (en) Debugging tool with predictive fault location
JP2006185211A (en) Program analysis system, test execution device, and analysis method and program thereof
CN109635568B (en) Concurrent vulnerability detection method based on combination of static analysis and fuzzy test
Doyle et al. An empirical study of the evolution of PHP web application security
Chen et al. A large-scale empirical study on control flow identification of smart contracts
CN110851352A (en) Fuzzy test system and terminal equipment
Azim et al. Dynamic slicing for android
Park et al. unicorn: a unified approach for localizing non‐deadlock concurrency bugs
Peters et al. How does migrating to kotlin impact the run-time efficiency of android apps?
Meinicke et al. Understanding differences among executions with variational traces
Gauthier et al. Experience: Model-Based, Feedback-Driven, Greybox Web Fuzzing with BackREST
Fedorova et al. Performance comprehension at WiredTiger
Weng et al. Argus: Debugging performance issues in modern desktop applications with annotated causal tracing
Gauthier et al. Backrest: A model-based feedback-driven greybox fuzzer for web applications
DeMott et al. Towards an automatic exploit pipeline
Xu et al. Real-Time Diagnosis of Configuration Errors for Software of AI Server Infrastructure
KR102165747B1 (en) Lightweight crash report based debugging method considering security

Legal Events

Date Code Title Description
AS Assignment

Owner name: MICROSOFT CORPORATION, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:OPSTAD, EDWIN LARS;RENK, ANDREW;MARGOLIS, DANIEL;SIGNING DATES FROM 20110523 TO 20110525;REEL/FRAME:026435/0959

STCF Information on status: patent grant

Free format text: PATENTED CASE

AS Assignment

Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:034544/0001

Effective date: 20141014

AS Assignment

Owner name: ZHIGU HOLDINGS LIMITED, CAYMAN ISLANDS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT TECHNOLOGY LICENSING, LLC;REEL/FRAME:040354/0001

Effective date: 20160516

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551)

Year of fee payment: 4

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 8