CN116893957A

CN116893957A - Fuzzy test using software overlay feedback through dynamic detection of connectivity based on instruction blocks in a control flow graph

Info

Publication number: CN116893957A
Application number: CN202310357493.3A
Authority: CN
Inventors: C·胡斯; M·C·埃塞莱
Original assignee: Robert Bosch GmbH
Current assignee: Robert Bosch GmbH
Priority date: 2022-04-05
Filing date: 2023-04-04
Publication date: 2023-10-17
Also published as: DE102022203356A1

Abstract

The present disclosure relates to a computer-implemented method of obtaining software overlay feedback when fuzzing software on a hardware target, the hardware target having at least one breakpoint register and being designed to: if an instruction of the software has arrived at the execution of the software and a memory address of the instruction is set in at least one breakpoint register, the execution of the software is stopped before the execution of the instruction, the method comprising: selecting a first instruction block of software; setting a first breakpoint in at least one breakpoint register prior to an instruction of the first instruction block; performing or continuing the fuzzy test iteration of the software for the first time; checking for the first time whether a first breakpoint is reached when the first execution or continuation of the fuzzing iteration; storing first log information, comprising: if the first check is affirmative, the first instruction block has been reached in the fuzziness test iteration. The software overlay feedback includes first log information. Selecting the first instruction block may be based on connectivity, optionally connectivity weights, of the instruction blocks in the control flow graph of the software.

Description

Fuzzy test using software overlay feedback through dynamic detection of connectivity based on instruction blocks in a control flow graph

Background

Fuzzing (English: fuzzing or fuzzing) is an automated technique for testing software. In this case, invalid, unexpected and/or random input data is used in multiple fuzzy test iterations to execute the software and here monitor if there are exceptions (English) such as crashes, failed built-in code assertions (English), potential memory leaks, etc. For software whose input data has to be present in a predetermined data structure, a fuzzifier (english: fuzzers) designed for the predetermined data structure may be used. The predetermined data structure is specified, for example, as a file format and/or protocol. The (high efficiency) fuzzifier is designed to: invalid, unexpected and/or random input data is generated in the predetermined data structure such that execution of a respective fuzzy test iteration of the software can be started without parsing errors based on the input data. By means of fuzzy testing, it is possible in particular to determine unexpected behaviour, such as unexpected (program) paths and/or programming errors and edge situations, with complex software, such as software for controlling, adjusting and/or monitoring technical systems. With a better knowledge of the software obtained in this way, the software and in particular the security of the software can be improved.

The fuzzing targets may be software (e.g., a program) and/or a portion of the software (e.g., a function) that should be tested by fuzzing. The fuzzing targets may accept potentially untrusted input data that may be generated when fuzzing is performed by the fuzzer for multiple fuzzing iterations. In this regard, the fuzzy test may be considered an automated process: any and especially invalid, unexpected and/or random input data is posted to the fuzzy test object and then the response of the fuzzy test object is observed during execution of the fuzzy test iteration. A fuzzifier or fuzzifier is a computer program designed to automatically generate input data for a fuzzification test target for each fuzzification test iteration. The fuzzifier is not part of the fuzzified test objects but is independent of the fuzzified test objects. Typically, the blur is not detected. Known blumers are for example afl or libfuzzers. The combination of the ambiguity test target and the associated ambiguity may be referred to as an ambiguity test. The ambiguity test is executable. The fuzzifier may generate different (fuzzification) input data for a plurality of fuzzification test iterations, for example hundreds or thousands of fuzzification test iterations per second, and initiate, observe and, if necessary, also stop fuzzification tests using the relevant input data, respectively. The fuzziness test iteration includes executing a fuzziness test objective/software based on the (fuzziness test) input data generated for the fuzziness test iteration. By storing the corresponding input data, in particular when unexpected behaviour of the software (e.g. a path and/or programming error that is still unknown) is identified during the fuzzification test iteration, the fuzzification test iteration can be reproduced at a later point in time, wherein the fuzzification test objective can then be executed without a fuzzifier but based on the stored (fuzzification test) input data.

During execution of the fuzzing test, information from the software object may be output. Such software overlay-guided fuzzing may advantageously be used for fuzzing to identify paths/blocks that are still unknown and/or to locate programming errors in the software. Software overlay feedback in the fuzzy test may be implemented, for example, by static detection of the fuzzy test object as in afl. In the case of static detection, the fuzzy test object, i.e. the software, is changed (e.g. at compile time) such that information about, for example, the last executed instruction in the software and/or (program) path can be invoked when executing the software and in particular during the fuzzy test iterations. Alternatively or additionally, software coverage feedback may be obtained from dynamic detection. In this case, the execution of the software at runtime is controlled by the system function and/or simulator in order to obtain information of the flow in the software. The software override feedback by dynamic detection is particularly advantageous when the software is present in compiled form (english: closed-source).

JinSeok Oh, sungyu Kim, eunji Jeong, and Soo-Mook Moon, "Os-less dynamic binary instrumentation for embedded firmware", in2015 IEEE Symposium in LowPower and High-Speed Chips (COOL CHIPS XVIII), pages 1-3.IEEE,2015 disclose dynamic detection by debugger and breakpoint, which is based on software interrupts. Here, binary code (english) of the software is actively changed, and one instruction is replaced by a software interrupt instruction.

Lawrence Page, sergey Brin, rajeev Motwani, and Terry Winograd, "The pagerank citation ranking: bringing order to the web ", technical report, stanford InfoLab,1999 discloses an algorithm for calculating PageRank (or PageRank rating). The publication teaches: the importance of a website should be a subjective question in nature, which depends on the interests, knowledge, and mindset of the website user. However, the relative importance of a web site can be viewed in a very large number. PageRank is a mathematical method for objectively and mechanically ranking websites, with which one can effectively measure one's interests and interests in those websites. PageRank can be compared to an idealized random web surfer. Furthermore, pageRank can be computed efficiently for a large number of pages. Indicating how to apply PageRank to search and user navigation.

Disclosure of Invention

A first general aspect of the present disclosure relates to a computer-implemented method for obtaining software overlay feedback when fuzzing software on a hardware target, wherein the hardware target has at least one breakpoint register and is designed to: if an instruction of the software has arrived at the execution of the software and a memory address of the instruction is set in the at least one breakpoint register, execution of the software is stopped before the instruction is executed. The method includes selecting a first instruction block of software. The method further comprises the steps of: a first breakpoint is set in the at least one breakpoint register prior to an instruction of the first instruction block. The method further comprises the steps of: the first execution or the first continuation of the fuzzing iteration of the software. The method further comprises the steps of: the first check is made whether a first breakpoint is reached at the first execution or first continuation of the fuzzification test iteration. The method further comprises the steps of: storing first log information, the first log information comprising: if the first check is affirmative, the first instruction block has been reached in the fuzziness test iteration. The method may include: if the first check is affirmative, the first breakpoint is deleted. The software overlay feedback when fuzzing the software includes first log information. The selection of the first instruction block of the software may be based on connectivity of the instruction blocks in a control flow graph of the software. Alternatively or additionally, the selection of the first instruction block of the software may be based on connectivity weights of the instruction blocks in a control flow graph of the software.

A second general aspect of the present disclosure relates to a computer system designed to: a computer-implemented method according to the first general aspect (or embodiments thereof) for obtaining software overlay feedback when fuzzing software on a hardware target is performed.

A third general aspect of the present disclosure relates to a computer program designed to: a computer-implemented method according to the first general aspect (or embodiments thereof) for obtaining software overlay feedback when fuzzing software on a hardware target is performed.

A fourth general aspect of the present disclosure relates to a computer readable medium or signal storing and/or containing a computer program according to the third general aspect (or embodiments thereof).

The method according to the first aspect (or embodiments thereof) presented in this disclosure aims at obtaining software coverage feedback when fuzzing software on a hardware target, especially when the software itself or on a hardware target cannot be statically detected.

For computer systems such as desktop systems (PCs, etc.) and especially for software in the form of programming code (e.g., open source), static detection is a known method for obtaining software overlay feedback when executing the software and especially during fuzzy test iterations of the software.

However, for software executing on embedded systems (english) embedded in a technical environment, static detection for fuzzy testing may be difficult for the following reasons: the obfuscators required for the obfuscation test typically have to be executed on another computer (e.g., due to lack of computing and/or storage capabilities in the embedded system). From this follows: software overlay feedback from the embedded system must first be transferred to the obfuscator (or another computer). Typically, the entire system must also be tested, which may include multiple components. The software of the overall system may have third party libraries and software components of other suppliers and/or customers. Such software components are typically provided as binary files (i.e., in compiled form, also referred to as binary code) that cannot be modified or can only be modified in a complex manner. Since such software components are no longer compiled, they can only be detected statically in a complex manner. On the other hand, software (or software components thereof) existing as programming code can be easily statically detected at compile time. However, due to static detection, the size of the software will always increase, such that due to the typically limited resources, the software being statically detected is typically no longer suitable for the memory of the embedded system. The same applies to other functions that the software may be extended to fuzzing.

In an alternative approach, the software of the embedded system may be executed in a simulator such as a QEMU. In this case, the transparency and configurability of the simulator can be used to provide software overlay feedback during the ambiguity test. Unfortunately, setting up such simulators for specific hardware targets requires significant effort and expense, as the functionality of the software of the embedded system is typically based on the availability of external hardware components such as sensors and/or actuators. However, if such hardware components are missing in the simulator, the software cannot be tested under actual design conditions. In fact, in this case, the behavior of the software will be different from that in the context of the technical system for which it was designed. For example, this may be expressed as the software may traverse other paths that are only conditionally relevant to the actual purpose of use of the software. Therefore, it is not appropriate to use a simulator for software coverage at the time of the blur test.

If the debugger is connected to the embedded system, a breakpoint instruction (via at least one hardware breakpoint register) may be used to stop execution of software in the embedded system at a target location in the code. However, almost complete software coverage feedback will not be available with any break points, as the number of break points active at the same time is often severely limited. For example, an ARM Cortex-M0 microcontroller is designed for up to four simultaneously active hardware breakpoints.

The method according to the first aspect (or embodiments thereof) presented in this disclosure is particularly suitable for fuzzy testing on hardware target/embedded systems and in (typically) cases where the number of breakpoints is severely limited. As already explained, the software can be tested more practically and thus better by fuzzy testing on (real) hardware targets, unlike simulators. For this method it is sufficient that the hardware target allows at least one breakpoint. Based on, for example, a mostly specified start-up instruction, due to the method presented in the present disclosure and in particular to systematically start up instructions by strategically setting at least one breakpoint, the control flow graph of the software may be determined and/or tried at least for the part of the software covered by the fuzzy test iteration. If unexpected behavior occurs during the fuzzing test iteration, software override feedback may be used to improve the software and in particular the security of the software through software modification. Thereby, the functionality and in particular the security of the embedded system controlled, regulated and/or monitored by the software can also be improved.

In view of the present disclosure, the methods known in the prior art (Oh et al, supra) for dynamic detection by debuggers and breakpoints based on software interrupts can equally be used to obtain software coverage feedback when fuzzing software on a hardware target. For example, the replacement of one instruction by a software interrupt instruction that has been described may be repeated multiple times and thereby set different breakpoints. However, the memory must be rewritten here at each repetition. However, this can lead to considerable overhead when using EEPROM/flash memory in the microcontroller. Thus, obtaining software overlay feedback via software interrupts when fuzzing software on a hardware target becomes inefficient or even impractical.

In contrast, in the method presented in the present disclosure, a hardware breakpoint is set (via at least one hardware breakpoint register). In this case, it has proven to be advantageous: hardware breakpoints can be set without significant overhead. Thus, software overlay feedback when fuzzing software on a hardware target can be efficiently and therefore quickly obtained.

Furthermore, the method presented in the present disclosure is also suitable for applications of software in read-only memory (ROM) where hardware objects are executed (e.g. for testing/trying existing products). In this case, even if possible, considerable expense can only be used to replace instructions in the binary code of the software by software interrupt instructions.

In the method according to the first aspect (or embodiments thereof) presented in this disclosure, at least one breakpoint or for example one breakpoint for each existing hardware breakpoint register-for example there may be one, two, three, four or more than four, eight or more than eight strategic settings (or selections) of hardware breakpoint registers one by one on a hardware target, comprises: the setting is based on the connectivity, in particular connectivity weights, of the instruction blocks in the control flow graph. In one embodiment of the method, the strategic setting of at least one breakpoint may comprise, for example, setting the breakpoint consecutively before the instructions of the instruction blocks, which are each connected to other instruction blocks to a high degree in the control flow graph, i.e. have a high connectivity, in particular a high connectivity weight. Unlike setting breakpoints consecutively before adjacent instructions, for example in an abstract syntax tree of the software and/or even for arbitrary instructions of the software and especially in case the software is sufficiently complex-in practice almost always met and/or can be very easily met-the probability of finding execution blocks of interest and/or importance for the software developer/user of the software is significantly increased by strategically setting based on (high) connectivity, especially (high) connectivity weights, of instruction blocks in the control flow graph. Thus, the blur test is significantly more efficient, better and/or faster. Thus, the software and in particular the system controlled, regulated and/or monitored by the software may be improved.

In an embodiment of the method according to the first aspect (or embodiments thereof) presented in the present disclosure, the at least one breakpoint or the strategic setting (or selection) of, for example, one breakpoint for each existing hardware breakpoint register may be guided at least trending by (high) connectivity, in particular (high) connectivity weights, of the instruction block on the one hand, and based on probabilistic selection of the instruction block on the other hand. For this purpose, for example, a probability measure can be calculated, which assigns (normalized) connectivity weights to instruction blocks or parts thereof in the control flow graph. The selection of the instruction block in which the at least one breakpoint should be set in advance or in this case may then comprise a single or multiple extraction of the instruction block according to the probability measure. In other words: the instruction block may roll the dice according to a probability measure, such as Monte Carlo. The advantages are that: while instruction blocks with high connectivity/connectivity weights are preferably and at least pseudo-randomly selected, it is still possible and from time to time that instruction blocks that are less cross-linked may also be selected. Randomness can be particularly responsible for: the control flow graph is well-covered during fuzzy testing, i.e., the selection of instruction blocks is not limited to only a small portion of the control flow graph. Thus, the blur test is also significantly more efficient, better and/or faster. Thus, the software and in particular the system controlled, regulated and/or monitored by the software may be improved.

Drawings

FIG. 1a schematically illustrates a computer-implemented method for obtaining software overlay feedback when fuzzing software on a hardware target.

FIG. 1b schematically illustrates a continuation of the computer-implemented method for obtaining software overlay feedback when fuzzing software on a hardware target.

FIG. 2 schematically illustrates an exemplary embodiment of a computer-implemented method for obtaining software overlay feedback when fuzzing software on a hardware target.

FIG. 3 shows an exemplary and schematic control flow diagram of software with instruction blocks and directed edges.

Detailed Description

The method 100 presented in this disclosure is capable of obtaining software overlay feedback when fuzzing software on a hardware target. The hardware target may be, for example, an electronic control unit, wherein the software may be designed to: the electronic control unit is controlled, regulated and/or monitored.

The method 100 presented in this disclosure may be particularly suited for cases where software is not statically detected for ambiguity testing. Furthermore, the software may be (fully or partially) closed-source (closed-source). In contrast, software coverage feedback can be obtained by dynamic detection when fuzzing the software.

In the case of a debug connection to a hardware target, a software overlay feedback can be obtained, for example, at the time of the fuzzy test and in particular during the fuzzy test iteration as follows and as illustrated by way of example in fig. 2: first, the zeroth breakpoint may be set before the start of the instruction (English: function to instrument (function to instrument)), for example, before performing the fuzzing iteration. The start instruction may be characterized by: the start-up instruction is independent of the input data when executing the software and is therefore executed in each fuzzing iteration. The initiation instructions may be identified, for example, according to a specification of the software (e.g., a symbol file) and/or by a test engineer. Alternatively or additionally, the first breakpoint may be set in or before the first instruction block prior to performing the fuzziness test iteration.

The fuzzing iterations of the software may then be performed based on the fuzzing input data for the fuzzing iterations. If the zeroth breakpoint or first breakpoint is reached when performing the fuzziness test iteration, the breakpoint may be marked as reached. Alternatively and in particular when the maximum number of (hardware) breakpoint registers is severely limited, the zeroth and/or first breakpoint may be deleted.

At least one second breakpoint may then be set in or before the second instruction block. If a second breakpoint is again reached when the fuzziness test iteration is performed, the second breakpoint may be marked as reached. Alternatively and in particular when the maximum number of breakpoint registers is severely limited, the second breakpoint may be deleted. The fuzzy test input data resulting in the second breakpoint having been reached may be stored and associated with the relevant instruction block in the control flow graph 10 of the software.

By continuous implementation (or deletion and reset) of at least one breakpoint, software coverage feedback can be obtained during execution of the fuzzy test iteration. The software overlay feedback may include, for example, a path in control flow graph 10, where the path may include a sequence of instruction blocks of control flow graph 10.

However, the control flow graph 10 of the software (compiled, closed-source) is generally not known a priori. However, the control flow graph may be built at reasonable expense during the ambiguity test, e.g., by continuously recording the set breakpoints, using the method 100 set forth in the present disclosure.

It is entirely possible that: the set breakpoint is not reached when the fuzzing test iteration is performed. Even this may occur: the set breakpoint is not reached when multiple fuzzy test iterations are performed. The breakpoint may be marked as skipped according to predetermined criteria (e.g., when the breakpoint is not reached after a predetermined number of fuzzy test iterations and/or when a time is exceeded). The breakpoint or new breakpoint may then be set, for example, in or before an instruction block adjacent to an unreachable instruction block in control flow graph 10.

In this method 100, the selection of an instruction block or breakpoint is based on connectivity of the instruction block in control flow graph 10, optionally based on connectivity weights of the instruction block in control flow graph 10. However, various other policies for setting breakpoints available on hardware targets may be additionally applied in the method 100. Furthermore, various strategies may be combined and/or alternated in the method 100. Alternatively or additionally, the policy may include a probabilistic search. Alternatively or additionally, the policy may include an entropy search. Alternatively or additionally, the policy may include guiding the search. Alternatively or additionally, the policies may include other search policies. In the case of entropy searching, for example, instruction blocks (and/or directed edges) arriving at (or traversed at) the time of the fuzzy test in control flow graph 10 may be provided with corresponding values of information gain (also referred to as: entropy). Then, the policies in the case of entropy search may include: such fuzzy test input data that maximizes the overall information gain is generated by the fuzzifier. Thus, instruction blocks (and/or directed edges) with high information gain may be preferred. Thus, fewer instruction blocks that have arrived (and/or directed edges that have traversed) are found. Thus, new instruction blocks (and/or directed edges) may be more efficiently discovered. Furthermore, in the case of a guided search, the setting of the respective at least one breakpoint may be based on user input. If, for example, a user, programmer and/or auditor of the software knows the critical location in the software or control flow graph 10 of the software, such knowledge can be used via the user interface to select breakpoint and/or fuzzy test input data such that the critical location arrives at least one fuzzy test iteration and is thus deeply tested.

A computer-implemented method 100 for obtaining software overlay feedback when fuzzing software on a hardware target is disclosed, wherein the hardware target has at least one breakpoint register and is designed to: if an instruction of the software has arrived at the execution of the software and a memory address of the instruction is set in the at least one breakpoint register, execution of the software is stopped before the instruction is executed. The at least one breakpoint register may be a hardware breakpoint register. A hardware breakpoint is a breakpoint set via a hardware breakpoint register.

The method 100 schematically shown in fig. 1a comprises: a first instruction block of the software is selected 119.

The method 100 includes: a first breakpoint is set 120 in the at least one breakpoint register prior to an instruction of the first instruction block. Setting 120 the first breakpoint before the instruction of the software may include: the memory address of the instruction is set into the at least one breakpoint register.

The method 100 may include: the first execution 130 of a fuzziness test iteration of the software. Alternatively, the method 100 may include: the fuzzing test iteration of the software (which has been partially executed but stopped) is continued 131 for the first time.

The method 100 includes: the first check 140 checks whether the first breakpoint is reached when the first execution 130 or the first continuation 131 of the fuzzing iteration. The first breakpoint is reached if the first instruction has been executed without the first breakpoint when executing the software based on the fuzzy test input data of the fuzzy test iteration.

The method 100 includes storing 150 first log information, wherein the first log information includes: if the first check 140 is affirmative, then the first instruction block (or the corresponding instruction preceding or in the first instruction block) has arrived in the fuzziness test iteration.

Such as shown as an optional step in fig. 1a, the method 100 may include: if the first check 140 is affirmative, the first breakpoint is deleted 151.

The software overlay feedback when fuzzing the software includes first log information. The first log information may also include, for example, fuzzy test input data for a fuzzy test iteration. The storage of the fuzzing input data may be used to (continuously) create a mapping from training blocks reached at the time of the fuzzing to the fuzzing input data (or vice versa).

The selection 119 of the first instruction block of software may be based on connectivity (e.g., PR (b)) of the instruction block 11 in the control flow graph 10 of the software. Here, B is an instruction block in the set of instruction blocks B or a part thereof from the control flow graph, for example.

Alternatively or additionally, the selection 119 of the first instruction block of software may be based on connectivity weights of instruction blocks 11 in the control flow graph 10 of the software.

Control flow graph 10, shown exemplarily and schematically in fig. 3, may be a directed graph that is used to describe control flow of software (or a portion thereof). The control flow graph may be: a set of nodes 11 (e.g. b1, b2, b3, b4, b5, b6, b 7) representing instruction blocks of the described program; and a set of directed edges 12 that include possible transitions between instruction blocks, i.e., program flow. For example, instruction block b5 may be an instruction block that has arrived at the time of the ambiguity test. Control flow graph 10 of sufficiently complex software may include > =1e2, > =1e3, > =1e4, > =1e5, > =1e6, a > =1e7 instruction blocks and > =1e2 > =1e3 > =1e4 > =1e5 > =1e6 > =1e7 > =1e8 > =1e9 directed edges.

For example, the instruction block with the highest connectivity, optionally the highest connectivity weight, may be fetched with or without passing through Zur cklegen and thus selected 119.

The connectivity (e.g., PR (B)) of an (arbitrary) instruction block 11 (e.g., instruction block B from a set of instruction blocks B or a portion thereof of the control flow graph) may be an at least approximate measure of how many and/or there are multiple other instruction blocks 11 of the control flow graph 10 or a portion thereof that point to that instruction block 11 via each directed edge 12 in the control flow graph 10.

Connectivity may be, for example, real numbers within the interval 0, infinity (inclusive of the left endpoint). Connectivity may be a natural number (including zero). Connectivity may be positive or zero, among other things.

In fig. 3, for example, instruction block b6 with two incoming directed edges has a higher connectivity than instruction block b1 without an incoming directed edge.

The connectivity weight of (any) instruction block 11 (e.g. instruction block b) may be based on the connectivity of (any) instruction block 11 (e.g. instruction block b). For example, when connectivity is large, the connectivity weight may be large, and when connectivity is small, the connectivity weight may be small. For this purpose, arbitrary (continuous) relationships are conceivable. For example, connectivity weights may also be proportional to connectivity. For example, the connectivity weight (e.g., PR (b)) may also be connectivity (e.g., PR (b)).

The connectivity weights may be real numbers within the interval 0, infinity (inclusive of the left endpoint). The connectivity weight may be a natural number (including zero). The connectivity weight may be positive or zero.

Such as shown as an optional step in fig. 1a, the method 100 may include: at least one connectivity of the instruction block 11 is calculated 118 a. Alternatively or additionally, the method 100 may include: at least one connectivity weight for the instruction block 11 is calculated 118 a.

For example, the at least one connectivity and/or the at least one connectivity weight may be calculated 118a via a PageRank algorithm, as known in the art, provided that: the control flow graph 10 or a portion thereof replaces the internet, the instruction block replaces the website of the internet and the directed edge 12 replaces the hyperlink. PageRank algorithm not only compares incoming links, but also weights the links according to importance of the source. Possible implementations of the PageRank algorithm are described, for example, in https: is found in// network x.org/document/stable/_modules/network x/algorithms/link_analysis/parerank_alg. PageRank algorithm/the PageRank algorithm may have other input parameters (and be specified by such input parameters). For example, the PageRank algorithm on networkX may have preset (default) input parameters-e.g., damping factor alpha=0.85, maximum number of iterations=100 and tolerance=1e-6. For example, the PageRank algorithm on networkX may be used for the calculation 118a along with the preset (default) input parameters.

Such as shown as an optional step in fig. 1a, the method 100 may include: a first probability measure is calculated 118 b. For example, the first probability metric may assign a normalized connectivity weight to at least two instruction blocks of control flow graph 10 or a portion thereof, respectively, wherein the connectivity weights may be normalized such that the sum of all connectivity weights of the at least two instruction blocks may be one.

Selection 119 of the first instruction block of software-in particular, selection 119 of the first instruction block of software based on connectivity, in particular connectivity weights, of the instruction blocks in the control flow graph 10-may comprise randomly or at least pseudo-randomly first extracting one instruction block from the instruction blocks distributed according to the first probability metric as the first instruction block.

This extraction may also be referred to herein or in the general case as rolling dice (Monte Carlo). The (first, second, later) extraction may include, for example: extracting at least one digit over a uniformly distributed (quasi) real interval [0,1 ]; and evaluating an inverse distribution function of the (first, later second) probability measure at the extracted at least one number. Other implementations of this extraction are known in the art.

The first extraction of instruction blocks may be implemented such that a first plurality of instruction blocks (i.e., a first plurality of instruction blocks predetermined at a point in time of the first extraction) are not extracted. Alternatively or additionally, the first extraction of instruction blocks may be implemented such that a plurality of (previously) arriving instruction blocks 13 or parts thereof at the time of the ambiguity test are not extracted.

The exclusion that a particular instruction block may be decimated may be achieved, for example, by: instruction blocks are extracted until an instruction block is extracted that should not be excluded. Alternatively, for example, the (first, second later) probability measure may be modified prior to the extraction such that only non-excluded instruction blocks can be extracted. This possibility lies, for example, in: the (connectivity) weights of the excluded instruction blocks are scaled to zero and the probability measures thus modified are renormalized.

By excluding specific instruction blocks, the efficiency of the fuzzy test may be further improved.

Such as schematically shown as an optional step in fig. 1b (continuation of fig. 1 a), the method 100 may comprise: if the first check 140 is affirmative (e.g., at the latest before step 160, e.g., also already in step 119), a second instruction block of the software is selected 159. Such as shown as an optional step in fig. 1b, the method 100 may further include: if the first check 140 is affirmative (e.g. at the latest before step 170, e.g. also already in step 120), a second breakpoint is set 160 in the at least one breakpoint register or in another breakpoint register before the instruction of the second instruction block of the software. Setting 160 the second breakpoint before the instruction of the second instruction block of the software may include: the memory address of the instruction of the second instruction block is set into the at least one breakpoint register or into another breakpoint register.

Such as shown as an optional step in fig. 1b, the method 100 may include: a second execution 170 of a fuzziness test iteration of the software (based on the fuzziness test input data associated with the fuzziness test iteration). Alternatively, such as shown as an optional step in fig. 1b, the method 100 may include: the second time the fuzzing test iteration of the software (which has been partially performed but stopped) is continued 171.

Such as shown as an optional step in fig. 1b, the method 100 may include: a second check 180 checks if a second breakpoint is reached at the second execution 170 or at the second continuation 171 of the fuzzing iteration. The second breakpoint is reached if an instruction of the second instruction block has been executed while executing the software based on the fuzzy test input data of the fuzzy test iteration without the second breakpoint.

Such as shown as an optional step in fig. 1b, the method 100 may include: the following second log information is stored 190: if the second check 180 is affirmative, a second instruction block has been reached in the fuzziness test iteration.

Such as shown as an optional step in fig. 1b, the method 100 may include: if the second check 180 is affirmative, the second breakpoint is deleted 191.

The software overlay feedback when fuzzing the software may include second log information. The second log information may include, for example, previous or next fuzzy test input data of the fuzzy test iteration.

The selection 159 of the second instruction block of software may be based on connectivity of instruction block 11 in control flow graph 10 of the software. Alternatively or additionally, the selection 159 of the second instruction block of the software may be based on connectivity weights of instruction blocks 11 in the control flow graph 10 of the software.

Such as shown as an optional step in fig. 1b, the method 100 may include: at least one connectivity of the instruction block 11 is calculated 158a. Alternatively or additionally, the method 100 may include: at least one connectivity weight for the instruction block 11 is calculated 158a. As in step 119a, the at least one connectivity and/or the at least one connectivity weight may be calculated 158a via, for example, a PageRank algorithm as known in the art.

Connectivity may be, but need not be, recalculated. Furthermore, the connectivity weights may, but need not, be recalculated. For example, when connectivity and/or connectivity weights are recalculated for another portion of the control flow graph 10, i.e., when that portion has changed halfway, the recalculation may be justified. Alternatively or additionally, when the control flow graph 10 of the software is first unknown and continuously determined by obtaining software overlay feedback, recalculation may be justified.

Such as shown as an optional step in fig. 1b, the method 100 may include: a second probability metric is calculated 158b, wherein the second probability metric assigns a normalized connectivity weight to at least two instruction blocks of the control flow graph 10 or a portion thereof, respectively, wherein the connectivity weights are normalized such that the sum of all connectivity weights of the at least two instruction blocks may be one.

The selection 159 of the second instruction block of software, in particular the selection 159 of the second instruction block of software based on the connectivity, in particular the connectivity weight, of the instruction blocks in the control flow graph 10, may comprise a random or at least pseudo-random second extraction of one instruction block from the instruction blocks distributed according to the second probability measure as a second instruction block.

The second probability measure may, but need not be, the first probability measure. In the case where the first probability measure is the same as the second probability measure, no recalculation may be required. For example, when the portion of control flow graph 10 has changed in the middle, recalculation may be justified.

The second extraction of instruction blocks may be implemented such that a second plurality of instruction blocks (i.e., a second plurality of instruction blocks predetermined at a point in time of the second extraction) are not extracted. The second plurality of instruction blocks may, but need not, be the first plurality of instruction blocks. Alternatively or additionally, the second extraction of instruction blocks may be implemented such that the first instruction block is not extracted.

As already stated, the efficiency of the ambiguity test can be further improved by excluding specific instruction blocks.

Such as shown as an optional step in fig. 1b, the method 100 may include: the next fuzzy test input data for the fuzzy test iteration is generated 169, optionally based on the first log information. The second execution 170 of the fuzzy test iteration may be based on these next fuzzy test input data.

The setting 120, 160 of breakpoints may be implemented via a debug connection to a hardware target. Furthermore, execution 130, 170 and/or continuation 131, 171 of the fuzzy test iteration may be implemented via a debug connection with a hardware target. Such as shown as an optional step in fig. 1a, the method 100 may include: a debug connection to a hardware target is initialized 110.

The first instruction block of the software may include a predetermined start function (english: function to instrument) of the software. Such a selection may be suitable for starting a fuzzing test or at least one fuzzing test iteration of the fuzzing test. Alternatively, the first instruction block may be any instruction block of software (e.g., upon repeat 199). Alternatively or additionally, the method 100 may begin with the setting of a (zeroth) breakpoint prior to any instruction of the software, and in particular prior to an instruction of any (zeroth) instruction block in the control flow graph 10.

Such as shown as an optional step in fig. 1a, the method 100 may include: the control flow graph 10 of the software is generated 105 based on the programming code of the software. Alternatively or additionally, the method 100 may include: the control flow graph 10 of the software is generated 105 by reverse engineering of the software (Reverse engineering). Alternatively or additionally, the method 100 may include generating 105 a control flow graph 10 of the software by obtaining software overlay feedback continuously as the software is fuzzed on a hardware target.

Such as shown as an optional step in fig. 1a, the method 100 may include: storing 152 first log information, wherein the first log information comprises: if the first check 140 is negative, the first instruction block has not been reached in the fuzziness test iteration. The method 100 may include: if the first check 140 is negative, the first breakpoint is deleted 151. For example, the first check 140 may be negative when a predetermined criterion is met. For example, the predetermined criterion may be met when the first breakpoint is not reached after a predetermined length of time or before the end of the fuzziness test iteration.

Such as shown as an optional step in fig. 1b, the method 100 may include: store 192 second log information, wherein the second log information comprises: if the second check 180 is negative, the second instruction block has not been reached in the fuzziness test iteration. The method 100 may include: if the second check 180 is negative, the second breakpoint is deleted 191. For example, the second check 180 may be negative when a predetermined criterion is met. For example, the predetermined criterion may be met when the second breakpoint is not reached after a predetermined length of time or before the end of the fuzziness test iteration.

The software overlay feedback when fuzzing the software may include the first log information and/or the second log information. The first log information and/or the second log information may for example also comprise (next) fuzzy test input data.

Such as shown as optional steps in fig. 1a or 1b, respectively, the method 100 may include: the method 100 is repeated 199. For example, the method 100 may repeat 199 until execution of the fuzziness test iteration and/or the fuzziness test ends. By this repetition 199, a continuous breakpoint can be set and thus software coverage feedback can be obtained at the time of the fuzzing test. The repetition 199 may start, for example, with a new step 110, 118a, 118b, 119 or 120. Alternatively or additionally, the repetition 199 may begin with new steps 158a, 158b, 159, 160, wherein each successive number of breakpoints, instruction blocks, etc. is incremented by one. Thus, the new step 160 may be, for example: if the second check is affirmative, a third breakpoint is set prior to the instruction of the third instruction block, and so on.

Also disclosed is a computer system designed to: a computer-implemented method 100 for obtaining software overlay feedback when fuzzing software on a hardware target is performed. The computer system may include a processor and/or a working memory. The computer system may be designed to: communication with the hardware target is via a debug connection. The computer system may include a fuzzifier designed to: fuzzing input data for at least one fuzzing iteration of the software on the hardware target is generated and provided.

Also disclosed is a computer program designed to: a computer-implemented method 100 for obtaining software overlay feedback when fuzzing software on a hardware target is performed. The computer program may for example be present in an interpretable form or in a compiled form. The computer program may (also) be partly loaded into the RAM of a computer, for example, to be executed as a sequence of bits or bytes.

A computer readable medium or signal storing and/or containing the computer program is also disclosed. The medium may include, for example, one of RAM, ROM, EPROM, HDD, SDD,.

Claims

1. A computer-implemented method (100) for obtaining software overlay feedback when fuzzing software on a hardware target, wherein the hardware target has at least one breakpoint register and is designed to: if an instruction of the software has arrived at the time of executing the software and a memory address of the instruction is set in the at least one breakpoint register, stopping the execution of the software before executing the instruction, the method comprising:

-selecting (119) a first instruction block of the software;

-setting (120) a first breakpoint in the at least one breakpoint register prior to an instruction of the first instruction block;

-performing (130) for the first time or continuing (131) a fuzziness test iteration of the software for the first time;

-checking (140) for the first time whether the first breakpoint is reached when the blur test iteration is first performed (130) or is first continued (131);

-storing (150) first log information, the first log information comprising: if the first check (140) is affirmative, the first instruction block has been reached in the fuzziness test iteration-and optionally the first breakpoint is deleted (151),

wherein the software override feedback when fuzzing the software includes the first log information,

wherein the selection (119) of the first instruction block of the software is based on connectivity, optionally connectivity weights, of instruction blocks (11) in a control flow graph (10) of the software.

2. The method (100) according to claim 1, wherein the connectivity of an instruction block (11) is an at least approximate measure of how many and/or how important other instruction blocks (11) of the control flow graph (10) or a part thereof are pointing to the instruction block (11) via one directional edge (12) each in the control flow graph (10).

3. The method (100) according to claim 1 or 2, wherein the connectivity weight of an instruction block (11) is based on the connectivity of the instruction block (11);

-optionally wherein the connectivity weight is heavy when the connectivity is large and the connectivity weight is small when the connectivity is small;

-optionally, wherein the connectivity weight is proportional to the connectivity;

-optionally, wherein the connectivity weight is the connectivity.

4. The method (100) according to any one of the preceding claims, the method comprising:

-calculating (118 a) at least one connectivity, at least one connectivity weight of the instruction block (11),

optionally via the PageRank algorithm, with the proviso that: the control flow graph (10) or a portion thereof replaces the internet, the instruction block replaces the website of the internet and the directed edge (12) replaces a hyperlink.

5. The method (100) according to any one of the preceding claims, the method comprising:

-calculating (118 b) a first probability measure, wherein the first probability measure assigns a normalized connectivity weight to at least two instruction blocks of the control flow graph (10) or a part thereof, respectively, wherein the connectivity weights are normalized such that the sum of all connectivity weights of the at least two instruction blocks is one;

Wherein the selection (119) of the first instruction block of the software, in particular the selection (119) of the first instruction block of the software based on the connectivity, in particular the connectivity weights, of the instruction blocks in the control flow graph (10), comprises a random first extraction of one instruction block from the instruction blocks distributed according to the first probability measure as a first instruction block.

6. The method (100) of claim 5, wherein the first extraction of the instruction block is implemented such that:

-a first plurality of instruction blocks; and/or

-a plurality of instruction blocks (13) or a part thereof arriving at the time of the ambiguity test

Is not extracted.

7. The method (100) according to any one of the preceding claims, the method comprising:

-if the first check (140) is affirmative, selecting (159) a second instruction block of the software;

-if said first check (140) is affirmative, setting (160) a second breakpoint in said at least one breakpoint register or in another breakpoint register before an instruction of a second instruction block of said software;

-performing (170) a second time a fuzziness test iteration of the software or continuing (171) the fuzziness test iteration a second time;

-a second check (180) whether the second breakpoint is reached when the blur test iteration is performed (170) a second time or continued (171) a second time;

-storing (190) second log information, the second log information comprising: if the second check (180) is affirmative, the second instruction block has been reached in the fuzzy test iteration-and optionally the second breakpoint is deleted (191),

wherein the software overlay feedback when fuzzing the software includes the second log information,

wherein the selection (159) of the second instruction block of the software is based on connectivity, optionally connectivity weights, of instruction blocks (11) in a control flow graph (10) of the software.

8. The method (100) of claim 7, the method comprising:

-calculating (158 a) at least one connectivity, optionally at least one connectivity weight, of the instruction block (11),

9. The method (100) according to claim 7 or 8, the method comprising:

calculating (158 b) a second probability measure, wherein the second probability measure assigns a normalized connectivity weight to at least two instruction blocks of the control flow graph (10) or a part thereof, respectively, wherein the connectivity weights are normalized such that the sum of all connectivity weights of the at least two instruction blocks is one,

Wherein the selection (159) of the second instruction block of the software, in particular the selection (159) of the second instruction block of the software based on the connectivity, in particular the connectivity weights, of the instruction blocks in the control flow graph (10), comprises a random second extraction of one instruction block from the instruction blocks distributed according to the second probability measure as a second instruction block.

10. The method (100) of claim 9, wherein the second decimating of the instruction block is implemented such that:

-a second plurality of instruction blocks; and/or

-said first instruction block

Is not extracted.

11. The method (100) according to any one of claims 7 to 10, the method comprising:

generating (169) next fuzzy test input data for the fuzzy test iteration, optionally based on the first log information,

wherein a second execution (170) of the fuzziness test iteration is based on the next fuzziness test input data.

12. The method (100) according to any one of the preceding claims, wherein the first instruction block of the software comprises a predetermined start-up function of the software.

13. The method (100) according to any of the preceding claims, wherein the software is not statically detected for a fuzzy test and/or is closed-source.

14. The method (100) according to any one of the preceding claims, the method comprising:

-generating (105) a control flow graph (10) of the software by reverse engineering of the software and/or by continuously obtaining software overlay feedback when the software is fuzzified on the hardware target based on programming code of the software.

15. The method (100) according to any one of the preceding claims, wherein the hardware target is an electronic control unit and the software is designed to: the electronic control unit is controlled, regulated and/or monitored.

16. The method (100) according to any one of the preceding claims, the method comprising:

-storing (152) first log information, the first log information comprising: -if the first check (140) is negative, the first instruction block has not been reached in the fuzzing test iteration-and optionally deleting (151) the first breakpoint, wherein the first check (140) is negative when the first breakpoint has not been reached after a predetermined length of time or before the end of the fuzzing test iteration; and/or

-storing (192) second log information, the second log information comprising: if the second check (180) is negative, the second instruction block has not been reached in the fuzzing test iteration-and optionally the second breakpoint is deleted (191), wherein the second check (180) is negative when the second breakpoint has not been reached after a predetermined length of time or before the end of the fuzzing test iteration,

Wherein the software overlay feedback when fuzzing the software comprises the first log information and/or the second log information.

17. The method (100) according to any one of the preceding claims, the method comprising:

-repeating (199) the method (100) according to any of the preceding claims.

18. A computer system, the computer system being designed to: a computer implemented method (100) for obtaining software overlay feedback when fuzzing software on a hardware target according to any of the preceding claims is performed.

19. A computer program, the computer program being designed to: a computer implemented method (100) for obtaining software overlay feedback when fuzzing software on a hardware target according to any of claims 1 to 17 is performed.

20. A computer readable medium or signal storing and/or containing a computer program according to claim 19.