CN112181426A - Assembly program control flow path detection method and device - Google Patents

Assembly program control flow path detection method and device Download PDF

Info

Publication number
CN112181426A
CN112181426A CN202010990361.0A CN202010990361A CN112181426A CN 112181426 A CN112181426 A CN 112181426A CN 202010990361 A CN202010990361 A CN 202010990361A CN 112181426 A CN112181426 A CN 112181426A
Authority
CN
China
Prior art keywords
control flow
assembler
program
path
basic block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010990361.0A
Other languages
Chinese (zh)
Other versions
CN112181426B (en
Inventor
许瑾晨
曹浩
郭绍忠
周蓓
刘聃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Information Engineering University of PLA Strategic Support Force
Original Assignee
Information Engineering University of PLA Strategic Support Force
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Information Engineering University of PLA Strategic Support Force filed Critical Information Engineering University of PLA Strategic Support Force
Priority to CN202010990361.0A priority Critical patent/CN112181426B/en
Publication of CN112181426A publication Critical patent/CN112181426A/en
Application granted granted Critical
Publication of CN112181426B publication Critical patent/CN112181426B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/43Checking; Contextual analysis
    • G06F8/433Dependency analysis; Data or control flow analysis
    • G06F8/434Pointers; Aliasing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/44Encoding
    • G06F8/447Target code generation

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention belongs to the technical field of computer program analysis and software testing, and discloses an assembler control flow path detection method, which comprises the following steps: detecting and deleting comments of code segments in the assembly program, and renaming an integer register; obtaining a control flow graph CFG of the assembly program through the auxiliary table AT and the control flow original data CRD; inserting a path label into the head of the instruction sequence of the basic block BBN to generate new control flow original data CRD; recombining new control flow original data CRD to obtain a new assembler with a path label; compiling the new assembler program with the path label and the probe program into an executable file together to generate a result file; and reading the generated result file, and counting according to the paths to obtain the passing rates of different paths. The detection method can quickly and accurately obtain the control flow information of the assembly program so as to further obtain dynamic information such as path coverage test and the like, and has the characteristics of light weight, strong pertinence, easy modification and maintenance and the like.

Description

Assembly program control flow path detection method and device
Technical Field
The invention belongs to the technical field of computer program analysis and software testing, particularly relates to a problem of assembler control flow path detection, and particularly relates to a method and a device for assembler control flow path detection for a domestic platform.
Background
The path coverage test can be used for detecting the branch coverage and statement coverage of a program, and code analysis and instrumentation are two important links for realizing the path coverage test. Some probes are inserted into the program while maintaining the original logic integrity of the program under test. These probes are essentially the code segments for information gathering, and can be assignment statements or function calls to gather overlay information. And executing and outputting the running characteristic data of the program through the probe. Based on the characteristic data analysis, control flow and data flow information of the program can be obtained, and further dynamic information such as coverage test and the like can be obtained.
The core system basic software supporting the domestic platform is mostly written in assembly language, such as a domestic basic mathematics library. Index parameters such as performance, precision and correctness of the software directly influence the operation effects of various numerical calculation applications, upper-layer software and user topics on a domestic platform. Therefore, the testing and optimization of the core system software are very important, but the current domestic platform testing tool is not complete enough. The mainstream control flow analysis tools IDA Pro and Intel Pin at home and abroad are too large in size and complex in program, and the difficulty of transplanting the tools to a domestic platform is very high, so that the tools are not strong in pertinence to the domestic platform and are difficult to adjust and maintain. In addition, the dynamic C, C + + program instrumentation probe feedback mode of Intel Pin supports a program operated at a single time well, and feedback results of a program which needs to operate thousands of times of test data are difficult to collect and arrange, and the one-to-one corresponding relationship between control flow and input test data cannot be well reflected. In addition to the two code analysis tools which are most widely applied to the non-domestic platform, the code analysis tools such as the PLC program control flow analysis, the Klocword code static analysis tool and the LDRA Testbed coding rule detection tool are also not friendly to the domestic platform, and the transplanting work is very difficult.
Disclosure of Invention
The invention provides a light-weight, strong-pertinence and easy-to-modify-maintain control flow path analysis and detection method and tool aiming at the technical problems of how to analyze program control flow of domestic assembly codes containing a large number of jump statements and how to detect path coverage of a specific input test set.
In order to achieve the purpose, the invention adopts the following technical scheme:
the invention provides an assembler control flow path detection method, which comprises the following steps:
step 1: traversing the assembly program, deleting comments of code segments in the assembly program, renaming an integer register, storing contents of non-code segments in the assembly program and recording positions of the non-code segments in the assembly program;
step 2: traversing the assembly program preprocessed in the step 1, and obtaining a control flow graph CFG of the assembly program through an auxiliary table AT and control flow original data CRD;
and step 3: analyzing the original control flow data CRD generated in the step 2, reading the instruction sequence of each basic block BBN in the original control flow data CRD, and inserting a path label into the head of the instruction sequence of the basic block BBN to generate new original control flow data CRD;
and 4, step 4: recombining the new control flow original data CRD generated in the step 3 by using the content and the position of the non-code segment in the assembly program obtained in the step 1 to obtain a new assembly program with a path label;
and 5: compiling the new assembler with the path label in the step 4 and the probe program into an executable file together to generate a result file;
step 6: and (5) reading the result file generated in the step (5), and carrying out statistics according to the path to obtain the passing rates of different paths.
Further, the step 2 specifically includes: taking the first word of the preprocessed assembly program as an operational character of the current instruction, obtaining the current instruction type through the operational character of the current instruction, performing corresponding storage operation according to different instruction types, after traversing line by line, converting the line number information into a node name by means of an auxiliary table AT, deleting the line number attribute, and outputting the final result in the form of control flow original data CRD, thereby obtaining a control flow graph CFG of the assembly program.
Further, when the path tag is inserted into the head of the instruction sequence of the basic block BBN in step 3, an alias is automatically generated for each basic block and the corresponding relationship is stored in the mapping table MP.
Further, the step 6 specifically includes: and (5) reading the result file generated in the step (5), scanning line by line to obtain the input and the alias sequence of the corresponding basic block, converting the alias of the basic block into the name of the basic block through a mapping table MP, and counting according to the path to obtain the passing rates of different paths.
Further, the step 5 is further: compiling the new assembler program with the path label in the step 4 and the probe program into an executable file through a specified test program, and generating a result file.
Further, the probe program in step 5 only includes a mark function.
The invention also provides an assembler control flow path detection device, which comprises:
the preprocessing module is used for traversing the assembler, deleting comments of code segments in the assembler, renaming an integer register, storing the content of non-code segments in the assembler and recording the positions of the non-code segments in the original assembler;
the control flow generation module is used for traversing the preprocessed assembly program and obtaining a control flow graph CFG of the assembly program through the auxiliary table AT and the control flow original data CRD;
the path label insertion module is used for analyzing the original control flow data CRD, reading the instruction sequence of each basic block BBN in the original control flow data CRD and inserting a path label at the head of the instruction sequence of the basic block BBN to generate new original control flow data CRD;
the program recombination module is used for recombining the new control flow original data CRD generated in the step 3 by using the content and the position of the non-code segment in the assembly program to obtain a new assembly program with a path label;
the path detection module is used for compiling the new assembler program with the path label and the probe program into an executable file together to generate a result file;
and the result analysis module is used for reading the generated result file, counting according to the path and obtaining the passing rate of different paths.
Further, the path detection module is further configured to compile a new assembler with a path label and the probe program into an executable file through a specified test program, and generate a result file.
Compared with the prior art, the invention has the beneficial effects that:
1. the detection method realizes automatic analysis of assembly program control flow on a domestic platform, and each line of codes in the divided code blocks have consistency in the condition of testing whether data is executed or not, and are the minimum control flow unit which can be divided into blocks. Therefore, the code blocks divided based on the automatic code analysis result can feed back the specific structure and the specific performance interval of the code, and reflect the execution path of the code control flow more accurately, thereby being more beneficial to optimizing the code.
2. The detection method realizes automatic pile insertion of mathematical function codes on a domestic platform, and the pile insertion position synchronously generates an external C program of a feedback path mark based on the code analysis result of the previous step. The field protection of most registers is realized before and after the instrumentation probe calls the external C program, so that the automatic instrumentation has stronger universality and cannot influence the normal operation of the program.
3. The path marking feedback result realized by the detection method of the invention corresponds to the input test data one by one, and because each element of the test data set is arranged from small to big, the specific value range of the path can be planned in the input test data set according to the path result obtained by the test.
4. The detection method can quickly and accurately obtain the control flow information of the assembly program so as to further obtain dynamic information such as path coverage test and the like, and has the characteristics of light weight, strong pertinence, easy modification and maintenance and the like.
Drawings
FIG. 1 is a diagram illustrating the properties of basic blocks in the present invention.
Fig. 2 is a schematic structural diagram of control flow raw data in the present invention.
FIG. 3 is a schematic diagram of an assembly queue according to the present invention.
FIG. 4 is a schematic diagram of an input file according to the present invention.
FIG. 5 is a basic flow chart of the assembler control flow path detection method of the invention.
Fig. 6 is a schematic process diagram of generating a control flow graph of an assembler in embodiment 1 of the present invention.
Fig. 7 is a control flow diagram of the assembler 3 in embodiment 1 of the present invention.
Fig. 8 is a schematic diagram of a process of generating new control flow raw data in embodiment 1 of the present invention.
Fig. 9 is a schematic diagram of a process of generating a new assembler in embodiment 1 of the present invention.
Fig. 10 is a schematic process diagram of assembler path testing in embodiment 1 of the invention.
FIG. 11 is a schematic structural diagram of an assembler control flow path detection apparatus.
Detailed Description
The following examples are intended to illustrate the invention, but are not intended to limit the scope of the invention. Unless otherwise specified, the technical means used in the examples are conventional means well known to those skilled in the art. The test methods in the following examples are conventional methods unless otherwise specified.
The embodiment of the invention adopts the following data structure:
data structure 1: basic Block Node (BBN). The basic block refers to a sequence of statements executed by a program in sequence, wherein there is only one entry statement and one exit statement, the entry is the first statement, and the exit is the last statement. For a basic block, execution only enters from its entry statement and exits from its exit statement. If the program's execution flow enters the entry of basic block B immediately after leaving the exit of basic block A, then A is the parent or predecessor of B, and B is the child or successor of A. As shown in fig. 1, the basic block includes 8 attributes, which are a basic block Name (Name), a parent node Name (Children), a child node Name (details), an instruction sequence (Instructions), a basic block start line (StartLine), a basic block end line (EndLine), a parent node start line (ParentsLine), and a child node start line (ChildrenLine). In fig. 1, 4 attributes in the dashed box are only used in the control flow generation stage, the "line number" has no practical meaning after the control flow generation is finished, and the 4 attributes are removed by the subsequent functional modules for simplifying the processing.
Data structure 2: control flow Raw Data (CFG Raw Data, CRD). A data structure for storing a Control Flow Graph (CFG) can be regarded as a special doubly linked list. As shown in fig. 2, the elements in the graph are basic blocks BBN, places serve as their predecessor nodes, children serves as its successor node, and the elements are linked by the places and children. Unlike a general doubly-linked list, there may be multiple predecessors or multiple successors of the basic block BBN. The control flow original data CRD is an intermediate file for transmitting information among the main functional modules of the detection method.
Data structure 3: auxiliary Table (AT). The auxiliary table AT stores all indirect successors and their start rows. As shown in the following table, at the time of control flow generation, since the assembly code is sequentially read line by line, the target basic block pointed to by the jump instruction has not been read yet. However, these basic blocks BBN with jump instructions can directly fetch their immediately succeeding start lines via the auxiliary table AT.
BBN1 0
BBN2 4
BBN3 8
Data structure 4: mapping Table (Map Table, MP). When inserting the path label, the mapping table MP is used to record the corresponding relationship between the basic block name and the alias. As shown in the following table, in the process of inserting path labels for basic blocks one by one, when path testing is performed according to the self-increasing names of positive integers, 1, 2 and 3 …, the alias is passed to the testing function in a parameter mode. And when the path is analyzed, the mapping table is consulted to restore the alias into the basic block name.
BBN1 1
BBN2 2
BBN3 3
Data structure 5: assembly queue (AL). The assembly queue is a work queue in the process of program recombination and is used for recording a basic block sequence, the constituent elements of the assembly queue are also queues, and the element on each queue is a basic block name. As shown in FIG. 3, if the assembly queue is viewed as a two-dimensional array, BBN1 may be located with AL [0] [0] and BBN2 may be located with AL [0] [1 ].
According to the above data structure, the present invention makes the following convention when performing assembler path detection:
contract 1: and inputting a file. The input file is a domestic instruction assembler, and the code part is an instruction sequence which is sequentially arranged by basic blocks. As shown in FIG. 4, the start column of the basic block consists of the block name plus the colon, and the end column is a jump instruction, ret, or normal instruction. Sometimes, however, due to writing habit, the initial column may carry functional instructions, such as BBN3 in the figure, in addition to the block name. For this situation, the function module will automatically recognize and process accordingly.
Contract 2: direct Successor (DS) and Indirect Successor (IS). For any basic block, it may have 0, 1 or 2 child nodes. If it is 0 child nodes, then this basic block is the end of the trace where it is located, usually ending with a ret instruction; if the number of the child nodes is 1, the basic block is ended by an unconditional jump instruction (such as 'j'), or the tail part is connected with the next basic block without the jump instruction; if it is 2 child nodes, this basic block ends with a conditional jump instruction (e.g., 'beqz') and the tail is connected to the next basic block.
Assuming that the tail of a basic block A is a conditional jump instruction, and pointing to a target basic block B when the jump is judged to be true; and when the jump judgment is false, pointing to a target basic block C, and then calling B as the indirect successor of A and C as the direct successor of A. As shown in FIG. 4, BBN2 is an indirect successor of BBN1, and BBN3 is a direct successor of BBN 1.
If the tail end of a basic block A 'is an unconditional jump instruction, and the jump is judged to be true, the basic block A' points to a target basic block B ', then the direct successor does not exist in the basic block A', and the indirect successor of the basic block A 'is B'. As shown in fig. 4, BBN3 is an indirect successor to BBN 2.
Contract 3: the basic block name. According to the writing habit of assembly codes, the block name of the basic block is defined by the initial line of the block, but the immediately following initial line is usually followed by a conditional jump instruction, and the block name is not defined. As shown in FIG. 4 as basic block BBN4, the basic block of line 12 and line 13 instructions is the immediate successor to BBN 4. At this time, the invention will automatically cut the BBN4 into 2 basic blocks, and the newly added basic blocks are named as SEG1, SEG2 and SEG3 … according to the numbers
Example 1
As shown in fig. 5, the method for detecting the assembler control flow path according to the present invention includes the following steps:
step 101: the method comprises the steps of preprocessing the assembler, namely traversing the assembler, deleting comments of code segments in the assembler, renaming integer registers, storing the content of non-code segments in the assembler and recording the positions of the non-code segments in the assembler.
The code segment in the assembler comprises annotation information which starts with the symbol "#". In practice, the integer registers of instruction operands use both register aliases and application binary interface names (ABIName), such as the Shenwei integer registers. In order to facilitate subsequent processing, the integer register is renamed, namely the integer register is uniformly replaced by the binary interface name of the application program. The contents of the non-code sections are stored and their locations in the assembly code are recorded for subsequent reassembly of the assembly program.
As an implementable manner, the assembler 1 to be tested is input, and the assembler 1 includes non-code segments such as storage data definition pseudo instructions (. quad,. short,. byte, etc.), symbol definition pseudo instructions (. set,. global,. extern, etc.), control pseudo instructions (. section,. text,. align, etc.), pre-compilation instructions (header inclusion, macro replacement, conditional compilation, etc.), and the like. Traversing the assembler line by line through an instruction set data table, automatically detecting and deleting comments of code segments in the assembler in a mode of matching character strings with first-line words, replacing an integer register using an alias with an application binary interface name, storing the content of non-code segments in the assembler and recording the positions of the non-code segments in the assembler to obtain the assembler 2.
Assembler 1 is shown in the following table:
Figure BDA0002690661980000061
Figure BDA0002690661980000071
Figure BDA0002690661980000081
assembler 2 is shown in the following table:
Figure BDA0002690661980000082
Figure BDA0002690661980000091
step 102: and generating a control flow graph, namely traversing the assembler preprocessed in the step 101, and obtaining the control flow graph CFG of the assembler through the auxiliary table AT and the original control flow data CRD.
As shown in fig. 6, the first word of the preprocessed assembly program is used as an operator of the current instruction, the current instruction type (including a conditional jump instruction, a non-conditional jump instruction, a return instruction, an operation instruction, and the like) is obtained through the operator of the current instruction, corresponding storage operation is performed according to different instruction types, after the line-by-line traversal is completed, the line number information is converted into a node name by means of an auxiliary table AT, the line number attribute is deleted, and the final result is output in the form of control flow original data CRD, so that a control flow graph CFG of the assembly program is obtained. The method comprises the following specific steps:
(1) the assembler is scanned line by line from the beginning to generate the auxiliary table AT.
(2) Scanning the assembler line by line from the head, and if the current line is a basic block starting line, saving a basic block name, a starting line number and the line instruction; if not, go to (3).
(3) If the current row instruction is a conditional jump instruction, which indicates that the basic block is ended and the basic block has two child nodes, the ending row number of the current basic block, the starting row numbers of the two child nodes (the starting row numbers of the child nodes are obtained by inquiring an auxiliary table AT) and the row instruction are saved; if not, go to (4).
(4) If the current line instruction is an unconditional jump instruction, which indicates that the basic block is ended and the basic block has a child node, the ending line number of the current basic block, the starting line number of the child node (the starting line number of the child node is obtained by inquiring an auxiliary table AT) and the line instruction are saved; if not, go to (5).
(5) If the current line instruction is ret, which indicates the end of the basic block, the ending line number of the current basic block and the line instruction are saved; if not, go to (6).
(6) If the current row indicates the end of the basic block and has a child node in the auxiliary table AT, the ending row number of the current basic block, the starting row number of the child node (the starting row number of the child node is obtained by inquiring the auxiliary table AT) and the row instruction are saved; if not, go to (7).
(7) The current line is a normal instruction, which is saved.
(8) And after scanning, performing subsequent processing: and generating Parents and Children attributes of the BBN according to the StartLine, EndLine, ParentsLine and ChildrenLine, deleting the four line number attributes, outputting the final result in the form of control flow original data CRD, and obtaining a control flow graph CFG of the assembler through the CRD.
Taking the assembly program 2 as an example, firstly, the first word of each line of assembly code in the assembly program 2 is matched by using the keyword, the corresponding instruction type is obtained through the first word (operator) of the line of assembly code, and corresponding storage operation is performed according to different instruction types. The assembler 2 can thus be divided into 10 basic blocks, respectively basic blocks I-X, as shown in the assembler 3.
Assembler 3 is shown in the following table:
Figure BDA0002690661980000101
Figure BDA0002690661980000111
in this way, a control flow graph of the assembler 3 is obtained, as shown in fig. 7.
Step 103: inserting a path tag, that is, parsing the original control flow data CRD generated in step 102, reading the instruction sequence of each basic block BBN in the original control flow data CRD, and inserting a path tag at the head of the instruction sequence of the basic block BBN to generate a new original control flow data CRD.
When the head of the instruction sequence of the basic block BBN is inserted with the path tag, the present invention automatically generates different aliases for the basic block and stores the corresponding relationship in the mapping table MP. As shown in fig. 8, if the program flow passes through the basic block, a call marking function is executed, thereby triggering the probe program. Since the marking process uses integer registers, to ensure the correctness of the assembler after inserting the path tag, the tag header will stack all the values of the integer registers, and the tag trailer will retrieve all the values of the integer registers from the stack.
In this embodiment a path tag is inserted at the head of the instruction sequence as shown in python code segment 1. Line 34 code ldi $16,1, the second operand, 1, is an alias defined for the basic block.
Python code segment 1 is shown in the following table:
Figure BDA0002690661980000112
Figure BDA0002690661980000121
Figure BDA0002690661980000131
step 104: and (4) program reorganization, namely, reorganizing the new control flow original data CRD generated in the step 103 by using the content and the position of the non-code segment in the assembler obtained in the step 101 to obtain a new assembler with a path tag. The method comprises the following specific steps:
(1) a new assembly queue AL is generated.
And reading the new control flow original data CRD generated in step 103, initializing an assembly queue AL, and then starting to analyze the basic block BBN. For the basic block BBN currently being processed, if it has 2 successors, 1 of them is the direct successor ds, and the other is the indirect successor is. Traversing each row of the assembly queue AL, if the indirect successor ds is not found, finding the row of the current basic block BBN in the assembly queue AL, and inserting the direct successor ds into the basic block BBN; if the direct successor ds is found, no processing is done. And if the indirect successor is not found by traversing each list, opening up a new row by the assembly queue AL, and storing the direct successor ds as the row head node.
(2) And (6) splicing. If the last element of row A is the same as the first element of row B, then B is stitched to the back of A and B is emptied. As shown in FIG. 9, the last element of AL 0 is the same as the first element of AL 1, then the elements of AL 1 are spliced behind AL 0 and AL 1 is emptied.
(3) And removing redundancy. If all elements of row B are a subset of all elements of row A, then B is cleared. As shown in FIG. 9, all elements of AL [2] are a subset of AL [0], at which point AL [2] is cleared.
(4) And (6) assembling. Each row of the assembly Queue AL is read sequentially, with elements enqueued in the Output Queue (Output Queue) in sequence, and duplicate elements not enqueued. After the assembly queue AL is read, the final output queue arrangement order is the arrangement order of the basic blocks in the final assembler, and then the contents of the non-code segments in the assembler obtained in step 101 are inserted into the generated new assembler according to the original position, and the generated new assembler has a logic function consistent with that of the original input assembler.
The new assembler after recombination is the assembler 4, and since the code length is too long, the middle is deleted, and only one basic block is reserved.
The truncated assembler 4 is shown in the following table:
Figure BDA0002690661980000141
Figure BDA0002690661980000151
Figure BDA0002690661980000161
step 105: assembler path testing, i.e. compiling the new assembler with path label in step 104 and the probe program into an executable file together, and generating a result file. Specifically, two situations can be distinguished:
(1) if the assembler is itself a callable function, as shown in case one of fig. 10, the path tagged assembler is called by the specified test C program, which will call the path probe C program, thus compiling three of them into an executable file; as shown in case two of fig. 10, if the assembler itself is a complete program, both the assembler and the probe C program are compiled together into an executable file.
(2) The probe program includes only 1 function, namely mark (), and the coding mode of the probe program is shown in the following table.
Figure BDA0002690661980000162
Figure BDA0002690661980000171
(3) And generating a result file. The result file contains the 16's and decimal representation of the input, as well as the base block alias walked by the input. As shown in the following table: for any row, the first value before the colon is a 16-ary input value and the second value is a decimal input value; the colon represents the program path that the input has taken, represented by the code of the basic block.
Figure BDA0002690661980000172
The corresponding result file of the assembler 4 is shown in the following table:
Figure BDA0002690661980000173
step 106: and analyzing the result, namely reading the result file generated in the step 105, scanning line by line to obtain the input and the alias sequence of the corresponding basic block, converting the alias of the basic block into the name of the basic block through the mapping table MP, and counting according to the path to obtain the passing rate of different paths.
The different path passing rates of the result file of the assembler 4 are shown in the following table, where the pass path corresponding to the input is preceded by a colon and the pass rate is indicated after the colon.
Figure BDA0002690661980000174
Figure BDA0002690661980000181
Example 2
As shown in fig. 11, the assembler control flow path detecting apparatus of the present invention includes:
the preprocessing module 201 is used for traversing the assembler, deleting comments of code segments in the assembler, renaming integer registers, storing contents of non-code segments in the assembler and recording positions of the non-code segments in the original assembler;
the control flow generation module 202 is used for traversing the preprocessed assembly program and obtaining a control flow graph CFG of the assembly program through the auxiliary table AT and the control flow original data CRD;
a path tag insertion module 203, configured to parse the original control flow data CRD, read an instruction sequence of each basic block BBN in the original control flow data CRD, and insert a path tag at a head of the instruction sequence of the basic block BBN to generate a new original control flow data CRD;
a program restructuring module 204, configured to restructure the new control flow original data CRD generated in step 3 by using the content and the position of the non-code segment in the assembly program to obtain a new assembly program with a path tag;
the path detection module 205 is configured to compile a new assembler with a path tag and a probe program into an executable file together, and generate a result file; further, the path detection module 205 is further configured to compile a new assembler with a path label and the probe program into an executable file through a specified test program, and generate a result file;
and the result analysis module 206 is configured to read the generated result file, perform statistics according to the path, and obtain the passing rates of different paths.
The above-mentioned embodiments are merely preferred embodiments of the present invention, which are merely illustrative and not restrictive, and it should be understood that other embodiments may be easily made by those skilled in the art by replacing or changing the technical contents disclosed in the specification, and therefore, all changes and modifications that are made on the principle of the present invention should be included in the scope of the claims of the present invention.

Claims (8)

1. An assembler control flow path detection method, comprising the steps of:
step 1: traversing the assembly program, deleting comments of code segments in the assembly program, renaming an integer register, storing contents of non-code segments in the assembly program and recording positions of the non-code segments in the assembly program;
step 2: traversing the assembly program preprocessed in the step 1, and obtaining a control flow graph CFG of the assembly program through an auxiliary table AT and control flow original data CRD;
and step 3: analyzing the original control flow data CRD generated in the step 2, reading the instruction sequence of each basic block BBN in the original control flow data CRD, and inserting a path label into the head of the instruction sequence of the basic block BBN to generate new original control flow data CRD;
and 4, step 4: recombining the new control flow original data CRD generated in the step 3 by using the content and the position of the non-code segment in the assembly program obtained in the step 1 to obtain a new assembly program with a path label;
and 5: compiling the new assembler with the path label in the step 4 and the probe program into an executable file together to generate a result file;
step 6: and (5) reading the result file generated in the step (5), and carrying out statistics according to the path to obtain the passing rates of different paths.
2. The assembler control flow path detection method of claim 1, wherein the step 2 specifically comprises: taking the first word of the preprocessed assembly program as an operational character of the current instruction, obtaining the current instruction type through the operational character of the current instruction, performing corresponding storage operation according to different instruction types, after traversing line by line, converting the line number information into a node name by means of an auxiliary table AT, deleting the line number attribute, and outputting the final result in the form of control flow original data CRD, thereby obtaining a control flow graph CFG of the assembly program.
3. The assembler control flow path detection method of claim 1, wherein in step 3, when a path tag is inserted into the head of the instruction sequence of the basic block BBN, an alias is automatically generated for each basic block and the corresponding relationship is stored in the mapping table MP.
4. The assembler control flow path detection method of claim 3, wherein the step 6 specifically comprises: and (5) reading the result file generated in the step (5), scanning line by line to obtain the input and the alias sequence of the corresponding basic block, converting the alias of the basic block into the name of the basic block through a mapping table MP, and counting according to the path to obtain the passing rates of different paths.
5. The assembler control flow path detection method of claim 1, wherein said step 5 further comprises: compiling the new assembler program with the path label in the step 4 and the probe program into an executable file through a specified test program, and generating a result file.
6. The assembler control flow path detection method of claim 1 or 5, wherein the probe program in step 5 comprises only one mark function.
7. The assembler control flow path detection device according to any of claims 1 to 6, comprising:
the preprocessing module is used for traversing the assembler, deleting comments of code segments in the assembler, renaming an integer register, storing the content of non-code segments in the assembler and recording the positions of the non-code segments in the original assembler;
the control flow generation module is used for traversing the preprocessed assembly program and obtaining a control flow graph CFG of the assembly program through the auxiliary table AT and the control flow original data CRD;
the path label insertion module is used for analyzing the original control flow data CRD, reading the instruction sequence of each basic block BBN in the original control flow data CRD and inserting a path label at the head of the instruction sequence of the basic block BBN to generate new original control flow data CRD;
the program recombination module is used for recombining the new control flow original data CRD generated in the step 3 by using the content and the position of the non-code segment in the assembly program to obtain a new assembly program with a path label;
the path detection module is used for compiling the new assembler program with the path label and the probe program into an executable file together to generate a result file;
and the result analysis module is used for reading the generated result file, counting according to the path and obtaining the passing rate of different paths.
8. The assembler control flow path inspection device of claim 7, wherein the path inspection module is further configured to compile a new assembler with a path tag and a probe program into an executable file through a designated test program, and generate a result file.
CN202010990361.0A 2020-09-19 2020-09-19 Assembly program control flow path detection method and device Active CN112181426B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010990361.0A CN112181426B (en) 2020-09-19 2020-09-19 Assembly program control flow path detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010990361.0A CN112181426B (en) 2020-09-19 2020-09-19 Assembly program control flow path detection method and device

Publications (2)

Publication Number Publication Date
CN112181426A true CN112181426A (en) 2021-01-05
CN112181426B CN112181426B (en) 2021-06-25

Family

ID=73955993

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010990361.0A Active CN112181426B (en) 2020-09-19 2020-09-19 Assembly program control flow path detection method and device

Country Status (1)

Country Link
CN (1) CN112181426B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113360142A (en) * 2021-06-24 2021-09-07 广东工贸职业技术学院 Design and compiling method for numerical control embedded PLC intermediate file
CN115390915A (en) * 2022-08-15 2022-11-25 中国人民解放军战略支援部队信息工程大学 RISC-V assembly instruction level oriented key path automatic detection method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102073587A (en) * 2010-12-27 2011-05-25 北京邮电大学 Static detection method for inaccessible route in program
CN105446881A (en) * 2015-11-26 2016-03-30 福建工程学院 Automatic detection method for program unaccessible paths
CN109491918A (en) * 2018-11-22 2019-03-19 中国人民解放军战略支援部队信息工程大学 A kind of detection method and device for the redundant instruction that collects

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102073587A (en) * 2010-12-27 2011-05-25 北京邮电大学 Static detection method for inaccessible route in program
CN105446881A (en) * 2015-11-26 2016-03-30 福建工程学院 Automatic detection method for program unaccessible paths
CN109491918A (en) * 2018-11-22 2019-03-19 中国人民解放军战略支援部队信息工程大学 A kind of detection method and device for the redundant instruction that collects

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113360142A (en) * 2021-06-24 2021-09-07 广东工贸职业技术学院 Design and compiling method for numerical control embedded PLC intermediate file
CN115390915A (en) * 2022-08-15 2022-11-25 中国人民解放军战略支援部队信息工程大学 RISC-V assembly instruction level oriented key path automatic detection method
CN115390915B (en) * 2022-08-15 2023-03-31 中国人民解放军战略支援部队信息工程大学 RISC-V assembly instruction level oriented key path automatic detection method

Also Published As

Publication number Publication date
CN112181426B (en) 2021-06-25

Similar Documents

Publication Publication Date Title
US7509632B2 (en) Method and apparatus for analyzing call history data derived from execution of a computer program
US8132156B2 (en) Methods and systems for testing tool with comparative testing
Louden Compiler construction
US6964036B2 (en) Descriptive variables while debugging
CN103678110B (en) The method and apparatus of amendment relevant information is provided
EP0643851B1 (en) Debugger program which includes correlation of computer program source code with optimized objet code
US7353427B2 (en) Method and apparatus for breakpoint analysis of computer programming code using unexpected code path conditions
CN111488154A (en) ST language source code compiling method, device, computer equipment and medium
US6588009B1 (en) Method and apparatus for compiling source code using symbolic execution
US20080104096A1 (en) Software development system
CN112181426B (en) Assembly program control flow path detection method and device
US5862382A (en) Program analysis system and program analysis method
CN101751281A (en) System and method for generating compiler
Nichols Augmented bug localization using past bug information
US20020129335A1 (en) Robust logging system for embedded systems for software compilers
US20090064092A1 (en) Visual programming language optimization
CN111694726B (en) Python program type derivation method based on type labeling
CN1115628C (en) Software simulation test method
CN116257245A (en) Multi-output compiling method and system based on flex and bison grammar analysis
US11119740B2 (en) Parsability of code snippets
Yang et al. PyVerDetector: A Chrome Extension Detecting the Python Version of Stack Overflow Code Snippets
JP2008059515A (en) Method, system, and program for displaying program execution process
CN114895914A (en) Log output code generation method and device, electronic equipment and storage medium
CN107703923B (en) Data coupling and control coupling automatic analysis method
Schaub et al. Comprehensive analysis of c++ applications using the libclang api

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant