CN117555811A - Embedded software analysis method, device and storage medium based on static symbol execution - Google Patents

Embedded software analysis method, device and storage medium based on static symbol execution Download PDF

Info

Publication number
CN117555811A
CN117555811A CN202410039234.0A CN202410039234A CN117555811A CN 117555811 A CN117555811 A CN 117555811A CN 202410039234 A CN202410039234 A CN 202410039234A CN 117555811 A CN117555811 A CN 117555811A
Authority
CN
China
Prior art keywords
embedded
program
grammar
keyword
defect detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410039234.0A
Other languages
Chinese (zh)
Other versions
CN117555811B (en
Inventor
梁洪亮
马冬雨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN202410039234.0A priority Critical patent/CN117555811B/en
Publication of CN117555811A publication Critical patent/CN117555811A/en
Application granted granted Critical
Publication of CN117555811B publication Critical patent/CN117555811B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3604Software analysis for verifying properties of programs
    • G06F11/3608Software analysis for verifying properties of programs using formal methods, e.g. model checking, abstract interpretation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/75Structural analysis for program understanding
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Stored Programmes (AREA)

Abstract

The invention provides an embedded software analysis method, device and storage medium based on static symbol execution, wherein the method comprises the following steps: inputting an embedded program to be tested, and identifying an embedded instruction based on an embedded grammar, grammar expansion and macro definition identification logic pre-constructed by an embedded grammar identification module to generate an embedded project configuration file and a preprocessing file; generating an abstract syntax tree, identifying embedded keywords in an embedded program to be tested, adding the embedded keywords on the abstract syntax tree, generating keyword syntax nodes, and constructing a control flow graph; loading the defect detector to an execution engine, placing the defect detector in a control flow inlet of the embedded program to be tested, and executing the embedded program to be tested by the execution engine in a path-by-path symbolized mode; the defect detector contains embedded related defect detection logic; in the process of executing the embedded program to be tested, detecting the defects of the embedded program to be tested by a defect detector and generating a defect detection result.

Description

Embedded software analysis method, device and storage medium based on static symbol execution
Technical Field
The invention relates to the technical field of embedded software analysis, in particular to an embedded software analysis method, device and storage medium based on static symbol execution.
Background
Embedded software (Embedded software) is a piece of software specifically designed and programmed for use in embedded systems. An embedded system is a special type of computer system that is typically embedded in other devices or systems to perform specific functions or tasks. These systems are typically designed to run on a fixed hardware platform and their functions and software are typically optimized for a particular application domain. With the development of satellite, carrier rocket, manned aerospace and other technologies in China, the scale and complexity of embedded software are increasingly large, the concurrency of the system is continuously enhanced, and potential defects in a software system are also increasingly difficult to detect, locate and control. Thus, defect detection of embedded software is very important and critical.
Program analysis technology is applied to defect detection of embedded software in academia and industry at home and abroad. Currently, most of the defect detection of embedded software adopts a dynamic program analysis method. The fuzzy test is a dynamic program analysis method, can automatically detect software loopholes, and in recent years, the aim is to apply the fuzzy test technology to embedded equipment with strong hardware dependency at home and abroad, but the fuzzy test is usually to simulate running and test binary firmware. The main testing process is as follows: random data is generated by using the obfuscator and is sent to the firmware running in the virtual simulation environment, so that the firmware triggers unpredictable anomalies, attempts to trigger firmware crashes, and then defects in embedded software are found. To improve the effectiveness of the fuzzy test, firmware binaries are typically instrumented to direct the fuzzifier to generate more efficient seed data by collecting coverage information.
Program static analysis is a program analysis technique that infers the runtime nature of a program by analyzing the program source code at compile time, with the goal of finding as many errors as possible implicit therein before the program is run, to improve the reliability and security of the program. However, existing static analysis tools hardly support defect detection of embedded software. For example, the static analysis tool SaTC utilizes static pollution analysis to locate pollution sources in back-end files based on keywords shared by the front-end and back-end files in order to automatically mine vulnerabilities on the Internet of things devices. The static analysis tool Astre is a code analysis tool based on abstract interpretation, and has the defects of high false alarm rate, incapability of supporting embedded common C/C++ language keywords and the like. The static analysis tool CSA (Clang static analyzer ) also does not support defect detection for embedded software.
As can be seen from the above, the existing static analysis tool cannot support the source code analysis and defect detection of the embedded software, and how to use the static analysis to realize the defect detection of the embedded software and reduce the missing report and false report is a current urgent problem to be solved.
Disclosure of Invention
In view of the above, the present invention provides an embedded software analysis method and device based on static symbol execution, so as to solve at least one technical problem existing in the prior art.
One aspect of the present invention provides an embedded software analysis method based on static symbol execution, the method employing a static analysis tool, the static analysis tool including an execution engine and a defect detector, the execution engine being configured with an embedded grammar recognition module, a profile generation module, and an embedded keyword recognition module, the method comprising a preprocessing stage and an analysis stage:
the pretreatment stage comprises the following steps: inputting an embedded program to be tested, identifying an embedded instruction based on an embedded grammar, grammar expansion and macro definition identification logic pre-constructed by an embedded grammar identification module, generating an embedded project configuration file based on the identified embedded instruction and an embedded project construction mode, and further generating a preprocessing file containing embedded instruction information, the configuration file and grammar nodes, wherein the embedded project configuration file contains a catalog executed by a compiling command, compiling command parameters, a source file and a path thereof;
The analysis phase comprises:
generating an abstract syntax tree according to the preprocessing file and the embedded program source code to be detected, identifying embedded keywords in the program source code through a keyword identification module, adding the identified embedded keywords on the abstract syntax tree, generating keyword syntax nodes, and further constructing a control flow graph;
loading a defect detector to an execution engine, placing the defect detector in a control flow inlet of the embedded program to be tested, traversing the control flow diagram by the execution engine in a path sensitive mode, symbolizing the embedded program to be tested by path to obtain a constraint range of a variable or expression and collecting semantic information in the program; the defect detector comprises defect detection logic set based on various defect definitions, wherein the defect detection logic comprises pre-established embedded relevant defect detection logic, and the embedded relevant defect detection logic comprises embedded grammar defect detection logic and embedded keyword relevant defect detection logic;
in the process of executing the embedded program to be detected, a defect detector searches for an reachable path in the embedded program to be detected, and detects defects of the embedded program to be detected and generates a defect detection result based on defect detection logic set by various defect definitions.
In some embodiments of the present invention, the generating the embedded project configuration file based on the identified embedded instruction and the embedded project building manner includes: capturing system call of an embedded compiler command in the process of executing project construction commands by an interception technology based on a dynamic library loading environment variable, and acquiring embedded compiler compiling command line parameters; and automatically summarizing the captured compiling command line parameters to generate the embedded project configuration file.
In some embodiments of the invention, the compile command parameters include a compiler name, a compile command option, a macro definition, and a compile command path.
In some embodiments of the present invention, the identifying, by the keyword identifying module, the embedded keyword in the program source code, adding the identified embedded keyword to the abstract syntax tree, and generating a keyword syntax node, includes: identifying embedded keywords in the program source code through a keyword identification module, and adding lexical marks of the keywords into a lexical analysis module of an execution engine; receiving, by a grammar analyzer of the execution engine, a lexical marker stream from the lexical analyzer, generating an abstract syntax tree based on the lexical marker stream and the preprocessed file such that the generated abstract syntax tree contains embedded keywords; and adding a grammar analysis mode corresponding to the embedded keyword on the node of the abstract grammar tree to form an embedded keyword grammar node on the abstract grammar tree.
In some embodiments of the present invention, the execution engine further comprises a floating point type supporting portion including a symbolic value representation supporting a floating point type, an expression representation supporting a floating point type operation, an in-memory representation supporting a floating point type, a constraint solver supporting a floating point type, and a mathematical function modeling module for modeling a mathematical function to enable the mathematical function to be identified by the symbolic execution engine; the defect detection logic in the defect detector further comprises floating point defect detection logic set based on a floating point defect definition; the method further comprises the steps of: generating an expansion graph for recording the state of each node of the embedded program to be tested in the process of executing the embedded program to be tested symbolically path by path; when the input of the current branch path contains a mathematical function, binding the mathematical function with a symbol value, classifying the input mathematical function based on a function output attribute when the input parameter of the mathematical function is a floating point type parameter, calculating an output value range corresponding to the input parameter based on a classification result, binding the value range as a constraint range with the symbol value of the mathematical function, generating a new state of the branch path based on the bound value range, and adding the generated new state into an expansion diagram.
In some embodiments of the present invention, classifying the input mathematical function based on the function output attribute, and calculating the output value range corresponding to the input parameter based on the classification result includes: the mathematical functions are classified based on monotonicity, periodicity, parity, asymptotics, extremum and/or identity of the mathematical functions, and the output value range of the input parameters is determined as a constraint range according to the classification result.
In some embodiments of the present invention, the embedded keyword related defect detection logic comprises related defect detection logic for the following keywords: const, static, volt, exterior and interior.
In some embodiments of the invention, the preprocessing file includes embedded instruction information, the configuration file, and grammar nodes.
Another aspect of the present invention provides an embedded software analysis system based on static symbol execution, the system comprising: a processor and a memory, said memory having stored therein computer instructions for executing the computer instructions stored in said memory, the system implementing the steps of the method as described above when said computer instructions are executed by the processor.
Another aspect of the invention also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the steps of the method as described above.
The embedded software analysis method and device based on static symbol execution can realize the defect detection of the embedded software by adopting a static analysis technology, can effectively prevent the missing report of the defects of the embedded software, and improves the detection capability of a static analysis tool.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention may be realized and attained by the structure particularly pointed out in the written description and drawings.
It will be appreciated by those skilled in the art that the objects and advantages that can be achieved with the present invention are not limited to the above-described specific ones, and that the above and other objects that can be achieved with the present invention will be more clearly understood from the following detailed description.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate and together with the description serve to explain the invention.
FIG. 1 is a flow chart of an embedded software analysis method based on static sign execution in an embodiment of the invention.
FIG. 2 is a block diagram of an embedded software defect detection tool according to another embodiment of the present invention.
FIG. 3 is a flow chart of an embedded software analysis method based on static sign execution in another embodiment of the invention.
Detailed Description
The present invention will be described in further detail with reference to the following embodiments and the accompanying drawings, in order to make the objects, technical solutions and advantages of the present invention more apparent. The exemplary embodiments of the present invention and the descriptions thereof are used herein to explain the present invention, but are not intended to limit the invention.
It should be noted here that, in order to avoid obscuring the present invention due to unnecessary details, only structures and/or processing steps closely related to the solution according to the present invention are shown in the drawings, while other details not greatly related to the present invention are omitted.
It should be emphasized that the term "comprises/comprising" when used herein is taken to specify the presence of stated features, elements, steps or components, but does not preclude the presence or addition of one or more other features, elements, steps or components.
The embedded project has a special embedded compiler and a special embedded project construction mode. If the code analysis tool lacks support for different embedded compilers and different ways of constructing embedded projects, then the code analysis tool cannot process the compiling instructions of the embedded compilers, nor generate a configuration file that fits the way the embedded program projects are constructed (e.g., ccs, xilinux, iar embedded workbench, visio studio, qt, type, szide, etc.), resulting in a failure to apply static analysis to embedded software analysis. For example, a clang static code analyzer (CSA) can only handle gcc under x86 architecture and the instructions compiled by clang compiler, and the intermediate representation compiled by embedded compiler cannot be identified. Furthermore, CSA can only generate configuration files that adapt the common source code program project build style (e.g., cmake and makefile, etc.), and then analyze based on the intermediate representation obtained.
In order to realize the defect detection of the embedded program, the invention provides an embedded software analysis method based on static sign execution. The method is implemented using a static analysis tool, which may be a CSA-based modified static analysis tool, or CSA-E. The invention improves the CSA for realizing the method as follows:
(1) An identification module of the embedded compiler grammar is added to the symbol execution engine to support embedded software specific compiler options.
The static analysis tool mainly comprises two parts: a symbolic execution engine of the core (which may be referred to as a core engine, an execution engine or an analysis engine) and a defect detector that may be mounted. In the embodiment of the invention, an embedded grammar recognition module is added in the execution engine, and recognition logic of embedded grammar, grammar expansion and macro definition is pre-built in the embedded grammar recognition module, so that after embedded software is input into a static analysis tool, the static analysis tool can recognize macro definition, grammar and dialect (grammar expansion of a compiler) of an embedded software compiler by using the embedded grammar recognition module, and the execution engine can analyze instructions compiled by the embedded compiler.
For example, the macro definitions and syntax extensions in different embedded COMPILERs are different, e.g., the IAR COMPILERs have macro definitions "__ iar_compilerbase __" and "__ VERSION __", syntax extensions "__ bit", "__ no_init" and "__ enable_inter rupt ()" etc., and the texas instruments' COMPILERs have macro definitions "__ ti_compiler_version __" and "__ ti_gnu_attribute_support __", syntax extensions "# pragma DIRECTIVES" and "__ inter rupt" etc., which are not recognizable by the CSA analysis engine.
The invention establishes the recognition logic of the embedded language by pre-constructing the embedded grammar recognition module which is adaptive to various embedded compilers to recognize macro definition, grammar expansion and the like in the embedded program, thereby recognizing the embedded instruction.
(2) Configuration file generation module for adapting embedded project in symbol execution engine
The configuration file generating module is used for generating an embedded project configuration file, such as a command_commands.json file, which is adapted to the embedded program project construction mode based on the embedded instruction and the embedded project construction mode identified by the identifying module, wherein the embedded project configuration file generated by the configuration file generating module records the exact command used in compiling, the directory containing the execution of the compiling command, the specific composition of the compiling command (including the name of the compiler, the compiling options, the macro definition, the path containing and the like) and the compiled specific source file and path, and the configuration file can be analyzed by an analysis tool in an execution engine.
The embedded project construction mode can be preconfigured in an embedded project construction file corresponding to the embedded program, that is, the configuration file generation module can generate a configuration file adapting to a specific construction mode of the embedded project based on the embedded instruction and the embedded project construction file identified by the identification module.
(3) Adding keyword recognition module in execution engine, and establishing embedded grammar defect detection logic and keyword modification logic in defect detector
The above-mentioned identification module and configuration file generation module can make the execution engine of CSA identify and analyze the embedded system, and correspondingly, the invention also establishes the embedded grammar defect detection logic in the defect detector so as to implement the embedded grammar defect detection. However, there are many specific embedded keywords in the embedded system, which have specific roles and meanings in the embedded environment, but the existing CSA cannot identify the embedded keywords, if the keywords cannot be identified, when the embedded keywords modify functions or variables, the CSA will treat the variables with the embedded keywords modification as common variables, or cannot explore the functions modified by the keywords. Thus, these keywords do not play a particular role/function in the CSA and their meaning in the embedded system.
There are two main cases of embedded keyword applications, the first is a modifier function, and if the symbol execution engine does not model the keyword, the function modified by the keyword cannot be explored. The second is a modifier variable that may trigger the type of defect that is caused by unaccounted for or misuse of keywords. Because the execution engine of the existing CSA cannot distinguish between the detection of these keywords, it is not possible to write a related defect detector to detect defects related to embedded keywords.
In order to solve the problems of keyword recognition and keyword related defect detection, in the invention, a keyword recognition module is added in an execution engine of the CSA to recognize the most common C/C++ language keywords of the current embedded system, such as C/C++ language keywords const, static, volt, external, interactive and the like. In addition, the invention also follows the current authoritative C/C++ language security coding specification, constructs the defect types related to the C/C++ language keywords, correspondingly constructs keyword modification logic (namely keyword related defect detection logic) for realizing the detection of the defect types related to the C/C++ language keywords, and sets the keyword modification logic in the defect detector.
In the embodiment of the invention, the operations related to the keyword recognition module added in the execution engine comprise: the embedded keywords are identified through the keyword identification module, so that the embedded keywords are added to the parsed abstract syntax tree, and the embedded keyword modifiers can be displayed on the abstract syntax tree.
More specifically, the keyword recognition module recognizes the embedded keywords in the source code of the embedded program to be tested, and adds the lexical marks (identifiers) of the keywords in the lexical analysis module of the execution engine, and the storage mode of the keywords in the execution engine is added, for example, the keywords can be stored in a token mode. Thus, when the abstract syntax tree is constructed, the syntax analyzer receives the lexical marker stream from the lexical analyzer, and the abstract syntax tree is generated by the syntax analysis module based on the marker stream, and the abstract syntax tree thus generated contains embedded keywords. Then, adding grammar analysis modes (such as analysis when modifying variables, concrete implementation when modifying functions, and the like) corresponding to the embedded keywords on nodes of the grammar tree to form embedded keyword grammar nodes on the abstract grammar tree. In forming an abstract syntax tree, the parser examines the relationships between the tokens to determine if they follow the syntax rules and constructs the corresponding syntax tree nodes if it is determined that the syntax rules are followed. Further, in the semantic analysis module, the relation between the keywords and the variables and the relation between the keywords and the functions are further established, relevant interfaces are added, the fact that the semantics of the source codes of the embedded type correlations are correct is guaranteed, and the source codes are added to the corresponding specific positions on the grammar tree and the specification of the embedded type identifiers and the like.
The identification of the keywords is realized, and the defect detection can be further performed in the defect detector based on the defect detection logic, especially the embedded related defect detection is performed based on the detection logic of the embedded related defects. Because embedded software has very high requirements on reliability, the software development process has a strict review link, the software design considers fault tolerance and redundancy design, and the software coding implementation complies with a strict safety programming specification. In the embodiment of the invention, the defect detector follows the detection logic of the embedded related defects established by the strict C/C++ language security programming specification, so that the introduction of program errors can be avoided from multiple aspects, and the difficulty of program analysis can be simplified.
The keyword-related defect detection logic established by embodiments of the present invention is described below with respect to some keyword examples.
1. const key:
meaning: the const key is used to define objects in the embedded program and to declare constants, indicating that an object cannot change its value after it has been initialized.
The function is as follows: the const key may define common variables, function parameters, pointer variables, etc. in the in-process program. The value of the object defined by the const key cannot be updated.
If the execution engine cannot recognize the const keyword, no distinction is made between the case of a const keyword modifier and the case of no const keyword modifier. This may not only result in failure to find defects associated with the const key, but may also result in inadvertent modification of the code, resulting in assignment of an unalterable read-only memory area, causing the program to terminate abnormally. For this reason, the present invention establishes the following 11 const key modifier logics for C/c++ language const key related defect detection based on the current C/c++ language code specification.
const key modifier logic (i.e., const key related defect detection logic):
1) String constants can only be assigned to pointers decorated by const keywords;
2) Removing a const attribute is forbidden in the type conversion of a pointer or a reference decorated by a const key;
3) The value of the object modified by the const key cannot be modified;
4) The reference type is not limited by a keyword const or a keyword volatile;
5) The member function defined as const (i.e., the member function decorated by the const key) prohibits the return of pointers or references other than const;
6) Object data returned by the member function must be modified by using a const keyword;
7) string constant, global variable is const;
8) The function should not return a reference or pointer to the parameter passed to the const keyword modifier by reference;
9) If the operator [ ] "is to be reloaded with a non-constant version, then the const version should also be implemented;
10 A) a reference capture (class type) exception should be modified by a reference or a const key;
11 Mobile semantics (td:: move) are not used for objects declared as const or const &.
2. static key:
meaning: the static key is used to control the manner of storage and scope of action of the variable.
The function is as follows: the static key may modify local variables, global variables, and functions.
When a static key modifies a variable and is located in a function volume (a local variable), its value is maintained between function calls. When a static key modifies a variable and is outside of a function (global variable), it is accessible to all functions within the translation unit in which it resides, but not to functions in other translation units. When a static modifies a function, it can only be called by the function in the translation unit where it resides.
If the execution engine cannot recognize the static keyword, there is no distinction between the case of modification of the static keyword and the case of no modification of the static keyword, so that the storage mode and the action range of the variable may be invalid. To this end, the present invention establishes the following 12 static key modifier logic for C/C++ language static key related defect detection based on the current C/C++ language coding specification.
static keyword trim logic (i.e., static keyword-related defect detection logic):
1) Declarations of arrays should not use static keywords in [ ];
2) Static storage classes should be used in objects and functions with internal links;
3) The inline function should be declared as a static storage class;
4) Static functions must be used;
5) Global arrays, must be explicitly initialized;
6) Static and thread local objects should be constant initialized;
7) Avoiding loops during static object initialization;
8) Exception handling logic in class constructs and destructors prohibits access to non-static members;
9) All class templates, function templates, class template member functions and class template static members must be instantiated at least once;
10 A) each function defined in the anonymous namespace, a static function with internal links, or a private member function should be used;
11 Identifier names of non-member objects having static storage duration or static functionality must not be reused within the namespace;
12 Identifier names of functions with static storage duration or non-member objects with external or internal links should not be reused.
3. exten key:
meaning: the extern key is used to indicate that the variable or function is defined in other files.
The function is as follows: the extern key may modify an object or function. The object or function decorated by the extern key has external links. The extern key definition should be used for those interface functions that are exposed to external use.
If the execution engine does not recognize the external keyword, there is no distinguishing process for the case of having the external keyword modification and the case of not having the external keyword modification, which may cause the variable or the function to be unable to be declared externally. For this reason, the present invention establishes the following 10 exorn key modifier logics for C/c++ language exorn key related defect detection based on the current C/c++ language code specification.
exorn keyword modifier logic (i.e., exorn keyword related defect detection logic):
1) The exten identifier should be unique.
2) The internally defined variables should not override the external variables.
3) The object and function names where there is an external link should be different.
4) Compatibility declarations should be visible when externally linked objects and functions are defined.
5) The external object should be declared and only once.
6) The identifier for which an external link exists can only be defined with accuracy.
7) If functions and objects are referenced in only one unit, they should not be defined as external links.
8) Externally linked arrays must be displayed of a specified size.
9) Objects or functions with external links (including members of the namespace) should be declared in the header file.
10 Identifier names of functions with static storage duration or non-member objects with external or internal links should not be reused.
4. volatile key:
meaning: the volatile key is used to tell the compiler that the modified variable is not to be optimized to ensure that access and operation to the hardware registers are predictable, and remind the compiler that the variable defined later on is likely to change at any time, so that the compiled program directly reads data from the variable address each time it needs to store or read the variable. If there is no volt key, the compiler may optimize reading and storing, may temporarily use the value in the register, and may be inconsistent if this variable is updated by another program.
The function is as follows: the volt key may modify access to external hardware registers, and variables in the interrupt handler and multithreaded access share variables and functions. The access to the external hardware register is modified by using the volatile key, so that the compiler can be prevented from optimizing the access and ensuring that the latest value is read from the hardware or written into the hardware each time; modifying variables in the interrupt handler with the volatile key ensures that their reads and writes are not disturbed by compiler optimizations and that these variables can be handled correctly in the interrupt context; modifying the shared variable for multithreaded access with a volatile key may ensure that reads and writes to them are visible in the multithreaded environment to avoid inconsistencies.
If the engine cannot identify the volatile key, there is no distinction between the case of modification of the volatile key and the case of no modification of the volatile key, which may result in that the compiler may optimize the modified variable, access and operation of the hardware register become unpredictable, the compiler may optimize reading and storing, the value in the register may be temporarily used, and if the variable is updated by another program, an inconsistency phenomenon occurs. Therefore, the invention establishes the following 5 volatile key modification logics for detecting the related defects of the volatile key in the C/C++ language based on the current C/C++ language coding specification.
volatile keyword modifier logic (i.e., volatile keyword-related defect detection logic):
1) Removal of the volt attribute is prohibited in the type conversion of the pointer or reference.
2) The variables that can be changed outside the program must be specified using the volatile type.
3) The use of volatile type variables in complex expressions is prohibited.
4) Operations are prohibited from occurring in the expression for multiple variables of the same volt type.
5) Access to the volatil type object using the non-volatil type pointer is prohibited.
5. Interrupt key:
meaning: interrupt keywords are non-standard keywords in the C51 language that define interrupt service routines.
The function is as follows: interrupt keywords are used to modify functions. The embedded system 6701 and VC33, 2812 processors are defined and declared with an interactive key modifier function. These functions are automatically invoked when an interrupt event occurs. When using the interrupt key, the compiler will save registers and then generate some special return code sequences according to the register save rules required by the interrupt service routine function. In addition, the interrupt function is a special function defined and declared by the interrupt key modifier function in 6701 and VC33, 2812 processors of the embedded system. At 6713 and SPARCV8 of the embedded system, interrupts are implemented using an interrupt vector table. The external device may trigger an interrupt via a hardware Interrupt Request (IRQ) signal and invoke a corresponding interrupt handler by querying a corresponding interrupt number on an interrupt vector table.
If the execution engine cannot recognize the interrupt key, there is no distinction between the case of modification of the interrupt key and the case of no modification of the interrupt key, which may cause the compiler to not know which functions are interrupt service routines (interrupt functions), to not set the interrupt vector table correctly, and to not save and restore the interrupt context correctly. For this reason, the present invention establishes the following 2 inter-rupt key modifier logic for C/c++ language inter-rupt key related defect detection based on the current C/c++ language code specification.
Interrupt keyword modifier logic (i.e., inter rupt keyword related defect detection logic):
1) Signal () is prohibited from being invoked from the interruptible signal processing routine.
2) Interrupt driven embedded system data contention: if the shared variables are modified by the interrupt handlers, a lock mechanism should be used to protect the shared variables to ensure data consistency and avoid race conditions.
The above list is only a part of the embedded C/c++ language key-related defect detection logic established according to the currently used security code specification, but the present invention is not limited thereto, and actually the key-related defect detection logic is established to be much more than the list, up to hundreds or more.
The invention can identify the C/C++ language keywords and form the embedded keyword grammar nodes on the abstract grammar tree by adding the embedded keyword identification module supporting a plurality of embedded compilers in the execution engine, thereby the defect detector realizes the comprehensive detection of the embedded defects based on the detection logic of the built-in related defects in advance. In the embodiment of the invention, the common embedded keywords are modeled by utilizing the keyword recognition module, and constraint conditions of related errors of the keywords are added to the defect detector part, so that the expandability and the detection capability of an engine on the embedded software are improved. The embedded compilers currently supported include, but are not limited to, the following: ccs3.3, ccs5.5, xilox, iar embedded workbench, visiostudio, qt, tyide and szide.
The embedded software analysis method implemented based on the improved static analysis tool (CSA-E) described above is described below. Fig. 1 is a flow chart of the method for analyzing embedded software according to an embodiment of the invention. As shown in fig. 1, the method includes a preprocessing stage and an analysis stage.
Wherein the pretreatment stage comprises:
step S110, inputting an embedded program to be tested, identifying an embedded instruction based on an embedded grammar pre-constructed by an embedded grammar identification module, grammar expansion and macro definition identification logic, generating an embedded project configuration file based on the identified embedded instruction and an embedded project construction mode, and further generating a preprocessing file containing embedded instruction information, the configuration file and grammar nodes.
The embedded project configuration file is used for helping a code analysis tool to understand the compiling process of the project and the constructing process of the project, so that the tool can adapt to constructing configuration under different environments, acquire the specific file position and compiling parameters of analysis and the like, and carry out the next analysis stage.
In the embodiment of the invention, the step of generating the embedded project configuration file based on the identified embedded instruction and the embedded project construction mode comprises the following steps:
1) Capturing system call (such as exec system call) to the embedded compiler command in the process of executing project construction command by using an interception technology based on a dynamic library loading environment variable (such as LD_PRELOAD), and acquiring embedded compiler compiling command line parameters; by way of example, LD_PRELOAD is an environment variable commonly used in Unix-like operating systems (such as Linux) that allows a user to specify a shared library that is loaded prior to program launch. In this way, the user can load the custom library before the standard library, so that the call of some standard library functions can be "intercepted" (i.e. redefined), and the acquisition of the compiling command line parameters of the embedded compiler is realized. The compile command parameters may include, for example, compiler names, compile command options, macro definitions, compile command paths, and the like.
2) The captured command line parameters are then automatically aggregated, ultimately generating an embedded project configuration file (e.g., compilejcommand. Json). The embedded project profiles are parsed by the code analysis tool and converted into analysis commands for the CSA.
In this step S110, as described above, since the symbol execution engine is added with the embedded grammar recognition module and the configuration file generation module, not only the embedded instruction can be recognized, but also the embedded compiler grammar can be added to the CSA preprocessing process, thereby generating the embedded project configuration file including the directory executed by the compiling command, the compiling command parameters, the source file and the path thereof, and further generating the preprocessing file supporting the embedded project code analysis. The pre-processing file may include compiler options for adapting the embedded system, macro definitions, configuration files for adapting the embedded system, and abstract syntax tree nodes. The execution engine may then obtain an intermediate representation from the resulting preprocessed file and perform a program analysis based on the obtained intermediate representation.
The analysis phase comprises the following steps S120-S140:
and step S120, generating an abstract syntax tree according to the preprocessing file and the embedded program source code to be detected, identifying the embedded keywords in the program source code through a keyword identification module, adding the identified embedded keywords on the abstract syntax tree, generating keyword syntax nodes, and further constructing a control flow graph and a function call graph.
The method for identifying the embedded keywords in the program source code through the keyword identification module, adding the identified embedded keywords on the abstract syntax tree and generating the keyword syntax nodes comprises the following steps:
identifying embedded keywords in the program source code through a keyword identification module, and adding lexical marks of the keywords into a lexical analysis module of an execution engine;
receiving, by a grammar analyzer of the execution engine, a lexical marker stream from the lexical analyzer, generating an abstract syntax tree based on the lexical marker stream and the preprocessed file such that the generated abstract syntax tree contains embedded keywords; and
and adding a grammar analysis mode corresponding to the embedded keyword on the node of the abstract grammar tree to form the grammar node of the embedded keyword on the abstract grammar tree.
Since the key grammar node is formed, the defect detection can be performed on the key modifier subsequently.
In step S120, the execution engine parses the abstract syntax tree of the embedded program to be tested according to the preprocessing file and the source code of the embedded program, constructs a control flow graph based on the parsed abstract syntax tree, and performs program analysis based on the obtained intermediate representations. The nodes of the abstract syntax tree correspond to source code on the parse tree and are the basis for creating a control flow graph. The control flow graph is an abstract representation of a program, representing the portion traversed during execution of the program, and may be represented as a directed graph, with each node representing a basic statement block or line of code of the program and the directed edge representing a possible execution path of the program. The control flow graph graphically represents the possible flow of all basic block executions within a process and also reflects the real-time execution of a process. Since the control flow graph constructed based on the abstract syntax tree belongs to the prior art, the description is not repeated here.
Step S130, the defect detector is loaded to the execution engine and placed in a control flow inlet of the embedded program to be tested, the execution engine traverses the control flow diagram in a path sensitive mode, the embedded program to be tested is symbolized by paths to obtain a constraint range of variables or expressions, and semantic information in the program is collected.
In the embodiment of the invention, the defect detector comprises defect detection logic set based on various defect definitions, wherein the defect detection logic not only comprises integer defect detection logic (the defect detector of CSA already comprises), but also comprises pre-established embedded relevant defect detection logic, and the embedded relevant defect detection logic comprises embedded grammar defect detection logic and embedded keyword relevant defect detection logic. For example, the defect detector contains detection logic of embedded key grammar related defects designed according to the current authority C/C++ security coding specification, so that the embedded key grammar related defects can be detected.
In step S140, during the execution of the embedded program to be tested, the defect detector searches for the reachable path in the embedded program to be tested, and based on the defect detection logic set by the multiple defect definitions, detects the defect of the embedded program to be tested and generates a defect detection result.
In the embodiment of the invention, each reachable path can be explored based on the reachable path of the program branches traversed by the control flow graph, so that each branch of the program control flow is tracked, and the hidden errors in each program branch are analyzed by the defect detector. As an example, the defect detector performs the detection logic of the related defect of the embedded keyword by firstly matching the embedded keyword on the parsed abstract syntax tree, finding the modified variable or function of the embedded keyword, and locating the control flow graph statement of the variable or function corresponding to the modified variable or function of the embedded keyword. And then extracting the detailed program state at the current moment from the expansion diagram statement through path-sensitive data flow analysis and control flow analysis. Defect detection is then performed according to the detection logic of the embedded relevant defect. For example, based on const key modifier logic, assignment of string constants to an object is prohibited unless the object is a const modified string pointer. The act of assigning a string constant to an object will cause a write operation to read-only memory, belonging to undefined acts, which will cause a program running error or even crash. At this time, the defect detection logic firstly matches the variable assignment expression, acquires the type before assignment, judges whether the type is a character string constant, acquires the grammar structure after assignment, checks whether the character string pointer decorated by the const key is assigned, and generates a defect report if the character string pointer decorated by the const key is not assigned. Other defect detection logic designs are the same and will not be described again. As for the monitoring of the defect detector for the entire defect, the detection is not described in detail herein as it belongs to the prior art.
Based on the steps, the invention realizes the defect detection of the embedded program by using the CSA-E and adopting a static analysis technology, and the detection logic based on the setting can effectively prevent the missing report of the defect.
Furthermore, the inventors have found that in the course of implementing the invention: some embedded programs include floating point operations, such as navigation, orbit control, positioning, sensing, tracking, prediction modules, etc., which are increasingly included in embedded scenarios such as unmanned aerial vehicles, autopilots, aerospace, etc. Whereas existing CSAs do not support floating point type defect detection. Because of the limitations of constraint solvers on floating point constraints, most symbolic execution engines tend to ignore floating point types and operations, thereby failing to accurately analyze program behavior and failing to detect floating point type defects (including common defects and floating point exceptions). In addition, there are a large number of mathematical function calls in embedded software. The transfer of many floating point number parameters, the return of values, and floating point operations are implemented by mathematical function calls, and the existing CSA execution engine ignores the handling of the mathematical function calls, resulting in the inability to accurately infer the path that the mathematical function call is located and the inability to detect defects caused by the mathematical function call. Mathematical function-related defects include common defects (e.g., group crossing, arithmetic overflow, zero divide, null pointer dereferencing, data race, etc.) and floating point exceptions (floating point exceptions caused by mathematical function result operations: overflow, underflow, zero divide, invalid operations and imprecision, etc.). The problems are light, so that false alarm and missing alarm are caused, and heavy causes serious disasters.
Therefore, the invention further increases the support for floating point type defect detection and the monitoring of defects caused by function call based on the CSA-E supporting embedded program detection, thereby further reducing the false alarm and missing report of defects and ensuring the operation safety of software.
In order to realize the monitoring of the floating point type defects and overcome the defect report caused by mathematical functions, in the embodiment of the invention, a part supporting the floating point type is added in an execution engine of the CSA-E, wherein the part supporting the floating point type comprises a symbol value representation supporting the floating point type, an expression representation supporting the floating point type operation, a memory representation supporting the floating point type, a constraint solver supporting the floating point type and a mathematical function modeling module, and the mathematical function modeling module is used for modeling the mathematical functions so that the mathematical functions can be identified by a symbol execution engine. Accordingly, floating point defect detection logic is added to the defect detector. That is, the defect detector of the present invention includes a defect mode definition to be detected, which includes not only defect detection logic supporting integer defect detection, but also embedded grammar defect detection logic, embedded C/c++ language keyword related defect detection logic capable of triggering based on embedded keyword defect definition setting, and floating point defect detection logic capable of triggering floating point anomaly identification based on floating point defect definition setting.
FIG. 2 is a schematic diagram of an overall static analysis tool for implementing embedded software analysis in accordance with an embodiment of the present invention. As shown in fig. 2, the device mainly comprises two parts: a symbol execution engine and a defect detector. The invention adds the part supporting the embedded program and the part supporting the floating point type in the engine and the defect detector, namely, the invention introduces the comprehensive support for the embedded software and the floating point type. In the symbol execution engine, the added part supporting the embedded program comprises an embedded grammar recognition module, a configuration file generation module and an embedded keyword recognition module, and as mentioned above, the description is omitted here; the added portion supporting the floating point type may include a symbolic value representation supporting the floating point type, an expression representation supporting the floating point type operation, a memory representation supporting the floating point type, a constraint solver supporting the floating point type, and a mathematical function modeling module. For example, for sign values, floating-point constants, floating-point variables, and floating-point pointers are added, these supporting the sign value representation of the floating-point type. For symbolic expressions, symbolic expressions are added to floating-point type operations, examples of which include basic operations, assignment operations, logical operations, mathematical operations, comparison operations, rounding operations, etc., for which the present invention adds corresponding expression representations. And adding the floating point constant, the floating point variable and the memory representation of the floating point pointer to the memory management module. Constraint solving support for floating point constraints is added for the constraint solver. In addition, the mathematical function modeling module is added in the symbolic execution engine, and when a mathematical function call is met, a symbolic value corresponding to the mathematical function can be constructed by utilizing the mathematical function modeling module, so that the symbolic execution engine can identify the mathematical function. In the embodiment of the invention, tens of mathematical functions such as trigonometric functions, exponential functions, logarithmic functions, power functions, rounding functions, inverse trigonometric functions, remainder functions, gamma functions, hyperbolic functions, triangular identity functions, modulo functions and the like are supported in the mathematical function library so as to comprehensively contain mathematical functions of floating point types. When modeling mathematical functions, corresponding mathematical functions can be selected from the mathematical function library for modeling, and symbol values are allocated to bind the mathematical functions. Furthermore, the invention can classify the mathematical function based on the characteristics of output attribute (such as monotonicity, periodicity, parity, asymptotic, extremum and/or identity) of the mathematical function, calculate the output value range of the variable or expression based on the classification result, bind the value range with the symbol value of the mathematical function, and generate new states of parameters based on the bound value range, thereby fixing different constraint ranges according to different classifications, and reducing false alarm and improving the accuracy of the device the more precisely the variable is limited. The invention adds the new state into the expansion diagram, and then continuously updates the program state of the expansion diagram, and carries out path condition reasoning and constraint solving, so that the path of the mathematical function can be explored, floating point abnormality caused by the mathematical function can be detected, and the accuracy and the detection capability of the device are improved. In the embodiment of the invention, the embedded defect detection logic and the floating point type defect detection logic are added in the defect detector, so that the comprehensive detection of the embedded related defects and the floating point type defects is realized, and the missing report is further prevented. After the embedded program to be tested is input into the execution engine, the execution engine calls the embedded grammar recognition module and the configuration file generation module to generate a configuration file and a preprocessing file based on the configuration file and the source program, then calls the embedded keyword recognition module to add the recognized embedded keywords on the abstract grammar tree and generate keyword grammar nodes, further constructs a control flow graph, and performs program analysis (process-to-process analysis) based on the obtained intermediate representations.
FIG. 3 is a flow chart illustrating an embedded software analysis method capable of implementing embedded related defect detection and floating point type defect detection according to an embodiment of the present invention. As shown in fig. 3, the method comprises the steps of:
(1) And inputting an embedded program to be tested, and preprocessing. This step includes the identification of embedded grammars and mathematical functions, the generation of configuration files, and the generation of pre-processing files.
(2) According to the preprocessing file and the software source code, constructing an abstract syntax tree of the embedded program to be detected, adding embedded keywords on the abstract syntax tree, generating keyword syntax nodes, and generating a control flow graph and a function call graph.
In the step, the embedded keyword module is used for identifying the embedded keyword and adding the embedded keyword on the grammar tree, and a grammar analysis mode corresponding to the embedded keyword is added on the node of the grammar tree to form the grammar node of the embedded keyword on the abstract grammar tree.
(3) And mounting a defect detector on the execution engine, and placing the defect detector in a control flow inlet of the embedded program to be tested.
The defect detector comprises integer defect detection logic, embedded grammar detection logic, embedded keyword related defect detection logic and floating point type defect detection logic.
(4) Traversing the control flow graph in a path sensitive mode, symbolizing the execution of the embedded program to be tested by path to acquire constraint ranges of variables, expressions and the like, collecting semantic information in the embedded program to be tested, and generating an expansion graph for recording the state of each node (statement block or code line) of the program in the process of executing the embedded program to be tested in the path sensitive mode.
The path sensitivity is to calculate different analysis information according to different predicates (terms expressing object properties or relationships) of conditional branch sentences, that is, the path sensitivity specifically considers the distinction between program branches, and each branch (or called path branch) of the program control flow is tracked to record different program states of each branch. When traversing the control flow graph in a path sensitive mode, the symbol execution engine symbolically executes the embedded program to be tested according to different path branches, and when encountering integer variables/functions, embedded keyword modification variables/functions, floating point type and other variables/functions, obtains constraint ranges of variables, expressions and the like, and collects semantic information in the program.
In the process of executing the embedded program to be tested in a path-by-path symbolizing manner, if the key word recognition module recognizes that the input program branch contains the embedded key word, the execution engine symbolizes the input of the embedded key word to obtain a symbol value, adds the generated new state into the expansion graph, and carries out constraint propagation so as to explore functions modified by the key words and detect the defect types influenced by the key words.
If the mathematical function recognition module recognizes that the input program branch contains a mathematical function, the execution engine signs the embedded keyword input to obtain a sign value, and adds the generated new state to the expansion graph. If not, or after constructing the sign value of the mathematical function, the next step is to determine if the current input is of the floating point type, if so, construct the floating point variables and expressions.
In the embodiment of the invention, under the condition that the input parameters of the mathematical function are floating point type parameters, the input mathematical function can be classified based on the function output attribute, the output value range corresponding to the input parameters is calculated based on the classification result, the value range is used as a constraint range to be bound with the symbol value of the mathematical function, a new state of the branch path is generated based on the bound value range, and the generated new state is added into the expansion graph. In the embodiment of the present invention, classifying the input mathematical function based on the function output attribute, and calculating the output value range corresponding to the input parameter based on the classification result may include: the mathematical functions are classified based on monotonicity, periodicity, parity, asymptotics, extremum and/or identity of the mathematical functions, and the output value range of the input parameters is determined as a constraint range according to the classification result.
After binding of the value range with the sign value of the mathematical function is completed, a new state of the node is generated based on the bound value range, and the generated new state is added to the expansion map, so that the subsequent detector can perform based on the updated expansion map when performing anomaly detection.
(5) Further, a constraint solver is used by a defect detector to perform constraint solving based on a constraint range of a symbol value or an expression to explore reachable paths in the embedded program to be detected, and based on defect detection logic set by various defect definitions, defects of the embedded program to be detected are detected and defect detection results are generated.
The embedded software analysis method based on static symbol execution can solve the problems that the existing static analysis tool is limited by embedded characteristics, floating point conditions and constraints and mathematical functions, so that an embedded system is detected, the embedded characteristics are processed, embedded keyword modified functions are explored, common defects and embedded defects contained in the embedded system are detected (defects such as floating point abnormality influenced by embedded C/C++ language keywords, related misuse of keywords and the like can be detected). At the same time, the floating point path in the embedded system can be explored, and errors and floating point anomalies contained in the floating point path are detected. The path of the mathematical function call in the embedded system can be explored, and errors contained in the mathematical function call and defects caused by the operation of the mathematical function are detected.
Correspondingly, the invention also provides an embedded software analysis device based on static symbol execution, which comprises: a processor and a memory, said memory having stored therein computer instructions for executing the computer instructions stored in said memory, the apparatus implementing the steps of the method as described above when said computer instructions are executed by the processor.
Furthermore, the present invention provides a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the steps of the method as described above.
Those of ordinary skill in the art will appreciate that the various illustrative components, systems, and methods described in connection with the embodiments disclosed herein can be implemented as hardware, software, or a combination of both. The particular implementation is hardware or software dependent on the specific application of the solution and the design constraints. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention. When implemented in hardware, it may be, for example, an electronic circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, a plug-in, a function card, or the like. When implemented in software, the elements of the invention are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine readable medium or transmitted over transmission media or communication links by a data signal carried in a carrier wave.
It should be understood that the invention is not limited to the particular arrangements and instrumentality described above and shown in the drawings. For the sake of brevity, a detailed description of known methods is omitted here. In the above embodiments, several specific steps are described and shown as examples. However, the method processes of the present invention are not limited to the specific steps described and shown, and those skilled in the art can make various changes, modifications and additions, or change the order between steps, after appreciating the spirit of the present invention.
In this disclosure, features that are described and/or illustrated with respect to one embodiment may be used in the same way or in a similar way in one or more other embodiments and/or in combination with or instead of the features of the other embodiments.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, and various modifications and variations can be made to the embodiments of the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. An embedded software analysis method based on static symbol execution is characterized in that the method adopts a static analysis tool, the static analysis tool comprises an execution engine and a defect detector, the execution engine is configured with an embedded grammar recognition module, a configuration file generation module and an embedded keyword recognition module, and the method comprises a preprocessing stage and an analysis stage:
the pretreatment stage comprises the following steps: inputting an embedded program to be tested, identifying an embedded instruction based on an embedded grammar, grammar expansion and macro definition identification logic pre-constructed by an embedded grammar identification module, generating an embedded project configuration file based on the identified embedded instruction and an embedded project construction mode, and further generating a preprocessing file containing embedded instruction information, the configuration file and grammar nodes, wherein the embedded project configuration file contains a catalog executed by a compiling command, compiling command parameters, a source file and a path thereof;
the analysis phase comprises:
generating an abstract syntax tree according to the preprocessing file and the embedded program source code to be detected, identifying embedded keywords in the embedded program source code to be detected through a keyword identification module, adding the identified embedded keywords on the abstract syntax tree, generating keyword syntax nodes, and further constructing a control flow graph;
Loading a defect detector to an execution engine, placing the defect detector in a control flow inlet of the embedded program to be tested, traversing the control flow diagram by the execution engine in a path sensitive mode, symbolizing the embedded program to be tested by path to obtain a constraint range of a variable or expression and collecting semantic information in the program; the defect detector comprises defect detection logic set based on various defect definitions, wherein the defect detection logic comprises pre-established embedded relevant defect detection logic, and the embedded relevant defect detection logic comprises embedded grammar defect detection logic and embedded keyword relevant defect detection logic;
in the process of executing the embedded program to be detected, a defect detector searches for an reachable path in the embedded program to be detected, and detects defects of the embedded program to be detected and generates a defect detection result based on defect detection logic set by various defect definitions.
2. The method of claim 1, wherein generating the embedded project profile based on the identified embedded instructions and the embedded project build style comprises:
capturing system call of an embedded compiler command in the process of executing project construction commands by an interception technology based on a dynamic library loading environment variable, and acquiring embedded compiler compiling command line parameters;
And automatically summarizing the captured compiling command line parameters to generate the embedded project configuration file.
3. The method of claim 1 or 2, wherein the compile command parameters include a compiler name, a compile command option, a macro definition, and a compile command path.
4. The method of claim 1, wherein the identifying embedded keywords in program source code by the keyword identification module, adding the identified embedded keywords to the abstract syntax tree and generating keyword syntax nodes, comprises:
identifying embedded keywords in the program source code through a keyword identification module, and adding lexical marks of the keywords into a lexical analysis module of an execution engine;
receiving, by a grammar analyzer of the execution engine, a lexical marker stream from the lexical analyzer, generating an abstract syntax tree based on the lexical marker stream and the preprocessed file such that the generated abstract syntax tree contains embedded keywords;
and adding a grammar analysis mode corresponding to the embedded keyword on the node of the abstract grammar tree to form the grammar node of the embedded keyword on the abstract grammar tree.
5. The method of claim 1, wherein the execution engine further contains a floating point type enabled portion comprising a symbolic value representation that supports a floating point type, an expression representation that supports a floating point type operation, an in-memory representation that supports a floating point type, a constraint solver that supports a floating point type, and a mathematical function modeling module for modeling a mathematical function to enable the mathematical function to be identified by the symbolic execution engine; the defect detection logic in the defect detector further comprises floating point defect detection logic set based on a floating point defect definition; the method further comprises the steps of:
Generating an expansion graph for recording the state of each node of the embedded program to be tested in the process of executing the embedded program to be tested symbolically path by path;
when the input of the current branch path contains a mathematical function, binding the mathematical function with a symbol value, classifying the input mathematical function based on a function output attribute when the input parameter of the mathematical function is a floating point type parameter, calculating an output value range corresponding to the input parameter based on a classification result, binding the value range as a constraint range with the symbol value of the mathematical function, generating a new state of the branch path based on the bound value range, and adding the generated new state into an expansion diagram.
6. The method of claim 5, wherein classifying the input mathematical function based on the function output attribute, and calculating the output range of values corresponding to the input parameter based on the classification result comprises:
the mathematical functions are classified based on monotonicity, periodicity, parity, asymptotics, extremum and/or identity of the mathematical functions, and the output value range of the input parameters is determined as a constraint range according to the classification result.
7. The method of claim 5, wherein the embedded keyword related defect detection logic comprises related defect detection logic for the following keywords: const, static, volt, exterior and interior.
8. The method of claim 1, wherein the pre-processing file includes embedded instruction information, the configuration file, and grammar nodes.
9. An embedded software analysis device based on static symbol execution, comprising a processor and a memory, characterized in that the memory has stored therein computer instructions for executing the computer instructions stored in the memory, which device, when executed by the processor, implements the steps of the method according to any of claims 1 to 8.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the steps of the method according to any one of claims 1 to 8.
CN202410039234.0A 2024-01-11 2024-01-11 Embedded software analysis method, device and storage medium based on static symbol execution Active CN117555811B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410039234.0A CN117555811B (en) 2024-01-11 2024-01-11 Embedded software analysis method, device and storage medium based on static symbol execution

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410039234.0A CN117555811B (en) 2024-01-11 2024-01-11 Embedded software analysis method, device and storage medium based on static symbol execution

Publications (2)

Publication Number Publication Date
CN117555811A true CN117555811A (en) 2024-02-13
CN117555811B CN117555811B (en) 2024-03-19

Family

ID=89823553

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410039234.0A Active CN117555811B (en) 2024-01-11 2024-01-11 Embedded software analysis method, device and storage medium based on static symbol execution

Country Status (1)

Country Link
CN (1) CN117555811B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101286132A (en) * 2008-06-02 2008-10-15 北京邮电大学 Test method and system based on software defect mode
AU2008328515A1 (en) * 2007-11-20 2009-05-28 National Ict Australia Limited Multi language software code analysis
CN102339252A (en) * 2011-07-25 2012-02-01 大连理工大学 Static state detecting system based on XML (Extensive Makeup Language) middle model and defect mode matching
US20140165208A1 (en) * 2012-12-06 2014-06-12 Apple Inc. Method and apparatus for dynamic obfuscation of static data
CN104536883A (en) * 2014-12-05 2015-04-22 北京邮电大学 Static defect detecting method and system thereof
CN106055937A (en) * 2016-05-25 2016-10-26 深圳创维数字技术有限公司 Encryption method and system for software static data
CN110515838A (en) * 2019-07-31 2019-11-29 华东计算技术研究所(中国电子科技集团公司第三十二研究所) Method and system for detecting software defects based on topic model
US11036614B1 (en) * 2020-08-12 2021-06-15 Peking University Data control-oriented smart contract static analysis method and system
CN114021131A (en) * 2021-10-28 2022-02-08 绿盟科技集团股份有限公司 Method and device for acquiring data analysis map and electronic equipment
CN114036072A (en) * 2022-01-06 2022-02-11 湖南泛联新安信息科技有限公司 Method and system supporting automatic detection of program defects

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2008328515A1 (en) * 2007-11-20 2009-05-28 National Ict Australia Limited Multi language software code analysis
CN101286132A (en) * 2008-06-02 2008-10-15 北京邮电大学 Test method and system based on software defect mode
CN102339252A (en) * 2011-07-25 2012-02-01 大连理工大学 Static state detecting system based on XML (Extensive Makeup Language) middle model and defect mode matching
US20140165208A1 (en) * 2012-12-06 2014-06-12 Apple Inc. Method and apparatus for dynamic obfuscation of static data
CN104536883A (en) * 2014-12-05 2015-04-22 北京邮电大学 Static defect detecting method and system thereof
CN106055937A (en) * 2016-05-25 2016-10-26 深圳创维数字技术有限公司 Encryption method and system for software static data
CN110515838A (en) * 2019-07-31 2019-11-29 华东计算技术研究所(中国电子科技集团公司第三十二研究所) Method and system for detecting software defects based on topic model
US11036614B1 (en) * 2020-08-12 2021-06-15 Peking University Data control-oriented smart contract static analysis method and system
CN114021131A (en) * 2021-10-28 2022-02-08 绿盟科技集团股份有限公司 Method and device for acquiring data analysis map and electronic equipment
CN114036072A (en) * 2022-01-06 2022-02-11 湖南泛联新安信息科技有限公司 Method and system supporting automatic detection of program defects

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
DIPANKAR DAS: "Robust embedded software design through early analysis of quality faults", ISEC \'11: PROCEEDINGS OF THE 4TH INDIA SOFTWARE ENGINEERING CONFERENCE, 24 February 2011 (2011-02-24), pages 31 *
田硕等: "二进制程序安全缺陷静态分析方法的研究综述", 计算机科学, vol. 36, no. 7, 31 July 2009 (2009-07-31), pages 8 - 14 *

Also Published As

Publication number Publication date
CN117555811B (en) 2024-03-19

Similar Documents

Publication Publication Date Title
Siegel et al. CIVL: the concurrency intermediate verification language
Chen et al. Star: Stack trace based automatic crash reproduction via symbolic execution
Le Goues et al. Specification mining with few false positives
Chen et al. Coverage prediction for accelerating compiler testing
CN111104335B (en) C language defect detection method and device based on multi-level analysis
Hassan et al. Rudsea: recommending updates of dockerfiles via software environment analysis
Miné et al. Taking static analysis to the next level: proving the absence of run-time errors and data races with Astrée
Deutsch Static verification of dynamic properties
Kroening et al. Sound static deadlock analysis for C/Pthreads
Kusano et al. Thread-modular static analysis for relaxed memory models
Giebas et al. Detection of concurrency errors in multithreaded applications based on static source code analysis
Cho et al. Practical lock/unlock pairing for concurrent programs
CN113836023A (en) Compiler security testing method based on system structure cross check
Kaestner et al. Analyze this! sound static analysis for integration verification of large-scale automotive software
CN117555811B (en) Embedded software analysis method, device and storage medium based on static symbol execution
Giet et al. Towards zero alarms in sound static analysis of finite state machines
Gates et al. DynaMICs: An automated and independent software-fault detection approach
Borodin et al. Static analyzer for Go
Kästner et al. Obtaining DO-178C Certification Credits by Static Program Analysis
Honda et al. Range analyzer: An automatic tool for arithmetic overflow detection in model-based development
Yousaf et al. Efficient Identification of Race Condition Vulnerability in C code by Abstract Interpretation and Value Analysis
Zhang et al. Propositional projection temporal logic specification mining
Herpel et al. Real-time system prototyping based on a heterogeneous multi-processor environment
Deilami KLEEWT: A parallel symbolic execution engine
Xing et al. CCMOP: A Runtime Verification Tool for C/C++ Programs

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant