JP4693044B2 - Source code vulnerability inspection device - Google Patents

Source code vulnerability inspection device Download PDF

Info

Publication number
JP4693044B2
JP4693044B2 JP2005237124A JP2005237124A JP4693044B2 JP 4693044 B2 JP4693044 B2 JP 4693044B2 JP 2005237124 A JP2005237124 A JP 2005237124A JP 2005237124 A JP2005237124 A JP 2005237124A JP 4693044 B2 JP4693044 B2 JP 4693044B2
Authority
JP
Japan
Prior art keywords
transition
variable
function
source code
vulnerability
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
JP2005237124A
Other languages
Japanese (ja)
Other versions
JP2007052625A (en
Inventor
伸之 大浜
博 宮崎
Original Assignee
株式会社日立ソリューションズ
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社日立ソリューションズ filed Critical 株式会社日立ソリューションズ
Priority to JP2005237124A priority Critical patent/JP4693044B2/en
Publication of JP2007052625A publication Critical patent/JP2007052625A/en
Application granted granted Critical
Publication of JP4693044B2 publication Critical patent/JP4693044B2/en
Application status is Expired - Fee Related legal-status Critical
Anticipated expiration legal-status Critical

Links

Images

Description

  The present invention examines the source code of software, identifies input / output data with the outside, traces the passing between functions and variables, and further detects a location having a vulnerability corresponding to a pre-prepared rule The present invention relates to a code vulnerability inspection device.

  There are two methods for inspecting vulnerabilities included in software: a method in which various data are input to the running software and the results are inspected, and a method in which the software source code is inspected.

  There are two methods for inspecting source code: a method using pattern matching and a method using syntax analysis. In the inspection using pattern matching, the use location of the “dangerous function” included in the standard library of the programming language is detected from the source code to be inspected by pattern matching based on the function name or the like. A “dangerous function” is a function that creates a vulnerability in software if the programmer is not careful about its use. For example, functions that may be copied beyond the data area of the copy destination when copying data, or functions that may not be used as intended by the programmer due to race conditions when using computer resources and so on. In particular, the former is widely known as a cause of a buffer overrun (buffer overflow) vulnerability.

  In the inspection using parsing, additional annotations for inspection are added to the source code in advance, and the source code is analyzed, and data structure transitions (memory management errors, type mismatches, etc.) Analyze the use of "function" to find logical contradictions.

  In any source code inspection method, a tool for automatically performing the method has been developed. However, every tool has many false detections, and the programmer must check the source code based on the inspection result to determine whether or not it is a vulnerability.

The source code inspection method is described in Non-Patent Document 1 below, and examples of “dangerous functions” and buffer overruns are described in Non-Patent Document 2 below.
Information processing promotion business association security center: Investigation report on security assurance of open source software Part II Investigation of efficient inspection technology of open source software: February 2003, p. II-2 to II-4 Information Processing Promotion Corporation Security Center: Secure Programming Course: March 2002, [1-2.] Cross-site scripting, etc.

  Software vulnerabilities arise when software misprocesses input data from the outside, or outputs misprocessed (or unprocessed) input data to other software. Here, the former representative example is buffer overrun, and the latter representative example is cross-site scripting vulnerability and SQL command insertion vulnerability. These are described in detail in Non-Patent Document 2 above.

  By the way, the source code inspection by pattern matching only detects the use place of the “dangerous function”, and does not distinguish whether the data processed by the function is input data from the outside. Therefore, regardless of the origin of the data passed as an actual argument, all locations where "dangerous functions" are used are detected as vulnerabilities, and in order to determine whether these are really vulnerable Eventually, the programmer must inspect the source code. Since the function that accepts external input data is known, the use location can be detected from the source code. However, pattern matching cannot trace the subsequent use location of the external input data, and does not limit the inspection location by the programmer.

  In source code inspection by parsing, a programmer must describe in advance in the source code an annotation that supports parsing. Therefore, the prior work for the source code inspection is large, and the burden on the programmer is large.

  The present invention aims to solve the above-mentioned problems, and in particular, can detect a vulnerable part with high accuracy, and only find a place where a “dangerous function” is used by simple pattern matching. Compared to the method, it is possible to reduce the number of parts that the programmer needs to inspect, and it is not necessary for the programmer to write annotations for the purpose of assisting vulnerability detection in the source code to be inspected in advance. The purpose is to reduce the work load of inspection.

  In order to achieve the above object, a source code vulnerability inspection apparatus according to the present invention is a source code vulnerability inspection apparatus that inspects a vulnerability of a source code to be inspected, and analyzes the source code to be inspected. Tracking the external input transition in the source code to be inspected, a means for determining how the external input transitions, a vulnerability database in which a vulnerable function is registered when the external input is used as a parameter, And means for warning a location that matches the registered content of the vulnerability database as a location having a vulnerability.

  In the present invention, for example, for a function included in a library used in the inspection target source code, which dummy argument value is moved or copied to which dummy argument by internal processing, or returned as a return value In addition, a temporary argument transition DB (database) in which data transition between dummy arguments indicating which dummy argument receives input data from outside is described is provided in advance. Also included in the library is a vulnerability DB that has been registered in advance for functions that cause vulnerability when external input data is given to specific dummy arguments included in the same library, with the function name and the position of the specific dummy argument paired. .

  First, in the inspection of the inspection target source code, the syntax analysis unit constructs an analysis tree from the inspection target source code in accordance with the syntax analysis rule of the programming language used to describe the inspection target source code.

  Subsequently, the vulnerability analysis means traces this parse tree from the root node in a predetermined order, and when it reaches the function definition node, the user-defined function represented by that node is stored in the dynamic parameter transition DB if not yet registered. sign up.

  Next, the partial parse tree having the function definition node as a vertex is traced in a predetermined order, and when the dummy argument definition node is reached, the dummy argument represented by the dummy argument definition node is registered in the variable table. Here, the transition DB between dynamic dummy arguments is referred to for the user-defined function currently being processed, and if this dummy argument is registered when external input data is received, the variable table also indicates the type of value of this dummy argument. Register input data.

  When the variable definition node is reached, the variable represented by the variable definition node is registered in the variable table.

  Further, when the function call node is reached, the data transition between the dummy arguments of the function represented by this function call node is taken out by referring to the transition DB between dummy arguments and the transition DB between dynamic dummy arguments.

  If the first dummy argument of the function represented by this function calling node is included as the type of the value of the transition source dummy argument of this data transition, the analysis tree is traced in a predetermined order and the position of the first dummy argument is reached. The first actual argument node is reached. Then, referring to the variable table, the type of the value held by the variable registered for the first variable represented by the first actual argument node is read out, and this is set as the new type of transition destination value. Alternatively, if external input data is included in the value type of the transition source dummy argument, this is set as the new value type of the transition destination.

  Next, if the second dummy argument of the function represented by the function calling node is included as the value type of the transition destination dummy argument of the data transition, the position of the second dummy argument is traced through the parse tree in a predetermined order. The second actual argument node corresponding to is reached. Then, for the second variable represented by the second actual argument node, the type of the new transition destination value obtained previously is registered in the variable table as the value held by the second variable.

  Furthermore, here, if external input data is included in the new transition destination value type and the function represented by the function call node is registered in the transition DB between dynamic dummy arguments, the first dummy argument of this function Is added to the transition DB between dynamic dummy arguments that external input data is received.

  After tracing the partial parse tree, for each dummy argument, obtain the type of value recorded in the variable table and use it as the value held by the transition source dummy argument, and the value held by the transition destination dummy argument itself Is created, and the dynamic inter-argument transition DB is overwritten as a new inter- dummy argument data transition of the function represented by the function call node.

  The entire parse tree is traced while performing the above for each user-defined function of the inspection target source code. If the dynamic dummy argument transition DB is updated even once during the entire parse tree, the parse tree is again traced from the root node and the above processing is performed. When there is no new update when the trace is completed, the vulnerability analysis means again follows the parse tree from the root node in a predetermined order, and performs the above processing while transitioning between the formal argument transition DB and the dynamic formal argument transition. Create and update variable table with reference to DB. However, in the processing of the function call node, the combination of the function represented by the function call node and the position of the first dummy argument is included in the vulnerability DB. If registered, the description position of the first variable in the source code to be inspected is output as a place where vulnerability may occur.

  The source code vulnerability inspection apparatus according to the present invention records the value type for each pointer variable itself and the indirect reference destination of the pointer variable when the source code to be inspected is described in C language, for example. It also registers and deletes variables in the variable table according to the variable scope.

  According to the source code vulnerability inspection apparatus of the present invention, a part detected as a vulnerable part can be limited to a use part of a “dangerous function” that receives external input data as an actual argument. For this reason, it is possible to reduce the number of locations where the detected location is actually vulnerable or the programmer needs to inspect as compared with the method of finding the location where the “dangerous function” is used by simple pattern matching.

  That is, according to the present invention, it is possible to prevent erroneous vulnerability detection because it is determined whether or not the data handled by the function is derived from the outside and whether or not it has undergone processing that avoids an appropriate vulnerability. In addition, it is possible to inspect whether or not the function call location defined by the programmer is vulnerable, as well as the function call location registered in advance. As a result, the programmer can easily specify the place where the processing for avoiding the vulnerability should be performed. In addition, it is not necessary to describe annotations for detecting vulnerabilities that are not directly related to their contents in the source code. In addition, the work load of the source code inspection by the programmer is small.

  Since the source of the variable value can be recorded for each pointer variable itself and each indirect reference destination of the pointer variable in the source code vulnerability inspection apparatus of the present invention, the source code described in a programming language that supports indirect reference such as C language. It is also possible to inspect. In addition, because variables are registered / deleted in the variable table according to the variable scope, the number of registrations in the variable table can be reduced and the time required for source code inspection can be reduced compared to the case where the variable scope is not considered. it can.

  Hereinafter, embodiments of the present invention will be described with reference to the drawings. First, the first embodiment will be described, and then the second embodiment will be described.

  FIG. 1 is a configuration diagram of a source code vulnerability inspection apparatus according to the first embodiment of the present invention.

  Reference numeral 101 denotes a main body (computer) of the source code vulnerability inspection apparatus, which includes a CPU 102, a storage device 103, and a memory 104. The CPU 102 executes a program for performing source code inspection and performs various controls. The storage device 103 stores a program for performing a source code inspection, a source code to be inspected, a transition between temporary parameters DB 204 and a vulnerability DB 205, which will be described later. The memory 104 temporarily stores various data handled such as programs executed by the CPU 102.

  A keyboard 105 is an input device for sending an instruction to the source code vulnerability check device, and a CRT 106 is a display device of the source code vulnerability check device. The external storage device 107 is a storage device for inputting the source code to be inspected into the source code vulnerability inspection device and outputting the inspection result. A bus 108 is a bus that connects these units 102 to 107 to each other, and is a path of data that is moved by an instruction from the CPU 102.

  FIG. 2 is a schematic diagram of functions of the source code vulnerability inspection apparatus. The source code vulnerability inspection apparatus includes a syntax analysis unit 201 and a vulnerability detection unit 202. These means are realized by the CPU 102 executing a program for performing source code inspection.

  The syntax analysis unit 201 parses the inspection target source code 208 in accordance with the syntax analysis rule 203 to construct an analysis tree 210.

  The vulnerability detection unit 202 traces the parse tree 210, enumerates functions defined in the inspection target source code 208 (user-defined functions), and creates a dynamic DB argument transition DB 206. Further, variables defined in the inspection target source code 208 are registered in the variable table 207. Then, a transition parameter DB 204 created in advance for a function (library function) provided by the library used by the inspection target source code 208, and a dynamic parameter transition DB 206 created / updated while tracing the parse tree 210. Accordingly, the data transition between variables is traced, and the type of value held by the variable is sequentially recorded in the variable table 207. Here, a variable that refers to data input from the outside through file I / O or communication is an actual argument of a dangerous library function (a library function that is regarded as a “dangerous function”) registered in the vulnerability DB 205 in advance. Is output to the vulnerability inspection result 209 as a vulnerable portion in the inspection target source code.

  FIG. 3 is a detailed processing procedure of the syntax analysis unit 201 and the vulnerability detection unit 202 of the source code vulnerability inspection apparatus. The procedure 301 is performed by the syntax analysis unit 201, and the subsequent procedures are performed by the vulnerability detection unit 202. Details of each procedure and 203 to 207 in FIG. 2 will be described below.

  In step 301, the syntax analysis unit 201 reads the inspection target source code 208 and the syntax analysis rule 203 and constructs an analysis tree 210.

  Here, the processing procedure of the source code vulnerability inspection apparatus will be described using the inspection target source code 208 described in the C language shown in FIG. 4 as an example. The row numbers at the left end of FIG. 4 are shown for convenience of explanation. This processing procedure can also be applied to source code written in another programming language.

  The parsing rule 203 is a rule defined in advance for interpreting the grammar of the programming language used to describe the inspection target source code 208. Although this embodiment does not specify the rule description method and contents and the syntax analysis procedure, the syntax analysis identifies at least global variables, local variables, formal parameters, user-defined functions, and library functions used in the source code. In addition, it is assumed that the analysis tree 210 can be constructed by decomposing the source code into expressions that are the minimum unit of processing.

  FIG. 5 is an example of the parse tree 210 constructed from the source code of FIG. For convenience of explanation, a line number is shown at the left end. Enclosed characters in the figure represent nodes that are analysis units of source code. The characters in parentheses of each terminal node (node having no child node) of the parse tree 210 are a fragment of the source code corresponding to the terminal node. In FIG. 5, only the main nodes of the analyzed source code are shown, and delimiters such as parentheses and semicolons, operators, etc. are omitted.

  In the present embodiment, the structure of the parse tree 210 is not particularly specified. However, the parse tree 210 has a tree structure in which the parent-child relationship of all the nodes (function definition, expression, variable, etc.) of the analyzed source code is the vertex of the root node. It shall be expressed in Further, if a predetermined order (in the example of FIG. 5, in order from the top to the bottom and from the left to the right starting from the root node), the source code fragments of the end nodes are traced in the order of appearance in the source code.

  In the description of the present embodiment, the list structure data is expressed by enclosing “[”, “]”, the set by “(”, “)”, and delimiting each element with “,”.

  In addition, in the description of the present embodiment, the values held by each variable and dummy argument in the inspection target source code 208 when the inspection target source code 208 is compiled and executed are described in an abstract manner. The basic notation method is to abstract the variable values into the identification numbers assigned to the variables registered in the variable table 207, and the dummy argument values for each dummy argument of the library function or user-defined function. Are abstracted to the formal argument numbers assigned 1, 2, and 3 in order from the left, and are prefixed with a number of '*' s according to the depth of the indirect reference possible for that variable or formal argument. Shall. For example, if the possible indirect reference depth of a variable (a dummy argument) assigned 3 as an identification number (a dummy argument number) is 0 (not a pointer), the value is expressed as “3”. If the possible indirect reference depth is 1 (pointer or number of one-dimensional arrays), the value of the pointer variable (dummy argument) itself is expressed as '3', and the value of the indirect reference destination is expressed as '* 3'. If the possible indirect reference depth is 2 (pointer pointer or two-dimensional array), the value of the pointer pointer variable (dummy argument) itself is '3', and the value of the indirect reference destination is '* 3'. In addition, the value of the indirect reference destination is expressed as '** 3'. 0 is used as a formal argument number representing the return value of a library function or user-defined function. When a variable-length dummy argument is expressed, '...' is appended to the dummy argument number of the variable-length dummy argument. When the variable or formal argument holds external input data directly acquired by calling a library function that acquires external input such as file I / O or communication, it is expressed as “IN”. Indicate 'OUT' to indicate that the value held by a variable or dummy argument is output to the outside. The above-described abstract representation of the values held by variables and dummy arguments is referred to as reference data in this embodiment.

  In step 302 after step 301, the vulnerability detection unit 202 reads the parse tree 210 and the transition DB 204 between dummy arguments.

  FIG. 6 is an example of the transition argument DB 204 between dummy arguments. The transition parameter between DBs 204 is a DB that enumerates library functions that accompany data transitions such as copying and movement of values held between formal parameters by internal processing. Contents vary depending on the library and compiler processing system used. Here is an example of an ANSI C standard library function.

  The first column of the transition between temporary arguments DB 204 in FIG. 6 is a function name. The second column is a list of data transitions indicating which dummy argument value transitions to which dummy argument. This list is called a transition between dummy arguments, and each element in the list is called a transition between dummy arguments. For each transition element between dummy arguments, all or part of the value held in the dummy argument represented by the reference data (transition source reference data) on the left of the arrow “→” is the reference data (transition on the right of “→”). This means that it is copied to the formal argument written in (reference data). Note that each element in the transition list between dummy arguments is evaluated in order from the left. Since the value of a dummy argument passed by value is always unchanged before and after the function call, the transition element between dummy arguments (for example, “1 → 1”) indicating this is not added to the transition list between dummy arguments. For dummy arguments passed by reference, the transition element between dummy arguments (eg, * 1 for a dummy argument with a possible indirect reference depth of 2), unless the indirect reference destination possible with the dummy argument is not overwritten and remains. * 1 'and' ** 1 → ** 1 ') must be added to the transition list between formal arguments.

  According to the example of FIG. 6, in the function “gets”, the external input data transits to the indirect reference destination of the first dummy argument, and the pointer of the first dummy argument is returned as a return value. In the function sprintf, the indirect reference destinations of the second dummy argument and the variable length dummy argument after the third dummy argument are unchanged, and the value of the indirect reference destination transitions to the indirect reference destination of the first dummy argument. The function puts has the indirect reference destination of the first dummy argument unchanged and is output to the outside.

  It should be noted that the registration target in the transition parameter transition DB 204 or the dynamic transition parameter transition DB 206 may be limited to a function that handles a dummy argument of a specific data type, or may be all functions regardless of the data type. In this embodiment, it is limited to those that handle char * type and const char * type used for storing character strings.

  In step 303, the vulnerability detection unit 202 traces all of the parse tree 210 from the root node in a predetermined order, and registers the found user-defined function in the dynamic parameter transition DB 206 and the global variable in the variable table 207. To do. The registered contents here are provisional and are updated in procedures 305 to 312.

  FIG. 7 shows an example of the dynamic dummy argument transition DB 206 created by following the analysis tree 210 of FIG. The first and second columns of the transition DB206 between dynamic dummy arguments are the same as the transition DB204 between dummy arguments. The third column is a set of dummy arguments that may receive external input data, and external input dummy arguments that are elements of the set are described as reference data. Since the data transition between the dummy arguments is unknown at the time of step 303, the initial transition transition list between the dummy arguments in the second column indicates that the reference destination is unchanged for only the dummy argument passed by reference. give. Whether or not each dummy argument receives external input data is also unknown at this time, so the initial value of the external input dummy argument set in the third column is an empty set. However, only the function main is defined in the C language specification when the second argument (char type pointer array) receives the program execution parameter, so the external input dummy argument set '(** 2)' indicating that Is the initial value.

  Since the analysis tree 210 in FIG. 5 does not include a global variable, there is no registration in the variable table 207 here. The initial value for registration is the same as that of the local variable described in step 308.

In step 304, the vulnerability detection unit 202 executes the dynamic provisional before and after the full search when executing the step 304 for the first time or when returning to the step 304 after finishing the parse tree full search in the steps 305 to 312. If the content of the argument transition DB 206 has been updated, the procedure 305 is performed next. Otherwise, go to step 313. In step 305, the vulnerability detection unit 202 starts searching the analysis tree 210 from the root node. In step 306, the vulnerability detection unit 202 performs step 304 if the search of the analysis tree 210 has been completed, and proceeds to step 307 if it has not been completed.

In step 307, the vulnerability detection means 202 follows the analysis tree 210 from the next node searched so far in a predetermined order, and the partial analysis tree having the function definition node that has reached first as a vertex is stored in steps 308 to 312. The target of processing. Hereinafter, the user-defined function represented by the partial parse tree is referred to as a target function.

  In the parse tree 210 of FIG. 5, the function format_string is first the target function. In the following description, unless otherwise specified, the variable means a dummy argument, a local variable, and a global variable of the target function.

  In step 308, the vulnerability detection unit 202 traces the partial parse tree of the target function in a predetermined order, and if the local variable defined in the formal argument of the target function or the function body is found, registers them in the variable table 207. .

  FIG. 8 shows the contents of the variable table 207 after performing the procedure 308 for the dummy argument (the 5th to 11th lines in FIG. 5) of the function format_string of the partial analysis tree shown in FIG. One line of data is added to the variable table 207 for each variable (provisional argument). The first column is an identification number of a variable registered in the variable table 207, and is assigned from 1 in the order of registration. The second column indicates the type of the variable. When registering a dummy argument, “A” is stored, “L” is stored for a local variable, and “G” is stored for a global variable. The third column is the type of variable, the fourth column is the variable name, the fifth column is a list (variable reference data set) composed of a set of reference data (variable reference data set) representing the value held by the variable (variable reference data set) Set list).

  The number of elements (variable reference data sets) in the variable reference data set list of the variable table 207 in FIG. 8 is determined by the depth of indirect reference possible for each variable. If the variable is a basic data type (simple data type that is not a pointer or array, such as int type or char type), the number of elements is 1, 2 if it is a pointer or a one-dimensional array, and 3 if it is a pointer of a pointer or a two-dimensional array. The position of each variable reference data set in the list corresponds to the number of indirect reference repetitions of the variable, and the variable reference data set at the top of the list is a set of variable reference data representing the value of this variable. The reference data set is a set of variable reference data representing the value of the indirect reference destination of the variable, and the third variable reference data set is a set of variable reference data representing the value of the indirect reference destination of the variable indirect reference destination. The reason why the value of the variable and the value of the indirect reference destination are set as a set of variable reference data is to cope with a state in which the variable holds a character string obtained by concatenating character strings copied from a plurality of variables. The numbers in each variable reference data in the variable table 207 are variable identification numbers.

  The initial value of the variable reference data set list in the variable table 207 in FIG. 8 differs depending on the type of variable. In the case of the local variable, the initialization is not performed, and the initialization by the constant value is as follows. First, for each indirect reference depth that is possible for a local variable, variable reference data in which “*” corresponding to the indirect reference depth is prefixed to the identification number in the variable table 207 of the local variable is generated. Then, a list having a variable reference data set having each variable reference data as a single element as an element is set as an initial value. In the case of a variable initialized with a constant value, the variable reference data set corresponding to the deepest indirect reference is set to “(C)” using reference data “C” representing the constant value. For example, in a local variable that is a pointer of a pointer, the possible indirect reference depths are 0, 1, and 2. Therefore, the initial value of the variable reference data set list when it is not initialized is' [(1), ( * 1), (** 1)] ', and the initial value when initialized with a constant value is' [(1), (* 1), (C)]'. When other variables and function calls are included in the initialization of the local variable, it is assumed that the process of step 309 is added to the expression for generating initialization data to the initial value.

  For global variables, it is the same as for local variables.

  In the case of a dummy argument, the initial value is a variable reference data set list similar to an uninitialized local variable. However, for a dummy argument that may be passed external input data, pseudo external input data 'PIN' is added to the initial value to the variable reference data set of the corresponding indirect reference depth. Here, the presence / absence of the external input data is determined as follows. First, the formal argument number of the formal argument to be registered in the variable table 207 is acquired. At this point, since the partial parse tree having the formal argument list node of the parse tree 210 as a vertex is being traced, it is possible to know the order number of the formal argument to be registered, that is, the formal argument number. Next, the transition DB206 between dynamic dummy arguments is searched using the name of the target function as a key, and the external input dummy argument set of the matching row is read. All external input dummy arguments including the dummy argument number are acquired from this set. Then, for each external input dummy argument, the indirect reference depth (number of '*' prepended to the dummy argument number) is obtained, and the position obtained by adding 1 to the depth from the left in the variable reference data set list 'PIN' is added to the variable reference data set. For example, when registering the second dummy argument of the target function in the variable table 207, the variable identification number 4 in the variable table 207 of the second argument is 4, and the transition DB206 between dynamic dummy arguments is set using the name of the target function as a key. Assume that the set of external input formal parameters retrieved and retrieved is '(* 1, ** 2)'. In this set, the external input dummy argument including the argument number “2” to be processed is currently “** 2”, and the depth of the indirect reference is 2. Therefore, the initial value of the variable reference data set list is 'PIN' for the third set of initial values '[(4), (* 4), (** 4)]' when there is no external input dummy argument. '[(4), (* 4), (** 4, PIN)]' is added. Since the external input data is always given to the second dummy argument of the function main, the initial value of the variable reference data set list of the variable table 207 for this second dummy argument is assumed to have a variable identification number of 3. , '[(3), (* 3), (IN)]'.

  When the procedure 308 described above is performed on the target function format_string, a dummy argument strout that is a pointer to the char type is first found from the parse tree in FIG. 5 (lines 6-8). If the function name column in the function name column of the dynamic dummy argument transition DB 206 (FIG. 7) at this time is searched with the name of the target function format_string, the set of external input dummy arguments in the matched row is empty. Since the dummy argument strout is the first data registered in the variable table 207, the identification number is 1, and the initial value of the variable reference data set list is “[(1), (* 1)]” (number 1 in FIG. 8). line). Subsequently, the dummy argument strin is found from the parse tree of FIG. 5 (9th to 11th lines), the identification number in the variable table 207 becomes 2, and there is no external input dummy argument corresponding to this dummy argument. The initial value of the set list is “[(2), (* 2)]” (line 2 in FIG. 8).

  Although not included in the example of the source code in FIG. 4, in the case of a target function that returns a return value, a line for recording reference data of the return value is also added to the variable table 207. Its type is A and it is treated as a dummy argument, the variable type is the return value type, the name is 'return' (cannot be used as a variable name because it is a reserved word in C language, it does not overlap with user-defined variables), and the variable reference data set list is Registered as a list in which empty sets are arranged according to the depth of the indirect reference of the return value (for example, if the depth is 2, [(), (), ()]).

  In the procedure 308, a pair of the dummy argument number of the dummy argument registered in the variable table 207 and the variable identification number assigned when the dummy argument is registered in the variable table 207 is also stored for each dummy argument. deep.

  In step 309, the vulnerability detection unit 202 follows the parse tree 210 in a predetermined order from the next node that has been searched, and when it reaches a node where the value of the variable may move to another variable, For the partial analysis tree, the transition between the dummy arguments DB 204 and the transition between the dynamic dummy arguments DB 206 are referred to, the data transition is traced, and the variable table 207 is updated.

  Nodes where the value of a variable may transition to another variable include function calls and assignment statements. When the function call node is reached, the partial analysis tree having the node as a vertex is processed. As an example, when the procedure 309 is performed for the function format_string of the source code shown in FIG. 4, the node that may reach the first value after the search of the parse tree 210 in FIG. This is a function call node (line 15 in FIG. 5). A partial analysis tree having this node as a vertex is set as a processing target here (lines 15 to 26).

  Hereinafter, the detailed procedure of the variable table update process (procedure 309) when analyzing the partial parse tree of the function call will be described with reference to FIG.

  In step 901, the vulnerability detection unit 202 obtains the name of a function to be called by tracing through the partial analysis tree. In the example of the parse tree 210 of FIG. 5, the name sprintf of the function to be called is obtained from the source code fragment of the identifier node (16th line) that arrives next to the function call node.

  In procedure 902, the vulnerability detection unit 202 traces the partial analysis tree, acquires all actual arguments of this function call, and checks whether each actual argument is a variable or a constant. In the example of the parse tree 210 of FIG. 5, the first actual argument strout, the second actual argument “input =% s”, and the third actual argument strin are obtained from each actual argument node. Since the identifier node whose direct parent node is the expression node means a variable, it can be seen that the first actual argument and the third actual argument are variables. Since the node corresponding to the second actual argument is a character string node, it can be seen that it is a constant value.

  In step 903, the vulnerability detection unit 202 searches the function name column of the transition parameter transition DB 204 and the transition parameter between dynamic dummy arguments DB 206 using the previously obtained function name as a key, and transition list between dummy arguments from matching lines. Get. If no matching line is found by searching each DB (when the function used for calling the function to be processed is not registered in the inter-argument transition DB 204), the procedure of FIG. 9 ends. In the above example, when both DBs are searched using the function name sprintf as a key, a matching line is found in the third line of the inter-argument transition DB 204 in FIG. Therefore, the transition list between formal arguments '[* 2 → * 1, * 2 → * 2, * 3 ... → * 1, * 3 ... → * 3 ...]' is acquired from this line.

  In step 904, the vulnerability detection unit 202 extracts the transition element between dummy arguments in order from the left of the previously obtained transition list between temporary arguments, performs the processing from step 905 to step 909, and evaluates the data transition between variables. . When the process is completed for all transition elements between dummy arguments, the procedure 910 is performed next.

  In step 905, the vulnerability detection unit 202 extracts the transition source reference data and the transition destination reference data from the transition element between the dummy arguments to be evaluated. If both include a dummy argument number, and if the transition source reference data is 'IN' and the transition destination reference data includes a dummy argument number, the actual argument corresponding to the dummy argument number included in the transition destination reference data is If it is a variable, go to the next process. In cases other than these, the process returns to step 904, and the next transition element between dummy arguments is set as an evaluation target. In the example of FIG. 5, first, for the first transition element between dummy arguments “* 2 → * 1”, the transition source and the transition destination include the dummy argument numbers, so the procedure moves to step 906. In the case of the third and fourth dummy argument transition elements, the process proceeds to step 906. The second dummy argument transition element '* 2 → * 2' corresponds to the dummy argument number included in the transition destination reference data. Since the actual argument to be performed is a constant, the process returns to step 904.

  In step 906, the vulnerability detection unit 202 examines the indirect reference depth of the transition source reference data and the transition destination reference data including the temporary argument number for the transition element between the temporary arguments to be evaluated. This is obtained by counting the asterisk '*' appended to the formal parameter number. In the above example, the indirect reference depth is 1 from both the transition source reference data '* 2' and the transition destination reference data '* 1'.

  In step 907, if the transition source reference data includes a dummy argument number, the vulnerability detection unit 202 acquires a reference data set of the corresponding actual argument in the partial analysis tree. If the actual argument corresponding to the formal argument number is a variable, the name column of the variable table 207 is searched using the name of the variable as a key, and a variable reference data set list of the matching row is obtained. From this list, a variable reference data set (hereinafter referred to as a transition source variable reference data set) corresponding to the previously obtained transition source indirect reference depth is extracted. If the actual argument corresponding to the dummy argument number is a constant, '(C)' is set as the transition source variable reference data set. If the transition source reference data contains '...' representing a variable-length dummy argument, if there are multiple variables or constants corresponding to the actual argument, the above processing is performed for all of them, and the union of the results (however, Element duplicate reference data set). If the transition source reference data is 'IN', '(IN)' is set as the transition source variable reference data set. In the above example, the transition source is known as the second argument from the transition source reference data '* 2', and since this is a constant value, '(C)' is the transition source variable reference data set.

  In step 908, if the function to be called is a user-defined function and 'IN' or 'PIN' is included in the transition source variable reference data set in step 908, the vulnerability detection unit 202 includes external input data in the called function. Is updated, the external input dummy argument set of the dynamic dummy argument transition DB 206 is updated for this function. The transition DB 206 between the dynamic dummy arguments is searched using the function name of the function to be called obtained as a key, and if the transition source reference data is not included in the external input dummy argument set of the matching row, this transition source reference data Add itself to the external input parameter set. Since there is no corresponding case in the above example, description of a specific example is omitted.

  In step 909, the vulnerability detection unit 202 searches the name column of the variable table 207 using the name of the variable of the actual argument corresponding to the dummy argument number included in the transition destination reference data as a key to find a matching line. Get the variable identification number. Then, the transition source variable reference data set obtained in step 907 is further paired with the combination of this variable identification number and the indirect reference of the transition destination reference data obtained in step 906, and temporarily set as a temporary transition destination variable reference data set. To remember. A temporary transition destination variable reference data set is created for each transition destination (a combination of a variable identification number and an indirect reference depth). If there are multiple data transitions at the same transition destination (a combination of variable identification number and indirect reference depth), the provisional transition destination variable reference data is obtained each time a new transition source variable reference data set is obtained. Link to the transition source variable reference data set in the set. In the above example, the name column of the variable table 207 in FIG. 8 is searched using the variable name strout of the transition destination actual argument as a key, and the variable identification number 1 is obtained from the number column of the matched line. The depth of the indirect reference of the transition destination obtained earlier is 1. The transition source variable reference data set '(C)' obtained earlier in these sets is further stored as a set temporarily. For convenience, this temporary transition destination variable reference data set is described as '(1, 1, (C))' in the order of the variable identification number of the transition destination variable, its joint reference depth, and the transition source variable reference data set. To do. After step 909 described above, the procedure returns to step 904.

  In the above processing, in the above example, when processing the transition element between the third dummy arguments, a temporary transition destination variable reference data set including the same variable identification number and indirect reference depth as the first is obtained. Therefore, the temporary transition destination variable reference data set obtained by merging them is obtained as “(1, 1, (C, * 2))”. The temporary transition destination variable reference data set obtained from the processing of the fourth transition element between dummy arguments is '(2, 1, (* 2))'.

  In step 910, the vulnerability detection unit 202 performs the following processing for each provisional transition destination variable reference data set, and updates the variable table 207. The variable identification number of the transition destination variable is acquired from the temporary transition destination variable reference data set, the number column of the variable table 207 is searched using this as a key, and the variable reference data set list is obtained from the matched row. In this list, the transition source variable reference data set obtained from the temporary transition destination variable reference data set is overwritten at the position corresponding to the indirect reference depth obtained from the temporary transition destination variable reference data set. The new variable reference data set list obtained as a result is overwritten on the original location of the variable table 207, and the process proceeds to the next provisional transition destination variable reference data set. When the above process is completed for all temporary transition destination variable reference data sets, the update process of the variable table 207 in step 309 is terminated. For example, when the variable table 207 in FIG. 8 is updated in the above example, the variable identification number of the transition destination variable for the first provisional transition destination variable reference data set '(1, 1, (C, * 2))' Get one. By using this as a key, the number column of the variable table 207 in FIG. 8 is searched to obtain the variable reference data set list '[(1), (* 1)]' of the matched row. Further, the indirect reference depth 1 of the transition destination variable is obtained from the first provisional transition destination variable reference data set. The transition source variable reference data set '(C, * 2)' obtained from the first temporary storage data is overwritten in the second position of the variable reference data set list corresponding to this, and '[(1 ), (C, * 2)] '. This result is overwritten on the original position of the variable table 207 of FIG. Similarly, when the second temporary transition destination variable reference data set is processed, the second element of the variable reference data set list in the number 2 row of the variable table 207 in FIG. (* 2) 'is overwritten, and further overwritten at the original position of the variable table 207 of FIG. Thus, the update process for the variable table 207 is completed. The updated variable table 207 is shown in FIG.

  This is the end of the description of FIG.

  The processing procedure shown in FIG. 9 is for a function call. However, for a variable update process via an operator such as an assignment statement, the partial analysis tree is a partial analysis tree for the sprintf function call (15 to 26 in FIG. 5). Since it is constructed in the same way as the (line), it can be processed in almost the same procedure.

  For example, in the case of pointer assignment 'p = q' (variables p and q are assumed to be char * type), the dummy argument number on the left side is 1, the dummy argument number on the right side is 2, and the transition list between dummy arguments is' [ 2 → 1, → * 1, * 2 → * 2] '. After assignment, the pointer is copied and the indirect reference destination of the variable p is the same as the variable q. Therefore, in the above list, the pointer itself is copied ('2 → 1') and the indirect reference destination of the variable p is cleared ('→ * 1') ). If the pointer assignment 'a = b' for the char * type pointer variables a and b is processed, if the variable table 207 before assignment is in the state shown in FIG. 11, it is updated as shown in FIG. The variable reference data set corresponding to the indirect reference destination of the variable a is an empty set. If the reference data of the indirect reference destination of the variable a is required in the function call or assignment statement processing when the variable a is in this state, the indirect reference is one shallower than the empty variable reference data set (FIG. 12). In, the variable identification number in the set (next to the left in the variable reference data set list) is extracted. Then, the variable reference data set of the indirect reference depth originally required from the variable reference data set list of the variable identified by this number is read and used. If this is also an empty set, the above processing is repeated until a non-empty set is found.

  Also, when the result of an operation by another function call or operator is used as an operation argument of an actual argument or operator of the function call, the result is regarded as a temporary local variable value and registered in the variable table 207. By regarding the node in the partial analysis tree corresponding to the position where this result is returned as the node of the temporary local variable, the variable table 207 can be updated by applying the processing procedure of the procedure 309.

  When the value of the variable is returned by the return statement, the variable reference data set list of the variable is merged with the variable reference data set list of the line registered as the variable name return in the variable table 207. At this time, the union of both variable reference data sets is taken for each same indirect reference depth, and the result is used as a new variable reference data set list of the variable name return. If the return statement returns a constant value, merge '(C)'.

  Returning to FIG. 3 again, in step 310, the vulnerability detection unit 202 returns to step 311 if the search of the partial analysis tree of the target function has ended, and returns to step 308 if it has not ended.

  In step 311, the vulnerability detection unit 202 updates the contents of the dynamic dummy argument transition DB 206 for the target function based on the contents of the variable table 207. For each dummy argument of the target function, a variable reference data set list of the dummy argument is acquired from the variable table 207 to generate a transition element between dummy arguments, and finally, these elements are connected to form a transition list between dummy arguments.

  Details of the processing in the procedure 311 will be described. First, in the type column of the variable table 207, a row with the type “A” (provisional argument) is selected, and all the variable identification numbers are acquired. In the present embodiment, nested function definitions are not permitted in the target C language, and therefore, only the dummy argument of the target function is registered in the variable table 207 as a dummy argument.

  The following processing is performed for each acquired variable identification number. First, the variable reference data set list in the row of the variable identification number to be processed is extracted from the variable table 207. Then, a temporary argument transition element is generated for each variable reference data in the list. In the transition source reference data, reference data obtained by converting the variable identification number included in the variable reference data to be processed into the temporary argument number by referring to the pair of the dummy argument number and the variable identification number stored in the process of step 308. Is used. Similarly, the transition reference data includes the variable reference data that includes the variable reference data to be processed after converting the variable identification number to be processed into the dummy argument number by referring to the pair of the dummy argument number and the variable identification number. Reference data combining the depth of indirect references represented by a set is used. When the variable reference data set is an empty set, there is no transition source reference data of the transition element between dummy arguments. Also, if the variable reference data to be processed contains a variable identification number of a local variable, 'PIN' representing pseudo external input data or 'C' representing a constant value, it is ignored. If a global variable is included, that variable is ignored. Add "G" to the identification number and include it in the transition element between formal arguments.

  All the transition elements between dummy arguments generated as described above are connected to form a new transition list between dummy arguments of the target function. Using the name of the target function as a key, the transition between dynamic dummy arguments DB 206 is searched, and the transition list between dummy arguments in the matching row is updated. Note that transition elements between temporary arguments having the same dummy argument numbers included in the transition source reference data and the transition destination reference data and whose indirect reference depth is 0 are not linked to the transition list between dummy arguments.

  When the processing of the above-described procedure 311 is performed on the variable table 207 in FIG. 10, the rows whose type column is “A” are those with variable identification numbers 1 and 2. First, the variable reference data set list '[(1), (C, * 2)]' is extracted for the row with the variable identification number 1. The transition element between dummy arguments generated for the first variable reference data '1' is '1 → 1'. The argument transition element generated for the second variable reference data '* 2' ignores the constant 'C' Then, '* 2 → * 1'. Similarly, for the row with the variable identification number 2, the generated transition elements between formal arguments are “2 → 2” and “* 2 → * 2” in this order. The new transition list between dummy arguments formed by connecting these is “[* 2 → * 1, * 2 → * 2]” in consideration of the case where it is excluded from the connection target. As a result, the dynamic DB transition database 206 is updated as shown in FIG.

  In step 312, the vulnerability detection unit 202 deletes the registration data regarding the dummy argument and the local variable related to the target function from the variable table 207.

  Note that if the variable reference data set list registered in the variable table 207 for a global variable includes reference data including a variable identification number of a local variable or a dummy argument for pointer assignment processing, the variable identification number Is referred to and replaced with a variable reference data set corresponding to the indirect reference depth of the reference data.

  In the description of the present embodiment, the scope of the variable is not considered for simplification, but when the registration of the local variable and the dummy argument in the variable table 207 enters the scope, the deletion goes out of the scope. Sometimes do. The parse tree 210 is traced in a predetermined order, and if there is a variable definition or variable declaration node, the scope of the variable starts. The depth of this node in the parse tree 210 is recorded in the variable table 207 and the search of the parse tree 210 is continued. If the depth of the node being searched becomes shallower than that registered in the variable table 207, the variable is scoped. Therefore, the corresponding data may be deleted from the variable table 207.

  Thus, the process for one target function ends, and the process returns to step 306.

  When the procedures 306 to 312 are continued for the source code of FIG. 4, the dynamic formal argument transition DB 206 and the variable table 207 are updated as follows.

  FIG. 14 shows the variable table 207 at the end of the procedure 308 for the function get_input of the source code of FIG.

  For the same function, the variable table 207 at the end of step 309 is updated to FIG. 15, and the transition DB206 between dynamic dummy arguments remains as in FIG. Here, for the call to the function “gets”, the variable reference data set list of the variable “in” registered in the variable table 207 of FIG. 14 is updated based on the transition DB 204 between dummy arguments.

  For the same function, the transition DB206 between dynamic dummy arguments at the end of the procedure 311 is updated to FIG. Here, based on the variable table 207 in FIG. 15, the transition list between dummy arguments for the function get_input in the dynamic dummy argument transition DB 206 in FIG. 13 is updated.

  The variable table 207 at the end of the procedure 308 for the function put_input of the source code of FIG. 4 is FIG.

  For the same function, the variable table 207 at the end of the procedure 309 remains as in FIG.

  For the same function, the transition DB206 between dynamic dummy arguments at the end of the procedure 311 remains as shown in FIG.

  FIG. 18 shows the variable table 207 at the end of step 308 for the 18th line of the function main of the source code of FIG. Here, referring to the external input dummy argument set registered in the dynamic dummy argument transition DB 206 for the function main, the external input data is given to the second dummy argument. Therefore, in the variable reference data set list registered in the variable table 207 for the second dummy argument argv, “(IN)” is set as the initial value of the variable reference data set corresponding to the reference depth 2. .

  For the 19th line of the function, the variable table 207 at the end of step 309 is updated to FIG. 19, and the transition DB 206 between dynamic dummy arguments remains as in FIG. Here, the variable reference data set list of the variable inbuf registered in the variable table 207 in FIG. 18 is updated in accordance with the transition DB206 between dynamic dummy arguments in FIG. 16 for the call of the user-defined function get_input.

  For the 20th line of the function, the variable table 207 at the end of step 309 is updated to FIG. 20, and the dynamic inter-argument transition DB 206 is updated to FIG. Here, the variable reference data set list of the variable outbuf registered in the variable table 207 of FIG. 19 is updated according to the transition DB206 between dynamic dummy arguments of FIG. 16 for the call of the user-defined function format_string. In addition, since “IN” is included in the variable reference data set list of the variable inbuf registered in the variable table 207, the user-defined function format_string is registered in the dynamic inter-argument transition DB 206 of FIG. The input dummy argument set is updated in FIG.

  For the 21st line of the function, the variable table 207 at the end of step 309 remains as in FIG. 20, and the transition DB206 between dynamic dummy arguments is updated to FIG. Since “IN” is included in the variable reference data set list of the variable outbuf registered in the variable table 207 in FIG. 20, the external input temporary register registered in the transition DB 206 between dynamic dummy arguments in FIG. The argument set is updated in FIG.

  The procedure 311 is not performed because the same function is registered in the inter-argument transition DB 204.

  The search of the analysis tree 210 in FIG. Since the current dynamic dummy argument transition DB 206 (FIG. 22) is different from that at the time of the previous procedure 304 (FIG. 7), the procedures 305 to 312 are performed again.

  FIG. 23 shows the variable table 207 at the end of the second procedure 308 for the function format_string of the source code in FIG. Here, based on the external input dummy argument set registered in the transition DB206 between dynamic dummy arguments in FIG. 22 for the function, the pseudo external input is set to the initial value of the variable reference data set list of the dummy argument strin registered in the variable table 207. The data 'PIN' is included.

  For the same function, the variable table 207 at the end of the second step 309 is updated to FIG. Here, the variable reference data set list of the variable strout in the variable table 207 of FIG. 23 is updated in FIG. 24 based on the transition DB 204 between dummy arguments for the call of the function sprintf.

  For the same function, the transition DB206 between dynamic dummy arguments at the end of the second procedure 311 remains as shown in FIG. Unlike the first execution of the same procedure, 'PIN' is included in the variable reference data set list. However, since it is not used to generate a transition list between dummy arguments, the transition DB206 between dynamic dummy arguments in FIG. Not updated.

  Regarding the function get_input of the source code in FIG. 4, the state of change in the variable table 207 when performing the second steps 306 to 312 is the same as that in the first time (FIG. 14). Therefore, the transition list between dummy arguments registered in the dynamic dummy argument transition DB 206 for the same function is not changed from the first one, and the dynamic dummy argument transition DB 206 of FIG. 22 is maintained as it is.

  FIG. 25 shows the variable table 207 at the end of the second procedure 308 for the function put_input of the source code of FIG. Here, based on the external input dummy argument set registered in the transition DB206 between dynamic dummy arguments in FIG. 22 for the function, the pseudo external value is set to the initial value of the variable reference data set list of the dummy argument out registered in the variable table 207. Input data 'PIN' is included.

  For the same function, the variable table 207 at the end of the second step 309 remains as in FIG. 25, and the transition DB 206 between dynamic dummy arguments remains as in FIG.

  For the same function, the transition DB206 between dynamic dummy arguments at the end of the second procedure 311 remains as shown in FIG.

  Regarding the function main of the source code in FIG. 4, the state of the variable table 207 when the second steps 306 to 312 are performed is the same as that of the first time (FIGS. 18 to 21). Therefore, the transition list between dummy arguments registered in the dynamic dummy argument transition DB 206 for the same function is not changed from the first one, and the dynamic dummy argument transition DB 206 of FIG. 22 is maintained as it is.

  The second search of the analysis tree 210 in FIG. Since the current dynamic dummy argument transition DB 206 (FIG. 22) has not been updated since the previous execution of the procedure 304, the procedure proceeds to the procedure 313.

  In step 313, the vulnerability detection unit 202 reads the vulnerability DB 205.

  An example of the vulnerability DB 205 is shown in FIG. The first column is the name of the function that can cause a vulnerability. The second column is a weak dummy argument list in which dummy arguments that may cause vulnerability when external input data is given are listed in a reference data format, and the numbers are dummy argument numbers.

  In step 314, the vulnerability detection unit 202 traces the parse tree 210 in a predetermined order from the root node, and includes a variable that includes “IN” or “PIN” in the dummy argument registered in the vulnerability parameter list of the vulnerability DB 205. Is output as a vulnerability inspection result 209.

  In step 314, steps 305 to 312 are performed based on the dynamic temporary argument transition DB 206 and variable table 207 constructed so far. However, the procedure related to the update of the transition DB206 between dynamic dummy arguments (procedure 311 and procedure 908 in FIG. 9 which details the processing of the procedure 309) is omitted. Further, immediately before the processing of the procedure 910, a vulnerability detection procedure is performed in which the vulnerability DB 205 is referred to and the vulnerability detection rule is pointed out when the vulnerability detection rule is matched.

  The processing of the vulnerability detection procedure is as follows. First, the function name column of the vulnerability DB 205 in FIG. 26 is searched using the name of the function to be called acquired in step 901 as a key. If no matching function is found, the vulnerability detection procedure is terminated. If found, read the vulnerability parameter list for that line. Next, processing similar to steps 905 to 907 is performed, and the reference data is read from the variable table 207 for the actual argument corresponding to each dummy argument reference data in the vulnerability dummy argument list. If this reference data includes 'IN' representing external input data or 'PIN' representing pseudo external input data, it is output as a vulnerability inspection result 209 because there is a risk of vulnerability.

  A case where the vulnerability detection procedure is performed on the source code of FIG. 4 will be described.

  The state of the variable table 207 immediately before the execution of the vulnerability detection procedure for the function call of the function sprintf on the third line of the source code of FIG. 4 is FIG. 23, and the state of the transition list 206 between dynamic dummy arguments is FIG. . First, the function name column of the vulnerability DB 205 in FIG. 26 is searched with the function name sprintf, and the vulnerable parameter list “[* 2, * 3...]” Of the matching line is read. Since the actual argument corresponding to the dummy argument reference data “* 2” in the list is “input =% s”, this reference data does not exist. The actual argument corresponding to the next dummy argument reference data '* 3 ...' in the list is the variable strin, and the indirect reference depth of the dummy argument reference data is 1, so the variable of the variable strin When the reference data set is acquired from the variable table 207 of FIG. 23, it is “(* 2, PIN)”. Since this variable reference data set includes “PIN”, the function call of the function sprintf may cause a vulnerability, and this is output as a vulnerability check result 209.

  Similarly, for the function call of the function “gets” on the eighth line of the source code of FIG. 4, the state of the variable table 207 immediately before the execution of the vulnerability detection procedure is FIG. It is. When the vulnerability parameter list registered in the vulnerability DB 205 is read for the function “gets”, “[]” is obtained. If the list is empty, it means that the function call is always vulnerable. Therefore, there is a possibility that the function call of the function “get” may cause a vulnerability, and this fact is output as the vulnerability check result 209.

  Similarly, for the function call of the function puts on the 13th line of the source code of FIG. 4, the state of the variable table 207 immediately before the execution of the vulnerability detection procedure is FIG. 25, and the state of the transition list 206 between dynamic dummy arguments is FIG. It is. When the vulnerability formal parameter list registered in the vulnerability DB 205 is read for the function puts, “[* 1]” is obtained. The actual argument corresponding to the reference data is a variable out, and the indirect reference depth of the reference data is 1. When a variable reference data set corresponding to these is read from the variable table 207 of FIG. 25, “(* 1, PIN)” is obtained. Since this variable reference data set includes “PIN”, the function call of the function “puts” may cause a vulnerability, and this is output as a vulnerability check result 209.

  This completes the description of the basic procedure in this embodiment.

  Next, processing when a conditional branch or loop exists in the inspection target source code 208 will be described below.

  In conditional branching by an if statement or a switch statement, one of branch processing A, branch processing B, branch processing C,... is executed depending on the evaluation result of the conditional expression. FIG. 27 shows an example of a source code 208 to be inspected by an if statement, and FIG. 28 shows an example of an analysis tree 210 generated by the syntax analysis unit 201 from this source code. Processing in the case of an if statement will be described using these figures.

  When the vulnerability detection unit 202 reaches the conditional branch node in the predetermined order by following the parse tree 210 of FIG. 28 constructed from the inspection target source code 208 of FIG. 27 in step 309, this conditional branch node is set as the vertex. The following processing is performed on the partial analysis tree. First, the partial parse tree is traced in a predetermined order, and the procedure 309 is applied to the first expression node (the fourth line in FIG. 28). Subsequently, a copy of the dynamic temporary argument transition DB 206 and the variable table 207 at this time is created. Then, the analysis tree 210 in FIG. 28 is traced in a predetermined order, and the replication tree 210 is applied by applying steps 308 to 310 to the analysis tree 210 having the second expression node (the fifth line in FIG. 28) that appears next as a vertex. The original transition DB 206 between dynamic dummy arguments and the variable table 207 are updated. When the parse tree 210 in FIG. 28 is traced in a predetermined order and the ELSE node is reached, the steps 308 to 310 are performed for the parse tree 210 having the third expression node (7th line in FIG. 28) that appears next as a vertex. When applied, the copied dynamic formal argument transition DB 206 and variable table 207 are updated. 28. When the analysis tree 210 in FIG. 28 is traced in a predetermined order and the analysis tree 210 having the third expression node as a vertex is traced, the transition DB206 and variable table 207 between the replication source dynamic dummy arguments and the replication The transition DB 206 between dynamic dummy arguments and the variable table 207 thus merged. In the merge of the dynamic dummy argument transition DB 206, for each row having the same function name, the dummy argument transition lists and the external input dummy argument sets are connected except for overlapping elements. In the merge of the variable table 207, for each row having the same variable identification number, the variable reference data sets having the same indirect reference depth in the variable reference data set list are connected except for overlapping elements.

  In the case of a switch statement, the vulnerability detection unit 202 applies the procedure 309 to the conditional expression, and then prepares a copy of the dynamic inter-argument transition DB 206 and the variable table 207 for each branch process, and performs the procedure 309. Are applied to perform update processing, and those results are merged to form a new dynamic dummy argument transition DB 206 and variable table 207.

  In the case of loops with for, while, and do-while statements, vulnerability detection means 202 applies procedure 309 only once to the conditional expression (including initialization and update expressions in the case of for statements) and the loop body. Then, the transition DB 206 between dynamic dummy arguments and the variable table 207 are updated.

  In the first embodiment, only a part that directly calls a function registered in the vulnerability DB 205 is detected as a vulnerability. However, a part that calls a user-defined function including such a part is also pointed out as a vulnerability. This is possible by partially modifying the procedure of the first embodiment.

  One method is to store the user-defined function as described above and a dummy argument that may receive external input data as a set, and detect the use location in the same process as in step 314.

  Alternatively, for each dummy argument of a user-defined function, the type of the reference data of the actual argument at the location where the user-defined function is called is changed to external input data 'IN', pseudo external input data 'PIN', etc. The data are classified and stored in the transition DB 206 between dynamic dummy arguments. When there is no update of the same DB in step 304, there is a dummy argument that does not receive other classified reference data, and is detected as a vulnerability by using external input data received with the dummy argument Is stored as a pair of the function name and the dummy argument, and the use location is detected by the same processing as in step 314.

  This is the end of the description of the first embodiment.

  Next, a second embodiment will be described. Since the configuration of the source code vulnerability inspection apparatus of the second embodiment is the same as that of FIG. 1 described in the first embodiment, the description of the configuration is omitted. However, the notation method of values held by each variable and dummy argument is different from that of the first embodiment.

  FIG. 29 is a functional configuration diagram included in the source code vulnerability check apparatus according to the second embodiment. The source code vulnerability check apparatus of the present embodiment includes a source code check means 201, a fixed function transition DB 2902, a weak transition destination DB 2903, an indefinite function transition DB 2904, and a variable DB 2905. The source code inspection unit 201 includes a syntax analysis unit 2911 that analyzes the syntax of the source code, and a vulnerability detection unit 2912 that detects a vulnerable part.

  FIG. 30 shows an example of the vulnerable transition destination DB 2903 registered in advance. The first column is the vulnerability transition destination, and the second column is the vulnerability avoidance value.

  FIG. 31 shows an example of the fixed function transition DB 2902. The fixed function transition DB 2902 is a database in which data transitions such as standard library functions are registered in advance. The first column is a function name, and the second column is a list of data transitions indicating which value transitions where. Data transitions are expressed using arrows “→”, and the left of “→” is the transition source and the right of “→” is the transition destination. The transition source includes an argument number representing the number of arguments registered as a positive integer in a variable DB 2905 described later, IN representing data derived from data input from the outside, and processing for avoiding the vulnerability. One of the vulnerability avoidance values shown or a combination thereof is set. In the transition destination, an argument number, a variable number of a global variable, R representing a return value, or a vulnerable transition destination that is a transition destination that may cause a vulnerability is set.

  FIG. 32 is an example of the indefinite function transition DB 2904. In the indefinite function transition DB 2904, data transition information regarding a function defined by a programmer is registered during source code analysis. This is the same structure as the fixed function transition DB 2902, but an item of an analyzed flag is added to indicate whether or not the analysis has been newly completed. The analyzed flag takes a value of 0 or 1, and a function whose analyzed flag is 0 indicates that the analysis from steps 3406 to 3412 described below has not been completed, while a function that is 1 has completed the analysis. Represents that you are doing. The initial value is 0. Hereinafter, the state of the function that has not been analyzed is referred to as unanalyzed. In addition to the values that can be set in the case of the fixed function transition DB 2902, the values that can be set as the transition destination in the indefinite function transition DB 2904 include the variable numbers of global variables that are registered as negative integers in the variable DB 2905 described later. The transition value is given an initial value DEF at the end of the procedure 3401. This value is updated in procedure 3406 or procedure 3408 described later.

  FIG. 33 is an example of the variable DB 2905. The first column is a variable identification number registered in the variable DB 2905. In the case of a global variable, this variable number is a negative integer value and is assigned in the descending order of −1 and −2. For dummy arguments, a positive integer value that matches the argument number is attached. In the case of local variables, they are assigned in ascending order so as to follow the variable numbers of the formal parameters already registered. If no positive integer value is registered, the numbers are assigned in ascending order from 1. The second column indicates the type of the variable. When registering a dummy argument, “A” is stored, “L” is stored for a local variable, and “G” is stored for a global variable. The third column is the variable name. The fourth column is a value held by the variable, and the initial value is the same value as the variable number. If the value is a constant, use C.

  FIG. 34 shows the processing procedure of the source code vulnerability inspection apparatus of this embodiment. In the following processing, the syntax analysis unit 2911 analyzes the syntax in the source code such as function definition, assignment, and function call.

  When the source code checking program is started, all the given source codes are traced, and the found programmer-defined function is registered in the indefinite function transition DB 2904 and the global variable is registered in the variable DB 2905, respectively. Variable numbers of global variables are assigned negative integer values in descending order from -1 (procedure 3401).

  Next, the beginning of the source code is set as the search start position (procedure 3402). When inspecting a source code composed of a plurality of files, the order of the files to be inspected is determined, and the head thereof is set as a search start position. Next, an unanalyzed function definition is searched from the source code (step 3403). If there is no unanalyzed function, the process proceeds to procedure 3413 described later (procedure 3403). If the corresponding function is found, next, it is searched in the definition of the corresponding function whether an unanalyzed function call has been made (step 3404). If an unanalyzed function call has been made, the search start position is moved to the next function definition position in the source code (step 3405), and the procedure returns to step 3403. If an unanalyzed function call has not been made, this function is set as a function to be processed, and analysis is started (procedure 3406).

  First, if there is a formal argument of the target function, it is registered in the variable DB 2905 (step 3407). At this time, the variable number of the dummy argument is registered so as to match the argument number. If local variables defined in the target function are found, they are registered in the variable DB 2905 (procedure 3408). Variable numbers are given in ascending order as positive integers.

  Further, if there is a function call location, the fixed function transition DB 2902 and the indefinite function transition DB 2904 are searched by function name, and the transition value of the corresponding function is acquired. If the transition value is DEF, go to step 3410. If not, the following processing is performed. If the called function is a function with arguments, get the value of the actual argument. When the transition source of the transition associated with the function call is a positive integer (argument number), the following processing is performed after correcting the transition source to an actual argument value corresponding to the argument number. When the transition source is a negative integer (a variable number of a global variable), the corresponding variable value is referenced from the variable DB 2905 and corrected to the value, and then the following processing is performed. In other cases, the following processing is performed without doing anything.

  The following processing is performed according to the transition destination value of data transition (procedure 3409). First, if there is a data transition part whose transition destination is a variable number, the corresponding variable value in the variable DB 2905 is updated. If the transition destination matches one of the vulnerable transition destinations in the vulnerable transition destination DB 2903 and there is a data transition with a variable number of the variable type 'A' or 'G' as the transition source, the transition destination is added to the indefinite function transition DB 2904. If there is a data transition whose transition destination matches one of the vulnerable transition destinations in the vulnerable transition destination DB 2903, includes IN in the transition source, and does not include the vulnerability avoidance value corresponding to the vulnerable transition destination, it is added to the indefinite function transition DB 2904 At the same time, a warning to the corresponding part in the source is output as the inspection result.

  The assignment statement is processed in the same manner as the function call as data transition from the right side to the left side.

  Steps 3408 and 3409 are repeated until the processing of the target function is completed. When the processing of the target function is finished, the analyzed flag is set to 1 and the procedure proceeds to procedure 3411 (procedure 3410).

  In the type column of the variable DB 2905, select a line whose type is “A” (provisional argument) or “G” (global variable), and acquire the value. If the value is IN or a variable number other than itself and the type is 'A' or 'G', the transition with IN or variable number as the transition source and the variable number of the corresponding row as the transition destination is an indefinite function Register in the transition DB 2904 (step 3411).

  Delete registration data regarding dummy arguments and local variables from the variable DB 2905. The value of the global variable is initialized (procedure 3412). After the process for the target function is completed, the process returns to step 3403.

  In step 3413, it is checked whether or not a new transition value having a global variable as a transition destination is registered in the indeterminate function transition DB 2904 as compared with the previous analysis. If the corresponding transition value is registered, all the values of the analyzed flag in the indefinite function transition DB 2904 are initialized to 0, and the procedure returns to the procedure 3403. On the other hand, if not, the process is terminated.

  Since there is no pointer variable transition in the inspection source code described later, it is not described, but it can be set to the transition value of the fixed function transition DB 2902 and indefinite function transition DB 2904 by adding the reference depth and reference value to the variable DB 2905. By adding this reference value to one of the objects, and when the transition source is the variable number of the pointer variable in step 3409, the value is obtained from the variable list, and the transition source is corrected. It can respond to the transition of.

  This is the end of the description of FIG.

  FIG. 46 shows inspection source code for explaining the source code vulnerability inspection apparatus of this embodiment. It is described in C language, and the line number is described on the leftmost for convenience of explanation. The definition of the sanitizeOs function is omitted, but including it does not change the following results.

  First, the case where the conventional pattern matching method is applied to the source code of FIG. 46 will be described. In the inspection by the pattern matching method, all the library function usage points that may cause a vulnerability registered in advance are detected without determining whether the data passed to the function is derived from the outside. A warning is output for the system function, which is a library function that can cause OS command injection. However, the system function is not called directly from the main function, but is called through the call_do_system function, and one of the 8th lines out of the 3rd line of the 5th, 8th and 12th lines where the system function is executed. Only the part is actually vulnerable. In the fifth line, the data handled is not from outside. The twelfth line is data that has been subjected to appropriate vulnerability avoidance processing in advance. After performing the vulnerability inspection using the pattern matching method, the programmer needs to visually determine whether the warning part is actually vulnerable in this way.

  Further, the vulnerability avoidance process performed by the programmer as a countermeasure to the vulnerability warning needs to be performed at a place where it is visually determined that the vulnerability is actually detected. If vulnerability handling processing is performed on the data handled immediately before the warned 22nd line, vulnerability avoidance processing is performed on the data that does not need to be processed on the 8th line. The vulnerability avoidance process is performed twice with eyes, and unexpected operations may occur at these two locations.

  Next, processing when the processing of 34 is applied to the source code of FIG. 46 will be described. FIG. 32 shows an example of the indefinite function transition DB 2904 at the end of the procedure 3401, in which functions defined in the inspection target source code are registered. No global variable is defined in the source code to be checked.

  After procedures 3401 and 3402, an unanalyzed function is searched from the top of the source code, and a main function is detected (procedure 3403). In the definition of the main function, the presence / absence of an unanalyzed function call is checked (procedure 3404). The call_do_system function is called on the fifth line, and it can be seen from the indefinite function transition DB 2904 that the analyzed flag is 0 (procedure 3404). That is, since an unresolved function call has been made, the search start position is moved to the next function definition (procedure 3405).

  Next, it can be seen that the do_gets function is unanalyzed and no unanalyzed function call is made in the definition (steps 3403 and 3404). Therefore, do_gets is set as a function to be analyzed (target function) (procedure 3406). Hereinafter, the order of the functions that are the target functions in the procedure 3406 is do_gets, do_system, call_do_system, and main.

  First, a dummy argument and a local variable are registered in the variable DB 2905 for the target function do_gets (procedures 3407 and 3408). FIG. 33 shows the variable DB 2905 at the end of step 3408 for do_gets. Data transition processing is performed by calling the function “gets” on the 17th line (procedure 3409). The gets transition value is retrieved and retrieved from the fixed function transition DB 2902 and the indefinite function transition DB 2904. Then, a transition of IN → 1 is acquired. Since this transition destination is a variable number, the variable value of the corresponding variable in the variable DB 2905 is updated. This completes the processing of the target function do_gets (procedure 3410). FIG. 35 shows the variable DB 2905 at the end of the procedure 3410.

  Since the variable value of type “A” in the variable DB 2905 is IN, the corresponding transition IN → 1 is registered in the indefinite function transition DB 2904 (procedure 3411). An indefinite function transition DB 2904 at the end of the procedure 3411 is shown in FIG.

  Next, processing using do_system as a target function will be described. In the same manner as described above, dummy arguments and local variables are registered in the variable DB 2905 (procedures 3407 and 3408). FIG. 37 shows the variable DB 2905 at the end of step 3408. Data transition processing is performed by calling the function system on the 22nd line (procedure 3409). The system transition value is retrieved and retrieved from the fixed function transition DB 2902 and the indefinite function transition DB 2904. Then, the transition of 1 → OSCOMM is acquired. The actual argument value of the first argument is 1, and the transition source is corrected to this value. The transition destination OSCOMM is registered in the vulnerable transition destination of the vulnerable transition destination DB 2903. Since the transition source is the variable number of the variable type “A”, the transition is registered in the indefinite function transition DB 2904 (step 3409). This completes the processing of the target function do_system (procedure 3410). FIG. 38 shows the indefinite function transition DB 2904 at the end of the procedure 3410.

  Next, processing using call_do_system as a target function will be described. In the same manner as described above, dummy arguments and local variables are registered in the variable DB 2905 (procedures 3407 and 3408). FIG. 39 shows the variable DB 2905 at the end of the procedure 3408. Data transition processing is performed by calling the function do_system on the 27th line (procedure 3409). The transition value of do_system is retrieved from the fixed function transition DB 2902 and the indefinite function transition DB 2904. Then, the transition of 1 → OSCOMM is acquired. The actual argument value of the first argument is 1, and the transition source is corrected to this value. This transition destination OSCOMM is registered in the vulnerability transition destination of the vulnerability transition destination DB 2903. Since the transition source is the variable number of the variable type “A”, the transition is registered in the indefinite function transition DB 2904 (step 3409). This completes the processing of the target function call_do_system (procedure 3410). FIG. 40 shows the indefinite function transition DB 2904 at the end of the procedure 3410.

  Next, processing using main as a target function will be described. In the same manner as described above, dummy arguments and local variables are registered in the variable DB 2905 (procedures 3407 and 3408). FIG. 41 shows the variable DB 2905 at the end of the procedure 3408. Data transition processing is performed by calling the function call_do_system on the fifth line (procedure 3409). That is, first, the transition value of do_system is retrieved from the fixed function transition DB 2902 and the indefinite function transition DB 2904. Then, the transition of 1 → OSCOMM is acquired. The actual argument value of the first argument is C representing a constant, and the transition source is corrected to this value. This transition destination OSCOMM is registered in the vulnerability transition destination of the vulnerability transition destination DB 2903. However, since the transition source is the constant value C, nothing is performed (procedure 3409).

  Next, data transition processing is performed by calling the function get_input on the seventh line (procedure 3409). Process in the same way as the 17th line. FIG. 42 shows the variable DB 2905 at the end of the procedure 3409.

  Next, data transition processing is performed by calling the function call_do_system on the eighth line (procedure 3409). Similarly, a transition of 1 → OSCOMM is acquired. The actual argument value of the first argument is IN, and the transition source is corrected to this value. This transition destination OSCOMM is registered in the vulnerability transition destination of the vulnerability transition destination DB 2903. Since the transition source is IN, it is registered in the indefinite function transition DB 2904, and a warning that this part is vulnerable is output as the inspection result.

  Next, data transition processing is performed by calling the function strlcpy on the 10th line (procedure 3409). Similarly, a transition of 2 → 1 is acquired. The actual argument value of the second argument is IN, and the transition source is corrected to this value. Since this transition destination is a variable number, the variable value of the corresponding variable in the variable DB 2905 is updated. FIG. 43 shows the variable DB 2905 at the end of the procedure 3409.

  Next, data transition processing is performed by calling the function sanitizeOs on the eleventh line (step 3409). Similarly, a transition of 1, SAFE_OS → 1 is acquired. The actual argument value of the first argument is IN, and 1 of the transition source is corrected to this value. Since this transition destination is a variable number, the variable value of the corresponding variable in the variable DB 2905 is updated. FIG. 44 shows the variable DB 2905 at the end of the procedure 3409.

  Next, data transition processing is performed by calling the function call_do_system on the 12th line (procedure 3409). Similarly, a transition of 1 → OSCOMM is acquired. The actual argument value of the first argument is IN, SAFE_OS, and the transition source is corrected to this value. This transition destination OSCOMM is registered in the vulnerability transition destination of the vulnerability transition destination DB 2903. Although IN is included in the transition source, since the vulnerability avoidance value SAFE_OS of the vulnerable transition destination DB 2903 is also included, nothing is done.

  This completes the processing of the target function main (procedure 3410). The indefinite function transition DB 2904 at the end of the procedure 3410 is shown in FIG.

  Since the corresponding function is not found in the unanalyzed function search (procedure 3403), the procedure proceeds to procedure 3413. Since the transition value registered in the indefinite function transition DB 2904 does not have a transition registered with the variable number of the global variable as the transition destination, the inspection is terminated.

  As described above, when this source code vulnerability check device is applied to the source code of FIG. 46, a warning is issued that the eighth line is accurate. Therefore, it is understood that the programmer should perform vulnerability avoidance processing on the data handled immediately before the eighth line. As described above, compared with the conventional pattern matching method, erroneous detection is not performed, and the location where the programmer should perform vulnerability avoidance processing can be easily specified. Moreover, it is not necessary to write an annotation for inspection in the source code.

  Although not described in this embodiment, a warning can be issued when a vulnerability avoidance process is performed twice for a variable. When updating the variable value in the variable DB 2905 in step 3409, if the newly updated value includes two or more same values of the vulnerability avoidance values in the vulnerability transition destination DB 2903, a warning is issued at that location. do it.

  Further, when the transition destination is the variable number of the variable of type “A” or “G” in the variable DB 2905, the transition is once reflected in the variable DB 2905 in the procedure 3409 in FIG. Instead of registering in the function transition DB, the procedure 3411 may be eliminated and the transition may be registered directly in the indefinite function transition DB.

  This is the end of the description of the second embodiment.

1 is a schematic diagram of the configuration of a source code vulnerability inspection device according to the first embodiment. Overview of functions of the source code vulnerability inspection device of the first embodiment Diagram showing the processing procedure in the syntax analysis means and vulnerability detection means Diagram showing an example of inspection source code Diagram showing an example of a parse tree The figure which shows the example of transition DB between formal arguments The figure which shows the example (at the end of procedure 303) of transition DB between dynamic dummy arguments Diagram showing an example of variable table (at the end of step 308 for the function format_string) The figure which shows the variable table update processing procedure at the time of analyzing the function call part The figure which shows the example of variable table (at the time of completion | finish of procedure 309 about function format_string) Diagram showing an example of variable table (before processing of pointer assignment 'a = b') Diagram showing an example of variable table (after processing of pointer assignment 'a = b') The figure which shows the example (at the end of procedure 311 about function format_string) of dynamic DB transition DB A diagram showing an example of a variable table (at the end of step 308 for the function get_input) The figure which shows the example (at the time of completion | finish of procedure 309 about a function get_input) of a variable table The figure which shows the example (at the end of procedure 311 about function get_input) of dynamic DB transition DB The figure which shows the example of variable table (at the time of the end of procedure 308 about function put_input) The figure which shows the example of variable table (at the time of completion | finish of procedure 308 about the 18th line of the function main) The figure which shows the example of a variable table (at the time of completion | finish of procedure 309 about the 19th line of the function main) The figure which shows the example of a variable table (at the time of the end of procedure 309 about the 20th line of the function main) The figure which shows the example (at the time of completion | finish of procedure 309 about the 20th line of the function main) between dynamic dummy argument transition DB The figure which shows the example (at the time of the end of procedure 309 about the 21st line of the function main) of dynamic DB transition DB Diagram showing an example of a variable table (at the end of the second step 308 for the function format_string) Diagram showing an example of a variable table (at the end of the second step 309 for the function format_string) The figure which shows the example of variable table (at the time of the end of the 2nd procedure 308 about function put_input) Diagram showing an example of vulnerability DB Diagram showing an example of source code to be examined including if statement Diagram showing an example of a parse tree Outline diagram of functions of source code vulnerability inspection device of second embodiment Diagram showing an example of a vulnerable transition destination DB The figure which shows the example of fixed function transition DB The figure which shows the example (at the time of the end of procedure 3401) of indefinite function transition DB The figure which shows the example of variable DB (at the end of procedure 3408 about function do_gets) The flowchart figure which shows the operation | movement of the source code vulnerability inspection of this embodiment The figure which shows the example (at the end of procedure 3410 about function do_gets) of variable DB The figure which shows the example of the indefinite function transition DB (at the time of procedure 3411 completion | finish about function do_gets) The figure which shows the example of variable DB (at the end of procedure 3408 about function do_system) Diagram showing an example of the indefinite function transition DB (when the procedure 3410 ends for the function do_system) The figure which shows the example of variable DB (at the end of procedure 3408 about function call_do_system) The figure which shows the example of the indefinite function transition DB (at the time of completion | finish of procedure 3411 about function call_do_system) The figure which shows the example of variable DB (when procedure 3408 ends about function main) The figure which shows the example of variable DB (at the end of procedure 3409 about the 7th line of function main) The figure which shows the example (at the end of procedure 3409 about the 10th line of the function main) of variable DB The figure which shows the example (at the end of procedure 3409 about the 11th line of the function main) of variable DB The figure which shows the example of indefinite function transition DB (at the time of the end of procedure 3411 about function main) Diagram showing a specific example of source code to be inspected

Explanation of symbols

  DESCRIPTION OF SYMBOLS 101 ... Main body (computer) of source code vulnerability inspection apparatus, 102 ... CPU, 103 ... Storage device, 104 ... Memory, 105 ... Keyboard, 106 ... CRT, 107 ... External storage device, 108 ... Bus.

Claims (6)

  1. A source code vulnerability inspection device for inspecting a vulnerability of a source code to be inspected,
    A syntax analysis means for analyzing the syntax of the inspection target source code and constructing an analysis tree of the inspection target source code;
    For a function included in a library used in the inspection target source code, a transition database between dummy arguments describing data transition between dummy arguments by internal processing;
    For user-defined functions defined in the inspection target source code, data transition between dummy arguments by internal processing, and from outside the program during execution of a program created by compiling the inspection source code A transition database between dynamic dummy arguments for registering a pair of dummy argument positions that may receive given input data; and
    A variable table for recording whether the type of value held in the variable used in the inspection target source code is input data given from the outside, and
    If there is a user-defined function while tracing the parse tree, the user-defined function is set as a target function to be analyzed, the partial parse tree is searched, the definition of the formal argument and the local variable is registered in the variable table, If there is a data transition between variables or an external output part of a variable, update by referring to the transition database between dummy arguments, the transition database between dynamic dummy arguments, and the variable table, and the partial analysis tree of the target function When the search is completed, the content of the transition database between the dynamic dummy arguments is updated based on the content of the variable table for the target function, and by repeating such processing, the data transition between the variables is traced through the parse tree. Means to enable tracking,
    When using external input as a parameter, a vulnerability database that registers a vulnerable function,
    A source code vulnerability inspection apparatus comprising: means for reading the vulnerability database and warning a location where data transition between the variables matches a registered content of the vulnerability database as a location having a vulnerability .
  2. A source code vulnerability inspection device for inspecting a vulnerability of a source code to be inspected,
    Parsing means for constructing an analysis tree by parsing the inspection source code according to the parsing rules of the programming language used to describe the inspection source code;
    For a function included in a library used in the inspection target source code, a transition database between dummy arguments that describes data transition between dummy arguments by internal processing; and
    For a user-defined function defined in the source code to be inspected, data transition between dummy arguments by internal processing and from the outside of the program during execution of a program created by compiling the source code to be inspected A dynamic inter-argument transition database that registers pairs of positions of formal arguments that may receive given input data; and
    A variable table for recording the type of the value held in the variable used in the source code to be inspected by distinguishing whether it is input data given from outside the program; and
    A function in the library that may cause a vulnerability when receiving and processing input data given from the outside of the program with a specific dummy argument when executing the program, and the position of the specific dummy argument, A vulnerability database registered in pairs,
    While repeatedly traversing the entire parse tree in a predetermined order from the root node of the parse tree, the dynamic dummy argument transition database and variable table are updated, and a new update is made to the dynamic dummy argument transition database. Then, the variable table is updated with reference to the transition database between the dummy arguments and the transition database between the dynamic parameters while tracing the entire analysis tree from the root node of the analysis tree in a predetermined order again. Further, referring to the vulnerability database, the function registered in the vulnerability database is executed at the position of the specific argument of the pair with the registered function at a location where the function is called in the inspection target source code. The type of the value stored in the variable table for the variable specified as an argument is input data from outside the program. If, and a vulnerability detection means for outputting a vulnerability inspection result relevant section as vulnerability,
    When the vulnerability detection means arrives at a function definition node by following the parse tree in a predetermined order, if the user-defined function represented by the function definition node is not registered in the transition database between dynamic dummy arguments, Register a user-defined function
    Subsequently, the partial analysis tree having the function definition node at the vertex of the analysis tree is traced in a predetermined order, and when the dummy argument definition node is reached, the dummy argument represented by the dummy argument node is registered in the variable table,
    If the dummy argument is registered as a dummy argument that may receive input data from outside the program in the dynamic dummy argument transition database for the user-defined function, the dummy argument is stored in the variable table. Register external input data as the type of value
    If the variable definition node is reached, the variable represented by the variable definition node is registered in the variable table,
    When the function call node is reached, the data represented between the dummy arguments registered in the transition database between the dummy arguments or the dynamic dummy argument transition database for the function represented by the function call node is referred to. Get the value type of the transition source dummy argument and the value type of the transition destination dummy argument that make up the data transition of,
    If the type of the value of the transition source dummy argument is the first dummy argument of the function represented by the function calling node, the first parsed tree corresponding to the position of the first dummy argument is traced in a predetermined order. Reaching the actual argument node, obtaining the type of value held by the variable recorded in the variable table for the first variable represented by the first actual argument node as a new transition destination value type,
    If the value type of the transition source dummy argument is external input data, the new transition destination value type is external input data.
    If the value type of the transition destination dummy argument is the second dummy argument of the function represented by the function calling node, the second parsed tree corresponding to the position of the second dummy argument is traced in a predetermined order. Reaching the actual argument node, registering the type of the value of the new transition destination as the type of the value held in the variable recorded in the variable table for the second variable represented by the second actual argument node;
    Furthermore, if the new transition destination value type includes external input data and the function represented by the function call node is registered in the dynamic parameter transition database, the function represented by the function call node Registering that one dummy argument receives input data from outside the program in the transition database between the dynamic dummy arguments;
    Furthermore, if the type of the new transition destination value includes external input data and the combination of the function represented by the function call node and the position of the first dummy argument is registered in the vulnerability database, Outputting the description position of the first variable in the source code to be inspected as a place where vulnerability may occur,
    When the partial parse tree is traced in a predetermined order, for each dummy argument, the type of value held by the dummy argument recorded in the variable table is changed to the value type held by the transition source dummy argument, the dummy argument. Create a data transition between new dummy arguments with the argument itself as the type of value held by the transitional dummy argument, and change the data transition between the dummy arguments for the user-defined function in the dynamic dummy argument transition database. A source code vulnerability inspection device that is overwritten by data transition between new arguments.
  3. The source code vulnerability inspection device according to claim 2 ,
    The source code vulnerability inspection apparatus, wherein the vulnerability detection unit records a value type held by a variable recorded in the variable table for each possible indirect reference depth.
  4. In the source code vulnerability inspection device according to claim 2 or 3 ,
    The vulnerability detection means follows the analysis tree in a predetermined order, and when the variable definition node is reached, registers the variable represented by the variable definition node and the depth in the analysis tree in the variable table,
    Further, when the depth of the node reached by following the parse tree in a predetermined order becomes shallower than the depth of the definition node of the variable, the registered data for the variable is deleted from the variable table. A source code vulnerability inspection device.
  5. A source code vulnerability inspection device for inspecting a vulnerability of a source code to be inspected,
    A syntax analysis means for analyzing the syntax of the inspection target source code;
    Fixed function transition database that pre-registers data transitions associated with library functions,
    Vulnerability transition destination database that pre-registers a transition destination that may cause a vulnerability and a value indicating that processing has been performed to avoid it,
    A variable database that records the types and values of variables defined in the source code;
    An indefinite function transition database that records data transitions associated with functions defined in the source code to be examined;
    While tracing the inspection target source code, the fixed function transition database and the vulnerable transition destination database are referred to, the variable database defines the type and value of the variable defined in the inspection target source code, and the indefinite function The transition database is updated so that data transitions can be tracked, and external input values that have not been subjected to processing to avoid vulnerabilities transition to transition destinations where vulnerabilities may occur. A source code vulnerability inspection apparatus comprising: vulnerability detection means for detecting a function call place to be performed as a vulnerability.
  6. In the source code vulnerability inspection device according to claim 5 ,
    Among the functions defined in the source code to be checked, the function to be processed is processed from a library function or a function definition that is only a programmer-defined function that has already been processed and registered data transition. A source code vulnerability inspection apparatus comprising means for determining the source code vulnerability.
JP2005237124A 2005-08-18 2005-08-18 Source code vulnerability inspection device Expired - Fee Related JP4693044B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2005237124A JP4693044B2 (en) 2005-08-18 2005-08-18 Source code vulnerability inspection device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2005237124A JP4693044B2 (en) 2005-08-18 2005-08-18 Source code vulnerability inspection device

Publications (2)

Publication Number Publication Date
JP2007052625A JP2007052625A (en) 2007-03-01
JP4693044B2 true JP4693044B2 (en) 2011-06-01

Family

ID=37917035

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2005237124A Expired - Fee Related JP4693044B2 (en) 2005-08-18 2005-08-18 Source code vulnerability inspection device

Country Status (1)

Country Link
JP (1) JP4693044B2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9507933B2 (en) 2012-08-01 2016-11-29 Mitsubishi Electric Corporation Program execution apparatus and program analysis apparatus

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4951416B2 (en) * 2007-06-01 2012-06-13 株式会社 日立システムアンドサービス Program verification method and program verification apparatus
JP5186443B2 (en) 2009-06-30 2013-04-17 インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation System, method and program for judging validity of character string
US8584246B2 (en) 2009-10-13 2013-11-12 International Business Machines Corporation Eliminating false reports of security vulnerabilities when testing computer software
US8468605B2 (en) 2009-11-30 2013-06-18 International Business Machines Corporation Identifying security vulnerability in computer software
KR101051600B1 (en) * 2010-03-29 2011-07-22 주식회사 소프트 포 소프트 Systems for performing code inspection on abap source code
JP2012004923A (en) 2010-06-18 2012-01-05 Funai Electric Co Ltd Television device and speaker system
US8528095B2 (en) 2010-06-28 2013-09-03 International Business Machines Corporation Injection context based static analysis of computer software applications
JP5171907B2 (en) * 2010-09-13 2013-03-27 株式会社東芝 Information processing apparatus and information processing program
KR101991687B1 (en) 2012-11-23 2019-06-24 삼성전자 주식회사 Dynamic library profiling method, computer readable recording medium storing thereof and dynamic library profiling system
KR102026959B1 (en) * 2019-04-19 2019-09-30 한화시스템(주) Security system and operation method thereof

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004171064A (en) * 2002-11-15 2004-06-17 Mitsubishi Research Institute Inc Buffer overflow static analysys method and program
JP2006523898A (en) * 2003-04-18 2006-10-19 オンス ラブス,インク Source code vulnerability detection method and detection system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006087780A1 (en) * 2005-02-17 2006-08-24 Fujitsu Limited Vulnerability examining program, vulnerability examining device, and vulnerability examining method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004171064A (en) * 2002-11-15 2004-06-17 Mitsubishi Research Institute Inc Buffer overflow static analysys method and program
JP2006523898A (en) * 2003-04-18 2006-10-19 オンス ラブス,インク Source code vulnerability detection method and detection system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9507933B2 (en) 2012-08-01 2016-11-29 Mitsubishi Electric Corporation Program execution apparatus and program analysis apparatus

Also Published As

Publication number Publication date
JP2007052625A (en) 2007-03-01

Similar Documents

Publication Publication Date Title
Liskov et al. Programming with abstract data types
Minamide Static approximation of dynamically generated web pages
US7120898B2 (en) Intermediate representation for multiple exception handling models
CA2522605C (en) Method and system for detecting vulnerabilities in source code
US8566789B2 (en) Semantic-based query techniques for source code
Paulson A semantics-directed compiler generator
US8732669B2 (en) Efficient model checking technique for finding software defects
Holzmann Static source code checking for user-defined properties
US5371747A (en) Debugger program which includes correlation of computer program source code with optimized object code
US5870590A (en) Method and apparatus for generating an extended finite state machine architecture for a software specification
JP5042315B2 (en) Detect security vulnerabilities in source code
Owens et al. Regular-expression derivatives re-examined
US5161216A (en) Interprocedural slicing of computer programs using dependence graphs
US5854924A (en) Static debugging tool and method
US6038378A (en) Method and apparatus for testing implementations of software specifications
US8307351B2 (en) System and method for performing code provenance review in a software due diligence system
Madsen et al. Practical static analysis of JavaScript applications in the presence of frameworks and libraries
JP2005011345A (en) Code segment creating method and system for the same
Yu et al. Symbolic string verification: An automata-based approach
US20060294502A1 (en) Programmable annotation inference
US7958493B2 (en) Type inference system and method
JPWO2006038394A1 (en) Source code tester, method, program, and storage medium
US8479161B2 (en) System and method for performing software due diligence using a binary scan engine and parallel pattern matching
US9378014B2 (en) Method and apparatus for porting source code
US8015554B2 (en) Source-to-source transformation for language dialects

Legal Events

Date Code Title Description
A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20080108

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20101124

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20101201

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20110128

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20110217

A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20110217

R150 Certificate of patent or registration of utility model

Free format text: JAPANESE INTERMEDIATE CODE: R150

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20140304

Year of fee payment: 3

LAPS Cancellation because of no payment of annual fees