CN101661543B

CN101661543B - Method and device for detecting security flaws of software source codes

Info

Publication number: CN101661543B
Application number: CN200810146955.2A
Authority: CN
Inventors: 唐文
Original assignee: Siemens Ltd China
Current assignee: Siemens Ltd China
Priority date: 2008-08-28
Filing date: 2008-08-28
Publication date: 2015-06-17
Anticipated expiration: 2028-08-28
Also published as: CN101661543A

Abstract

The invention discloses a method for detecting security flaws of software source codes. The method comprises the following steps: establishing an abstract syntax tree AST corresponding to source codes of software to be detected; determining controllable points and risk points of each node of the established AST according to predefined controllable points and risk points; and searching an execution path between the controllable points and the risk points in the AST, and if the risk points on the execution path can be controlled by the controllable points on the execution path, the execution path is determined as a potential risk execution path probably causing the security flaws. The invention also discloses a device for detecting the security flaws of the software. The method and the device can effectively detect the security flaws existing in the source codes of the software.

Description

The detection method of security flaws of software source codes and pick-up unit

Technical field

The present invention relates to technical field of software security, particularly a kind of detection method of security flaws of software source codes and pick-up unit.

Background technology

At present, software is by increasing such as, for the treatment of various sensitive information and high value information, business information, financial information etc., and this makes software day by day become the target of attack of the assailant of these information of attempt acquisition.Assailant attempts to excavate the security breaches in software, to disturb running software, realizes the malicious operation to software.Wherein, the security breaches of source code writing phase introducing are modal security breaches.Therefore, a kind of effective security flaw detection method of exploitation is necessary to detect the potential security hole be present in source code.

At present, automatic code security audit technology can be adopted from software source code to detect security breaches.Automatic code security audit is a kind of Security Vulnerability Detection analyzed based on software source code, and it is Static Detection, belongs to white box (white box) measuring technology.Automatic code security audit is when not operating software, carries out static analysis to the source code of software, thus obtains software configuration.And then the security breaches found out according to analysis result in software.But, its shortcoming is, static source code analysis is only only applicable to detect the security breaches relevant to grammer, such as Buffer Overflow, race condition etc., but injecting the security breaches relevant to semanteme with database injection etc. for analyzing such as order, just seeming helpless.

From the above, although current security flaw detection method can detect some security breaches be present in software, but because its detection mode is limited to, detect comprehensive not, effectively can not detect the various security breaches be present in software source code.

Summary of the invention

In order to overcome the above problems, one aspect of the present invention provides a kind of detection method of security flaws of software source codes, there is provided a kind of pick-up unit of security flaws of software source codes on the other hand, can effectively detect the security breaches be present in software source code.

The detection method of security flaws of software source codes provided by the present invention, comprising:

Set up the abstract syntax tree AST that the source code of software to be detected is corresponding;

Handle a little and risk point according to predefined, determine the point handled in each node of set up AST and risk point;

Search for the execution route between the point handled in described AST to risk point, if described execution route by the point control handled on described execution route, then can be defined as the potential risk execution route that may cause security breaches by the risk point on described execution route.

According to method of the present invention, wherein, described manipulation is some input class function; Described risk point is for performing class function and/or assignment statement.

According to method of the present invention, wherein, the parameter handled can handled a little and the risk parameter in risk point is pre-defined further.

According to method of the present invention, wherein, determine that the risk point on described execution route can be by the mode of the point control handled on described execution route:

If the risk parameter in the risk point on described execution route is polluted by the parameter handled in the point handled on described execution route, then determine that the risk point on described execution route can by the point control handled on described execution route.

According to method of the present invention, wherein, determine that the risk parameter in the risk point on described execution route by the mode that the parameter handled in the point handled on described execution route is polluted is:

Using the parameter handled in the point handled on described execution route as initially potential can manipulated variable PEV;

Determine on described execution route by the middle PE V of described initial p EV pollution;

Judge whether the risk parameter in the risk point on described execution route is described initial p EV or middle PE V, if so, then judge that the risk parameter in the risk point on described execution route is polluted by the parameter handled in the point handled on described execution route.

According to method of the present invention, wherein, the mode obtaining the middle PE V that described execution route is polluted by described initial p EV is data-flow analysis mode.

According to method of the present invention, wherein, obtain on described execution route and comprised control flow analysis mode further by the mode of the middle PE V of described initial p EV pollution.

According to method of the present invention, wherein, the method comprises further: generate test report according to determined potential risk execution route.

According to method of the present invention, wherein, describedly generate test report according to determined potential risk execution route and comprise:

If described potential risk execution route can directly represent by test input and test condition, then generate test script according to described test input and test condition, otherwise, generate the test report recording described potential risk execution route.

According to method of the present invention, wherein, the source code of described software to be detected is high-level programming language source code, or for executable program code being carried out decompiling and the assembly language source code obtained.

The pick-up unit of security flaws of software source codes provided by the present invention, comprises source code processing unit and path analysis unit; Wherein,

Described source code processing unit, the AST that the source code for setting up software to be detected is corresponding, handles a little and risk point according to predefined, determines the point handled in each node of set up AST and risk point;

Described path analysis unit, for in each node of AST of determining at described source code processing unit, determine to handle a little to the execution route between risk point, if described execution route by the point control handled on described execution route, then can be defined as the potential risk execution route that may cause security breaches by the risk point on described execution route.

According to device of the present invention, wherein, described source code processing unit comprises configuration module, analysis module, node type locating module and AST logging modle;

Described configuration module, for recording predefined manipulation a little and risk point;

Described analysis module, the source code for treating inspection software carries out morphology, syntax and semantics analysis, sets up AST;

Described node type locating module, for according to predefinedly handling a little and risk point of recording in described configuration module, determine the point handled in each node of the AST that described analysis module is set up and risk point, and by described AST and determinedly to handle a little and risk point is recorded in AST logging modle;

Described AST logging modle, for recording described AST and determinedly handling a little and risk point, and is supplied to described path analysis unit.

According to device of the present invention, wherein, described source code processing unit comprises decompiling module further, for being assembly language source code by executable program code decompiling, and sends to described analysis module.

According to device of the present invention, wherein, described path analysis unit comprises execution route locating module and leak locating module;

Described execution route locating module, can handle a little to the execution route between risk point for search in described AST, and will search execution route and inform to described leak locating module;

Described leak locating module, for after the notice receiving described execution route locating module, determine whether the risk point on described execution route by the point control handled on described execution route, if so, then can be defined as potential risk execution route by described execution route.

According to device of the present invention, wherein, this device comprises safety test use-case generation unit further, generates test report for the potential risk execution route determined according to described path analysis unit.

As can be seen from such scheme, the basis that the present invention detects security breaches analyzes source code thus sets up AST, because the process setting up AST is not limited only to static morphology and grammatical analysis, also comprise semantic analysis, therefore relative to automatic code security audit technology, the present invention is more comprehensive to the analytic process of source code.Secondly, the present invention carries out path analysis to the AST set up, and finding can by the execution route can handling point control risk point.Owing to utilizing these execution routes, assailant can by controlling risk a little to handling a well-designed data of input, thus cause software execution to occur that mistake even paralyses.Therefore, what the present invention found can be exactly security breaches by the execution route can handling point control risk point.Visible, the present invention passes through carry out route searching in the AST set up at comprehensive source code analysis thus obtain various security breaches, various security breaches in software source code can more efficientlyly be found out by the present invention, and as improving the foundation of software, thus effectively avoid risk point by foul manipulation, enhance the security of software source code.

Accompanying drawing explanation

Exemplary embodiment of the present invention will be described in detail by referring to accompanying drawing below, the person of ordinary skill in the art is more clear that above-mentioned and other feature and advantage of the present invention, in accompanying drawing:

Fig. 1 is the process flow diagram of software security flaw detection method in the embodiment of the present invention;

Fig. 2 is the process flow diagram of software security flaw detection method in another embodiment of the present invention;

Fig. 3 is the structural representation of a kind of AST in the embodiment of the present invention;

Fig. 4 is the structural representation of software security flaw pick-up unit in the embodiment of the present invention;

Fig. 5 is the structural representation of source code processing unit 1 in Fig. 4;

Fig. 6 is the structural representation of path analytic unit 2 in Fig. 4.

Embodiment

For making object of the present invention, technical scheme and advantage clearly understand, to develop simultaneously embodiment referring to accompanying drawing, the present invention is described in more detail.

In practice, assailant, by the well-designed particular data of input, makes software perform according to certain path, finally causes the safety problem such as Buffer Overflow, code injection.These threaten the execution route of software security to be called potential risk execution route (PVEP, Potential vulnerable Execution Path), are the objects that assailant excavates.Therefore the present invention reaches the object detecting security breaches by the potential risk execution route in search software source code.

Based on above consideration, the invention provides a kind of detection scheme of software security flaw, the program sets up abstract syntax tree (AST by analysis software source code, Abstract Syntax Tree), point (EP is handled according to predefined, Exploitable Points) and risk point (VP, Vulnerable Points), determine EP and VP in each node of set up AST; Then by carrying out path analysis to described AST, determining the execution route between EP to VP, if the VP on described execution route can be controlled by the EP on described execution route, then the execution route found being defined as the PVEP that may cause security breaches.

Wherein, EP refers to the input point in software source code.By EP, from software externally to software input command or data, with control software design running status, and then various preplanned mission can be completed.But if EP victim utilizes, assailant can input particular command or data by EP, realizes the malicious operation to software.

EP is normally present in software source code with functional form.For C language source code, the input class function that dynamic link library (DLL), system call interfaces (SCI), application programming interface (API) and built-in function etc. provide can be defined as EP.Input class function comprises users input functions (as scanf ()), system environments input function (as getenv ()) and network input function (as read ()) etc.In actual applications, EP also can be user-defined function.

VP refers to the execution point in software source code.By VP, multiple process can be carried out to perform various reservation task.But, if these execution point victims are by EP indirect operation, the adverse consequencess such as running software mistake, collapse can be caused.

VP generally includes function and assignment statement.Still for C language source code, the execution class function that DLL, SCI, API and built-in function etc. can be provided and assignment statement are defined as VP.Perform class function and comprise access variable execution function (as execlp ()), operating system (OS, Operating System) order execution function (as system ()), database command execution function (as Embedded database query command EXEC SQL) etc.; Assignment statement such as comprises character string copy function strcpy ().Above-described access variable performs function may cause code injection leak, and OS order performs function may cause code injection leak, and database command performs function may cause database injection loophole, and assignment statement may cause Buffer Overflow leak.In addition, when abnormal to the access order performed by resource (file, network etc.), when causing the competition to same resource, the condition that can constitute competition leak, its consequence is that program cannot normally be run.In actual applications, VP also can be User-Defined Functions.

From the above, the present invention is by setting up AST and carrying out path analysis to AST thus determine PEVP.Because the process setting up AST is not limited only to static morphology and grammatical analysis, also comprise semantic analysis, therefore carry out on the basis of the AST set up PEVP that path analysis finds can system, reflect and various security breaches in software source code achieve effective detection of security breaches all sidedly.

Wherein, above analysis source code the process setting up AST can adopt the front end compiler in compiler to complete.The function of compiler is fetch program source code and is translated as target language.Existing compiler comprises front end compiler and back-end compiler, front end compiler carries out morphology, syntax and semantics analysis to source code, sets up AST, also source code is converted to the internal representation (IR of compiler simultaneously, Intermediate), i.e. the compiler language that can identify.Back-end compiler analysis also optimizes internal representation, finally generates object code.Application front end compiler of the present invention carries out morphology, syntax and semantics analysis to source code, generates AST.

Fig. 1 is the process flow diagram of software security flaw detection method in one embodiment of the invention.This embodiment adopts the front end compiler CC1 in GNU C compiler (GCC, GNU C Compiler) as analyzing and setting up the module of AST.GCC under linux system is the multi-platform compiler of powerful, the superior performance that GNU releases.Wherein, GNU is the acronym language of " GNU ' s Not Unix ".As shown in Figure 1, the method comprises the following steps:

Step 101: read the source code of software to be detected and predefined EP and VP.

In this step, predefined EP and VP can be recorded in EP configuration file and VP configuration file respectively.EP configuration file and VP configuration file also can be same file.

Wherein, EP configuration file is made up of multirow, every line description EP.The form of EP configuration file can be: function name: can handle parameter.Here, certain or some input parameter that parameter is function can be handled.Because a function may have multiple input parameter, be not that each parameter may directly utilize by victim, therefore specifically can define which input parameter in EP as required for can parameter be handled.

Character string input function gets () in C language: have recorded gets:1 in EP configuration file, representative function gets () is an EP; Only have an input parameter in gets (), its first is also that a unique input parameter is for handling parameter.Wherein, the input parameter of gets () is the string variable of input.

For C language source code, the function that can be defined as EP includes but not limited to: read data function f read () from file, read data function read () from the file of file handle instruction, from the amount of specifying Offsets read data function pread () of the file of file handle instruction, character string function fgets () is read from file, input character string function gets () from standard input device, wide word character string function fgetws () is read from file, host name function gethostname () is got from system environments, domain name function getdomainname () is got from system environments, data function scanf () is inputted from standard input device, from character string read data function sscanf () and from file read data function f scanf (), etc..

Wherein, VP configuration file is made up of multirow, every line description VP.The form of VP configuration file can be: function name: risk parameter.Here, risk parameter is certain or some input parameter of function.Because a function may have multiple input parameter, being not that each parameter can victim indirect operation cause security breaches, can which input parameter specifically defined as required in VP be therefore risk parameter.When risk parameter is handled, its place function will cause security breaches.

Perform in one of function system (): VP configuration file for the order in C language and have recorded system:1, representative function system () is a VP, and its first input parameter is risk parameter.When first parameter contaminated (tainted) of function system (), time in other words by manipulation, function system () can be controlled by certain EP by victim, thus causes the security breaches of code injection.Wherein, first parameter of system () is the command name be performed.

For C language source code, the function that can be defined as VP includes but not limited to: perform shell-command function system (), perform the program function execve () specified, the program function execl () performed under specified path, perform program file function execlp () of specifying, the program function execle () performed under specified path, the program function execv () performed under specified path, perform program file function execvp () and establishment process fill order function popen () of specifying, etc.

Step 102: morphology, syntax and semantics analysis are carried out to the source code read, sets up AST, and according to predefined EP and VP, determine EP and VP in the AST set up.

In this step, be prior art by carrying out morphology, syntax and semantics analysis to set up the process of AST to the source code read.Wherein, lexical analysis is that whole source code is resolved into multiple lemma (tokens), each lemma is an independent language atomic unit (atomic unit), such as, key word (keyword) in program statement, identifier (identifier), symbolic name (symbol name), lexical analysis is the first step generating AST.Grammatical analysis is the syntactic structure that order by analyzing each lemma carrys out recognizer.Syntactic analysis phase starts to set up parsing tree, i.e. AST.AST is the tree construction that a kind of different forms (formal) grammar rule describes, and adopts this tree construction to represent word order.Semantic analysis adds semantic information in AST, and perform semantic test.AST sets up in the process that hockets of syntax and semantics analysis.

Node in AST reflects the program statement in source code.Each node can have child node, and each child node can have again oneself child node.Each node has at least one attribute, the position (Location), type (Type), effective range (Scope) etc. of the program statement that such as this node records.

After establishing AST, or in the process setting up AST, judge on AST, whether each node is predefined EP or VP one by one, if certain node on AST is identical with predefined EP, be then EP by this nodes records, if certain node on AST is identical with predefined VP, be then VP by this nodes records.

This step is when recording EP and VP, and can adopt EP and the VP record mode in the table in AST, each node that also can be adopted as AST increases the mode of EP attribute and VP attribute.Such as, when certain node is EP, then its EP/VP property value is set to preset value, such as, puts 1; When certain node is VP, then its EP/VP property value is set to preset value, such as, sets to 0.Certainly, can also the bit sequence of two be adopted to represent EP attribute and VP attribute, two bits in this bit sequence represent EP attribute and VP attribute respectively.

Step 103: search the execution route in AST between EP to VP, if the VP on described execution route can be controlled by the EP on described execution route, then determines that described execution route is PVEP.

In this step, using each EP in AST successively as current EP, and be handled as follows for current EP:

A1, the input variable that can handle parameter will be defined as in current EP as initial p EV (PotentialExploitable Variable, potential can manipulated variable); Wherein, initial p EV may be one or more than one variable;

B1, from current EP, in AST, carry out route searching by preset search algorithm, determine that whether current search node is the middle PE V occurred in VP and current search node one by one; If current search node is VP, then determine that whether the risk parameter of this VP is initial p EV on its place execution route or middle PE V; If the risk parameter of this VP is initial p EV on its place execution route or middle PE V, then determine that the execution route between this VP and current EP is a PVEP.

Above-mentioned a1 and b1 step repeats for each EP in AST, until all complete for the route searching of described each EP.

In described step b1, the searching algorithm carrying out employing during route searching is generally depth-first search (deep-first searching) rule, its way of search is from the root node of tree, preferentially search for depth, until arrive the afterbody node of tree, if run into the bifurcated of tree in the search, then first can search for left branch according to the principle pre-set, or first search for right branch.First to search for left branch, by the root node of on AST, child node is propped up on the left side first searching for this root node, then child node is propped up on a left side for the left child node of search, until arrive the afterbody node of tree, then, return the father node of this afterbody node, judge whether this father node has a right child node, if had, then continue this right side of search child node, then child node is propped up on the left side searching for this right side child node, until arrive the afterbody node of tree; Otherwise, return even higher level of node again; So analogize, until traveled through by all paths this root node, so far, the route searching for this root node has terminated.

In described step b1, for each the current search node in path search process, all carry out the judgement whether this node is VP, if so, then determine to find the execution route between a current EP to this VP.

In described step b1, the middle PE V occurred in current search node can be determined by data-flow analysis (data flow analysis).In addition, in order to increase the validity that PEV analyzes, control flow analysis (control flow analysis) can also be aided with on the basis of data-flow analysis.Data-flow analysis and control flow analysis can carry out when searching each node.Wherein, data-flow analysis refers to and to analyze for the definition of variable and use.Such as, when a variable is PEV, can be understood as this variable can be contaminated, and when this PEV copy or assignment being given another variable, then another variable also can be contaminated, becomes another PEV, and this analytic process is exactly data-flow analysis.And when program is before giving another variable by a variant duplication or assignment, carry out validity checking, such as judge whether the length of first variable exceedes legal length, whether comprise unallowable instruction digit (such as "/") in the value of variable, if the judgment is Yes, then do not perform copy or assignment operation, or carry out again copying or assignment operation after carrying out respective handling, so, even if having copy or assignment statement, second variable also can not be contaminated.The analytic process of validity checking statement whether is had just to belong to a kind of analysis mode of control flow analysis in this determining program.

Step 104: according to the PVEP detected, generating test use case, and report tester.

In practice, if the implementation of a PVEP is fairly simple, when can directly adopt test input and test condition to represent, test input and test condition are reported as test script content.Wherein, test input is the parameter value that user, test environment or network input; Test condition is the conditional information making the VP on this PVEP can be controlled by EP.

When the implementation more complicated of PVEP, test input and the test condition of thus testing PVEP can not be determined simply, and when needing manually to arrange, the path description of PVEP can be documented in test report and to export to tester, carrying out follow-up test by tester according to test report.The line number of the program statement that the path description of PVEP can record for each node on this execution route, describing mode can be forms mode, also can be stacked manner.

Below, give an actual example fairly simple to the implementation of PVEP, the situation that can generate test script is described.

First, lifting one adopts test input as the example 1 of test script.In example 1, the risk parameter in VP is the parameter handled in EP, and in example 1, the C language source code of software program to be detected is as follows:

void main()

{

char str[100]；

scanf(“％s”，str)；

system(str)；

}

After CC1 reads the source code of this software to be detected, generate AST.Wherein, node scanf () is EP, and its input parameter str is for can handle parameter, and namely variable str is initial p EV; System () is VP, and its input parameter str is risk parameter.Because variable str is initial p EV, the execution route between therefore from scanf (" %s ", str) to system (str) is PVEP.By finding the analysis of this PVEP, this PVEP directly inputs str by user, does not do any inspection and is just used as operating system command execution, will cause code injection leak.Therefore, str=" cat/etc/password " can be made, so when performing along this PVEP, can show be kept at/etc path under password file, i.e. system user name file.Due to when assault system, first the list of validated user in acquisition system is needed, then hacker just may crack the password of these users, so the user name file preserving validated user list is sensitive information, therefore this example is using the content of str=" cat/etc/password " as test script.This str=" cat/etc/password " is test input.

According to the testing result of PVEP, tester can carry out perfect to software source code, such as can increase before system (str) function call and judge statement, whether the value judging str is safety value, and such as regulation str can not be the pathname cat/etc/password be denied access; If str value safety, then perform system (str); Otherwise, do not perform system (str).Then again adopt above-mentioned test script, amended software source code is tested, if the user name file of no longer display system, then prove that original security breaches are successfully made up.Visible, generate test script and can realize automatic software test, alleviate the burden of tester, improve test and efficiency.

Lifting one again adopts test input as the example 2 of test script.In example 2, the risk parameter in VP is not the parameter handled in EP, but the risk parameter in VP is subject to the pollution of the parameter handled in EP, and the C language source code of example 3 is following form:

void main()

{

char str[100]，*A；

scanf(“％s”，str)；

A＝str；

system(A)；

}

Wherein, scanf () is EP, and its input parameter str is for can handle parameter, and namely variable str is initial p EV; System () is VP, and its input parameter A is risk parameter.Because A is by str assignment, then A is middle PE V, and therefore, the execution route between from scanf (" %s ", str) to system (str) is PVEP.Visible, when user is str assignment " cat/etc/password ", display is kept at/etc path under password file.Therefore, the test script that step 105 generates is str=" cat/etc/password ", this str=" cat/etc/password " is test input.

Lifting one again adopts test input and test condition as the example 3 of test script.In example 3, C language source code is following form:

void main()

{

char str[100]；

int X；

scanf(“％d％s”，&X，str)；

if(X>0)

then system(str)；

else return；

}

In example 3, when only having X to be greater than zero, just can perform system (str), so, the test script generated in this example can be: str=" cat/etc/password ", X=1.Wherein, str=" cat/etc/password " is test input, and X=1 is test condition.

So far, this flow process terminates.

Below in conjunction with Fig. 2, the embodiment shown in Fig. 1 is described in detail.As shown in Figure 2, the method comprises:

Step 201: open EP configuration file and VP configuration file, EP and VP in file reading.

Step 202: read and analyze source code, setting up AST; Identify EP and VP occurred in AST, and be marked in AST.

Analyze source code in this step and the operation of setting up AST is known technology means, the front end compiler CC1 in GCC can be adopted to complete, just no longer describe in detail here.But, EP and VP that occur in described identification AST, and the step be marked in AST is existing CC1 can not complete, and needing to modify to CC1, adding the function for identifying EP and VP.

In this step, whether the concrete mode EP and VP identified being marked at AST is: setting up in AST process, be EP according to the node that EP configuration file is searched on AST, is 1 in the node of EP describes by EP attribute assignment, be 0 by VP attribute assignment, represent that this node is only EP; Whether be VP according to the node that VP configuration file is searched on AST, being 1 by VP attribute assignment in the node of VP describes, is 0 by EP attribute assignment, represent that this node is for just VP.If a node is EP and VP simultaneously, then in the node of this node describes by EP attribute and VP attribute simultaneously assignment be 1.

Certainly, the mode EP and VP on AST being recorded in form also can be adopted to realize the record of EP and VP.

In the present embodiment, CC1, before analysis source code, must read source code and be converted to the language that compiler can identify.Usually, when source code for adopting the program code of senior programming language such as such as C/C++, PASCAL, JAVA etc. time, CC1 can directly read, and completes conversion.But, when being difficult to the source code of acquisition program in practice, executable program (Executable Software) can be decompiled into assembly language by the CC1 in the present embodiment, and now CC1 can read the program code that assembly language is write, and completes conversion.Be higher level lanquage or assembly language be all the programming language with data structure, compiler can identify these programming languages and be converted to internal representation, is that the configuration file utilizing user to input realizes.Often kind of language all has respective configuration file, describes the data structure of language in configuration file, is the foundation that compiler carries out changing.These data structures describe by the definition of lexical analyzer (LEX) and syntax analyzer (YACC), and such LEX just can resolve the morphology of this language, and YACC then can resolve the grammer of this language.

Step 203 a: node being marked as EP in selection AST is as current EP.

Step 204: the positional information recording current EP; The parameter handled of current EP is defined as initial p EV, and the PEV being recorded in epicycle route searching checks in list, then searches for from current EP, using the next node that searches as current search node.

In this step, by stacked for the line number of program statement in source code described by current EP, to record the positional information of current EP.Each during the route searching of a new round, all set up new PEV and check list from an EP.

Step 205: the positional information of record current search node, determine the middle PE V occurred in current search node, and the PEV being recorded in epicycle route searching checks in list.

In this step, by stacked for the line number of program statement in source code described by current search node, and the PEV described in abovementioned steps 103 is adopted to determine the middle PE V occurred in mode determination current search node.

It should be noted that, when recording PEV, can also further consider the action scope of PEV, the action scope of PEV refers to the set of the node that PEV will pollute.The action scope of PEV directly can adopt the action scope of PEV place node.Wherein, it is known technology means that front end compiler CC1 obtains node action scope, its basic working method is as follows: first, obtain effective range (Scope) property value of a node, if the property value of Scope is GLOBAL, then this node is global node, and the PEV obtained from this node is global variable, and the action scope of this PEV is all nodes in AST.If the value of Scope is LOCAL, then this node is local nodes, and the PEV obtained from this node is local variable, and so, the action scope of this PEV is just for having all peers (brother) node of identical father node with its place node in AST; If wherein the program statement existed described by a brother of node is a function, then the action scope of this PEV does not comprise the child node of this brother of node; And for other situations, the program statement described by a brother of node is expression formula etc., then the action scope of this PEV comprises the child node of this brother of node.

Therefore this step 205 is when recording PEV, first can judge whether the action scope of the existing PEV that PEV checks in list comprises current search node, and PEV action scope not being comprised current search node deletes from PEV inspection list.Be appreciated that the variable in the subsequent searches node that can not to pollute due to deleted PEV in epicycle route searching, in the operation of the follow-up PEV of determination, thus do not need the PEV considering that these are deleted again.

Give one example below and the action scope of PEV and the effect when recording PEV thereof be described.Fig. 3 is the structural representation of a kind of AST.As shown in Figure 3, the circle in figure represents node, and the N in circle is the english abbreviation of node (Node), and the numeral in circle is node serial number, and the straight line in the middle of two circles represents the annexation between two nodes.Each node and annexation thereof constitute AST.In this AST shown in Fig. 3, N1 be root node and for EP, N2 be that child node is propped up on a left side of N1, N5 is that child node is propped up on the right side of N1, N3 and N4 is the child node of N2, and N6 is the child node of N5.When analyzing N3, determine PEV2, effective range due to N3 is the child node at the same level with it with identical father node, the i.e. child node N4 of N2, so when analyzing N4, because the action scope of PEV2 comprises N4, if the variable therefore in N4 is polluted by PEV2, then this variable is the middle PE V occurred in N4.

But, when N2 and child node analysis thereof are over, when Water demand N5, because the action scope of PEV2 does not comprise N5, now, can check list from PEV and delete PEV2, and then check the PEV in list determines whether occur middle PE V in N5 according to PEV.

Step 206: judge whether current search node is VP, if so, then performs step 207; Otherwise, perform step 209.

In this step, judge whether the VP attribute of current search node is 1, if so, then judge that current processing node is as VP, perform step 207.

Step 207: judge that whether the risk parameter of described VP is the PEV that PEV checks in list, if so, then perform step 208, otherwise, perform step 209.

In the present embodiment, when analyzing current search node, first the middle PE V occurred in current search node being recorded in PEV and checking in list, then judging whether current search node is VP, if so, then judge that whether the risk parameter of this VP is the PEV that PEV checks in list.In force, also first can judge whether current search node is VP, if, judge whether the risk parameter in this VP is subject to the pollution that PEV checks the PEV in list again, if so, then the risk parameter polluted in VP is added to PEV and check in list, and perform step 208.

Step 208: the path between current EP to described VP is defined as PVEP, reports tester according to PVEP generating test use case.

Visible, determine that the PVEP in AST has two conditions, first current search node is a VP, and it two are risk parameters of this node is PEV.

Step 209: judge that whether current search node is last node in epicycle route searching, if so, then perform step 210; Otherwise, perform step 212.

Step 210: judge whether all EP traveled through on AST, if so, then process ends; Otherwise, perform step 211.

Step 211: select the next node being marked as EP in AST, as current EP, return and perform step 204.

Step 212: using the next node in epicycle route searching as current search node, returns and performs step 205.

In step 212, next node is determined according to predetermined searching algorithm.

So far, this flow process terminates.

In order to realize the software security flaw detection method that above-described embodiment describes, the embodiment of the present invention additionally provides a kind of pick-up unit of software security flaw.Fig. 4 is the structural representation of software security flaw pick-up unit in the embodiment of the present invention.As shown in Figure 4, this device comprises source code processing unit 1, path analysis unit 2 and safety test use-case generation unit 3.Below respectively each comprising modules is described in detail.

First, source code processing unit 1, the AST that the source code for setting up software to be detected is corresponding, according to predefined EP and VP, determines EP and VP in each node of set up AST.

Fig. 5 is the structural representation of source code processing unit 1 in Fig. 4.As shown in Figure 5, this source code processing unit 1 specifically can comprise configuration module 11, analysis module 12, node type locating module 13 and AST logging modle 14.Wherein,

Configuration module 11, for recording predefined EP and VP.Particularly, EP configuration file and VP configuration file can be stored, specifically describe before the concrete form of configuration file.

Analysis module 12, the source code for treating inspection software carries out morphology, syntax and semantics analysis, sets up AST.

Node type locating module 13, for predefined EP and VP according to record in configuration module 11, determines EP and VP in each node of the AST that analysis module 12 is set up, and is recorded in AST logging modle 14 by described AST and determined EP and VP.

AST logging modle 14, for recording EP and VP on AST and this AST, and is supplied to path analysis unit 2.Particularly, the form of EP and VP in AST and this AST can be preserved, or preserve the AST with EP and VP mark.

Source code processing unit 1 can further include decompiling unit 15, for being the source code that assembly language is write by executable program code decompiling, sends to analysis module 12.If source code is the program code that high-level programming language is write, then can directly be read by analysis module 12.

Path analysis unit 2 in described pick-up unit, for in each node of AST of determining at source code processing unit 1, determine the execution route between EP to VP, if the VP on described execution route can be controlled by the EP on described execution route, then described execution route is defined as the PVEP that may cause security breaches.Fig. 6 shows the structural representation of path analytic unit 2 in Fig. 4.As shown in Figure 6, execution route locating module 21 and leak locating module 22 can specifically be comprised in this path analysis unit 2.Wherein,

Execution route locating module 21, for adopting each node in preset search algorithm traversal AST, searching for the execution route between each EP to VP, will search execution route and inform to leak locating module 22.Wherein, preset search algorithm can be Depth Priority Algorithm as previously mentioned.

Leak locating module 22, for after receiving the above-mentioned notice of execution route locating module 21, determines whether the VP on described execution route can be controlled by the EP on described execution route, if so, then described execution route is defined as PVEP.

Include when can handle parameter and risk parameter in predefined EP and VP, execution route locating module 21 is also for when searching for execution route, with the parameter handled in each EP for initial p EV, determine by the middle PE V that described initial p EV pollutes on described execution route, and determined initial p EV and middle PE V is sent to leak locating module 22.Wherein, determine that the mode of middle PE V can adopt data-flow analysis mode as previously mentioned, or the mode adopting data-flow analysis to combine with control flow analysis.

Accordingly, leak locating module 22 is specifically for after receiving the above-mentioned notice of execution route locating module 21, initial p EV and middle PE V, determine whether the risk parameter in the VP on described execution route is described initial p EV or middle PE V, if so, then described execution route is defined as PVEP.

Particularly, this leak locating module 22 can comprise and judges submodule 61 and locator module 62.Wherein,

Judge submodule 61, for receiving described notice, initial p EV and middle PE V that execution route locating module 21 is sent, determine whether the risk parameter in the VP on the execution route that execution route locating module 21 searches is described initial p EV or middle PE V, judged result is sent to locator module 62.

Locator module 62, for when the described judged result received is for being, is defined as PVEP by the execution route that execution route locating module 21 searches.

Safety test use-case generation unit 3 in described pick-up unit, is received from the PVEP generating test use case of execution route analytic unit 2 for basis.Particularly, this safety test use-case generation unit 3 can judge whether received EPVP can adopt test input and test condition to represent, if so, then generates test script according to test input and test condition, and reports tester; Otherwise, then the path of EPVP is documented in test report and reports tester.

From the above, the basis that the present invention detects security breaches analyzes source code, and analyze the morphology and grammatical analysis that are not limited to static state, also comprise semantic analysis, and can the execution route analysis of simulation softward implementation, therefore, it is possible to various potential risk execution route effectively detected, thus effectively detect the security breaches be present in software source code.

The foregoing is only preferred embodiment of the present invention, be not intended to limit protection scope of the present invention.Within the spirit and principles in the present invention all, any amendment done, equivalent replacement and improvement etc., all should be included within protection scope of the present invention.

Claims

1. a detection method for security flaws of software source codes, is characterized in that, the method comprises:

2. method according to claim 1, is characterized in that, described manipulation is some input class function; Described risk point is for performing class function and/or assignment statement.

3. method according to claim 2, is characterized in that, pre-defines the parameter handled can handled a little and the risk parameter in risk point further.

4. method according to claim 3, is characterized in that, determines that the risk point on described execution route can be by the mode of the point control handled on described execution route:

5. method according to claim 4, is characterized in that, determines that the risk parameter in the risk point on described execution route by the mode that the parameter handled in the point handled on described execution route is polluted is:

6. method according to claim 5, is characterized in that, the mode obtaining the middle PE V that described execution route is polluted by described initial p EV is data-flow analysis mode.

7. method according to claim 6, is characterized in that, obtains on described execution route and is comprised control flow analysis mode further by the mode of the middle PE V of described initial p EV pollution.

8. method according to claim 1, is characterized in that, the method comprises further: generate test report according to determined potential risk execution route.

9. method according to claim 8, is characterized in that, describedly generates test report according to determined potential risk execution route and comprises:

10. the method according to above-mentioned arbitrary claim, is characterized in that, the source code of described software to be detected is high-level programming language source code, or for executable program code being carried out decompiling and the assembly language source code obtained.

The pick-up unit of 11. 1 kinds of security flaws of software source codes, is characterized in that, this device comprises source code processing unit (1) and path analysis unit (2);

Described source code processing unit (1), the AST that the source code for setting up software to be detected is corresponding, handles a little and risk point according to predefined, determines the point handled in each node of set up AST and risk point;

Described path analysis unit (2), for in each node of AST of determining described source code processing unit (1), determine to handle a little to the execution route between risk point, if described execution route by the point control handled on described execution route, then can be defined as the potential risk execution route that may cause security breaches by the risk point on described execution route.

12. pick-up units according to claim 11, it is characterized in that, described source code processing unit (1) comprises configuration module (11), analysis module (12), node type locating module (13) and AST logging modle (14);

Described configuration module (11), for recording predefined manipulation a little and risk point;

Described analysis module (12), the source code for treating inspection software carries out morphology, syntax and semantics analysis, sets up AST;

Described node type locating module (13), for handling a little and risk point according to the predefined of record in described configuration module (11), determine the point handled in each node of the AST that described analysis module (12) is set up and risk point, and by described AST and determinedly to handle a little and risk point is recorded in AST logging modle (14);

Described AST logging modle (14), for recording described AST and determinedly handling a little and risk point, and is supplied to described path analysis unit (2).

13. pick-up units according to claim 12, it is characterized in that, described source code processing unit (1) comprises decompiling module (15) further, for being assembly language source code by executable program code decompiling, and send to described analysis module (12).

14. pick-up units according to claim 11, is characterized in that, described path analysis unit (2) comprises execution route locating module (21) and leak locating module (22);

Described execution route locating module (21), can handle a little to the execution route between risk point for search in described AST, and will search execution route and inform to described leak locating module (22);

Described leak locating module (22), for after the notice receiving described execution route locating module (21), determine whether the risk point on described execution route can by the point control handled on described execution route, if so, then described execution route is defined as potential risk execution route.

15. pick-up units according to claim 11, it is characterized in that, this device comprises safety test use-case generation unit (3) further, generates test report for the potential risk execution route determined according to described path analysis unit (2).