CN101630350A - Method and device for detecting buffer overflow and code instrumentation method and device - Google Patents

Method and device for detecting buffer overflow and code instrumentation method and device Download PDF

Info

Publication number
CN101630350A
CN101630350A CN200810132880A CN200810132880A CN101630350A CN 101630350 A CN101630350 A CN 101630350A CN 200810132880 A CN200810132880 A CN 200810132880A CN 200810132880 A CN200810132880 A CN 200810132880A CN 101630350 A CN101630350 A CN 101630350A
Authority
CN
China
Prior art keywords
code
buffer zone
buffer
program source
length
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN200810132880A
Other languages
Chinese (zh)
Inventor
唐文
胡建钧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Siemens Ltd China
Original Assignee
Siemens Ltd China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Siemens Ltd China filed Critical Siemens Ltd China
Priority to CN200810132880A priority Critical patent/CN101630350A/en
Publication of CN101630350A publication Critical patent/CN101630350A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a device and a method for detecting buffer overflow, a code instrumentation device and a code instrumentation method. The detection method comprises the following steps: identifying a code related with a buffer from a program source code, and inserting a buffer length information code corresponding to the code related with the buffer in the program source code; and performing model check on the program code inserted with the buffer length information code, judging whether the buffer overflow exists in the program code according to the buffer length information code, and reporting a code executing track causing the buffer overflow when the buffer overflow exists. The technical scheme disclosed by the invention can detect actually existing buffer overflow.

Description

The detection method of buffer-overflow vulnerability, device and code instrumentation method, device
Technical field
The present invention relates to the software security technical field of measurement and test, the detection method and the pick-up unit of buffer-overflow vulnerability in particularly a kind of software, and code code instrumentation method and inserting apparatus.
Background technology
At present, software is used to handle various sensitive informations and high value information more and more, for example business information, financial information etc., and this makes software become the target of attack that the assailant of these information is obtained in attempt day by day.Assailant's attempt utilizes the security breaches in the software, to disturb running software, realizes the malicious operation to software.Wherein, the buffer-overflow vulnerability of program source code coding stage introducing is modal security breaches, and buffer zone overflows all types of bufferings such as can comprising character type, pointer or integer and overflows.Therefore, develop a kind of efficient buffer district overflow vulnerability detection method, the potential buffer-overflow vulnerability that is present in the source code with detection is necessary.
More existing detect in methods of buffer-overflow vulnerabilities and might exist the code that buffer zone overflows (and whether really existing buffer zone to overflow regardless of it) all to detect institute, and these codes are reported.So, can have a lot of unnecessary reports, rate of false alarm is higher, and when handling buffer-overflow vulnerabilities according to these reports, can waste a lot of unnecessary times.
Summary of the invention
In order to overcome the above problems, one aspect of the present invention provides a kind of buffer-overflow vulnerability that carries out to detect code instrumentation method and the code instrumentation device that is used to insert before code, provide a kind of detection method and pick-up unit of buffer-overflow vulnerability on the other hand, so that only detect real buffer-overflow vulnerability.
Code instrumentation method provided by the present invention comprises:
From program source code, identify the code relevant with buffer zone;
In program source code, insert buffer length message code corresponding to the described code relevant with buffer zone.
Preferably, the described code relevant with buffer zone comprises: buffer zone definition category code, buffer zone transmit category code and buffer zone access classes code;
Described buffer zone definition category code comprises that static state is provided with the code of buffer zone and the code of dynamic assignment buffer zone;
Described buffer zone transmits assignment expression code that category code comprises buffer zone, buffer zone as the code of the parameter of function or process and the buffer zone code as the rreturn value of function;
Described buffer zone access classes code comprises by the code in subscript or pointer access buffer district and the code by risk function access buffer district.
Preferably, the described buffer length message code that inserts in program source code corresponding to the described code relevant with buffer zone comprises:
Corresponding to described buffer zone definition category code, in program source code, insert the length attribute variable-definition and the assignment code of buffer zone;
Corresponding to the assignment expression code of the described buffer zone in the buffer zone transmission category code, in program source code, insert the length attribute assignment expression code of destination buffer and source buffer zone; Corresponding to the code of the buffer zone in the buffer zone transmission category code, in this function or process, insert the parameter that is used to transmit the buffer length attribute as the parameter of function or process; Transmit buffer zone in the category code as the code of the rreturn value of function corresponding to buffer zone, in program source code, insert redetermination to length attribute variable that should rreturn value and it is carried out the code of assignment according to the physical length of buffer area;
Corresponding to buffer zone access classes code, the boundary limitation of inserting buffer zone before described code is asserted code.
Preferably, described from program source code, identifying before the code relevant with buffer zone, further comprise: the part or all of code relevant with buffer zone comprise needs identification is set, and the configuration file corresponding to the buffer length message code of the described code relevant with buffer zone of needs insertion;
Describedly from program source code, identify the code relevant and comprise: the code of partly or entirely being correlated with buffer zone in the program source code is discerned according to described configuration file with buffer zone;
The described buffer length message code that inserts in program source code corresponding to the described code relevant with buffer zone comprises: insert in program source code corresponding to buffer length the message code described and code that buffer zone is correlated with according to described configuration file.
The detection method of buffer-overflow vulnerability provided by the present invention comprises:
From program source code, identify the code relevant, in described program source code, insert buffer length message code corresponding to the described code relevant with buffer zone with buffer zone;
The program code execution model that has inserted described buffer length message code is detected, judge whether there is buffer-overflow vulnerability in the described program code according to described buffer length message code, when having buffer-overflow vulnerability, report the code that causes described buffer-overflow vulnerability to carry out track.
Preferably, the described code relevant with buffer zone comprises: buffer zone definition category code, buffer zone transmit category code and buffer zone access classes code;
Described buffer zone definition category code comprises that static state is provided with the code of buffer zone and the code of dynamic assignment buffer zone;
Described buffer zone transmits assignment expression code that category code comprises buffer zone, buffer zone as the code of the parameter of function or process and the buffer zone code as the rreturn value of function;
Described buffer zone access classes code comprises by the code in subscript or pointer access buffer district and the code by risk function access buffer district.
The described buffer length message code that inserts in program source code corresponding to the described code relevant with buffer zone comprises:
Corresponding to described buffer zone definition category code, in program source code, insert the length attribute variable-definition and the assignment code of buffer zone;
Corresponding to the assignment expression code of the described buffer zone in the buffer zone transmission category code, in program source code, insert the length attribute assignment expression code of destination buffer and source buffer zone; Corresponding to the code of the buffer zone in the buffer zone transmission category code, in this function or process, insert the parameter that is used to transmit the buffer length attribute as the parameter of function or process; Transmit buffer zone in the category code as the code of the rreturn value of function corresponding to buffer zone, in program source code, insert redetermination to length attribute variable that should rreturn value and it is carried out the code of assignment according to the physical length of buffer area;
Corresponding to buffer zone access classes code, the boundary limitation of inserting buffer zone before described code is asserted code.
Code instrumentation device provided by the present invention comprises:
Location identification module is used for identifying the code relevant with buffer zone from program source code;
The code insert module is used for inserting buffer length message code corresponding to the described code relevant with buffer zone at program source code.
Preferably, the described code relevant with buffer zone comprises: buffer zone definition category code, buffer zone transmit category code and buffer zone access classes code;
Described location identification module comprises:
The first recognin module is used for identifying from program source code and comprises that static state is provided with the buffer zone definition category code of the code of the code of buffer zone or dynamic assignment buffer zone;
The second recognin module is used for identifying from program source code and comprises that buffer zone assignment expression code, buffer zone transmit category code as the parameter of function/process or buffer zone as the buffer zone of the code of the rreturn value of function;
The 3rd recognin module is used for identifying the buffer zone access classes code that comprises by the code in subscript or pointer or risk function access buffer district from program source code.
Preferably, described code insert module comprises:
First inserts submodule, is used for inserting the length attribute variable-definition and the assignment code of buffer zone corresponding to described code in program source code when the described first recognin module identifies buffer zone definition category code;
Second inserts submodule, be used for when the described second recognin module identifies the buffer zone assignment expression code of buffer zone transmission category code, in program source code, inserting the length attribute assignment expression code of destination buffer and source buffer zone corresponding to described code; When the described second recognin module identifies buffer zone as the code of the parameter of function/process, in this function or process, insert the parameter that is used to transmit the buffer length attribute corresponding to described code; When the described second recognin module identifies buffer zone as the code of the rreturn value of function, insert in program source code corresponding to described code redetermination corresponding described rreturn value the length attribute variable and it is carried out the code of assignment according to the physical length of buffer area;
The 3rd inserts submodule, is used for when described the 3rd recognin module identifies buffer zone access classes code, and the boundary limitation of inserting buffer zone before described code is asserted code.
Preferably, this device further comprises: configuration module;
Described configuration module comprises one or more configuration file, the part or all of code relevant that comprises the needs identification of configuration in the described configuration file, and the length information code of the buffer zone that need insert corresponding to the described code relevant with buffer zone with buffer zone;
Described location identification module is discerned the part or all of code relevant with buffer zone in the program source code according to the described configuration file in the described configuration module;
Described code insert module is inserted the buffer length message code of described partly or entirely relevant with the buffer zone code that identifies corresponding to location identification module according to the described configuration file in the described configuration module in program source code.
The pick-up unit of buffer-overflow vulnerability provided by the present invention comprises:
The code instrumentation unit is used for identifying the code relevant with buffer zone from program source code, and inserts the buffer length message code corresponding to the described code relevant with buffer zone in described program source code;
The model detecting unit, be used for the program code execution model that has inserted described buffer length message code is detected, judge whether there is buffer-overflow vulnerability in the described program code according to described buffer length message code, when having buffer-overflow vulnerability, report the code that causes described buffer-overflow vulnerability to carry out track.
Preferably, described code instrumentation unit comprises: location identification module and code insert module, wherein,
Described location identification module comprises:
The first recognin module is used for from the static state that comprises that program source code identifies the described code relevant with buffer zone the buffer zone definition category code of the code of the code of buffer zone or dynamic assignment buffer zone being set;
The second recognin module is used for transmitting category code as the parameter of function/process or buffer zone as the buffer zone of the code of the rreturn value of function from comprise buffer zone assignment expression code, buffer zone that program source code identifies the described code relevant with buffer zone;
The 3rd recognin module is used for identifying the buffer zone access classes code by the code in subscript or pointer or risk function access buffer district of comprising the described code relevant with buffer zone from program source code.
Described code insert module comprises:
First inserts submodule, is used for inserting the length attribute variable-definition and the assignment code of buffer zone corresponding to described code in program source code when the described first recognin module identifies buffer zone definition category code;
Second inserts submodule, be used for when the described second recognin module identifies the buffer zone assignment expression code of buffer zone transmission category code, in program source code, inserting the length attribute assignment expression code of destination buffer and source buffer zone corresponding to described code; When the described second recognin module identifies buffer zone as the code of the parameter of function/process, in this function or process, insert the parameter that is used to transmit the buffer length attribute corresponding to described code; When the described second recognin module identifies buffer zone as the code of the rreturn value of function, corresponding to described code in program source code, insert redetermination to length attribute variable that should rreturn value and it is carried out the code of assignment according to the physical length of buffer area;
The 3rd inserts submodule, is used for when described the 3rd recognin module identifies buffer zone access classes code, and the boundary limitation of inserting buffer zone before described code is asserted code.
From such scheme as can be seen, among the present invention, at first from program source code, identify the code relevant with buffer zone, the code place that described and buffer zone in described program source code is relevant inserts the buffer length message code, this buffer length message code can code such as assert for the length attribute of buffer zone and/or the boundary limitation of buffer zone, afterwards, the program code execution model that has inserted described buffer length message code is detected, judge whether there is buffer-overflow vulnerability in the described program code according to described buffer length message code, when detecting buffer-overflow vulnerability, report the code that causes described buffer-overflow vulnerability to carry out track, thereby real buffer-overflow vulnerability can be detected, for the maintainer code is repaired, reduce rate of false alarm, improved specific aim and efficient that leak is handled.
Description of drawings
To make the clearer above-mentioned and other feature and advantage of the present invention of those of ordinary skill in the art by describe exemplary embodiment of the present invention in detail with reference to accompanying drawing below, in the accompanying drawing:
Fig. 1 is the exemplary block diagram of the pick-up unit of buffer-overflow vulnerability in the embodiment of the invention;
Fig. 2 is the structural representation of code instrumentation unit in the device shown in Figure 1;
Fig. 3 is the inner structure synoptic diagram of location identification module and code insert module in the code instrumentation shown in Figure 2 unit;
Fig. 4 is the exemplary process diagram of the detection method of buffer-overflow vulnerability in the embodiment of the invention.
Embodiment
In the embodiment of the invention, for the Hole Detection that will really exist buffer zone to overflow, can carry out model to program code detects, promptly this program code is carried out virtual execution, with program code conversion is finite state machine, state space in this finite state machine is carried out exhaustive search, judge whether to exist buffer zone to overflow, and find out the code execution track that causes that buffer zone overflows.Wherein, cause that the code that buffer zone overflows carries out the set that track refers to a series of codes of certain execution route that causes that buffer zone overflows.But in this process, when judging whether to exist buffer zone to overflow, need judge according to the length information of this buffer zone, yet may there be the code of the program language of buffer-overflow vulnerability for these, as the code of C/C++ etc., do not provide the length information of buffer zone usually, as length attribute or boundary limitation etc., so, carry out just there has not been basis for estimation when model detects.
With the C language is example, the length of supposing to be provided with a character type (char) is 16 buffer zone bufferB, as " char bufferB[16] ", but need not the corresponding length attribute that in code, defines buffer zone bufferB in the C language, promptly need not to be provided with similar " int bufferB_length=16 " such integer (int) code statement, and during certain element of access buffer district bufferB, when utilizing i element of " bufferB[i] " access buffer district bufferB, also need not the corresponding boundary limitation that in code, defines buffer zone bufferB in the C language, promptly do not have similar " assert (i<16) " such assertion statement.In this case, when carrying out the model detection, as if the element outside the length that access buffer district bufferB occurred, as when the element of utilization " bufferB[17] " access buffer district bufferB having occurred, buffer length information such as do not assert owing to do not have above-mentioned similar " int bufferB_length=16 " such buffer length attribute or " assert (i<16) " such buffer zone boundary limitation, when making that therefore carrying out model detects, can't judge and not comprise bufferB[17 among the buffer zone bufferB] this element, cause to detect and buffer zone has taken place overflow this moment.As seen, directly program source code is carried out model and detect to detect whether there is buffer-overflow vulnerability.
For this reason, before program code being carried out the model detection, should comprise the length information code relevant in the program source code with buffer zone.For meeting this requirement, can add corresponding length information code when the coding source code by the programmer, but so, will certainly increase programmer's workload; Even and if done like this, this manual type is difficult to also guarantee that each place that should add the length information code all is added with the length information code, therefore the feasibility of this mode is relatively poor.In the embodiment of the invention, for satisfying above-mentioned requirements, in completed program source code, identify the code relevant with buffer zone, and the corresponding code relevant with buffer zone that is identified inserts the buffer length message code automatically in program source code, afterwards the program code execution model that has inserted described buffer length message code is detected, judge whether there is buffer-overflow vulnerability in the described program code according to described buffer length message code, when having buffer-overflow vulnerability, report the code that causes described buffer-overflow vulnerability to carry out track.
In order to realize said process, a kind of pick-up unit of buffer-overflow vulnerability can be set in the embodiment of the invention, from source code, identify the code relevant by this pick-up unit with buffer zone, and in program source code, insert buffer length message code corresponding to the code relevant that is identified with buffer zone, this pick-up unit detects the program code execution model that has inserted described buffer length message code afterwards, judge whether there is buffer-overflow vulnerability in the described program code according to described buffer length information, when having buffer-overflow vulnerability, report the code that causes described buffer-overflow vulnerability to carry out track.
For making purpose of the present invention, technical scheme and advantage clearer, below with reference to the accompanying drawing embodiment that develops simultaneously, the present invention is described in more detail.
Fig. 1 is the exemplary block diagram of the pick-up unit of buffer-overflow vulnerability in the embodiment of the invention.As shown in Figure 1, this device comprises: code instrumentation unit 100 and model detecting unit 200.
Wherein, code instrumentation unit 100 is used for identifying the code relevant with buffer zone from program source code, inserts the buffer length message code corresponding to the described code relevant with buffer zone in described program source code.The buffer length message code here can code such as assert for the length attribute of buffer zone and/or the boundary limitation of buffer zone.
Model detecting unit 200 is used for the program code execution model that has inserted described buffer length message code is detected, judge whether there is buffer-overflow vulnerability in the described program code according to described buffer length message code, when having buffer-overflow vulnerability, report the code that causes described buffer-overflow vulnerability to carry out track.
During specific implementation, can there be various implementations code instrumentation unit 100, and Fig. 2 shows wherein a kind of structural representation of specific implementation.As shown in Figure 2, this code instrumentation unit 100 comprises: location identification module 110 and code insert module 120.
Wherein, location identification module 110 is used for identifying the code relevant with buffer zone from program source code.During specific implementation, for different program languages, its code relevant with buffer zone can be inequality.
Generally speaking, the code relevant with buffer zone roughly can comprise following a few class, is example with the C language below, and the code relevant with buffer zone is described:
(1) buffer zone definition class comprises:
Static state is provided with the code of buffer zone, as char buffer1[16];
The code of dynamic assignment buffer zone promptly comprises the code of buffer zone partition function alloc (), malloc () or realloc () etc., as buffer2=malloc (32).
(2) buffer zone transmits class, promptly the address of buffer zone is transmitted.Comprise:
The assignment expression code of buffer zone is as giving the start address (as buffer3) of certain buffer zone the code pointer1=buffer3 of certain pointer (as pointer1);
Buffer zone is as the code of the parameter of function or process, be called the process voidfoo (char*str) of foo as certain, the parameter of its character type " * str " is when specifically calling, the buffer zone of a corresponding character type, as buffer4, corresponding foo (buffer4) when being actual calling, such code also is the code relevant with buffer zone that needs identification;
Buffer zone is called the function char*foo () that the foo rreturn value is the character type pointer as the code of the rreturn value of function as certain, comprises code as follows:
char*foo(){
char*tmp;
......
tmp=malloc(25);
......
return?tmp;
}
Promptly this is called the corresponding buffer zone tmp=malloc (25) of rreturn value of the function of foo, and such code also is the code relevant with buffer zone that needs identification.
(3) buffer zone access classes, i.e. concrete some or some element in the access buffer district.Comprise:
By the code in subscript or pointer access buffer district, as code buffer1[i by subscript access buffer district]=' a ', and for example, by code * (pointer1+i)=' a ' in pointer access buffer district;
By the code in risk function (VF, Vulnerable Function) access buffer district, the risk function here refers to the function that can cause that buffer memory overflows, and in different programming languages, its risk function can be different.In the C language, risk function can refer to character string copy function strcpy (), character string fixed length copy function strncpy (), character string contiguous function strcat () and obtain host name function gethostname () etc.Correspondingly, by the code in risk function access buffer district, can for as strcpy (buffer2, buffer3) shown in the code of form.
During specific implementation, a kind of specific implementation form of location identification module 110 can comprise as shown in Figure 3: the first recognin module 111, the second recognin module 112 and the 3rd recognin module 113.
Wherein, the first recognin module 111 is used for identifying from program source code and comprises that static state is provided with the buffer zone definition category code of the code of the code of buffer zone or dynamic assignment buffer zone.
The second recognin module 112 is used for identifying from program source code and comprises that buffer zone assignment expression code, buffer zone transmit category code as the parameter of function/process or buffer zone as the buffer zone of the code of the rreturn value of function.
The 3rd recognin module 113 is used for identifying the buffer zone access classes code that comprises by the code in subscript or pointer or risk function access buffer district from program source code.
Code insert module 120 is used for inserting buffer length message code corresponding to the described code relevant with buffer zone at program source code.During specific implementation, for different program languages, its buffer length message code corresponding to the code relevant with buffer zone that need insert also can be inequality.These buffer length message codes can comprise that length attribute, the boundary limitation of buffer zone are asserted and some other correlative codes.
With the C language is example, and its corresponding above-mentioned several classes code relevant with buffer zone can insert buffer length message code as follows respectively:
(1) corresponding to the insertable buffer length message code of buffer zone definition category code:
Corresponding defined buffer zone, as the buffer zone of static state setting or the buffer zone of dynamic assignment, can in program source code, define the length attribute variable of a buffer zone, and be the physical length assignment of this length attribute variable according to buffer zone, promptly in program source code, insert the length attribute variable-definition and the assignment code of buffer zone.As, corresponding char buffer1[16], can insert code int buffer1_length=16; Corresponding buffer2=malloc (32) can insert code int buffer2_length=32.
(2) transmit the insertable buffer length message code of category code corresponding to buffer zone:
Corresponding to the assignment expression of buffer zone, can in program source code, insert the length attribute assignment expression of destination buffer and source buffer zone.The left side of this length attribute assignment expression is the length attribute variable of corresponding destination buffer, and the right side of this length attribute assignment expression is the length attribute variable of corresponding source buffer zone.Wherein, the length attribute variable of corresponding destination buffer is the length attribute variable of redetermination, and the length attribute variable of corresponding source buffer zone is generally according to the length attribute variable after the physical length assignment of source buffer zone.As corresponding pointer1=buffer3, can insert code intpointer1_length=buffer3_length.Wherein, buffer3_length is when definition buffer3 (as charbuffer1[16]), according to the variable after the physical length definition of buffer3 (as corresponding charbuffer1[16], inserted code int buffer3_length=16).
Corresponding to the code of buffer zone, can in this function of program source code or process, insert a parameter that is used to transmit the buffer length attribute in addition as the parameter of function or process.As corresponding voidfoo (char*str), can in this process, add another parameter int str_length, become void foo (char*str, int str_length), corresponding foo during actual calling (buffer4, buffer4_length).
Corresponding to the code of buffer zone as the rreturn value of function, can in program source code, insert a redetermination to length attribute variable (this variable can be global variable) that should rreturn value, and the length attribute variable of this rreturn value carried out assignment according to the physical length of buffer area.As the following code of correspondence:
char*foo(){
char*tmp;
......
tmp=malloc(25);
......
return?tmp;
}
Can be at first in program source code, insert code int foo_retum_length, in above-mentioned name is called the function body of foo, this variable carried out assignment according to the physical length of buffer zone afterwards, promptly become following code:
char*foo(){
char*tmp;
......
tmp=malloc(25);
foo_return_length=25;
......
return?tmp;
}
(3) corresponding to the insertable buffer length message code of buffer zone access classes code:
Corresponding to the code by subscript or pointer or risk function access buffer district, the boundary limitation of inserting buffer zone before can this code in program source code is asserted (assert).As, corresponding buffer1[i]=' a ', can before this code, insert code assert (i<buffer1_length); And for example, corresponding * (pointer1+i)=' a ' can insert code assert (i<pointer1_length) before this code; And for example, (buffer2 buffer3), can insert code assert (buffer3_length<buffer2_length) to corresponding strcpy before this code.
Therefore during specific implementation, a kind of specific implementation form of code insert module 120 can comprise as shown in Figure 3: first inserts submodule 121, second inserts submodule 122 and the 3rd and inserts submodule 123.
Wherein, first inserts submodule 121 is used for when the described first recognin module 111 identifies buffer zone definition category code, to should code insert the length attribute variable-definition and the assignment code of buffer zone in program source code.
Second inserts submodule 122 is used for when the described second recognin module 122 identifies buffer zone and transmits the buffer zone assignment expression code of category code, to should code insert the length attribute assignment expression code of destination buffer and source buffer zone in program source code; When the described second recognin module 122 identifies buffer zone as the code of the parameter of function or process, to should in this function or process, inserting a parameter that is used to transmit the buffer length attribute by code; When the described second recognin module 122 identifies buffer zone as the code of the rreturn value of function, to should code in program source code, insert a redetermination to length attribute variable that should rreturn value and it is carried out the code of assignment according to the physical length of buffer area.
The 3rd inserts submodule 123 is used for when described the 3rd recognin module 123 identifies buffer zone access classes code, and the boundary limitation of inserting buffer zone before this code is asserted code.
During specific implementation, said process can go out the above-mentioned code relevant with buffer zone with code insert module 120 Direct Recognition by code instrumentation unit 100 or by the location identification module in the code instrumentation unit 100 110, and the corresponding code that identifies, insert the length information code of corresponding buffer zone.Further, also can corresponding configuration file be set in code instrumentation unit 100, like this, the user also can be to the code relevant with buffer zone and is needed the length information code of the buffer zone of insertion to expand accordingly.During specific implementation, this code instrumentation unit 100 also can comprise a configuration module 130 shown in the dotted portion among Fig. 2.This configuration module 130 comprises one or more configuration file, comprises the code relevant with buffer zone of part or all of needs identification in these configuration files, and to the length information code of the buffer zone that should the code relevant with buffer zone need insert.Correspondingly, location identification module 110 in the code instrumentation unit 100 can be according to the configuration file in the configuration module 130, the part or all of code relevant with buffer zone in the program source code discerned, code insert module 120 can be according to the configuration file in the configuration module 130, the described partly or entirely code relevant with buffer zone that correspondence position identification module 110 identifies inserts corresponding buffer region length information code.For the code relevant of the identification of the needs beyond the configuration file and in requisition for the buffer length message code that inserts with buffer zone, then directly discern, in program source code, insert length information code corresponding to the corresponding buffer zone of the code relevant that identifies with buffer zone by code insert module 120 by location identification module 110.
For example, this configuration module 130 can comprise a configuration function (AF, Allocation Function) configuration file and a risk function (VF, Vulnerable Function) configuration file.Wherein, the function that can comprise the dynamic assignment buffer zone in the AF configuration file, as buffer zone partition function alloc (), malloc () or redistribute buffer area function realloc functions such as (), and the corresponding code that comprises these functions buffer length message code that need insert.Can comprise character string copy function strcpy () in the VF configuration file, character string fixed length copy function strncpy (), character string contiguous function strcat () and obtain host name function gethostname functions such as (), and the corresponding code that comprises these functions buffer length message code that need insert.Correspondingly, location identification module 110 in the code instrumentation unit 100 can be according to AF configuration file in the configuration module 130 and VF configuration file, the code that comprises AF or VF in the program source code is discerned, after identifying, the corresponding code that is identified, according to AF configuration file in the configuration module 130 and VF configuration file, in program source code, insert corresponding buffer length message code by code insert module 120.The code relevant with buffer zone for the identification of the needs beyond the configuration file then discerned by location identification module 110 and code insert module 120, and the corresponding code relevant with buffer zone that identifies, and inserts the length information code of corresponding buffer zone.
Code instrumentation unit in the embodiment of the invention can be realized by the front end compiler CC1 in the GNU C compiler (GCC, GNU CCompiler).
During specific implementation, model detecting unit 200 also can have multiple specific implementation form, as can be by Blast (Berkeley Lazy Abstraction Software Verification Tool, Berkeley inertia abstraction verification tool), MOPS model checking tools such as (MOdel Checking Programs for Security properties, the model trace routines of security feature) realizes.Model detecting unit 200 is after being converted to finite state machine with program, state space in the state machine is carried out exhaustive search, according to described buffer length message code, judge whether to exist the situation of running counter to described code corresponding buffer region length information, if there is no, then think and do not have buffer-overflow vulnerability in the program code, if exist, then think and have buffer-overflow vulnerability in the program code, and correspondingly report the code that has this leak, in the present embodiment, report the code that causes buffer-overflow vulnerability to carry out track usually.
As to following C programmer source code:
Void?main(){
char?buffer[3];
for(i=0;i<=3;i++){
buffer[i]=‘a’;
}
}
Source code after being handled by the code instrumentation unit 100 in the embodiment of the invention becomes:
Void?main(){
char?buffer[3];
int?buffer_length=3;
for(i=0;i<=3;i++){
assert(i<3);
buffer[i]=‘a’;
}
}
When handling by the model detecting unit 200 in the embodiment of the invention, buffer zone buffer will take place when i equals 3 overflow.So during the code after handling above-mentioned plug-in mounting, will detect buffer zone and overflow, and the code of its leak that may provide execution track is with the model checking tools of model detecting unit 200 correspondences:
char?buffer[3];
for(i=0;i<=3;i++)
buffer[0]=‘a’
for(i=0;i<=3;i++)
buffer[1]=‘a’
for(i=0;i<=3;i++)
buffer[2]=‘a’
for(i=0;i<=3;i++)
Buffer[3]=' a ' overflow herein.
In the embodiment of the invention, code instrumentation unit 100 and model detecting unit 200 also can independently be set to a device respectively, and promptly the function of code instrumentation unit 100 and model detecting unit 200 can be finished by code instrumentation device and model pick-up unit respectively.
More than the pick-up unit of the buffer-overflow vulnerability in the embodiment of the invention is described in detail, again the detection method of the buffer-overflow vulnerability in the embodiment of the invention is described in detail below.
Fig. 4 is the exemplary process diagram of the detection method of buffer-overflow vulnerability in the embodiment of the invention.As shown in Figure 4, this flow process comprises the steps:
Step 401 identifies the code relevant with buffer zone from program source code, insert the buffer length message code corresponding to the described code relevant with buffer zone in described program source code.The buffer length message code here can code such as assert for the length attribute of buffer zone and/or the boundary limitation of buffer zone.
Concrete operations in this step can be consistent with the concrete operations described in the pick-up unit of present embodiment.For different program languages, its specific code relevant with buffer zone can be inequality, but roughly can comprise following a few class: buffer zone definition category code, buffer zone transmit category code and buffer zone access classes code.Wherein, buffer zone definition category code can comprise that static state is provided with the code of buffer zone and the code of dynamic assignment buffer zone; Buffer zone transmits assignment expression code that category code can comprise buffer zone, buffer zone as the code of the parameter of function or process and the buffer zone code as the rreturn value of function or process; Buffer zone access classes code can comprise by the code in subscript or pointer access buffer district and the code by risk function access buffer district.
Equally, for the distinct program language, its buffer length message code corresponding to the code relevant with buffer zone that need insert also can be inequality, but generally, corresponding to buffer zone definition category code, can insert the length attribute variable-definition and the assignment code of buffer zone.Corresponding to the assignment expression code of the buffer zone in the buffer zone transmission category code, can insert the length attribute assignment expression code of destination buffer and source buffer zone; Corresponding to the code of the buffer zone in the buffer zone transmission category code, can in this function or process, insert a parameter that is used to transmit the buffer length attribute as the parameter of function or process; Transmit buffer zone in the category code as the code of the rreturn value of function corresponding to buffer zone, can insert a redetermination to length attribute variable that should rreturn value and carry out the code of assignment according to the physical length of buffer area.Corresponding buffer zone access classes code, the boundary limitation that can insert buffer zone before this code are asserted code.
During specific implementation, the operation in this step can be realized by the code instrumentation unit in the device shown in Figure 1, and this code instrumentation unit can be the front end compiler CC1 in the GNU C compiler.
Step 402, the program code execution model that has inserted described buffer length message code is detected, judge whether there is buffer-overflow vulnerability in the described program code according to described buffer length message code, when having buffer-overflow vulnerability, report the code that causes described buffer-overflow vulnerability to carry out track.
During specific implementation, the operation in this step can be realized by the model detecting unit in the device shown in Figure 1, and this model detecting unit can be realized by model checking tools such as Blast, MOPS.
Further, before the step 401, also can comprise: the code relevant with buffer zone comprise that part or all of needs are discerned is set, and to the configuration file of the length information code of the buffer zone that should the code relevant with buffer zone need insert.Then can discern the part or all of code relevant in the program source code according to set configuration file in the step 401 with buffer zone, the corresponding described partly or entirely code relevant with buffer zone that identifies inserts corresponding buffer length message code according to this configuration file in the step 402.The code relevant with buffer zone for the identification of the needs beyond the configuration file then directly discerned in step 401, and at the code relevant with buffer zone that the direct correspondence of step 402 identifies, inserts the length information code of corresponding buffer zone.
The above is preferred embodiment of the present invention only, is not to be used to limit protection scope of the present invention.Within the spirit and principles in the present invention all, any modification of being done, be equal to and replace and improvement etc., all should be included within protection scope of the present invention.

Claims (12)

1, a kind of code instrumentation method is characterized in that, this method comprises:
From program source code, identify the code relevant with buffer zone;
In program source code, insert buffer length message code corresponding to the described code relevant with buffer zone.
2, the method for claim 1 is characterized in that, the described code relevant with buffer zone comprises: buffer zone definition category code, buffer zone transmit category code and buffer zone access classes code;
Described buffer zone definition category code comprises that static state is provided with the code of buffer zone and the code of dynamic assignment buffer zone;
Described buffer zone transmits assignment expression code that category code comprises buffer zone, buffer zone as the code of the parameter of function or process and the buffer zone code as the rreturn value of function;
Described buffer zone access classes code comprises by the code in subscript or pointer access buffer district and the code by risk function access buffer district.
3, method as claimed in claim 2 is characterized in that, the described buffer length message code that inserts in program source code corresponding to the described code relevant with buffer zone comprises:
Corresponding to described buffer zone definition category code, in program source code, insert the length attribute variable-definition and the assignment code of buffer zone;
Corresponding to the assignment expression code of the described buffer zone in the buffer zone transmission category code, in program source code, insert the length attribute assignment expression code of destination buffer and source buffer zone; Corresponding to the code of the buffer zone in the buffer zone transmission category code, in this function or process, insert the parameter that is used to transmit the buffer length attribute as the parameter of function or process; Transmit buffer zone in the category code as the code of the rreturn value of function corresponding to buffer zone, in program source code, insert redetermination to length attribute variable that should rreturn value and it is carried out the code of assignment according to the physical length of buffer area;
Corresponding to buffer zone access classes code, the boundary limitation of inserting buffer zone before described code is asserted code.
4, as each described method in the claim 1 to 3, it is characterized in that, described from program source code, identifying before the code relevant with buffer zone, further comprise: the part or all of code relevant with buffer zone comprise needs identification is set, and the configuration file corresponding to the buffer length message code of the described code relevant with buffer zone of needs insertion;
Describedly from program source code, identify the code relevant and comprise: the code of partly or entirely being correlated with buffer zone in the program source code is discerned according to described configuration file with buffer zone;
The described buffer length message code that inserts in program source code corresponding to the described code relevant with buffer zone comprises: insert in program source code corresponding to buffer length the message code described and code that buffer zone is correlated with according to described configuration file.
5, a kind of detection method of buffer-overflow vulnerability is characterized in that, this method comprises:
From program source code, identify the code relevant, in described program source code, insert buffer length message code corresponding to the described code relevant with buffer zone with buffer zone;
The program code execution model that has inserted described buffer length message code is detected, judge whether there is buffer-overflow vulnerability in the described program code according to described buffer length message code, when having buffer-overflow vulnerability, report the code that causes described buffer-overflow vulnerability to carry out track.
6, method as claimed in claim 5 is characterized in that, the described code relevant with buffer zone comprises: buffer zone definition category code, buffer zone transmit category code and buffer zone access classes code;
Described buffer zone definition category code comprises that static state is provided with the code of buffer zone and the code of dynamic assignment buffer zone;
Described buffer zone transmits assignment expression code that category code comprises buffer zone, buffer zone as the code of the parameter of function or process and the buffer zone code as the rreturn value of function;
Described buffer zone access classes code comprises by the code in subscript or pointer access buffer district and the code by risk function access buffer district.
The described buffer length message code that inserts in program source code corresponding to the described code relevant with buffer zone comprises:
Corresponding to described buffer zone definition category code, in program source code, insert the length attribute variable-definition and the assignment code of buffer zone;
Corresponding to the assignment expression code of the described buffer zone in the buffer zone transmission category code, in program source code, insert the length attribute assignment expression code of destination buffer and source buffer zone; Corresponding to the code of the buffer zone in the buffer zone transmission category code, in this function or process, insert the parameter that is used to transmit the buffer length attribute as the parameter of function or process; Transmit buffer zone in the category code as the code of the rreturn value of function corresponding to buffer zone, in program source code, insert redetermination to length attribute variable that should rreturn value and it is carried out the code of assignment according to the physical length of buffer area;
Corresponding to buffer zone access classes code, the boundary limitation of inserting buffer zone before described code is asserted code.
7, a kind of code instrumentation device is characterized in that, this device comprises:
Location identification module (110) is used for identifying the code relevant with buffer zone from program source code;
Code insert module (120) is used for inserting buffer length message code corresponding to the described code relevant with buffer zone at program source code.
8, device as claimed in claim 7 is characterized in that, the described code relevant with buffer zone comprises: buffer zone definition category code, buffer zone transmit category code and buffer zone access classes code;
Described location identification module (110) comprising:
The first recognin module (111) is used for identifying from program source code and comprises that static state is provided with the buffer zone definition category code of the code of the code of buffer zone or dynamic assignment buffer zone;
The second recognin module (112) is used for identifying from program source code and comprises that buffer zone assignment expression code, buffer zone transmit category code as the parameter of function/process or buffer zone as the buffer zone of the code of the rreturn value of function;
The 3rd recognin module (113) is used for identifying the buffer zone access classes code that comprises by the code in subscript or pointer or risk function access buffer district from program source code.
9, device as claimed in claim 8 is characterized in that, described code insert module (120) comprising:
First inserts submodule (121), is used for inserting the length attribute variable-definition and the assignment code of buffer zone corresponding to described code in program source code when the described first recognin module (111) identifies buffer zone definition category code;
Second inserts submodule (122), be used for when the described second recognin module (122) identifies the buffer zone assignment expression code of buffer zone transmission category code, in program source code, inserting the length attribute assignment expression code of destination buffer and source buffer zone corresponding to described code; When the described second recognin module (122) identifies buffer zone as the code of the parameter of function/process, in this function or process, insert the parameter that is used to transmit the buffer length attribute corresponding to described code; When the described second recognin module (122) identifies buffer zone as the code of the rreturn value of function, insert in program source code corresponding to described code redetermination corresponding described rreturn value the length attribute variable and it is carried out the code of assignment according to the physical length of buffer area;
The 3rd inserts submodule (123), is used for when described the 3rd recognin module (123) identifies buffer zone access classes code, and the boundary limitation of inserting buffer zone before described code is asserted code.
10, as each described device in the claim 7 to 9, it is characterized in that this device further comprises: configuration module (130);
Described configuration module (130) comprises one or more configuration file, the part or all of code relevant that comprises the needs identification of configuration in the described configuration file, and the length information code of the buffer zone that need insert corresponding to the described code relevant with buffer zone with buffer zone;
Described location identification module (110) is discerned the part or all of code relevant with buffer zone in the program source code according to the described configuration file in the described configuration module (130);
Described code insert module (120) is inserted the buffer length message code of described partly or entirely relevant with the buffer zone code that identifies corresponding to location identification module (110) according to the described configuration file in the described configuration module (130) in program source code.
11, a kind of pick-up unit of buffer-overflow vulnerability is characterized in that, this device comprises:
Code instrumentation unit (100) is used for identifying the code relevant with buffer zone from program source code, and inserts the buffer length message code corresponding to the described code relevant with buffer zone in described program source code;
Model detecting unit (200), be used for the program code execution model that has inserted described buffer length message code is detected, judge whether there is buffer-overflow vulnerability in the described program code according to described buffer length message code, when having buffer-overflow vulnerability, report the code that causes described buffer-overflow vulnerability to carry out track.
12, device as claimed in claim 11 is characterized in that, described code instrumentation unit (100) comprising: location identification module (110) and code insert module (120), wherein,
Described location identification module (110) comprising:
The first recognin module (111) is used for from the static state that comprises that program source code identifies the described code relevant with buffer zone the buffer zone definition category code of the code of the code of buffer zone or dynamic assignment buffer zone being set;
The second recognin module (112) is used for transmitting category code as the parameter of function/process or buffer zone as the buffer zone of the code of the rreturn value of function from comprise buffer zone assignment expression code, buffer zone that program source code identifies the described code relevant with buffer zone;
The 3rd recognin module (113) is used for identifying the buffer zone access classes code by the code in subscript or pointer or risk function access buffer district of comprising the described code relevant with buffer zone from program source code.
Described code insert module (120) comprising:
First inserts submodule (121), is used for inserting the length attribute variable-definition and the assignment code of buffer zone corresponding to described code in program source code when the described first recognin module (111) identifies buffer zone definition category code;
Second inserts submodule (122), be used for when the described second recognin module (122) identifies the buffer zone assignment expression code of buffer zone transmission category code, in program source code, inserting the length attribute assignment expression code of destination buffer and source buffer zone corresponding to described code; When the described second recognin module (122) identifies buffer zone as the code of the parameter of function/process, in this function or process, insert the parameter that is used to transmit the buffer length attribute corresponding to described code; When the described second recognin module (122) identifies buffer zone as the code of the rreturn value of function, corresponding to described code in program source code, insert redetermination to length attribute variable that should rreturn value and it is carried out the code of assignment according to the physical length of buffer area;
The 3rd inserts submodule (123), is used for when described the 3rd recognin module (123) identifies buffer zone access classes code, and the boundary limitation of inserting buffer zone before described code is asserted code.
CN200810132880A 2008-07-14 2008-07-14 Method and device for detecting buffer overflow and code instrumentation method and device Pending CN101630350A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN200810132880A CN101630350A (en) 2008-07-14 2008-07-14 Method and device for detecting buffer overflow and code instrumentation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN200810132880A CN101630350A (en) 2008-07-14 2008-07-14 Method and device for detecting buffer overflow and code instrumentation method and device

Publications (1)

Publication Number Publication Date
CN101630350A true CN101630350A (en) 2010-01-20

Family

ID=41575455

Family Applications (1)

Application Number Title Priority Date Filing Date
CN200810132880A Pending CN101630350A (en) 2008-07-14 2008-07-14 Method and device for detecting buffer overflow and code instrumentation method and device

Country Status (1)

Country Link
CN (1) CN101630350A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103294517A (en) * 2012-02-22 2013-09-11 国际商业机器公司 Stack overflow protection device, stack protection method, related compiler and calculation device
CN103455364A (en) * 2013-09-05 2013-12-18 北京航空航天大学 System and method for online obtaining Cache performance of parallel program under multi-core environment
CN104035862A (en) * 2013-03-08 2014-09-10 腾讯科技(深圳)有限公司 Method and device for closure testing
CN104766015A (en) * 2015-04-10 2015-07-08 北京理工大学 Function call based dynamic detection method for buffer overflow vulnerability
CN105718799A (en) * 2015-09-10 2016-06-29 哈尔滨安天科技股份有限公司 Method and system for identifying file overflow vulnerability
US9626368B2 (en) 2012-01-27 2017-04-18 International Business Machines Corporation Document merge based on knowledge of document schema
CN107016286A (en) * 2016-12-30 2017-08-04 深圳市安之天信息技术有限公司 A kind of malicious code randomization recognition methods and system based on random-tracking
WO2018058414A1 (en) * 2016-09-29 2018-04-05 Intel Corporation Overflow detection
CN116226673A (en) * 2023-05-05 2023-06-06 中国人民解放军国防科技大学 Training method of buffer region vulnerability recognition model, vulnerability detection method and device

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9626368B2 (en) 2012-01-27 2017-04-18 International Business Machines Corporation Document merge based on knowledge of document schema
US9740698B2 (en) 2012-01-27 2017-08-22 International Business Machines Corporation Document merge based on knowledge of document schema
US9734039B2 (en) 2012-02-22 2017-08-15 International Business Machines Corporation Stack overflow protection device, method, and related compiler and computing device
CN103294517A (en) * 2012-02-22 2013-09-11 国际商业机器公司 Stack overflow protection device, stack protection method, related compiler and calculation device
CN103294517B (en) * 2012-02-22 2018-05-11 国际商业机器公司 Stack overflow protective device, stack protection method, dependent compilation device and computing device
CN104035862A (en) * 2013-03-08 2014-09-10 腾讯科技(深圳)有限公司 Method and device for closure testing
WO2014134990A1 (en) * 2013-03-08 2014-09-12 Tencent Technology (Shenzhen) Company Limited Method, device and computer-readable storage medium for closure testing
US9507693B2 (en) 2013-03-08 2016-11-29 Tencent Technology (Shenzhen) Company Limited Method, device and computer-readable storage medium for closure testing
CN103455364A (en) * 2013-09-05 2013-12-18 北京航空航天大学 System and method for online obtaining Cache performance of parallel program under multi-core environment
CN103455364B (en) * 2013-09-05 2016-08-17 北京航空航天大学 A kind of multi-core environment concurrent program Cache performance online obtains system and method
CN104766015A (en) * 2015-04-10 2015-07-08 北京理工大学 Function call based dynamic detection method for buffer overflow vulnerability
CN104766015B (en) * 2015-04-10 2018-02-13 北京理工大学 A kind of buffer-overflow vulnerability dynamic testing method based on function call
CN105718799A (en) * 2015-09-10 2016-06-29 哈尔滨安天科技股份有限公司 Method and system for identifying file overflow vulnerability
WO2018058414A1 (en) * 2016-09-29 2018-04-05 Intel Corporation Overflow detection
CN107016286A (en) * 2016-12-30 2017-08-04 深圳市安之天信息技术有限公司 A kind of malicious code randomization recognition methods and system based on random-tracking
CN107016286B (en) * 2016-12-30 2019-09-24 深圳市安之天信息技术有限公司 A kind of malicious code randomization recognition methods and system based on random-tracking
CN116226673A (en) * 2023-05-05 2023-06-06 中国人民解放军国防科技大学 Training method of buffer region vulnerability recognition model, vulnerability detection method and device
CN116226673B (en) * 2023-05-05 2023-07-07 中国人民解放军国防科技大学 Training method of buffer region vulnerability recognition model, vulnerability detection method and device

Similar Documents

Publication Publication Date Title
CN101630350A (en) Method and device for detecting buffer overflow and code instrumentation method and device
JP5992622B2 (en) Malicious application diagnostic apparatus and method
CN101661543B (en) Method and device for detecting security flaws of software source codes
US7860842B2 (en) Mechanism to detect and analyze SQL injection threats
CN103699480B (en) A kind of WEB dynamic security leak detection method based on JAVA
US20170223040A1 (en) Identifying device, identifying method and identifying program
CN106020873B (en) Patch package loading method and device
CN101853200B (en) High-efficiency dynamic software vulnerability exploiting method
US9392011B2 (en) Web vulnerability repair apparatus, web server, web vulnerability repair method, and program
CN105550594A (en) Security detection method for android application file
CN110909358A (en) Shaping vulnerability detection method based on dynamic and static analysis
CN102868699A (en) Method and tool for vulnerability detection of server providing data interaction services
CN104573503A (en) Method and device for detecting memory access overflow
US20100131472A1 (en) Detection and utilzation of inter-module dependencies
US11868465B2 (en) Binary image stack cookie protection
CN111919214A (en) Automatic generation of patches for security violations
CN110443039A (en) Detection method, device and the electronic equipment of plug-in security
CN104239801A (en) Identification method and device for 0day bug
KR102269174B1 (en) Appratus and method for verification of smart contracts
US8843908B2 (en) Compiler validation via program verification
CN106354624B (en) Automatic testing method and device
CN109299610B (en) Method for verifying and identifying unsafe and sensitive input in android system
CN102984229B (en) For configuring the method and system of trust machine
EP2535813B1 (en) Method and device for generating an alert during an analysis of performance of a computer application
CN104680043A (en) Method and device for protecting executable file

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20100120