CN113407442B - Pattern-based Python code memory leak detection method - Google Patents
Pattern-based Python code memory leak detection method Download PDFInfo
- Publication number
- CN113407442B CN113407442B CN202110586274.3A CN202110586274A CN113407442B CN 113407442 B CN113407442 B CN 113407442B CN 202110586274 A CN202110586274 A CN 202110586274A CN 113407442 B CN113407442 B CN 113407442B
- Authority
- CN
- China
- Prior art keywords
- type
- child node
- mode
- belongs
- memory
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/362—Software debugging
- G06F11/366—Software debugging using diagnostics
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/362—Software debugging
- G06F11/3644—Software debugging by instrumenting at runtime
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention discloses a mode-based Python code memory leak detection method. The method acquires the type information of the Python code by type inference, and performs memory leak detection by combining a self-defined mode to obtain the circular reference causing the memory leak. The detection method has the characteristics of high precision, high speed and the like, can effectively detect the memory leakage existing in the code before the code runs, and timely notifies relevant developers to adopt corresponding solutions. The method is different from the characteristic that the prior detection method only analyzes the memory use condition when the code runs, and the mode-based Python code memory leak detection method is suitable for detection of a coding stage in the software development process and is beneficial to finding the defect code as early as possible.
Description
Technical Field
The invention relates to the technical field of software, in particular to a mode-based Python code memory leak detection method.
Background
The memory leak is a common error in software engineering, that is, after a program dynamically applies for a memory, the memory is not released before the program finishes using the memory, so that the memory resource is occupied for a long time. Memory leaks are usually free of any obvious symptoms early in the project. The memory leak occurs in a continuous process, and as the memory leak accumulates in the system, the number of leaking objects increases, which leads to a continuous decrease in memory resources in the system. When the memory resources in the system are exhausted, the applications in the system may be suspended temporarily, waiting for the memory to be reallocated. If the memory reallocation process takes a long time, it will cause a system crash.
In recent years, dynamic memory management mechanisms such as garbage collection mechanisms are adopted in programming languages to help programmers prevent leakage defects. However, the garbage collection mechanism handles memory leaks by analyzing the runtime, which requires the program to be detected, monitored, or tested. Meanwhile, enough test cases are needed to ensure that codes causing memory leakage can be triggered in the test process. On the other hand, researchers also use some static analysis methods to detect memory leaks. Static analysis methods static analysis treats memory leak detection as a reachability analysis problem, allowing memory leaks to be detected without actually running the program. However, for Python, there is no explicit memory allocation and release statement for the code, so the cost of statically determining the accessibility of objects can be very expensive.
Disclosure of Invention
In order to overcome the defects of the prior art, a mode-based Python code memory leak detection method is provided. The invention uses the code mode to solve the problem of detecting the memory leakage, and can effectively solve the problem. The technical scheme adopted by the invention is as follows:
a mode-based Python code memory leak detection method comprises the following steps:
s1, inputting the source code of the project, traversing all code files in the project, and loading each Python code fileObtaining each Python code file by using ast module in Python standard libraryCorresponding abstract syntax tree
S2, obtaining the abstract syntax tree based on the step S1 by using the abstract interpreter technologyPerforming type inference to obtain type treeWherein the type treeThe nodes in the graph represent abstract types, and the relationship among the nodes represents the dependency relationship;
s3, traversing the type tree obtained in the step S2Each instance type inGet each instance typeIn all the child nodes in the type tree, checking whether each child node b has memory leakage by using a predefined memory leakage mode, if one of the memory leakage modes is satisfied, recording cyclic references causing the memory leakage, wherein each cyclic reference is recorded as a node sequence [ v ] v0,v1,...vn]Wherein each node v in the sequence of nodesiFor the nodes in the type tree, reference relations exist between adjacent nodes, and v is satisfied0=vn。
Preferably, the specific steps of deriving the type tree through type inference in step S2 are as follows:
s21, packaging a plurality of abstract types according to the type defined by Python;
s22, for each Python code fileCorresponding abstract syntax treeType tree derivation using abstract interpreter for type inferenceType treeEach node in (1) represents an abstract type and an abstract syntax treeAdding the corresponding module type into the type tree T as a root node, wherein the relationship among the nodes represents the dependency relationship, namely the definition of the child node is in a father node;
s23 traversing the type tree in sequenceObtaining all function types by the node of each function type in the system; for each function type, judging whether the function type has a call in type inference, namely whether the function type called by at least one call type is the function type; if the function type is the new function type, the judgment of the next function type is skipped, if not, the function type is regarded as the new function type and a call type is created, the unknown type is used as an incoming parameter, and the abstract interpreter is used for calling the newly created call type at one time.
Preferably, the abstract types packaged in step S21 include the following 11 types:
1) the type of the module: mod < id >, where id represents the unique identifier of the module;
2) function type: fun < id >, where id represents a unique identifier for the function;
3) the calling type is as follows: invoke < fun, [ tau ], τ >, where fun represents the function type of the call, [ tau ] represents the type of parameter needed by the call, τ represents the type of return value of the call;
4) class type: cls < id, [ cls ] >, where id represents a unique identifier of the class and [ cls ] represents the type of the parent class of the class;
5) example types: ins < cls >, where cls represents the class type to which the instance belongs;
6) the method comprises the following steps: meth < fun, ins >, where fun denotes the function type and ins denotes the instance type to which the method belongs;
7) the combination type is as follows: any type of collection;
8) dictionary type: ditt < τ, τ >, where two τ's represent the type of a key sum value in the dictionary, respectively;
9) list type: list < τ >, where τ represents the type of element in the list;
10) tuple type: tuple < τ >, where τ represents the type of element in the tuple;
11) set type: set < τ >, where τ denotes the type of element in the set.
Preferably, the specific steps of checking whether there is a memory leak in each child node b using the predefined memory leak mode in step S3 are as follows:
s31, predefining mode 1 as memory leakage caused by self-reference, judging whether the child node b meets mode 1, if b andif they are the same, then mode 1 is considered satisfied, and the cycle that records memory leaks is referred to as
S32, predefining mode 2 as the memory leakage caused by the circulation reference between an instance and a container, judging whether the child node b satisfies the mode 2, if b strictly contains the mode 2Then mode 2 is deemed satisfied, at which point the cycle reference that recorded the memory leak is
S33, predefining mode 3 is a memory leak caused by cyclic referencing between instances and methods, determining whether mode 3 is satisfied, and the determining steps are as follows, in S331 and S332:
s331, if the child node b belongs to the method type, checking the instance type b.ins to which the child node b belongs, and if b.ins and b.insIf they are the same, then mode 3 is considered satisfied, and the cycle that records memory leaks is referred to as
S332, if the child node b belongs to the combination type, checking each type t contained in the child node b, and if t of one method type exists, the instance type t.ins and the instance type t.iIf they are the same, then mode 3 is considered satisfied, and the cycle that records memory leaks is referred to as
S34, predefining mode 4 as memory leakage caused by cyclic referencing between two instances, determining whether the child node b satisfies mode 4, and the determining steps are as follows, for example, S341 and S342:
s341, if the child node b belongs to the instance type, and b is associated withNot identical, each child node of b is checked, if there is one child node c that is instance type and identical toEquivalently, then mode 4 is deemed satisfied, at which point the cycle that records memory leaks is referenced as
S342, if the child node b belongs to the combination type, each type t contained in the b is checked, if there is one t belonging to the instance type, each child node c in the t is further checked, if there is one child node c being the instance type and being associated withEquivalently, then mode 4 is deemed satisfied, at which point the cycle that records memory leaks is referenced as
S35, predefining pattern 5 is a memory leak caused by cyclic referencing between two instances and a container, and determining whether the child node b satisfies pattern 5, where the determining steps are S351 and S352:
s351, if the child node b belongs toIn instance type, and b andnot identical, further check each child node within b if there is one child node c that is instance type and containsThen mode 5 is deemed satisfied, at which point the cycle reference that recorded the memory leak is
S352, if the child node b belongs to the combination type, checking each type t contained in the b, if one t belongs to the instance type, further checking each child node c in the t, if one child node c is the instance type and containsThen mode 5 is deemed satisfied, at which point the cycle reference that recorded the memory leak is
S36, predefining a mode 6 as memory leakage caused by cyclic reference between two examples and methods, and judging whether the child node b meets the mode 6, wherein the judgment steps are as follows:
if the child node b belongs to the instance type, each child node c of b is further checked, if there is one child node c which is the method type and to which instance type cIf the same, then mode 6 is deemed satisfied, and the cycle that records memory leaks is referenced as
Preferably, in step S32, for any two types x and y, the strict inclusion relationship between the two types x and y is determined as follows:
1) if x belongs to the dictionary type, checking whether the key or value of x is equivalent to y, and if so, considering that the type x contains the type y;
2) if x belongs to a set type, a list type or a meta-ancestor type, checking whether y belongs to one of x, and if so, considering that the type x contains the type y;
3) if x belongs to the combination type, checking whether a type t in x contains the type y, if so, considering that the type x contains the type y.
Preferably, in step S34, for any two types x and y, the equivalence relation determination step between the two types x and y is as follows:
1) if x does not belong to the combination type, judging whether x and y are the same, and if so, considering x and y to be equivalent;
2) if x belongs to the combination type, check if there is a type t in x that is equivalent to y, and if so, consider x and y to be equivalent.
Preferably, in step S35, for any two types x and y, the inclusion relationship determining step between the two types x and y is as follows:
1) if x belongs to the dictionary type, check if the key or value of x is the same as y, and if so, consider type x to contain type y.
2) If x belongs to the set type, list type or meta-ancestor type, then check if y belongs to x, and if so, then consider type x to contain type y.
3) If x belongs to the combination type, check if there is a type t containing type y in x, if yes, then consider type x containing type y.
The invention uses abstract interpreter to deduce type, and uses mode-based detection method to detect memory leakage in code based on type tree, the invention has following benefits: 1. acquiring the type information of the code by utilizing type inference so that the memory leak detection is suitable for dynamic languages such as Python; 2. by using the mode-based memory leak detection method, the memory leak has the characteristics of high accuracy and high speed.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings.
On the contrary, the invention is intended to cover alternatives, modifications, equivalents and alternatives which may be included within the spirit and scope of the invention as defined by the appended claims. Furthermore, in the following detailed description of the present invention, certain specific details are set forth in order to provide a better understanding of the present invention. It will be apparent to one skilled in the art that the present invention may be practiced without these specific details.
As shown in fig. 1, a method for detecting a memory leak of a pattern-based Python code according to the present invention includes the following steps:
s1, code extraction: inputting the source code of the project, traversing all code files in the project, and loading each Python code fileObtaining each Python code file by using the corresponding function of the ast module in the Python standard libraryCorresponding abstract syntax tree
S2, type analysis: the abstract syntax tree obtained based on step S1 using the abstract interpreter techniquePerforming type inference to obtain type treeWherein the type treeThe nodes in (1) represent abstract types, and the relationships between the nodes represent dependencies.
In the present embodiment, the abstract syntax tree obtained based on step S1 is utilized by the abstract interpreter techniqueThe specific steps for performing type inference to obtain a type tree are as follows:
s21, packaging a plurality of abstract types according to the type defined by Python;
in this embodiment, the encapsulated abstract types include the following 11 types:
1) the type of the module: mod < id >, where id represents the unique identifier of the module;
2) function type: fun < id >, where id represents a unique identifier for the function;
3) the calling type is as follows: invoke < fun, [ tau ], τ >, where fun represents the function type of the call, [ tau ] represents the type of parameter needed by the call, τ represents the type of return value of the call;
4) class type: cls < id, [ cls ] >, where id represents a unique identifier of the class and [ cls ] represents the type of the parent class of the class;
5) example types: ins < cls >, where cls represents the class type to which the instance belongs;
6) the method comprises the following steps: meth < fun, ins >, where fun denotes the function type and ins denotes the instance type to which the method belongs;
7) the combination type is as follows: any type of collection, i.e., a collection of multiple abstract types;
8) dictionary type: ditt < τ, τ >, where two τ's represent the type of a key sum value in the dictionary, respectively;
9) list type: list < τ >, where τ represents the type of element in the list;
10) tuple type: tuple < τ >, where τ represents the type of element in the tuple;
11) set type: set < τ >, where τ denotes the type of element in the set.
S22, for each Python code fileCorresponding abstract syntax treeType tree derivation using abstract interpreter for type inferenceType treeEach node in (a) represents an abstract type, the relationship between nodes represents the definition of dependencies, i.e. children, within a parent node, and an abstract syntax treeAdding the corresponding module type into the type tree T as a root node
S23 traversing the type tree in sequenceObtaining all function types by the node of each function type in the system; for each function type in all function types, judging whether the function type has a call in type inference, namely whether the function type called by at least one call type is the function type; if the function type is the new function type, the judgment of the next function type is skipped, if not, the function type is regarded as the new function type and a call type is created, the unknown type is used as an incoming parameter, and the abstract interpreter is used for calling the newly created call type at one time.
S3, detecting memory leakage: traversing the type tree obtained in step S2Each instance type inGet each instance typeIn all the child nodes in the type tree, checking whether each child node b has memory leakage by using a predefined memory leakage mode, if one of the memory leakage modes is satisfied, recording cyclic references causing the memory leakage, wherein each cyclic reference is recorded as a node sequence [ v ] v0,v1,...vn]Wherein each node v in the sequence of nodesiFor the nodes in the type tree, reference relations exist between adjacent nodes, and v is satisfied0=vn。
In this embodiment, the specific steps of checking whether there is a memory leak in each child node b using the predefined memory leak mode are as follows:
s31, predefining mode 1 as memory leakage caused by self-reference, judging whether the child node b meets mode 1, if b andif they are the same, then mode 1 is considered satisfied, and the cycle that records memory leaks is referred to as
S32, predefining mode 2 as the memory leakage caused by the circulation reference between an instance and a container, judging whether the child node b satisfies the mode 2, if b strictly contains the mode 2Then mode 2 is deemed satisfied, at which point the cycle reference that recorded the memory leak is
Wherein, judging whether b strictly comprisesWhen it is needed to be provided withAnd determining a judgment rule of strict inclusion relationship between the two. Due to b andboth represent an abstract type, and therefore, in this embodiment, for any two abstract types x and y, the strict inclusion relationship between the two abstract types x and y is determined as follows:
1) if x belongs to the dictionary type, checking whether the key or value of x is equivalent to y, and if so, considering that the type x contains the type y;
2) if x belongs to a set type, a list type or a meta-ancestor type, checking whether y belongs to one of x, and if so, considering that the type x contains the type y;
3) if x belongs to the combination type, checking whether a type t in x contains the type y, if so, considering that the type x contains the type y.
S33, predefining mode 3 is a memory leak caused by cyclic referencing between instances and methods, determining whether mode 3 is satisfied, and the determining steps are as follows, in S331 and S332:
s331, if the child node b belongs to the method type, checking the instance type b.ins to which the child node b belongs, and if b.ins and b.insIf they are the same, then mode 3 is considered satisfied, and the cycle that records memory leaks is referred to as
S332, if the child node b belongs to the combination type, checking each type t contained in the child node b, and if t of one method type exists, the instance type t.ins and the instance type t.iIf they are the same, then mode 3 is considered satisfied, and the cycle that records memory leaks is referred to as
S34, predefining mode 4 as memory leakage caused by cyclic referencing between two instances, determining whether the child node b satisfies mode 4, and the determining steps are as follows, for example, S341 and S342:
s341, if the child node b belongs to the instance type, and b is associated withNot identical, each child node of b is checked, if there is one child node c that is instance type and identical toEquivalently, then mode 4 is deemed satisfied, at which point the cycle that records memory leaks is referenced as
S342, if the child node b belongs to the combination type, each type t contained in the b is checked, if there is one t belonging to the instance type, each child node c in the t is further checked, if there is one child node c being the instance type and being associated withEquivalently, then mode 4 is deemed satisfied, at which point the cycle that records memory leaks is referenced as
Wherein c is determinedWhen the two are equivalent, a judgment rule of the equivalence relation between the two needs to be set. Due to c andboth represent an abstract type, and therefore, in this embodiment, for any two abstract types x and y, the equivalence relation between the two abstract types x and y is determined as follows:
1) if x does not belong to the combination type, judging whether x and y are the same, and if so, considering x and y to be equivalent;
2) if x belongs to the combination type, check if there is a type t in x that is equivalent to y, and if so, consider x and y to be equivalent.
S35, predefining pattern 5 is a memory leak caused by cyclic referencing between two instances and a container, and determining whether the child node b satisfies pattern 5, where the determining steps are S351 and S352:
s351, if the child node b belongs to the instance type, and b is compared withNot identical, further check each child node within b if there is one child node c that is instance type and containsThen mode 5 is deemed satisfied, at which point the cycle reference that recorded the memory leak is
S352, if the child node b belongs to the combination type, checking each type t contained in the b, if one t belongs to the instance type, further checking each child node c in the t, if one child node c is the instance type and containsThen mode 5 is deemed satisfied, at which point the cycle reference that recorded the memory leak is
Wherein, it is determined whether c includesIn this case, a judgment rule of the inclusion relationship between the two is required to be set. Due to c andboth represent an abstract type, and therefore, in this embodiment, for any two types x and y, the inclusion relationship between the two types x and y is determined as follows:
1) if x belongs to the dictionary type, check if the key or value of x is the same as y, and if so, consider type x to contain type y.
2) If x belongs to the set type, list type or meta-ancestor type, then check if y belongs to x, and if so, then consider type x to contain type y.
S36, predefining a mode 6 as memory leakage caused by cyclic reference between two examples and methods, and judging whether the child node b meets the mode 6, wherein the judgment steps are as follows:
if the child node b belongs to the instance type, each child node c of b is further checked, if there is one child node c which is the method type and to which instance type cIf the same, then mode 6 is deemed satisfied, and the cycle that records memory leaks is referenced as
3) If x belongs to the combination type, check if there is a type t containing type y in x, if yes, then consider type x containing type y.
The above-mentioned S31-S36 define different memory leakage patterns in 6, and each child node b is regarded as having memory leakage as long as it meets any one pattern, and is regarded as not having memory leakage if it does not meet any one pattern. Through the operations of S1-S3, the memory leak detection of Python codes can be realized, the cyclic references causing the memory leak are located and obtained, and finally all cyclic reference sets causing the memory leak can be obtained. The method is suitable for detecting the coding stage in the software development process, can effectively detect the memory leakage existing in the code before the code runs, and timely informs relevant developers to adopt corresponding solutions.
The above-mentioned steps S1-S3 are applied to an embodiment to show the technical effects thereof.
Examples
The steps of this embodiment are the same as those of the specific embodiment, and are not described herein again. The following shows some of the implementation processes and implementation results:
data source acquisition: the code used in this embodiment is the source code of 4 real open source items obtained from the GitHub open source community. Since the project does not store information about real memory leaks, it is necessary to run test cases of each project to collect memory leak information generated during the running process, wherein the relevant statistical information of each project is shown in table 1. Then, the collected memory leaks are manually classified into 6 modes according to the technical scheme of the invention.
And (5) result verification: in this embodiment, the cyclic reference set causing the memory leak collected in the actual code running process is compared with the cyclic reference set which is detected by applying the technical scheme of the present invention to the project source code and may cause the memory leak, so as to evaluate the validity of the scheme. In order to verify the technical effect of the technical scheme of the invention, four indexes are selected to measure the detection performance:
SP is the number of cyclic references which are found by the technical scheme of the invention and exist in actual operation.
SN is the number of cyclic references found by the technical scheme of the invention and not existing in actual operation.
DP is the number of cyclic references that exist in actual operation and are found by the technical solution of the present invention.
DN is the number of cyclic references which exist in actual operation and cannot be found by the technical scheme of the invention.
In order to calculate the above four indexes, it is necessary to determine a cyclic reference S that may cause memory leakage and is found by the technical solution of the present inventioniWhether to compare with a cyclic reference D collected in actual operation and causing memory leakageiThe same is true. Assume that the present embodiment sets the cyclic references that cause actual memory leakage collected during the actual code execution process into D ═ D1,D2,...,Dn}. The technical scheme of the invention is applied to the cyclic reference set which is detected by the project source code and possibly causes memory leakage, and the cyclic reference set is S ═ S1,S2,...,Sm}. Suppose Si=[s1,s2,...,sk]And Di=[d1,d2,...,dk]Are respectively cyclic references containing k types and belonging to the same mode, if a constant l (0 ≦ l)<k),si=di+lThen, determine SiAnd DiThe same is true. Wherein s isi=djThe judgment method is as follows:
(1)siand djThe same;
(2) if d isjAnd siSame, then djAnd siThe same subtype of
Table 2 shows the comparison of the memory leak detection results with the inventive solution on 4 data sets with the memory leak collected from the actual code run. As can be seen from the table, the DNs of the 4 entries are all 0, which proves the effectiveness of the mode-based memory leak detection method provided by the present invention, that is, in the 4 entries, the technical solution of the present invention can find all the memory leaks occurring in actual operation. In addition, for the three items of boto, djblets and libNeuroML, as can be seen from the table, the method of the present invention finds out 3, 3 and 6 cyclic references that do not exist in the actual code operation, respectively, and through manual verification, the above 12 memory leaks that do not collect in the actual operation but actually exist. This is because the memory leaks collected by the code execution depend on the test cases, and it is difficult to collect all the memory leaks when the test cases are insufficient. The technical scheme of the invention belongs to static analysis and does not depend on a test case. In general, the SP is larger than the DP, which shows that the method has high accuracy, and can find the memory leakage which cannot be collected in the actual operation because the test case cannot cover the memory leakage collected in the actual operation besides the memory leakage collected in the actual operation.
Actual operation of table 14 real items obtains statistical information table of memory leak data set
TABLE 2 comparison of the results of the test according to the method of the invention with the results collected from the actual code run
Name of item | SP | SN | DP | DN |
boto | 315 | 3 | 274 | 0 |
djblets | 26 | 3 | 25 | 0 |
libNeuroML | 256 | 6 | 63 | 0 |
pymtl | 56 | 0 | 37 | 0 |
The above-described embodiments are merely preferred embodiments of the present invention, which should not be construed as limiting the invention. Various changes and modifications may be made by one of ordinary skill in the pertinent art without departing from the spirit and scope of the present invention. Therefore, the technical scheme obtained by adopting the mode of equivalent replacement or equivalent transformation is within the protection scope of the invention.
Claims (6)
1. A method for detecting memory leak of Python code based on mode is characterized by comprising the following steps:
s1, inputting the source code of the project, traversing all code files in the project, and loading each Python code fileObtaining each Python code file by using ast module in Python standard libraryCorresponding abstract syntax tree
S2, obtaining the abstract syntax tree based on the step S1 by using the abstract interpreter technologyPerforming type inference to obtain type treeWherein the type treeThe nodes in the graph represent abstract types, and the relationship among the nodes represents the dependency relationship;
s3, traversing the type tree obtained in the step S2Obtaining all child nodes of each instance type i in the type tree, then checking whether memory leakage exists in each child node b by using a predefined memory leakage mode, if one memory leakage mode is met, recording cyclic references causing the memory leakage, and recording each cyclic reference as a node sequence [ v ] v0,v1,...vn]Wherein each node v in the sequence of nodesiFor the nodes in the type tree, reference relations exist between adjacent nodes, and v is satisfied0=vn;
The specific steps of deriving the type tree by type inference in step S2 are as follows:
s21, packaging a plurality of abstract types according to the type defined by Python;
s22, for each Python code fileCorresponding abstract syntax treeType tree derivation using abstract interpreter for type inferenceType treeEach node in (1) represents an abstract type and an abstract syntax treeAdding the corresponding module type into the type tree T as a root node, wherein the relationship among the nodes represents the dependency relationship, namely the definition of the child node is in a father node;
s23 traversing the type tree in sequenceObtaining all function types by the node of each function type in the system; for each function type, judging whether the function type has a call in type inference, namely whether the function type called by at least one call type is the function type; if the function type is the new function type, the judgment of the next function type is skipped, if not, the function type is regarded as the new function type and a call type is created, the unknown type is used as an incoming parameter, and the abstract interpreter is used for calling the newly created call type at one time.
2. The method according to claim 1, wherein the abstract types encapsulated in step S21 include the following 11 types:
1) the type of the module: mod < id >, where id represents the unique identifier of the module;
2) function type: fun < id >, where id represents a unique identifier for the function;
3) the calling type is as follows: invoke < fun, [ tau ], τ >, where fun represents the function type of the call, [ tau ] represents the type of parameter needed by the call, τ represents the type of return value of the call;
4) class type: cls < id, [ cls ] >, where id represents a unique identifier of the class and [ cls ] represents the type of the parent class of the class;
5) example types: ins < cls >, where cls represents the class type to which the instance belongs;
6) the method comprises the following steps: meth < fun, ins >, where fun denotes the function type and ins denotes the instance type to which the method belongs;
7) the combination type is as follows: any type of collection;
8) dictionary type: ditt < τ, τ >, where two τ's represent the type of a key sum value in the dictionary, respectively;
9) list type: list < τ >, where τ represents the type of element in the list;
10) tuple type: tuple < τ >, where τ represents the type of element in the tuple;
11) set type: set < τ >, where τ denotes the type of element in the set.
3. The method according to claim 1, wherein the specific step of checking whether there is a memory leak in each child node b using the predefined memory leak mode in step S3 is as follows:
s31, predefining a mode 1 as memory leakage caused by self-reference, judging whether the child node b meets the mode 1, if b is the same as i, considering that the mode 1 is met, and recording the cyclic reference causing the memory leakage as [ i ] at the moment;
s32, predefining a mode 2 as a memory leak caused by cyclic reference between an instance and a container, judging whether a child node b meets the mode 2, if b strictly contains i, considering that the mode 2 is met, and recording the cyclic reference causing the memory leak as [ b, i ];
s33, predefining mode 3 is a memory leak caused by cyclic referencing between instances and methods, determining whether mode 3 is satisfied, and the determining steps are as follows, in S331 and S332:
s331, if the child node b belongs to the method type, checking the instance type b.ins to which the child node b belongs, if the b.ins is the same as the i, determining that the mode 3 is met, and recording the cyclic reference causing the memory leakage as [ b.ins, i ];
s332, if the child node b belongs to the combination type, each type t contained in the child node b is checked, if a method type t exists, and the instance types t.ins and i of the method type t are the same, the mode 3 is considered to be met, and at the moment, the cyclic reference causing memory leakage is recorded as [ t.ins, i ];
s34, predefining mode 4 as memory leakage caused by cyclic referencing between two instances, determining whether the child node b satisfies mode 4, and the determining steps are as follows, for example, S341 and S342:
s341, if the child node b belongs to the instance type and b is different from i, each child node of b is checked, if one child node c is the instance type and is equivalent to i, the mode 4 is considered to be met, and at the moment, the cyclic reference causing the memory leakage is recorded as [ c, i ];
s342, if the child node b belongs to the combined type, each type t contained in the b is checked, if one t belongs to the instance type, each child node c in the t is further checked, if one child node c exists, the child node c is the instance type and is equivalent to the i, the mode 4 is considered to be met, and at the moment, the cyclic reference causing the memory leakage is recorded as [ c, i ];
s35, predefining pattern 5 is a memory leak caused by cyclic referencing between two instances and a container, and determining whether the child node b satisfies pattern 5, where the determining steps are S351 and S352:
s351, if the child node b belongs to the instance type and b is different from i, further checking each child node in b, if one child node c exists, is the instance type and contains i, considering that the mode 5 is met, and recording the cyclic reference causing memory leakage as [ c, i ];
s352, if the child node b belongs to the combined type, each type t contained in the b is checked, if one t belongs to the instance type, each child node c in the t is further checked, if one child node c exists, the child node c is the instance type and contains i, the mode 5 is considered to be met, and at the moment, the cyclic reference causing memory leakage is recorded as [ c, i ];
s36, predefining a mode 6 as memory leakage caused by cyclic reference between two examples and methods, and judging whether the child node b meets the mode 6, wherein the judgment steps are as follows:
if child node b belongs to the instance type, each child node c of b is further checked, if there is one child node c that is the method type and the instance type c.ins and i to which it belongs, then pattern 6 is considered to be satisfied, at which point the circular reference that caused the memory leak is recorded as [ c.ins, i ].
4. The method according to claim 1, wherein in step S32, for any two types x and y, the strict inclusion relation between them is determined as follows:
1) if x belongs to the dictionary type, checking whether the key or value of x is equivalent to y, and if so, considering that the type x contains the type y;
2) if x belongs to a set type, a list type or a meta-ancestor type, checking whether y belongs to one of x, and if so, considering that the type x contains the type y;
3) if x belongs to the combination type, checking whether a type t in x contains the type y, if so, considering that the type x contains the type y.
5. The method according to claim 1, wherein in step S34, for any two types x and y, the equivalence relation between them is determined as follows:
1) if x does not belong to the combination type, judging whether x and y are the same, and if so, considering x and y to be equivalent;
2) if x belongs to the combination type, check if there is a type t in x that is equivalent to y, and if so, consider x and y to be equivalent.
6. The method according to claim 1, wherein in step S35, for any two types x and y, the step of determining the inclusion relationship between them is as follows:
1) if x belongs to the dictionary type, checking whether the key or value of x is the same as y, and if so, considering that the type x contains the type y;
2) if x belongs to a set type, a list type or a meta-ancestor type, checking whether y belongs to x, and if so, considering that the type x contains the type y;
3) if x belongs to the combination type, check if there is a type t containing type y in x, if yes, then consider type x containing type y.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110586274.3A CN113407442B (en) | 2021-05-27 | 2021-05-27 | Pattern-based Python code memory leak detection method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110586274.3A CN113407442B (en) | 2021-05-27 | 2021-05-27 | Pattern-based Python code memory leak detection method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113407442A CN113407442A (en) | 2021-09-17 |
CN113407442B true CN113407442B (en) | 2022-02-18 |
Family
ID=77674737
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110586274.3A Active CN113407442B (en) | 2021-05-27 | 2021-05-27 | Pattern-based Python code memory leak detection method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113407442B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101017458A (en) * | 2007-03-02 | 2007-08-15 | 北京邮电大学 | Software safety code analyzer based on static analysis of source code and testing method therefor |
CN105912381A (en) * | 2016-04-27 | 2016-08-31 | 华中科技大学 | Compile-time code security detection method based on rule base |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103198260B (en) * | 2013-03-28 | 2016-06-08 | 中国科学院信息工程研究所 | A kind of binary program leak automatization localization method |
US20150363196A1 (en) * | 2014-06-13 | 2015-12-17 | The Charles Stark Draper Laboratory Inc. | Systems And Methods For Software Corpora |
CN107967208B (en) * | 2016-10-20 | 2020-01-17 | 南京大学 | Python resource sensitive defect code detection method based on deep neural network |
CN111736980B (en) * | 2019-03-25 | 2024-01-16 | 华为技术有限公司 | Memory management method and device |
CN111352829A (en) * | 2019-11-21 | 2020-06-30 | 杭州迪普科技股份有限公司 | Memory leak test method, device and equipment |
-
2021
- 2021-05-27 CN CN202110586274.3A patent/CN113407442B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101017458A (en) * | 2007-03-02 | 2007-08-15 | 北京邮电大学 | Software safety code analyzer based on static analysis of source code and testing method therefor |
CN105912381A (en) * | 2016-04-27 | 2016-08-31 | 华中科技大学 | Compile-time code security detection method based on rule base |
Also Published As
Publication number | Publication date |
---|---|
CN113407442A (en) | 2021-09-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Xu et al. | Finding low-utility data structures | |
US7797687B2 (en) | Parameterized unit tests with behavioral purity axioms | |
US7886272B1 (en) | Prioritize code for testing to improve code coverage of complex software | |
US8875110B2 (en) | Code inspection executing system for performing a code inspection of ABAP source codes | |
US10606570B2 (en) | Representing software with an abstract code graph | |
US7971193B2 (en) | Methods for performining cross module context-sensitive security analysis | |
CN109033843B (en) | Java file dependency analysis method and module for distributed static detection system | |
US20060253739A1 (en) | Method and apparatus for performing unit testing of software modules with use of directed automated random testing | |
Le Hanh et al. | Selecting an efficient OO integration testing strategy: an experimental comparison of actual strategies | |
US20130212563A1 (en) | Method and a System for Searching for Parts of a Computer Program Which Affects a Given Symbol | |
US20080028378A1 (en) | Utilizing prior usage data for software build optimization | |
CN103577324A (en) | Static detection method for privacy information disclosure in mobile applications | |
CN105760292A (en) | Assertion verification method and device for unit testing | |
Xu et al. | Scalable runtime bloat detection using abstract dynamic slicing | |
CN111831562A (en) | Fuzzy test case generation method based on machine learning, computer equipment and readable storage medium for operating method | |
CN109408366B (en) | Data source configuration test method, system, computer equipment and storage medium | |
CN111767076A (en) | Code reconstruction method and device | |
CN114510722A (en) | Static detection method and detection system for incremental code | |
CN114328213A (en) | Parallelization fuzzy test method and system based on target point task division | |
CN108897678B (en) | Static code detection method, static code detection system and storage device | |
He et al. | IFDS-based context debloating for object-sensitive pointer analysis | |
CN113407442B (en) | Pattern-based Python code memory leak detection method | |
US20130152053A1 (en) | Computer memory access monitoring and error checking | |
US8997064B2 (en) | Symbolic testing of software using concrete software execution | |
CN114490413A (en) | Test data preparation method and device, storage medium and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |