CN110297639B - Method and apparatus for detecting code - Google Patents

Method and apparatus for detecting code Download PDF

Info

Publication number
CN110297639B
CN110297639B CN201910584002.2A CN201910584002A CN110297639B CN 110297639 B CN110297639 B CN 110297639B CN 201910584002 A CN201910584002 A CN 201910584002A CN 110297639 B CN110297639 B CN 110297639B
Authority
CN
China
Prior art keywords
code
elements
detected
candidate
library
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910584002.2A
Other languages
Chinese (zh)
Other versions
CN110297639A (en
Inventor
刘志伟
邢永旭
张克鹏
白伟
李涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201910584002.2A priority Critical patent/CN110297639B/en
Publication of CN110297639A publication Critical patent/CN110297639A/en
Application granted granted Critical
Publication of CN110297639B publication Critical patent/CN110297639B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/44Encoding
    • G06F8/443Optimisation
    • G06F8/4434Reducing the memory space required by the program code
    • G06F8/4435Detection or removal of dead or redundant code

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Stored Programmes (AREA)

Abstract

The embodiment of the application discloses a method and a device for detecting codes. The method comprises the following steps: analyzing the code base to be detected to determine the calling relationship among the code elements in the code base to be detected; determining the complexity of calling relations among the code elements, and screening out the code elements with the complexity of calling relations with other code elements in the code library to be detected not exceeding a preset complexity as candidate code elements; analyzing the dependency relationship between the code library to be detected and other code libraries; in response to determining that the candidate code element is not called by other codebases having dependencies with the codebase to be detected, determining that the candidate code element is an invalid code element. The method realizes accurate detection of invalid codes.

Description

Method and apparatus for detecting code
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to the technical field of data processing, and particularly relates to a method and a device for detecting codes.
Background
Invalid code is code that does not participate in actual execution or code that does participate in execution but produces results that are not used, among non-annotated code, and is redundant code. Invalid code is typically embedded in normal code and cannot be recognized directly from the grammar by a programmer as annotated code.
With continuous development, the number of codes in an enterprise is more and more, and huge cost is consumed for maintaining the codes. The invalid codes not only occupy machine resources and reduce product performance, but also increase the difficulty of reading and understanding by engineers, thereby causing resource waste.
Disclosure of Invention
Embodiments of the present disclosure propose a method, an apparatus, an electronic device, and a computer-readable medium for detecting a code.
In a first aspect, an embodiment of the present disclosure provides a method for detecting a code, including: analyzing the code base to be detected to determine the calling relationship among the code elements in the code base to be detected; determining the complexity of calling relations among the code elements, and screening out the code elements with the complexity of calling relations with other code elements in the code library to be detected not exceeding a preset complexity as candidate code elements; analyzing the dependency relationship between the code library to be detected and other code libraries; in response to determining that the candidate code element is not called by other codebases having dependencies with the codebase to be detected, determining that the candidate code element is an invalid code element.
In some embodiments, the determining the complexity of the call relationship between the code elements includes: for one code element in the code library to be detected, determining the complexity of the calling relationship between the code element and other code elements in the code library to be detected based on whether the code element has the calling relationship with other code elements in the code library to be detected and whether other code elements having the calling relationship with the code element are candidate code elements.
In some embodiments, the code elements screened out that the complexity of the call relationship with other code elements in the code library to be detected does not exceed the preset complexity are candidate code elements, and the method includes at least one of the following steps: screening out code elements which do not have a calling relation with other code elements in the code base to be detected as candidate code elements; and screening out code elements which are not called by other code elements in the code base to be detected and are not the execution entries of the program corresponding to the code base to be detected, and taking the code elements as candidate code elements.
In some embodiments, the screening, as the candidate code elements, code elements whose complexity of call relationships with other code elements in the codebase to be detected does not exceed a preset complexity, further includes: screening out the code elements called by the determined candidate code elements as called code elements of the candidate code elements; in response to determining that the called code element is not called by other code elements than the determined candidate code element, determining the called code element as a candidate code element.
In some embodiments, the parsing the to-be-detected code library to determine the call relationship between the code elements in the to-be-detected code library includes: analyzing a code base to be detected by adopting a code analyzer to obtain an abstract syntax tree; and extracting the calling relation among the code elements in the code library to be detected based on the abstract syntax tree.
In some embodiments, the above method further comprises: determining the candidate code element as an invalid code element in response to determining that the candidate code element is called by the other code libraries having a dependency relationship with the code library to be detected and that all code elements calling the candidate code element in the other code libraries having a dependency relationship with the code library to be detected are invalid code elements.
In a second aspect, an embodiment of the present disclosure provides an apparatus for detecting a code, including: the first analysis unit is configured to analyze the code base to be detected so as to determine the calling relationship among the code elements in the code base to be detected; the screening unit is configured to determine the complexity of calling relations among the code elements, and screen out the code elements with the complexity of calling relations with other code elements in the code library to be detected not exceeding a preset complexity as candidate code elements; the second analysis unit is configured to analyze the dependency relationship between the code library to be detected and other code libraries; a determination unit configured to determine the candidate code element as an invalid code element in response to determining that the candidate code element is not called by another code library having a dependency relationship with the code library to be detected.
In some embodiments, the filtering unit is configured to determine the complexity of the call relationship between the code elements as follows: for one code element in the code library to be detected, determining the complexity of the calling relationship between the code element and other code elements in the code library to be detected based on whether the code element has the calling relationship with other code elements in the code library to be detected and whether other code elements having the calling relationship with the code element are candidate code elements.
In some embodiments, the screening unit is configured to screen out the candidate code elements in at least one of the following ways: screening out code elements which do not have a calling relation with other code elements in the code base to be detected as candidate code elements; and screening out code elements which are not called by other code elements in the code base to be detected and are not the execution entries of the program corresponding to the code base to be detected, and taking the code elements as candidate code elements.
In some embodiments, the screening unit is further configured to screen out the candidate code elements as follows: screening out code elements called by the determined candidate code elements as called code elements of the candidate code elements; in response to determining that the called code element is not called by other code elements than the determined candidate code element, determining the called code element as a candidate code element.
In some embodiments, the first parsing unit is configured to parse the codebase to be detected to determine a calling relationship between code elements in the codebase to be detected, as follows: analyzing a code base to be detected by adopting a code analyzer to obtain an abstract syntax tree; and extracting the calling relation among the code elements in the code library to be detected based on the abstract syntax tree.
In some embodiments, the determining unit is further configured to: determining the candidate code element as an invalid code element in response to determining that the candidate code element is called by the other code libraries having a dependency relationship with the code library to be detected and that all code elements calling the candidate code element in the other code libraries having a dependency relationship with the code library to be detected are invalid code elements.
In a third aspect, an embodiment of the present disclosure provides an electronic device, including: one or more processors; a storage device for storing one or more programs which, when executed by one or more processors, cause the one or more processors to implement a method for detecting code as provided in the first aspect.
In a fourth aspect, an embodiment of the present disclosure provides a computer-readable medium on which a computer program is stored, wherein the program, when executed by a processor, implements the method for detecting code provided in the first aspect.
According to the method and the device for detecting the code, the electronic equipment and the computer readable medium of the embodiment of the disclosure, the code library to be detected is analyzed to determine the calling relationship among the code elements in the code library to be detected, then the complexity of the calling relationship among the code elements is determined, and the code elements with the complexity of the calling relationship among other code elements in the code library to be detected not exceeding the preset complexity are screened out to be candidate code elements; then analyzing the dependency relationship between the code base to be detected and other code bases; and then, in response to the fact that the candidate code elements are not called by other code libraries having dependency relations with the code library to be detected, the candidate code elements are determined to be invalid code elements, accurate detection of invalid codes is achieved, reliable basis can be provided for automatic cleaning of the codes, and therefore the code maintenance cost is reduced.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 is an exemplary system architecture diagram in which embodiments of the present disclosure may be applied;
FIG. 2 is a flow diagram for one embodiment of a method for detecting code, according to the present disclosure;
FIG. 3 is a schematic flow chart of a method for detecting codes according to the present disclosure for screening candidate code elements;
FIG. 4 is a flow diagram of another embodiment of a method for detecting code according to the present disclosure;
FIG. 5 is a schematic block diagram illustrating one embodiment of an apparatus for detecting a code according to the present disclosure;
FIG. 6 is a schematic block diagram of a computer system suitable for use in implementing an electronic device of an embodiment of the present disclosure.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
Fig. 1 illustrates an exemplary system architecture 100 to which the method for detecting code or the apparatus for detecting code of the present application may be applied.
As shown in fig. 1, the system architecture 100 may include a terminal device 101, a network 102, and a server 103. Network 102 is the medium used to provide communication links between terminal devices 101 and server 103. The network may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
User 110 may use terminal device 101 to interact with server 103 over network 102 to receive or send messages and the like. The terminal device 101 may have an application program such as a software development tool, a software management application, and a code management application installed thereon.
Server 103 may be a backend server that provides code management services. The server 103 may take code written by an engineer during software development and store the code in order (e.g., according to the product to which it belongs). Server 103 may provide code query and management services. The user 110 may be a code library manager, which may issue a code management instruction to the server 103 using the terminal apparatus 101, for example, an instruction to detect a code and delete an invalid code. The server 103 may perform responsive management operations on the stored code in response to receiving the instruction, such as detecting invalid code and cleaning up the invalid code. The server 103 may also return the code management progress and the result of the code management operation to the terminal 101.
The terminal device 101 may be hardware or software. When the terminal device 101 is hardware, it may be various electronic devices having a display screen and supporting running of a code management application, including but not limited to a tablet computer, a laptop portable computer, a desktop computer, and the like. When the terminal apparatus 101 is software, it can be installed in the electronic apparatuses listed above. It may be implemented as multiple pieces of software or software modules (e.g., multiple pieces of software or software modules to provide distributed services) or as a single piece of software or software module. And is not particularly limited herein.
The server 103 may be hardware or software. When the server 103 is hardware, it may be implemented as a distributed server cluster composed of a plurality of servers, or may be implemented as a single server. When the server 103 is software, it may be implemented as multiple pieces of software or software modules (e.g., multiple pieces of software or software modules for providing distributed services), or as a single piece of software or software module. And is not particularly limited herein.
The method for detecting the code provided by the embodiment of the application can be executed by the terminal device 101 or the server 103, and accordingly, the device for detecting the code can be arranged in the terminal device 101 or the server 103.
It should be understood that the number of terminal devices, networks, servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, servers, as desired for implementation.
With continued reference to FIG. 2, a flow 200 of one embodiment of a method for detecting code in accordance with the present application is shown. The method for detecting a code includes the steps of:
step 201, analyzing the code base to be detected to determine the calling relationship between the code elements in the code base to be detected.
In the present embodiment, an execution subject (e.g., a server shown in fig. 1) of the method for detecting a code may first acquire a library of codes to be detected. The code base to be detected may be a designated code base, or each code base stored in the code database may be used as the code base to be detected to perform a code detection operation. Typically a code library may be a collection of codes written using the same programming language. In practice, the code files are organized in code libraries, typically a product or an item having one or more code libraries for performing its functions. The codes can be stored in different code libraries according to products or items corresponding to functions realized by the codes.
The codes in the codebase to be detected may then be parsed. Specifically, parsing such as syntax and lexical can be performed based on the expression of the code, so as to obtain a calling relationship between code elements in the code library. Wherein, the code element is an element having an independent syntactic meaning in the code, and may include but is not limited to at least one of the following: function, variable, method, class.
The calling relationship between the code elements may be a calling and called relationship between the code elements. For example, the calling relationship between functions and variables, classes, and functions, the calling relationship between methods and functions, classes, and variables, and so on.
The method of lexical analysis, syntactic analysis, control flow analysis, data flow analysis and the like can be adopted to analyze the code base to be detected so as to extract the calling relation among the code elements. Taking syntax analysis as an example, each programming language typically has a predefined canonical syntax format, and calls between code elements follow a predetermined call syntax. The calling relation in the code library can be analyzed according to a syntax format predetermined by a programming language, and the code element as a caller and the code element as a callee in the calling relation are recorded.
Alternatively, a code parser may be used to parse the code base to be detected to obtain an abstract syntax tree, and then the call relationship between the code elements in the code base to be detected is extracted based on the abstract syntax tree.
Different programming languages have corresponding code parsing tools, for example, C + + language can parse codes by compilers such as LLVM (Low Level Virtual Machine), GCC (GNU Compiler Collection), and code written in eclipse platform based on Java language can be parsed by using open-source CDT (C/C + + Development toolkit) plug-in, etc.
After the code analyzer is adopted to analyze the code base to be detected, the abstract syntax tree of the code base to be detected can be obtained. The abstract syntax tree is an abstract representation of the syntax structure of the source code, and the syntax structure of the code is represented in a tree structure. Each node in the abstract syntax tree is each code element. And analyzing the abstract syntax tree again to obtain the calling relation among the code elements.
In the abstract syntax tree, two code elements with direct calling relation are connected with each other, and the direct calling relation and the indirect calling relation between the code elements can be determined in sequence by analyzing the connection relation and the arrow direction of the nodes in the abstract syntax tree.
Step 202, determining the complexity of the calling relationship among the code elements, and screening out the code elements with the complexity of the calling relationship between the code elements and other code elements in the code library to be detected not exceeding the preset complexity as candidate code elements.
The complexity of the calling relationship between the code elements can be determined according to the calling relationship between the code elements analyzed in step 201. Here, each code element may have a complexity attribute of a calling relationship between code elements corresponding thereto. The complexity is related to whether a code element is called by other code elements, whether other code elements are called, the number of times other code elements are called, whether it is a program execution entry, and the like.
A method of ranking the complexity may be preset, with higher ranking corresponding to higher complexity. For example, when there is no call and called relation between a code element and other code elements, the complexity level of the corresponding call relation is the first level or 0; when the code element is not called by other code elements and is not a program execution entry, the complexity level of the corresponding calling relation is 1; when a code element is called by other code elements and the other code elements calling the code element are not invalid code elements, the corresponding calling relationship level is 3, and so on.
Whether the complexity of the calling relation corresponding to each code element reaches a preset complexity level can be judged, and whether the preset complexity is not exceeded or not is determined. For example, if the preset complexity level is 1, the complexity of the call relation of the code elements with the call relation complexity level of 0 or 1 does not exceed the preset complexity, and these code elements may be used as candidate code elements. Here, the candidate code element is a candidate invalid code element, and it is necessary to further determine whether or not the candidate code element is an invalid code element by a subsequent step.
Alternatively, other code elements in the codebase to be detected that are not determined to be candidate code elements may be determined to be non-invalid code elements.
And step 203, analyzing the dependency relationship between the code library to be detected and other code libraries.
Candidate code elements in a code base that may be invalid code elements are obtained via step 202. The dependency relationships between different codebases may be further resolved. Here, the dependency relationship between the code libraries characterizes that one code library depends on another code library, that is, one code library needs to use code elements in another code library when executed, for example, one code library calls a function or a method in another code library when executed.
The dependencies between code libraries may be obtained by reading and parsing a file containing declarations of dependencies between code libraries. For example, in C + + language, dependencies between code libraries are resolved by reading a "makefile" file, and in Java, dependencies between code libraries are obtained by resolving fields describing dependencies in a "pom.
And step 204, in response to determining that the candidate code element is not called by other code libraries having a dependency relationship with the code library to be detected, determining that the candidate code element is an invalid code element.
After the dependency relationship between the code libraries is analyzed, whether the code library having the dependency relationship with the code library to be detected calls the candidate code elements in the code library to be detected can be judged. If the candidate code element is not called by any other code library having a dependency relationship with the code library to be detected, the candidate code element can be determined to be an invalid code element.
Invalid code elements can be cleaned, so that resource waste caused by the execution of the invalid code elements can be avoided when the code library is operated, and meanwhile, the code library is optimized, the reading and understanding difficulty of engineers is reduced, and the cost is reduced.
According to the method for detecting the codes, the calling relation among the code elements in the code library to be detected is determined by analyzing the code library to be detected, then the complexity of the calling relation among the code elements is determined, and the code elements, the complexity of the calling relation among other code elements in the code library to be detected does not exceed the preset complexity, are screened out and used as candidate code elements; then analyzing the dependency relationship between the code base to be detected and other code bases; and then responding to the fact that the candidate code elements are not called by other code libraries having dependency relations with the code library to be detected, determining the candidate code elements as invalid code elements, and achieving accurate detection of invalid codes. The method can provide reliable basis for automatic cleaning of the codes, thereby reducing the code maintenance cost.
Optionally, when step 202 in the flow 200 of the method for detecting code is executed, for one code element in the code library to be detected, the complexity of the calling relationship between the code element and other code elements in the code library to be detected may be determined based on whether the code element has a calling relationship with other code elements in the code library to be detected and whether other code elements having a calling relationship with the code element are candidate code elements. If the code element has a calling relationship with other code elements and other code elements having the calling relationship with the code element are not candidate code elements, the complexity of the calling relationship corresponding to the code element is higher; if the code element has no calling relationship with other code elements (does not call and is not called by other code elements), or other code elements having calling relationship with the code element are determined as candidate code elements, the calling relationship corresponding to the code element has lower complexity.
Further, referring to fig. 3, a schematic flow chart illustrating the screening of candidate code elements in the method for detecting a code according to the present disclosure is shown, that is, a schematic flow chart illustrating an alternative implementation manner of the step 202. As shown in FIG. 3, the filter candidate code flow 300 includes the following steps:
and 301, screening out code elements which do not have a calling relation with other code elements in the code base to be detected as candidate code elements.
Whether the code elements in the code base to be detected have calling relations with other code elements in the same code base or not can be sequentially judged. If the code element does not call other code elements in the same code base and is not called by other code elements in the same code base, the code element is an 'isolated' element in the code base to be detected, the condition that the complexity of the calling relation between the code element and the other code elements in the code base to be detected does not exceed the preset complexity is determined, namely the code element may be an invalid code element, and the code element can be determined as a candidate code element.
And step 302, screening out code elements which are not called by other code elements in the code base to be detected and are not the execution entries of the program corresponding to the code base to be detected, and taking the code elements as candidate code elements.
If a code element is not called by other code elements in the same code library and the code element is not an execution entry of a program, the code element will not be executed when executing code in the code library, may be an invalid code element, and may be determined to be a candidate code element.
The execution entries of the program are typically predefined, such as main functions. In general, a code element that is an execution entry of a program is not called by another code element, and whether or not the code element is an execution entry of the program is determined, whereby it is possible to avoid misdetermining the execution entry of the program as an invalid code.
It should be noted that, in the embodiment of the present disclosure, the candidate code elements that may be invalid code elements may be screened according to any one of the screening manners of step 301 and step 302, or the candidate code elements may be screened by combining the two methods of step 301 and step 302.
Optionally, after screening out some candidate code elements in one code library, further screening out other candidate code elements by:
and step 303, screening out the code elements called by the determined candidate code elements as called code elements of the candidate code elements, and determining the called code elements as the candidate code elements in response to determining that the called code elements are not called by other code elements except the determined candidate code elements.
The candidate code elements screened in steps 301 and 302 are determined candidate code elements. Specifically, on the basis of step 302, after determining a code element that is not called by another code element in the code library to be detected and is not a code element of the execution entry of the program corresponding to the code library to be detected as a candidate code element, the code elements called by the candidate code element determined in this way may be further analyzed one by one. If the code element called by the candidate code element is not called by other code elements, or optionally, if the code element called by the candidate code element is not called by other non-candidate code elements in the codebase to be detected, the code element called by the candidate code element may also be determined as a candidate code element.
If a code element is only called by an invalid code element, the code element will not be executed, and is an invalid code element. By further analyzing the calling relation of the code elements called by the candidate code elements, the candidate code elements which are possibly invalid code elements can be determined more quickly according to the calling relation, so that more comprehensive detection of the invalid code elements is realized.
With continued reference to FIG. 4, shown is a flow diagram of another embodiment of a method for detecting code in accordance with the present disclosure. As shown in fig. 4, a flow 400 of the method for detecting a code of the present embodiment includes the following steps:
step 401, parsing the to-be-detected code base to determine a calling relationship between code elements in the to-be-detected code base.
In this embodiment, the code library to be detected may be analyzed in a lexical analysis mode, a syntactic analysis mode, a control flow analysis mode, a data flow analysis mode, and the like, and a call relationship between code elements in the code library may be extracted.
Alternatively, a code analyzer may be used to analyze the code library to be detected to obtain an abstract syntax tree, and then the call relationship between the code elements in the code library to be detected is extracted based on the abstract syntax tree.
Step 402, determining the complexity of the calling relationship among the code elements, and screening out the code elements with the complexity of the calling relationship between the code elements and other code elements in the code library to be detected not exceeding the preset complexity as candidate code elements.
The complexity of the calling relationship between the code element and other code elements in the codebase to be detected may be determined based on the indicators of whether the code element is called, whether other code elements are called, whether the code element is an execution entry of a program, whether the called or called code element is an invalid code element, and the like. The complexity level corresponding to the code element can be found in a preset complexity level mapping relation according to the index. And then judging whether the complexity level exceeds a preset level, namely judging whether the complexity exceeds a preset complexity threshold value. And if the complexity level does not exceed the preset level, determining the corresponding code element as a candidate code element.
Optionally, for one code element in the to-be-detected code library, the complexity of the calling relationship between the code element and other code elements in the to-be-detected code library may be determined based on whether the code element has a calling relationship with other code elements in the to-be-detected code library and whether other code elements having a calling relationship with the code element are candidate code elements.
Furthermore, code elements which do not have calling relations with other code elements in the code library to be detected can be screened out and used as candidate code elements; and/or screening out code elements which are not called by other code elements in the code base to be detected and are not the execution entries of the program corresponding to the code base to be detected as candidate code elements. And code elements which are not called by other code elements in the code base to be detected and are not the execution entries of the program corresponding to the code base to be detected can be screened out and used as candidate code elements.
And step 403, analyzing the dependency relationship between the code base to be detected and other code bases.
Files for describing the dependency relationship between the code libraries, such as "makefile" files in C + +, or "pom.
Step 404, in response to determining that the candidate code element is not called by another code library having a dependency relationship with the code library to be detected, determining that the candidate code element is an invalid code element.
If the candidate code element is not called by any other code library having a dependency relationship with the code library to be detected, the candidate code element can be determined to be an invalid code element.
Steps 401, 402, 403, and 404 in this embodiment are respectively the same as steps 201, 202, 203, and 204 in the foregoing embodiment, and specific implementation manners of steps 401, 402, 403, and 404 may also refer to descriptions of specific implementation manners of steps 201, 202, 203, and 204 in the foregoing embodiment, which are not described herein again.
Step 405, in response to determining that the candidate code element is called by other code libraries having a dependency relationship with the code library to be detected and that all code elements calling the candidate code element in the other code libraries having a dependency relationship with the code library to be detected are invalid code elements, determining that the candidate code element is an invalid code element.
In this embodiment, other code library call candidate code elements having a dependency relationship with the code library to be detected may be used as target candidate code elements for further detection. It may be determined whether a code element that calls the target candidate code element in the other code libraries having a dependency relationship with the code library to be detected is an invalid code element. If so, the target candidate code element called by the invalid code element may also be determined to be an invalid code element.
By the method, the invalid code element as the callee can be detected under the condition that the calling relation exists among different code libraries but the code element as the caller is the invalid code element, the missing rate of the invalid code element is further reduced, and the accuracy of invalid code detection is improved.
Alternatively, if the candidate code element is called by another code library having a dependency relationship with the code library to be detected, and at least one code element that calls the candidate code element in the another code library having a dependency relationship with the code library to be detected is not an invalid code element, it may be determined that the candidate code element is not an invalid code element. In this case, although the code element is not called by other valid code elements in the same code library, but is called by valid code elements in other code libraries, the candidate code element is executed at runtime and does not belong to an invalid code element.
With further reference to fig. 5, as an implementation of the methods shown in the above-mentioned figures, the present application provides an embodiment of an apparatus for detecting a code, which corresponds to the method embodiments shown in fig. 2 and fig. 4, and which is particularly applicable to various electronic devices.
As shown in fig. 5, the apparatus 500 for detecting a code of the present embodiment includes: a first analyzing unit 501, a screening unit 502, a second analyzing unit 503, and a determining unit 504. The first parsing unit 501 is configured to parse the code library to be detected to determine a call relationship between code elements in the code library to be detected; the screening unit 502 is configured to determine complexity of call relationships between code elements, and screen out code elements, of which complexity of call relationships with other code elements in the code library to be detected does not exceed a preset complexity, as candidate code elements; the second parsing unit 503 is configured to parse the dependency relationship between the to-be-detected code base and the other code bases; the determining unit 504 is configured to determine the candidate code element as an invalid code element in response to determining that the candidate code element is not called by another code library having a dependency relationship with the code library to be detected.
In some embodiments, the filtering unit 502 is configured to determine the complexity of the call relationship between the code elements as follows: for one code element in the code library to be detected, determining the complexity of calling relations between the code element and other code elements in the code library to be detected based on whether the code element has a calling relation with other code elements in the code library to be detected and whether other code elements having a calling relation with the code element are candidate code elements.
In some embodiments, the filtering unit 502 is configured to filter out the candidate code elements in at least one of the following manners: screening out code elements which do not have a calling relation with other code elements in the code base to be detected as candidate code elements; and screening out code elements which are not called by other code elements in the code base to be detected and are not the execution entries of the program corresponding to the code base to be detected, and taking the code elements as candidate code elements.
In some embodiments, the filtering unit 502 is further configured to filter out the candidate code elements as follows: screening out code elements called by the determined candidate code elements as called code elements of the candidate code elements; in response to determining that the called code element is not called by other code elements than the determined candidate code element, determining the called code element as a candidate code element.
In some embodiments, the first parsing unit 501 is configured to parse the codebase to be detected to determine the call relationship between the code elements in the codebase to be detected as follows: analyzing a code base to be detected by adopting a code analyzer to obtain an abstract syntax tree; and extracting the calling relation among the code elements in the code library to be detected based on the abstract syntax tree.
In some embodiments, the determining unit 504 is further configured to: determining the candidate code element as an invalid code element in response to determining that the candidate code element is called by the other code libraries having a dependency relationship with the code library to be detected and that all code elements calling the candidate code element in the other code libraries having a dependency relationship with the code library to be detected are invalid code elements.
It should be understood that the elements recited in apparatus 500 correspond to various steps in the methods described with reference to fig. 2 and 4. Thus, the operations and features described above for the method are equally applicable to the apparatus 500 and the units included therein, and are not described in detail here.
The device 500 for detecting codes according to the embodiment of the application determines the calling relationship among the code elements in the to-be-detected code library by analyzing the to-be-detected code library, then determines the complexity of the calling relationship among the code elements, and screens out the code elements, of which the complexity of the calling relationship with other code elements in the to-be-detected code library does not exceed the preset complexity, as candidate code elements; then analyzing the dependency relationship between the code base to be detected and other code bases; and then responding to the fact that the candidate code elements are not called by other code libraries having dependency relations with the code library to be detected, determining the candidate code elements as invalid code elements, and achieving accurate detection of invalid codes.
Referring now to FIG. 6, a schematic diagram of an electronic device (e.g., the server of FIG. 1) 600 suitable for use in implementing embodiments of the present disclosure is shown. The electronic device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 6, the electronic device 600 may include a processing means (e.g., central processing unit, graphics processor, etc.) 601 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the electronic apparatus 600 are also stored. The processing device 601, the ROM 602, and the RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
Generally, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; a storage device 608 including, for example, a hard disk; and a communication device 609. The communication means 609 may allow the electronic device 600 to communicate with other devices wirelessly or by wire to exchange data. While fig. 6 illustrates an electronic device 600 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided. Each block shown in fig. 6 may represent one device or may represent multiple devices as desired.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer-readable medium, the computer program comprising program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 609, or may be installed from the storage means 608, or may be installed from the ROM 602. The computer program, when executed by the processing device 601, performs the above-described functions defined in the methods of embodiments of the present disclosure. It should be noted that the computer readable medium described in the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In embodiments of the disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In embodiments of the present disclosure, however, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: analyzing the code base to be detected to determine the calling relationship among the code elements in the code base to be detected; determining the complexity of calling relations among the code elements, and screening out the code elements with the complexity of calling relations with other code elements in the code library to be detected not exceeding a preset complexity as candidate code elements; analyzing the dependency relationship between the code library to be detected and other code libraries; in response to determining that the candidate code element is not called by other codebases having dependencies with the codebase to be detected, determining that the candidate code element is an invalid code element.
Computer program code for carrying out operations for embodiments of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present application may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes a first parsing unit, a screening unit, a second parsing unit, and a determination unit. Where the names of these units do not in some cases constitute a limitation on the unit itself, for example, the first parsing unit may also be described as a "unit that parses the codebase to be detected to determine the calling relationship between the code elements in the codebase to be detected".
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention herein disclosed is not limited to the particular combination of features described above, but also encompasses other arrangements in which any combination of the features described above or their equivalents does not depart from the spirit of the invention disclosed above. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims (12)

1. A method for detecting a code, comprising:
analyzing a code library to be detected to determine a calling relationship among code elements in the code library to be detected;
determining the complexity of the calling relationship among the code elements, and screening out the code elements with the complexity of the calling relationship between the code elements and other code elements in the code library to be detected not exceeding the preset complexity as candidate code elements;
analyzing the dependency relationship between the code library to be detected and other code libraries;
in response to determining that the candidate code element is not called by other code libraries having a dependency relationship with the code library to be detected, determining that the candidate code element is an invalid code element;
wherein the method further comprises:
determining the candidate code element as an invalid code element in response to determining that the candidate code element is called by other code libraries having a dependency relationship with the code library to be detected and that all code elements calling the candidate code element in the other code libraries having a dependency relationship with the code library to be detected are invalid code elements;
determining that the candidate code element is not an invalid code element in response to determining that the candidate code element is called by other code libraries having dependencies with the code library to be detected and that at least one code element that calls the candidate code element in the other code libraries having dependencies with the code library to be detected is not an invalid code element.
2. The method of claim 1, wherein the determining a complexity of call relationships between the code elements comprises:
and for one code element in the code library to be detected, determining the complexity of the calling relationship between the code element and other code elements in the code library to be detected based on whether the code element has a calling relationship with other code elements in the code library to be detected and whether other code elements having a calling relationship with the code element are candidate code elements.
3. The method according to claim 2, wherein the screening out code elements whose calling relationships with other code elements in the code library to be detected have a complexity not exceeding a preset complexity as candidate code elements includes at least one of:
screening out code elements which do not have a calling relation with other code elements in the code base to be detected as candidate code elements;
and screening out code elements which are not called by other code elements in the code base to be detected and are not the execution entries of the program corresponding to the code base to be detected, and taking the code elements as candidate code elements.
4. The method according to claim 3, wherein the screening out code elements whose calling relationships with other code elements in the to-be-detected code library have a complexity not exceeding a preset complexity as candidate code elements further comprises:
screening out code elements called by the determined candidate code elements as called code elements of the candidate code elements;
determining the called code element as a candidate code element in response to determining that the called code element is not called by other code elements other than the determined candidate code element.
5. The method of claim 1, wherein the parsing the codebase to be detected to determine call relationships between code elements in the codebase to be detected comprises:
analyzing the code base to be detected by adopting a code analyzer to obtain an abstract syntax tree;
and extracting the calling relation among the code elements in the code library to be detected based on the abstract syntax tree.
6. An apparatus for detecting a code, comprising:
the code library analysis device comprises a first analysis unit, a second analysis unit and a third analysis unit, wherein the first analysis unit is configured to analyze a code library to be detected so as to determine a calling relation among code elements in the code library to be detected;
the screening unit is configured to determine the complexity of the calling relationship among the code elements, and screen out the code elements with the complexity of the calling relationship between the code elements and other code elements in the code library to be detected not exceeding a preset complexity as candidate code elements;
the second analysis unit is configured to analyze the dependency relationship between the code library to be detected and other code libraries;
a determining unit configured to determine the candidate code element as an invalid code element in response to determining that the candidate code element is not called by another code library having a dependency relationship with the code library to be detected;
wherein the determining unit is further configured to:
in response to determining that the candidate code element is called by other code libraries having a dependency relationship with the code library to be detected and that all code elements calling the candidate code element in the other code libraries having a dependency relationship with the code library to be detected are invalid code elements, determining that the candidate code element is an invalid code element;
determining that the candidate code element is not an invalid code element in response to determining that the candidate code element is called by other code libraries having dependencies with the code library to be detected and that at least one code element that calls the candidate code element in the other code libraries having dependencies with the code library to be detected is not an invalid code element.
7. The apparatus of claim 6, wherein the filtering unit is configured to determine the complexity of call relationships between the code elements as follows:
and for one code element in the code library to be detected, determining the complexity of the calling relationship between the code element and other code elements in the code library to be detected based on whether the code element has a calling relationship with other code elements in the code library to be detected and whether other code elements having a calling relationship with the code element are candidate code elements.
8. The apparatus of claim 7, wherein the filtering unit is configured to filter out candidate code elements in at least one of:
screening out code elements which do not have a calling relation with other code elements in the code base to be detected as candidate code elements;
and screening out code elements which are not called by other code elements in the code base to be detected and are not the execution entries of the program corresponding to the code base to be detected, and taking the code elements as candidate code elements.
9. The apparatus of claim 8, wherein the filtering unit is further configured to filter out candidate code elements as follows:
screening out code elements called by the determined candidate code elements as called code elements of the candidate code elements;
determining the called code element as a candidate code element in response to determining that the called code element is not called by other code elements other than the determined candidate code element.
10. The apparatus according to claim 6, wherein the first parsing unit is configured to parse the codebase to be detected to determine call relations between code elements in the codebase to be detected as follows:
analyzing the code base to be detected by adopting a code analyzer to obtain an abstract syntax tree;
and extracting the calling relation among the code elements in the code library to be detected based on the abstract syntax tree.
11. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-5.
12. A computer-readable medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-5.
CN201910584002.2A 2019-07-01 2019-07-01 Method and apparatus for detecting code Active CN110297639B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910584002.2A CN110297639B (en) 2019-07-01 2019-07-01 Method and apparatus for detecting code

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910584002.2A CN110297639B (en) 2019-07-01 2019-07-01 Method and apparatus for detecting code

Publications (2)

Publication Number Publication Date
CN110297639A CN110297639A (en) 2019-10-01
CN110297639B true CN110297639B (en) 2023-03-21

Family

ID=68029670

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910584002.2A Active CN110297639B (en) 2019-07-01 2019-07-01 Method and apparatus for detecting code

Country Status (1)

Country Link
CN (1) CN110297639B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112732335A (en) * 2021-01-12 2021-04-30 平安资产管理有限责任公司 Object code extraction method, device, computer equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101164062A (en) * 2005-01-13 2008-04-16 Hsbc北美控股有限公司 Framework for group systems software configuration and release management
CN101976318A (en) * 2010-11-15 2011-02-16 北京理工大学 Detection method of code similarity based on digital fingerprints
CN105930162A (en) * 2016-04-24 2016-09-07 复旦大学 Subgraph search-based feature location method
CN108549538A (en) * 2018-04-11 2018-09-18 深圳市腾讯网络信息技术有限公司 A kind of code detection method, device, storage medium and test terminal
US10133557B1 (en) * 2013-01-11 2018-11-20 Mentor Graphics Corporation Modifying code to reduce redundant or unnecessary power usage

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6029005A (en) * 1997-04-01 2000-02-22 Intel Corporation Method for identifying partial redundancies in a new processor architecture
CN102231134A (en) * 2011-07-29 2011-11-02 哈尔滨工业大学 Method for detecting redundant code defects based on static analysis
CN106021116B (en) * 2016-06-07 2018-07-13 北京信息科技大学 Unreachable function call path detection method in complication system
CN106294139B (en) * 2016-08-02 2018-08-31 上海理工大学 A kind of Detection and Extraction method of repeated fragment in software code
CN108614707B (en) * 2018-04-27 2023-05-02 深圳市腾讯网络信息技术有限公司 Static code checking method, device, storage medium and computer equipment
CN108776643B (en) * 2018-06-04 2021-10-22 腾讯科技(武汉)有限公司 Target code merging control method and system based on version control process

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101164062A (en) * 2005-01-13 2008-04-16 Hsbc北美控股有限公司 Framework for group systems software configuration and release management
CN101976318A (en) * 2010-11-15 2011-02-16 北京理工大学 Detection method of code similarity based on digital fingerprints
US10133557B1 (en) * 2013-01-11 2018-11-20 Mentor Graphics Corporation Modifying code to reduce redundant or unnecessary power usage
CN105930162A (en) * 2016-04-24 2016-09-07 复旦大学 Subgraph search-based feature location method
CN108549538A (en) * 2018-04-11 2018-09-18 深圳市腾讯网络信息技术有限公司 A kind of code detection method, device, storage medium and test terminal

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Luis-J. Saiz-Adalid.MCU Tolerance in SRAMs Through Low-Redundancy Triple Adjacent Error Correction.2014,第2332 - 2336页. *
方登辉.基于抽象语法树的代码静态缺陷检测工具开发.2018,(第11期),第I138-113页. *

Also Published As

Publication number Publication date
CN110297639A (en) 2019-10-01

Similar Documents

Publication Publication Date Title
US9852015B2 (en) Automatic discovery of a JavaScript API
US10481884B2 (en) Systems and methods for dynamically replacing code objects for code pushdown
US10481964B2 (en) Monitoring activity of software development kits using stack trace analysis
US9720798B2 (en) Simulating black box test results using information from white box testing
CN112100072B (en) Static detection method, device, equipment and medium for application program code
US8799874B2 (en) Static analysis of computer software applications
US8819644B2 (en) Selective data flow analysis of bounded regions of computer software applications
WO2023061874A1 (en) Checking source code validity at time of code update
CN111506900B (en) Vulnerability detection method and device, electronic equipment and computer storage medium
CN110659210A (en) Information acquisition method and device, electronic equipment and storage medium
CN110297639B (en) Method and apparatus for detecting code
CN113821486B (en) Method and device for determining dependency relationship between pod libraries and electronic equipment
CN116578282A (en) Code generation method, device, electronic equipment and medium
US9710360B2 (en) Optimizing error parsing in an integrated development environment
CN111382017A (en) Fault query method, device, server and storage medium
US20230115334A1 (en) Identifying computer instructions enclosed by macros and conflicting macros at build time
CN114328090A (en) Program monitoring method and device, electronic equipment and storage medium
CN111124423B (en) Compiling detection method, device, server and medium based on multiple platforms
CN114238130A (en) Performance test method, device, equipment and storage medium
CN114090514A (en) Log retrieval method and device for distributed system
CN111240728A (en) Application program updating method, device, equipment and storage medium
US20240103853A1 (en) Code maintenance system
CN113190453A (en) User interface testing method, device, server and medium
CN117539492A (en) Method and device for deleting redundant sentences in codes, electronic equipment and storage medium
CN116775099A (en) Program data processing method, program data processing device, electronic equipment, medium and program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant