CN115016840A - Dependency conflict repairing method and device based on call graph - Google Patents

Dependency conflict repairing method and device based on call graph Download PDF

Info

Publication number
CN115016840A
CN115016840A CN202210768989.5A CN202210768989A CN115016840A CN 115016840 A CN115016840 A CN 115016840A CN 202210768989 A CN202210768989 A CN 202210768989A CN 115016840 A CN115016840 A CN 115016840A
Authority
CN
China
Prior art keywords
call graph
package
conflict
dependency
packages
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210768989.5A
Other languages
Chinese (zh)
Inventor
吴荣鑫
王超
林立
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen University
Original Assignee
Xiamen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen University filed Critical Xiamen University
Priority to CN202210768989.5A priority Critical patent/CN115016840A/en
Publication of CN115016840A publication Critical patent/CN115016840A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/71Version control; Configuration management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • G06F8/61Installation

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Stored Programmes (AREA)

Abstract

The present disclosure provides a dependency conflict repairing method based on a call graph, including: inputting a Python project path and a packet which depends on conflict, and obtaining a first set after processing by a call graph generator, wherein the first set comprises all methods for calling the conflict packet by the project; inputting all versions of the package which depends on the conflict, and obtaining a second set after the processing of the call graph generator, wherein the second set comprises all methods contained in other versions of the conflict package; and comparing the first set with the second set, if the first set is a subset of the second set, acquiring the alternative versions of the packages which depend on the conflict, and outputting an alternative version list. The disclosure also provides a dependency conflict repairing device based on the call graph, an electronic device and a readable storage medium.

Description

Dependency conflict repairing method and device based on call graph
Technical Field
The present disclosure relates to a Python library dependency conflict repair technology, and in particular, to a dependency conflict repair method and apparatus based on a call graph, an electronic device, and a readable storage medium.
Background
Indexes have been built to millions of Python libraries to allow developers to automatically download and install dependent items for their projects according to specified version constraints. Despite the convenience of automation, version constraints in the Python project are easily subject to conflict, resulting in build failures. This type of conflict is known as the Dependency Conflict (DC) problem. Conflicts caused by remote dependency update are divided into conflicts caused by direct dependency, conflicts caused by direct dependency and transfer dependency, and conflicts caused by transfer dependency and transfer dependency.
The existing package management tool pip can automatically install a third party library in the server-side central repository PyPI. In this setting, the developer specifies the dependencies in setup or requisitions. txt, which is then handed to the pip for installation, which gives the appropriate recommended version by scanning for package dependencies on PyPI. The names and versions of libraries typically follow de facto conventions, such as semantic versioning. However, the new version pip tool also has a problem that cannot be solved, and there may be a case of conflict and no solution according to the range of package versions specified by a project developer and specified in requisition. This is typically the case for conflicts arising from direct dependencies and transfer dependencies or for conflicts arising from transfer dependencies and transfer dependencies. Because the conflict caused by direct dependency can be easily found by developers, and the transitive dependency has strong concealment and is difficult for developers to find the conflict relationship of the dependency.
Disclosure of Invention
In order to solve at least one of the above technical problems, the present disclosure provides a dependency conflict recovery method, apparatus, electronic device and readable storage medium based on a call graph.
According to one aspect of the present disclosure, there is provided a dependency conflict repairing method based on a call graph, including:
inputting a Python item path and a conflict-dependent package, and obtaining a first set after processing by a call graph generator, wherein the first set comprises all methods for calling the item to the conflict package;
inputting all versions of the package which depends on the conflict, and obtaining a second set after the second set is processed by a call graph generator, wherein the second set comprises all methods contained in other versions of the conflict package;
and comparing the first set with the second set, if the first set is a subset of the second set, acquiring the alternative versions of the packages which depend on the conflict, and outputting the alternative version list.
According to the dependency conflict repairing method based on the call graph, the processing procedure of the call graph generator comprises the following steps:
acquiring a package/project to be installed and a package name of a package to be analyzed;
downloading the required packages of the items to a specified catalogue;
constructing a dependency tree between packages;
processing the packet or the name of the packet through Pycg to obtain a function relation call graph of a single packet;
taking out the edge set of the functional relationship call graph of the single package and processing the namespace to obtain a global functional relationship call graph;
based on the global function relation call graph, obtaining reachability between packages through depth-first or breadth-first search;
outputting a set of methods that includes the called package.
According to the dependency conflict repairing method based on the call graph, the representation structure of the functional relation call graph comprises the following steps:
and G ═ V, E, (V, E), where V denotes a set of vertices, size is denoted by | V |, E denotes a set of edges, size is denoted by | E |, the set of vertices is a set of methods in each of the packages or packets, and the set of edges is a call relationship between methods in each of the packages or packets.
According to the dependency conflict repairing method based on the call graph, in the case that the functional relation call graph is the directed graph, the functional relation call graph is represented by the adjacency matrix.
According to the dependency conflict repairing method based on the call graph, the dependency tree between the packages is constructed, and the method comprises the following steps:
acquiring an entry function for generating a dependency tree from a pipdeptree;
and converting the data structure of the pipdeptree, and removing information including the version and the installation time to obtain a simplified version dependency tree.
According to the dependency conflict repairing method based on the call graph, at least one embodiment of the disclosure is used for processing the namespace, and the method comprises the following steps:
and increasing a name space on the basis of acquiring the functional relationship call graph of the single package.
According to still another aspect of the present disclosure, there is provided a dependency conflict repairing apparatus based on a call graph, including:
the first set acquisition module is used for acquiring a Python project path and a conflict-dependent packet, and obtaining a first set after processing by the call graph generator, wherein the first set comprises all methods for calling a project to the conflict packet;
a second set acquisition module, configured to acquire all versions of the package that depends on the conflict, and obtain a second set after processing by the call graph generator, where the second set includes all methods included in other versions of the conflict package;
and the comparison module compares the first set with the second set, acquires the alternative versions of the packages which depend on the conflict if the first set is a subset of the second set, and outputs the alternative version list.
According to the dependency conflict repairing device based on the call graph, the call graph generator comprises:
the input module is used for acquiring the package/project to be installed and the package name of the package to be analyzed;
the downloader downloads the packages required by the items to the specified file directory;
a dependency tree generator for constructing a dependency tree between packages;
the integrator can obtain a functional relationship call graph of a single package after the package or the name of the package is processed by Pycg, take out an edge set of the call graph, and process a namespace to obtain a global functional relationship call graph;
the reachability analysis module acquires reachability relation between packages through depth-first or breadth-first search based on the global function relation call graph;
and the output module outputs a set containing the called method of the packet.
According to still another aspect of the present disclosure, there is provided an electronic device including:
a memory storing execution instructions;
a processor executing execution instructions stored by the memory to cause the processor to perform any of the methods described above.
According to yet another aspect of the present disclosure, there is provided a readable storage medium having stored therein execution instructions for implementing any of the above methods when executed by a processor.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the disclosure and together with the description serve to explain the principles of the disclosure.
FIG. 1 is a flowchart illustrating a dependency conflict remediation method based on a call graph according to one embodiment of the present disclosure.
FIG. 2 is a call graph generator process schematic according to one embodiment of the present disclosure.
FIG. 3 is a schematic diagram of a dependency conflict recovery apparatus based on a call graph according to an embodiment of the present disclosure.
FIG. 4 is a schematic diagram of a call graph generator structure according to one embodiment of the present disclosure.
FIG. 5 is a schematic diagram of dependencies, according to one embodiment of the present disclosure.
FIG. 6 is a slice containing a pass dependency according to one embodiment of the present disclosure.
FIG. 7 is a dependency tree diagram of the Archer project, according to one embodiment of the present disclosure.
Fig. 8 is a collision chain diagram according to one embodiment of the present disclosure.
FIG. 9 is a flowchart illustrating a dependency conflict recovery method based on call graphs according to another embodiment of the disclosure
Description of the reference numerals
1000 dependency conflict repairing device based on call graph
1002 first set acquisition module
1004 second set acquisition module
1006 comparison module
1100 bus
1200 processor
1300 memory
1400 and other circuits.
Detailed Description
The present disclosure will be described in further detail with reference to the drawings and embodiments. It is to be understood that the specific embodiments described herein are for purposes of illustration only and are not to be construed as limitations of the present disclosure. It should be further noted that, for the convenience of description, only the portions relevant to the present disclosure are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict. Technical solutions of the present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Unless otherwise indicated, the illustrated exemplary embodiments/examples are to be understood as providing exemplary features of various details of some ways in which the technical concepts of the present disclosure may be practiced. Accordingly, unless otherwise indicated, features of the various embodiments may be additionally combined, separated, interchanged, and/or rearranged without departing from the technical concept of the present disclosure.
The use of cross-hatching and/or shading in the drawings is generally used to clarify the boundaries between adjacent components. As such, unless otherwise noted, the presence or absence of cross-hatching or shading does not convey or indicate any preference or requirement for a particular material, material property, size, proportion, commonality between the illustrated components and/or any other characteristic, attribute, property, etc., of a component. Further, in the drawings, the size and relative sizes of components may be exaggerated for clarity and/or descriptive purposes. While example embodiments may be practiced differently, the specific process sequence may be performed in a different order than that described. For example, two processes described consecutively may be performed substantially simultaneously or in reverse order to that described. In addition, like reference numerals denote like parts.
When an element is referred to as being "on" or "on," "connected to" or "coupled to" another element, it can be directly on, connected or coupled to the other element or intervening elements may be present. However, when an element is referred to as being "directly on," "directly connected to" or "directly coupled to" another element, there are no intervening elements present. For purposes of this disclosure, the term "connected" may refer to physically, electrically, etc., and may or may not have intermediate components.
The terminology used herein is for the purpose of describing particular embodiments and is not intended to be limiting. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. Furthermore, when the terms "comprises" and/or "comprising" and variations thereof are used in this specification, the presence of stated features, integers, steps, operations, elements, components and/or groups thereof are stated but does not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components and/or groups thereof. It is also noted that, as used herein, the terms "substantially," "about," and other similar terms are used as approximate terms and not as degree terms, and as such, are used to interpret inherent deviations in measured values, calculated values, and/or provided values that would be recognized by one of ordinary skill in the art.
FIG. 1 is a flowchart illustrating a dependency conflict remediation method based on a call graph according to one embodiment of the present disclosure.
As shown in fig. 1, a call graph-based dependency conflict repairing method S100 includes:
s102, inputting a Python project path and a conflict-dependent packet, and processing the Python project path and the conflict-dependent packet by a call graph generator Smart _ Pycg to obtain a first set, wherein the first set comprises all methods for calling a conflict packet by a project;
s104, inputting all versions of the packet which depends on the conflict, and obtaining a second set after the versions are processed by a call graph generator Smart _ Pycg, wherein the second set comprises all methods contained in other versions of the conflict packet;
s106, comparing the first set with the second set, if the first set is a subset of the second set, indicating that the alternative version of the conflict package can be found, acquiring the alternative version of the package depending on the conflict, and outputting an alternative version list.
FIG. 2 is a call graph generator process schematic according to one embodiment of the present disclosure.
As shown in fig. 2, the process S200 of calling the graph generator includes the following steps.
S202, acquiring the package/project to be installed and the package name of the package to be analyzed.
And S204, downloading the package required by the project to a specified directory. First, a clean environment is ensured to avoid collision with locally installed packages. Whether the corresponding dependence of the package needs to be downloaded is judged according to the requirements of the user, the corresponding package can be downloaded according to the requirement of the user analysis package, and the installation time and efficiency can be greatly reduced.
S206, building a dependency tree between the packages.
And S208, processing the package or the name of the package through Pycg to obtain a function relation call graph of a single package. Pycg is a call graph generation tool based on static analysis.
S210, extracting the edge set of the functional relationship call graph of the single package, and processing the namespace to obtain the global functional relationship call graph.
S212, based on the global function relation call graph, the reachability between the packages is obtained through depth-first or breadth-first searching. This can be attempted to reduce the overhead of | V | and | E | in order to reduce the overhead of pycg, as well as the spatial complexity of storing the entire functional relationship call graph, based on achieving a method-level reachability problem between analysis packets.
S214, outputting a set of methods containing the called packets.
In S208, S210, and S212, the structure of the functional relationship call graph is represented as follows.
And (V, E), where V denotes a vertex set, size is denoted by | V |, E denotes a set of edges, size is denoted by | E |, the vertex set is a set of methods in each package or package, and the set of edges is a call relationship between methods in each package or package. The specific definition is as follows: let V ═ V1, V2.., vn } be a non-empty set of n vertices, E ═ E1, E2.. em } be a set of m arcs, and E element in E is an ordered pair (vi, vj), then we call both sets V and E form a directed graph, denoted D ═ V, E, and the ordered pair (vi, vj) indicates that there is an edge pointing from vi to vj.
When the functional relation call graph is a directed graph, the functional relation call graph represents the functional relation call graph through an adjacency matrix in order to reduce storage cost.
In step S206, the building a dependency tree between the packages includes:
and acquiring an entry function for generating the dependency tree from the pipdeptree. From the source code of the pipdeptree, the entry function PackageDAG for generating the dependency tree is found.
And converting the data structure of the pipdeptree, and removing information including the version and the installation time to obtain a simplified version dependency tree.
In step S210, the processing the namespace includes:
and on the basis of acquiring the functional relationship call graph of a single package, increasing a name space.
FIG. 3 is a schematic diagram of a dependency conflict recovery apparatus based on a call graph according to the present disclosure.
As shown in fig. 3, the dependency conflict repairing apparatus 1000 based on call graph includes:
the first set obtaining module 1002 is configured to obtain a Python item path and a packet dependent on a conflict, and obtain a first set after processing by the call graph generator, where the first set includes all methods for calling an item to a conflict packet;
a second set obtaining module 1004, obtaining all versions of the package depending on the conflict, and obtaining a second set after processing by the call graph generator, wherein the second set comprises all methods contained in other versions of the conflict package;
the comparison module 1006 compares the first set with the second set, obtains the alternative versions of the packages that depend on the conflict if the first set is a subset of the second set, and outputs an alternative version list.
FIG. 4 is a schematic diagram of a call graph generator structure according to one embodiment of the present disclosure.
As shown in fig. 4, the call graph generator includes the following components.
And the input module is used for acquiring the package/project to be installed and the package name of the package to be analyzed through the input module.
And the downloader downloads the packages required by the items to the specified file directory.
And the discriminator judges whether the package downloaded by the downloader has a dependency relationship with other libraries on which the package depends, generates a call graph through Pycg if the package has the dependency relationship, and generates a dependency tree through the dependency tree generator if the package does not have the dependency relationship.
And the dependency tree generator is used for constructing a dependency tree between the packages.
Pycg, based on the static analysis call graph generation tool, generates a call graph.
The integrator, which is a smart _ Pycg integrator shown in fig. 4, may obtain the functional relationship call graph of a single packet after Pycg processing, extract an edge set of the call graph, and process a namespace to obtain a global functional relationship call graph.
And the reachability analysis module acquires the reachability relation between the packages through depth-first or breadth-first search based on the global function relation call graph. The method level packet-to-packet reachability analysis is to find out which methods the packet calls to another packet. After Smart _ Pycg processing, a global function relation call graph is obtained, depth-first search or breadth-first search is carried out on the graph, and reachability between packets is explored.
And an output module output for outputting a set of methods comprising the called package.
The specific implementation of the downloader is as follows. The downloaded path is designated as smart _ pycg/temp, and all files under temp are emptied first before installation, avoiding some interference. The os command Pip install is then embedded in Python code. And the installed path is designated as smart _ pycg/temp according to a target parameter. Installation can be done using requisitions. If the installation is unsuccessful, os.system will return to-1, then exception handling is performed, exception of InputError is thrown out, the user is prompted that the package which needs to be installed has an error, and the program is terminated.
The specific implementation of the dependency tree generator is as follows. The data structure for storing the dependency tree is the same as that of the call graph, and both adopt the adjacency list. Firstly, finding an entry function PackageDAG for generating a dependency tree from a source code of a pipdeptree; then, the data structure of the Pipdeptree is converted, so that specific information such as the version and the installation time of the package is omitted, and a simplified version of the dependency tree is constructed.
The Pycg is a command line tool, and when the call graph is generated through the Pycg, the call graph comprises two conditions of determining or not determining an entry function.
In the case where the entry function of the packet is determined, the call graph may be generated by:
Figure BDA0003723207920000091
in the case where the entry function is uncertain, all py files within a package may be traversed, commanding the following:
Figure BDA0003723207920000092
to embed the Pycg call result into the code, the calling interface of Pycg can be found to have the following four parameters by reading the Pycg source code: entry _ point, package, max _ iter, operation. The first parameter entry _ point refers to the entry of the pycg parsing this packet, which is divided into two cases: the first case is entry determination, which can directly specify a py file as the entry of the package to generate the call graph; the second case is that the entry is indeterminate, which is mostly the case, and it is possible to take all the py files under the package to be traversed as the entry to call pycg. The correlation code is as follows. As can be seen from the code, exception handling is also performed to prevent program abort caused by a pycg error. The second parameter, package, refers to the packet to be invoked, either an absolute path or a relative path, and the third parameter refers to the maximum number of iterations through the source code, which is specified as-1 by pycg by default. The fourth parameter is the operation to be performed, and call graph (call graph generation) is selected as the default. The Pycg interface is defined as follows.
Figure BDA0003723207920000102
By calling pycg, all the edge sets of the functional relationships of a single package can be obtained.
The integrator is implemented as follows. The integrator is a Smart _ Pycg integrator. On the basis of the edge set obtained by pycg, a namespace process is also performed. FIG. 6 is a program fragment containing a transfer dependency, and in the example of FIG. 6, the call graph for the generation PA for the call pycg results in the following edge sets:
EA [ < 'a 1', 'b.b.1.test _ Cat' >, < 'a 1', 'c.c.1.test _ Dog' > ], where a1 is the a1 module under PA, which is not recognized in the subsequent reachability analysis as a method of a package, due to the absence of a namespace.
Through regular expression matching, the processed edge set is obtained as follows,
EA ' [ < ' a.a1 ', ' b.b.1.test _ Cat ' >, < ' a.a1 ', ' c.c. 1.test _ Dog ' > ], and then these edge sets are added to the call graph.
The reachability analysis module is implemented as follows. Assuming that there are two packets PA and PB, the method level reachability analysis between PA and PB is to find all reachability paths connected to PB starting from PA, and in short, PA's packet calls all methods of PB's packet, which includes direct call and call of implicit transfer relationship. In the constructed global call graph, two strategies of depth-first search and breadth-first search are adopted.
Breadth First Search (BFS): given graph G ═ V, E) and one vertex s, the edges of graph G are systematically searched to find all vertices of the packet des that method s can call. The specific algorithm is as follows:
Figure BDA0003723207920000101
Figure BDA0003723207920000111
in the BFS (G, s, des) algorithm, color [ u ] records the color of each vertex, if the vertex is White (White), it indicates that the vertex has not been searched, if the vertex has been searched and the next level of search is being performed around this vertex, it is recorded as Gray (Gray), if the vertex and all vertices adjacent to the vertex have been searched, it is recorded as dark Gray (DarkGray). Pi u represents the vertex closest to u, i.e., the parent node of u, that has been traversed before the search for the u vertex, with queue Q recording the gray vertices currently being searched. The first three rows are initialization of vertices, coloring other vertices except the s-vertex white, representing no search, and noting their parent vertex as NIL. The starting point s is then enqueued and dyed grey. And when the queue Q is not empty, sequentially taking out the vertex u from the queue Q, taking out all vertexes v adjacent to the vertex u in the for loop of the line 9, judging whether the vertexes v are white, dyeing the vertexes v into gray, adding the vertexes v into the queue Q, and setting the parent vertexes of v as u. At this time, it is determined whether the vertex belongs to the method in the namespace of the packet des, and if so, the vertex is added to the set. After the for loop ends, each adjacent vertex v of the vertex u is colored dark gray after it has been searched. And returning the Set after the algorithm is finished.
Depth First Search (DFS) is a recursive algorithm that starts with a vertex v and then recursively calls the adjacent vertices of v in turn until each vertex is searched. The specific algorithm is as follows:
Figure BDA0003723207920000112
Figure BDA0003723207920000121
the initialization process of the depth-first search is the same as the breadth-first search, starting from vertex u, dfsvisit (u), coloring vertex u grey, then fetching vertex v adjacent to u, if vertex v is white, recursively visiting v until all the paths of vertices connected to v are visited, and finally coloring u dark grey.
The reachability analysis algorithm of the method level between the packets is modified on the basis of depth-first search or breadth-first search, and the breadth-first search is taken as an example:
Figure BDA0003723207920000122
firstly, a Set is created, BFS is called for each method belonging to the packet Start namespace to search, the searched methods are added into the Set, and all the methods called by the Start packet and related to the packet Target can be found.
Because the program calling relationship is very complex, the vertex set generated by Pycg is recorded as | V |, the edge set is recorded as | E |, and the vertex set belonging to the Start packet is recorded as | V |, so the time complexity of the algorithm is O (| V | + | E |).
The present disclosure provides an integrator smart _ pycg that, after having installed smart _ pycg and the dependent packages, may use the install smart _ pycg command: pip install smart _ pycg. Further, different instruction executions may be implemented by different parameters of the command line, as follows.
The-g 0 parameter may be used to obtain a relationship graph between packages, for example, a relationship dependency graph between requests packages, and there may be one png file under smart _ pycg/directory. Python-m entry-cl 1-p requests-g 0.
The full functional relationship call graph can be obtained by using-g 1, for example, the full functional relationship call graph of the requests package is obtained, and the edge set of the full functional relationship call graph is generated under output. python-m entry-cl 1-p requests-g 1-o 1.
Reachability analysis lookups can be performed using the-s parameter, such as looking up all methods of urllib3 called by requests packages, looking under output. python-m entry-cl 1-p-requests-s-requests url lib 3-o 1.
Installation can be done using the-r parameter to specify requirements. python-m entry-cl 1-r test _ demo2/reqtest/requirements. txt-g 1-o 1.
FIG. 5 is a schematic diagram of dependencies, according to one embodiment of the present disclosure. As shown in fig. 5, it can be seen that requests only have a dependency relationship with idna, and therefore only the mutual calling relationship between requests and idna needs to be analyzed, so that Pycg can be called for the requests and idna packages, the number of times of calling Pycg can be reduced, and the overhead of program analysis is greatly reduced.
FIG. 9 is a flowchart illustrating a dependency conflict remediation method based on a call graph according to yet another embodiment of the present disclosure.
As shown in fig. 9, the method for repairing a dependency conflict based on a call graph includes the following steps:
inputting a project path and a package depending on the conflict;
finding all the packages on the conflict chain according to the dependency tree provided by the WatchMan, and respectively installing the packages;
constructing a functional relationship call graph on a conflict chain;
finding all the methods that the item calls to the conflict package and merging into set 1;
loading label.json, wherein the label.json file is obtained by crawling the versions of all packages on PyPI to obtain a version list (version _ list) of conflict packages, sequentially traversing the version list (version _ list) of the packages, calling pycg to find all methods under the version, and merging the methods into a set 2;
judging whether the set1 is the subset of the set2, if so, indicating that a proper version is found, acquiring the proper version and adding the proper version into a recommendation list (support _ list);
and after the traversal of the version list (version _ list) is finished, obtaining the version range of the recommended packet.
Optionally, the method for repairing a dependency conflict based on a call graph further includes: if the full functional relationship call graph construction is not successful or the package installation is not successful, a log report can be generated.
The details of the project path and the related data depending on the package conflict are as follows.
According to the existing WatchMan's work, a suitable data set is screened out. The items of Pattern (conflict caused by remote dependency) on WatchMan are sorted out, and the package with conflict is found according to the Bug report provided by WatchMan. Then, according to the dependency tree of the WatchMan online detection system, items with complete conflict and no solution when being installed are selected, and installation is tried in the local environment, so that it is ensured that the dependency conflict is not eliminated along with the update of the PyPI, 39 items with dependency conflict and no solution are obtained, and the items are crawled. These items satisfy the following conditions:
(1) more than 50 stars or bifurcations;
(2) used as a library: contains more than three directly downstream items;
(3) the situation that the packages on which the items depend have complete conflict and no solution exists;
and finally, 39 completely conflicting items are obtained, next step of experiment is carried out according to the items, and finally, a recommended version for installing the conflict package is found.
Taking Archer y, one of the 39 totally conflicting projects as an example, and the Archer y project dependency tree is shown in FIG. 7, it can be clearly seen that the version required by sqlparse in the chain Archer-mybases-mapper 2sql-sqlparse is 0.2.4, while the version required in the chain Archer-sqlparse is 0.3.0.
Fig. 8 is a collision chain diagram according to one embodiment of the present disclosure. As shown in fig. 8, as long as the packages on the two chains are installed, namely: the version 0.2.4 required by sqlparse on the chain of Archer-mybases-mapper 2sql-sqlparse is disclosed, the version 0.3.0 of Archer-sqlparse is provided, then a function relation call graph is generated, all methods of the called sqlparse are found from Archer, and then the two sets are combined, thus all methods of the called sqlparse by Archer can be found. And then, searching all versions of sqlparse in the label.json file, installing the latest version from the previous version, calling pycg once when installing one version, acquiring all methods of the version, and if all methods of the version comprise all methods called to sqlparse by Archery, indicating that the version meets the requirements and finally obtaining the recommended version.
The embodiment in fig. 9 finally obtains 39 repairing results of completely dependent conflict items, as shown in table 1:
TABLE 1 results of the experiment
Figure BDA0003723207920000151
Figure BDA0003723207920000161
These results were classified as matching to recommended version Success (17, 44% by weight), No suitable recommended version No match version (6, 15% by weight), Pycg Err (13, 33% by weight), pip install error (3, 8% by weight).
Further analyzing the reasons of the Pycg Err, some of them are because there is syntax error in the code submitted by the developer, for example, the space indentation is not noticed, and the pip install Err is because the version of the packet is deleted artificially due to the update of PyPI, so that the installation is unsuccessful, and these invalid data sets are removed, and finally 23 valid data sets are obtained. In these 23 datasets, a conflicting package version of 17 items was successfully recommended (74% duty).
Finally, of the 17 items successfully recommended, the reconstruction of the item was attempted, with 14 of them being able to be successfully constructed (83%) according to the version of the recommendation, modifying the requisitions.
According to yet another aspect of the present disclosure, there is provided an electronic device including:
a memory storing execution instructions;
a processor executing execution instructions stored by the memory to cause the processor to perform the method of any of the above.
According to yet another aspect of the present disclosure, there is provided a readable storage medium having stored therein execution instructions for implementing the method of any one of the above when executed by a processor.
The dependency conflict repairing method and device based on the call graph, provided by the disclosure, are based on a call graph generating tool of a single package of Pycg static analysis, so that a cross-package Python functional relationship call graph generating tool is realized, a global functional relationship call graph is generated based on the cross-package Python functional relationship call graph generating tool, reachability analysis is performed, a function call relationship among cross-packages can be detected, and fine granularity of connection between the packages is improved by one level. The dependency conflict repairing method and device based on the call graph can be used for detecting the function call relation between the packages, and have important functions in program analysis, vulnerability detection and dependency conflict repairing.
Fig. 3 shows an exemplary diagram of an apparatus employing a hardware implementation of a processing system. The apparatus may include corresponding means for performing each or several of the steps of the flowcharts described above. Thus, each step or several steps in the above-described flow charts may be performed by a respective module, and the apparatus may comprise one or more of these modules. The modules may be one or more hardware modules specifically configured to perform the respective steps, or implemented by a processor configured to perform the respective steps, or stored within a computer-readable medium for implementation by a processor, or by some combination.
The hardware architecture may be implemented using a bus architecture. The bus architecture may include any number of interconnecting buses and bridges depending on the specific application of the hardware and the overall design constraints. The bus 1100 couples various circuits including the one or more processors 1200, the memory 1300, and/or the hardware modules together. The bus 1100 may also connect various other circuits 1400, such as peripherals, voltage regulators, power management circuits, external antennas, and the like.
The bus 1100 may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one connection line is shown, but no single bus or type of bus is shown.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and the scope of the preferred embodiments of the present disclosure includes other implementations in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the implementations of the present disclosure. The processor performs the various methods and processes described above. For example, method embodiments in the present disclosure may be implemented as a software program tangibly embodied in a machine-readable medium, such as a memory. In some embodiments, some or all of the software program may be loaded and/or installed via memory and/or a communication interface. When the software program is loaded into memory and executed by a processor, one or more steps of the method described above may be performed. Alternatively, in other embodiments, the processor may be configured to perform one of the methods described above by any other suitable means (e.g., by means of firmware).
The logic and/or steps represented in the flowcharts or otherwise described herein may be embodied in any readable storage medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions.
For the purposes of this description, a "readable storage medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the readable storage medium include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable read-only memory (CDROM). In addition, the readable storage medium may even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in the memory.
It should be understood that portions of the present disclosure may be implemented in hardware, software, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps of the method implementing the above embodiments may be implemented by hardware that is instructed to implement by a program, which may be stored in a readable storage medium, and when executed, includes one or a combination of the steps of the method embodiments.
In addition, each functional unit in the embodiments of the present disclosure may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a separate product, may also be stored in a readable storage medium. The storage medium may be a read-only memory, a magnetic or optical disk, or the like.
In the description herein, reference to the description of the terms "one embodiment/implementation," "some embodiments/implementations," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment/implementation or example is included in at least one embodiment/implementation or example of the present application. In this specification, the schematic representations of the terms described above are not necessarily the same embodiment/mode or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments/modes or examples. Furthermore, the various embodiments/aspects or examples and features of the various embodiments/aspects or examples described in this specification can be combined and combined by those skilled in the art without being mutually inconsistent.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present application, "plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
It will be understood by those skilled in the art that the foregoing embodiments are merely for clarity of illustration of the disclosure and are not intended to limit the scope of the disclosure. Other variations or modifications may occur to those skilled in the art, based on the foregoing disclosure, and are still within the scope of the present disclosure.

Claims (10)

1. A dependency conflict repairing method based on call graph is characterized by comprising the following steps:
inputting a Python item path and a conflict-dependent package, and obtaining a first set after processing by a call graph generator, wherein the first set comprises all methods for calling the item to the conflict package;
inputting all versions of the package which depends on the conflict, and obtaining a second set after the second set is processed by a call graph generator, wherein the second set comprises all methods contained in other versions of the conflict package; and
and the comparison module compares the first set with the second set, acquires the alternative versions of the packages which depend on the conflict if the first set is a subset of the second set, and outputs the alternative version list.
2. The call graph-based dependency conflict recovery method according to claim 1, wherein the processing procedure of the call graph generator comprises:
acquiring a package/project to be installed and a package name of a package to be analyzed;
downloading the required packages of the items to a specified catalogue;
constructing a dependency tree between packages;
processing the packet or the name of the packet through Pycg to obtain a function relation call graph of a single packet;
taking out the edge set of the functional relationship call graph of the single package and processing the namespace to obtain a global functional relationship call graph;
based on the global function relation call graph, obtaining reachability between packages through depth-first or breadth-first search; and
outputting a set of methods that includes the called package.
3. The call graph-based dependency conflict recovery method according to claim 2, wherein the representation structure of the functional relationship call graph comprises:
and G ═ V, E, (V, E), where V denotes a set of vertices, size is denoted by | V |, E denotes a set of edges, size is denoted by | E |, the set of vertices is a set of methods in each of the packages or packets, and the set of edges is a call relationship between methods in each of the packages or packets.
4. The method according to claim 3, wherein the functional relationship call graph is represented by an adjacency matrix when the functional relationship call graph is a directed graph.
5. The call graph-based dependency conflict recovery method according to claim 2, wherein constructing a dependency tree from package to package comprises:
acquiring an entry function for generating a dependency tree from a pipdeptree; and
and converting the data structure of the pipdeptree, and removing information including the version and the installation time to obtain a simplified version dependency tree.
6. The call graph-based dependency conflict recovery method according to claim 2, wherein the processing of the namespace comprises:
and increasing a name space on the basis of acquiring the functional relationship call graph of the single package.
7. A dependency conflict recovery apparatus based on a call graph, comprising:
the first set acquisition module is used for acquiring a Python project path and a conflict-dependent packet, and obtaining a first set after processing by the call graph generator, wherein the first set comprises all methods for calling a project to the conflict packet;
the second set acquisition module is used for acquiring all versions of the package which depends on the conflict, and acquiring a second set after the second set is processed by the call graph generator, wherein the second set comprises all methods contained in other versions of the conflict package; and
and the comparison module compares the first set with the second set, acquires the alternative versions of the packages which depend on the conflict if the first set is a subset of the second set, and outputs the alternative version list.
8. The call graph-based dependency conflict recovery apparatus according to claim 7, wherein the call graph generator comprises:
the input module is used for acquiring the package/project to be installed and the package name of the package to be analyzed;
the downloader downloads the packages required by the items to the specified file directory;
a dependency tree generator for constructing a dependency tree between packages;
the integrator can obtain a functional relationship call graph of a single package after the package or the name of the package is processed by Pycg, take out an edge set of the call graph, and process a namespace to obtain a global functional relationship call graph;
the reachability analysis module acquires reachability relation between packages through depth-first or breadth-first search based on the global function relation call graph; and
and an output module for outputting a set of methods including the called package.
9. An electronic device, comprising:
a memory storing execution instructions; and
a processor executing execution instructions stored by the memory to cause the processor to perform the method of any of claims 1 to 6.
10. A readable storage medium having stored therein execution instructions, which when executed by a processor, are configured to implement the method of any one of claims 1 to 6.
CN202210768989.5A 2022-06-30 2022-06-30 Dependency conflict repairing method and device based on call graph Pending CN115016840A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210768989.5A CN115016840A (en) 2022-06-30 2022-06-30 Dependency conflict repairing method and device based on call graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210768989.5A CN115016840A (en) 2022-06-30 2022-06-30 Dependency conflict repairing method and device based on call graph

Publications (1)

Publication Number Publication Date
CN115016840A true CN115016840A (en) 2022-09-06

Family

ID=83079205

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210768989.5A Pending CN115016840A (en) 2022-06-30 2022-06-30 Dependency conflict repairing method and device based on call graph

Country Status (1)

Country Link
CN (1) CN115016840A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7254814B1 (en) * 2001-09-28 2007-08-07 Emc Corporation Methods and apparatus for managing plug-in services
WO2017212496A2 (en) * 2016-06-08 2017-12-14 Veriversion Labs Ltd. Methods and systems of software testing, distribution, installation and deployment
US20200364042A1 (en) * 2019-05-17 2020-11-19 Sap Se Static analysis of higher-order merge conflicts in large software development projects
CN112631607A (en) * 2020-12-31 2021-04-09 东北大学 Method for detecting dependency conflict in python environment
CN112965913A (en) * 2021-03-26 2021-06-15 东北大学 Method for automatically repairing dependency conflict problem of Java software
CN113515303A (en) * 2021-05-19 2021-10-19 中国工商银行股份有限公司 Project transformation method, device and equipment
CN114003273A (en) * 2021-11-02 2022-02-01 中国银行股份有限公司 Dependency management method and device based on graphic database

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7254814B1 (en) * 2001-09-28 2007-08-07 Emc Corporation Methods and apparatus for managing plug-in services
WO2017212496A2 (en) * 2016-06-08 2017-12-14 Veriversion Labs Ltd. Methods and systems of software testing, distribution, installation and deployment
US20200364042A1 (en) * 2019-05-17 2020-11-19 Sap Se Static analysis of higher-order merge conflicts in large software development projects
CN112631607A (en) * 2020-12-31 2021-04-09 东北大学 Method for detecting dependency conflict in python environment
CN112965913A (en) * 2021-03-26 2021-06-15 东北大学 Method for automatically repairing dependency conflict problem of Java software
CN113515303A (en) * 2021-05-19 2021-10-19 中国工商银行股份有限公司 Project transformation method, device and equipment
CN114003273A (en) * 2021-11-02 2022-02-01 中国银行股份有限公司 Dependency management method and device based on graphic database

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
李阳;毛世峰;叶民友;: "基于依赖管理的CFETR文档管理系统的设计与实现", 计算机应用与软件, no. 11, 12 November 2018 (2018-11-12) *
王敬亚;朱怀宏;胡琰华;徐洁磐;: "利用控制依赖关系算法分析BPEL4WS", 计算机工程与科学, no. 10, 15 October 2008 (2008-10-15) *
王超;曹俊兴;: "通信网管系统自动集成网元的研究与实现", 智能计算机与应用, no. 02, 1 April 2015 (2015-04-01) *

Similar Documents

Publication Publication Date Title
US8434054B2 (en) System and method for managing cross project dependencies at development time
JP6287549B2 (en) Method and apparatus for porting source code
US6363524B1 (en) System and method for assessing the need for installing software patches in a computer system
US10606570B2 (en) Representing software with an abstract code graph
US7765520B2 (en) System and method for managing cross project dependencies at development time
US10514898B2 (en) Method and system to develop, deploy, test, and manage platform-independent software
CN107015813A (en) Method, device and electronic equipment that code is repaired
CN115202669A (en) Project construction method and system based on configuration file and related equipment
CN109491661B (en) Universal cross-compiling method and device
CN111752841A (en) Single test simulation method, device, equipment and computer readable storage medium
CN113515303A (en) Project transformation method, device and equipment
US20030233640A1 (en) Structuring program code
CN115016840A (en) Dependency conflict repairing method and device based on call graph
US5710927A (en) Method of replacing lvalues by variables in programs containing nested aggregates in an optimizing compiler
Amissah et al. Towards a framework for executable systems modeling: an executable systems modeling language (ESysML)
CN115729590A (en) Service deployment method, device, equipment and computer readable storage medium
CN115437643A (en) Project code conversion method, device, equipment and storage medium
US20210271762A1 (en) Method and device for symbolic analysis of a software program
CN114791865A (en) Method, system and medium for detecting self-consistency of configuration items based on relational graph
CN114417347A (en) Vulnerability detection method, device, equipment, storage medium and program of application program
CN114816984A (en) JAVA program regression test method and test device
CN111209197B (en) Application continuous integration test method, system, equipment and storage medium
CN113849181A (en) Cross compiling tool chain construction method and device, electronic equipment and storage medium
CN111949268A (en) Project compiling method and device, electronic equipment and storage medium
CN116301950B (en) Docker image generation method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination