CN114675839A - Code warehouse Java conflict file sorting and grouping method based on directed graph - Google Patents

Code warehouse Java conflict file sorting and grouping method based on directed graph Download PDF

Info

Publication number
CN114675839A
CN114675839A CN202210597420.7A CN202210597420A CN114675839A CN 114675839 A CN114675839 A CN 114675839A CN 202210597420 A CN202210597420 A CN 202210597420A CN 114675839 A CN114675839 A CN 114675839A
Authority
CN
China
Prior art keywords
files
graph
java
conflict
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210597420.7A
Other languages
Chinese (zh)
Other versions
CN114675839B (en
Inventor
张婷婷
唐勇
刘世伟
张传忠
张卫丰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xuancai Interactive Network Science And Technology Co ltd
Nanjing University of Posts and Telecommunications
Original Assignee
Xuancai Interactive Network Science And Technology Co ltd
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xuancai Interactive Network Science And Technology Co ltd, Nanjing University of Posts and Telecommunications filed Critical Xuancai Interactive Network Science And Technology Co ltd
Priority to CN202210597420.7A priority Critical patent/CN114675839B/en
Publication of CN114675839A publication Critical patent/CN114675839A/en
Application granted granted Critical
Publication of CN114675839B publication Critical patent/CN114675839B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/72Code refactoring

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Devices For Executing Special Programs (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a code warehouse Java conflict file sorting and grouping method based on a directed graph, which comprises the following steps: firstly, constructing a dependency relationship graph for all Java files on different branches according to the dependency relationship among the files; secondly, merging the graphs on different branches to obtain a merged dependency graph; then, traversing the merged dependency graph, comparing the full path names of the files carried by the nodes with the full path names of the Java files with conflicts, and storing the full path names of the Java files according to the node serial numbers which are the same in comparison from large to small; then, the Java files in the same dependency graph are classified into the same group. The method can effectively sort a large number of conflicts existing in the current code warehouse, and can effectively group the files with conflicts so as to help developers to orderly solve the conflicts.

Description

Code warehouse Java conflict file sorting and grouping method based on directed graph
Technical Field
The invention relates to the technical field of computers, in particular to a code warehouse Java conflict file sorting and grouping method based on a directed graph.
Background
In modern software development, developers rely on versioning systems like Git to collaborate in branch-based development work. One drawback of this mode of operation is that when contributions from different developers are combined, conflicts can occur, reducing the efficiency of collaboration, and introducing potential vulnerabilities, which can worsen as reformulations are prevalent in software development and a large number of conflicts arise.
However, common problems in the collision merging process are: for unstructured merging, the codes are regarded as plain text, so that conflicts can be reported as long as developers modify the codes of the same line, and a large number of false alarms exist in the case; for structured merging, it reduces some false positives. However, for any merging method, a large number of conflicts can be solved together at one time, and a large amount of time is consumed, especially in the case of a large number of conflicts, a large number of files and a large number of code lines. This is a major challenge faced when merging branches of a code warehouse.
An effective solution to these problems is to incorporate conflict files into the same group, so that after the conflict resolution of the same group is completed, the problem of an individual module can be compiled in advance when all the problems are resolved, and the whole project is not compiled after all the conflicts are resolved. The function of sorting conflict files and then solving conflicts needs to accurately sort the conflict files and bring the conflict files in the same graph into the same group by analyzing the full path names of the nodes in the graph and sorting the access sequence of the nodes, thereby helping developers to orderly solve conflicts, reducing the waiting time for processing conflicts and improving the development efficiency.
In the published file CN201910904927, an AST-based relational database SQL table relational analysis and presentation method, and the file CN201210219285, a Java program-oriented random test case generation method, although there is a concept of dependence analysis and sorting, when sorting is performed, nodes are not sorted, so that repeated modification is easily performed in the subsequent processing, and the efficiency of the solution process is affected.
Disclosure of Invention
In order to solve the problems, the invention discloses a code warehouse Java conflict file sorting and grouping method based on a directed graph, which can be used as a tool for accurately sorting conflict files in a warehouse, particularly can solve conflicts in order aiming at the situation that the dependency relationship existing in the conflict files is complex, emphatically improves the efficiency of a conflict solution process, makes up the defects of the current tool, and effectively helps developers to solve conflicts.
The technical scheme of the invention is as follows: a code warehouse Java conflict file sorting and grouping method based on a directed graph is characterized in that a dependency graph is constructed for all files in Java items in a code warehouse according to dependency relations among the files; then, traversing the dependency graph to construct a new graph; then, traversing the newly constructed graph, and sequentially storing the conflict files according to the sequence of the traversed nodes; and finally, distributing the conflict files in the same graph into the same group.
A code warehouse Java conflict file sorting and grouping method based on a directed graph comprises the following steps:
step 1: constructing a dependency graph for all files of Java items in a code warehouse according to the dependency among the files, wherein a graph is constructed on each branch in the code warehouse;
and 2, step: constructing a new graph according to the dependency graph, wherein nodes in the graph comprise out-degree, in-degree, full file path names and sequence numbers;
and step 3: acquiring conflict information aiming at a merging scene in a code warehouse, and storing the information of all files with conflicts into files;
and 4, step 4: sequencing each node in the graph, starting traversal from the node with zero in-degree, and adding one to the in-degree of the node and adding one to the serial number when reaching one point;
and 5: merging the graphs of each branch in the code warehouse to obtain merged graphs;
step 6: traversing the merged graph to sort, comparing the full path names of the files which are collected in the step 2 and have conflict with the full path names of the nodes, and collecting the full path names of the files which are successfully matched into the files from large to small according to the node sequence numbers;
and 7: grouping the files with conflicts, traversing each graph, and grouping the conflicting files in each graph into the same group.
Further, in step 1, for the items that need to sort and group the conflicts, the Java files and all Java files that conflict in the items are extracted, and a dependency graph is constructed according to the dependencies of all Java files.
Further, in step 2, a dependency graph is constructed on each branch in the code repository, the sequence number is initially zero, the in-degree indicates the number of files dependent on the file, and the out-degree indicates the number of files dependent on other files.
Further, the conflict information in step 3 includes information on the path and name of the file in which the conflict occurred.
Furthermore, in step 4, traversal is started from a node with zero in-degree, and each node is traversed backwards, and the traversed node sequence number is equal to the node sequence number of the last traversal plus one.
Further, in the step 5, matching is performed according to the full path names of the files carried in the nodes, matching is successfully performed, the serial numbers of the nodes are kept to be larger for the condition that the serial numbers of the two nodes which are successfully matched are different, if the nodes which are not successfully matched exist, the nodes are inserted into the combined graph according to the relative positions of the nodes in the graph, the combined graph can contain all the files of the whole project and the dependency relationships among the files, and omission is avoided.
Further, traversing the merged directed graph in step 6, when the full path names contained in the nodes are Java files with conflicts, saving the full path names of the files in the nodes according to the size of the serial numbers of the nodes from large to small, wherein the sequence after saving is the sequence of processing the conflict files.
Further, in step 7, modules exist in the Java project, a plurality of independent graphs are constructed for the case that there is no dependency between the modules, and for the case that there is dependency between the modules, a graph is constructed by the Java files of two modules, each graph is traversed, the full pathnames of the files carried in the nodes are compared with the full pathnames of the Java files in which conflicts occur, and the Java files in which conflicts occur in the same graph are grouped into the same group.
The invention has the advantages that: 1. The invention constructs the Java items in the code warehouse into the relationship dependency graph to clearly analyze the dependency relationship between the files, prevents the files depending on the files from generating errors after the dependent files are processed, and determines the resolution sequence of the conflicting files by sequencing the conflicting files according to the serial numbers of the nodes in the dependency graph so as to avoid repeated modification.
2. The invention can distribute the files with conflict to different groups according to whether the files are in the same graph or not, and can finish compiling by one functional module after solving a group of conflicts in the conflict grouping under the condition that the conflict quantity is more in the large project, thereby improving the conflict processing efficiency.
3. The invention can accurately sort and group the conflict files in the warehouse, particularly can solve the conflict orderly aiming at the condition that the dependency relationship existing in the conflict files is complex, emphatically improves the efficiency of the conflict solution process, makes up the deficiency of the current tool, and effectively helps the developer to solve the conflict.
Drawings
FIG. 1 is a flow chart illustrating the dependency graph of Java file construction according to the present invention;
FIG. 2 is a schematic diagram illustrating a process for sorting conflict files according to the present invention;
FIG. 3 is a schematic diagram of the conflict file ordering result generation based on the dependency graph according to the present invention.
Detailed Description
For the purpose of enhancing an understanding of the present invention, the following detailed description of the present invention is provided in conjunction with the accompanying drawings, which are provided for the purpose of illustration only and are not intended to limit the scope of the present invention.
As shown in fig. 1-3, a code repository Java conflict file sorting and grouping method based on a directed graph constructs a dependency graph for all files in Java items in a code repository according to dependency relationships among the files; then, traversing the dependency graph to construct a new graph; then, traversing the newly constructed graph, and sequentially storing the conflict files according to the sequence of the traversed nodes; and finally, distributing the conflict files in the same graph into the same group.
A code warehouse Java conflict file sorting and grouping method based on a directed graph comprises the following steps:
step 1: constructing a dependency graph for all files of Java items in a code warehouse according to the dependency among the files, wherein a graph is constructed on each branch in the code warehouse;
step 2: constructing a new graph according to the dependency graph, wherein nodes in the graph comprise out-degree, in-degree, full file path names and sequence numbers;
and step 3: acquiring conflict information aiming at a merging scene in a code warehouse, and storing the information of all files with conflicts into files;
and 4, step 4: sequencing each node in the graph, starting traversal from the node with zero in-degree, and adding one to the in-degree of the node and adding one to the serial number when reaching one point;
and 5: merging the graphs of each branch in the code warehouse to obtain merged graphs;
step 6: traversing the merged graph to sort, comparing the full path names of the files which are collected in the step 2 and have conflict with the full path names of the nodes, and collecting the full path names of the files which are successfully matched into the files from large to small according to the node sequence numbers;
and 7: grouping the files with conflicts, traversing each graph, and grouping the conflicting files in each graph into the same group.
In the step 1, a node of the dependency graph represents a Java file; the edges in the graph indicate that the dependency relationship exists between two file nodes, the edges point to the nodes depended on from one node, one node can have a plurality of edges pointing to the node according to the dependency relationship, and a plurality of edges point to other nodes from the node aiming at the items needing conflict sorting grouping, the Java files and all the Java files which conflict in the items are respectively extracted, and the dependency relationship graph is constructed according to the dependency relationship of all the Java files. In the prior art, only files which are in conflict are retrieved for modification, the dependency relationship among the files is ignored, so that the files which depend on the files can be in error after the dependent files are processed, the dependency relationship among the files can be clearly analyzed by constructing a relationship dependency graph, and the resolution sequence of the conflicting files can be determined according to the dependency relationship so as to avoid repeated modification.
In step 2, a dependency graph is constructed on each branch in the code warehouse, the sequence number is initially zero, the in-degree represents the number of files depending on the file, the out-degree represents the number of the files depending on other files, the edges in the graph represent that the dependency exists between two files, because one file depends on a plurality of files, a plurality of directed edges point to the node, and edges pointing to different nodes from one node are also provided, and the sequence number is the basis for sequencing conflict files.
And 3, extracting conflict files, checking the information of the conflict files and the full path names thereof from the git command line, storing all the conflict file information into the files, and taking the conflict files as input in the subsequent step of sequencing, wherein the conflict information comprises the path and name information of the conflict files, and the path and name of the files are the basis of node matching.
In step 4, in the Java project, the files will ultimately depend on some basic classes, which do not depend on other classes any more, so that it is certain that a dependency graph formed by all the files has a node with zero in-degree, the node with zero in-degree indicates that no file depends on the node, and there is more than one point with zero in-degree.
And 5, merging graphs formed by files on different branches, comparing the graphs from the nodes with zero in-degree, judging whether the node matching conditions are that the full path names of the files carried by the nodes are the same, matching according to the full path names of the files carried in the nodes, copying the nodes with larger sequence numbers as new nodes if the node matching is successful, inserting the nodes which are not successfully matched in the merged graph according to the relative positions of the nodes and other nodes in the graph on the branch where the nodes are located, and enabling the merged graph to contain the dependency relationships between all the files and files of the whole project so as not to be missed.
And 6, traversing the merged directed graph, and when the full path names contained in the nodes are Java files with conflicts, saving the full path names of the files in the nodes from large to small according to the size of the serial numbers of the nodes, wherein the saved sequence is the processing sequence of the conflict files.
And 6) traversing the graph merged in the step 5), starting from the node with zero in-degree, comparing the full path name of each node with the full path name in the conflict file list collected in the step 3), if the full path name of the node exists in the conflict file list, storing the node into a set, and storing the node in a storage sequence from large to small according to the sequence number of the node, wherein the larger the node sequence number is, the more the conflict of the method level is processed, and the smaller the node sequence number is, the more the conflict of the method level is processed.
In step 7, modules exist in the Java project, a plurality of independent graphs are constructed when there is no dependency between the modules, a graph is constructed by Java files of two modules when there is dependency between the modules, each graph is traversed, the full pathnames of the files carried in the nodes are compared with the full pathnames of the Java files with conflicts, the Java files with conflicts appearing in the same graph are grouped into the same group, the files with conflicts are directly processed in the prior art, and when a group of conflicts in the conflict group is solved when there are many conflicts appearing in a large project, a functional module can complete compilation, and the efficiency of processing conflicts is improved.

Claims (8)

1. A code warehouse Java conflict file sorting and grouping method based on a directed graph is characterized by comprising the following steps:
step 1: constructing a dependency graph for all files of Java items in a code warehouse according to the dependency among the files, wherein a graph is constructed on each branch in the code warehouse;
step 2: constructing a new graph according to the dependency graph, wherein nodes in the graph comprise out-degree, in-degree, full file path names and sequence numbers;
and step 3: acquiring conflict information aiming at a merging scene in a code warehouse, and storing the information of all files with conflicts into files;
and 4, step 4: sequencing each node in the graph, starting traversal from the node with zero in-degree, and adding one to the in-degree of the node and adding one to the serial number when reaching one point;
and 5: merging the graphs of each branch in the code warehouse to obtain merged graphs;
step 6: traversing the merged graph to sort, comparing the full path names of the files which are collected in the step 2 and have conflict with the full path names of the nodes, and collecting the full path names of the files which are successfully matched into the files from large to small according to the node sequence numbers;
and 7: grouping the files with conflicts, traversing each graph, and grouping the conflicting files in each graph into the same group.
2. The directed graph-based code repository Java conflict file ordering grouping method according to claim 1, wherein: in the step 1, for the items needing to sort and group the conflicts, the Java files and all Java files with conflicts in the items are respectively extracted, and a dependency graph is constructed according to the dependencies of all Java files.
3. The directed graph-based code repository Java conflict file ordering grouping method according to claim 1, wherein: in step 2, a dependency graph is constructed on each branch in the code repository, the sequence number is initially zero, the in-degree indicates the number of files dependent on the file, and the out-degree indicates the number of files dependent on other files.
4. The directed graph-based code repository Java conflict file ordering grouping method according to claim 1, wherein: the conflict information in step 3 includes information of the path and name of the file where the conflict occurs.
5. The directed graph-based code repository Java conflict file ordering grouping method according to claim 1, wherein: in the step 4, traversal is started from the node with zero in-degree, and each time one node is traversed backwards, the traversed node sequence number is equal to the sequence number of the last traversed node plus one.
6. The directed graph-based code repository Java conflict file ordering grouping method according to claim 1, wherein: and in the step 5, matching is carried out according to the full path names of the files carried in the nodes, the matching is successfully kept, the serial number of the node is kept larger for the condition that the serial numbers of the two successfully matched nodes are different, if the node which is not successfully matched exists, the node is inserted into the combined graph according to the relative position of the node in the graph, the combined graph can contain all the files of the whole project and the dependency relationship among the files, and omission is avoided.
7. The directed graph-based code repository Java conflict file ordering grouping method according to claim 1, wherein: and 6, traversing the merged directed graph in the step 6, and when the full path names contained in the nodes are Java files with conflicts, saving the full path names of the files in the nodes from large to small according to the sequence numbers of the nodes, wherein the sequence after saving is the sequence of processing the conflict files.
8. The directed graph-based code repository Java conflict file ordering grouping method according to claim 1, wherein: in the step 7, modules exist in the Java project, a plurality of independent graphs are constructed when there is no dependency between the modules, and when there is dependency between the modules, Java files of two modules construct a graph, each graph is traversed, the full pathnames of files carried in nodes are compared with the full pathnames of Java files with conflicts, and the Java files with conflicts appearing in the same graph are grouped into the same group.
CN202210597420.7A 2022-05-30 2022-05-30 Code warehouse Java conflict file sorting and grouping method based on directed graph Active CN114675839B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210597420.7A CN114675839B (en) 2022-05-30 2022-05-30 Code warehouse Java conflict file sorting and grouping method based on directed graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210597420.7A CN114675839B (en) 2022-05-30 2022-05-30 Code warehouse Java conflict file sorting and grouping method based on directed graph

Publications (2)

Publication Number Publication Date
CN114675839A true CN114675839A (en) 2022-06-28
CN114675839B CN114675839B (en) 2022-08-30

Family

ID=82079687

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210597420.7A Active CN114675839B (en) 2022-05-30 2022-05-30 Code warehouse Java conflict file sorting and grouping method based on directed graph

Country Status (1)

Country Link
CN (1) CN114675839B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105468650A (en) * 2014-09-12 2016-04-06 阿里巴巴集团控股有限公司 Merge conflict processing method and device and conflicting file processing method and device
CN110990055A (en) * 2019-12-19 2020-04-10 南京邮电大学 Pull Request function classification method based on program analysis
CN111190583A (en) * 2019-12-31 2020-05-22 华为技术有限公司 Associated conflict block presenting method and equipment
CN112965913A (en) * 2021-03-26 2021-06-15 东北大学 Method for automatically repairing dependency conflict problem of Java software

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105468650A (en) * 2014-09-12 2016-04-06 阿里巴巴集团控股有限公司 Merge conflict processing method and device and conflicting file processing method and device
CN110990055A (en) * 2019-12-19 2020-04-10 南京邮电大学 Pull Request function classification method based on program analysis
CN111190583A (en) * 2019-12-31 2020-05-22 华为技术有限公司 Associated conflict block presenting method and equipment
CN112965913A (en) * 2021-03-26 2021-06-15 东北大学 Method for automatically repairing dependency conflict problem of Java software

Also Published As

Publication number Publication date
CN114675839B (en) 2022-08-30

Similar Documents

Publication Publication Date Title
US8307010B2 (en) Data feature tracking through hierarchical node sets
CN106843840B (en) Source code version evolution annotation multiplexing method based on similarity analysis
CN102012857B (en) Device and method for automatically testing web page
CN111813412B (en) Method and system for constructing test data set for evaluating binary code comparison tool
CN102981946B (en) ETL smoke test method
CN114756456A (en) Continuous integration method and device and computer readable storage medium
CN112364024A (en) Control method and device for batch automatic comparison of table data
Van Rysselberghe et al. Mining Version Control Systems for FACs (Frequently Applied Changes).
CN108897678B (en) Static code detection method, static code detection system and storage device
CN115098109A (en) Directed graph-based code warehouse code block level conflict sorting and grouping method
CN110688112A (en) Automatic storage method and system for multi-project collinear development codes
CN112765014B (en) Automatic test system for multi-user simultaneous operation and working method
CN112631925B (en) Method for detecting single-variable atom violation defect
CN114675839B (en) Code warehouse Java conflict file sorting and grouping method based on directed graph
US10642716B1 (en) Automated software program repair
US20040010780A1 (en) Method and apparatus for approximate generation of source code cross-reference information
Greenan Method-level code clone detection on transformed abstract syntax trees using sequence matching algorithms
CN112115125B (en) Database access object name resolution method and device and electronic equipment
CN115310095A (en) Block chain intelligent contract mixed formal verification method and system
CN115048106A (en) Source code parallel compiling method and system
CN113126998A (en) Incremental source code acquisition method and device, electronic equipment and storage medium
CN111277650A (en) Automatic micro-service identification method combining functional indexes and non-functional indexes
CN113568662B (en) Code change influence range analysis method and system based on calling relation
CN115470147A (en) Unit testing method and device
Leßenich et al. Adjustable syntactic merge of Java programs

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant