JP2017142712A

JP2017142712A - Call graph difference extraction method, call graph difference extraction program, and information processing device

Info

Publication number: JP2017142712A
Application number: JP2016024514A
Authority: JP
Inventors: 晃治山本; Koji Yamamoto; 竜一梅川; Ryuichi Umekawa
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2016-02-12
Filing date: 2016-02-12
Publication date: 2017-08-17

Abstract

PROBLEM TO BE SOLVED: To compare two call graphs and efficiently extract an edge difference when an edge difference is detected.SOLUTION: An information processing device 10 classifies a plurality of first edges 1e-1g into categories for respective distances in a first call graph 1 by distance from a first node 1a that is a starting point, and classifies a plurality of second edges 2f-2i in a second call graph 2 by distance from a second node 2a that is a starting point. A difference candidate first edge where second edges of the same category having a common call relation do not exist and a difference candidate second edge where first edges of the same category having a common call relation do not exist are specified. A difference first edge, among the difference candidate first edges, where difference candidate second edges having a common call relation do not exist, and a difference candidate second edge, among the difference candidate second edges, where difference candidate first edges having a common call relation do not exist are extracted.SELECTED DRAWING: Figure 1

Description

本発明は、コールグラフ差分抽出方法、コールグラフ差分抽出プログラム、および情報処理装置に関する。 The present invention relates to a call graph difference extraction method, a call graph difference extraction program, and an information processing apparatus.

コンピュータに実行させるプログラムでは、サブルーチンとしての別のプログラムの呼び出しが定義されていることがある。大規模なソフトウェアでは、このようなプログラム間の呼び出し関係が複雑に絡み合う。そこで、ソフトウェアに含まれるプログラム間の呼び出し関係が視覚的に把握できるように、プログラム間の呼び出し関係をグラフで表すことがある。このようなプログラム間の呼び出し関係を表すグラフは、コールグラフと呼ばれる。 A program to be executed by a computer may define a call for another program as a subroutine. In a large-scale software, such a calling relationship between programs is complicatedly intertwined. Therefore, the call relationship between programs may be represented by a graph so that the call relationship between programs included in software can be visually grasped. Such a graph representing a call relationship between programs is called a call graph.

コールグラフは、プログラムを表すノードの集合と、プログラム間の呼び出し関係を表すエッジの集合とで構成される有向グラフである。コールグラフにおいて、エッジは、呼び出し元のプログラムに対応するノードから呼び出し先のプログラムに対応するノードへの方向性を有する線で表される。 The call graph is a directed graph composed of a set of nodes representing programs and a set of edges representing call relationships between programs. In the call graph, an edge is represented by a line having directionality from a node corresponding to a calling source program to a node corresponding to a calling destination program.

このようなコールグラフは、ソフトウェアの一部のプログラムが修正されたことによる、他のプログラムの影響の有無の把握に利用できる。例えば、修正前のソフトウェアの呼び出し関係を示すコールグラフと、修正後のソフトウェアの呼び出し関係を示すコールグラフとを比較すれば、サブルーチンの呼び出し元が変更されたことなどを把握できる。このとき、サブルーチンが呼び出し元に依存する処理を含んでいる場合、そのサブルーチンのプログラムについても修正を要することが分かる。 Such a call graph can be used to grasp the influence of other programs due to the modification of a part of the software program. For example, by comparing a call graph indicating a call relationship of software before correction with a call graph indicating a call relationship of software after correction, it is possible to grasp that the caller of the subroutine has been changed. At this time, if the subroutine includes a process that depends on the caller, it is understood that the subroutine program needs to be corrected.

グラフを用いてプログラムの構造を把握する技術としては、例えば、コンピュータ・プログラムの構造を、有向グラフで簡潔に表示する階層的グラフ解析方法がある。またアプリケーションに障害が発生した際、障害の根本的な原因となる処理を行っている箇所を特定するためにコールグラフを利用した、アプリケーションの障害原因の特定作業支援システムも考えられている。さらに、ソースコードの等価性を検証する際に、ソースコードを解析した構造グラフによる構造比較を行うソースコード等価性検証装置も考えられている。 As a technique for grasping the structure of a program using a graph, for example, there is a hierarchical graph analysis method for simply displaying the structure of a computer program in a directed graph. In addition, when a failure occurs in an application, an application failure cause identification work support system that uses a call graph to identify a location where processing that is the root cause of the failure is performed is also considered. Furthermore, a source code equivalence verification apparatus that performs structure comparison using a structure graph obtained by analyzing the source code when verifying the equivalence of the source code is also considered.

特開平６−１８７１３８号公報JP-A-6-187138 特開２００７−２４１４２６号公報JP 2007-241426 A 特開２０１４−１２６９８５号公報JP 2014-126985 A

しかし、２つのコールグラフを比較し、エッジの差異を検出するとき、２つのコールグラフがどちらにも多数のエッジがある場合、実用的な時間内で比較処理を完了させることが困難となる。例えば、一方のコールグラフ内の各エッジを他方のコールグラフ内の全エッジと総当たりで比較した場合、それぞれのコールグラフのエッジ数同士を乗算した回数分のエッジの同一性判定処理が行われる。ＯＳ（Operating System）のような規模の大きなソフトウェアでは、コールグラフに含まれるエッジ数が百万個に及ぶ場合もあり、エッジの同一性判定処理の繰り返し回数が膨大になる。その結果、２つのコールグラフの比較に時間がかかっている。 However, when two call graphs are compared and edge differences are detected, if both of the two call graphs have a large number of edges, it is difficult to complete the comparison process within a practical time. For example, when each edge in one call graph is compared with all the edges in the other call graph in a brute force manner, edge identity determination processing is performed for the number of times obtained by multiplying the number of edges in each call graph. . In a large-scale software such as an OS (Operating System), the number of edges included in a call graph may reach 1 million, and the number of repetitions of edge identity determination processing becomes enormous. As a result, it takes time to compare the two call graphs.

１つの側面では、本件は、エッジの差分を効率的に抽出できるようにすることを目的とする。 In one aspect, the object is to enable efficient extraction of edge differences.

１つの案では、コンピュータが以下の処理を実行するコールグラフ差分抽出方法が提供される。
このコールグラフ差分抽出方法では、コンピュータは、起点プログラムモジュールが指定されると、起点プログラムモジュールを含む第１ソフトウェア内の複数の第１プログラムモジュール間での呼び出し関係を、複数の第１プログラムモジュールを示す複数の第１ノード間を接続する複数の第１エッジによって表した第１コールグラフに基づいて、起点プログラムモジュールに対応する第１ノードから呼び出し関係を辿って複数の第１エッジそれぞれに到達するまでの距離により、第１コールグラフ内の複数の第１エッジを距離ごとのカテゴリに分類する。またコンピュータは、起点プログラムモジュールを含む第２ソフトウェア内の複数の第２プログラムモジュール間での呼び出し関係を、複数の第２プログラムモジュールを示す複数の第２ノード間を接続する複数の第２エッジによって表した第２コールグラフに基づいて、起点プログラムモジュールに対応する第２ノードから呼び出し関係を辿って複数の第２エッジそれぞれに到達するまでの距離により、複数の第２エッジを距離ごとのカテゴリに分類する。次にコンピュータは、複数の第１エッジそれぞれを対象とし、対象第１エッジと同じカテゴリに属する第２エッジのなかに、呼び出し元と呼び出し先とのプログラムモジュールの組が該対象第１エッジと共通の第２エッジが存在しない場合、該対象第１エッジを差分候補第１エッジとして特定する。またコンピュータは、複数の第２エッジそれぞれを対象とし、対象第２エッジと同じカテゴリに属する第１エッジのなかに、呼び出し元と呼び出し先とのプログラムモジュールの組が該対象第２エッジと共通の第１エッジが存在しない場合、該対象第２エッジを差分候補第２エッジとして特定する。さらにコンピュータは、差分候補第１エッジのうち、呼び出し元と呼び出し先とのプログラムモジュールの組が共通となる差分候補第２エッジが存在しない差分第１エッジを抽出する。またコンピュータは、差分候補第２エッジのうち、呼び出し元と呼び出し先とのプログラムモジュールの組が共通となる差分候補第１エッジが存在しない差分第２エッジを抽出する。 In one proposal, a call graph difference extraction method is provided in which a computer performs the following processing.
In this call graph difference extraction method, when a starting point program module is designated, the computer displays a calling relationship among a plurality of first program modules in the first software including the starting point program module by using the plurality of first program modules. Based on the first call graph represented by the plurality of first edges connecting the plurality of first nodes shown, the call relation is traced from the first node corresponding to the origin program module to reach each of the plurality of first edges. The plurality of first edges in the first call graph are classified into categories for each distance. In addition, the computer uses a plurality of second edges connecting a plurality of second nodes indicating a plurality of second program modules to establish a calling relationship between the plurality of second program modules in the second software including the starting program module. Based on the expressed second call graph, the plurality of second edges are classified into categories for each distance according to the distance from the second node corresponding to the starting program module to the call relationship and reaching each of the plurality of second edges. Classify. Next, the computer targets each of the plurality of first edges, and among the second edges belonging to the same category as the target first edge, the program module pair of the caller and the callee is common to the target first edge. If the second edge does not exist, the target first edge is specified as the difference candidate first edge. In addition, the computer targets each of the plurality of second edges, and among the first edges belonging to the same category as the target second edge, the set of program modules of the caller and the callee is the same as the target second edge. When the first edge does not exist, the target second edge is specified as the difference candidate second edge. Further, the computer extracts a first difference edge that does not have a second difference candidate edge that has a common program module set of the caller and the callee among the first difference candidates. Further, the computer extracts a difference second edge that does not include a difference candidate first edge that has a common set of program modules of the caller and the callee among the difference candidate second edges.

１態様によれば、エッジの差分を効率的に抽出できるようになる。 According to one aspect, the difference between edges can be efficiently extracted.

第１の実施の形態に係る情報処理装置の構成例を示す図である。It is a figure which shows the structural example of the information processing apparatus which concerns on 1st Embodiment. コールグラフの生成例を示す図である。It is a figure which shows the example of a production | generation of a call graph. コールグラフのエッジのデータ構造の一例を示す図である。It is a figure which shows an example of the data structure of the edge of a call graph. コールグラフ上のエッジの距離の例を示す図である。It is a figure which shows the example of the distance of the edge on a call graph. 更新後のソースコードのコールグラフの例を示す図である。It is a figure which shows the example of the call graph of the source code after an update. 第２の実施の形態に用いるコンピュータのハードウェアの一構成例を示す図である。It is a figure which shows one structural example of the hardware of the computer used for 2nd Embodiment. コールグラフ比較のためのコンピュータの機能を示すブロック図である。It is a block diagram which shows the function of the computer for call graph comparison. 第１・第２コールグラフ記憶部内のデータの例を示す図である。It is a figure which shows the example of the data in a 1st, 2nd call graph memory | storage part. 第１・第２距離別エッジリスト記憶部内のデータの例を示す図である。It is a figure which shows the example of the data in the edge list memory | storage part classified by 1st, 2nd distance. 第１・第２差分エッジ記憶部内のデータの例を示す図である。It is a figure which shows the example of the data in a 1st, 2nd difference edge memory | storage part. 第１・第２存在エッジ記憶部内のデータの例を示す図である。It is a figure which shows the example of the data in a 1st, 2nd presence edge memory | storage part. 差分抽出結果記憶部内のデータの例を示す図である。It is a figure which shows the example of the data in a difference extraction result memory | storage part. 差分エッジ抽出処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of a difference edge extraction process. 距離別エッジ分類処理の手順の一例を示すフローチャートである。It is a flowchart which shows an example of the procedure of edge classification processing classified by distance. 同一距離間エッジ比較処理の手順の一例を示すフローチャートである。It is a flowchart which shows an example of the procedure of the edge comparison process between the same distances. 距離違いエッジ存在確認処理の手順の一例を示すフローチャートである。It is a flowchart which shows an example of the procedure of a distance difference edge presence confirmation process. 第１参考手法と第２の実施の形態に係る手法との比較例を示す図である。It is a figure which shows the comparative example of the 1st reference method and the method which concerns on 2nd Embodiment. 第２参考手法と第２の実施の形態に係る手法との比較例を示す図である。It is a figure which shows the comparative example of the 2nd reference method and the method which concerns on 2nd Embodiment.

以下、本実施の形態について図面を参照して説明する。なお各実施の形態は、矛盾のない範囲で複数の実施の形態を組み合わせて実施することができる。
〔第１の実施の形態〕
まず、第１の実施の形態について説明する。 Hereinafter, the present embodiment will be described with reference to the drawings. Each embodiment can be implemented by combining a plurality of embodiments within a consistent range.
[First Embodiment]
First, the first embodiment will be described.

図１は、第１の実施の形態に係る情報処理装置の構成例を示す図である。情報処理装置１０は、記憶部１１と演算部１２とを有する。
記憶部１１は、第１コールグラフ１と第２コールグラフ２とを記憶する。第１コールグラフ１は、第１ソフトウェア内の複数の第１プログラムモジュール間での呼び出し関係を、複数の第１プログラムモジュールを示す複数の第１ノード１ａ〜１ｄ間を接続する複数の第１エッジ１ｅ〜１ｇによって表した有向グラフである。第２コールグラフ２は、第２ソフトウェア内の複数の第２プログラムモジュール間での呼び出し関係を、複数の第２プログラムモジュールを示す複数の第２ノード２ａ〜２ｅ間を接続する複数の第２エッジ２ｆ〜２ｉによって表した有向グラフである。なお、第１ソフトウェアの識別子は「１」、第２ソフトウェアの識別子は「２」であるものとする。 FIG. 1 is a diagram illustrating a configuration example of an information processing apparatus according to the first embodiment. The information processing apparatus 10 includes a storage unit 11 and a calculation unit 12.
The storage unit 11 stores the first call graph 1 and the second call graph 2. The first call graph 1 includes a plurality of first edges that connect a plurality of first nodes 1a to 1d indicating a plurality of first program modules with respect to a calling relationship between the plurality of first program modules in the first software. It is the directed graph represented by 1e-1g. The second call graph 2 includes a plurality of second edges that connect a plurality of second nodes 2a to 2e indicating a plurality of second program modules with respect to a calling relationship between the plurality of second program modules in the second software. It is a directed graph represented by 2f-2i. It is assumed that the identifier of the first software is “1” and the identifier of the second software is “2”.

演算部１２は、分類部１２ａ、特定部１２ｂ、および抽出部１２ｃを有する。
分類部１２ａは、第１ソフトウェアと第２ソフトウェアとの両方に含まれる起点プログラムモジュールが指定されると、第１コールグラフ１と第２コールグラフ２とのそれぞれに含まれる第１エッジ１ｅ〜１ｇと第２エッジ２ｆ〜２ｉとを複数のカテゴリに分類する。例えば分類部１２ａは、第１コールグラフ１に基づいて、起点プログラムモジュールに対応する第１ノード１ａを特定する。そして分類部１２ａは、特定した第１ノード１ａから呼び出し関係を辿って複数の第１エッジ１ｅ〜１ｇそれぞれに到達するまでの距離により、複数の第１エッジ１ｅ〜１ｇを距離ごとのカテゴリに分類する。例えば分類部１２ａは、複数の第１エッジ１ｅ〜１ｇをカテゴリごとに分けて、第１距離別エッジリスト３に格納する。なお、第１距離別エッジリスト３に登録されているエッジは、対応する呼び出し関係における呼び出し元のプログラムモジュールの名称と呼び出し先のプログラムモジュールの名称との組（プログラム名対）で表されている。 The calculation unit 12 includes a classification unit 12a, a specification unit 12b, and an extraction unit 12c.
When the start program module included in both the first software and the second software is designated, the classification unit 12a includes the first edges 1e to 1g included in the first call graph 1 and the second call graph 2, respectively. And the second edges 2f to 2i are classified into a plurality of categories. For example, the classification unit 12a identifies the first node 1a corresponding to the starting program module based on the first call graph 1. Then, the classifying unit 12a classifies the plurality of first edges 1e to 1g into categories for each distance according to the distance from the identified first node 1a to the respective first edges 1e to 1g following the calling relationship. To do. For example, the classification unit 12a divides the plurality of first edges 1e to 1g into categories and stores them in the first distance-specific edge list 3. The edge registered in the first distance-specific edge list 3 is represented by a pair (program name pair) of the name of the call source program module and the name of the call destination program module in the corresponding call relationship. .

また分類部１２ａは、第２コールグラフ２に基づいて、起点プログラムモジュールに対応する第２ノード２ａを特定する。そして分類部１２ａは、特定した第２ノード２ａから呼び出し関係を辿って複数の第２エッジ２ｆ〜２ｉそれぞれに到達するまでの距離により、複数の第２エッジ２ｆ〜２ｉを距離ごとのカテゴリに分類する。例えば分類部１２ａは、複数の第２エッジ２ｆ〜２ｉをカテゴリごとに分けて、第２距離別エッジリスト４に格納する。なお、第２距離別エッジリスト４に登録されているエッジは、対応する呼び出し関係における呼び出し元のプログラムモジュールの名称と呼び出し先のプログラムモジュールの名称との組（プログラム名対）で表されている。 Further, the classification unit 12a identifies the second node 2a corresponding to the starting point program module based on the second call graph 2. Then, the classification unit 12a classifies the plurality of second edges 2f to 2i into categories for each distance based on the distance from the identified second node 2a to the respective second edges 2f to 2i following the calling relationship. To do. For example, the classification unit 12a divides the plurality of second edges 2f to 2i into categories and stores them in the second distance-specific edge list 4. The edge registered in the second distance edge list 4 is represented by a pair (program name pair) of the name of the call source program module and the name of the call destination program module in the corresponding call relationship. .

特定部１２ｂは、複数の第１エッジ１ｅ〜１ｇそれぞれを対象とする。そして特定部１２ｂは、対象第１エッジと同じカテゴリに属する第２エッジのなかに、呼び出し元と呼び出し先とのプログラムモジュールの組が該対象第１エッジと共通の第２エッジが存在しない場合、その対象第１エッジを差分候補第１エッジとして特定する。 The specifying unit 12b targets each of the plurality of first edges 1e to 1g. Then, the specifying unit 12b, when the second edge that belongs to the same category as the target first edge and the set of program modules of the caller and the callee does not have a second edge common to the target first edge, The target first edge is specified as the difference candidate first edge.

また特定部１２ｂは、複数の第２エッジ２ｆ〜２ｉそれぞれを対象とする。そして特定部１２ｂは、対象第２エッジと同じカテゴリに属する第１エッジのなかに、呼び出し元と呼び出し先とのプログラムモジュールの組が該対象第２エッジと共通の第１エッジが存在しない場合、その対象第２エッジを差分候補第２エッジとして特定する。 The specifying unit 12b targets each of the plurality of second edges 2f to 2i. Then, the specifying unit 12b, when the first edge belonging to the same category as the target second edge does not have a first edge common to the target second edge in the set of program modules of the caller and the callee, The target second edge is specified as the difference candidate second edge.

例えば特定部１２ｂは、第１距離別エッジリスト３と第２距離別エッジリスト４との、同一カテゴリ内の第１エッジと第２エッジを比較する。そして特定部１２ｂは、同一カテゴリのなかに、同じプログラム名対で表されている第２エッジが存在しない第１エッジを、差分候補第１エッジとして第１差分エッジリスト５に格納する。また、特定部１２ｂは、同一カテゴリのなかに、同じプログラム名対で表されている第１エッジが存在しない第２エッジを、差分候補第２エッジとして第２差分エッジリスト６に格納する。 For example, the specifying unit 12b compares the first edge and the second edge in the same category in the first distance-based edge list 3 and the second distance-specific edge list 4. Then, the specifying unit 12b stores, in the first difference edge list 5, the first edge that does not include the second edge represented by the same program name pair in the same category as the difference candidate first edge. Further, the specifying unit 12b stores, in the second difference edge list 6, the second edge that does not include the first edge represented by the same program name pair in the same category as the difference candidate second edge.

抽出部１２ｃは、差分候補第１エッジのうち、呼び出し元と呼び出し先とのプログラムモジュールの組が共通となる差分候補第２エッジが存在しない差分第１エッジを抽出する。また抽出部１２ｃは、差分候補第２エッジのうち、呼び出し元と呼び出し先とのプログラムモジュールの組が共通となる差分候補第１エッジが存在しない差分第２エッジを抽出する。 The extraction unit 12c extracts a difference first edge that does not include a difference candidate second edge that has a common set of program modules of the caller and the callee among the difference candidate first edges. Further, the extraction unit 12c extracts a difference second edge that does not include a difference candidate first edge having a common program module set of the caller and the callee among the difference candidate second edges.

例えば抽出部１２ｃは、第１差分エッジリスト５内の差分候補第１エッジと第２差分エッジリスト６内の差分候補第２エッジとを、総当たりで比較する。そして抽出部１２ｃは、比較相手のいずれともプログラム名対が異なる差分候補第１エッジを、第１ソフトウェアの識別子に対応付けて、最終差分エッジリスト７に格納する。またそして抽出部１２ｃは、比較相手のいずれともプログラム名対が異なる差分候補第１エッジを、第２ソフトウェアの識別子に対応付けて、最終差分エッジリスト７に格納する。 For example, the extraction unit 12 c compares the difference candidate first edge in the first difference edge list 5 with the difference candidate second edge in the second difference edge list 6 brute force. Then, the extraction unit 12c stores the difference candidate first edge having a different program name pair from any of the comparison partners in the final difference edge list 7 in association with the identifier of the first software. Further, the extraction unit 12c stores the difference candidate first edge having a different program name pair from any of the comparison partners in the final difference edge list 7 in association with the identifier of the second software.

このような情報処理装置１０によれば、起点プログラムモジュールが指定されると、第１コールグラフ１と第２コールグラフ２とのエッジが、起点ノード１ａ，２ａからの距離に応じたカテゴリに分類される。次に分類部１２ａによって、同じカテゴリの第１エッジと第２エッジとの比較により、差分候補第１エッジと差分候補第２エッジとが特定される。そして、抽出部１２ｃにより、差分候補第１エッジと差分候補第２エッジとのなかから、第１コールグラフ１と第２コールグラフ２との一方にのみ存在する差分第１エッジおよび差分第２エッジが抽出される。 According to such an information processing apparatus 10, when the starting point program module is designated, the edges of the first call graph 1 and the second call graph 2 are classified into categories according to the distance from the starting point nodes 1a and 2a. Is done. Next, the difference candidate first edge and the difference candidate second edge are specified by the classification unit 12a by comparing the first edge and the second edge of the same category. Then, the extraction unit 12c causes the difference first edge and the difference second edge to exist only in one of the first call graph 1 and the second call graph 2 from the difference candidate first edge and the difference candidate second edge. Is extracted.

第１の実施の形態では、特定部１２ｂによって抽出するエッジの候補を絞り込み、候補となったエッジから差分第１エッジおよび差分第２エッジを抽出するため、効率的に差分エッジを抽出できる。その結果、コールグラフ間の差分抽出に要する時間が短縮される。 In the first embodiment, the candidate for edge to be extracted is narrowed down by the specifying unit 12b, and the difference first edge and the difference second edge are extracted from the candidate edges, so that the difference edge can be extracted efficiently. As a result, the time required for extracting the difference between call graphs is shortened.

なお演算部１２は、差分候補第１エッジと差分候補第２エッジとのハッシュ値を用いて、抽出部１２ｃによる抽出処理の効率化を図ることもできる。その場合、特定部１２ｂは、差分候補第１エッジのハッシュ値が登録された第１存在エッジリストと、差分候補第２エッジのハッシュ値が登録された第２存在エッジリストとを生成する。ハッシュ値は、差分候補第１エッジまたは差分候補第２エッジのプログラム名対を、所定のハッシュ関数で演算して求めることができる。抽出部１２ｃは、第２存在エッジリストに登録されているハッシュ値とは異なるハッシュ値が得られる差分候補第１エッジを、差分第１エッジとして抽出する。また抽出部１２ｃは、第１存在エッジリストに登録されているハッシュ値とは異なるハッシュ値が得られる差分候補第２エッジを、差分第２エッジとして抽出する。 Note that the calculation unit 12 can also improve the efficiency of the extraction process by the extraction unit 12c using the hash values of the difference candidate first edge and the difference candidate second edge. In this case, the specifying unit 12b generates a first existence edge list in which the hash value of the difference candidate first edge is registered and a second existence edge list in which the hash value of the difference candidate second edge is registered. The hash value can be obtained by calculating a program name pair of the difference candidate first edge or the difference candidate second edge with a predetermined hash function. The extraction unit 12c extracts the difference candidate first edge from which the hash value different from the hash value registered in the second existence edge list is obtained as the difference first edge. In addition, the extraction unit 12c extracts the difference candidate second edge from which the hash value different from the hash value registered in the first existing edge list is obtained as the difference second edge.

このようにハッシュ値を用いることで、非常に長いプログラム名対を短いデータに圧縮することができ、少ない量のメモリ使用による効率的な処理が可能となる。
また演算部１２は、分類されたカテゴリ内で第１エッジおよび第２エッジをソート（並べ替え）することで、差分候補第１エッジと差分候補第２エッジとの特定処理をさらに効率化することができる。その場合、特定部１２ｂは、分類されたカテゴリにごとに第１エッジと第２エッジとを所定の基準でソートする。次に特定部１２ｂは、同じ距離のカテゴリに分類された第１エッジ群と第２エッジ群とを比較対象とする。さらに特定部１２ｂは、第１エッジ群の上位から順に対象第１エッジを選択し、第２エッジ群の上位から順に対象第２エッジを選択する。そして特定部１２ｂは、対象第１エッジと対象第２エッジとにおける呼び出し元と呼び出し先とのプログラムモジュールの組が共通か否かを判断する。共通ではないとき、特定部１２ｂは、対象第１エッジと対象第２エッジとのうち、所定の基準でソートしたときに上位となる方を、差分候補第１エッジまたは差分候補第２エッジとして特定する。 By using a hash value in this way, a very long program name pair can be compressed into short data, and efficient processing using a small amount of memory becomes possible.
Further, the calculation unit 12 sorts (sorts) the first edge and the second edge within the classified category, thereby further improving the efficiency of the identification process between the difference candidate first edge and the difference candidate second edge. Can do. In that case, the specifying unit 12b sorts the first edge and the second edge on a predetermined basis for each classified category. Next, the specifying unit 12b uses the first edge group and the second edge group classified into the same distance category as comparison targets. Further, the specifying unit 12b selects the target first edge in order from the top of the first edge group, and selects the target second edge in order from the top of the second edge group. Then, the specifying unit 12b determines whether a set of program modules of the call source and the call destination at the target first edge and the target second edge is common. When not common, the specifying unit 12b specifies, as the difference candidate first edge or the difference candidate second edge, the higher one of the target first edge and the target second edge when sorted according to a predetermined criterion. To do.

特定部１２ｂは、差分候補第１エッジを特定した場合、対象第１エッジを、第１エッジ群内の次の第１エッジに変更する。また特定部１２ｂは、差分候補第２エッジを特定した場合、対象第２エッジを、第２エッジ群内の次の第２エッジに変更する。呼び出し元と呼び出し先とのプログラムモジュールの組が共通と判断した場合、特定部１２ｂは、対象第１エッジを、第１エッジ群内の次の第１エッジに変更すると共に、対象第２エッジを、第２エッジ群内の次の第２エッジに変更する。 When specifying the difference candidate first edge, the specifying unit 12b changes the target first edge to the next first edge in the first edge group. In addition, when the difference candidate second edge is specified, the specifying unit 12b changes the target second edge to the next second edge in the second edge group. When determining that the program module set of the call source and the call destination is common, the specifying unit 12b changes the target first edge to the next first edge in the first edge group, and sets the target second edge. To the next second edge in the second edge group.

このように、カテゴリ内で第１エッジおよび第２エッジをソートした上で同一性の比較を行うことで、総当たりで比較する場合に比べ比較処理の回数を削減できる。その結果、処理の効率化が図れる。 Thus, by comparing the identities after sorting the first edge and the second edge within the category, it is possible to reduce the number of comparison processes compared to the case where comparison is made brute force. As a result, processing efficiency can be improved.

なお、演算部１２は、例えば情報処理装置１０が有するプロセッサにより実現することができる。また、記憶部１１は、例えば情報処理装置１０が有するメモリまたはストレージ装置により実現することができる。演算部１２内の分類部１２ａ、特定部１２ｂ、および抽出部１２ｃは、例えば演算部１２にプログラムを実行させることで実現できる。その際、第１距離別エッジリスト３、第２距離別エッジリスト４、第１差分エッジリスト５、第２差分エッジリスト６、および最終差分エッジリスト７は、例えば情報処理装置１０が有するメモリまたはストレージ装置に一時的に格納される。 In addition, the calculating part 12 is realizable with the processor which the information processing apparatus 10 has, for example. Moreover, the memory | storage part 11 is realizable with the memory or storage apparatus which the information processing apparatus 10 has, for example. The classification unit 12a, the specifying unit 12b, and the extraction unit 12c in the calculation unit 12 can be realized by causing the calculation unit 12 to execute a program, for example. At that time, the first distance-specific edge list 3, the second distance-specific edge list 4, the first differential edge list 5, the second differential edge list 6, and the final differential edge list 7 are stored in, for example, a memory included in the information processing apparatus 10 or It is temporarily stored in the storage device.

以上が第１の実施の形態についての説明である。
〔第２の実施の形態〕
次に第２の実施の形態について説明する。第２の実施の形態では、オブジェクト指向プログラミングにより作成された２つのソフトウェアそれぞれから作成されたコールグラフのエッジの差分の抽出を、コンピュータを用いて効率的に実施するものである。なお、オブジェクト指向プログラミングにより作成された２つのソフトウェアのコールグラフでは、ソフトウェアを構成する複数のメソッドがノードで表され、メソッド間の呼び出し関係が、エッジで表される。すなわちメソッドは、第１の実施の形態に示すプログラムモジュールの一例である。 The above is the description of the first embodiment.
[Second Embodiment]
Next, a second embodiment will be described. In the second embodiment, the extraction of the difference between the edges of the call graph created from each of two pieces of software created by object-oriented programming is efficiently performed using a computer. In the call graph of two software created by object-oriented programming, a plurality of methods constituting the software are represented by nodes, and the call relationship between methods is represented by edges. That is, the method is an example of a program module shown in the first embodiment.

図２は、コールグラフの生成例を示す図である。図２に示すように、ソースコード３１を解析することで、メソッド間の呼び出し関係を示すコールグラフ３２が得られる。
例えば、ソースコード３１を解析すると、「Tranceiver.transmit()」・「Transformer.encode()」・「Protocol.makeHeader()」・「Tranceiver.send()」というメソッドが存在していることが分かる。そこで、コールグラフ３２内に、各メソッドに対応するノード３２ａ〜３２ｄが生成される。また、ソースコード３１を解析すると、メソッド「Tranceiver.transmit()」が、他の３つのメソッド「Transformer.encode()」・「Protocol.makeHeader()」・「Tranceiver.send()」を呼び出していることが分かる。そこで、メソッド「Tranceiver.transmit()」に対応するノード３２ａと他の３つのノード３２ｂ〜３２ｄとを接続するエッジ３２ｅ〜３２ｇが生成される。エッジ３２ｅ〜３２ｇは、呼び出し元のメソッドに対応するノード３２ａから、呼び出し先のメソッドに対応するノード３２ｂ〜３２ｄを指し示す矢印である。 FIG. 2 is a diagram illustrating an example of generating a call graph. As shown in FIG. 2, by analyzing the source code 31, a call graph 32 showing a calling relationship between methods is obtained.
For example, when the source code 31 is analyzed, it can be seen that the methods "Tranceiver.transmit ()", "Transformer.encode ()", "Protocol.makeHeader ()", and "Tranceiver.send ()" exist. . Therefore, nodes 32 a to 32 d corresponding to the methods are generated in the call graph 32. When the source code 31 is analyzed, the method “Tranceiver.transmit ()” calls the other three methods “Transformer.encode ()”, “Protocol.makeHeader ()”, and “Tranceiver.send ()”. I understand that. Therefore, edges 32e to 32g connecting the node 32a corresponding to the method “Tranceiver.transmit ()” and the other three nodes 32b to 32d are generated. The edges 32e to 32g are arrows pointing from the node 32a corresponding to the call source method to the nodes 32b to 32d corresponding to the call destination method.

このようにして、ノード３２ａ〜３２ｄによりメソッドを表し、エッジ３２ｅ〜３２ｇによりメソッド間の呼び出し関係を表すコールグラフ３２が生成される。コールグラフ３２の各エッジ３２ｅ〜３２ｇは、文字列の２つ組で表現できる。 In this manner, a call graph 32 is generated that represents a method by the nodes 32a to 32d and represents a call relationship between the methods by the edges 32e to 32g. Each edge 32e to 32g of the call graph 32 can be expressed by a pair of character strings.

図３は、コールグラフのエッジのデータ構造の一例を示す図である。各エッジ３２ｅ〜３２ｇは、呼び出し元メソッドの完全修飾名文字列と、呼び出し先メソッドの完全修飾名文字列とを含んでいる。左側の文字列が呼び出し元メソッドの完全修飾名であり、右側の文字列が呼び出し先メソッドの完全修飾名である。このような文字列の２つ組によって、エッジが特定される。換言すると、文字列の２つ組が同じエッジは、同一の呼び出し関係を示すエッジであると判断できる。 FIG. 3 is a diagram illustrating an example of a data structure of an edge of a call graph. Each edge 32e to 32g includes a fully qualified name character string of the caller method and a fully qualified name character string of the callee method. The character string on the left is the fully qualified name of the caller method, and the character string on the right is the fully qualified name of the callee method. An edge is specified by such a pair of character strings. In other words, it can be determined that edges having the same pair of character strings are edges indicating the same calling relationship.

なお、エッジ３２ｅ〜３２ｇを表す文字列としてノード３２ａ〜３２ｄを示すメソッドの完全修飾名が含まれている。そのため、エッジ３２ｅ〜３２ｇの情報があれば、ノード３２ａ〜３２ｄを含め、コールグラフ３２の全体を生成できる。 Note that the fully qualified names of the methods indicating the nodes 32a to 32d are included as character strings representing the edges 32e to 32g. Therefore, if there is information on the edges 32e to 32g, the entire call graph 32 including the nodes 32a to 32d can be generated.

また、複数のエッジは、データ構造に含まれる文字列により、同一、および大小を比較することができる。例えば、図３に示すように呼び出し元メソッドを示す文字列と、呼び出し先メソッドを示す文字列との組で表される２つのエッジ（Ｎ１，Ｎ２），（Ｎ３，Ｎ４）があるものとする。ここで、Ｎ１，Ｎ２，Ｎ３，Ｎ４は、文字列である。このとき、各エッジそれらが同一であるかは以下の条件で判定できる。
（Ｎ１，Ｎ２）＝（Ｎ３，Ｎ４）⇔Ｎ１＝Ｎ３かつＮ２＝Ｎ４
さらに２つのエッジは次のように大小比較できる。
（Ｎ１，Ｎ２）＜（Ｎ３，Ｎ４）⇔Ｎ１＜Ｎ３または｛Ｎ１＝Ｎ３かつＮ２＜Ｎ４｝
（Ｎ１，Ｎ２）≦（Ｎ３，Ｎ４）⇔Ｎ１≦Ｎ３または｛Ｎ１＝Ｎ３かつＮ２≦Ｎ４｝
なお、文字列同士の大小関係は、例えばアルファベット順とする。その場合、「Ａ」が最も小さい値となり、「Ｚ」が最も大きな値となる。この比較方法を使って一意の順序にエッジをソートできる。 In addition, the plurality of edges can be compared in size and in size by using character strings included in the data structure. For example, as shown in FIG. 3, there are two edges (N1, N2) and (N3, N4) represented by a combination of a character string indicating a caller method and a character string indicating a callee method. . Here, N1, N2, N3, and N4 are character strings. At this time, whether or not each edge is the same can be determined under the following conditions.
(N1, N2) = (N3, N4) ⇔N1 = N3 and N2 = N4
Two edges can be compared in size as follows.
(N1, N2) <(N3, N4) ⇔N1 <N3 or {N1 = N3 and N2 <N4}
(N1, N2) ≦ (N3, N4) ⇔N1 ≦ N3 or {N1 = N3 and N2 ≦ N4}
The magnitude relationship between character strings is, for example, in alphabetical order. In that case, “A” is the smallest value and “Z” is the largest value. This comparison method can be used to sort edges in a unique order.

このようなコールグラフ３２は、ソフトウェア開発時に利用される。例えばソフトウェア開発過程での版数の異なる複数のソースコードそれぞれから生成した複数のコールグラフにおけるエッジの差分抽出を行うことがある。コールグラフ間のエッジの差分抽出は、大規模なソフトウェアの一部を修正した場合における、その修正の影響範囲の把握に有用である。例えば、ソフトウェアの開発にあたり、ＯＳＳ（Open Source Software）など第三者が作成したソースコードセットの呼び出しを含むソフトウェアが作成されることがある。または第三者が作成したソースコードセットを改変したり、機能を追加したりして、ソフトウェアを完成させることもある。 Such a call graph 32 is used during software development. For example, edge differences may be extracted from a plurality of call graphs generated from a plurality of source codes having different versions in the software development process. Extracting edge differences between call graphs is useful for grasping the range of influence of a modification when a part of a large-scale software is modified. For example, in developing software, software including calling a source code set created by a third party such as OSS (Open Source Software) may be created. Or the source code set created by a third party may be modified or functions may be added to complete the software.

このようにして作成されたソフトウェアにおいて、呼び出されるソースコードセットが更新された場合、開発者は過去の版で呼び出したり改変したりしたメソッド（開発者が着目しているメソッド）が更新の影響を受けているか確認する。以下、開発者が着目しているメソッドに対応するノードを、起点ノードと呼ぶ。２つのコールグラフそれぞれから、起点ノードを起点として呼び出し関係を追跡し、追跡された呼び出し関係を比較して差分を抽出することにより、例えば更新の影響が起点メソッドに及ぶか否かを判断できる。また、その他に精査するメソッドを特定することもできる。 In the software created in this way, when the source code set to be called is updated, the method that the developer called or modified in the previous version (the method that the developer focuses on) has an effect on the update. Check if you are receiving. Hereinafter, a node corresponding to a method focused on by the developer is referred to as a starting node. From each of the two call graphs, it is possible to determine whether or not the influence of the update affects the starting point method, for example, by tracking the calling relationship from the starting point node and comparing the tracked calling relationship and extracting the difference. You can also specify other methods to scrutinize.

第２の実施の形態では、このような呼び出し関係の差分抽出をコンピュータ処理によって高速に実施できるようにする。そのため第２の実施の形態では、コールグラフ内のエッジが、起点ノードからの距離で分類される。 In the second embodiment, such call-related difference extraction can be performed at high speed by computer processing. Therefore, in the second embodiment, the edges in the call graph are classified by the distance from the starting node.

図４は、コールグラフ上のエッジの距離の例を示す図である。図４に示すコールグラフ４０は、ノード４１ａ〜４１ｋとエッジ４２ａ〜４２ｋを含んでいる。コールグラフ４０上の起点ノードからｄ個（ｄは０以上の整数）のエッジを同一方向に辿って別のあるノードに到達できるとき、２のノードの距離をｄと定義する。このときのエッジを辿る方向（探索方向）は、呼び出し元から呼び出し先へ向かう方向（呼び出し先方向）の場合と、逆に、呼び出し先から呼び出し元へ向かう方向（呼び出し元方向）の場合とがある。なお、起点ノードから、その起点ノードまでの距離は０と定義する。 FIG. 4 is a diagram illustrating an example of edge distance on the call graph. The call graph 40 shown in FIG. 4 includes nodes 41a to 41k and edges 42a to 42k. When it is possible to reach another node by tracing d edges (d is an integer of 0 or more) from the starting node on the call graph 40 in the same direction, the distance between the two nodes is defined as d. The direction (search direction) following the edge at this time is the direction from the caller to the callee (callee direction), and conversely, the direction from the callee to the caller (caller direction). is there. The distance from the starting node to the starting node is defined as 0.

そしてノード間の距離を用いて、起点ノードに対するエッジの距離を定義できる。例えば、起点ノードからみて距離（ｄ−１）の位置にあるノードと距離ｄの位置にあるノードを結ぶエッジは、起点ノードからの距離をｄと定義する。 And the distance of the edge with respect to the origin node can be defined using the distance between nodes. For example, an edge connecting a node located at a distance (d−1) from a starting node and a node located at a distance d defines the distance from the starting node as d.

図４の例では、「メソッドｇ」に対応するノード４１ｇが起点ノードである。起点ノードから「メソッドｄ」に対応するノード４１ｄまでの距離は「１」である。そして、起点ノードであるノード４１ｇとノード「メソッドｄ」に対応するノード４１ｄとを接続するエッジ４２ｈの距離は「１」である。 In the example of FIG. 4, the node 41g corresponding to “method g” is the starting node. The distance from the starting node to the node 41d corresponding to “method d” is “1”. Then, the distance of the edge 42h connecting the node 41g as the starting node and the node 41d corresponding to the node “method d” is “1”.

起点ノードから「メソッドｆ」に対応するノード４１ｆまでの距離は「２」である。そして、起点ノードから距離「１」にあるノード４１ｄとノード「メソッドｆ」に対応するノード４１ｆとを接続するエッジ４２ｄの距離は「２」である。 The distance from the starting node to the node 41f corresponding to “method f” is “2”. Then, the distance of the edge 42d connecting the node 41d at the distance “1” from the starting node and the node 41f corresponding to the node “method f” is “2”.

ここで、図４に示したコールグラフ４０の生成元となったソースコードが更新された場合を想定する。この場合、更新後のソースコードについてもコールグラフを生成することで、そのコールグラフにより、更新後のソースコードにおけるメソッド間の呼び出し関係が表される。 Here, it is assumed that the source code that is the generation source of the call graph 40 illustrated in FIG. 4 is updated. In this case, by generating a call graph for the updated source code, the call relationship between the methods in the updated source code is represented by the call graph.

図５は、更新後のソースコードのコールグラフの例を示す図である。図５の例では、更新後のソースコードのコールグラフ４０ａを、図４に示す更新前のコールグラフ４０と比較すると、ノード４１ｋが削除され、ノード４１ｌが追加されている。ノード４１ｄとノード４１ｆとを接続するエッジ４２ｄは削除され、ノード４１ｄとノード４１ｌを接続するエッジ４２ｌおよびノード４１ｌとノード４１ｆを接続するエッジ４２ｍが追加されている。 FIG. 5 is a diagram illustrating an example of a call graph of the updated source code. In the example of FIG. 5, when the call graph 40a of the updated source code is compared with the call graph 40 before the update shown in FIG. 4, the node 41k is deleted and the node 41l is added. An edge 42d connecting the node 41d and the node 41f is deleted, and an edge 42l connecting the node 41d and the node 41l and an edge 42m connecting the node 41l and the node 41f are added.

以下、図４、図５に示した２つのコールグラフ４０，４０ａ間のエッジの差分抽出処理を高速に行うコンピュータについて、詳細に説明する。
＜ハードウェア構成＞
図６は、第２の実施の形態に用いるコンピュータのハードウェアの一構成例を示す図である。コンピュータ１００は、プロセッサ１０１によって装置全体が制御されている。プロセッサ１０１には、バス１０９を介してメモリ１０２と複数の周辺機器が接続されている。プロセッサ１０１は、マルチプロセッサであってもよい。プロセッサ１０１は、例えばＣＰＵ（Central Processing Unit）、ＭＰＵ（Micro Processing Unit）、またはＤＳＰ（Digital Signal Processor）である。プロセッサ１０１がプログラムを実行することで実現する機能の少なくとも一部を、ＡＳＩＣ（Application Specific Integrated Circuit）、ＰＬＤ（Programmable Logic Device）などの電子回路で実現してもよい。 Hereinafter, a computer that performs high-speed edge difference extraction processing between the two call graphs 40 and 40a illustrated in FIGS. 4 and 5 will be described in detail.
<Hardware configuration>
FIG. 6 is a diagram illustrating a configuration example of hardware of a computer used in the second embodiment. The computer 100 is entirely controlled by a processor 101. A memory 102 and a plurality of peripheral devices are connected to the processor 101 via a bus 109. The processor 101 may be a multiprocessor. The processor 101 is, for example, a CPU (Central Processing Unit), an MPU (Micro Processing Unit), or a DSP (Digital Signal Processor). At least a part of the functions realized by the processor 101 executing the program may be realized by an electronic circuit such as an ASIC (Application Specific Integrated Circuit) or a PLD (Programmable Logic Device).

メモリ１０２は、コンピュータ１００の主記憶装置として使用される。メモリ１０２には、プロセッサ１０１に実行させるＯＳのプログラムやアプリケーションプログラムの少なくとも一部が一時的に格納される。また、メモリ１０２には、プロセッサ１０１による処理に必要な各種データが格納される。メモリ１０２としては、例えばＲＡＭ（Random Access Memory）などの揮発性の半導体記憶装置が使用される。 The memory 102 is used as a main storage device of the computer 100. The memory 102 temporarily stores at least part of an OS program and application programs to be executed by the processor 101. The memory 102 stores various data necessary for processing by the processor 101. As the memory 102, for example, a volatile semiconductor storage device such as a RAM (Random Access Memory) is used.

バス１０９に接続されている周辺機器としては、ストレージ装置１０３、グラフィック処理装置１０４、入力インタフェース１０５、光学ドライブ装置１０６、機器接続インタフェース１０７およびネットワークインタフェース１０８がある。 Peripheral devices connected to the bus 109 include a storage device 103, a graphic processing device 104, an input interface 105, an optical drive device 106, a device connection interface 107, and a network interface 108.

ストレージ装置１０３は、内蔵した記憶媒体に対して、電気的または磁気的にデータの書き込みおよび読み出しを行う。ストレージ装置１０３は、コンピュータの補助記憶装置として使用される。ストレージ装置１０３には、ＯＳのプログラム、アプリケーションプログラム、および各種データが格納される。なお、ストレージ装置１０３としては、例えばＨＤＤ（Hard Disk Drive）やＳＳＤ（Solid State Drive）を使用することができる。 The storage device 103 writes and reads data electrically or magnetically with respect to a built-in storage medium. The storage device 103 is used as an auxiliary storage device of a computer. The storage device 103 stores an OS program, application programs, and various data. For example, an HDD (Hard Disk Drive) or an SSD (Solid State Drive) can be used as the storage device 103.

グラフィック処理装置１０４には、モニタ２１が接続されている。グラフィック処理装置１０４は、プロセッサ１０１からの命令に従って、画像をモニタ２１の画面に表示させる。モニタ２１としては、ＣＲＴ（Cathode Ray Tube）を用いた表示装置や液晶表示装置などがある。 A monitor 21 is connected to the graphic processing device 104. The graphic processing device 104 displays an image on the screen of the monitor 21 in accordance with an instruction from the processor 101. Examples of the monitor 21 include a display device using a CRT (Cathode Ray Tube) and a liquid crystal display device.

入力インタフェース１０５には、キーボード２２とマウス２３とが接続されている。入力インタフェース１０５は、キーボード２２やマウス２３から送られてくる信号をプロセッサ１０１に送信する。なお、マウス２３は、ポインティングデバイスの一例であり、他のポインティングデバイスを使用することもできる。他のポインティングデバイスとしては、タッチパネル、タブレット、タッチパッド、トラックボールなどがある。 A keyboard 22 and a mouse 23 are connected to the input interface 105. The input interface 105 transmits signals sent from the keyboard 22 and the mouse 23 to the processor 101. The mouse 23 is an example of a pointing device, and other pointing devices can also be used. Examples of other pointing devices include a touch panel, a tablet, a touch pad, and a trackball.

光学ドライブ装置１０６は、レーザ光などを利用して、光ディスク２４に記録されたデータの読み取りを行う。光ディスク２４は、光の反射によって読み取り可能なようにデータが記録された可搬型の記録媒体である。光ディスク２４には、ＤＶＤ（Digital Versatile Disc）、ＤＶＤ−ＲＡＭ、ＣＤ−ＲＯＭ（Compact Disc Read Only Memory）、ＣＤ−Ｒ（Recordable）／ＲＷ（ReWritable）などがある。 The optical drive device 106 reads data recorded on the optical disc 24 using laser light or the like. The optical disc 24 is a portable recording medium on which data is recorded so that it can be read by reflection of light. The optical disc 24 includes a DVD (Digital Versatile Disc), a DVD-RAM, a CD-ROM (Compact Disc Read Only Memory), a CD-R (Recordable) / RW (ReWritable), and the like.

機器接続インタフェース１０７は、コンピュータ１００に周辺機器を接続するための通信インタフェースである。例えば機器接続インタフェース１０７には、メモリ装置２５やメモリリーダライタ２６を接続することができる。メモリ装置２５は、機器接続インタフェース１０７との通信機能を搭載した記録媒体である。メモリリーダライタ２６は、メモリカード２７へのデータの書き込み、またはメモリカード２７からのデータの読み出しを行う装置である。メモリカード２７は、カード型の記録媒体である。 The device connection interface 107 is a communication interface for connecting peripheral devices to the computer 100. For example, the memory device 25 and the memory reader / writer 26 can be connected to the device connection interface 107. The memory device 25 is a recording medium equipped with a communication function with the device connection interface 107. The memory reader / writer 26 is a device that writes data to the memory card 27 or reads data from the memory card 27. The memory card 27 is a card type recording medium.

ネットワークインタフェース１０８は、ネットワーク２０に接続されている。ネットワークインタフェース１０８は、ネットワーク２０を介して、他のコンピュータまたは通信機器との間でデータの送受信を行う。 The network interface 108 is connected to the network 20. The network interface 108 transmits and receives data to and from other computers or communication devices via the network 20.

以上のようなハードウェア構成によって、第２の実施の形態の処理機能を実現することができる。なお、第１の実施の形態に示した情報処理装置１０も、図６に示したコンピュータ１００と同様のハードウェアにより実現することができる。 With the hardware configuration described above, the processing functions of the second embodiment can be realized. The information processing apparatus 10 shown in the first embodiment can also be realized by hardware similar to the computer 100 shown in FIG.

コンピュータ１００は、例えばコンピュータ読み取り可能な記録媒体に記録されたプログラムを実行することにより、第２の実施の形態の処理機能を実現する。コンピュータ１００に実行させる処理内容を記述したプログラムは、様々な記録媒体に記録しておくことができる。例えば、コンピュータ１００に実行させるプログラムをストレージ装置１０３に格納しておくことができる。プロセッサ１０１は、ストレージ装置１０３内のプログラムの少なくとも一部をメモリ１０２にロードし、プログラムを実行する。またコンピュータ１００に実行させるプログラムを、光ディスク２４、メモリ装置２５、メモリカード２７などの可搬型記録媒体に記録しておくこともできる。可搬型記録媒体に格納されたプログラムは、例えばプロセッサ１０１からの制御により、ストレージ装置１０３にインストールされた後、実行可能となる。またプロセッサ１０１が、可搬型記録媒体から直接プログラムを読み出して実行することもできる。 The computer 100 implements the processing functions of the second embodiment by executing a program recorded on a computer-readable recording medium, for example. A program describing the processing content to be executed by the computer 100 can be recorded in various recording media. For example, a program to be executed by the computer 100 can be stored in the storage device 103. The processor 101 loads at least a part of the program in the storage apparatus 103 into the memory 102 and executes the program. A program to be executed by the computer 100 can be recorded on a portable recording medium such as the optical disc 24, the memory device 25, and the memory card 27. The program stored in the portable recording medium becomes executable after being installed in the storage apparatus 103 under the control of the processor 101, for example. The processor 101 can also read and execute a program directly from a portable recording medium.

＜処理機能＞
コンピュータ１００は、２つのソースコードセットそれぞれから作成されたコールグラフを比較し、エッジの差分を検出する。コールグラフ比較処理のためにコンピュータ１００が有する機能を図７に示す。 <Processing function>
The computer 100 compares call graphs created from each of the two source code sets and detects an edge difference. The functions of the computer 100 for the call graph comparison process are shown in FIG.

図７は、コールグラフ比較のためのコンピュータの機能を示すブロック図である。コンピュータ１００は、第１コールグラフ記憶部１１１、第２コールグラフ記憶部１１２、要求受付部１２０、距離別エッジ分類部１３０、第１距離別エッジリスト記憶部１４１、第２距離別エッジリスト記憶部１４２、同一距離間エッジ比較部１５０、第１差分エッジ記憶部１６１、第２差分エッジ記憶部１６２、第１存在エッジ記憶部１６３、第２存在エッジ記憶部１６４、差分エッジ抽出部１７０、差分抽出結果記憶部１８０、および結果返答部１９０を有する。このうち距離別エッジ分類部１３０は、第１の実施の形態における分類部１２ａの一例である。また同一距離間エッジ比較部１５０は、第１の実施の形態における特定部１２ｂの一例である。さらに差分エッジ抽出部１７０は、第１の実施の形態における抽出部１２ｃの一例である。 FIG. 7 is a block diagram illustrating functions of a computer for call graph comparison. The computer 100 includes a first call graph storage unit 111, a second call graph storage unit 112, a request reception unit 120, an edge classification unit 130 by distance, a first edge list storage unit 141 by distance, and a second edge list storage unit by distance 142, same distance edge comparison unit 150, first difference edge storage unit 161, second difference edge storage unit 162, first existence edge storage unit 163, second existence edge storage unit 164, difference edge extraction unit 170, difference extraction A result storage unit 180 and a result response unit 190 are included. Of these, the edge-by-distance classifying unit 130 is an example of the classifying unit 12a according to the first embodiment. The same distance edge comparison unit 150 is an example of the specifying unit 12b in the first embodiment. Furthermore, the differential edge extraction unit 170 is an example of the extraction unit 12c in the first embodiment.

第１コールグラフ記憶部１１１は、比較対象の２つのコールグラフのうちの一つを記憶する。例えば第１コールグラフ記憶部１１１は、改変前の版数のソフトウェアにおけるメソッド間の呼び出し関係を示すコールグラフを記憶する。 The first call graph storage unit 111 stores one of the two call graphs to be compared. For example, the first call graph storage unit 111 stores a call graph indicating a call relationship between methods in the software of the version number before modification.

第２コールグラフ記憶部１１２は、比較対象の２つのコールグラフのうちの他の一つを記憶する。例えば第２コールグラフ記憶部１１２は、改変後の版数のソフトウェアにおけるメソッド間の呼び出し関係を示すコールグラフを記憶する。 The second call graph storage unit 112 stores another one of the two call graphs to be compared. For example, the second call graph storage unit 112 stores a call graph indicating a call relationship between methods in the modified version of software.

要求受付部１２０は、開発者による差分抽出要求の入力を受け付ける。差分抽出要求には、着目するメソッドを指定する情報と、探索方向を指定する情報とが含まれる。要求受付部１２０は、受け付けた差分抽出要求を、距離別エッジ分類部１３０に送信する。 The request receiving unit 120 receives a difference extraction request input from a developer. The difference extraction request includes information specifying a method of interest and information specifying a search direction. The request reception unit 120 transmits the received difference extraction request to the edge classification unit 130 by distance.

距離別エッジ分類部１３０は、第１コールグラフ記憶部１１１と第２コールグラフ記憶部１１２とに記憶されている２つのコールグラフそれぞれのエッジを、着目するメソッドに対応するノードからの距離により分類する。具体的には、距離別エッジ分類部１３０は、着目するメソッドからの距離ごとのカテゴリを生成する。そして距離別エッジ分類部１３０は、第１コールグラフ記憶部１１１に記憶されているコールグラフを読み出し、そのコールグラフ内のエッジが属するカテゴリを判断し、各エッジをカテゴリごとに第１距離別エッジリスト記憶部１４１に格納する。同様に、距離別エッジ分類部１３０は、第２コールグラフ記憶部１１２に記憶されているコールグラフを読み出し、そのコールグラフ内のエッジが属するカテゴリを判断し、各エッジをカテゴリごとに第２距離別エッジリスト記憶部１４２に格納する。 The edge classification unit by distance classifies the edges of each of the two call graphs stored in the first call graph storage unit 111 and the second call graph storage unit 112 according to the distance from the node corresponding to the method of interest. To do. Specifically, the edge-by-distance classifying unit 130 generates a category for each distance from the focused method. The distance-specific edge classification unit 130 reads the call graph stored in the first call graph storage unit 111, determines the category to which the edge in the call graph belongs, and determines each edge as the first distance-specific edge for each category. Stored in the list storage unit 141. Similarly, the edge-by-distance classification unit 130 reads the call graph stored in the second call graph storage unit 112, determines the category to which the edge in the call graph belongs, and sets each edge to the second distance for each category. It is stored in another edge list storage unit 142.

第１距離別エッジリスト記憶部１４１は、第１コールグラフ記憶部１１１内のコールグラフに含まれるエッジを、着目するメソッドからの距離に応じたカテゴリに分類して記憶する。 The first distance-specific edge list storage unit 141 classifies and stores the edges included in the call graph in the first call graph storage unit 111 into categories according to the distance from the method of interest.

第２距離別エッジリスト記憶部１４２は、第２コールグラフ記憶部１１２内のコールグラフに含まれるエッジを、着目するメソッドからの距離に応じたカテゴリに分類して記憶する。 The second distance-specific edge list storage unit 142 classifies and stores the edges included in the call graph in the second call graph storage unit 112 into categories according to the distance from the method of interest.

同一距離間エッジ比較部１５０は、第１距離別エッジリスト記憶部１４１と第２距離別エッジリスト記憶部１４２との同じカテゴリに分類されたエッジ同士を比較し、同じエッジの有無を判断する。そして同一距離間エッジ比較部１５０は、第１距離別エッジリスト記憶部１４１内のあるカテゴリに記憶されているが、第２距離別エッジリスト記憶部１４２内の同一カテゴリに記憶されていないエッジを、第１差分エッジ記憶部１６１に格納する。また同一距離間エッジ比較部１５０は、第２距離別エッジリスト記憶部１４２内のあるカテゴリに記憶されているが、第１距離別エッジリスト記憶部１４１内の同一カテゴリに記憶されていないエッジを、第２差分エッジ記憶部１６２に格納する。 The same distance edge comparison unit 150 compares edges classified into the same category in the first distance-specific edge list storage unit 141 and the second distance-specific edge list storage unit 142, and determines the presence or absence of the same edge. The same distance edge comparison unit 150 stores edges that are stored in a certain category in the first distance-specific edge list storage unit 141 but are not stored in the same category in the second distance-specific edge list storage unit 142. And stored in the first differential edge storage unit 161. The same distance edge comparison unit 150 stores edges that are stored in a certain category in the second distance edge list storage unit 142 but are not stored in the same category in the first distance edge list storage unit 141. , And stored in the second differential edge storage unit 162.

さらに同一距離間エッジ比較部１５０は、第１差分エッジ記憶部１６１に格納したエッジのハッシュ値を算出し、得られたハッシュ値を第１存在エッジ記憶部１６３に格納する。また同一距離間エッジ比較部１５０は、第２差分エッジ記憶部１６２に格納したエッジのハッシュ値を算出し、得られたハッシュ値を第２存在エッジ記憶部１６４に格納する。 Further, the same distance edge comparison unit 150 calculates the hash value of the edge stored in the first difference edge storage unit 161 and stores the obtained hash value in the first existence edge storage unit 163. Further, the same distance edge comparison unit 150 calculates the hash value of the edge stored in the second difference edge storage unit 162 and stores the obtained hash value in the second existence edge storage unit 164.

第１差分エッジ記憶部１６１は、第１距離別エッジリスト記憶部１４１内のあるカテゴリに記憶されているが、第２距離別エッジリスト記憶部１４２内の同一カテゴリに記憶されていないエッジを記憶する。 The first differential edge storage unit 161 stores edges that are stored in a certain category in the first distance-specific edge list storage unit 141 but are not stored in the same category in the second distance-specific edge list storage unit 142. To do.

第２差分エッジ記憶部１６２は、第２距離別エッジリスト記憶部１４２内のあるカテゴリに記憶されているが、第１距離別エッジリスト記憶部１４１内の同一カテゴリに記憶されていないエッジを記憶する。 The second difference edge storage unit 162 stores edges that are stored in a certain category in the second distance-specific edge list storage unit 142 but are not stored in the same category in the first distance-specific edge list storage unit 141. To do.

第１存在エッジ記憶部１６３は、第１差分エッジ記憶部１６１に格納されたエッジのハッシュ値を記憶する。
第２存在エッジ記憶部１６４は、第２差分エッジ記憶部１６２に格納されたエッジのハッシュ値を記憶する。 The first existence edge storage unit 163 stores the hash value of the edge stored in the first differential edge storage unit 161.
The second existence edge storage unit 164 stores the hash value of the edge stored in the second differential edge storage unit 162.

差分エッジ抽出部１７０は、第１差分エッジ記憶部１６１と第２差分エッジ記憶部１６２とから、比較対象の２つのコールグラフの一方にのみ含まれるエッジを抽出する。具体的には、差分エッジ抽出部１７０は、第１差分エッジ記憶部１６１に記憶されているエッジのハッシュ値と同一のハッシュ値が、第２存在エッジ記憶部１６４に格納されているか否かを判断する。差分エッジ抽出部１７０は、該当するハッシュ値が格納されていなければ、ハッシュ値の生成元であるエッジを第１差分エッジ記憶部１６１から抽出し、差分抽出結果記憶部１８０に格納する。また差分エッジ抽出部１７０は、第２差分エッジ記憶部１６２に記憶されているエッジのハッシュ値と同一のハッシュ値が、第１存在エッジ記憶部１６３に格納されているか否かを判断する。差分エッジ抽出部１７０は、該当するハッシュ値が格納されていなければ、ハッシュ値の生成元であるエッジを第２差分エッジ記憶部１６２から抽出し、差分抽出結果記憶部１８０に格納する。 The difference edge extraction unit 170 extracts edges included in only one of the two call graphs to be compared from the first difference edge storage unit 161 and the second difference edge storage unit 162. Specifically, the differential edge extraction unit 170 determines whether or not the same hash value as the hash value of the edge stored in the first differential edge storage unit 161 is stored in the second existence edge storage unit 164. to decide. If the corresponding hash value is not stored, the difference edge extraction unit 170 extracts the edge that is the generation source of the hash value from the first difference edge storage unit 161 and stores it in the difference extraction result storage unit 180. Further, the difference edge extraction unit 170 determines whether or not the same hash value as the edge hash value stored in the second difference edge storage unit 162 is stored in the first existence edge storage unit 163. If the corresponding hash value is not stored, the difference edge extraction unit 170 extracts the edge that is the generation source of the hash value from the second difference edge storage unit 162 and stores it in the difference extraction result storage unit 180.

差分抽出結果記憶部１８０は、第１コールグラフ記憶部１１１と第２コールグラフ記憶部１１２とのいずれか一方に格納されているコールグラフにのみ含まれるエッジの抽出結果を記憶する。 The difference extraction result storage unit 180 stores an extraction result of edges included only in the call graph stored in one of the first call graph storage unit 111 and the second call graph storage unit 112.

結果返答部１９０は、差分抽出要求に対する応答として、差分抽出結果記憶部１８０に記憶されているエッジの情報を出力する。
なお、図７に示した各要素間を接続する線は通信経路の一部を示すものであり、図示した通信経路以外の通信経路も設定可能である。また、図７に示した各要素の機能は、例えば、その要素に対応するプログラムモジュールをコンピュータに実行させることで実現することができる。 The result response unit 190 outputs edge information stored in the difference extraction result storage unit 180 as a response to the difference extraction request.
In addition, the line which connects between each element shown in FIG. 7 shows a part of communication path, and communication paths other than the illustrated communication path can also be set. Moreover, the function of each element shown in FIG. 7 can be realized, for example, by causing a computer to execute a program module corresponding to the element.

＜データ構造＞
次に、図８〜図１２を参照し、コンピュータ１００内の各記憶部に格納されるデータについて具体的に説明する。 <Data structure>
Next, data stored in each storage unit in the computer 100 will be described in detail with reference to FIGS.

図８は、第１・第２コールグラフ記憶部内のデータの例を示す図である。第１コールグラフ記憶部１１１には、コールグラフ４０を示すコールグラフデータ１１１ａが格納されている。コールグラフデータ１１１ａには、コールグラフ４０に含まれる複数のエッジ４２ａ〜４２ｋそれぞれを示す文字列の２つ組が含まれる。第２コールグラフ記憶部１１２には、コールグラフ４０ａを示すコールグラフデータ１１２ａが格納されている。コールグラフデータ１１２ａには、コールグラフ４０ａに含まれる複数のエッジ４２ａ〜４２ｃ，４２ｅ〜４２ｋ，４２ｌ，４２ｍそれぞれを示す文字列の２つ組が含まれる。 FIG. 8 is a diagram illustrating an example of data in the first and second call graph storage units. The first call graph storage unit 111 stores call graph data 111 a indicating the call graph 40. The call graph data 111a includes two sets of character strings indicating each of the plurality of edges 42a to 42k included in the call graph 40. The second call graph storage unit 112 stores call graph data 112a indicating the call graph 40a. The call graph data 112a includes two sets of character strings indicating the plurality of edges 42a to 42c, 42e to 42k, 42l, and 42m included in the call graph 40a.

図９は、第１・第２距離別エッジリスト記憶部内のデータの例を示す図である。第１距離別エッジリスト記憶部１４１には、起点ノードからの距離による、コールグラフ４０内のエッジの分類結果を示す第１距離別エッジリスト１４１ａが格納されている。第２距離別エッジリスト記憶部１４２には、起点ノードからの距離による、コールグラフ４０ａ内のエッジの分類結果を示す第２距離別エッジリスト１４２ａが格納されている。 FIG. 9 is a diagram illustrating an example of data in the first and second distance-specific edge list storage units. The first distance-specific edge list storage unit 141 stores a first distance-specific edge list 141a indicating the classification result of the edges in the call graph 40 based on the distance from the starting node. The second distance-specific edge list storage unit 142 stores a second distance-specific edge list 142a indicating the classification result of the edges in the call graph 40a according to the distance from the starting node.

第１距離別エッジリスト１４１ａおよび第２距離別エッジリスト１４２ａは、記憶領域が、距離によってカテゴライズされたブロックに分けられている。各ブロックには、起点ノードからの距離が、そのブロックに設定された距離と同じであるエッジが登録される。例えば「距離１」の欄には、起点ノードからの距離が「１」であるエッジが登録される。 In the first distance-specific edge list 141a and the second distance-specific edge list 142a, the storage area is divided into blocks categorized by distance. In each block, an edge whose distance from the starting node is the same as the distance set in the block is registered. For example, in the “distance 1” column, an edge whose distance from the starting node is “1” is registered.

図１０は、第１・第２差分エッジ記憶部内のデータの例を示す図である。第１差分エッジ記憶部１６１には、第１差分エッジリスト１６１ａが格納されている。第１差分エッジリスト１６１ａは、第１距離別エッジリスト１４１ａ内のある距離の欄には登録されているが、第２距離別エッジリスト１４２ａ内の同一距離の欄には登録されていないエッジの集合を示す。第２差分エッジ記憶部１６２には、第２差分エッジリスト１６２ａが格納されている。第２差分エッジリスト１６２ａは、第２距離別エッジリスト１４２ａ内のある距離の欄には登録されているが、第１距離別エッジリスト１４１ａ内の同一距離の欄には登録されていないエッジの集合を示す。 FIG. 10 is a diagram illustrating an example of data in the first and second differential edge storage units. The first differential edge storage unit 161 stores a first differential edge list 161a. The first difference edge list 161a is registered in a certain distance column in the first distance-specific edge list 141a, but is not registered in the same distance column in the second distance-specific edge list 142a. Indicates a set. The second differential edge storage unit 162 stores a second differential edge list 162a. The second difference edge list 162a is registered in a certain distance column in the second distance-specific edge list 142a, but is not registered in the same distance column in the first distance-specific edge list 141a. Indicates a set.

図１１は、第１・第２存在エッジ記憶部内のデータの例を示す図である。第１存在エッジ記憶部１６３には、第１存在エッジリスト１６３ａが格納されている。第１存在エッジリスト１６３ａは、第１差分エッジリスト１６１ａ内の複数のエッジそれぞれのハッシュ値を示すデータである。第２存在エッジ記憶部１６４には、第２存在エッジリスト１６４ａが格納されている。第２存在エッジリスト１６４ａは、第２差分エッジリスト１６２ａ内の複数のエッジそれぞれのハッシュ値を示すデータである。 FIG. 11 is a diagram illustrating an example of data in the first and second existence edge storage units. The first existing edge storage unit 163 stores a first existing edge list 163a. The first existence edge list 163a is data indicating hash values of a plurality of edges in the first differential edge list 161a. The second existing edge storage unit 164 stores a second existing edge list 164a. The second existence edge list 164a is data indicating hash values of a plurality of edges in the second differential edge list 162a.

図１２は、差分抽出結果記憶部内のデータの例を示す図である。差分抽出結果記憶部１８０には、２つのコールグラフ４０，４０ａの一方にのみ含まれるエッジの集合を示す最終差分エッジリスト１８１が格納されている。最終差分エッジリスト１８１には、エッジごとに、エッジを表す文字列の２つ組と、そのエッジが含まれていたコールグラフの識別番号とが設定されている。例えばコールグラフ４０の識別番号を「１」、コールグラフ４０ａの識別番号を「２」とする。この場合、コールグラフ４０に存在し、コールグラフ４０ａに存在しないエッジ４２ｋ（文字列の２つ組（“ｇ”，“ｋ”）には、コールグラフ４０の識別番号「１」が付与される。 FIG. 12 is a diagram illustrating an example of data in the difference extraction result storage unit. The difference extraction result storage unit 180 stores a final difference edge list 181 indicating a set of edges included only in one of the two call graphs 40 and 40a. In the final difference edge list 181, for each edge, two sets of character strings representing the edge and an identification number of the call graph that includes the edge are set. For example, the identification number of the call graph 40 is “1”, and the identification number of the call graph 40 a is “2”. In this case, the identification number “1” of the call graph 40 is assigned to the edge 42k (a pair of character strings (“g”, “k”) that exists in the call graph 40 but does not exist in the call graph 40a. .

＜処理手順＞
コンピュータ１００は、以上のようなデータを用いて、２つのコールグラフ４０，４０ａの差分エッジ抽出処理を行う。以下、差分エッジ抽出処理について詳細に説明する。 <Processing procedure>
The computer 100 performs differential edge extraction processing of the two call graphs 40 and 40a using the above data. Hereinafter, the differential edge extraction process will be described in detail.

図１３は、差分エッジ抽出処理の手順を示すフローチャートである。以下、図１３に示す処理をステップ番号に沿って説明する。なお差分エッジ抽出処理は、差分抽出要求が入力されたときに実行される。 FIG. 13 is a flowchart illustrating a procedure of differential edge extraction processing. In the following, the process illustrated in FIG. 13 will be described in order of step number. The difference edge extraction process is executed when a difference extraction request is input.

［ステップＳ１０１］差分抽出要求を受け付けた要求受付部１２０は、その差分抽出要求を、距離別エッジ分類部１３０に送信する。その後、ステップＳ１０２とステップＳ１０３の処理が並列に実行される。 [Step S101] Upon receiving the difference extraction request, the request reception unit 120 transmits the difference extraction request to the edge-based edge classification unit 130. Thereafter, the processes of step S102 and step S103 are executed in parallel.

［ステップＳ１０２］距離別エッジ分類部１３０は、差分抽出要求を取得すると、２つのスレッドを立ち上げる。そして距離別エッジ分類部１３０は、１つ目のスレッドにより、第１コールグラフ記憶部１１１内のコールグラフデータ１１１ａに対する距離別エッジ分類処理を実行する。これにより、第１距離別エッジリスト記憶部１４１に、第１距離別エッジリスト１４１ａが記憶される。 [Step S <b> 102] The edge-by-distance classifying unit 130 activates two threads upon obtaining the difference extraction request. Then, the edge classification unit 130 by distance executes the edge classification process by distance for the call graph data 111a in the first call graph storage unit 111 by the first thread. As a result, the first distance-specific edge list 141 a is stored in the first distance-specific edge list storage unit 141.

［ステップＳ１０３］距離別エッジ分類部１３０は、２つ目のスレッドにより、第２コールグラフ記憶部１１２内のコールグラフデータ１１２ａに対する距離別エッジ分類処理を実行する。これにより、第２距離別エッジリスト記憶部１４２に、第２距離別エッジリスト１４２ａが記憶される。 [Step S103] The edge-by-distance classification unit 130 executes edge-by-distance classification processing on the call graph data 112a in the second call graph storage unit 112 by the second thread. As a result, the second distance-specific edge list 142 a is stored in the second distance-specific edge list storage unit 142.

ステップＳ１０２，Ｓ１０３の距離別エッジ分類処理は、コールグラフ内のエッジを、起点ノードからの距離で分類する処理である。距離別エッジ分類処理の詳細は後述する（図１４参照）。ステップＳ１０２とステップＳ１０３の両方の処理が終了すると、処理がステップＳ１０４に進められる。 The distance-based edge classification processing in steps S102 and S103 is processing for classifying the edges in the call graph by the distance from the starting node. Details of the distance-based edge classification processing will be described later (see FIG. 14). When the processes in both step S102 and step S103 are completed, the process proceeds to step S104.

［ステップＳ１０４］距離別エッジ分類部１３０は、同一距離間エッジ比較処理を実行する。同一距離間エッジ比較処理では、第１距離別エッジリスト１４１ａと第２距離別エッジリスト１４２ａそれぞれにおける、起点ノードからの距離が同じカテゴリ内のエッジ集合同士が比較され、同一のエッジの有無が判断される。同一距離間エッジ比較処理により、第１差分エッジ記憶部１６１に第１差分エッジリスト１６１ａが格納され、第２差分エッジ記憶部１６２に第２差分エッジリスト１６２ａが格納される。また同一距離間エッジ比較処理により、第１存在エッジ記憶部１６３に第１存在エッジリスト１６３ａが格納され、第２存在エッジ記憶部１６４に第２存在エッジリスト１６４ａが格納される。同一距離間エッジ比較処理の詳細は後述する（図１５参照）。同一距離間エッジ比較処理が終了すると、ステップＳ１０５とステップＳ１０６との処理が並列で実行される。 [Step S104] The edge-by-distance classifying unit 130 executes edge comparison processing between the same distances. In the same distance edge comparison processing, edge sets in the same distance category from the starting node in each of the first distance edge list 141a and the second distance edge list 142a are compared to determine the presence or absence of the same edge. Is done. By the edge comparison process between the same distances, the first difference edge list 161a is stored in the first difference edge storage unit 161, and the second difference edge list 162a is stored in the second difference edge storage unit 162. Further, by the same distance edge comparison processing, the first existence edge list 163a is stored in the first existence edge storage unit 163, and the second existence edge list 164a is stored in the second existence edge storage unit 164. Details of edge comparison processing between the same distances will be described later (see FIG. 15). When the edge comparison process between the same distances is completed, the processes of step S105 and step S106 are executed in parallel.

［ステップＳ１０５］差分エッジ抽出部１７０は、２つのスレッドを立ち上げる。そして差分エッジ抽出部１７０は、１つ目のスレッドにより、第１差分エッジリスト１６１ａを対象として、距離違いエッジ存在確認処理を実行する。この距離違いエッジ存在確認処理では、第１差分エッジリスト１６１ａに含まれているが第２差分エッジリスト１６２ａには含まれていないエッジが特定される。 [Step S105] The differential edge extraction unit 170 starts two threads. Then, the difference edge extraction unit 170 executes the distance difference edge existence confirmation process for the first difference edge list 161a by the first thread. In this distance difference edge existence confirmation process, edges that are included in the first differential edge list 161a but are not included in the second differential edge list 162a are specified.

［ステップＳ１０６］差分エッジ抽出部１７０は、２つ目のスレッドにより、第２差分エッジリスト１６２ａを対象として、距離違いエッジ存在確認処理を実行する。この距離違いエッジ存在確認処理では、第２差分エッジリスト１６２ａに含まれているが第１差分エッジリスト１６１ａには含まれていないエッジが特定される。 [Step S106] The difference edge extraction unit 170 executes the distance difference edge existence confirmation process for the second difference edge list 162a by the second thread. In this distance difference edge existence confirmation process, an edge that is included in the second differential edge list 162a but is not included in the first differential edge list 161a is specified.

ステップＳ１０５，Ｓ１０６の距離違いエッジ存在確認処理の詳細は後述する（図１６参照）。ステップＳ１０５，Ｓ１０６の距離違いエッジ存在確認処理により、差分抽出結果記憶部１８０内に最終差分エッジリスト１８１が格納される。ステップＳ１０５とステップＳ１０６の両方の処理が終了すると、処理がステップＳ１０７に進められる。 Details of the difference edge existence confirmation processing in steps S105 and S106 will be described later (see FIG. 16). The final difference edge list 181 is stored in the difference extraction result storage unit 180 by the distance difference edge existence confirmation processing in steps S105 and S106. When the processes in both step S105 and step S106 are completed, the process proceeds to step S107.

［ステップＳ１０７］結果返答部１９０は、差分抽出処理の結果を示す最終差分エッジリスト１８１を、差分抽出要求への応答として出力する。
以上のような手順で、２つのコールグラフ４０，４０ａ間の差分エッジを抽出することができる。以下、距離別エッジ分類処理（ステップＳ１０２，Ｓ１０３）、同一距離間エッジ比較処理（ステップＳ１０４）、および距離違いエッジ存在確認処理（ステップＳ１０５，Ｓ１０６）について、図１４〜図１６を参照して詳細に説明する。 [Step S107] The result reply unit 190 outputs the final difference edge list 181 indicating the result of the difference extraction process as a response to the difference extraction request.
The difference edge between the two call graphs 40 and 40a can be extracted by the procedure as described above. Hereinafter, the distance-based edge classification processing (steps S102 and S103), the same distance edge comparison processing (step S104), and the distance difference edge presence confirmation processing (steps S105 and S106) will be described in detail with reference to FIGS. Explained.

図１４は、距離別エッジ分類処理の手順の一例を示すフローチャートである。以下、第１コールグラフ記憶部１１１内のコールグラフデータ１１１ａを処理対象とする場合を例に取り、図１４に示す処理をステップ番号に沿って説明する。 FIG. 14 is a flowchart illustrating an example of a procedure of distance-based edge classification processing. In the following, the process shown in FIG. 14 will be described in order of step numbers, taking as an example the case where the call graph data 111a in the first call graph storage unit 111 is to be processed.

［ステップＳ１１１］距離別エッジ分類部１３０は、差分抽出要求で指定された、着目するメソッドに対応するノードを起点ノードとし、起点ノードの識別子のみが要素として登録された現ノードリストを生成する。現ノードリストに登録されるノードの識別子は、例えばそのノードに対応するメソッドのメソッド名である。生成された現ノードリストは、例えばメモリ１０２内に格納される。 [Step S111] The edge-by-distance classifying unit 130 generates a current node list in which the node corresponding to the method of interest specified in the difference extraction request is set as the starting node, and only the identifier of the starting node is registered as an element. The identifier of the node registered in the current node list is, for example, the method name of the method corresponding to that node. The generated current node list is stored in the memory 102, for example.

［ステップＳ１１２］距離別エッジ分類部１３０は、比較対象の距離ｄに「１」を設定する。
［ステップＳ１１３］距離別エッジ分類部１３０は、内容が空の次ノードリストを生成する。生成された次ノードリストは、例えばメモリ１０２内に格納される。 [Step S112] The edge classification unit 130 sets “1” as the distance d to be compared.
[Step S113] The edge-by-distance classifying unit 130 generates a next node list whose contents are empty. The generated next node list is stored in the memory 102, for example.

［ステップＳ１１４］距離別エッジ分類部１３０は、コールグラフデータ１１１ａに含まれるすべてのエッジを、未探索の状態に設定する。例えば、距離別エッジ分類部１３０は、コールグラフデータ１１１ａ内のエッジに「探索済」を示すフラグが設定されていた場合、そのフラグを削除する。 [Step S114] The edge classification unit 130 by distance sets all edges included in the call graph data 111a to an unsearched state. For example, when the flag indicating “searched” is set for the edge in the call graph data 111a, the edge classification unit 130 by distance deletes the flag.

［ステップＳ１１５］距離別エッジ分類部１３０は、現ノードリストに登録されている要素を、現在ノードとして１つ選択する。
［ステップＳ１１６］距離別エッジ分類部１３０は、現在ノードから探索方向に接続された、「探索済」のフラフが付与されていないエッジの１つを、探索対象として選択する。なお、探索方向は差分抽出要求において指定されている。探索方向が、呼び出し先方向であれば、距離別エッジ分類部１３０は、コールグラフデータ１１１ａから、現在ノードを呼び出し元とするエッジを１つ選択する。探索方向が、呼び出し元方向であれば、距離別エッジ分類部１３０は、コールグラフデータ１１１ａから、現在ノードを呼び出し先とするエッジを１つ選択する。 [Step S115] The edge-based edge classification unit 130 selects one element registered in the current node list as the current node.
[Step S116] The edge-by-distance classifying unit 130 selects one of the edges connected to the search direction from the current node, to which the “searched” flag has not been assigned, as a search target. The search direction is specified in the difference extraction request. If the search direction is the callee direction, the edge classification unit by distance 130 selects one edge having the current node as the caller from the call graph data 111a. If the search direction is the caller direction, the distance-based edge classification unit 130 selects one edge having the current node as the callee from the call graph data 111a.

［ステップＳ１１７］距離別エッジ分類部１３０は、選択したエッジを、第１距離別エッジリスト１４１ａの距離ｄのブロックに格納する。そして距離別エッジ分類部１３０は、コールグラフデータ１１１ａの選択したエッジに「探索済」のフラグを設定する。 [Step S117] The edge-by-distance classifying unit 130 stores the selected edge in the block of the distance d in the first edge-by-distance list 141a. Then, the edge-by-distance classifying unit 130 sets a “searched” flag to the selected edge of the call graph data 111a.

［ステップＳ１１８］距離別エッジ分類部１３０は、次ノードリストに、選択したエッジで接続される２つのノードのうち、現在ノード以外のノードを、次ノードリストに追加する。探索方向が呼び出し先方向であれば、選択したエッジの呼び出し先のノードが、次ノードリストに追加される。探索方向が呼び出し元方向であれば、選択したエッジの呼び出し元のノードが、次ノードリストに追加される。 [Step S118] The edge-by-distance classification unit 130 adds, to the next node list, nodes other than the current node among the two nodes connected by the selected edge to the next node list. If the search direction is the call destination direction, the call destination node of the selected edge is added to the next node list. If the search direction is the caller direction, the caller node of the selected edge is added to the next node list.

［ステップＳ１１９］距離別エッジ分類部１３０は、現在ノードから探索方向に接続されたすべてのエッジを探索対象としたか否かを判断する。すべてのエッジを探索対象とした場合、処理がステップＳ１２０に進められる。探索対象としてないエッジがある場合、処理がステップＳ１１６に進められる。 [Step S119] The edge-by-distance classifying unit 130 determines whether or not all edges connected in the search direction from the current node have been searched. If all edges are to be searched, the process proceeds to step S120. If there is an edge that is not a search target, the process proceeds to step S116.

［ステップＳ１２０］距離別エッジ分類部１３０は、現ノードリスト内のすべてのノードを処理したか否かを判断する。すべてのノードの処理が完了していれば、距離別エッジ分類処理が終了する。未処理のノードがあれば、処理がステップＳ１２１に進められる。 [Step S120] The edge-by-distance classification unit 130 determines whether all nodes in the current node list have been processed. If the processing of all the nodes has been completed, the edge-based edge classification processing ends. If there is an unprocessed node, the process proceeds to step S121.

［ステップＳ１２１］距離別エッジ分類部１３０は、現ノードリストの内容を、次ノードリストの内容と一致させる。例えば距離別エッジ分類部１３０は、現ノードリストの内容をすべて削除した後、次ノードリストに登録されているすべてのノードを、現ノードリストに登録する。 [Step S121] The edge-by-distance classifying unit 130 matches the contents of the current node list with the contents of the next node list. For example, the distance-based edge classification unit 130 deletes all the contents of the current node list, and then registers all the nodes registered in the next node list in the current node list.

［ステップＳ１２２］距離別エッジ分類部１３０は、次ノードリスト内の全ノードを削除し、次ノードリストを空にする。その後、処理がステップＳ１１５に進められる。
このようにして、コールグラフ４０に含まれるエッジ４２ａ〜４２ｋが、距離によってカテゴライズされ、第１距離別エッジリスト１４１ａに登録される。同様に、第２コールグラフ記憶部１１２内のコールグラフデータ１１２ａを処理対象として、図１４の距離別エッジ分類処理を実行すれば、コールグラフ４０ａに含まれるエッジ４２ａ〜４２ｃ，４２ｅ〜４２ｊ，４２ｌ，４２ｍが、距離によって分類される。そして各エッジ４２ａ〜４２ｃ，４２ｅ〜４２ｊ，４２ｌ，４２ｍが、距離ごとのカテゴリに分けて第２距離別エッジリスト１４２ａに登録される。その後、同一距離間エッジ比較部１５０により、同一距離間エッジ比較処理が実行される。 [Step S122] The edge-based edge classification unit 130 deletes all the nodes in the next node list and empties the next node list. Thereafter, the process proceeds to step S115.
In this way, the edges 42a to 42k included in the call graph 40 are categorized by the distance and registered in the first distance-specific edge list 141a. Similarly, if the edge-based edge classification processing in FIG. 14 is executed with the call graph data 112a in the second call graph storage unit 112 as a processing target, the edges 42a to 42c, 42e to 42j, and 42l included in the call graph 40a. , 42m are classified by distance. Then, the edges 42a to 42c, 42e to 42j, 42l, and 42m are registered in the second distance-specific edge list 142a while being divided into categories for each distance. Thereafter, the same distance edge comparison processing is executed by the same distance edge comparison unit 150.

図１５は、同一距離間エッジ比較処理の手順の一例を示すフローチャートである。以下、図１５に示す処理をステップ番号に沿って説明する。
［ステップＳ１３１］同一距離間エッジ比較部１５０は、距離ｄに１を設定する。 FIG. 15 is a flowchart illustrating an example of the same distance edge comparison process. In the following, the process illustrated in FIG. 15 will be described in order of step number.
[Step S131] The same distance edge comparison section 150 sets 1 to the distance d.

［ステップＳ１３２］同一距離間エッジ比較部１５０は、第１距離別エッジリスト１４１ａの距離ｄのブロックに格納されたエッジを、所定の基準でソートする。例えば同一距離間エッジ比較部１５０は、エッジを表す２つ組の文字列に基づいて、アルファベット順で、同一の距離にカテゴライズされたエッジをソートする。そして同一距離間エッジ比較部１５０は、ソート後のエッジの配列を、第１リストとする。 [Step S132] The edge comparison unit 150 between the same distances sorts the edges stored in the block of the distance d in the first distance-specific edge list 141a according to a predetermined criterion. For example, the same distance edge comparison unit 150 sorts the edges categorized at the same distance in alphabetical order based on a pair of character strings representing the edges. Then, the same distance edge comparison unit 150 sets the sorted edge arrangement as the first list.

［ステップＳ１３３］同一距離間エッジ比較部１５０は、第２距離別エッジリスト１４２ａの距離ｄのブロックに格納されたエッジを、ステップＳ１３２と同様の基準でソートする。そして同一距離間エッジ比較部１５０は、ソート後のエッジの配列を、第２リストとする。 [Step S133] The edge comparison unit 150 within the same distance sorts the edges stored in the block of the distance d in the second distance-specific edge list 142a according to the same criteria as in Step S132. Then, the same distance edge comparison unit 150 sets the sorted edge arrangement as the second list.

［ステップＳ１３４］同一距離間エッジ比較部１５０は、第１リストの先頭エッジを、第１リストの着目点とする。同様に同一距離間エッジ比較部１５０は、第２リストの先頭エッジを、第２リストの着目点とする。 [Step S134] The same distance edge comparison unit 150 sets the first edge of the first list as the point of interest in the first list. Similarly, the same distance edge comparison unit 150 sets the first edge of the second list as a point of interest in the second list.

［ステップＳ１３５］同一距離間エッジ比較部１５０は、第１リストの着目点のエッジと、第２リストの着目点のエッジとが同一か否かを判断する。エッジが同一であれば、処理がステップＳ１３６に進められる。エッジが同一でなければ、処理がステップＳ１３７に進められる。 [Step S135] The same distance edge comparison unit 150 determines whether the edge of the point of interest in the first list is the same as the edge of the point of interest in the second list. If the edges are the same, the process proceeds to step S136. If the edges are not identical, the process proceeds to step S137.

［ステップＳ１３６］同一距離間エッジ比較部１５０は、第１リストの着目点を、第１リストの現在の着目点のエッジの次のエッジに移動する。また同一距離間エッジ比較部１５０は、第２リストの着目点を、第２リストの現在の着目点のエッジの次のエッジに移動する。その後、処理がステップＳ１４４に進められる。 [Step S136] The same distance edge comparison unit 150 moves the point of interest in the first list to the edge next to the edge of the current point of interest in the first list. The same distance edge comparison unit 150 moves the point of interest in the second list to the next edge of the edge of the current point of interest in the second list. Thereafter, the process proceeds to step S144.

［ステップＳ１３７］同一距離間エッジ比較部１５０は、第１リストの着目点のエッジと第２リストの着目点のエッジとについて、ステップＳ１３２，Ｓ１３３のソート条件に基づく大小を比較する。例えば同一距離間エッジ比較部１５０は、ソート条件において上位になるエッジほど、小さいエッジであると判断する。第１リストの着目点のエッジの方が小さければ、処理がステップＳ１３８に進められる。第２リストの着目点のエッジの方が小さければ、処理がステップＳ１４１に進められる。 [Step S137] The same distance edge comparison unit 150 compares the edges of the points of interest in the first list and the edges of the points of interest in the second list based on the sort conditions in steps S132 and S133. For example, the same distance edge comparison unit 150 determines that the higher edge in the sort condition is the smaller edge. If the edge of the target point in the first list is smaller, the process proceeds to step S138. If the edge of the point of interest in the second list is smaller, the process proceeds to step S141.

［ステップＳ１３８］同一距離間エッジ比較部１５０は、第１差分エッジリスト１６１ａに、第１リストの着目点のエッジを追加する。
［ステップＳ１３９］同一距離間エッジ比較部１５０は、第１存在エッジリスト１６３ａに、第１リストの着目点のエッジのハッシュ値を追加する。 [Step S138] The same-distance edge comparison unit 150 adds the edge of the target point of the first list to the first difference edge list 161a.
[Step S139] The edge comparison unit 150 between the same distances adds the hash value of the edge of the point of interest in the first list to the first existing edge list 163a.

［ステップＳ１４０］同一距離間エッジ比較部１５０は、第１リストの着目点を、第１リストの現在の着目点のエッジの次のエッジに移動する。その後、処理がステップＳ１４４に進められる。 [Step S140] The same distance edge comparison unit 150 moves the point of interest in the first list to the edge next to the edge of the current point of interest in the first list. Thereafter, the process proceeds to step S144.

［ステップＳ１４１］同一距離間エッジ比較部１５０は、第２差分エッジリスト１６２ａに、第２リストの着目点のエッジを追加する。
［ステップＳ１４２］同一距離間エッジ比較部１５０は、第２存在エッジリスト１６４ａに、第２リストの着目点のエッジのハッシュ値を追加する。 [Step S141] The same distance edge comparison unit 150 adds the edge of the target point of the second list to the second difference edge list 162a.
[Step S142] The same-distance edge comparison unit 150 adds the hash value of the edge of the target point in the second list to the second existence edge list 164a.

［ステップＳ１４３］同一距離間エッジ比較部１５０は、第２リストの着目点を、第２リストの現在の着目点のエッジの次のエッジに移動する。
［ステップＳ１４４］同一距離間エッジ比較部１５０は、第１リストと第２リストとの両方において、すべてのエッジを比較済みか否かを判断する。すべてのエッジを比較済みであれば、処理がステップＳ１４５に進められる。第１リストと第２リストとの少なくとも一方に、比較していないエッジ（着目点になっていないエッジ）があれば、処理がステップＳ１３５に進められる。 [Step S143] The same distance edge comparison unit 150 moves the point of interest in the second list to the next edge of the edge of the current point of interest in the second list.
[Step S144] The same distance edge comparison unit 150 determines whether all edges have been compared in both the first list and the second list. If all edges have been compared, the process proceeds to step S145. If at least one of the first list and the second list includes an uncompared edge (an edge that is not a point of interest), the process proceeds to step S135.

なお、第１リストと第２リストの一方において、すべてのエッジが比較済みとなっている場合、比較対象として残存しているエッジは、同一または大小の比較の相手が存在しない。その場合、比較対象として残存しているエッジについて、ステップＳ１３５の同一性判定では「ＮＯ」と判断される。またステップＳ１３７の大小判定では、比較対象として残存しているエッジの方が小さいと判定される。 In addition, when all the edges have been compared in one of the first list and the second list, the same or larger comparison partner does not exist for the remaining edges as comparison targets. In this case, the edge remaining as the comparison target is determined as “NO” in the identity determination in step S135. In the size determination in step S137, it is determined that the remaining edge as a comparison target is smaller.

［ステップＳ１４５］同一距離間エッジ比較部１５０は、カテゴリとして存在するすべての距離ｄについて処理を実施したか否かを判断する。すべての距離ｄについて処理が完了した場合、同一距離間エッジ比較処理が終了する。未処理の距離があれば、処理がステップＳ１４６に進められる。 [Step S145] The same distance edge comparison unit 150 determines whether or not processing has been performed for all distances d existing as categories. When the process is completed for all the distances d, the edge comparison process between the same distances ends. If there is an unprocessed distance, the process proceeds to step S146.

［ステップＳ１４６］同一距離間エッジ比較部１５０は、距離ｄの値に１を加算し、処理をステップＳ１３２に進める。
このようにして、第１差分エッジリスト１６１ａ、第２差分エッジリスト１６２ａ、第１存在エッジリスト１６３ａ、および第２存在エッジリスト１６４ａが生成される。その後、差分エッジ抽出部１７０により、距離違いエッジ存在確認処理（ステップＳ１０５，Ｓ１０６）が実行される。 [Step S146] The same distance edge comparison unit 150 adds 1 to the value of the distance d, and advances the process to Step S132.
In this way, the first differential edge list 161a, the second differential edge list 162a, the first existing edge list 163a, and the second existing edge list 164a are generated. Thereafter, the difference edge extraction unit 170 executes a difference distance edge existence confirmation process (steps S105 and S106).

図１６は、距離違いエッジ存在確認処理の手順の一例を示すフローチャートである。以下、第１差分エッジ記憶部１６１内の第１差分エッジリスト１６１ａを、エッジの抽出元として処理を実行する場合を例に取り、図１６に示す処理をステップ番号に沿って説明する。 FIG. 16 is a flowchart illustrating an example of the procedure of the difference distance edge presence confirmation process. In the following, the process shown in FIG. 16 will be described in order of step number, taking as an example the case where the first differential edge list 161a in the first differential edge storage unit 161 is used as an edge extraction source.

［ステップＳ１５１］差分エッジ抽出部１７０は、第１差分エッジリスト１６１ａから未処理のエッジを１つ選択する。
［ステップＳ１５２］差分エッジ抽出部１７０は、選択したエッジのハッシュ値を計算し、そのハッシュ値が第２存在エッジリスト１６４ａ内に存在するか否かを判断する。ハッシュ値が存在する場合、処理がステップＳ１５４に進められる。ハッシュ値が存在しなければ、処理がステップＳ１５３に進められる。 [Step S151] The differential edge extraction unit 170 selects one unprocessed edge from the first differential edge list 161a.
[Step S152] The differential edge extraction unit 170 calculates a hash value of the selected edge, and determines whether the hash value exists in the second existence edge list 164a. If the hash value exists, the process proceeds to step S154. If the hash value does not exist, the process proceeds to step S153.

［ステップＳ１５３］差分エッジ抽出部１７０は、選択したエッジに、そのエッジの抽出元のソースコードの識別子を付与し、差分抽出結果記憶部１８０内の最終差分エッジリスト１８１に追加する。 [Step S153] The difference edge extraction unit 170 adds the identifier of the source code from which the edge is extracted to the selected edge, and adds it to the final difference edge list 181 in the difference extraction result storage unit 180.

［ステップＳ１５４］差分エッジ抽出部１７０は、第１差分エッジリスト１６１ａ内のすべてのエッジを処理したか否かを判断する。すべてのエッジの処理が完了した場合、距離違いエッジ存在確認処理が終了する。未処理のエッジがあれば、処理がステップＳ１５１に進められる。 [Step S154] The differential edge extraction unit 170 determines whether or not all the edges in the first differential edge list 161a have been processed. When the processing for all the edges is completed, the difference distance edge presence confirmation processing ends. If there is an unprocessed edge, the process proceeds to step S151.

このようにして、起点ノードからの距離違いエッジ間で、同一エッジが他方のコールグラフに存在しているかどうかが確認され、存在していないエッジのみが、最終差分エッジリスト１８１に格納される。そして、最終差分エッジリスト１８１内のエッジが、差分抽出要求に対する応答として出力される。 In this way, it is confirmed whether or not the same edge exists in the other call graph between the edges with different distances from the origin node, and only the nonexistent edges are stored in the final difference edge list 181. Then, the edge in the final difference edge list 181 is output as a response to the difference extraction request.

＜第２の実施の形態の効果＞
以上説明したように、第２の実施の形態によれば、起点ノードから距離で分類し、最初に同一距離のエッジ間で差分を抽出するようにしたことで、各エッジについての同一エッジの存否確認が容易となる。その結果、処理が高速化される。 <Effects of Second Embodiment>
As described above, according to the second embodiment, the existence of the same edge for each edge is determined by classifying the distance from the starting node and extracting the difference between the edges having the same distance first. Confirmation is easy. As a result, the processing is speeded up.

ここで、第２の実施の形態に示した差分抽出処理の処理時間を、他の差分抽出処理による処理時間と比較する。比較対象の差分抽出方法として、以下の２つの参考手法を想定する。 Here, the processing time of the difference extraction processing shown in the second embodiment is compared with the processing time of other difference extraction processing. The following two reference methods are assumed as a difference extraction method to be compared.

［第１参考手法］一方のコールグラフ内の各エッジを他方のコールグラフ内の全エッジと総当たりで比較する手法である。この手法では、一致したエッジは双方のコールグラフにあり、一致しなかったエッジは一方のコールグラフのみにあることが検出される。 [First Reference Method] In this method, each edge in one call graph is compared with all the edges in the other call graph. In this approach, it is detected that the matched edge is in both call graphs and the unmatched edge is in only one call graph.

［第２参考手法］それぞれのコールグラフ内のエッジ群をソートし、ソート後のエッジのリスト同士を比較し、相違するエッジを抽出する方法である。この手法では、コールグラフごとにエッジをソートした後、２つのソート済みエッジリストを先頭から順に着目点を移動させながら比較するものである。一致した場合は，双方のリストで着目点が次エッジに移され、不一致の場合、比較して小さい方のエッジがあるリストのみ次のエッジに着目点が移される。一致したエッジは双方のコールグラフにあり、一致しないエッジは比較して小さい方のエッジが一方のコールグラフのみにあると判定される。この第２参考手法は、第２の実施の形態の手法から、距離別エッジ分類処理を除外したものである。 [Second Reference Method] This is a method of sorting the edge groups in each call graph, comparing the sorted lists of edges, and extracting different edges. In this method, after sorting the edges for each call graph, the two sorted edge lists are compared while moving the point of interest sequentially from the top. If they match, the point of interest is moved to the next edge in both lists, and if they do not match, the point of interest is moved to the next edge only in a list that has a smaller edge. A matching edge is found in both call graphs, and a non-matching edge is determined to have a smaller edge in only one call graph. This second reference method is obtained by excluding distance-based edge classification processing from the method of the second embodiment.

以下、図１７、図１８を参照し、第１・第２参考手法それぞれと、第２の実施の形態に係る手法との差分抽出処理の時間を比較する。なお以下の説明では、変数Ｍ、Ｎ、ｋ、ｒ、ｎを用いて、所要時間を計算する。各変数の意味は、以下の通りである。
Ｍ，Ｎ：それぞれのコールグラフに含まれるエッジの数
ｋ：対象範囲の半径（起点メソッドから呼び出し元・先のメソッドの距離の最大値）
ｒ：２つのコールグラフの元となった２つのソースコードセットの間の違いのある割合（違いのある箇所数／（違いのある箇所数＋違いのない箇所数））
ｎ：１つのメソッドを呼び出す（または呼び出される）メソッドの数
＜＜第１参考手法との比較＞＞
第１参考手法では、エッジ間の一致判定が総当たりで行われる。ここで、２つのエッジの比較１回当たりの所要時間を単位時間として、その単位時間の何倍の時間で差分抽出処理が完了するのかを計算する。第１参考手法の場合、差分抽出に要する時間ｒｔ０は、以下の式で表される。
ｒｔ０＝ＭＮ・・・（１）
一方、第２の実施の形態に示す手法で差分抽出処理を行う場合の所要時間ｒｔ２は、以下の式で表すことができる。
ｒｔ２＝Ｍｌｏｇ₂（Ｍ／ｋ）＋Ｎｌｏｇ₂（Ｎ／ｋ）＋Ｍ＋Ｎ＋ｒＭ＋ｒＮ
・・・（２）
式（２）の右辺の第１項「Ｍｌｏｇ₂（Ｍ／ｋ）」は、２つのコールグラフのうちの一方のコールグラフの１つのカテゴリ（距離）に属するエッジのソートに要する時間を示している。１つのカテゴリのソートに「（Ｍ／ｋ）ｌｏｇ₂（Ｍ／ｋ）」の時間を要し、その時間にカテゴリ数ｋを乗算することで「Ｍｌｏｇ₂（Ｍ／ｋ）」が得られる。第２項「Ｎｌｏｇ₂（Ｎ／ｋ）」は、他方のコールグラフの１つのカテゴリに属するエッジのソートに要する時間を示している。 Hereinafter, with reference to FIGS. 17 and 18, the time of difference extraction processing between each of the first and second reference techniques and the technique according to the second embodiment is compared. In the following description, the required time is calculated using the variables M, N, k, r, and n. The meaning of each variable is as follows.
M, N: Number of edges included in each call graph k: Radius of target range (maximum distance between calling method and calling method)
r: Ratio of difference between two source code sets that are the basis of two call graphs (number of differences / (number of differences + number of differences))
n: Number of methods calling (or called) one method << Comparison with the first reference method >>
In the first reference method, matching between edges is determined brute force. Here, the time required for one comparison of two edges is set as a unit time, and how many times the unit time is completed is calculated as the unit time. In the case of the first reference method, the time rt0 required for difference extraction is expressed by the following equation.
rt0 = MN (1)
On the other hand, the required time rt2 when the difference extraction process is performed by the method shown in the second embodiment can be expressed by the following equation.
rt2 = Mlog ₂ (M / k) + Nlog ₂ (N / k) + M + N + rM + rN
... (2)
The first term “Mlog ₂ (M / k)” on the right side of Equation (2) indicates the time required to sort the edges belonging to one category (distance) of one of the two call graphs. Yes. A time of “(M / k) log ₂ (M / k)” is required to sort one category, and “Mlog ₂ (M / k)” is obtained by multiplying the time by the number of categories k. The second term “Nlog ₂ (N / k)” indicates the time required to sort the edges belonging to one category of the other call graph.

第３項「Ｍ」と第４項「Ｎ」とは、同一カテゴリのエッジ間での比較に要する時間である。１つのカテゴリについての比較に要する時間は「（Ｍ／ｋ）＋（Ｎ／ｋ）」であり、その時間にカテゴリ数ｋを乗算することで「Ｍ＋Ｎ」が得られる。 The third term “M” and the fourth term “N” are times required for comparison between edges of the same category. The time required for comparison for one category is “(M / k) + (N / k)”, and “M + N” is obtained by multiplying the time by the number of categories k.

第５項「ｒＭ」と第６項「ｒＮ」とは、同一カテゴリのエッジ間の比較で、同じエッジが見つからなかった差分エッジについて、異なるカテゴリのエッジとの比較に要する時間である。 The fifth term “rM” and the sixth term “rN” are the time required to compare the difference edge, for which the same edge is not found in the comparison between edges of the same category, with the edge of a different category.

ここで、２つのコールグラフの元となった２つのソースコードセットの間に、微小な違いしかない場合、１つのメソッドを呼び出す（または呼び出される）メソッドの数ｎを用いて、以下の近似が可能である（以下の「〜」は近似記号）。 Here, if there is only a small difference between the two source code sets that are the basis of the two call graphs, the following approximation can be made using the number n of methods that call (or are called) one method: It is possible ("~" below is an approximate symbol).

この場合、ｒｔ０とｒｔ２は、以下のようになる。
ｒｔ０〜Ｍ²〜ｎ^2k ・・・（４）
ｒｔ２〜２Ｍ（ｌｏｇ₂（Ｍ／ｋ）＋１＋ｒ）〜２ｎ^k（ｋｌｏｇ₂ｄ−ｌｏｇ₂ｋ＋１＋ｒ）・・・（５）
多くの場合ｎ〜５となる。また様々なシステムのＯＳのマイナーバージョンアップでは、ｒ〜０．０１程度である。このような条件で改善率を計算すると、図１７に示すようになる。 In this case, rt0 and rt2 are as follows.
rt0~M ² ~n ^2k ··· (4)
rt2 to 2M (log ₂ (M / k) + 1 + r) to 2n ^k (klog ₂ d-log ₂ k + 1 + r) (5)
In many cases, it is n-5. In addition, the minor version upgrade of OS of various systems is about r to 0.01. When the improvement rate is calculated under such conditions, it is as shown in FIG.

図１７は、第１参考手法と第２の実施の形態に係る手法との比較例を示す図である。図１７には、起点ノードからの距離の最大値に応じた、差分抽出時間の改善率を示している。改善率は以下の式で計算される。
改善率＝（ｒｔ０−ｒｔ２）／ｒｔ０）・・・（６）
なお図１７では、改善率をパーセントで表記している（式（６）の値×１００）。この例では、ｋ＝２以上の場合、４９％以上の高速化が図れることが分かる。 FIG. 17 is a diagram illustrating a comparative example between the first reference method and the method according to the second embodiment. FIG. 17 shows the improvement rate of the difference extraction time according to the maximum value of the distance from the starting node. The improvement rate is calculated by the following formula.
Improvement rate = (rt0−rt2) / rt0) (6)
In FIG. 17, the improvement rate is expressed as a percentage (value of equation (6) × 100). In this example, it can be seen that when k = 2 or more, the speed can be increased by 49% or more.

＜＜第２参考手法との比較＞＞
第２参考手法の場合、差分抽出に要する時間ｒｔ１は、以下の式で表すことができきる。
ｒｔ１＝Ｍｌｏｇ₂Ｍ＋Ｎｌｏｇ₂Ｎ＋Ｍ＋Ｎ・・・（７）
式（７）の右辺の第１項「Ｍｌｏｇ₂Ｍ」は、一方のコールグラフのエッジのソートに要する時間を表している。第２項「Ｎｌｏｇ₂Ｎ」は、他方のコールグラフのエッジのソートに要する時間を表している。第３項「Ｍ」と第４項「Ｎ」は、ソート済みのリスト内のエッジ同士を比較する時間である。 << Comparison with the second reference method >>
In the case of the second reference method, the time rt1 required for difference extraction can be expressed by the following equation.
rt1 = Mlog ₂ M + Nlog ₂ N + M + N (7)
The first term “Mlog ₂ M” on the right side of Equation (7) represents the time required to sort the edges of one call graph. The second term “Nlog ₂ N” represents the time required to sort the edges of the other call graph. The third term “M” and the fourth term “N” are times for comparing edges in the sorted list.

式（３）に示した近似を行うと、ｒｔ１は以下のようになる。
ｒｔ１〜２Ｍ（ｌｏｇ₂Ｍ＋１）〜２ｎ^k（ｋｌｏｇ₂ｄ＋１）・・・（８）
ここで、ｎ〜５、ｒ〜０．０１という条件で改善率を計算すると、図１８に示すようになる。 When the approximation shown in Expression (3) is performed, rt1 is as follows.
rt1 to 2M (log ₂ M + 1) to 2n ^k (klog ₂ d + 1) (8)
Here, when the improvement rate is calculated under the conditions of n to 5 and r to 0.01, it is as shown in FIG.

図１８は、第２参考手法と第２の実施の形態に係る手法との比較例を示す図である。図１８には、起点ノードからの距離の最大値に応じた、差分抽出時間の改善率を示している。改善率は以下の式で計算される。
改善率＝（ｒｔ１−ｒｔ２）／ｒｔ１）・・・（９）
なお図１８では、改善率をパーセントで表記している（式（９）の値×１００）。この例では、ｋ＝２以上の場合、２０％程度の高速化が図れることが分かる。 FIG. 18 is a diagram illustrating a comparative example between the second reference method and the method according to the second embodiment. FIG. 18 shows the improvement rate of the difference extraction time according to the maximum value of the distance from the starting node. The improvement rate is calculated by the following formula.
Improvement rate = (rt1-rt2) / rt1) (9)
In FIG. 18, the improvement rate is expressed as a percentage (value of equation (9) × 100). In this example, it can be seen that when k = 2 or more, the speed can be increased by about 20%.

このように、第２の実施の形態では、起点ノードからの距離に応じたエッジの分類と、同一距離に属するエッジをソートした上での比較とを行うことで、高速な差分抽出処理が可能となっている。 Thus, in the second embodiment, high-speed differential extraction processing is possible by performing classification of edges according to the distance from the starting node and comparison after sorting edges belonging to the same distance. It has become.

＜＜その他の効果＞＞
第２の実施の形態では、コンピュータ１００は、同一距離間エッジ比較の結果差分として抽出されたエッジのハッシュ値を計算し、第１存在エッジリスト１６３ａと第２存在エッジリスト１６４ａとを生成している。そして第１存在エッジリスト１６３ａと第２存在エッジリスト１６４ａとを、距離違いエッジ存在確認時の検索対象としている。このように、ハッシュ値によってエッジ存在確認を行うことで、処理が効率化される。例えば、エッジを示す、メソッドの完全修飾名を表す２つの文字列の組の比較をすると、比較対象の文字列が長いことから、メモリ消費量が多くなるとともに、比較にも時間がかかる。ＭＤ５（Message Digest 5）などのハッシュ値に変換して比較をすれば、比較対象の文字列が短くなり、メモリ容量が少なくて済むと共に、処理時間も短縮される。 << Other effects >>
In the second embodiment, the computer 100 calculates the hash value of the edge extracted as the difference as a result of the edge comparison between the same distances, and generates the first existence edge list 163a and the second existence edge list 164a. Yes. Then, the first existence edge list 163a and the second existence edge list 164a are set as search targets when the existence of different distance edges is confirmed. In this way, the processing is streamlined by performing the edge existence confirmation using the hash value. For example, when comparing a pair of two character strings representing the fully qualified name of a method indicating an edge, the comparison target character string is long, so that the memory consumption increases and the comparison also takes time. If the comparison is performed by converting the hash value into MD5 (Message Digest 5) or the like, the character string to be compared is shortened, the memory capacity is reduced, and the processing time is shortened.

例えば、スマートフォンやタブレット端末などの携帯情報端末機器用のＯＳでは、メソッドの完全修飾名として、以下のような名前が用いられている。なお、以下に示す名前に使用されている文字は、半角文字であるものとする。
・google.protobuf.compiler.StaticDescriptorInitializer＿google＿2fprotobuf＿2fcompiler＿2fplugin＿2eproto.StaticDescriptorInitializer＿google＿2fprotobuf＿2fcompiler＿2fplugin＿2eproto （１７２文字）
・＿＿gnu＿pbds.detail.left＿child＿next＿sibling＿heap＿node＿point＿const＿iterator＿.left＿child＿next＿sibling＿heap＿node＿point＿const＿iterator＿（１２９文字）
・extensions.FileBrowserPrivateLogoutUserForReauthenticationFunction.FileBrowserPrivateLogoutUserForReauthenticationFunction （１２２文字）
・line＿map.tree＿base.tree＿decl＿common.tree＿decl＿with＿vis.tree＿function＿decl.VEC＿call＿site＿record＿base＿unordered＿remove （１１６文字）
当該ＯＳでは約６０万のメソッドがあり、完全修飾名の最長が２７９文字、最短が５文字、平均が４６．２文字、標準偏差は１５．６文字である。エッジのデータ構造には、このような長いメソッド名の２つ組が含まれる。 For example, in the OS for portable information terminal devices such as smartphones and tablet terminals, the following names are used as fully qualified names of methods. In addition, the character used for the name shown below shall be a half-width character.
・ Google.protobuf.compiler.StaticDescriptorInitializer_google_2fprotobuf_2fcompiler_2fplugin_2eproto.StaticDescriptorInitializer_google_2fprotobuf_2fcompiler_2fplugin_2eproto (172 characters)
-__Gnu_pbds.detail.left_child_next_sibling_heap_node_point_const_iterator_.left_child_next_sibling_heap_node_point_const_iterator_ (129 characters)
・ Extensions.FileBrowserPrivateLogoutUserForReauthenticationFunction.FileBrowserPrivateLogoutUserForReauthenticationFunction (122 characters)
・ Line_map.tree_base.tree_decl_common.tree_decl_with_vis.tree_function_decl.VEC_call_site_record_base_unordered_remove (116 characters)
The OS has about 600,000 methods. The longest fully qualified name is 279 characters, the shortest is 5 characters, the average is 46.2 characters, and the standard deviation is 15.6 characters. The edge data structure includes a pair of such long method names.

エッジを示す文字列を例えばＭＤ５のハッシュ値に変換すると１２８ビット長となる。１２８ビットは、ＡＳＣＩＩコード（７ビット）でエンコードした文字だと約１８文字相当である。すると、上記平均長の完全修飾名の２つ組の長さ（約９３文字）方が（標準偏差を考慮しても）、ハッシュ値よりも十分長い。 When a character string indicating an edge is converted into, for example, an MD5 hash value, the length becomes 128 bits. 128 bits are equivalent to about 18 characters when encoded with ASCII code (7 bits). Then, the length (about 93 characters) of the above-mentioned fully qualified name with the average length is sufficiently longer than the hash value (even if the standard deviation is taken into consideration).

従って、エッジのハッシュ値を用いて同一距離間エッジ比較処理を行うことで、少ないデータ量による効率的な処理が可能となる。
以上が第２の実施の形態についての説明である。 Therefore, efficient processing with a small amount of data can be performed by performing edge comparison processing for the same distance using the hash value of the edge.
The above is the description of the second embodiment.

〔その他の実施の形態〕
第２の実施の形態では、第１存在エッジリスト１６３ａと第２存在エッジリスト１６４ａとを用いて距離違いエッジ存在確認を行っているが、第１存在エッジリスト１６３ａと第２存在エッジリスト１６４ａとを用いずに距離違いエッジ存在確認を行ってもよい。その場合、第１差分エッジリスト１６１ａ内のエッジのうち、第２差分エッジリスト１６２ａに同一エッジが存在しないエッジが抽出され、最終差分エッジリスト１８１に登録される。同様に第２差分エッジリスト１６２ａ内のエッジのうち、第１差分エッジリスト１６１ａに同一エッジが存在しないエッジが抽出され、最終差分エッジリスト１８１に登録される。 [Other Embodiments]
In the second embodiment, the distance difference edge existence confirmation is performed using the first existence edge list 163a and the second existence edge list 164a, but the first existence edge list 163a, the second existence edge list 164a, It is also possible to perform edge existence confirmation at different distances without using. In that case, of the edges in the first differential edge list 161 a, an edge that does not have the same edge in the second differential edge list 162 a is extracted and registered in the final differential edge list 181. Similarly, of the edges in the second differential edge list 162a, an edge that does not have the same edge in the first differential edge list 161a is extracted and registered in the final differential edge list 181.

このように、第１差分エッジリスト１６１ａと第２差分エッジリスト１６２ａとを検索対象とする場合、エッジを、各エッジの文字列の２つ組をキー（配列の添え字）とした連想配列で保持してもよい。このとき、キーに応じた連想配列の値（value）は、空欄あるいは「０」などの所定値でよい。すなわち、処理対象のエッジの２つ組の文字列をキーとして連想配列を読み出したとき、対応する値が読み出されれば、そのエッジと同一のエッジが存在することとなり、エラーとなれば、そのエッジと同一のエッジが存在しないこととなる。このように、エッジを示す２つ組の文字列を連想配列に格納することで、第１差分エッジリスト１６１ａと第２差分エッジリスト１６２ａとの一方に存在するエッジが、他方に存在するか否かの検索を効率的に実施することができる。 As described above, when the first differential edge list 161a and the second differential edge list 162a are to be searched, the edge is an associative array using a pair of character strings of each edge as a key (array subscript). It may be held. At this time, the value (value) of the associative array corresponding to the key may be a blank or a predetermined value such as “0”. In other words, when an associative array is read using a pair of character strings of the processing target edge as a key, if the corresponding value is read, the same edge as that edge exists, and if an error occurs, that edge The same edge does not exist. In this way, by storing two sets of character strings indicating edges in the associative array, whether or not an edge that exists in one of the first differential edge list 161a and the second differential edge list 162a exists in the other. Such a search can be carried out efficiently.

また各エッジのハッシュ値を配列のキーとして、第１存在エッジリスト１６３ａと第２存在エッジリスト１６４ａとを保持することも可能である。これにより、第１存在エッジリスト１６３ａと第２存在エッジリスト１６４ａに基づくエッジの存在確認を高速に行うことができる。 It is also possible to hold the first existence edge list 163a and the second existence edge list 164a using the hash value of each edge as an array key. As a result, the presence check of the edge based on the first existence edge list 163a and the second existence edge list 164a can be performed at high speed.

以上、実施の形態を例示したが、実施の形態で示した各部の構成は同様の機能を有する他のものに置換することができる。また、他の任意の構成物や工程が付加されてもよい。さらに、前述した実施の形態のうちの任意の２以上の構成（特徴）を組み合わせたものであってもよい。 As mentioned above, although embodiment was illustrated, the structure of each part shown by embodiment can be substituted by the other thing which has the same function. Moreover, other arbitrary structures and processes may be added. Further, any two or more configurations (features) of the above-described embodiments may be combined.

１第１コールグラフ
１ａ〜１ｄ第１ノード
１ｅ〜１ｇ第１エッジ
２第２コールグラフ
２ａ〜２ｅ第２ノード
２ｆ〜２ｉ第２エッジ
３第１距離別エッジリスト
４第２距離別エッジリスト
５第１差分エッジリスト
６第２差分エッジリスト
７最終差分エッジリスト
１０情報処理装置
１１記憶部
１２演算部
１２ａ分類部
１２ｂ特定部
１２ｃ抽出部 DESCRIPTION OF SYMBOLS 1 1st call graph 1a-1d 1st node 1e-1g 1st edge 2 2nd call graph 2a-2e 2nd node 2f-2i 2nd edge 3 Edge list classified by 1st distance 4 Edge list classified by 2nd distance 5 1st 1 differential edge list 6 second differential edge list 7 final differential edge list 10 information processing apparatus 11 storage unit 12 calculation unit 12a classification unit 12b identification unit 12c extraction unit

Claims

Computer
When a starting point program module is specified, a calling relationship between a plurality of first program modules in the first software including the starting point program module is connected between a plurality of first nodes indicating the plurality of first program modules. Based on the first call graph represented by the plurality of first edges, the plurality of the plurality of the plurality of the plurality of the plurality of the plurality of the plurality of the plurality of the plurality of the plurality of first edges by the distance from the first node corresponding to the starting program module to the respective first edges. Categorize the first edge of
A calling relationship between a plurality of second program modules in the second software including the starting program module is represented by a plurality of second edges connecting a plurality of second nodes indicating the plurality of second program modules. Based on the second call graph, the plurality of second edges are classified into categories for each distance according to the distance from the second node corresponding to the origin program module to the respective second edges following the calling relationship. Classified into
Each of the plurality of first edges is a second edge that belongs to the same category as the target first edge, and a set of program modules of a caller and a callee is a second edge that is common to the target first edge If the target edge does not exist, the target first edge is specified as the difference candidate first edge,
A first edge that is a target of each of the plurality of second edges, and a set of program modules of a caller and a callee is common to the target second edge, among the first edges belonging to the same category as the target second edge If the target second edge does not exist, the target second edge is specified as the difference candidate second edge,
From the difference candidate first edge, extract a difference first edge that does not have the difference candidate second edge where a set of program modules of a caller and a callee is common,
Extracting the difference second edge in which the pair of program modules of the caller and the callee is common, and the difference candidate first edge does not exist among the difference candidate second edges,
Call graph difference extraction method.

The computer further comprises:
Generating a first existing edge list in which a hash value of the difference candidate first edge is registered and a second existing edge list in which a hash value of the difference candidate second edge is registered;
In the extraction of the difference first edge, the difference candidate first edge from which a hash value different from the hash value registered in the second existence edge list is obtained is extracted as the difference first edge;
In the extraction of the difference second edge, the difference candidate second edge from which a hash value different from the hash value registered in the first existence edge list is obtained is extracted as the difference second edge.
The call graph difference extraction method according to claim 1.

In the identification of the difference candidate first edge and the difference candidate second edge, the first edge and the second edge are sorted according to a predetermined criterion for each classified category, and are classified into categories of the same distance. Selecting the target first edge in order from the top of the first edge group, selecting the target second edge in order from the top of the second edge group, and comparing the edge group and the second edge group, When the set of program modules of the call source and the call destination at the target first edge and the target second edge is not common, the target first edge and the target second edge are sorted according to the predetermined criterion The higher one is specified as the difference candidate first edge or the difference candidate second edge,
The call graph difference extraction method according to claim 1 or 2.

On the computer,
When a starting point program module is specified, a calling relationship between a plurality of first program modules in the first software including the starting point program module is connected between a plurality of first nodes indicating the plurality of first program modules. Based on the first call graph represented by the plurality of first edges, the plurality of the plurality of the plurality of the plurality of the plurality of the plurality of the plurality of the plurality of the plurality of the plurality of first edges by the distance from the first node corresponding to the starting program module to the respective first edges. Categorize the first edge of
A calling relationship between a plurality of second program modules in the second software including the starting program module is represented by a plurality of second edges connecting a plurality of second nodes indicating the plurality of second program modules. Based on the second call graph, the plurality of second edges are classified into categories for each distance according to the distance from the second node corresponding to the origin program module to the respective second edges following the calling relationship. Classified into
Each of the plurality of first edges is a second edge that belongs to the same category as the target first edge, and a set of program modules of a caller and a callee is a second edge that is common to the target first edge If the target edge does not exist, the target first edge is specified as the difference candidate first edge,
A first edge that is a target of each of the plurality of second edges, and a set of program modules of a caller and a callee is common to the target second edge, among the first edges belonging to the same category as the target second edge If the target second edge does not exist, the target second edge is specified as the difference candidate second edge,
From the difference candidate first edge, extract a difference first edge that does not have the difference candidate second edge where a set of program modules of a caller and a callee is common,
Extracting the difference second edge in which the pair of program modules of the caller and the callee is common, and the difference candidate first edge does not exist among the difference candidate second edges,
Call graph difference extraction program that executes processing.

A first call graph representing a call relationship between a plurality of first program modules in the first software by a plurality of first edges connecting a plurality of first nodes indicating the plurality of first program modules; A second call graph representing a calling relationship between a plurality of second program modules in the second software by a plurality of second edges connecting the plurality of second nodes indicating the plurality of second program modules; A storage unit for storing;
When a starting point program module included in both the first software and the second software is specified, the calling relationship is traced from the first node corresponding to the starting point program module based on the first call graph. The plurality of first edges are classified into categories for each distance according to distances to reach each of the plurality of first edges, and are called from the second node corresponding to the origin program module based on the second call graph. A classification unit that classifies the plurality of second edges into categories for each distance according to distances that follow the relationship and reach each of the plurality of second edges;
Each of the plurality of first edges is a second edge that belongs to the same category as the target first edge, and a set of program modules of a caller and a callee is a second edge that is common to the target first edge If the target first edge is specified as a difference candidate first edge, each of the plurality of second edges is targeted, and the first edge belonging to the same category as the target second edge A specific unit that identifies the target second edge as a difference candidate second edge when the set of program modules with the callee does not have a first edge common to the target second edge;
The difference candidate first edge is extracted from the difference candidate first edge, and the difference candidate second edge in which the pair of program modules of the caller and the callee is common does not exist, and of the difference candidate second edge, An extraction unit that extracts a difference second edge in which a pair of program modules of a caller and a callee is common, and the difference candidate first edge does not exist;
An information processing apparatus.