KR19990047339A

KR19990047339A - Automatic object extraction method

Info

Publication number: KR19990047339A
Application number: KR1019970065700A
Authority: KR
Inventors: 마평수; 진윤숙; 신규상
Original assignee: 정선종; 한국전자통신연구원
Priority date: 1997-12-03
Filing date: 1997-12-03
Publication date: 1999-07-05

Abstract

본 발명은 C 프로그램을 C++ 프로그램으로 변환하기 위한 자동 객체 추출방법에 관한 것이다. 그 목적은 C 프로그램의 절차적 특성을 반영하고, 입력된 C 프로그램의 기능과 동일한 C++ 프로그램으로 변환할 수 있는 최적의 객체를 사용자의 노력없이 자동으로 추출하는 자동 객체 추출방법을 제공하는 데에 있다. 그 특징은 C 프로그램을 읽어들여서 프로그램의 내부표현으로 변환하는 단계와, 상기 프로그램의 내부표현을 분석하여 구문요소의 관계에 대한 정보를 추출하는 단계와, 상기 구문요소의 관계 및 가중치를 기반으로 변수-자료형-함수 그래프를 형성하는 단계와, 상기 그래프 형성단계의 결과인 변수-자료형-함수 그래프를 기반으로 하여 클러스터링을 수행하여 클러스터를 생성하는 단계 및 구문요소 정보나 상기 클러스터링 단계에서 생성된 클러스터로부터 클래스와 객체 인스턴스를 결정하는 단계로 이루어지는 데에 있다. 그 효과는 C 프로그램의 절차적 특성을 추출된 객체에 반영할 수 있게 하고, 너무 큰 객체나 중복된 객체를 추출하는 기존 방법의 한계를 해결하며, 프로그램의 이해를 돕고 유지보수가 용이하도록 좀 더 정확하게 객체를 추출할 수 있게 한다는 데에 있다.The present invention relates to an automatic object extraction method for converting a C program into a C ++ program. The purpose is to provide an automatic object extraction method that reflects the procedural characteristics of a C program and automatically extracts the optimal object without user's effort, which can be converted into a C ++ program that is identical to the function of the input C program. . Its features include reading a C program and converting it into an internal representation of the program, analyzing the internal representation of the program, extracting information about the relationship between syntax elements, and variables based on the relationship and weights of the syntax elements. Forming a cluster by forming a type-function graph, clustering based on the variable-type-function graph that is a result of the graph forming step, and syntactic information or clusters generated in the clustering step Consists of determining the class and object instances. The effect allows the procedural characteristics of a C program to be reflected in extracted objects, addressing the limitations of existing methods of extracting too large or duplicate objects, and making the program more understandable and easier to maintain. The goal is to be able to extract objects accurately.

Description

Automatic object extraction method

본 발명은 C 프로그램을 C++ 프로그램으로 변환하기 위한 자동 객체 추출방법에 관한 것으로서, 특히 프로그램의 이해를 돕고 유지보수를 원활히 하기 위하여 소스코드로부터 자동으로 객체를 추출하는 자동 객체 추출 방법에 관한 것이다.The present invention relates to an automatic object extraction method for converting a C program into a C ++ program, and more particularly, to an automatic object extraction method for automatically extracting an object from source code in order to facilitate understanding of a program and to facilitate maintenance.

일반적으로, C는 절차적 언어이고, C++은 객체지향 언어이므로 구문과 특징에 많은 차이가 있다. C 프로그램에서 C++ 프로그램으로의 변환은 구현언어의 차이로 인해 두 프로그램의 구문을 단순하게 서로 대응시킴으로써 변환할 수 없기 때문에 C 프로그램의 구조와 기능을 이해하여 객체를 추출한 후, 추출된 객체를 C++ 프로그램의 구문으로 변환하는 방법이 주를 이룬다. 이에 대하여 도 1에 잘 나타나 있는데, 도 1은 C 프로그램을 C++ 프로그램으로 변환하기 위한 흐름도이다. 도 1을 참조하여 C 프로그램을 C++ 프로그램으로 변환하기 위한 과정을 간단히 설명하면 다음과 같다. S1에서는 C 프로그램을 분석한다. S2에서는 상기 S1에서 분석된 정보를 토대로 클래스와 객체를 추출한다. S3에서는 상기 S2에서 추출된 클래스와 객체에 따라 코드를 변환하여 C++ 프로그램을 완성한다.In general, C is a procedural language, and C ++ is an object-oriented language, so there are many differences in syntax and features. Since the conversion from C program to C ++ program cannot be converted by simply matching the syntax of the two programs due to the difference in the implementation language, after understanding the structure and function of the C program and extracting the object, the extracted object is converted into C ++ program. The main way is to convert to the syntax of. This is illustrated well in FIG. 1, which is a flowchart for converting a C program to a C ++ program. A process for converting a C program into a C ++ program will be described with reference to FIG. 1 as follows. In S1, the C program is analyzed. In S2, classes and objects are extracted based on the information analyzed in S1. In S3, the C ++ program is completed by converting the code according to the class and the object extracted from the S2.

객체 추출방법으로는 수동 객체추출 방법과 자동 객체추출 방법이 있다. 수동 객체추출 방법은 사용자가 C 프로그램을 직접 분석하거나 분석, 모델링 단계에서 산출된 문서 또는 프로그램 이해지원 도구를 통해 자동 생성된 문서를 분석하여 프로그램의 구조와 기능을 이해한 후, 사용자의 판단에 따라 수동으로 객체를 추출하는 것이다. 반면에, 자동 객체추출 방법은 상기 S1과 상기 S2가 시스템에 의해 자동적으로 이루어지는 방법으로서, 상기 S1에서 입력 프로그램의 소스 코드로부터 구문요소인 전역변수, 자료형, 함수를 파악해 전역변수와 함수간의 관계 또는 자료형과 함수간의 관계를 자동으로 분석하여, 상기 S2에서 연관된 프로그램 구문요소를 하나의 객체로 그룹화한다.Object extraction methods include manual object extraction and automatic object extraction. In the manual object extraction method, the user analyzes the C program directly or analyzes the document generated at the analysis and modeling stage or the automatically generated document through the program understanding support tool, and then understands the structure and function of the program. Is to extract the objects manually. On the other hand, the automatic object extraction method is a method in which S1 and S2 are automatically performed by the system. In S1, a global variable, a data type, a function is identified from source code of an input program, and the relationship between the global variable and the function or Automatically analyze the relationship between data types and functions, and group the associated program syntax elements into one object in S2.

따라서, 종래의 수동 객체추출 방법은 사용자가 직접 C 프로그램의 구조와 기능을 이해해야 하므로, C 언어의 구문과 프로그램 분야의 지식을 알고 있어야 하고, 많은 인력, 시간, 비용이 투입된다는 문제점이 있었다.Therefore, the conventional manual object extraction method requires a user to directly understand the structure and function of a C program, and therefore has to know the syntax and program knowledge of the C language, and has a problem in that a lot of manpower, time, and money are spent.

또한, 종래의 자동 객체추출 방법은 제한된 구문요소간의 관계를 고려함으로써, C 프로그램의 절차적 특성을 객체로 반영하지 못하고, 고려되는 구문요소간의 관계에 해당되지 않는 구문요소가 많을 경우에 객체가 거의 추출되지 않을 수 있고, 구문요소들간의 관계의 비중에 차이를 두지 않고 한꺼번에 그룹화함으로써 너무 큰 객체를 추출할 수도 있다는 문제점이 있었다.In addition, the conventional automatic object extraction method does not reflect the procedural characteristics of the C program as an object by considering the relationship between the limited syntax elements, and the object is hardly found when there are many syntax elements that do not correspond to the relationship between the syntax elements under consideration. There is a problem that it may not be extracted, and too large objects may be extracted by grouping them together without making a difference in the weight of the relationship between the syntax elements.

상기 문제점을 해소하기 위해 안출된 본 발명은 입력된 C 프로그램의 기능과 동일한 C++ 프로그램으로 변환할 수 있는 최적의 객체를 사용자의 노력없이 자동으로 추출하는 자동 객체 추출방법을 제공하는 데에 그 목적이 있다.The present invention devised to solve the above problems is to provide an automatic object extraction method for automatically extracting an optimal object that can be converted into the same C ++ program as the function of the input C program without the user's effort. have.

상기 목적을 달성하기 위한 본 발명의 특징은 C 프로그램을 읽어들여서 프로그램의 내부표현으로 변환하는 단계와, 상기 프로그램의 내부표현을 분석하여 구문요소의 관계에 대한 정보를 추출하는 단계와, 상기 구문요소의 관계 및 가중치를 기반으로 변수-자료형-함수 그래프를 형성하는 단계와, 상기 그래프 형성단계의 결과인 변수-자료형-함수 그래프를 기반으로 하여 클러스터링을 수행하여 클러스터를 생성하는 단계 및 구문요소 정보나 상기 클러스터링 단계에서 생성된 클러스터로부터 클래스와 객체 인스턴스를 결정하는 단계로 이루어지는 데에 있다.A feature of the present invention for achieving the above object is the step of reading a C program and converting it into an internal representation of the program, analyzing the internal expression of the program to extract information on the relationship between the syntax elements, the syntax element Forming a variable-type-function graph based on the relation and weight of the step, and generating a cluster by clustering based on the variable-type-function graph that is the result of the graph forming step and the syntax element information or Determining a class and an object instance from the cluster created in the clustering step.

이는 전역변수-함수 관계와 자료형-함수 관계 이외에도 함수-함수 관계를 고려하여 변수-자료형-함수 그래프를 생성하고 가중치에 따라 서로 다른 클러스터를 약하게 연결하는 관계를 무시하도록 그래프를 클러스터링한 다음, 객체와 클래스를 구분 추출함으로써 이루어진다. 즉, 본 발명은 도 1의 S1과 S2를 개선한 것이다.In addition to the global variable-function and datatype-function relationships, this creates a variable-type-function graph that takes into account function-function relationships, clusters the graph to ignore relationships that weakly connect different clusters by weight, and then This is done by separating the classes. That is, the present invention improves S1 and S2 in FIG.

도 1은 C 프로그램을 C++ 프로그램으로 변환하기 위한 시스템의 흐름도.1 is a flow diagram of a system for converting a C program into a C ++ program.

도 2는 C 프로그램에서의 객체 추출기능 수행과정을 나타내는 흐름도.2 is a flowchart illustrating a process of performing an object extraction function in a C program.

도 3은 객체추출을 위한 클러스터링 기능 수행과정을 나타내는 흐름도.3 is a flowchart illustrating a process of performing a clustering function for object extraction.

이하, 첨부된 도면을 참조하여 본 발명에 따른 바람직한 실시예들 중의 하나를 상세히 설명한다.Hereinafter, with reference to the accompanying drawings will be described in detail one of the preferred embodiments according to the present invention.

도 2는 C 프로그램에서의 객체 추출기능 수행과정을 나타내는 흐름도이다. 도 2를 참조하면, 본 발명은 C 프로그램을 읽어들여서 프로그램의 내부표현으로 변환하는 단계(S10)와, 프로그램의 구문요소간의 관계에 대한 정보를 추출하기 위한 프로그램의 내부표현 분석단계(S20)와, 관계의 가중치를 기반으로 한 변수-자료형-함수 그래프 형성단계(S30)와, 변수-자료형-함수 그래프를 분할하는 클러스터링(clustering) 단계(S40)와, 클래스 및 객체 추출단계(S50)로 구성된다.2 is a flowchart illustrating a process of performing an object extraction function in a C program. 2, the present invention reads the C program and converts it into an internal representation of the program (S10), and an internal expression analysis step (S20) of the program for extracting information on the relationship between the syntax elements of the program; , A variable-type-function graph forming step based on the weight of the relationship (S30), a clustering step (S40) for dividing the variable-type-function graph, and class and object extraction step (S50) do.

S10에서는 C 프로그램을 읽어들여서 프로그램의 내부표현으로 변환한다. 프로그램의 내부표현은 C 프로그램을 구성하는 구문요소들의 정보를 트리(tree) 형태로 표현한 것이다.In S10, a C program is read and converted into an internal representation of the program. The internal expression of a program expresses the information of syntax elements constituting the C program in the form of a tree.

S20에서는 C 프로그램을 나타내는 내부표현을 분석하여 변수-자료형-함수 그래프에 필요한 구문요소간의 관계들에 대한 정보와 구문요소 자체의 정보를 추출한다. 여기서, 구문요소간의 관계에는 전역변수-함수 관계와 자료형-함수 관계 이외에 절차적 프로그램의 구조와 기능을 나타내는 함수-함수 관계가 포함된다. 본 발명에서 고려되는 구문요소간의 관계에는 변수-함수 관계와 자료형-함수 관계뿐만 아니라 C 프로그램의 절차적 특성을 나타내는 함수-함수 관계를 포함한다. 변수-함수 관계는 다음과 같은 경우에 존재하는데, 전역변수(global variable)가 함수 내에서 사용되거나 수정될 경우와, 함수의 실인자로 전달될 경우와, 함수의 반환값에 의해 할당될 경우에 존재한다. 또한, 자료형-함수 관계는 전역변수뿐만 아니라 지역변수와 형식인자도 고려하며, 함수의 인자, 반환값, 함수 내의 전역변수의 자료형 및 함수 사이의 관계이다. 함수-함수 관계는 한 함수의 출력이 다른 함수의 입력으로 사용될 경우와, 함수의 입력 또는 출력이 같은 경우와, 호출하는 함수가 호출되는 함수의 제어를 결정하는 인자를 전달하는 경우와, 두 함수가 호출관계에 있는 경우에 존재한다.S30에서 입력 프로그램에서의 구문요소와 그들간의 관계를 기반으로 하여 변수-자료형-함수 그래프를 형성한다. 이는 사용자가 선택한 관계와 관계의 가중치에 따라 다르다. 이 변수-자료형-함수 그래프는 노드와 링크로 구성되는데, 노드는 세 가지 구문요소인 변수, 자료형, 또는 함수이고, 링크는 상기 S20에서 추출한 구문요소간의 관계를 나타내며 그 관계 정도를 나타내는 가중치를 가진다.In S20, the internal expression representing the C program is analyzed to extract information about the relations between the syntax elements required for the variable-type-function graph and the syntax element itself. Here, the relationship between the syntax elements includes a function-function relationship that represents the structure and function of the procedural program in addition to the global variable-function relationship and the data type-function relationship. The relations between syntax elements considered in the present invention include not only variable-function relationships and data type-function relationships but also function-function relationships representing procedural characteristics of C programs. Variable-function relationships exist when: a global variable is used or modified within a function, passed as a function argument, and assigned by a function's return value. exist. In addition, the type-function relationship takes into account not only global variables but also local variables and formal arguments, and is a relationship between a function's arguments, return values, the global variable's data type and a function. Function-function relationships are used when the output of one function is used as the input of another function, when the input or output of the function is the same, when the calling function passes arguments that determine the control of the function being called, and so on. Is present in the calling relationship. In S30, a variable-type-function graph is formed based on the syntax elements in the input program and the relationships between them. This depends on the relationship chosen by the user and the weight of the relationship. This variable-type-function graph consists of nodes and links, where nodes are three syntax elements: variables, data types, or functions, and links represent the relationships between the syntax elements extracted in S20 and have weights indicating the degree of the relationship. .

표 1과 같이, 구문요소간의 관계의 가중치를 살펴보면 관계의 중요도에 따라 큰 가중치를 부여하여 중요한 관계에 연관된 구문요소들은 같은 객체에 속할 수 있도록 한다. 구문요소의 관계를 각각 적용하여 사용자에 의해 그 결과를 조합하는 종래의 객체 추출방법과는 달리 본 발명은 세 가지 구문요소의 관계를 상기 변수-자료형-함수 그래프에 한꺼번에 적용하여 자동으로 추출하는 것이다.As shown in Table 1, when looking at the weights of the relations between syntax elements, the weights are given according to the importance of the relations so that the syntax elements related to the important relations can belong to the same object. Unlike the conventional object extraction method of applying the relations of the syntax elements and combining the results by the user, the present invention is to automatically apply the relations of the three syntax elements to the variable-type-function graph at once. .

객체 추출에 필요한 관계의 가중치Weight of Relationship Required for Object Extraction 관 계relation 경 우Occation 가중치weight 함수-함수 관계Function-function relationship 함수의 출력이 입력이 되는 경우When the output of a function becomes an input 55 함수의 입력이나 출력이 같은 경우If the function's inputs or outputs are the same 44 호출하는 함수가 호출되는 함수의 제어를 결정하는 인자를 전달하는 경우The calling function passes arguments that determine the control of the function being called. 33 두 함수가 호출관계에 있는 경우If two functions are in call 22 변수-함수 관계Variable-function relationship 전역변수가 함수 내에서 수정 또는 사용되는 경우When global variables are modified or used within a function 4(3)4 (3) 전역변수가 함수의 실인자로 전달되는 경우When a global variable is passed as a real argument of a function 2(1)2 (1) 전역변수가 함수의 변환값에 의해 할당되는 경우If a global variable is assigned by the function's conversion value 22 자료형-함수 관계Type-Function Relationship 함수의 인자의 자료형Data type of the argument of the function 3(2)3 (2) 함수 내의 전역변수의 자료형Data type of global variable in function 00 함수의 반환값의 자료형Data type of the return value of the function 1One 함수의 반환값에 의해 할당되는 변수의 자료형Data type of the variable assigned by the return value of the function 00

S40에서는 내부 연결도(internal connectivity)를 기반으로 하여 상기 S30의 결과인 변수-자료형-함수 그래프를 서브그래프로 분할하여 클러스터(cluster)를 생성한다. 여기서, 변수-자료형-함수 그래프의 서브그래프의 내부 연결도는 중요한 관계를 가진 노드들이 서브그래프 내에 어느 정도 밀접하게 모여있는지를 나타내는 것으로서, 내부링크의 가중치의 합을 내부링크와 외부링크의 가중치의 합으로 나눈 값으로 결정한다. 내부 연결도의 값은 0과 1 사이이며, 하나의 노드를 가진 서브그래프의 내부 연결도는 0, 외부링크가 없는 격리된 서브그래프의 내부 연결도는 1이다.In S40, a cluster is generated by dividing the variable-type-function graph resulting from S30 into subgraphs based on internal connectivity. Here, the internal connectivity of the subgraph of the variable-type-function graph indicates how closely related nodes are gathered in the subgraph, and the sum of the weights of the inner links is calculated from the weights of the inner and outer links. Determined by the sum divided. The value of the internal connectivity is between 0 and 1, the internal connectivity of a subgraph with one node is 0, and the internal connectivity of an isolated subgraph without external links is 1.

S50에서는 상기 S20에서 추출된 구문요소 정보와 상기 S40에서 생성된 클러스터(분할된 서브그래프)로부터 객체 인스턴스와 클래스를 추출하고, 그에 해당되는 소스 코드와의 연결정보와, 객체로 추출되지 못한 구문요소들의 리스트를 출력한다. 여기서, 상기 클러스터링 단계(S40)에서 얻어진 클러스터로부터 클래스와 객체 인스턴스들을 추출하는 것이다. 클러스터는 하나의 클래스로 대응되며 그 클래스로부터 인스턴스화된 하나 이상의 객체가 추출된다. 클러스터 내에 같은 자료형으로 선언된 변수가 존재하지 않는다면 그 클래스에서는 하나의 객체가 추출되며 클러스터 내의 변수들은 객체의 애트리뷰트가 된다. 클러스터 내에 여러 변수들이 같은 자료형으로 선언되었다면 여러 개의 객체가 클래스로부터 추출되며, 그 변수들은 객체 인스턴스가 된다. 클래스의 지역화를 위해 추출된 애트리뷰트와 오퍼레이션의 가시성(visibility)을 결정하는데, 클래스 내부에서만 사용되는 멤버는 프라이비트(private)로, 클래스 외부에서도 사용되는 멤버는 퍼블릭(public)으로 결정한다. 클래스를 구성하는 애트리뷰트와 오퍼레이션이 C 프로그램의 어떤 변수와 함수에 해당되는지에 대한 소스코드와의 연결정보를 생성한다.In S50, the object instance and the class are extracted from the syntax element information extracted in S20 and the cluster (divided subgraph) generated in S40, connection information with the corresponding source code, and syntax elements not extracted as an object. Print a list of these. Here, class and object instances are extracted from the cluster obtained in the clustering step (S40). A cluster corresponds to a class and one or more objects instantiated from that class are extracted. If a variable declared with the same data type does not exist in the cluster, then an object is extracted from that class, and the variables in the cluster become attributes of the object. If multiple variables in a cluster are declared with the same data type, multiple objects are extracted from the class and the variables are object instances. Determining the visibility of attributes and operations extracted for the localization of a class, where members used only inside the class are private, and members used outside the class are public. Generates linkage to source code about which variables and functions in a C program correspond to the attributes and operations that make up a class.

도 3은 객체추출을 위한 클러스터링 기능 수행과정을 나타내는 흐름도이다. 도 3을 참조하여 상기 S40의 클러스터링 단계에서 객체추출을 위한 클러스터링 기능 수행과정을 설명하면 다음과 같다.3 is a flowchart illustrating a process of performing a clustering function for object extraction. Referring to Figure 3 describes the process of performing a clustering function for object extraction in the clustering step of S40 as follows.

S41에서는 클러스터링을 초기화한다. S42에서는 고려되지 않은 노드가 존재하는지 판단한다. 상기 S42에서 고려되지 않은 노드가 있다고 판단되면, S43에서는 외부노드를 선택한다. S44에서는 내부 연결도를 계산한다. S45에서는 내부 연결도가 감소하였는지 판단한다. 상기 S45에서 감소하지 않았다고 판단되면, S46에서는 클러스터를 확장하고 상기 S42부터 다시 수행한다. 여기서, 변수-자료형-함수 그래프의 클러스터는 최대의 내부 연결도를 가진 서브그래프로서, 내부 연결도의 값이 감소하지 않을 때까지 서브그래프에 외부링크를 포함시켜 확장시킴으로써 얻을 수 있다. 상기 S45에서 감소되었다고 판단되거나, 상기 S42에서 고려되지 않은 노드가 없다고 판단되면, S47에서는 클러스터를 출력한다. S48에서는 고려되지 않은 노드가 존재하는지 판단한다. 상기 S42에서 고려되지 않은 노드가 있다고 판단되면, 상기 S41에서 클러스터링을 다시 초기화한다. 그리고, 상기 S42에서 고려되지 않은 노드가 없다고 판단되면, 종료한다. 즉, 확장된 서브그래프의 내부 연결도의 값이 기존 서브그래프의 내부 연결도의 값보다 작다면 기존 서브그래프는 클러스터로 선택되고 전체 그래프의 모든 노드가 고려될 때까지 나머지 노드에 대해서 클러스터링을 계속하는 것이다.In S41, clustering is initialized. In S42, it is determined whether there is a node that is not considered. If it is determined that there is a node not considered in S42, an external node is selected in S43. In S44, the internal connection diagram is calculated. In S45, it is determined whether the internal connectivity is reduced. If it is determined that the decrease in S45 does not occur, the cluster is expanded in S46 and the process is performed again from S42. Here, the cluster of the variable-type-function graph is a subgraph having the maximum internal connectivity, and can be obtained by including an external link in the subgraph until the value of the internal connectivity does not decrease. If it is determined in S45 that there is no node or is not considered in S42, then S47 outputs a cluster. In S48, it is determined whether there is a node that is not considered. If it is determined that there is a node not considered in S42, clustering is initialized again in S41. When it is determined that there is no node that is not considered in S42, the process ends. In other words, if the value of the internal connectivity of the extended subgraph is less than the value of the internal connectivity of the existing subgraph, the existing subgraph is selected as a cluster and continues clustering for the remaining nodes until all nodes in the entire graph are considered. It is.

본 발명은 C 프로그램의 절차적 특성을 추출된 객체에 반영할 수 있게 하고, 너무 큰 객체나 중복된 객체를 추출하는 기존 방법의 한계를 해결하며, 프로그램의 이해를 돕고 유지보수가 용이하도록 좀 더 정확하게 객체를 추출할 수 있게 한다는 데에 그 효과가 있다.The present invention makes it possible to reflect the procedural characteristics of the C program to the extracted object, to solve the limitations of the existing method of extracting too large or duplicate objects, to better understand the program and to facilitate maintenance. The effect is that it allows you to extract objects accurately.

Claims

Reading a C program and converting the program into an internal representation of the program;

Extracting information on a relationship between syntax elements by analyzing an internal expression of the program;

Forming a variable-type-function graph based on the relationship and weights of the syntax elements;

Generating a cluster by performing clustering based on the variable-type-function graph resulting from the graph forming step; And

Determining class and object instances from the syntax element information or the cluster created in the clustering step.

The method of claim 1,

The relationship between the syntax elements

Variable-function relationships that exist when a global variable is used or modified in a function, passed as a function argument, and assigned by the function's return value;

Type-function relationships, which are global variables, local variables and formal arguments, function arguments, return values, global variable datatypes within a function, and relationships between functions; And

When the output of one function is used as the input of another function, when the input or output of the function is the same, when the calling function passes arguments that determine the control of the function being called, Automatic object extraction method, characterized in that it includes a function-function relationship that exists in the case and represent the procedural characteristics of the program.

The method of claim 1,

In the graph forming step,

The variable-type-function graph consists of nodes and links;

The node is three syntax elements: variable, data type, or function;

The link represents a relationship between syntax elements and has a weight indicating the degree of the relationship.

The method of claim 1,

In the clustering step, the variable-type-function graph is divided into subgraphs based on an internal connection diagram.

The method of claim 4, wherein

The value of the internal connectivity is determined by the sum of the weight of the inner link divided by the sum of the weight of the inner link and the outer link in the subgraph of the variable-type-function graph.

The method according to claim 1 or 4,

The clustering step,

Initializing clustering,

Creating a cluster of the variable-type-function graph by expanding while including an external link in the subgraph until the value of the internal connectivity does not decrease;

And repeating the initialization and cluster creation processes until there are no nodes considered in the variable-type-function graph.

The method of claim 1,

The extraction step,

If there is no variable declared with the same data type in the cluster, one object is extracted from the class, and the variables in the cluster are attributes of the object;

If several variables in the cluster are declared with the same data type, several objects are extracted from the class, and the variables are object instances;

Determining visibility of extracted attributes and operations for localization of the class; And

And generating connection information between the attribute constituting the class and the source code of which variables and functions of the program correspond to the program.