KR102113470B1

KR102113470B1 - Method and Apparatus for Determining of Equivalence Between SQL

Info

Publication number: KR102113470B1
Application number: KR1020180168639A
Authority: KR
Inventors: 한욱신; 소병훈; 김현지
Original assignee: 포항공과대학교 산학협력단
Priority date: 2018-12-24
Filing date: 2018-12-24
Publication date: 2020-05-21

Abstract

The present invention relates to a method for determining equality between structured query words and an apparatus thereof. The method for determining equality includes the steps of: receiving a first query word and a second query word; applying the first query word and the second query word to the same database to check application results; rewriting the first query word and the second query word if the application results are same; determining equality between the first query word and the second query word through string comparison of the rewritten first query word and the second query word; and storing the discrimination result of equality. Other embodiments are possible.

Description

Method and Apparatus for Determining of Equivalence Between SQL}

본 발명은 구조화 질의어 간의 동치성 판별 방법 및 장치에 관한 것으로, 구조화 질의어로부터 추출된 스키마 정보를 기반으로 생성된 테스트 데이터에 대한 질의 수행 결과의 동일성 비교, 의미적 검증 및 질의어 재작성을 통해 구조화 질의어 간의 동일성을 비교하여 동치성 검증을 수행하는 구조화 질의어 간의 동치성 판별 방법 및 장치에 관한 것이다. The present invention relates to a method and apparatus for determining equality between structured query words, and between structured query words through comparison of semantics, semantic verification, and rewrite of query results for test data generated based on schema information extracted from structured query words The present invention relates to a method and apparatus for determining equality between structured query words that perform identity verification by comparing identity.

질의 동치 문제는 주어진 제1 질의어와 제2 질의어에 대하여 두 개의 질의어가 의미적으로 동치인지 즉, 어떠한 데이터베이스 인스턴스에 대해 주어진 질의어를 수행하여도 동일한 결과를 도출하는지를 확인하는 문제이다. 이때, 질의어는 구조화 질의어로 SQL(structured query language)을 의미한다. 예컨대, 모든 질의 최적화기들은 질의 최적화 단계에서 많은 질의 수행 계획을 나열하는데, 이 수행 계획들은 모두 의미적으로 질의어와 동치여야 한다. The query equivalence problem is a problem that checks whether two query terms are semantically equivalent to a given first query word and a second query word, that is, even if a given query word is executed for any database instance, the same result is obtained. In this case, the query term is a structured query language (SQL). For example, all query optimizers list many query execution plans in the query optimization stage, and these execution plans must all be semantically equivalent to the query language.

질의어 간의 동치성을 결정하는 문제는 데이터베이스 구현을 위한 테스트 케이스를 생성하는 분야, 개발자들을 위한 교육용 툴을 개발하는 분야 및 학생의 숙제를 자동으로 채점하는 분야 등의 응용에도 중요한 문제이다. 이와 같은 문제는 데이터베이스 이론 연구 커뮤니티로부터 많은 주목을 받아왔으나, 종래 연구에서는 이 문제를 해소하는데 있어서의 이론적 한계에 초점을 맞추었기에 질의어 간의 동치성을 결정하는 실질적인 프로그램이 없다. 따라서, 관계형 데이터베이스(relational database)의 다양한 제약 조건과 연산을 내포하는 질의어들 간의 동치성을 검증하기가 용이하지 않은 문제가 있다. The problem of determining the equivalence between query words is also an important problem in applications such as the field of creating test cases for database implementation, the field of developing educational tools for developers, and the field of automatically scoring students' homework. This problem has received a lot of attention from the database theory research community, but there has been no practical program to determine the equivalence between query terms because the previous research focused on the theoretical limit in solving this problem. Accordingly, there is a problem in that it is not easy to verify the equivalence between query terms including various constraints and operations of a relational database.

이러한 종래의 문제점을 해결하기 위한 본 발명의 다양한 실시 예들은 구조화 질의어로부터 추출된 스키마 정보를 기반으로 생성된 테스트 데이터에 대한 질의 수행 결과의 동일성 비교, 의미적 검증 및 질의어 재작성을 통해 구조화 질의어 간의 동일성을 비교하여 동치성 검증을 수행하는 구조화 질의어 간의 동치성 판별 방법 및 장치를 제공하는 것이다.Various embodiments of the present invention for solving such a conventional problem are compared between structured query words through equality comparison, semantic verification, and rewrite of query results for test data generated based on schema information extracted from structured query words. It is to provide a method and apparatus for determining equality between structured query words that perform identity verification by comparing identity.

본 발명의 실시 예에 따른 구조화 질의어 간의 동치성 판별 방법은, 제1 질의어 및 제2 질의어를 수신하는 단계, 동일한 데이터베이스에 상기 제1 질의어와 상기 제2 질의어를 적용하여 적용 결과를 확인하는 단계, 상기 적용 결과가 동일하면 상기 제1 질의어 및 상기 제2 질의어를 재작성하는 단계, 상기 재작성된 상기 제1 질의어 및 상기 제2 질의어의 문자열 비교를 통해 상기 제1 질의어 및 상기 제2 질의어 사이의 동치성을 판별하는 단계, 상기 동치성의 판별결과를 저장하는 단계를 포함하는 것을 특징으로 한다. The method of determining equality between structured query words according to an embodiment of the present invention includes receiving a first query word and a second query word, and applying the first query word and the second query word to the same database to confirm an application result, the If the application result is the same, rewriting the first query word and the second query word, and comparing the first query word and the second query word through the comparison of the strings of the rewritten first query word and the second query word And a step of discriminating and storing the result of discrimination of equality.

또한, 적용 결과를 확인하는 단계는, 상기 적용 결과가 동일하면, 상기 데이터베이스의 스키마를 이용하여 테스트 데이터를 생성하는 단계 및 상기 생성된 테스트 데이터에 대해 상기 제1 질의어 및 상기 제2 질의어를 적용하여 적용 결과를 확인하는 단계를 포함하는 것을 특징으로 한다. In addition, in the step of confirming the application result, if the application result is the same, generating test data using the schema of the database and applying the first query word and the second query word to the generated test data It characterized in that it comprises the step of confirming the application result.

또한, 제2 질의어를 재작성하는 단계 이전에, 상기 생성된 테스트 데이터에 대해 상기 제1 질의어 및 상기 제2 질의어를 적용한 적용 결과가 동일하면 상기 제1 질의어 및 상기 제2 질의어에 대한 의미적 동치를 확인하는 단계를 더 포함하는 것을 특징으로 한다.In addition, before the step of rewriting the second query, if the application results of applying the first query word and the second query word to the generated test data are the same, a semantic equivalent to the first query word and the second query word It characterized in that it further comprises the step of checking.

또한, 제1 질의어 및 상기 제2 질의어를 재작성하는 단계는, 데이터베이스 관리 시스템(DBMS)를 이용하여 상기 제1 질의어 및 상기 제2 질의어를 재작성하는 단계인 것을 특징으로 한다. In addition, the rewriting of the first query word and the second query word is characterized in that the first query word and the second query word are rewritten using a database management system (DBMS).

또한, 제1 질의어 및 상기 제2 질의어 사이의 동치성을 판별하는 단계는, 상기 재작성된 상기 제1 질의어 및 상기 제2 질의어에 포함된 항목들을 알파벳 순서로 정렬하고, 앨리어스를 통일한 방식으로 설정하여 상기 제1 질의어 및 상기 제2 질의어의 문법 구조를 통일시키는 단계 및 상기 제1 질의어 및 상기 제2 질의어의 문법 구조를 비교하는 단계를 포함하는 것을 특징으로 한다. In addition, the step of determining equality between the first query word and the second query word may include arranging the items included in the rewritten first query word and the second query word in alphabetical order and setting the alias in a uniform manner. And unifying the grammatical structures of the first query word and the second query word, and comparing the grammatical structures of the first query word and the second query word.

또한, 제1 질의어 및 상기 제2 질의어를 기 저장된 질의어의 쌍(pair)과 비교하는 단계를 더 포함하는 것을 특징으로 한다. In addition, the method further includes comparing the first query word and the second query word with a pair of pre-stored query words.

아울러, 본 발명의 실시 예에 따른 구조화 질의어 간의 동치성 판별 장치는, 동일한 데이터베이스에 제1 질의어와 제2 질의어를 적용한 적용 결과가 동일하면 상기 제1 질의어 및 상기 제2 질의어를 재작성하고, 상기 재작성된 제1 질의어 및 상기 제2 질의어의 문자열 비교를 통해 상기 제1 질의어 및 상기 제2 질의어 사이의 동치성을 판별하는 제어부 및 상기 동치성의 판별결과를 저장하는 메모리를 포함하는 것을 특징으로 한다. In addition, the apparatus for determining equality between structured query words according to an embodiment of the present invention rewrites the first query word and the second query word if the application results of applying the first query word and the second query word to the same database are the same. It characterized in that it comprises a memory for storing a result of the identification of the equality and a control unit for determining the equality between the first query word and the second query word by comparing the first and second query words.

또한, 제어부는, 동일한 데이터베이스에 제1 질의어와 제2 질의어를 적용한 적용 결과가 동일하면 상기 데이터베이스의 스키마를 이용하여 테스트 데이터를 생성하고, 상기 테스트 데이터에 대해 상기 제1 질의어 및 상기 제2 질의어를 적용하여 적용 결과를 확인하는 것을 특징으로 한다.In addition, if the application result of applying the first query word and the second query word to the same database is the same, the control unit generates test data using the schema of the database, and generates the first query word and the second query word for the test data. It is characterized by applying to confirm the application result.

또한, 제어부는, 상기 생성된 테스트 데이터에 대해 상기 제1 질의어 및 상기 제2 질의어를 적용한 적용 결과가 동일하면 상기 제1 질의어 및 상기 제2 질의어에 대한 의미적 동치를 확인하는 것을 특징으로 한다. In addition, if the result of applying the first query word and the second query word to the generated test data is the same, the controller checks the semantic equivalence of the first query word and the second query word.

또한, 제어부는, 데이터베이스 관리 시스템(DBMS)를 이용하여 상기 제1 질의어 및 상기 제2 질의어를 재작성하는 것을 특징으로 한다. In addition, the control unit may rewrite the first query word and the second query word using a database management system (DBMS).

또한, 제어부는, 상기 재작성된 상기 제1 질의어 및 상기 제2 질의어에 포함된 항목들을 알파벳 순서로 정렬하고, 앨리어스를 통일한 방식으로 설정하여 상기 제1 질의어 및 상기 제2 질의어의 문법 구조를 통일시키는 것을 특징으로 한다.In addition, the controller arranges items included in the rewritten first query word and the second query word in alphabetical order, and sets aliases in a uniform manner to unify the grammatical structures of the first query word and the second query word. It is characterized by letting.

또한, 제어부는, 상기 문법 구조가 통일된 제1 질의어 및 제2 질의어의 문법 구조를 비교하는 것을 특징으로 한다.In addition, the control unit may compare the grammatical structures of the first query language and the second query language in which the grammatical structure is unified.

또한, 제어부는, 제1 질의어 및 상기 제2 질의어를 기 저장된 질의어의 쌍(pair)과 비교하는 것을 특징으로 한다.In addition, the control unit may compare the first query word and the second query word with a pair of pre-stored query words.

상술한 바와 같이 본 발명의 구조화 질의어 간의 동치성 판별 방법 및 장치는, 구조화 질의어로부터 추출된 스키마 정보를 기반으로 생성된 테스트 데이터에 대한 질의 수행 결과의 동일성 비교, 의미적 검증 및 질의어 재작성을 통해 구조화 질의어 간의 동일성을 비교하여 동치성 검증을 수행할 수 있는 효과가 있다. As described above, the method and apparatus for determining equality between structured query words of the present invention is structured through comparison of semantic results, semantic verification, and rewrite of query terms for test data generated based on schema information extracted from structured query words There is an effect that it is possible to perform identity verification by comparing the identity between query terms.

도 1은 본 발명의 실시 예에 따른 구조화 질의어 간의 동치성 판별 장치를 나타내는 도면이다.
도 2는 본 발명의 실시 예에 따른 구조화 질의어 간의 동치성 판별 방법을 설명하기 위한 순서도이다.
도 3은 본 발명의 실시 예에 따른 질의어 쌍의 적용 결과를 비교하는 방법을 설명하기 위한 상세순서도이다.1 is a diagram illustrating an apparatus for determining equality between structured query words according to an embodiment of the present invention.
2 is a flowchart illustrating a method for determining equality between structured query words according to an embodiment of the present invention.
3 is a detailed flowchart for explaining a method of comparing the application results of a pair of query words according to an embodiment of the present invention.

이하, 첨부된 도면을 참조하여 본 발명의 실시 예들을 보다 상세하게 설명하고자 한다. 이 때, 첨부된 도면에서 동일한 구성 요소는 가능한 동일한 부호로 나타내고 있음을 유의해야 한다. 그리고 본 발명의 요지를 흐리게 할 수 있는 공지 기능 및 구성에 대한 상세한 설명은 생략할 것이다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings. At this time, it should be noted that the same components in the accompanying drawings are indicated by the same reference numerals as possible. In addition, detailed descriptions of known functions and configurations that may obscure the subject matter of the present invention will be omitted.

도 1은 본 발명의 실시 예에 따른 구조화 질의어 간의 동치성 판별 장치를 나타내는 도면이다. 1 is a diagram illustrating an apparatus for determining equality between structured query words according to an embodiment of the present invention.

도 1을 참조하면, 본 발명에 따른 동치성 판별장치(100, 이하 전자장치(100)라 함)는 통신부(110), 입력부(120), 표시부(130), 메모리(140) 및 제어부(150)를 포함한다. 아울러, 전자장치(100)는 구조화 질의어 사이의 동치성을 판별하기 위한 컴퓨터, 랩탑 등의 전자장치일 수 있다. Referring to FIG. 1, an equality discrimination device (hereinafter referred to as electronic device 100) according to the present invention includes a communication unit 110, an input unit 120, a display unit 130, a memory 140, and a control unit 150 It includes. In addition, the electronic device 100 may be an electronic device such as a computer or a laptop for determining equality between structured query words.

통신부(110)는 외부장치(미도시)와의 통신을 수행한다. 이를 위해, 통신부(110)는 5G(fifth generation mobile communications), LTE-A(long term evolution-advanced), LTE(long term evolution), wifi(wireless fidelity) 등의 무선 통신을 수행할 수 있고, USB케이블 등의 유선 통신을 수행할 수 있다. 아울러, 외부장치는 제1 질의어 및 제2 질의어에 대한 동치성을 판별하기 위해 사용되는 데이터베이스가 저장된 장치일 수 있다. The communication unit 110 communicates with an external device (not shown). To this end, the communication unit 110 may perform wireless communication such as 5G (fifth generation mobile communications), LTE-A (long term evolution-advanced), LTE (long term evolution), wifi (wireless fidelity), and USB Wired communication such as a cable can be performed. In addition, the external device may be a device in which a database used to determine the identity of the first query word and the second query word is stored.

입력부(120)는 전자장치(100)의 사용자 입력에 대응하여, 입력데이터를 발생시킨다. 입력부(120)는 적어도 하나의 입력수단을 포함한다. 이러한 입력부(120)는 키 패드(key pad), 돔 스위치(dome switch), 터치 패널(touch panel), 조그 셔틀(jog and shuttle), 터치 키(touch key), 스타일러스 펜(stylus pen), 마우스(mouse), 키보드(keyboard) 중 적어도 하나를 포함한다.The input unit 120 generates input data in response to a user input of the electronic device 100. The input unit 120 includes at least one input means. The input unit 120 includes a key pad, a dome switch, a touch panel, a jog and shuttle, a touch key, a stylus pen, and a mouse. (mouse), keyboard (keyboard).

표시부(130)는 전자장치(100)의 동작에 따른 출력 데이터를 출력한다. 이를 위해, 표시부(130)는 액정 디스플레이(LCD; liquid crystal display), 발광 다이오드(LED; light emitting diode) 디스플레이, 유기 발광 다이오드(OLED; organic LED) 디스플레이, 마이크로 전자기계 시스템(MEMS; micro electro mechanical systems) 디스플레이 및 전자 종이(electronic paper) 디스플레이를 포함한다. 표시부(130)는 입력부(120)와 결합되어 터치 스크린(touch screen)으로 구현될 수 있다.The display unit 130 outputs output data according to the operation of the electronic device 100. To this end, the display unit 130 includes a liquid crystal display (LCD), a light emitting diode (LED) display, an organic LED (OLED) display, a micro electromechanical system (MEMS) systems) displays and electronic paper displays. The display unit 130 may be combined with the input unit 120 to be implemented as a touch screen.

메모리(140)는 전자장치(100)의 동작 프로그램들을 저장한다. 메모리(140)는 질의어 쌍(pair)을 저장할 수 있다. 이때, 메모리(140)에 저장되는 질의어 쌍은 각각 동치를 이루거나 동치를 이루지 않는 경우라도, 전자장치(100)에 의해 질의어 사이의 동치성 판별이 수행된 이력이 있는 경우를 의미할 수 있다. 메모리(140)는 제1 질의어와 제2 질의어에 대한 의미적 동치를 확인하기 위한 동치성 검증기를 저장할 수 있다. 또한, 메모리(140)는 테이블 형태의 데이터베이스를 저장할 수 있다. The memory 140 stores operation programs of the electronic device 100. The memory 140 may store a pair of query terms. In this case, the pair of query words stored in the memory 140 may refer to a case in which there is a history in which equality discrimination between query terms is performed by the electronic device 100, even if they are equal or not. The memory 140 may store an equivalence verifier for checking semantic equivalents for the first query word and the second query word. In addition, the memory 140 may store a table type database.

제어부(150)는 동일한 데이터베이스에 제1 질의어와 제2 질의어를 적용한 적용 결과를 확인한다. 적용 결과가 동일하면 제1 질의어 및 제2 질의어를 재작성하고 재작성된 제1 질의어 및 제2 질의어의 문자열 비교를 통해 제1 질의어 및 제2 질의어 사이의 동치성을 판별한다. The controller 150 checks the result of applying the first query word and the second query word to the same database. If the application results are the same, the first query word and the second query word are rewritten and the equality between the first query word and the second query word is determined by comparing the rewritten first and second query words.

보다 구체적으로, 제어부(150)는 제1 질의어 및 제2 질의어를 포함하는 질의어 쌍(pair)을 입력부(120)로부터 수신한다. 이때, 질의어는 구조화 질의어(SQL; structured query language)를 의미한다. 제어부(150)는 수신된 제1 질의어 및 제2 질의어가 메모리(140)에 기저장된 질의어의 쌍과 비교한다. 제어부(150)는 메모리(140)에 기저장된 질의어의 쌍 중에 수신된 질의어 쌍과 동일한 질의어의 쌍이 존재하면 질의어 간의 동치성 판별 프로세스를 종료한다. 이때, 제어부(150)는 수신된 질의어의 쌍과 동일한 질의어의 쌍이 메모리(140)에 존재하되, 질의어의 쌍이 동치로 확인된 질의어이면, 제1 질의어와 제2 질의어가 동치인 것으로 확인할 수 있다. 반대로, 질의어의 쌍이 비동치로 확인된 질의어이면, 제1 질의어와 제2 질의어가 비동치인 것으로 확인할 수 있다. More specifically, the controller 150 receives a pair of query words including the first query word and the second query word from the input unit 120. In this case, the query term means a structured query language (SQL). The control unit 150 compares the received first and second query words with a pair of query words previously stored in the memory 140. The controller 150 ends the process of determining equality between query terms when a pair of query terms identical to the received query pair exists among the pair of query terms previously stored in the memory 140. At this time, the controller 150 may determine that the first query word and the second query word are the same, if the pair of query words identical to the received pair of query words exists in the memory 140, but the pair of query words is the same query query. Conversely, if a pair of query words is a query word identified as non-equal, it can be confirmed that the first query word and the second query word are non-equal.

반대로, 메모리(140)에 기저장된 질의어 쌍 중에 수신된 질의어 쌍과 동일한 질의어의 쌍이 존재하지 않으면 제어부(150)는 질의어 쌍에 대한 각각의 적용 결과를 비교한다. 보다 구체적으로, 제어부(150)는 데이터베이스(미도시)에 수신된 질의어 쌍을 적용한다. 이때, 데이터베이스는 관계형 데이터베이스를 의미할 수 있고, 테이블 형태로 메모리(140)의 일부에 저장될 수도 있다. 제어부(150)는 데이터베이스에 제1 질의어를 적용하여 도출된 적용 결과와 제2 질의어를 적용하여 도출된 적용 결과를 비교한다. 제어부(150)는 두 개의 적용 결과가 동일하지 않은 것으로 확인되면, 제1 질의어와 제2 질의어가 비동치인 것으로 확인하고 비동치인 제1 질의어와 제2 질의어를 메모리(140)에 쌍으로 저장한다. Conversely, if a pair of query terms identical to the received query pair does not exist among the pair of query terms previously stored in the memory 140, the controller 150 compares the result of each application to the pair of query terms. More specifically, the controller 150 applies a pair of query terms received in a database (not shown). In this case, the database may mean a relational database, or may be stored in a part of the memory 140 in the form of a table. The control unit 150 compares the application result derived by applying the first query term to the database and the application result derived by applying the second query term. If it is determined that the two application results are not the same, the controller 150 determines that the first query word and the second query word are non-equal and stores the non-identical first query word and the second query word in pairs in the memory 140.

반대로, 제어부(150)는 두 개의 적용 결과가 동일한 것으로 확인되면, 제어부(150)는 테스트 데이터를 생성하고, 생성된 테스트 데이터에 제1 질의어와 제2 질의어를 적용하여 도출된 질의어 적용 결과를 비교한다. 이를 위해, 제어부(150)는 데이터베이스의 스키마를 확인하고, 확인된 스키마에 따라 생성된 테스트 데이터에 제1 질의어 및 제2 질의어 각각을 적용한다. 제어부(150)는 제1 질의어 및 제2 질의어의 적용에 의해 도출된 두 개의 적용 결과를 비교하여 동일 여부를 확인한다. 제어부(150)는 두 개의 적용 결과 중 하나의 적용 결과에 대한 칼럼(column) 순서를 고정하고, 다른 하나의 칼럼 순서를 랜덤하게 변경한다. 예컨대, 제어부(150)는 두 개의 적용 결과의 SELECT절에 나타나는 속성의 순서가 서로 상이할 수 있음을 고려하여 속성 하나를 고정시킨다. 제어부(150)는 고정된 속성을 제외한 다른 칼럼들의 순서를 변경하여 모든 경우에 대해서 결과가 상이하면 제1 질의어와 제2 질의어를 상이한 질의어로 판별한다. 또한, 제어부(150)는 모든 경우에 대해서 적어도 하나라도 동일한 결과가 확인되면 제어부(150)는 제1 질의어와 제2 질의어를 동일한 질의어로 판별한다.Conversely, when it is determined that the two application results are the same, the control unit 150 generates test data and compares the result of applying the query term derived by applying the first query word and the second query word to the generated test data. do. To this end, the controller 150 checks the schema of the database and applies each of the first query word and the second query word to test data generated according to the identified schema. The controller 150 compares two application results derived by the application of the first query word and the second query word to check whether they are the same. The controller 150 fixes the column order for one of the two application results, and randomly changes the order of the other column. For example, the controller 150 fixes one attribute in consideration that the order of attributes appearing in the SELECT clause of the two application results may be different from each other. The controller 150 determines the first query word and the second query word as different query words if the results are different in all cases by changing the order of the other columns except the fixed attribute. In addition, if at least one of the same results is confirmed in all cases, the controller 150 determines the first query word and the second query word as the same query word.

제어부(150)는 제1 질의어와 제2 질의어가 동일한 것으로 확인되면 제어부(150)는 메모리(140)에 저장된 동치성 검증기(미도시)를 통해 제1 질의어와 제2 질의어가 의미적 동치성을 갖는지 확인한다. 즉, 테스트 데이터를 이용한 확인결과, 제1 질의어와 제2 질의어가 동일하다 하더라도 모든 데이터베이스 인스턴스를 검증한 것이 아니므로, 제1 질의어 및 제2 질의어가 의미적 동치성을 갖는지를 확인해야 한다. 제어부(150)는 동치성 검증기를 통해 확인한 결과, 두 개의 질의어가 의미적 동치성을 갖거나, 두 개의 질의어에 대한 의미적 동치성의 판단이 불가한 것으로 확인되면 제어부(150)는 제1 질의어 및 제2 질의어를 재작성한다. 이때, 동치성 검증기는 관계형 데이터베이스(relational database)에서 SQL질의들의 동치성을 검증하는 COSETTE 등일 수 있다. 그러나, COSETTE의 경우, CASE, 집합 시멘틱스(semantics)를 지원하는 UNION, PARTITION BY 등의 연산과 NULL은 해석이 불가능한 단점이 있고, 속성의 타입이 제한되어 정수 연산과 문자열의 결합을 구별하지 못하는 등과 같이 연산자의 시멘틱스를 구분하지 못하는 단점이 있다. 이를 해소하기 위해 제어부(150)는 제1 질의어 및 제2 질의어를 재작성하고, 재작성된 제1 질의어 및 제2 질의어의 문자열을 비교한다. When it is determined that the first query word and the second query word are the same, the control unit 150 checks whether the first query word and the second query word have semantic equivalence through an equality verifier (not shown) stored in the memory 140. do. That is, as a result of using the test data, even if the first query word and the second query word are the same, not all database instances are verified, so it is necessary to check whether the first query word and the second query word have semantic equivalence. As a result of checking through the equivalence verifier, the control unit 150 determines that the two query words have semantic equivalence or that it is impossible to determine the semantic equivalence of the two query terms, the control unit 150 controls the first query term and the second query term. Rewrite the query. At this time, the equality verifier may be COSETTE or the like, which verifies the equality of SQL queries in a relational database. However, in the case of COSETTE, operations such as CASE, UNION that supports set semantics, and PARTITION BY, and NULL have a disadvantage that cannot be interpreted, and the type of the attribute is limited, so it is not possible to distinguish the combination of integer operation and string. Likewise, there is a disadvantage that the semantics of the operator cannot be distinguished. To solve this, the controller 150 rewrites the first query word and the second query word, and compares the strings of the rewritten first query word and the second query word.

제어부(150)는 제1 질의어와 제2 질의어의 문자열이 동일하면 제어부(150)는 제1 질의어와 제2 질의어가 동치인 것으로 확인한다. 제어부(150)는 동치인 제1 질의어와 제2 질의어를 메모리(140)에 쌍으로 저장한다. 보다 구체적으로, 제어부(150)는 DBMS(database management system)을 이용하여 제1 질의어 및 제2 질의어 각각을 재작성하여 정식 형태로 생성한다. 제어부(150)는 재작성된 제1 질의어 및 제2 질의어를 SQL질의의 SELECT, FROM 및 WHERE절의 속성들을 절 별로 알파벳 순서에 따라 정렬한다. 제어부(150)는 SQL질의에서 사용하는 앨리어스(alias)를 동일한 방식 즉, 제1 질의어 및 제2 질의어에 의해 생성된 테이블 중에서 동일한 테이블에 대한 엘리어스는 동일한 문자열을 이름으로 가지도록 설정하여 제1 질의어와 제2 질의어의 표현 방식을 통일시킨다. 아울러, 제어부(150)는 연산자를 이용하여 연결된 변수, 상수 등에 대해서도 표현 방식을 통일시킨다. 제어부(150)는 표현 방식을 통일시킨 후에 제1 질의어와 제2 질의어를 구성하는 문자를 비교하여 제1 질의어와 제2 질의어의 동치여부를 확인할 수 있다. If the strings of the first query word and the second query word are the same, the control unit 150 determines that the first query word and the second query word are the same. The controller 150 stores the equivalent first and second query words in pairs in the memory 140. More specifically, the controller 150 rewrites each of the first query word and the second query word using a DBMS (database management system) to generate them in a formal form. The control unit 150 sorts the attributes of the SELECT, FROM, and WHERE clauses of the SQL query for the rewritten first query word and the second query word in alphabetical order for each clause. The control unit 150 sets the alias used in the SQL query in the same way, that is, the alias for the same table among the tables created by the first query language and the second query language, so that the same string has the same name as the first query language And the second query language. In addition, the controller 150 unifies expression methods for variables, constants, and the like connected by using an operator. After unifying the expression method, the controller 150 compares the characters constituting the first query word and the second query word and checks whether the first query word and the second query word are identical.

이와 같이, 전자장치(100)는 다중 레벨 프레임워크를 구현함으로써 DBMS가 제공하는 테스트 데이터 생성 및 질의 재작성 기술을 수행하기 때문에, 질의어들에 대한 동치성 판별을 보다 정확하게 수행할 수 있는 효과가 있다As described above, since the electronic device 100 implements a test data generation and query rewriting technique provided by the DBMS by implementing a multi-level framework, it is possible to more accurately perform an equality discrimination on query words.

도 2는 본 발명의 실시 예에 따른 구조화 질의어 간의 동치성 판별 방법을 설명하기 위한 순서도이다.2 is a flowchart illustrating a method for determining equality between structured query words according to an embodiment of the present invention.

도 2를 참조하면, 201단계에서 제어부(150)는 제1 질의어 및 제2 질의어를 포함하는 질의어 쌍(pair)의 수신여부를 판단한다. 이때, 질의어는 구조화 질의어(SQL; structured query language)를 의미한다. 201단계에서의 판단결과, 제1 질의어 및 제2 질의어가 수신되면 제어부(150)는 203단계를 수행한다. 203단계에서 제어부(150)는 수신된 제1 질의어 및 제2 질의어를 메모리(140)에 기저장된 질의어의 쌍과 비교한다. 205단계에서 제어부(150)는 메모리(140)에 기저장된 질의어의 쌍 중에 수신된 질의어 쌍과 동일한 질의어의 쌍이 존재하면 상기 프로세스를 종료하고, 동일한 질의어의 쌍이 존재하지 않으면 207단계를 수행한다. 보다 구체적으로, 205단계에서 제어부(150)는 제1 질의어 및 제2 질의어와 동일한 질의어의 쌍이 메모리(140)에 존재하되, 질의어의 쌍이 동치로 확인된 질의어이면, 제1 질의어와 제2 질의어가 동치인 것으로 확인할 수 있다. 반대로, 질의어의 쌍이 비동치로 확인된 질의어이면, 제1 질의어와 제2 질의어가 비동치인 것으로 확인할 수 있다. Referring to FIG. 2, in step 201, the controller 150 determines whether a pair of query words including a first query word and a second query word is received. In this case, the query term means a structured query language (SQL). As a result of the determination in step 201, when the first query word and the second query word are received, the controller 150 performs step 203. In step 203, the controller 150 compares the received first and second query words with a pair of query words previously stored in the memory 140. In step 205, the control unit 150 terminates the process if a pair of query terms identical to the received query pair exists among the pair of query terms previously stored in the memory 140, and if the pair of the same query words does not exist, step 207 is performed. More specifically, in step 205, the controller 150 has a pair of the same query terms as the first query word and the second query word in the memory 140. It can be confirmed that it is the same. Conversely, if a pair of query words is a query word identified as non-equal, it can be confirmed that the first query word and the second query word are non-equal.

207단계에서 제어부(150)는 질의어 쌍을 적용한 적용 결과의 비교를 수행한다. 이는 하기의 도 3을 이용하여 상세히 설명하기로 한다. 도 3은 본 발명의 실시 예에 따른 질의어 쌍의 적용 결과를 비교하는 방법을 설명하기 위한 상세순서도이다. In step 207, the controller 150 compares the result of applying the query pair. This will be described in detail with reference to FIG. 3 below. 3 is a detailed flowchart for explaining a method of comparing the application results of a pair of query words according to an embodiment of the present invention.

도 3을 참조하면, 301단계에서 제어부(150)는 데이터베이스(미도시)에 201단계에서 수신된 질의어 쌍을 적용한다. 이때, 데이터베이스는 관계형 데이터베이스로 테이블 형태로 메모리(140)의 일부에 저장될 수도 있다. 303단계에서 제어부(150)는 데이터베이스에 제1 질의어를 적용하여 도출된 적용 결과와 제2 질의어를 적용하여 도출된 적용 결과를 비교한다. 305단계에서 제어부(150)는 두 개의 적용 결과가 동일한 것으로 확인되면 307단계를 수행하고, 두 개의 적용 결과가 동일하지 않은 것으로 확인되면 313단계를 수행한다. 313단계에서 제어부(150)는 제1 질의어와 제2 질의어가 비동치인 것으로 확인하고 215단계를 수행한다. 315단계에서 제어부(150)는 비동치인 제1 질의어와 제2 질의어를 메모리(140)에 쌍으로 저장하고 상기 프로세스를 종료할 수 있다.Referring to FIG. 3, in step 301, the controller 150 applies a pair of query words received in step 201 to a database (not shown). At this time, the database may be stored in a part of the memory 140 in a table form as a relational database. In step 303, the controller 150 compares the application result derived by applying the first query word to the database and the application result derived by applying the second query word. In step 305, the controller 150 performs step 307 when it is determined that the two application results are the same, and performs step 313 when it is determined that the two application results are not the same. In step 313, the controller 150 determines that the first query word and the second query word are non-equal and performs step 215. In step 315, the controller 150 may store the non-identical first query word and the second query word in pairs in the memory 140 and terminate the process.

307단계에서 제어부(150)는 테스트 데이터를 생성하고, 309단계에서 제어부(150)는 테스트 데이터에 제1 질의어와 제2 질의어를 적용하여 도출된 질의어 적용 결과를 비교한다. 보다 구체적으로, 제어부(150)는 데이터베이스의 스키마를 확인하고, 확인된 스키마에 따라 생성된 테스트 데이터에 제1 질의어 및 제2 질의어 각각을 적용한다. 제어부(150)는 제1 질의어 및 제2 질의어의 적용에 의해 도출된 두 개의 적용 결과를 비교하여 동일 여부를 확인한다. 제어부(150)는 두 개의 적용 결과 중 하나의 적용 결과에 대한 칼럼(column) 순서를 고정하고, 다른 하나의 칼럼 순서를 램덤하게 변경한다. 예컨대, 제어부(150)는 두 개의 적용 결과의 SELECT절에 나타나는 속성의 순서가 서로 상이할 수 있음을 고려하여 속성 하나를 고정시킨다. 제어부(150)는 고정된 속성을 제외한 다른 칼럼들의 순서를 변경하여 모든 경우에 대해서 결과가 상이하면 제1 질의어와 제2 질의어를 상이한 질의어로 판별한다.In step 307, the control unit 150 generates test data, and in step 309, the control unit 150 compares the result of applying the query term derived by applying the first query word and the second query word to the test data. More specifically, the controller 150 checks the schema of the database and applies each of the first query word and the second query word to test data generated according to the identified schema. The controller 150 compares two application results derived by the application of the first query word and the second query word to check whether they are the same. The controller 150 fixes the column order for one of the two application results and randomly changes the order of the other column. For example, the controller 150 fixes one attribute in consideration that the order of attributes appearing in the SELECT clause of the two application results may be different from each other. The controller 150 determines the first query word and the second query word as different query words if the results are different in all cases by changing the order of the other columns except the fixed attribute.

311단계에서 제어부(150)는 두 개의 적용 결과가 동일하지 않은 것으로 확인되면 313단계를 수행하고, 두 개의 적용 결과가 동일한 것으로 확인되면 도 2의 209단계로 리턴한다. 209단계에서 제어부(150)는 의미적 동치를 확인하여 211단계를 수행한다. 제어부(150)는 메모리(140)에 저장된 동치성 검증기(미도시)를 통해 제1 질의어와 제2 질의어가 의미적 동치성을 가지지 않는 것으로 판단되면 223단계를 수행한다. 반대로, 두 개의 질의어가 의미적 동치성을 갖거나, 두 개의 질의어에 대한 의미적 동치성의 판단이 불가한 것으로 확인되면 213단계를 수행한다. 이때, 동치성 검증기는 관계형 데이터베이스(relational database)에서 SQL질의들의 동치성을 검증하는 COSETTE 등일 수 있다. 그러나, COSETTE의 경우, CASE, 집합 시멘틱스(semantics)를 지원하는 UNION, PARTITION BY 등의 연산과 NULL은 해석이 불가능한 단점이 있고, 속성의 타입이 제한되어 정수 연산과 문자열의 결합을 구별하지 못하는 등과 같이 연산자의 시멘틱스를 구분하지 못하는 단점이 있다. In step 311, the controller 150 performs step 313 when it is determined that the two application results are not the same, and returns to step 209 of FIG. 2 when the two application results are confirmed to be the same. In step 209, the controller 150 checks the semantic equivalent and performs step 211. The controller 150 performs step 223 when it is determined that the first query word and the second query word do not have semantic equivalence through the equality verifier (not shown) stored in the memory 140. Conversely, if it is determined that the two query words have semantic equivalence or that it is impossible to determine the semantic equivalence of the two query terms, step 213 is performed. At this time, the equality verifier may be COSETTE or the like, which verifies the equality of SQL queries in a relational database. However, in the case of COSETTE, operations such as CASE, UNION that supports set semantics, and PARTITION BY, and NULL have a disadvantage that cannot be interpreted, and the type of the attribute is limited, so it is not possible to distinguish the combination of integer operation and string. Likewise, there is a disadvantage that the semantics of the operator cannot be distinguished.

이를 해소하기 위해 213단계에서 제어부(150)는 제1 질의어 및 제2 질의어를 재작성하고, 215단계를 수행한다. 215단계에서 제어부(150)는 재작성된 제1 질의어 및 제2 질의어의 문자열을 비교한다. 217단계에서 제어부(150)는 제1 질의어와 제2 질의어의 문자열이 동일한 것으로 확인되면 219단계를 수행하고, 문자열이 상이한 것으로 확인되면 223단계를 수행한다. 223단계에서 제어부(150)는 제1 질의어와 제2 질의어가 비동치인 것으로 확인하고 221단계를 수행한다. 221단계에서 제어부(150)는 비동치인 제1 질의어와 제2 질의어를 메모리(140)에 쌍으로 저장할 수 있다. 반대로, 219단계에서 제어부(150)는 제1 질의어와 제2 질의어가 동치인 것으로 확인하고 221단계를 수행한다. 221단계에서 제어부(150)는 동치인 제1 질의어와 제2 질의어를 쌍으로 저장한다. 보다 구체적으로, 제어부(150)는 DBMS(database management system)을 이용하여 제1 질의어 및 제2 질의어 각각을 재작성하여 정식 형태로 생성한다. 제어부(150)는 재작성된 제1 질의어 및 제2 질의어를 SQL질의의 SELECT, FROM 및 WHERE절의 속성들을 절 별로 알파벳 순서에 따라 정렬한다. 제어부(150)는 SQL질의에서 사용하는 앨리어스(alias)를 동일한 방식 즉, 제1 질의어 및 제2 질의어에 의해 생성된 테이블 중에서 동일한 테이블에 대한 엘리어스는 동일한 문자열을 이름으로 가지도록 설정하여 제1 질의어와 제2 질의어의 표현 방식을 통일시킨다. 아울러, 제어부(150)는 연산자를 이용하여 연결된 변수, 상수 등에 대해서도 표현 방식을 통일시킨다. 제어부(150)는 표현 방식을 통일시킨 후에 제1 질의어와 제2 질의어를 구성하는 문자를 비교한다. 제어부(150)는 문자의 비교결과 제1 질의어와 제2 질의어가 동일한 질의가 아니면 223단계를 수행할 수 있고, 도시되지 않았으나, 제어부(150)는 사용자가 직접 제1 질의어와 제2 질의어의 동치성을 판단한 검사결과를 사용자로부터의 입력에 의해 수신할 수도 있다. 제어부(150)는 문자의 비교결과 제1 질의어와 제2 질의어가 동일한 것으로 확인되면, 제1 질의어와 제2 질의어를 메모리(140)에 쌍으로 저장할 수 있다. To solve this, in step 213, the controller 150 rewrites the first query word and the second query word, and performs step 215. In step 215, the control unit 150 compares the rewritten first query word and the second query word string. In step 217, if it is determined that the strings of the first query word and the second query word are the same, the controller 150 performs step 219, and if it is determined that the character strings are different, performs step 223. In step 223, the controller 150 determines that the first query word and the second query word are non-equal and performs step 221. In step 221, the controller 150 may store the first query terms and the second query terms that are not identical in the memory 140 in pairs. Conversely, in step 219, the controller 150 determines that the first query word and the second query word are the same and performs step 221. In step 221, the controller 150 stores the equivalent first query word and second query word in pairs. More specifically, the controller 150 rewrites each of the first query word and the second query word using a DBMS (database management system) to generate them in a formal form. The control unit 150 sorts the attributes of the SELECT, FROM, and WHERE clauses of the SQL query for the rewritten first query word and the second query word in alphabetical order for each clause. The control unit 150 sets the alias used in the SQL query in the same way, that is, the alias for the same table among the tables created by the first query language and the second query language, so that the same string has the same name as the first query language And the second query language. In addition, the controller 150 unifies expression methods for variables, constants, and the like connected by using an operator. The controller 150 compares the characters constituting the first query word and the second query word after unifying the expression method. If the first query word and the second query word are not the same query as a result of comparing the characters, the controller 150 may perform step 223, and although not shown, the controller 150 directly recognizes the identity of the first query word and the second query word It is also possible to receive the inspection result determined by the input from the user. If it is determined that the first query word and the second query word are the same as a result of comparing the characters, the controller 150 may store the first query word and the second query word in pairs in the memory 140.

이와 같이, 전자장치(100)는 다중 레벨 프레임워크를 구현함으로써 DBMS가 제공하는 테스트 데이터 생성 및 질의 재작성 기술을 수행하기 때문에, 질의어들에 대한 동치성 판별을 보다 정확하게 수행할 수 있는 효과가 있다. As described above, since the electronic device 100 implements a test data generation and query rewriting technique provided by the DBMS by implementing a multi-level framework, it is possible to more accurately perform the equality discrimination for query words.

한편, 본 명세서와 도면에 개시된 본 발명의 실시예들은 본 발명의 기술 내용을 쉽게 설명하고 본 발명의 이해를 돕기 위해 특정 예를 제시한 것일 뿐이며, 본 발명의 범위를 한정하고자 하는 것은 아니다. 즉 본 발명의 기술적 사상에 바탕을 둔 다른 변형 예들이 실시 가능하다는 것은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에게 자명한 것이다.On the other hand, the embodiments of the present invention disclosed in the specification and drawings are merely to provide a specific example to easily explain the technical content of the present invention and to help understand the present invention, and are not intended to limit the scope of the present invention. That is, it is obvious to those skilled in the art to which the present invention pertains that other modified examples based on the technical spirit of the present invention can be implemented.

Claims

An electronic device receiving the first query word and the second query word;
Confirming an application result by applying the first query word and the second query word to the same database by the electronic device;
When the application result is the same, confirming the application result by applying the first query word and the second query word to test data generated using a schema of a database;
The electronic device confirming a semantic equivalent of the first query word and the second query word when the application results are the same;
The electronic device rewriting the first query word and the second query word to compare the character strings of the rewritten first query word and the second query word if the semantics are equal;
The electronic device confirms an equivalence between the rewritten first query word and the second query word if the strings of the rewritten first query word and the second query word are the same; And
Storing, by the electronic device, a result of checking the identity;
Method for determining equality between structured query words comprising a.

delete

According to claim 1,
Comparing the strings of the rewritten first query word and the second query word,
Rewriting the first query term and the second query term by the electronic device using a database management system (DBMS);
Method for determining equality between structured query words comprising a.

The method of claim 4,
The step of confirming the equivalence between the rewritten first query word and the second query word,
The electronic device sorts the items included in the rewritten first query word and the second query word in alphabetical order, and sets the alias in a uniform manner to unify the grammatical structures of the rewritten first query word and the second query word. ; And
Comparing, by the electronic device, the grammatical structure of the rewritten first query word and the second query word;
Method for determining equality between structured query words comprising a.

According to claim 1,
Comparing, by the electronic device, the first query word and the second query word with a pair of pre-stored query words;
Method for determining equality between structured query words further comprising a.

If the result of applying the first query word and the second query word to the same database is the same, the application result is checked by applying the first query word and the second query word to test data generated using the schema of the database. If is the same, confirm the semantic equivalence between the first query word and the second query word,
If the semantic equivalent is achieved, the first query word and the second query word are rewritten to compare the strings of the rewritten first query word and the second query word, and if the rewritten first query word and the second query word string are the same A control unit for checking equality between the rewritten first query word and the second query word; And
A memory for storing the identification result of the identity;
Apparatus for determining equality between structured query words comprising a.

delete

The method of claim 7,
The control unit,
An apparatus for determining equality between structured query words, characterized in that the first query word and the second query word are rewritten using a database management system (DBMS).

The method of claim 10,
The control unit,
Structuring characterized in that items included in the rewritten first query word and the second query word are arranged in alphabetical order, and aliases are set in a uniform manner to unify the grammatical structures of the rewritten first query word and the second query word. Apparatus for determining equality between query words.

The method of claim 11,
The control unit,
An apparatus for determining equality between structured query words characterized by comparing the grammatical structures of the rewritten first query word and the second query word in which the grammatical structure is unified.

The method of claim 7,
The control unit,
The apparatus for determining equality between structured query words, wherein the first query word and the second query word are compared with a pair of pre-stored query words.