KR100666942B1

KR100666942B1 - Method for Handling XML Data Using Relational Database Management System

Info

Publication number: KR100666942B1
Application number: KR1020050001873A
Authority: KR
Inventors: 나연주
Original assignee: 삼성전자주식회사
Priority date: 2005-01-07
Filing date: 2005-01-07
Publication date: 2007-01-11
Also published as: KR20060081255A

Abstract

본 발명의 일 측면에 따른 관계형 DBMS를 이용한 XML 데이터 관리 방법은, XML 파일을 DOM 파싱하여 DOM 객체 트리를 구성하고, 상기 DOM 객체 트리로부터 선택된 객체 정보를 RDBMS 테이블에 등록하는 단계; XML DTD 정보를 DTD 테이블에 등록하고, 등록된 DTD의 XML 데이터를 보관할 다큐먼트 테이블 및 노드 테이블을 생성하는 DTD 등록 단계; XML 파일을 DOM 파싱하여 DOM 객체 정보를 다큐먼트 테이블 및 노드 테이블에 저장하는 XML파일 저장 단계; 및 DBMS에 저장된 DOM 객체 정보를 이용하여 XML파일을 생성하는 XML파일 생성 단계를 포함한다.An XML data management method using a relational DBMS according to an aspect of the present invention includes constructing a DOM object tree by DOM parsing an XML file, and registering object information selected from the DOM object tree in an RDBMS table; A DTD registration step of registering XML DTD information in a DTD table and generating a document table and a node table for storing XML data of the registered DTD; Parsing an XML file and storing the DOM object information in a document table and a node table; And an XML file generation step of generating an XML file using the DOM object information stored in the DBMS.

Description

How to manage XML data using relational DVMS {Method for Handling XML Data Using Relational Database Management System}

도 1은 XML 파일 전체를 RDBMS 테이블의 CLOB 형태로 저장하는 방법을 통해 파일을 보관하는 경우의 XML 파일테이블을 나타낸 도면. BRIEF DESCRIPTION OF THE DRAWINGS Fig. 1 is a diagram showing an XML file table in a case where a file is stored through a method of storing the entire XML file in a CLOB form of an RDBMS table.

도 2는 XML 데이터에서 미리 검색 키(Key)로 사용하고자 하는 태그(tag)를 테이블의 칼럼으로 정의하는 경우의 XML 노드테이블을 나타낸 도면.FIG. 2 is a diagram illustrating an XML node table when a tag to be used as a search key in XML data is defined as a column of a table. FIG.

도 3은 본 발명에 따른 XML 데이터 RDBMS 테이블 매핑 알고리즘을 나타낸 도면.3 is a diagram illustrating an XML data RDBMS table mapping algorithm according to the present invention.

도 4는 본 발명에 따른 XML 데이터 RDBMS 테이블 매핑 알고리즘을 이용한 XML 라이브러리 구성을 도시한 도면.4 is a diagram showing the configuration of an XML library using the XML data RDBMS table mapping algorithm according to the present invention;

도 5는 XML 파일의 DOM 파싱 결과로 나타나는 DOM 객체 트리를 도시한 도면. 5 illustrates a DOM object tree resulting from DOM parsing of an XML file.

도 6은 XML의 DTD를 이용한 DB 테이블 분해(Decomposition) 과정을 나타낸 도면.6 is a diagram illustrating a DB table decomposition process using a DTD of XML.

도 7은 본 발명에 따른 요소별 관계 다이어그램을 나타낸 도면.7 is a diagram showing a relationship diagram for each element according to the present invention.

도 8은 본 발명에 따른 DOM 객체 트리의 바람직한 일 실시예를 나타낸 도면.8 illustrates one preferred embodiment of a DOM object tree in accordance with the present invention.

도 9는 본 발명에 따른 DTD 테이블의 바람직한 일 실시예를 나타낸 도면.9 illustrates a preferred embodiment of a DTD table in accordance with the present invention.

도 10은 본 발명에 따른 노드 테이블의 바람직한 일 실시예를 나타낸 도면.10 illustrates a preferred embodiment of a node table in accordance with the present invention.

도 11은 본 발명에 따른 테이블 분할(Partitioning)을 이용한 데이터 분산의 바람직한 일 실시예를 나타낸 도면.11 illustrates one preferred embodiment of data distribution using table partitioning in accordance with the present invention.

*도면의 주요 부분에 대한 부호의 설명** Description of the symbols for the main parts of the drawings *

101 : XML 파일 네임 102 : 등록 시간(regi_date)101: XML file name 102: registration time (regi_date)

103 : XML 파일 410 : setDTD103: XML file 410: setDTD

420 :importXML 430 : exportXML420: importXML 430: exportXML

440 : XML 데이터 검색 500 : RDBMS 테이블440: XML Data Retrieval 500: RDBMS Tables

710 : DTD 테이블 720 : 다큐먼트 테이블710: DTD table 720: document table

730 : 노드 테이블730: node table

본 발명은 RDBMS를 이용하여 XML 데이터를 관리하는 방법에 관한 것으로, 보다 구체적으로는 테이블 형태를 기본으로 하여 데이터베이스를 관리하는 RDBMS를 사용하여 객체 성향적인 XML 데이터를 관리하도록 하는 관계형 DBMS를 이용한 XML 데이터 관리 방법에 관한 것이다. The present invention relates to a method for managing XML data using an RDBMS, and more particularly, to XML data using a relational DBMS for managing object-oriented XML data using an RDBMS that manages a database based on a table type. It is about a management method.

초창기의 컴퓨터에서는 운영체제가 제공하는 파일 시스템을 이용하여 데이터를 단순히 저장하고 읽기만 했는데, 그러한 데이터를 보다 효율적으로 관리하거나 접근하기 위해서는 응용 프로그래머가 직접 파일 시스템 상에서 필요한 응용 프로그램을 개발해야만 했다. 그때에는 사용자의 요구가 바뀔 때마다 계속 응용 프로그램을 변경하거나 심지어는 다시 개발해야 했으며, 그 기능 또한 동시에 많은 사용자가 사용하기에는 극히 제한적이었다. Early computers simply stored and read data using the file system provided by the operating system. To manage or access such data more efficiently, application programmers had to develop the necessary applications on the file system. At that time, every time a user's needs changed, the application had to be changed or even redeveloped, and its functionality was extremely limited for many users at the same time.

1960년대 초에 이르러 처음으로 "데이터베이스"란 용어가 "한 조직의 응용 시스템들이 공용(shared)하기 위해 통합(integrated), 저장(stored)한 운영(operational) 데이터의 집합"이란 개념으로 정의되고 곧이어 데이터베이스를 관리하기 위한 시스템인 DBMS(Database Management System)가 소개되었다. 그 이후 계층적 데이터 모델과 네트워크 데이터 모델에 기반을 둔 IMS, Total 등의 DBMS들이 개발되어 영업망 정보, 부서 조직 정보 등 계층적 구조를 갖는 비즈니스 영역에서 우수한 성능을 발휘하였다. 하지만 이러한 DBMS들은 결정적인 단점을 하나 가지고 있었는데, 그것은 바로 데이터들이 포인터로 연결되어 있고 그러한 포인터들을 최대한 효율적으로 이용하도록 응용 프로그램들을 개발하였기 때문에 데이터베이스의 변경이 응용 프로그램의 성능에 큰 영향을 미친다는 것이다. 더구나 그러한 데이터베이스의 일관성을 유지하기란 여간 어려운 것이 아니었다. In the early 1960s, for the first time, the term "database" was defined and soon defined as the concept of "a collection of operational data that is integrated and stored for shared application systems in an organization." Introduced Database Management System (DBMS), a system for managing databases. Since then, DBMSs such as IMS and Total, which are based on hierarchical data models and network data models, have been developed to show excellent performance in hierarchical business areas such as sales network information and department organization information. However, these DBMSs had one decisive disadvantage: changes to the database had a significant impact on the performance of the application because the applications were developed to connect the data to pointers and make the best use of those pointers. Moreover, maintaining such a database was not difficult at all.

이러한 문제점들을 해결할 수 있는 관계형(Relational) 데이터 모델이 1970년대 초에 E.F. Codd에 의해 제안되었다. 관계형 데이터 모델은 개념적으로 project, restrict, join의 세가지 대표적인 관계 대수 연산을 제안하는데, 이러한 연산들을 기반으로 구축된 관계형 데이터베이스는 이전 데이터 모델들에서의 단점이었던 응용 프로그램에 특징적인 최적화 문제, 데이터 저장 구조 노출 문제, 포인터를 일일이 따라가야 하는 코드 작성 문제 등을 해소할 수 있었다. 그 결과로 데이터베이스 설계는 응용 프로그램과 독립적으로 유지되어 그 데이터베이스를 위한 여러 개의 응용 프로그램들이 개발 가능하게 되었다. A relational data model that could solve these problems was introduced in the early 1970s by E.F. Proposed by Codd. The relational data model conceptually proposes three representative relational algebraic operations: project, restrict, and join. The relational database built on these operations is an optimization problem, data storage structure, which is characteristic of the applications that were disadvantages of previous data models. We were able to solve the exposure problem, and the code writing code to follow the pointer manually. As a result, the database design remains independent of the application, enabling the development of multiple applications for that database.

관계형 데이터베이스의 기본적인 구성 요소는 이차원 테이블(또는 릴레이션)이다. 하나의 테이블은 데이터 행(튜플)들의 집합으로 구성되는데, 각 튜플은 다시 여러 데이터 원소(속성이라고 함)들로 구성된다. 이 때, 각 속성은 STRING, NUMBER, DATE 등과 같은 기본적인 데이터 타입을 갖도록 고정된다. The basic building block of a relational database is a two-dimensional table (or relation). A table consists of a set of data rows (tuples), each of which in turn consists of several data elements (called attributes). At this time, each attribute is fixed to have a basic data type such as STRING, NUMBER, DATE, and so on.

RDBMS 제품으로는 1970년대 중반의 IBM San Jose 연구소의 System R을 시작으로 U.C. Berkeley의 Ingres를 거쳐 오늘날의 오라클(Oracle), 인포믹스(Informix), Sybase, DB2 등 많은 시스템들이 개발되어 널리 사용되고 있다RDBMS products include U.C. Through Berkeley's Ingres, many systems such as Oracle, Informix, Sybase, and DB2 have been developed and widely used today.

1980년대에 들어 관계형 데이터베이스는 많은 기술적 발전을 이루어 DBMS 분야에 일대 변혁을 일으키며 문자, 숫자 등의 기본 타입을 갖는 대용량의 데이터 관리에 뛰어난 성능과 우수한 안정성을 보여 왔다. 현재까지도 대부분의 기업체의 주요 데이터가 숫자, 날짜, 그리고 짧은 문자열 등의 기본적인 타입의 데이터로 구성되어 있다.Since the 1980s, relational databases have undergone a number of technological advancements that have revolutionized the DBMS field, and have shown excellent performance and superior stability in managing large amounts of data with basic types such as letters and numbers. To date, most companies' primary data consists of basic types of data such as numbers, dates, and short strings.

하지만, 인터넷의 등장으로 세상은 급속하게 변화하고 있으며 보다 복잡한 데이터 분석 작업에 대한 필요성이 증대되고 있다. 또한 일반 사용자들은 이미지, 오디오, 그리고 비디오 등과 같은 멀티미디어 데이터 지원을 요구하고 있다. 이러 한 이유로 관계형 데이터베이스에 대한 문제점들이 나타나기 시작하였다. However, with the advent of the Internet, the world is changing rapidly and the need for more complex data analysis is increasing. In addition, end users are demanding multimedia data support such as images, audio, and video. For this reason, problems with relational databases began to emerge.

그것은, SQL에서의 데이터 타입은 제한적이며 확장이 불가능하고, 테이블을 이용하여 복합 객체(complex object)를 표현하기가 어려우며, SQL에서는 값에 의해 데이터의 관계가 표현되기 때문에 설령 복합 객체를 표현한다고 하더라도 관련 객체들을 찾기도 어렵다는 점이다. 임피던스 불일치(Impedence Mismatch) 문제로 인해서 응용 개발 및 유지가 어려운 점 또한 큰 문제점이라 할 수 있다. It is difficult to represent complex objects in SQL, and it is difficult to represent complex objects using tables. In SQL, data relationships are represented by values. It is also difficult to find related objects. Another problem is that application development and maintenance are difficult due to the impedance mismatch problem.

이와 같은 문제점들을 해결할 수 있는 한 방안으로 1980년대 중반이후부터 그 당시 주목받기 시작한 객체 지향 기술을 데이터베이스에 접목하는 것을 연구하기 시작했다. 1990년대에 들어서 객체 지향 기술은 소프트웨어 산업에 가장 중요한 개념들 중의 하나로 자리 잡았으며, 데이터베이스 분야에서도 객체 지향 기술을 받아들이는 것이 자연스러운 추세이다. As a way to solve these problems, I began to study the application of object-oriented technology, which had been in the spotlight since the mid-1980s, to the database. In the 1990s, object-oriented technology has become one of the most important concepts in the software industry, and it is natural to adopt object-oriented technology in the database field.

관계형 데이터베이스는 대용량의 기본 타입의 데이터를 효율적으로 저장 관리 및 검색하는데 탁월한 성능을 발휘하는 것에 비해서, 객체 지향 프로그래밍 언어(OOPL)가 기반으로 하는 객체 지향 모델은 실세계를 아주 훌륭하게 모델링 할 수 있으므로, 실세계 데이터를 효율적으로 표현하고 조작할 수 있다는 장점을 갖는다. While relational databases perform exceptionally well for efficiently storing, managing, and retrieving large amounts of primitive types of data, the object-oriented model on which the object-oriented programming language (OOPL) is based provides a very good model of the real world, The advantage is that data can be represented and manipulated efficiently.

하지만 관계형 모델과 객체 지향 모델이 근본적으로 다르기 때문에 그것들을 통합하는 것은 그리 간단하지가 않다. 관계형 데이터베이스는 이차원 테이블에 기반을 두며 데이터 사이의 관계는 각 테이블에 저장된 값들을 비교함으로써 표현되는데, SQL 언어를 이용하여 해당 테이블들을 연결(join)함으로써 그러한 관계를 실제로 이용할 수 있다. 그에 비해서 객체 모델은 코드와 데이터의 밀접한 통합, 유 연한 데이터 타입, 데이터 타입들 간의 계층적 관계(상속성), 그리고 객체 참조 등의 구조에 기반을 두고 있다. But because relational and object-oriented models are fundamentally different, integrating them is not so simple. Relational databases are based on two-dimensional tables, and the relationships between data are represented by comparing the values stored in each table. You can actually use those relationships by joining the tables using the SQL language. In contrast, the object model is based on the tight integration of code and data, flexible data types, hierarchical relationships between data types, and object references.

이러한 기본적인 구조를 관계형 데이터베이스 시스템의 이차원 테이블에 표현하는 것은 아주 어려운 일이며, 그 외에도 데이터 조작과 검색을 정확하게 다룰 수 있도록 두 시스템 사이의 인터페이스를 구현해야 하는 등 해결해야 할 문제들이 많이 존재한다. 결국 이와 같은 OOPL과 관계형 데이터베이스 사이의 의미적 불일치로 말미암아 관계형 데이터베이스를 통하지 않고 객체 지향 모델을 직접적으로 제공하는 객체 지향 데이터베이스 시스템이 나타나게 되었다. It is very difficult to represent this basic structure in two-dimensional tables in relational database systems, and there are many problems to be solved, including implementing an interface between the two systems to handle data manipulation and retrieval accurately. Eventually, this semantic mismatch between OOPL and relational databases led to the emergence of an object-oriented database system that directly provided an object-oriented model rather than through a relational database.

OODBMS(Object Oriented DBMS : 객체 지향 DBMS)는 데이터와 관련 코드를 결합한 구조의 객체에 기반하는 것으로서, 객체에 대한 정의는 클래스에 포함되며 각 객체는 해당 클래스의 인스턴스(instance)로서 생성된다. 이와 같은 객체 지향 모델에 기반을 둔 OODBMS는 기본적으로 캡슐화(encapsulation), 상속성(inheritance), 다형성(polymorphism), 객체 식별자, 객체들 간의 참조와 같은 객체의 특징들을 제공한다. An object-oriented DBMS (OODBMS) is based on an object whose structure combines data and related code. The definition of an object is contained in a class, and each object is created as an instance of the class. Based on this object-oriented model, OODBMSs basically provide object features such as encapsulation, inheritance, polymorphism, object identifiers, and references between objects.

OODBMS에서는 올바른 관계 유지를 위해 필요한 여러 고려사항들이 사라진다. OODBMS 제품으로는 1987년의 G-Base를 시작으로 GemStone, Orion, O2, ObjectStore, Versant, Objectivity 등 여러 제품들이 연구 개발되어 오늘날에 이르고 있다. In OODBMS, many of the considerations necessary to maintain the correct relationship are removed. Starting with G-Base in 1987, various products such as GemStone, Orion, O2, ObjectStore, Versant and Objectivity have been researched and developed.

한편, 1996년 W3C(World Wide Web Consortium)에서 제안된 XML(eXtensible Markup Language)은 최근 인터넷상의 정보 교환의 새로운 표준으로 채택되면서 그 활용 분야가 급격히 증가하고 있다. XML에 포함된 정보는 스스로 자신을 묘사하는 (self-describing) 성격을 가지기 때문에 비정규 데이터의 특징으로 가지며, 또한 계층화되는 객체 지향적(Object Oriented) 성격을 가지고 있다. 이러한 XML 데이터의 특징으로 인해, RDBMS 테이블 기반으로 XML 데이터를 관리하는 데 많은 제약이 따르게 된다. On the other hand, the XML (eXtensible Markup Language) proposed by the World Wide Web Consortium (W3C) in 1996 has recently been adopted as a new standard for information exchange on the Internet, and its field of use has rapidly increased. Because the information contained in XML is self-describing, it is characteristic of non-normal data and also has an object-oriented nature. Due to this characteristic of XML data, there are many restrictions in managing XML data based on RDBMS table.

기존에 RDBMS를 이용하여 XML 데이터를 관리하는 방법은 크게 두 가지로 나누어 볼 수 있다. Conventionally, there are two ways to manage XML data using RDBMS.

그 첫 번째는 XML 파일 전체를 RDBMS 테이블의 CLOB(Character Large Object) 형태로 저장하는 방법으로, CLOB은 문서 형태의 대용량 데이터를 저장하기 위해 RDBMS에서 제공하는 데이터 타입이다. 아래는 RDBMS 테이블 스키마(Schema)를 나타낸 것이다. The first is to store the entire XML file in the form of CLOB (Character Large Object) of the RDBMS table. CLOB is a data type provided by the RDBMS for storing large data in the form of documents. The following shows the RDBMS table schema.

create table xml_file_tbl (create table xml_file_tbl (

xml_file_name VARCHAR2(100) NOT NULL,xml_file_name VARCHAR2 (100) NOT NULL,

regi_date DATE,regi_date DATE,

xml_file CLOBxml_file CLOB

););

도 1은 XML 파일 전체를 RDBMS 테이블의 CLOB 형태로 저장하는 방법을 통해 파일을 보관하는 경우의 XML 파일테이블을 나타내고 있다. FIG. 1 shows an XML file table in a case where a file is stored through a method of storing the entire XML file in the form of a CLOB of an RDBMS table.

도 1의 XML 파일테이블의 XML_file_name(101)에는 sample1.XML이라는 이름의 파일이 저장되어 있으며, 등록 시간(regi_date)(102)은 '2004년 8월 31일 12시'임을 알 수 있다. 중요한 것은 XML 파일(103)이 자체 그대로 테이블의 CLOB 칼럼(column)에 저장되어 있다는 점이다. A file named sample1.XML is stored in XML_file_name 101 of the XML file table of FIG. 1, and it can be seen that the registration time (regi_date) 102 is '12 am on August 31, 2004 '. Importantly, the XML file 103 is stored in its own CLOB column of the table.

RDBMS를 이용하여 XML 데이터를 관리하는 두번째 방법은, XML 데이터에서 미리 검색 키(Key)로 사용하고자 하는 태그(tag)를 테이블의 칼럼으로 정의하고, XML 파일 저장시 그 값을 저장하는 것이다. The second way to manage XML data using RDBMS is to define a tag to be used as a search key in XML data as a column of a table, and to store the value when storing the XML file.

이 두 번째 방법은 XML 파일 전체를 RDBMS 테이블의 CLOB 형태로 저장한다는 점에서 첫 번째 방법과 동일하나, XML 문서의 특정 태그 및 값(value) 정보를 테이블의 칼럼에 매핑하여 저장한다는 점에서 차이를 가진다. 첫 번째 방법의 경우에는 XML 파일 전체를 CLOB에 단지 보관만 함으로써 XML의 특정 데이터 검색이 불가능하였던 것에 비해, 두 번째 방법은 이러한 문제를 보완하기 위해 특정 정보를 미리 정해 저장해 두는 방법을 쓰고 있다. This second method is identical to the first method in that the entire XML file is stored in the form of a CLOB in the RDBMS table, but the difference is that specific tag and value information in the XML document is mapped and stored in a column of the table. Have In the first method, it is impossible to retrieve specific data of XML by simply storing the entire XML file in a CLOB. The second method uses a method of pre-determining and storing specific information to compensate for this problem.

아래의 테이블 스키마는 첫 번째 방법에서 보았던 동일한 예를 가지고 'father' 태그의 이름 특성과 'son' 태그의 이름 특성을 테이블의 검색 키로 정의한 예를 보여주고 있다. The table schema below shows an example of defining the name attribute of the 'father' tag and the name attribute of the 'son' tag as the search key of the table, using the same example as in the first method.

/* Oracle Table Schema *// * Oracle Table Schema * /

create table xml_file_tbl (create table xml_file_tbl (

xml_file_name VARCHAR2(100) NOT NULL,xml_file_name VARCHAR2 (100) NOT NULL,

regi_date DATE,regi_date DATE,

xml_file CLOBxml_file CLOB

););

/* Oracle Table Schema *// * Oracle Table Schema * /

create table xml_node_tbl (create table xml_node_tbl (

xml_file_name VARCHAR2(100) NOT NULL,xml_file_name VARCHAR2 (100) NOT NULL,

father_name VARCHAR2(100) NOT NULL,father_name VARCHAR2 (100) NOT NULL,

son_name VARCHAR2(100) NOT NULL,son_name VARCHAR2 (100) NOT NULL,

regi_date DATEregi_date DATE

););

여기서, XML 파일테이블(xml_file_tbl)은 XML 파일 정보를 보관하는 테이블이고, XML 노드테이블(xml_node_tbl)은 XML 파일을 파싱(parsing)하여 특정 태그(father 태그, sun 태그)의 값을 저장하는 테이블이다. XML 파일테이블과 XML 노드테이블은 1대 N의 관계가 형성된다. Here, the XML file table (xml_file_tbl) is a table for storing XML file information, and the XML node table (xml_node_tbl) is a table for parsing an XML file and storing values of specific tags (father tag and sun tag). The XML file table and the XML node table have a one-to-n relationship.

도 2는 XML 데이터에서 미리 검색 키(Key)로 사용하고자 하는 태그(tag)를 테이블의 칼럼으로 정의하는 경우의 XML 노드테이블을 나타내고 있다. FIG. 2 shows an XML node table when a tag to be used as a search key in XML data is defined as a column of a table.

도 2는 상기 두 번째 방법을 설명하기 위한 것으로, 도 1의 경우와 같이 sample.XML이라는 이름의 XML 파일테이블을 저장하는 경우를 가정하며, 특정 태그(father, sun)의 값을 XML 파싱하여 XML 노드테이블의 해당 칼럼에 저장하고, XML 파일은 XML 파일 테이블의 CLOB 칼럼에 저장한다. 이 경우, XML 파일테이블의 형태는 도 1과 동일하며, XML 노드테이블의 형태는 도 2와 같이 나타내어진다.FIG. 2 is for explaining the second method, and assumes a case of storing an XML file table named sample.XML as in the case of FIG. 1 and XML by parsing a value of a specific tag (father, sun) It is stored in the corresponding column of the node table, and the XML file is stored in the CLOB column of the XML file table. In this case, the form of the XML file table is the same as in FIG. 1, and the form of the XML node table is shown in FIG.

상술한 RDBMS를 이용하여 XML 데이터를 관리하는 방법 중 첫 번째 방법은, XML 파일 자체를 그대로 테이블의 CLOB 칼럼에 저장만 했기 때문에 XML 데이터의 특정 태그에 대한 정보를 RDBMS에서 검색할 수 없으며, 두 번째 방법의 경우에는 특정 태그의 값을 파싱하여 검색 키로 저장함으로써 미리 정의된 태그에 대한 검색이 가능하지만, 첫 번째 방법과 마찬가지로 미리 정해두지 않은 임의의 태그에 대한 정보는 검색할 수 없는 문제점이 발생하게 된다. In the first method of managing XML data using the RDBMS described above, since the XML file itself is stored in the CLOB column of the table as it is, the information on a specific tag of the XML data cannot be retrieved from the RDBMS. In the case of the method, it is possible to search for a predefined tag by parsing the value of a specific tag and storing it as a search key.However, like the first method, it is impossible to retrieve information about an arbitrary tag that is not predefined. do.

최근 최신의 RDBMS(예를 들어, Oracle, Informix 등)도 객체 지향적 개념을 응용하여 XML 데이터의 처리 기능을 제공하기는 하지만, 이를 위해서는 최신의 RDBMS의 구매가 필요하다. 하지만 이미 운용중인 DBMS의 업그레이드는 많은 비용이 들고 데이터 관리의 안전성 측면에서도 선호되지 않으므로, 기존에 운용 중이던 DBMS를 사용하여 XML 데이터를 효과적으로 관리할 수 있는 방법에 대한 필요성이 대두되는 것이다. Recent RDBMSs (eg, Oracle, Informix, etc.) also apply object-oriented concepts to provide XML data processing, but this requires the purchase of the latest RDBMS. However, upgrading DBMS that is already in operation is expensive and is not preferred in terms of data management safety. Therefore, there is a need for a method to effectively manage XML data using the existing DBMS.

본 발명은 상기의 문제점을 해결하기 위해, DOM 파싱과 DTD 기반 DB 테이블의 분해 기법을 사용함으로써, XML 전용 DBMS가 아닌 일반적인 RDBMS를 이용하여 객체 성향을 띠는 XML 데이터를 처리할 수 있도록 하는 관계형 DBMS를 이용한 XML 데이터 관리 방법을 제공함을 그 목적으로 한다.
In order to solve the above problems, by using DOM parsing and decomposition of DTD-based DB tables, the relational DBMS that can process the XML data of the object tendency by using a general RDBMS rather than an XML-only DBMS Its purpose is to provide an XML data management method using.

본 발명의 일 측면에 따른 관계형 DBMS를 이용한 XML 데이터 관리 방법는, XML 파일을 DOM(Document Object Model) 파싱(Parsing)하여 DOM 객체 트리를 구성하는 단계와, 상기 DOM 객체 트리로부터 객체를 선택하는 단계와, 상기 선택된 DOM 객체 정보를 RDBMS 테이블에 등록하고, 상기 DOM 객체를 DB 테이블에 저장하는 단계를 포함한다. An XML data management method using a relational DBMS according to an aspect of the present invention comprises the steps of constructing a DOM object tree by parsing an XML file with a Document Object Model (DOM), selecting an object from the DOM object tree; Registering the selected DOM object information in an RDBMS table, and storing the DOM object in a DB table.

상기 객체 선택 단계는, 각 DOM 객체 정보를 '좌에서 우(left to right)', 그리고 '상에서 하(top to down)' 방향으로 순서대로 선택하는 것을 특징으로 한다. The object selecting step may include selecting each DOM object information in order of 'left to right' and 'top to down' in order.

상기 DOM 객체를 DB 테이블에 저장하는 단계는, XML 데이터의 DOM 객체를 DTD(Document Type Definition)별로 구분된 서로 다른 DB 테이블에 저장하는 것을 특징으로 한다.The storing of the DOM object in the DB table may include storing the DOM object of the XML data in different DB tables classified by Document Type Definitions (DTDs).

상기 DB 테이블은, DTD 정보를 보관하는 DTD 테이블, XML 파일 정보를 보관하는 다큐먼트 테이블 및 XML DOM 객체 정보를 보관하는 노드 테이블 중 적어도 하나를 포함하되, 하나의 DTD 테이블은 적어도 하나 이상의 다큐먼트 테이블과 연결되며, 하나의 다큐먼트 테이블은 적어도 하나 이상의 노드 테이블과 연결되는 것을 특징으로 한다.The DB table includes at least one of a DTD table for storing DTD information, a document table for storing XML file information, and a node table for storing XML DOM object information, wherein one DTD table is connected to at least one document table. One document table may be connected to at least one node table.

상기 노드 테이블은, 테이블의 데이터를 논리적 또는 물리적으로 구분하는 테이블 분할(Table Partitioning)에 의해 분산될 수 있는 것을 특징으로 한다.The node table may be distributed by table partitioning that logically or physically divides data of the table.

본 발명의 다른 측면에 따른 관계형 DBMS를 이용한 XML 데이터 관리 방법은, XML 파일을 DOM 파싱하여 DOM 객체 트리를 구성하고, 상기 DOM 객체 트리로부터 선택된 객체 정보를 RDBMS 테이블에 등록하는 단계; XML DTD 정보를 DTD 테이블에 등록하고, 등록된 DTD의 XML 데이터를 보관할 다큐먼트 테이블 및 노드 테이블을 생성하는 DTD 등록 단계; XML 파일을 DOM 파싱하여 DOM 객체 정보를 다큐먼트 테이블 및 노드 테이블에 저장하는 XML파일 저장 단계; 및 DBMS에 저장된 DOM 객체 정보를 이용하여 XML파일을 생성하는 XML파일 생성 단계를 포함한다.According to another aspect of the present invention, there is provided a method of managing XML data using a relational DBMS, including: parsing an XML file to construct a DOM object tree, and registering object information selected from the DOM object tree in an RDBMS table; A DTD registration step of registering XML DTD information in a DTD table and generating a document table and a node table for storing XML data of the registered DTD; Parsing an XML file and storing the DOM object information in a document table and a node table; And an XML file generation step of generating an XML file using the DOM object information stored in the DBMS.

상기 방법은, 상기 RDBMS 테이블로부터 SQL 쿼리(Query)를 이용하여 원하는 XML 데이터를 검색하는 단계를 더 포함할 수 있다.The method may further include retrieving desired XML data from the RDBMS table using an SQL query.

이하, 본 발명에 따른 바람직한 실시예를 도면을 살펴보면서 구체적으로 설명하기로 한다. Hereinafter, a preferred embodiment according to the present invention will be described in detail with reference to the drawings.

도 3은 본 발명에 따른 XML 데이터 RDBMS 테이블 매핑 알고리즘을 나타낸다. 3 illustrates an XML data RDBMS table mapping algorithm according to the present invention.

본 발명은 XML 파일 자체를 테이블에 저장하는 종래 기술과 달리, XML 파일을 DOM(Document Object Model) 파싱(parsing)한 DOM 객체 정보를 RDBMS 테이블에 저장한다.According to the present invention, unlike the prior art of storing the XML file itself in a table, DOM object information obtained by parsing an XML file in a Document Object Model (DOM) is stored in an RDBMS table.

본 발명에 따른 XML 데이터 RDBMS 테이블 매핑 알고리즘은 크게, XML 파일 DOM 파싱 단계(S310), DOM 트리로부터 객체를 선택하는 단계(S320), 그리고, DOM 객체 정보를 RDBMS에 등록하는 단계(S330)로 나누어 볼 수 있다. 각 단계에 대한 자세한 설명은, 도 5 내지 7을 통해 설명하기로 한다. The XML data RDBMS table mapping algorithm according to the present invention is largely divided into an XML file DOM parsing step S310, selecting an object from a DOM tree in step S320, and registering DOM object information in an RDBMS in step S330. can see. Detailed description of each step will be described with reference to FIGS. 5 to 7.

여기서, DOM은 HTML(HyperText Markup Language)과 XML 문서를 연결시켜주는 프로그래밍적 인터페이스로, 어떤 XML 문서를 열고 XML 데이터를 처리할 수 있는 방법을 정의하고 있다. DOM을 이용하면 개발자는 XML 문서를 만들 수 있고, XML 문서 구조를 네비게이션(navigation)할 수 있으며, 그 요소들을 추가/수정/삭제할 수 있다. DOM의 중요한 목적은 아주 다양한 환경과 애플리케이션에서 사용할 수 있는 표준적인 프로그래밍 인터페이스를 제공하는 것이다.Here, DOM is a programmatic interface connecting HTML (HyperText Markup Language) and XML documents. It defines how to open any XML document and process XML data. Using the DOM, developers can create XML documents, navigate XML document structures, and add / modify / delete their elements. The main purpose of the DOM is to provide a standard programming interface for use in a wide variety of environments and applications.

W3C DOM은 어떠한 프로그래밍 언어로도 개발될 수 있도록 고안되었다. XML문서 안에 포함된 정보는 자체 기술적이다. 즉 XML 문서는 데이터 뿐만 아니라 그 데이터에 관한 정보를 태그의 형태로 포함하고 있다. 그래서 텍스트 형태인 XML 문서에서 태그 정보나 데이터를 뽑아내는 데 어려움이 많다. 이처럼 텍스트 문서를 분석해야 하는 노력을 덜기 위해서, XML 데이터를 XML 파서(parser)를 사용하여 프로세싱한다. XML 문서는 파서를 이용하여 텍스트 형태의 XML 문서를 트리 기반의 계층적(hierarchical) 구조로 변환한다. The W3C DOM is designed to be developed in any programming language. The information contained within an XML document is self descriptive. That is, the XML document contains not only data but also information about the data in the form of a tag. Therefore, it is difficult to extract tag information or data from an XML document in text form. To reduce the effort of parsing text documents, XML data is processed using an XML parser. An XML document uses a parser to convert an XML document in text form into a tree-based hierarchical structure.

XML 문서의 요소(element)는 계층 구조로 형성되어 있기 때문에, 문서는 모든 정보를 포함하는 트리로 나타내어 질 수 있다. 서브 요소는 상위요소의 서브노드로 나타나며, 속성(attribute) 또한 속성이 정의된 요소의 서브 노드로 인식되어진다. 프로세스 과정에서, 파서는 그 문서의 요소들을 나타내는 노드들의 트리를 생성한다.Since the elements of an XML document are formed in a hierarchical structure, the document can be represented as a tree containing all the information. Sub-elements appear as sub-nodes of their parent, and attributes are also recognized as sub-nodes of the element in which they are defined. In the process, the parser creates a tree of nodes representing the elements of the document.

파싱 모델(Parsing model)에는 트리 기반(Tree Based)의 DOM과, 이벤트 기반 의 SAX(Simple API for XML)가 있으며, DOM은 XML의 구조와 정보로의 접근을 용이하게 하고, 그것의 조작을 가능하게 하는 표준 인터페이스를 제공하며, XML 문서를 계층적인 노드의 형태로 나타낸다. The parsing model includes a tree-based DOM and an event-based simple API for XML (SAX), which facilitates access to the structure and information of XML and allows its manipulation. It provides a standard interface for representing XML documents in the form of hierarchical nodes.

도 4는 본 발명에 따른 XML 데이터 RDBMS 테이블 매핑 알고리즘을 이용한 XML 라이브러리 구성을 도시하고 있다.4 illustrates an XML library configuration using the XML data RDBMS table mapping algorithm according to the present invention.

도 4는 도 3을 통해 제시된 XML 데이터 RDBMS 테이블 매핑 알고리즘을 근간으로 하여 XML 라이브러리(Library)를 구성할 수 있음을 보여주고 있다. XML 라이브러리는 세 가지 주요 기능을 제공하며, 운용자는 XML 라이브러리를 이용하여 XML 데이터를 RDBMS에 저장(importXML)하거나 RDBMS로부터 원하는 XML 파일 정보를 생성(exportXML)할 수 있다. 운용자는 또한 RDBMS 테이블로부터 SQL 쿼리(Query)를 이용하여 원하는 XML 데이터를 검색(440)할 수도 있다. FIG. 4 shows that an XML library can be constructed based on the XML data RDBMS table mapping algorithm shown in FIG. 3. The XML library provides three main functions, and the operator can use the XML library to store XML data in an RDBMS (importXML) or to export XML file information from the RDBMS (exportXML). The operator can also retrieve 440 the desired XML data using an SQL query from the RDBMS table.

XML 라이브러리의 세 가지 주요 기능은 DTD 등록(setDTD)(410), XML파일 저장(importXML)(420), 그리고 XML파일 생성(exportXML)(440)이다. The three main functions of the XML library are DTD registration (setDTD) 410, XML file import (importXML) 420, and XML file export (exportXML) 440.

DTD 등록(410) 기능은 XML DTD 정보를 DTD 테이블에 등록하며, 등록된 DTD의 XML 데이터를 보관할 다큐먼트 테이블, 노드 테이블을 생성하는 역할을 한다. The DTD registration function 410 registers the XML DTD information in the DTD table, and generates a document table and a node table for storing XML data of the registered DTD.

여기서, DTD는 그 문서의 문단을 구분하고, 주제의 제목을 식별하고, 또 각각이 어떻게 처리되어야할지를 나타내는 마크업(Markup)을 식별할 수 있도록, 문서와 함께 동반되는 하나의 규격으로, 문서와 함께 DTD를 로 보내면, DTD 리더(또는 SGML 컴파일러)를 갖고 있는 어떠한 곳에서도 그 문서를 처리하여, 원래 의도한대 로 화면에 표시하거나 프린트할 수 있게 될 것이다. 이것은 하나의 표준 SGML 컴파일러가, 다른 마크업 코드 및 관련된 의미를 갖는 많은 다른 종류의 문서들을 서비스할 수 있다는 것을 의미한다. 컴파일러는 DTD를 참조하여, 그 문서를 적절히 화면에 표시하거나 프린트한다. Here, the DTD is a specification that accompanies the document to distinguish paragraphs of the document, to identify the title of the subject, and to identify markups that indicate how each should be handled. Sending the DTD together will allow the document to be processed anywhere that has a DTD reader (or SGML compiler) that can be displayed or printed as originally intended. This means that one standard SGML compiler can serve many different kinds of documents with different markup code and related semantics. The compiler consults the DTD and displays or prints the document as appropriate.

XML파일 저장(420) 기능은, XML 파일을 DOM 파싱하여 DOM 객체 정보를 다큐먼트 테이블 및 노드 테이블에 저장하는 것이다. XML파일 생성(430) 기능은 DBMS에 저장된 DOM 객체 정보를 이용하여 XML파일을 생성하는 역할을 담당한다. The XML file storage 420 function is to DOM parse an XML file to store DOM object information in the document table and node table. The XML file generation 430 function plays a role in generating an XML file using DOM object information stored in a DBMS.

이상, 도 3과 도 4를 통해 본 발명에 대해 전반적으로 살펴보았고, 이하 도5 내지 도 7을 통해 본 발명에 대한 구체적인 세부 사항을 좀더 살펴보기로 한다.In the above, the present invention has been described with reference to FIGS. 3 and 4, and specific details of the present invention will be described below with reference to FIGS. 5 to 7.

도 5는 XML 파일의 DOM 파싱 결과로 나타나는 DOM 객체 트리를 도시하고 있다. 5 illustrates a DOM object tree resulting from DOM parsing of an XML file.

XML 파일을 DOM 파싱하면 도 4와 같은 DOM 객체 트리가 구성됨을 확인할 수 있다. DOM 파싱에 의해 생성된 DOM 객체 트리를 '좌에서 우(left to right)', 그리고 '상에서 하(top to down)' 방향으로 각 DOM 객체 정보를 순서대로 RDBMS 테이블(500)에 매핑하여 입력한다. When you parse the XML file, you can see that the DOM object tree is constructed as shown in FIG. The DOM object tree generated by DOM parsing is entered by mapping each DOM object information to the RDBMS table 500 in the order of 'left to right' and 'top to down' in order. .

도 6은 XML의 DTD를 이용한 DB 테이블 분해(Decomposition) 과정을 나타내고 있다.6 illustrates a DB table decomposition process using a DTD of XML.

도 6에서 보는 바와 같이, DTD 분해는 하나의 노드테이블을 물리적으로 분리하여 여러 개의 노드별 XML 데이터로 나누는 역할을 한다. As shown in FIG. 6, DTD decomposition performs a role of physically separating one node table and dividing it into XML data for each node.

DTD(Document Type Definition)는 XML 데이터의 포맷을 정의하는데, 이는 동일 DTD를 이용하는 XML 데이터는 동일 포맷으로 정의됨을 의미한다. 도 6의 'dtd_# xml data'의 앞부분인 "dtd"가 바로 데이터의 포맷을 나타내는 부분이다. 도 6에서 분할된 각 노드 테이블들(dtd_1 xml data, dtd_2 xml data, ... ,dtd_n xml data)은 하나의 DTD를 이용해 분할되었으므로 모두 동일한 포맷을 가진다. Document Type Definition (DTD) defines a format of XML data, which means that XML data using the same DTD is defined in the same format. "Dtd", which is the front part of 'dtd_ # xml data' of FIG. 6, is a part indicating the format of data. Each of the node tables (dtd_1 xml data, dtd_2 xml data, ..., dtd_n xml data) divided in FIG. 6 have the same format since they are partitioned using one DTD.

도 3에서 살펴보았던 XML 데이터의 RDBMS 테이블 매핑 알고리즘의 DOM 객체 정보 RDBMS 등록은, XML 데이터의 DOM 객체를 하나의 DB 테이블에 저장하는 것이 아니라 DTD별로 구분된 다른 DB 테이블에 저장한다. 새로운 DTD는 매번 DTD 등록 과정을 통해 새로운 DB 테이블을 동적으로 생성한다. DOM object information of the RDBMS table mapping algorithm of the XML data as described in FIG. 3 RDBMS registration, rather than storing the DOM object of the XML data in one DB table, stored in another DB table classified by DTD. The new DTD dynamically creates a new DB table through the DTD registration process each time.

이렇게 DTD별로 DB 테이블을 나누어 저장함으로써 얻는 이점은 XML 데이터를 서로 다른 DB 테이블에 분산시키고, 데이터가 분산되어 검색 시간을 단축시킬 수 있으며, DTD별 DB 테이블을 구분하여 관리의 효율을 높일 수 있다는 점이다. The benefits of dividing and storing DB tables by DTD are that XML data can be distributed to different DB tables, data can be distributed, and search time can be shortened. DB tables by DTD can be classified to improve management efficiency. to be.

도 7은 본 발명에 따른 요소별 관계 다이어그램(Entity Relationship Diagram)을 나타내고 있다. 7 shows an entity relationship diagram according to the present invention.

도 7을 살펴보면, 하나의 DTD 테이블(710)은 복수의 다큐먼트 테이블(720)과 연결되고, 하나의 다큐먼트 테이블(720)은 다시 복수의 노드테이블(730)과 연결되어 있음을 알 수 있다. Referring to FIG. 7, it can be seen that one DTD table 710 is connected to the plurality of document tables 720, and one document table 720 is again connected to the plurality of node tables 730.

DTD 테이블(dtd_tbl)(710)은 DTD 정보를 보관하는 테이블로 아래와 같은 형태의 오라클 테이블 스키마로 나타낼 수 있다. The DTD table (dtd_tbl) 710 is a table that stores DTD information and may be represented by an Oracle table schema of the following form.

/* Oracle Table Schema *// * Oracle Table Schema * /

create table dtd_tbl (create table dtd_tbl (

dtd_id NUMBER(5) NOT NULL, // DTD 구분 IDdtd_id NUMBER (5) NOT NULL, // DTD distinguished ID

dtd_name VARCHAR2(100) NOT NULL, // DTD file 이름dtd_name VARCHAR2 (100) NOT NULL, // DTD file name

dtd_root VARCHAR2(100) NOT NULL, // document 의 root tag 이름dtd_root VARCHAR2 (100) NOT NULL, // root tag name of document

version VARCHAR2(100) NOT NULL, // document versionversion VARCHAR2 (100) NOT NULL, // document version

encoding VARCHAR2(100) NOT NULL, // document encoding 이름encoding VARCHAR2 (100) NOT NULL, // document encoding name

doctbl_name VARCHAR2(100) NOT NULL, // dtd에 mapping되는 doc_tbl namedoctbl_name VARCHAR2 (100) NOT NULL, // doc_tbl name mapped to dtd

nodetbl_name VARCHAR2(100) NOT NULL, // dtd에 mapping되는 node_tbl name nodetbl_name VARCHAR2 (100) NOT NULL, // node_tbl name mapped to dtd

regi_date DATE, // document 등록일regi_date DATE, // document registration date

dtd CLOB // document DTD 정보dtd CLOB // document DTD information

););

다큐먼트 테이블(doc_#_tbl)(720)은 XML 파일(document) 정보를 보관하는 테이블로, #는 일련번호로 매번 DTD를 등록할 때 자동으로 할당되는 번호이다. 즉, 다큐먼트 테이블(720)은 doc_1_tbl, doc_2_tbl, ... 식으로 정해지며, 아래와 같은 형태의 스키마로 나타낼 수 있다.The document table (doc _ # _ tbl) 720 is a table for storing XML file information. # Is a serial number, and is a number automatically assigned each time a DTD is registered. That is, the document table 720 is determined by doc_1_tbl, doc_2_tbl, ..., and can be represented by a schema of the following form.

/* Oracle Table Schema *// * Oracle Table Schema * /

create table doc_#_tbl (create table doc _ # _ tbl (

doc_id NUMBER(5) NOT NULL, // XML document 구분 IDdoc_id NUMBER (5) NOT NULL, // XML document ID

dtd_id NUMBER(5) NOT NULL, // DTD 구분 ID (parent ID)dtd_id NUMBER (5) NOT NULL, // DTD distinguished ID (parent ID)

doc_name VARCHAR2(100) NOT NULL, // XML file(document) 이름doc_name VARCHAR2 (100) NOT NULL, // name of the XML file (document)

regi_date DATE // 등록일regi_date DATE // date of registration

););

노드 테이블(node_#_tbl)(730)은 XML DOM 객체 정보를 보관하는 테이블로, #는 역시 일련 번호를 나타내며, 매번 DTD를 등록할 때 자동으로 할당되는 값을 취한다. 그러므로, 노드 테이블(730)은 node_1_tbl, node_2_tbl, ... 의 형태를 취하여, 아래와 같은 스키마를 가진다. The node table (node _ # _ tbl) 730 is a table for storing XML DOM object information. # Also represents a serial number and takes a value automatically assigned each time a DTD is registered. Therefore, the node table 730 takes the form of node_1_tbl, node_2_tbl, ..., and has the following schema.

/* Oracle Table Schema *// * Oracle Table Schema * /

create table node_#_tbl (create table node _ # _ tbl (

doc_id NUMBER(5) NOT NULL, // XML document 구분 ID (parent ID)doc_id NUMBER (5) NOT NULL, // XML document ID (parent ID)

node_id NUMBER(5) NOT NULL, // DOM Object 구분 IDnode_id NUMBER (5) NOT NULL, // DOM Object ID

pnode_id NUMBER(5) NOT NULL, // DOM Tree 상의 상위 node IDpnode_id NUMBER (5) NOT NULL, // parent node ID on DOM Tree

node_type NUMBER(2) NOT NULL, // DOM 정의 Object Type IDnode_type NUMBER (2) NOT NULL, // DOM Definition Object Type ID

node_name VARCHAR2(100) NULL, // DOM Object 이름node_name VARCHAR2 (100) NULL, // DOM Object Name

node_value VARCHAR2(100) NULL, // DOM Object Valuenode_value VARCHAR2 (100) NULL, // DOM Object Value

node_ord NUMBER(2) NULL // DOM Object의 동일 sibling 순서node_ord NUMBER (2) NULL // Same sibling order of DOM Objects

););

이상으로, 본 발명의 구성과 동작에 대해 전반적으로, 또한 구체적으로 살펴본 바, 이하에서는 본 발명에 따른 바람직한 실시예들을 살펴보기로 한다. As mentioned above, the configuration and operation of the present invention have been described in general and in detail. Hereinafter, preferred embodiments of the present invention will be described.

도 8은 본 발명에 따른 DOM 객체 트리의 바람직한 일 실시예를 나타내고 있다. 8 illustrates a preferred embodiment of a DOM object tree in accordance with the present invention.

도 8은 도 1에서 사용하였던 sample.xml 파일을 그대로 실시예로 사용하여 종래 기술 대신 본 발명에 따른 DOM 객체 트리의 형상을 보여주고 있다. 도 8을 보면 알 수 있듯이, 각 객체들이 서로 연결되어 있다.FIG. 8 illustrates the shape of the DOM object tree according to the present invention instead of the prior art by using the sample.xml file used in FIG. 1 as an embodiment. As can be seen in Figure 8, each object is connected to each other.

도 9는 본 발명에 따른 DTD 테이블의 바람직한 일 실시예를 나타낸다. 9 illustrates a preferred embodiment of a DTD table according to the present invention.

도 9는 DTD 정보를 DTD 테이블에 저장한 결과를 보여주고 있으며, 여기서는 sample.xml의 DTD가 최초로 등록되어 doc_1_tbl, node_1_tbl로 생성됨을 가정하고 있다. doc_1_tbl에 저장한 결과는 아래의 표 1과 같이 나타내어진다.9 shows a result of storing DTD information in a DTD table. It is assumed here that a DTD of sample.xml is first registered and generated as doc_1_tbl and node_1_tbl. The result stored in doc_1_tbl is shown in Table 1 below.

doc_iddoc_id dtd_iddtd_id doc_namedoc_name regi_dateregi_date 1One 1One sample.xmlsample.xml 2004-08-31 12:00:002004-08-31 12:00:00

도 10은 본 발명에 따른 노드 테이블의 바람직한 일 실시예를 나타낸다. 10 illustrates one preferred embodiment of a node table in accordance with the present invention.

도 10은 DOM 트리의 객체 정보를 node_1_tbl에 저장한 결과를 보여주고 있으며, DOM 트리의 객제 정보를 저장하는 노드 테이블로부터 사용자가 원하는 임의의 XML 태그 및 값을 자유자재로 검색할 수 있다.FIG. 10 illustrates a result of storing object information of the DOM tree in node_1_tbl, and may arbitrarily search for an arbitrary XML tag and value desired by a user from a node table storing object information of the DOM tree.

도 10의 노드 테이블을 근거로 하여 몇 가지 데이터의 검색 예를 들어보기로 한다. 예를 들어, 'father'의 이름을 검색하는 경우에는, An example of retrieving some data will be given based on the node table of FIG. For example, if you search for the name of 'father',

"select node_value from node_tbl where node_name='name; and pnode_id=(select node_id from node_tbl where node_name='father')"select node_value from node_tbl where node_name = 'name; and pnode_id = (select node_id from node_tbl where node_name =' father ')

와 같은 검색식을 사용하면 된다. You can use a search expression like this:

또한 'son1'의 'father 나이'를 검색한다고 할 때에는 아래와 같은 검색식을 사용하면 될 것이다. Also, to search for 'father age' of 'son1', you can use the following search.

"select node_value from node_1_tbl where node_name="age" AND node_id = (select pnode_id from node_1_tb1 where node_name = 'name' AND node_value = 'son1')""select node_value from node_1_tbl where node_name =" age "AND node_id = (select pnode_id from node_1_tb1 where node_name = 'name' AND node_value = 'son1')"

도 11은 본 발명에 따른 테이블 분할(Partitioning)을 이용한 데이터 분산의 바람직한 일 실시예를 나타낸다. 11 illustrates a preferred embodiment of data distribution using table partitioning according to the present invention.

본 발명에서는, DOM 객체 정보가 저장되는 노드 테이블에 월 단위 테이블 분할 개념을 적용하여 데이터를 분산시켜 검색 성능을 향상시킨다. 테이블 분할은 DBMS에서 제공하는 기능으로 테이블의 데이터를 논리적 또는 물리적으로 구분되는 파티션(partition)을 이용하여 분할한다. In the present invention, by applying the concept of monthly table partitioning to the node table in which DOM object information is stored, data is distributed to improve search performance. Table partitioning is a function provided by the DBMS that partitions table data using partitions that are logically or physically separated.

도 11의 테이블 분할 방법은 도 6에서 살펴보았던 테이블 분해(Decomposition)와는 구별되는 것으로, 도 6의 테이블 분해가 물리적인 것이라면, 도 10에서 사용하는 월 단위 분할 방법은 논리적 구분에 따른 노드 테이블의 분할이라 할 것이다. The table partitioning method of FIG. 11 is distinguished from the table decomposition described in FIG. 6. If the table decomposition of FIG. 6 is physical, the monthly partitioning method used in FIG. 10 divides the node table according to logical division. Will be called.

도 11의 방법에 따라 월 단위 파티션이 적용된 노드 테이블의 스키마는 아래와 같이 나타낼 수 있다.According to the method of FIG. 11, the schema of the node table to which the monthly partition is applied may be represented as follows.

/* Oracle Table Schema *// * Oracle Table Schema * /

create table node_#_tbl (create table node _ # _ tbl (

node_ord NUMBER(2) NULL, // DOM Object의 동일 sibling 순서node_ord NUMBER (2) NULL, // Same sibling order of DOM Object

month NUMBER(2) NOT NULL // table partition keymonth NUMBER (2) NOT NULL // table partition key

) partition by range(month)) partition by range (month)

((

partition XML_01 values less than (2) partition XML_01 values less than (2)

, partition XML_02 values less than (3), partition XML_02 values less than (3)

, partition XML_03 values less than (4), partition XML_03 values less than (4)

, partition XML_04 values less than (5), partition XML_04 values less than (5)

, partition XML_05 values less than (6), partition XML_05 values less than (6)

, partition XML_06 values less than (7), partition XML_06 values less than (7)

, partition XML_07 values less than (8), partition XML_07 values less than (8)

, partition XML_08 values less than (9), partition XML_08 values less than (9)

, partition XML_09 values less than (10), partition XML_09 values less than (10)

, partition XML_10 values less than (11), partition XML_10 values less than (11)

, partition XML_11 values less than (12), partition XML_11 values less than (12)

, partition XML_12 values less than (13), partition XML_12 values less than (13)

););

본 발명은, 기존에 운용 중이던 RDBMS를 이용하여 XML 데이터를 유연하게 관리함으로써 기존 운용 RDBMS의 활용도를 높이고, XML 전용 DBMS와 RDBMS 두 DBMS를 함께 사용하는 데 따른 운용 및 데이터 관리의 중복 가능성 등의 문제를 없앨 수 있어 XML 데이터와 비 XML 데이터의 통합을 용이하게 하는 이점을 가진다. The present invention, by flexibly managing the XML data using the existing RDBMS to increase the utilization of the existing operating RDBMS, problems such as the possibility of duplication of operation and data management due to the use of both the DBDB and the RDBMS dedicated XML DBMS Can be eliminated to facilitate the integration of XML data with non-XML data.

Claims

In the method for managing XML (eXtensible Markup Language) data using a relational DBMS (RDBMS),

Parsing an XML file from a Document Object Model (DOM) to construct a DOM object tree;

Selecting an object from the DOM object tree; And

Registering the selected DOM object information in an RDBMS table, and storing the DOM object in a DB table.

The method of claim 1,

The object selection step,

A method of managing XML data using a relational DBMS, in which each DOM object information is selected in an order of 'left to right' or 'top to down'.

The method of claim 1,

Storing the DOM object in a DB table,

A method of managing XML data using a relational DBMS, characterized by storing DOM objects of XML data in different DB tables classified by document type definitions (DTDs).

The method of claim 3, wherein

The DB table,

At least one of a DTD table for storing DTD information, a document table for storing XML file information, and a node table for storing XML DOM object information.

A method for managing XML data using a relational DBMS, wherein one DTD table is connected to at least one document table, and one document table is connected to at least one node table.

The method of claim 4, wherein

The node table is,

A method of managing XML data using a relational DBMS, characterized in that the data can be distributed by table partitioning that logically or physically separates the data of the table.

Parsing an XML file to construct a DOM object tree and registering object information selected from the DOM object tree in an RDBMS table;

A DTD registration step of registering XML DTD information in a DTD table and generating a document table and a node table for storing XML data of the registered DTD;

Parsing an XML file and storing the DOM object information in a document table and a node table; And

An XML data management method using a relational DBMS including an XML file generation step of generating an XML file using DOM object information stored in a DBMS.

The method of claim 6,

XML data management method using a relational DBMS further comprising the step of retrieving the desired XML data using an SQL query from the RDBMS table.

The method of claim 6,

The RDBMS table registration step,

The method of claim 8,

The DB table,