JP2004348485A

JP2004348485A - Structured document processing method and device, structured document processing program, and storage medium storing structured document processing program

Info

Publication number: JP2004348485A
Application number: JP2003145414A
Authority: JP
Inventors: Takashi Hayashi; 孝志林; Shiro Kasuga; 史朗春日; Kiyoutaro Horiguchi; 恭太郎堀口; Mitsuaki Tsunakawa; 光明綱川
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2003-05-22
Filing date: 2003-05-22
Publication date: 2004-12-09
Anticipated expiration: 2023-05-22
Also published as: JP4289022B2

Abstract

<P>PROBLEM TO BE SOLVED: To establish both the high speed property of the retrieval of a database and the high speed property of the addition and update of the database. <P>SOLUTION: In this method, an XML document is divided into partial XML documents whose overlapping is permitted on the basis of the hierarchial structure of the XML document, and an standard inquiry to inputted XML is converted into SQL based on mapping definition to associate the partial XLM documents with RDB. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、構造化文書処理方法及び装置及び構造化文書処理プログラム及び構造化文書処理プログラムを格納した記憶媒体に係り、特に、データーベースを利用した構造化文書処理方法及び装置及び構造化文書処理プログラム及び構造化文書処理プログラムを格納した記憶媒体に関する。
【０００２】
【従来の技術】
ＸＭＬ（ｅＸｔｅｎｓｉｂｌｅＭａｒｋｕｐＬａｎｇｕａｇｅ）は、ネットワーク上で交換される文書やデータの記述形式を規定するために、Ｗ３Ｃ（ＷｏｒｌｄＷｉｄｅＷｅｂＣｏｎｓｏｒｔｉｕｍ）が制定した規準規格である。ＸＭＬは、構造化文書の国際標準であるＳＧＭＬ（ＳｔａｎｄａｒｄＧｅｎｅｒａｌｉｚｅｄＭａｒｋｕｐＬａｎｇｕａｇｅ）の拡張可能性とＨＴＭＬ（ＨｙｐｅｒＴｅｘｔＬａｎｇｕａｇｅ）のインターネット利用性を併せ持った形式として期待されている。例えば、ＥＣ（ＥｌｅｃｔｒｏｎｉｃＣｏｍｍｅｒｃｅ）やＫＭ（ＫｎｏｗｌｅｄｇｅＭａｎａｇｅｍｅｎｔ）で交換されるデータの記述や電子図書館の蔵書カタログの記述にもＸＭＬは利用できる。そして、このような利用例では、ＸＭＬの交換性という要件を超えて、大量のＸＭＬ文書を格納し、検索、更新できるという要件が重要になってくる。この要件を満たすためには、データーベース技術の適用・開発が必要となる。これには、大きく２つの方式がある。
【０００３】
（１）まず、１つの方式として、ネイティブＸＭＬデーターベースの開発がある。当該方式は、ＸＭＬ文書をそのまま格納し、検索、更新も可能なデーターベースを新たに開発する。Ｔａｍｉｎｏ（ソフトウェアＡＧ），ｅＸｃｅｌｏｎ（エクセロン）、Ｙｇｇｄｒａｓｉｌｌ（メディアフュージョン）などが挙げられる。
【０００４】
（２）２つ目の方式として、既存データーベースの機能拡張がある。既存のＲＤＢ（リレーショナルデーターベース）等に対し、ＸＭＬ文書を格納したり、取得したデータをＸＭＬ化する機能を拡張する。Ｏｒａｃｌｅ９ｉ（オラクル）、ＳＱＬＳｅｒｖｅｒ２０００（マイクロソフト）、ＤＢ２（ＩＢＭ）など主要なデーターベースはＸＭＬ対応となっている。
【０００５】
上記の（１）は、ＸＭＬ文書をそのまま扱えるのが大きな利点である。しかし、現在、ほとんどのデータがＲＤＢに格納されており、これら既存データとの連携、活用には（２）が適している。（２）でＸＭＬ文書の格納、検索、更新を行う場合、ＸＭＬ文書とＲＤＢとのマッピング方法が重要となる。ＸＭＬをＲＤＢにマッピングし、格納するには、大きく２つの方法がある。
【０００６】
ａ）ＸＭＬ文書全体を１カラムに格納する方法：
ＲＤＢのＣＬＯＢ（ＣｈａｒａｃｔｅｒＬａｒｇｅＯｂｊｅｃｔ）やＶａｒｃｈａｒ等のデータ型を利用することで、ＸＭＬ文書全体をそのまま１カラムに格納する。ＸＭＬ文書内のデータだけでなく、ＸＭＬ文書自体の文書構造を保持する場合に有効な方法である。例えば、新聞や雑誌の記事をアーカイブとして残す場合が当てはまる。この場合、指定された要素（日時や記者名等）のみ別カラムとし、インデックスを作成することで検索可能となる。
【０００７】
ｂ）ＸＭＬ文書をデータ項目に分解して複数カラムに格納する方法：
元となるＸＭＬ文書の要素や属性をデータ項目として分解し、ＲＤＢの複数のカラムに格納する。元のＸＭＬ文書自体は保存されないため、ＲＤＢ上に文書構造は保持されない。しかし、ＲＤＢ上のデータとして扱うため、当該データを複数のアプリケーションで共用する場合に有効な格納方法である。また、ＲＤＢからＸＭＬへのマッピング方法を決めておくことで、既存のＲＤＢに蓄積したデータをＸＭＬ文書として取得（出版）する（このような技術をＸＭＬＰｕｂｌｉｓｈｉｎｇと呼ぶ）こともできる。
【０００８】
これらの格納方法をＸＭＬ−ＲＤＢ間のマッピング定義情報により実現する技術も開発されている（例えば、非特許文献１参照）。
【０００９】
【非特許文献１】
ＩＢＭ，ＤＢ２ＸＭＬＥｘｔｅｎｄｅｒ：ｈｔｔｐ：／／ｗｗｗ−６．ｉｂｍ．ｃｏｍ／ｊｐ／ｓｏｆｕｗａｒｅ／ｄａｔａ／ｄｅｖｅｌｏｐｅｒ／ｃｏｌｕｍ／ｋａｎｔａｎｅｘｔｅｎｄ／０５ｘｍｌｅｘｔｅｎｄｅｒ／０１／ｈｔｍｌ
【００１０】
【発明が解決しようとする課題】
しかし、格納方法をＸＭＬ−ＲＤＢ間のマッピング定義情報により実現する技術では、不十分な点も多い。
【００１１】
（１）ＸＭＬに対する標準的な問い合わせで検索，追加、更新することが困難であるという問題（機能的要件）：Ｗ３Ｃによって開発が進められている標準的な検索言語ＸＱｕｅｒｙ（ｈｔｔｐ：／／／ｗｗｗ．ｗ３．ｏｒｇ／ＴＲ／ｘｑｕｅｒｙ／）では、検索結果ＸＭＬの構造を再構成、変換するといった柔軟な記述が可能となっている。また、追加、更新に関してもＸｕｐｄａｔｅ（ｈｔｔｐ：／／ｗｗｗ．ｘｍｌｄｂ．ｏｒｇ／ｘｕｐｄａｔｅ／）というＸＭＬに対応した言語が提案されている。しかし、従来の方法では、格納されたデータは、ＲＤＢ上のデータとして扱われる。そして、検索、追加、更新に用いられるクエリは、一般的にＸＭＬ拡張したＳＱＬが用いられる。また、ＸＭＬＰｕｂｌｉｓｈｉｎｇにおける、ＲＤＢからＸＭＬへのマッピングも固定的である。つまり、従来の方法では、格納されたデータをＲＤＢ上のデータとして扱うため、ＸＭＬに対する標準的なクエリで検索、追加、更新することが困難である。
【００１２】
（２）検索の高速性と追加、更新の高速性を両立することが困難（性能的要件）：一般的に、ＸＭＬ文書全体を検索、取得するだけなら、上記従来のａ）がｂ）に比べて高速である。一方、ＸＭＬ文書中の要素の追加、更新頻度が高い場合、ｂ）がａ）に比べて高速である。そして、ＸＭＬ文書の利用形態によっては、検索の高速性と追加、更新の高速性の両方が求められる場合がある。つまり、応用によってはこれらの格納方法を組み合わせることが必要となる。しかし、従来のＸＭＬ−ＲＤＢ間のマッピング定義情報では、ａ）とｂ）の２つの格納方法を柔軟に組み合わせて、検索の高速性と追加、更新の高速性を両立させることが困難である。
【００１３】
本発明は、上記の点に鑑みなされたもので、ＸＭＬ文書を、当該ＸＭＬ文書の階層構造に基づき、重複を許した部分ＸＭＬ文書に分割してＲＤＢと対応付けるマッピング定義情報及び当該定義に則って、入力されたＸＭＬに対する標準的な問い合わせをＳＱＬへ変換することにより、検索の高速性と追加、更新の高速性の両立を実現するデーターベースを利用した構造化文書処理方法及び装置及び構造化文書処理プログラム及び構造化文書処理プログラムを格納した記憶媒体を提供することを目的とする。
【００１４】
【課題を解決するための手段】
図１は、本発明の原理を説明するための図である。
【００１５】
本発明は、データーベースを利用して構造化文書の格納、検索、更新及び削除の操作を行う構造化文書処理方法において、
入力された構造化文書に対する問い合わせを解析し、該問い合わせの要求している操作が挿入か、更新か、削除か、検索かを判定する操作判定ステップ（ステップ１）と、
記憶手段に格納されている、構造化文書の階層構造に基づいて、重複を許した構造化文書を分割した部分文書に分割してデーターベースと対応付けるマッピング定義情報を参照して、操作判定ステップの判定に基づいて、入力された問い合わせを、該データーベースに対し、挿入／更新／削除／検索のいずれかの操作を行う命令文に変換する命令文生成ステップ（ステップ２）と、
変換された命令文をデーターベースに送信し、操作を実行させる操作指示ステップ（ステップ３）と、
命令文が挿入／更新／削除操作のいずれかであれば、該命令文を実行した文書件数を取得する文書件数取得ステップ（ステップ４）と、
命令文が検索操作であれば、マッピング定義情報を参照して、データーベースの検索結果を構造化文書として出力する構造化文書出力ステップ（ステップ５）とからなる。
【００１６】
また、命令文生成ステップは、マッピング定義情報を参照して、構造化文書を挿入／更新／削除／検索のいずれかの操作を行うためのデーターベースの領域を確保する命令文へと変換するステップを含む。
【００１７】
また、命令文生成ステップは、
挿入／更新操作用の構造化文書の構造が正当かを判定するステップと、
構造化文書全体を挿入する問い合わせか、既に格納されている構造化文書に対し、構造化文書の一部／更新を挿入する問い合わせかを判定するステップと、
構造化文書全体を挿入する問い合わせであれば、同一の構造を持った複数の構造化文書を識別するための識別子を払い出し、マッピング定義情報を参照して、データーベースに対し挿入を行う命令文へと変換するステップを含む。
【００１８】
また、命令文生成ステップは、
挿入／更新操作用の構造化文書の構造が正当かを判定するステップと、
構造化文書全体を挿入する問い合わせか、既に格納されている構造化文書に対し、構造化文書の一部を挿入／更新する問い合わせかを判定するステップと、
構造化文書の一部を挿入／更新する問い合わせであれば、マッピング定義情報を参照して、該構造化文書の一部に対応する要素の先祖要素がデーターベースに格納されているかを判定するステップと、
先祖要素が、データーベースに格納されていれば、該先祖要素の更新を行うための命令文を生成する更新命令生成ステップと、
マッピング定義情報を参照して、構造化文書の一部に対応する要素の子孫要素がデーターベースに格納されているかを判定するステップと、
子孫要素がデーターベースに格納されていれば、該子孫要素の更新を行うための命令文を生成する子孫要素更新命令生成ステップと、を含む。
【００１９】
また、命令文生成ステップは、
マッピング定義情報を参照して、構造化文書の一部に対応する要素の先祖要素がデーターベースに格納されているかを判定するステップと、
先祖要素がデーターベースに格納されていれば、該先祖要素から対応要素までの更新を行うための命令文を生成する更新命令生成ステップと、
マッピング定義情報を参照して、構造化文書の一部に対応する要素の子孫要素がデーターベースに格納されているかを判定するステップと、
子孫要素がデーターベースに格納されていれば、該子孫要素の更新を行うための命令文を生成する子孫要素更新命令生成ステップと、を含む。
【００２０】
また、更新命令生成ステップは、
更新対象のデータを取得するための命令文を生成、実行するステップと、
取得した更新対象のデータを更新するステップと、
更新したデータを再度、挿入するための命令文を生成するステップと、
上記の各ステップをデーターベースに格納されている先祖要素について問い合わせで指定した要素まで繰り返すステップと、を含む。
【００２１】
また、子孫要素更新命令生成ステップは、
更新対象の同一要素のデータがデーターベース上で複数行に別々に格納されているのか、一行にまとめて格納されているのかを判定するステップと、
複数行に別々に格納されている場合、データを一旦削除する命令文を生成、実行し、既に更新済みの先祖要素を格納しているデータを元に、該データを再度挿入するための命令文を該データを格納している全子孫要素について生成するステップと、
一行にまとめて格納されている場合に、
更新対象のデータを取得するための命令文を生成、実行するステップと、
取得した更新対象のデータを更新するステップと、
更新したデータを再度、挿入するための命令文を生成するステップと、
上記の各ステップをデーターベースに格納されている先祖要素について問い合わせで指定した要素まで繰り返えさせるステップにより、データ再挿入用の命令文を生成するステップと、を含む。
【００２２】
また、命令文生成ステップは、
問い合わせの要求をしている検索結果が、そのままデーターベースに格納されているか、子孫要素として分割して格納されているかを判定するステップと、
そのままデーターベースに格納されていれば、その所在を取得するステップと、
子孫要素として分割して格納されていれば、その全ての所在を取得するステップと、
取得した所在と、入力された問い合わせから検索のための命令文を生成するステップと、を含む。
【００２３】
図２は、本発明の原理構成図である。
【００２４】
本発明は、データーベースを利用して構造化文書の格納、検索、更新及び削除の操作を行う構造化文書処理装置であって、
入力された構造化文書に対する問い合わせを解析し、該問い合わせの要求している操作が挿入か、更新か、削除か、検索かを判定する操作判定手段１２０と、
記憶手段に格納されている、構造化文書の階層構造に基づいて、重複を許した構造化文書を分割した部分文書に分割してデーターベースと対応付けるマッピング定義情報１７０と、
マッピング定義情報１７０を参照して、操作判定手段の判定に基づいて、入力された問い合わせを、データーベース３００に対し、挿入／更新／削除／検索操作のいずれかを行う命令文に変換する命令文生成手段１３０と、
変換された命令文をデーターベース３００に送信し、操作を実行させる操作指示手段１４０と、
命令文が挿入／更新／削除操作のいずれかであれば、該命令文を実行した文書件数を取得する文書件数取得手段１５０と、
命令文が検索操作であれば、マッピング定義情報を参照して、データーベースの検索結果を構造化文書として出力する構造化文書出力手段１６０と、と、を有する。
【００２５】
また、命令文生成手段１３０は、
マッピング定義情報１７０を参照して、構造化文書を挿入／更新／削除／検索のいずれかの操作を行うためのデーターベースの領域を確保する命令文へと変換する手段を含む。
【００２６】
また、命令文生成手段１３０は、
挿入／更新操作用の構造化文書の構造が正当かを判定する手段と、
構造化文書全体を挿入する問い合わせか、既に格納されている構造化文書に対し、構造化文書の一部／更新を挿入する問い合わせかを判定する手段と、
構造化文書全体を挿入する問い合わせであれば、同一の構造を持った複数の構造化文書を識別するための識別子を払い出し、マッピング定義情報１７０を参照して、データーベースに対し挿入を行う命令文へと変換する手段を含む。
【００２７】
また、命令文生成手段１３０は、
挿入／更新操作用の構造化文書の構造が正当かを判定する手段と、
構造化文書全体を挿入する問い合わせか、既に格納されている構造化文書に対し、構造化文書の一部を挿入／更新する問い合わせかを判定する手段と、
構造化文書の一部を挿入／更新する問い合わせであれば、マッピング定義情報１７０を参照して、該構造化文書の一部に対応する要素の先祖要素がデーターベース３００に格納されているかを判定する手段と、
先祖要素が、データーベース３００に格納されていれば、該先祖要素の更新を行うための命令文を生成する更新命令生成手段と、
マッピング定義情報１７０を参照して、構造化文書の一部に対応する要素の子孫要素がデーターベースに格納されているかを判定する手段と、
子孫要素がデーターベース３００に格納されていれば、該子孫要素の更新を行うための命令文を生成する子孫要素更新命令生成手段と、を含む。
【００２８】
また、命令文生成手段１３０は、
マッピング定義情報１７０を参照して、構造化文書の一部に対応する要素の先祖要素がデーターベース３００に格納されているかを判定する手段と、
先祖要素がデーターベース３００に格納されていれば、該先祖要素から対応要素までの更新を行うための命令文を生成する更新命令生成手段と、
マッピング定義情報１７０を参照して、構造化文書の一部に対応する要素の子孫要素がデーターベースに格納されているかを判定する手段と、
子孫要素がデーターベース３００に格納されていれば、該子孫要素の更新を行うための命令文を生成する子孫要素更新命令生成手段と、を含む。
【００２９】
また、更新命令生成手段は、
更新対象のデータを取得するための命令文を生成、実行する手段と、
取得した更新対象のデータを更新する手段と、
更新したデータを再度、挿入するための命令文を生成する手段と、
上記の各手段をデーターベース３００に格納されている先祖要素について問い合わせで指定した要素まで繰り返す手段と、を含む。
【００３０】
また、子孫要素更新命令生成手段は、
更新対象の同一要素のデータがデーターベース３００上で複数行に別々に格納されているのか、一行にまとめて格納されているのかを判定する手段と、
複数行に別々に格納されている場合、データを一旦削除する命令文を生成、実行し、既に更新済みの先祖要素を格納しているデータを元に、該データを再度挿入するための命令文を該データを格納している全子孫要素について生成する手段と、
一行にまとめて格納されている場合に、
更新対象のデータを取得するための命令文を生成、実行する手段と、
取得した更新対象のデータを更新する手段と、
更新したデータを再度、挿入するための命令文を生成する手段と、
上記の各手段をデーターベースに格納されている先祖要素について問い合わせで指定した要素まで繰り返させる手段により、データ再挿入用の命令文を生成する手段と、を含む。
【００３１】
また、命令文生成手段１３０は、
問い合わせの要求をしている検索結果が、そのままデーターベース３００に格納されているか、子孫要素として分割して格納されているかを判定する手段と、
そのままデーターベース３００に格納されていれば、その所在を取得する手段と、
子孫要素として分割して格納されていれば、その全ての所在を取得する手段と、
取得した所在と、入力された問い合わせから検索のための命令文を生成する手段と、を含む。
【００３２】
本発明は、データーベースを利用して構造化文書の格納、検索、更新及び削除の操作をコンピュータに実行させる構造化文書処理プログラムであって、
入力された構造化文書に対する問い合わせを解析し、該問い合わせの要求している操作が挿入か、更新か、削除か、検索かを判定する操作判定ステップと、
記憶手段に格納されている、構造化文書の階層構造に基づいて、重複を許した構造化文書を分割した部分文書に分割してデーターベースと対応付けるマッピング定義情報を参照して、操作判定ステップの判定に基づいて、入力された問い合わせを、該データーベースに対し、挿入／更新／削除／検索のいずれかの操作を行う命令文に変換する命令文生成ステップと、
変換された命令文をデーターベースに送信し、操作を実行させる操作指示ステップと、
命令文が挿入／更新／削除操作であれば、該命令文を実行した文書件数を取得する文書件数取得ステップと、
命令文が検索処理であれば、マッピング定義情報を参照して、データーベースの検索結果を構造化文書として出力する構造化文書出力ステップと、からなる。
【００３３】
本発明は、データーベースを利用して構造化文書の格納、検索、更新及び削除の操作をコンピュータに実行させる構造化文書処理プログラムを格納した記憶媒体であって、
入力された構造化文書に対する問い合わせを解析し、該問い合わせの要求している操作が挿入か、更新か、削除か、検索かを判定する操作判定ステップと、
記憶手段に格納されている、構造化文書の階層構造に基づいて、重複を許した構造化文書を分割した部分文書に分割してデーターベースと対応付けるマッピング定義情報を参照して、操作判定ステップの判定に基づいて、入力された問い合わせを、該データーベースに対し、挿入／更新／削除／検索のいずれかの操作を行う命令文に変換する命令文生成ステップと、
変換された命令文をデーターベースに送信し、操作を実行させる操作指示ステップと、
命令文が挿入／更新／削除操作であれば、該命令文を実行した文書件数を取得する文書件数取得ステップと、
命令文が検索操作であれば、マッピング定義情報を参照して、データーベースの検索結果を構造化文書として出力する構造化文書出力ステップと、からなるプログラムを格納する。
【００３４】
上記のように、本発明は、入力されたＸＭＬに対する標準的な問い合わせを解析し、当該問い合わせの要求している操作が挿入か、更新か、削除か、検索かを判定し、ＸＭＬ−ＲＤＢ間のマッピング定義情報（ＸＭＬ文書の階層構造に基づき、重複を許したＸＭＬ文書に分割して、ＲＤＢと対応付けるマッピング定義情報）に則って、入力された問い合わせをＳＱＬをＲＤＢ上で実行し、挿入／更新／削除であれば、当該操作を実行した文書件数を取得し、検索であれば、ＸＭＬ−ＲＤＢ間のマッピング定義情報を参照し、ＲＤＢの検索結果をＸＭＬ文書として出力するものである。ＸＭＬに対する標準的な問い合わせを解析し、ＸＭＬ−ＲＤＢ間のマッピング定義情報を参照することで、当該問い合わせを用いたＲＤＢへのＸＭＬ文書の挿入、更新、削除、検索が可能となる。また、当該マッピング定義情報は、ＸＭＬ文書の階層構造に基づき、重複を許して部分ＸＭＬ文書に分割して、ＲＤＢと対応付けることで、検索の高速性と追加、更新の高速性を両立することができる。
【００３５】
【発明の実施の形態】
以下、図面と共に本発明の実施の形態を説明する。
【００３６】
図３は、本発明の一実施の形態における構造化文書処理装置の構成を示す。
【００３７】
構造化文書処理装置１００は、アプリケーションプログラムインタフェース部１１０と、ＸＭＬ問い合わせ文解析部１２０、ＲＤＢ−ＸＭＬマッピング定義情報１７０、ＳＱＬ文生成部１３０、ＲＤＢ管理システムインタフェース部１４０、ＤＯＭ操作部１５０、問い合わせ結果生成部１６０から構成される。
【００３８】
アプリケーションプログラムインタフェース部１１０は、アプリケーションプログラムと通信網４００とを介してユーザから入力されたＸＭＬ問い合わせ文を受け付ける。ＸＭＬ問い合わせ文は、変数バインド部分、問い合わせ条件指定部分、問い合わせ結果指定部分から構成される。
【００３９】
ＸＭＬ問い合わせ解析部１２０は、アプリケーションプログラムインタフェース部１１０が受け付けたＸＭＬ問い合わせ文の構文を解析し、ＸＭＬ問い合わせ文解析情報を出力する。
【００４０】
図４は、本発明の一実施の形態におけるＲＤＢ−ＸＭＬマッピング定義情報に含まれる情報の概要を示す。ＲＤＢ−ＸＭＬマッピング定義情報１７０には、ＲＤＢにアクセスするための情報１７１と、ＳＱＬ−ＤＤＬを生成するために必要なテーブル生成情報１７２と、ＸＭＬ文書の情報とＲＤＢに格納された情報との対応関係をマッピングするためのルート情報１７３が含まれる。なお、当該ＲＤＢ−ＸＭＬマッピング定義情報１７０は、ハードディスク装置等の記憶媒体に格納される。
【００４１】
ＳＱＬ文生成部１３０では、ＸＭＬ問い合わせ文解析情報とＲＤＢ−ＸＭＬマッピング定義情報からＳＱＬ文を生成し、ＲＤＢ管理システムインタフェース部１４０と通信網４００を介してＲＤＢ管理システム３００に送信する。そして、ＳＱＬ結果をＲＤＢ管理システム３００から取得する。ＲＤＢ管理システムインタフェース部１４０は、生成したＳＱＬ文の受け付け、ＲＤＢ管理システム３００への接続、ＳＱＬ結果の取得を行う。
【００４２】
ＤＯＭ操作部１５０は、問い合わせ結果指定部分の先祖／子孫要素がＲＤＢに格納されている際、当該要素を取得し、ＤＯＭ（ＤｏｃｕｍｅｎｔＯｂｊｅｃｔＭｏｄｅｌ：ＸＭＬ操作のための標準ＡＰＩの１つ）に展開した後、ＤＯＭ操作により、問い合わせ結果指定部分に対し更新を行う。
【００４３】
問い合わせ結果生成部１６０では、取得したＳＱＬ結果と読み出したＲＤＢ−ＸＭＬマッピング定義情報から問い合わせ結果を生成する。問い合わせ結果とは、挿入・更新・削除の場合は操作を実行したＸＭＬ文書件数であり、検索の場合は、返却されるＸＭＬ文書を指す。
【００４４】
次に、本発明の構造化文書処理装置１００の処理手順の概要について説明する。
【００４５】
図５は、本発明の一実施の形態における構造化文書処理装置の全体の概要動作のフローチャートである。
【００４６】
構造化文書処理装置１００は、準備フェーズ（ステップ１０００）と操作フェーズ（ステップ２０００）の順で処理が行われる。準備フェーズ（ステップ１０００）では、ＲＤＢ−ＸＭＬマッピング定義情報１７０中のテーブル生成情報を元にＳＱＬ−ＤＤＬを出力し、ＲＤＢ上にテーブルを生成する。操作フェーズ（ステップ２０００）では、生成されたテーブルに対し、ＸＭＬ文書の挿入・更新・削除・検索を行う。
【００４７】
図６は、本発明の一実施の形態における準備フェーズのフローチャートである。
【００４８】
準備フェーズでは、まず、ＲＤＢ−ＸＭＬマッピング定義情報を作成し、記憶手段に格納する（ステップ１０１０）。ＳＱＬ文生成部１３０では、当該ＲＤＢ−ＸＭＬマッピング定義情報１７０中のテーブル生成情報を元に、ＳＱＬ−ＤＤＬを出力する（ステップ１０２０）。ＲＤＢ管理システムインタフェース部１４０では、当該ＲＤＢ−ＸＭＬマッピング定義情報１７０中のＲＤＢアクセス情報を元に、ＲＤＢ管理システム３００へ接続し、ＳＱＬ−ＤＤＬをＲＤＢ管理システム３００へ送信する（ステップ１０３０）。ＲＤＢ管理システム３００では、送信されたＳＱＬ−ＤＤＬに基づいてテーブルを生成する（ステップ１０４０）。
【００４９】
図７〜図９は、本発明の一実施の形態における操作フェーズのフローチャートである。
【００５０】
操作フェーズでは、まず、アプリケーションプログラムインタフェース部１１０が、アプリケーションプログラム２００を介して入力されたＸＭＬ問い合わせ文を受理する（ステップ２０１０）。問い合わせは、図３に示すように、変数バインド部分、問い合わせ条件指定部分、問い合わせ結果指定部分から構成される。
【００５１】
ＸＭＬ問い合わせ文解析部１２０は、ＸＭＬ問い合わせ文を解析し、ＸＭＬ問い合わせ文解析情報を出力する（ステップ２０２０）。このＸＭＬ問い合わせ文解析情報から、受理したＸＭＬ問い合わせ文が検索操作を行うため構文か否かを判定する（ステップ２０３０）。同様に、挿入／更新操作を行うための構文か削除操作を行う構文かを判定する（ステップ２０３１）。
【００５２】
挿入／更新操作を行うための構文であれば、挿入／更新用ＸＭＬ文書の構造が正当かを判定する（ステップ２０４０）。正当でなければエラーを出力して操作フェーズを終了する。正当であれば、全ＸＭＬ文書の挿入を行うための構文かを判定する（ステップ２０４１）。全ＸＭＬ文書の挿入を行うための構文であれば、１つ１つのＸＭＬ文書を識別するための文書キーを払い出し（ステップ２０４２）、ＳＱＬ文生成部１３０は、ＲＤＢ−ＸＭＬマッピング定義情報１７０とＸＭＬ問い合わせ文解析情報を参照し、全ＸＭＬ文書挿入用ＳＱＬを生成する（ステップ２０４３）。生成したＳＱＬ文をＲＤＢ管理システム３００で実行し（ステップ２０４４）、挿入を実行した文書件数を取得し（ステップ２０７０）、これを問い合わせ結果として出力する（ステップ２０８７）。
【００５３】
全ＸＭＬ文書の挿入を行うための構文でなければ（ステップ２０４１）、部分ＸＭＬ文書の挿入あるいは、全／部分ＸＭＬ文書の更新を行う構文である。以降の処理は、部分ＸＭＬ文書挿入／更新／削除で同一となる。まず、問い合わせ結果指定部分に対応する要素の先祖要素がＲＤＢに格納されているかを判定する（ステップ２０５０）。格納されていれば、ＳＱＬ文生成部１３０は、ＲＤＢ−ＸＭＬマッピング定義情報１７０とＸＭＬ問い合わせ文解析情報を参照し、テーブル名とカラム名を取得し、更新対象データ取得用ＳＱＬを生成する（ステップ２０５１）。生成したＳＱＬを実行し、更新対象データを取得する（ステップ２０５２）。
【００５４】
ＤＯＭ操作部１５０は、取得したデータをＤＯＭ（ＤｏｃｕｍｅｎｔＯｂｊｅｃｔＭｏｄｅｌ：ＸＭＬ操作のための標準ＡＰＩの一つ）に展開した後、ＤＯＭ操作により、問い合わせ結果指定部分に対し更新を行い、再び、ＸＭＬ化する（ステップ２０５３）。ＳＱＬ文生成部１３０は、挿入用ＳＱＬを生成し、更新を行ったＸＭＬをデーターベースに再挿入する（ステップ２０５４）。ステップ２０５１からステップ２０５４までを問い合わせ結果指定部分で指定された要素まで繰り返す。
【００５５】
次に、問い合わせ結果指定部分に対応する要素の子孫要素がＲＤＢに格納されているかを判定する（ステップ２０６０）。格納されていれば、同一パスのＮ個の要素がＮ行に別々に格納されているのか、１行にまとめて格納されているのかを判定する（ステップ２０６１）。１行にまとめて格納されている場合、先程のステップ２０５１からステップ２０５４と同様に、更新対象データを取得し、ＤＯＭ展開し、更新を行い、ＸＭＬ化してデーターベースに再挿入する（ステップ２０６４からステップ２０６７）。Ｎ行に別々に格納されている場合、先祖要素の更新時に取得した文書キーでデータ削除用ＳＱＬを生成し、実行（ステップ２０６２）した後、既に更新済みの先祖要素を格納している列のデータを元にデータ挿入用ＳＱＬを生成し、実行する（ステップ２０６３）。
【００５６】
ステップ２０６０からステップ２０６６までをＲＤＢに格納されているすべての子孫要素に対して行う。部分ＸＭＬ文書挿入／更新／削除を実行した文書件数を取得し（ステップ２０７０）、これを問い合わせ結果として出力する（ステップ２０８７）。
【００５７】
検索操作を行うための構文であれば（ステップ２０３０）、ＲＤＢ−ＸＭＬマッピング定義情報１７０を参照して、問い合わせ結果指定部分に対応する要素がＲＤＢに格納されているかを判定する（ステップ２０８０）。格納されていれば、当該要素が格納されているＲＤＢのテーブル名、カラム名を取得する（ステップ２０８３）。格納されていない場合、格納されている要素が出現するまで、当該要素の子孫要素を辿り、格納されているＲＤＢのテーブル名、カラム名を取得する（ステップ２０８２）。これを全ての子孫要素に対して繰り返し、全てのテーブル名、カラム名を取得する（ステップ２０８１）。
【００５８】
取得したテーブル名、カラム名と問い合わせ条件指定部分から得られた解析情報を元にＳＱＬを生成／実行する（ステップ２０８４）。ＲＤＢ−ＸＭＬマッピング定義情報１７０を参照して、問い合わせ結果指定部分に対応する要素がＲＤＢに格納されているかを判定することで、ＸＭＬ再構築が必要かを判定する（ステップ２０８５）。必要であればＲＤＢ−ＸＭＬマッピング定義情報１７０を参照して、ＳＱＬの検索結果からＸＭＬ文書を問い合わせ結果として生成し（ステップ２０８６）、出力する（ステップ２０８７）。再構築が必要でない場合は、該当する要素がＲＤＢに格納されているので、それを問い合わせ結果として出力する（ステップ２０８７）。
【００５９】
【実施例】
以下、図面と共に本発明の実施例を説明する。
【００６０】
最初に、準備フェーズについて説明する。
【００６１】
（１）テーブル生成（準備フェーズ）：
図１０は、本発明の一実施例のＲＤＢ−ＸＭＬマッピング定義情報の詳細を示す図である。図１１は、本発明の一実施例のＲＤＢアクセス情報と記述例を示す。図１２は、本発明の一実施例のテーブル生成情報と記述例を示す。図１３〜図１５は、本発明の一実施例のルート情報と記述例を示す。
ＲＤＢ−ＸＭＬマッピング定義情報１７０は、ＸＭＬ形式で規定された定義であり、図４に示したように、ＲＤＢアクセス情報１７１（図１１）、テーブル生成情報１７２（図１２），ルート情報１７３（図１３〜図１５）が含まれる。
【００６２】
（２）ＲＤＢ−ＸＭＬマッピング定義情報（図１０）：
ＲＤＢ−ＸＭＬマッピング定義情報１７０のルート要素である。Ｔａｒｇｅｔ属性により対象のＲＤＢ管理システム種別を定義する。ｖｅｒｓｉｏｎ属性の値により、ＲＤＢ−ＸＭＬマッピング定義情報１７０の識別を行う。
【００６３】
（３）ＲＤＢアクセス情報（図１１）：
ＲＤＢの接続先毎の情報を定義する要素である。
【００６４】
・ａｃｃｅｓｓ要素
ＲＤＢ管理システムインタフェース部１４０を経由してＲＤＢに接続するための情報を定義する要素である。接続時に必要な接続先ｕｒｌをｕｒｌ要素へ、接続時に必要なユーザ名やパスワードをｐｒｏｐｅｒｔｙ要素に定義する。
【００６５】
（ａ）ｕｒｌ要素
ＲＤＢ管理システムインタフェース部１４０を経由してＲＤＢに接続する際に指定するＵＲＬを定義する要素である。
【００６６】
（ｂ）ｐｒｏｐｅｒｔｙ要素
ＲＤＢ管理システムインタフェース部１４０を経由してＲＤＢに接続する際に指定するプロパティ情報（ユーザ名、パスワード等）を指定する要素である。ｎａｍｅ属性によりプロパティ名を定義する。
【００６７】
（２）テーブル生成情報（図１１）
ＳＱＬ文生成部１３０にて、ＲＤＢのテーブル生成用ＳＱＬ−ＤＤＬを生成するために必要な情報を定義する。
【００６８】
（ａ）ｔａｂｌｅ要素
個々のテーブルに対する情報を定義する。ｎａｍｅ属性に対し、テーブル名を記述する。
【００６９】
（ｂ）ｃｏｌｕｍｎ要素
親要素のｔａｂｌｅ要素のｎａｍｅ属性に指定したテーブルに持たせるカラムの情報を定義する。ｎａｍｅ属性に対し、カラム名を記述し、ｔｙｐｅ属性に対し、ＲＤＢで定義されるデータ型を記述する。
【００７０】
（ｃ）ｃｏｎｓｔｒａｉｎｔ要素
親要素がｔａｂｌｅ要素の場合、表定義としての制約をＳＱＬと同じ形式で記述する。また、親要素がｃｏｌｕｍｎ要素の場合は行定義としての制約をＳＱＬと同じ形式で記述する。
【００７１】
（ｄ）ｃｈａｒａｃｔｏｒｉｓｔｉｃｓ要素
親要素のｔａｂｌｅ要素で指定されるテーブルに対しての物理特性やテーブル特性を定義する。ＳＱＬのＣＲＥＡＴＥ文におけるｐｈｉｓｉｃａｌ＿ｐｒｏｐｅｒｔｉｓ及びｔａｂｌｅ＿ｐｒｏｐｅｒｔｉｅｓに相当する部分をＳＱＬと同じ形式で記述する。
【００７２】
図１２にテーブル生成の記述例を示す。
【００７３】
（３）ルート情報（図１３）
入出力されるＸＭＬ文書とＲＤＢ内に格納されているデータとの関連をマッピングするための情報を定義する。ｅｌｅｍｅｎｔ情報、ａｔｔｒｉｂｕｔｅ要素、ｔｅｘｔ要素は、それぞれ元ＸＭＬ文書の要素，属性，テキストに相当し、元ＸＭＬと同じ階層構造について記述する。
【００７４】
ルート情報に定義する項目を図１５に示す。
【００７５】
（ａ）ｅｌｅｍｅｎｔ要素
ＸＭＬ文書の要素に対応する要素である。各種属性にてＲＤＢとのマッチング情報を定義する。各属性の内容を以下に示す。
【００７６】
・ｎａｍｅ属性
ＸＭＬ文書の要素名を定義する。
【００７７】
・ｔａｂｌｅ属性
格納先のテーブル名を定義する。なお、すべての要素に定義されるわけではなく、定義は子孫の要素に次のｔａｂｌｅ属性が定義されるまでを有効範囲とする。
【００７８】
・ｃｏｌｕｍｎ属性
格納先のカラム名を定義する。対象のテーブルは、上記のテーブル属性で定義された対象の（ａｎｓｅｃｔｏｒ−ｏｒ−ｓｅｌｆの内で直近のｔａｂｌｅ属性に定義された）テーブルとなり、対応するＸＭＬのデータの格納先が特定される。
【００７９】
・ｒａｔｉｏ属性
本属性は、ｔａｂｌｅ属性に付随して定義され、ｎ個の要素が存在する場合に、複数の格納先テーブル間の関係を定義する。
【００８０】
“１：１”の場合、本属性を持つ１もしくはｎ個の要素を１タプルに格納する。
【００８１】
“１：ｎ”もしくは“１：Ｎ”の場合、本属性を持つ１つもしくは、Ｎ個の要素をＮタプルに格納する。
【００８２】
なお、本属性が省略された場合“１：１”として扱う。
【００８３】
・ｍｉｎＯｃｃｕｒｓ属性
本属性の定義された要素が持つことのできる要素の最小数を“０”か“１”で定義する。
【００８４】
・ｍａｘＯｃｃｕｒｓ属性
本属性の定義された要素が持つことのできる要素の最大数を“１”か“ｕｎｂｏｕｎｄｅｄ”（制限なし）で定義する。
【００８５】
・ｄｏｃＩＤ属性
ＸＭＬ文書を識別するための文書ＩＤに該当する要素に対し、本属性に“ｙｅｓ”を定義する。
【００８６】
・ｔｙｐｅ属性
本属性は、ｃｏｌｕｍｎ属性に付随して定義され、格納先のカラムのデータ型を定義する。ＸＭＬデータ格納の際に参照され、入力のデータとデータがたの整合性を判断する。
【００８７】
・ｓｉｚｅ属性
本属性は、ｃｏｌｕｍｎ属性に付随して定義され、格納先のカラムのサイズを定義する。ＸＭＬデータ格納の際に参照され、入力のデータのサイズが格納可能かを判断する。
【００８８】
・ｐｋｅｙ属性
本属性は、ｔａｂｌｅ属性に付随して定義され、ｔａｂｌｅ属性に定義されたテーブルの主キーとなるカラム名を定義する。省略された場合は、ＤｏｃＫｅｙ（文書キー）を主キーと見做す。
【００８９】
・ｐａｒｅｎｔ−ｆｋｅｙ属性
本属性は、ｔａｂｌｅ属性に付随して定義され、該当のテーブルの主キーに対しての外部キーとなるカラム名を定義する。本属性で指定した列は、直近の先祖要素に定義されているテーブルの列として定義されるものとする。
【００９０】
・ｅｘｔｓ属性
拡張機能であるユーザ定義関数名を記述する。本属性を定義することによりユーザ定義関数と本属性が定義されたパスとの関連付けを定義する。
【００９１】
（ｂ）ａｔｔｒｉｂｕｔｅ要素
ＸＭＬ文書の属性に対応する要素である。各種属性にてデーターベースとのマッピング情報を定義する。属性は、ｅｌｅｍｅｎｔ要素と同じ属性をもつが、その他にａｔｔｒｉｂｕｔｅ要素固有の属性としてｍｅｔａ属性を持つ。
【００９２】
・ｍｅｔａ属性
追加属性を識別するための属性である。本属性に“ｙｅｓ”が指定された要素を追加属性と見做す。ＸＭＬ問い合わせ文で操作する場合は、直接本属性が定義された要素を問い合わせ結果指定部分（ｄｏ節／ｒｅｔｕｒｎ節）に指定しない限り、操作を行えないものとする。直接指定せずに操作を行おうとした場合、更新時には例外、検索時には、検索対象外とする。
【００９３】
（ｃ）ｔｅｘｔ属性
ＸＭＬ文書の属性に対応する要素である。各種属性にてＲＤＢとのマッピング情報を定義する。属性は、ｅｌｅｍｅｎｔ要素と同じ属性を持つ。
【００９４】
上記の項目を用いたＸＭＬ文書例、ルート情報の記述例を図１５に示す。本発明のＸＭＬ−ＲＤＢ間のマッピング定義情報１７０（ルート情報）は、ＸＭＬ文書を部分ＸＭＬ文書に分割してＲＤＢへ格納する。分割は、階層的に行い、重複も許している。図１６には、この格納の重複関係を階層構造（格納構造木）で示している。
【００９５】
準備フェーズでは、まず、ＲＤＢアクセス情報１７１（図１１）、テーブル生成情報１７２（図１２）、ルート情報１７３（図１３〜図１５）を含むＲＤＢ−ＸＭＬマッピング定義情報１７０（図１０）を作成する（ステップ１０１０）。ＳＱＬ文生成部１３０では、当該ＲＤＢ−ＸＭＬマッピング定義情報１７０中のテーブル生成情報（図１２）を元に、ＳＱＬ−ＤＤＬを出力する（ステップ１０２０）。ＲＤＢ管理システムインタフェース部１４０では、当該ＲＤＢ−ＸＭＬマッピング定義情報１７０中のＲＤＢアクセス情報（図１１）を元に、ＲＤＢ管理システム３００へ接続し、ＳＱＬ−ＤＤＬをＲＤＢ管理システム３００へ送信する（ステップ１０３０）。
【００９６】
ＲＤＢ管理システム３００では、送信されたＳＱＬ−ＤＤＬに基づいてテーブルを生成する（ステップ１０４０）。ＳＱＬ−ＤＤＬの出力例及びテーブルの生成例を図１６に示す。
【００９７】
次に、操作フェーズについて説明する。
【００９８】
（１）ＸＭＬ文書挿入（操作フェーズ）
入力されたＸＭＬ問い合わせ文を元にＲＤＢ−ＸＭＬマッピング定義情報１７０を参照して、ＲＤＢに対して全ＸＭＬ文書または、部分ＸＭＬ文書の挿入を行う。図１７（ａ），（ｂ）に示す問い合わせがアプリケーションプログラム２００から発行された場合を例に具体的に説明する。
【００９９】
まず、問い合わせを受理し（図７、ステップ２０１０）、ＸＭＬ問い合わせ文解析部１２０は、ＸＭＬ問い合わせ文を解析し、ＸＭＬ問い合わせ文解析情報を出力する（図７、ステップ２０２０）。挿入／更新／削除のＸＭＬ問い合わせ文は、ＸＭＬＵｐｄａｔｅ、検索のＸＭＬ問い合わせ文は、ＸＱｕｅｒｙの構文に基づいている。
【０１００】
ＸＭＬ問い合わせ解析情報から問い合わせ結果指定部分がｄｏ節なので検索操作を行う構文ではなく（図７、ステップ２０３０）、また、ａｐｐｅｎｄ関数が指定されているので挿入操作を行う構文と判定される（図７、ステップ２０３１）。ＸＭＬ−ＲＤＢマッピング定義情報１７０のルート情報を元に挿入用ＸＭＬ文書構造のチェックを行う（図８、ステップ２０４０）。変数バインド部分のｆｏｒ節からｒｏｏｔ関数の引数であるｒｏｏｔ名を取得し、ＸＭＬ−ＲＤＢマッピング定義情報１７０の該当するｒｏｏｔ名を持つｒｏｏｔ要素を特定する。また、ｆｏｒ節に指定するバインド位置で、全ＸＭＬ文書挿入であるか部分ＸＭＬ文書挿入であるかを判定する。バインド指定でパスが指定されていない場合は、全ＸＭＬ文書挿入とみなし、ルート要素が指定されていた場合、部分ＸＭＬ文挿入とみなす（図８、ステップ２０４１）。これは、本実施例でぇあ、図１７（ｃ）に示す格納モデルを想定しているためである。
【０１０１】
全ＸＭＬ文書挿入であれば（図１７（ｂ））、文書キーを払い出し（図８、ステプ２０４２）、ＳＱＬ文生成部１３０は、ＲＤＢ−ＸＭＬマッピング定義情報１７０とＸＭＬ問い合わせ文解析情報を参照し、全ＸＭＬ文書挿入用ＳＱＬを生成する（図８、ステップ２０４３）。
【０１０２】
具体的には、ＲＤＢ−ＸＭＬマッピング定義情報１７０のルート情報１７３を参照し、最上位階層の要素から順次階層を辿り、ｔａｂｌｅ属性が存在した場合、定義されたテーブル名と、その有効範囲に存在するすべてのｃｏｌｕｍｎ属性に定義された列名を取得し、ＳＱＬ文を生成する。なお、ｔａｂｌｅ属性に付随してｒａｔｉｏ属性が定義されていた場合、以下のルールに従いＳＱＬ文を生成する。
【０１０３】
・“１：１”の場合：
本属性を持つテーブルに格納する１もしくはＮ個の要素を１行に格納する。
【０１０４】
・“１：Ｎ”の場合：
本属性を持つテーブルに格納する１もしくはＮ個の要素をＮ行に格納する。
【０１０５】
ＳＱＬ生成イメージを図１８に示す。生成したＳＱＬ文をＲＤＢ管理システム３００で実行し（図８、ステップ２０４４）、挿入を実行した文書件数を取得し（図８、ステップ２０７０）、これを問い合わせ結果として出力する（図９、ステップ２０８７）。
【０１０６】
部分ＸＭＬ文書挿入であれば（図１７（ａ））、ｄｏ節に指定されたパスに対応する要素の先祖要素にｃｏｌｕｍｎ属性を持っている要素が存在する場合は、更新対象とみなす（図８、ステップ２０５０）。ＲＤＢ−ＸＭＬマッピング定義情報１７０を参照し、テーブル名とカラム名を取得し、問い合わせ条件指定部分（ｗｈｅｒｅ節）の解析で取得したデータと合わせてＳＱＬを生成する（図８、ステップ２０５１）。ＳＱＬ生成イメージを図１９に示す。生成したＳＱＬを実行し、更新対象のデータを取得する（図８、ステップ２０５２）。取得したデータは、ＤＯＭ木に展開した後、ＤＯＭ操作によりｄｏ節で指定されたパスに対し挿入を行い、再度ＸＭＬ化する（図８、ステップ２０５３）。
【０１０７】
ＳＱＬ文生成部１３０は、挿入用ＳＱＬを生成し、更新を行ったＸＭＬをデーターベースに再格納する（ＯｒａｃｌｅＪＢＤＣドライバの更新可能ＲｅｓｕｌｔＳｅｔ使用）。上記操作をｄｏ節で指定されたパスの要素まで繰り返す（図８、ステップ２０５０〜ステップ２０５４）。
【０１０８】
次に、ｄｏ節に指定されたパスに対応する要素が子孫要素を持っており、その子孫要素にｃｏｌｕｍｎ属性が定義されていた場合、ＲＤＢに格納されていると判断する（図８、ステップ２０６０）。更新対象のｔａｂｌｅ属性に対し、ｒａｔｉｏ属性が“１：１”で定義してある場合（図８、ステップ２０６１の判定）は、先祖要素と同じ方法で更新を行う（図８、ステップ２０６４〜ステップ２０６７）。また、ｒａｔｉｏ属性が“１：ｎ”もしくは、“１：Ｎ”である場合（図８、ステップ２０６１の判定）は該当するカラムに対して対象のデータを一旦削除した後に挿入を行う。先祖要素の更新時の取得した文書キーでＤＥＬＥＴＥ文を生成し、実行した（図８、ステップ２０６２）後、既に更新済みの先祖要素を格納しているカラムのデータを元にＩＮＳＥＲＴ文を生成し、実行する（図８、ステップ２０６３）。ＳＱＬの生成イメージを図２０に示す。ステップ２０６０からステップ２０６６までをＲＤＢに格納されている全ての子孫要素に対して行う。部分ＸＭＬ文書挿入／更新／削除を実行した文書件数を取得し（図８、ステップ２０７０）、これを問い合わせ結果として出力する（図９、ステップ２０８７）。
【０１０９】
（２）ＸＭＬ文書更新（操作フェーズ）
入力されたＸＭＬ問い合わせ文を元にＲＤＢ−ＸＭＬマッピング定義情報１７０を参照してＲＤＢに対してＸＭＬ文書の更新を行う。図２１（ａ）に示す問い合わせがアプリケーションプログラム２００から発行された場合を考える。
【０１１０】
問い合わせの受理（図７、ステップ２０１０）、ＸＭＬ問い合わせ文解析（図７、ステップ２０２０）は、前述の（１）ＸＭＬ文書挿入と同様である。問い合わせ結果指定部分がｄｏ節なので検索操作を行う構文ではなく（図７、ステップ２０３０）、ｕｐｄａｔｅ関数が指定されているので更新操作を行う構文と判断される（図７、ステップ２０３１）。ＸＭＬ−ＲＤＢマッピング定義情報１７０のルート情報１７３を元に挿入用ＸＭＬ文書構造のチェックを行う（図７、ステップ２０４０）。全ＸＭＬ文書挿入でなく、更新なので（図８、ステップ２０４１）、以降の処理は、前述の（１）文書挿入で説明したステップ２０５０からステップ２０８７と同様であるため、その説明は省略する。
【０１１１】
（３）ＸＭＬ文書削除（操作フェーズ）
入力されたＸＭＬ問い合わせ文を元にＲＤＢ−ＸＭＬマッピング定義情報１７０を参照して、ＲＤＢに対してＸＭＬ文書の更新を行う。図２２（ａ）（ｂ）に示す問い合わせがアプリケーションプログラム２００から発行された場合を考える。
【０１１２】
問い合わせの受理（図７、ステップ２０１０）、ＸＭＬ問い合わせ文の解析（図７、ステップ２０２０）は、前述の（１）ＸＭＬ文書挿入、（２）ＸＭＬ文書更新と同様である。問い合わせ結果指定部分がｄｏ節なので検索操作を行う構文ではなく（図７、ステップ２０３０）、ｒｅｍｏｖｅ関数が指定されているので削除操作を行う構文と判断される（図７、ステップ２０３１）。以降の処理は（１）ＸＭＬ文書挿入、（２）ＸＭＬ文書更新したステップ２０５０からステップ２０８７と同様であるため、説明を省略する。
【０１１３】
なお、更新済みの先祖要素を格納しているカラムのデータ内にＭＤのルート情報で更新対象と見做す要素が存在しなかった場合は、ＩＮＳＥＲＴ文生成時に値を指定しない（ＮＵＬＬを挿入する）。本実施例では、図２２（ａ）に示すＸＭＬ問い合わせ文による削除処理（図２３（ａ））が対応する。また、更新対象の要素がタグのみでテキストノードが存在しなかった場合は、空値（“”）を指定してＩＮＳＥＲＴ文を生成する。本実施例では、図２２（ｂ）に示すＸＭＬ問い合わせ文による削除処理（図２３（ｂ））が対応する。
【０１１４】
（４）ＸＭＬ文書検索（操作フェーズ）
入力されたＸＭＬ問い合わせ文を元にＲＤＢ−ＸＭＬマッピング定義情報１７０を参照してＲＤＢに対してＸＭＬ文書の検索を行う。図２４（ａ）に示す問い合わせがアプリケーションプログラム２００から発行された場合を考える。
【０１１５】
問い合わせの受理（図７、ステップ２０１０）、ＸＭＬ問い合わせ文の解析（図７、ステップ２０２０）は、前述の（１）ＸＭＬ文書挿入、（２）ＸＭＬ文書更新、（３）ＸＭＬ文書削除と同様であるので、その説明を省略する。
【０１１６】
問い合わせ結果指定部分が、ｒｅｔｕｒｎ節なので、検索操作を行う構文と判断される（図７、ステップ２０３０）。ＲＤＢ−ＸＭＬマッピング定義情報１７０を参照して、問い合わせ結果指定部分のパスに対応する要素にｃｏｌｕｍｎ属性が存在するので、当該要素はＲＤＢに格納されていると判断される（図９、ステップ２０８０）。当該要素が格納されているＲＤＢのテーブル名と、カラム名を取得する（図９、ステップ２０８３）。取得したテーブル名、カラム名と問い合わせ条件指定部分から得られた解析情報を元にＳＱＬを生成／実行する（図９、ステップ２０８４）。問い合わせ結果指定部分に対応する要素がＲＤＢに格納されているため、ＸＭＬ再構築は不要と判断される（図９、ステップ２０８５）。該当する要素がＲＤＢに格納されているので、それを問い合わせ結果として出力する（図９、ステップ２０８７）。
【０１１７】
図２５（ａ）に示す問い合わせがアプリケーションプログラム２００から発行された場合を考える。
【０１１８】
問い合わせの受理（図７、ステップ２０１０）から検索操作を行う構文との判定（図７、ステップ２０３０）までは先程の例と同様である。ＲＤＢ−ＸＭＬマッピング定義情報１７０を参照して、問い合わせ結果指定部分のパスに対応する要素にｃｏｌｕｍｎ属性が存在しないので、当該要素は、ＲＤＢに格納されていないと判断される（図９、ステップ２０８０）。当該要素の子孫要素を辿り、ｃｏｌｕｍｎ属性が存在する要素が出現するまで、子孫要素を辿ってテーブル名、カラム名を取得する（図９、ステップ２０８２）。これを全ての子孫要素に対して繰り返し、全てのテーブル名、カラム名を取得する（図９、ステップ２０８１）。取得したテーブル名、カラム名と問い合わせ条件指定部分から得られた解析情報を元に、ＳＱＬを生成／実行する（ステップ２０８４）。問い合わせ結果指定部分に対応する要素がＲＤＢに格納されていないため、ＸＭＬ再構築は必要と判定される（ステップ２０８５）。図２６に示すように、ＲＤＢ−ＸＭＬマッピング定義情報１７０を参照して、ＳＱＬの検索結果からＸＭＬ文書を問い合わせ結果として生成し（図９、ステップ２０８６）、出力する（ステップ２０８７）。
【０１１９】
なお、上記の準備フェーズ及び操作フェーズをプログラムとして構築し、構造化文書処理装置として利用されるコンピュータにインストールしておき、ＣＰＵ等の制御手段により実行する、または、ネットワークを介して流通させることも可能である。
【０１２０】
また、構築されたプログラムを、構造化文書処理装置として利用されるコンピュータに接続される、ハードディスク装置や、フレキシブルディスク、ＣＤ−ＲＯＭ等の可搬記憶媒体に格納しておき、実行時に、コンピュータにインストールして実行させることも可能である。
【０１２１】
なお、本発明は、上記の実施の形態及び実施例に限定されることなく、特許請求の範囲内において、種々変更・応用が可能である。
【０１２２】
【発明の効果】
上述のように、本発明によれば、ＸＭＬ文書の階層構造に基づいて、重複を許した部分ＸＭＬ文書に分割してＲＤＢと対応付けるマッピング定義情報及び当該定義に則って、入力されたＸＭＬに対する標準的な問い合わせをＳＱＬへ変換することにより、検索の頻度が高いのか、挿入／更新の頻度が高いのかという個々のＸＭＬ文書の利用形態に応じて、
（１）ＸＭＬ文書全体／部分をＲＤＢの１カラムに格納する。
（２）データ項目毎に分解して複数カラムに格納する。
という２つの格納方法を組み合わせることが可能となる。
【０１２３】
即ち、ＲＤＢを利用してＸＭＬ文書を扱う際に、検索の高速性と、追加、更新の高速性を両立することが可能となる。
【図面の簡単な説明】
【図１】本発明の原理を説明するための図である。
【図２】本発明の原理構成図である。
【図３】本発明の一実施の形態における構造化文書処理装置の構成図である。
【図４】本発明の一実施の形態におけるＲＤＢ−ＸＭＬマッピング定義情報に含まれる情報の概要を示す図である。
【図５】本発明の一実施の形態における構造化文書処理装置の全体の概要動作フローチャートである。
【図６】本発明の一実施の形態における準備フェーズのフローチャートである。
【図７】本発明の一実施の形態における操作フェーズのフローチャート（その１）である。
【図８】本発明の一実施の形態における操作フェーズのフローチャート（その２）である。
【図９】本発明の一実施の形態における操作フェーズのフローチャート（その３）である。
【図１０】本発明の一実施例のＲＤＢ−ＸＭＬマッピング定義情報の詳細を示す図である。
【図１１】本発明の一実施例のＲＤＢアクセス情報と記述例を示す図である。
【図１２】本発明の一実施例のテーブル生成情報と記述例である。
【図１３】本発明の一実施例のルート情報と記述例（その１）である。
【図１４】本発明の一実施例のルート情報と記述例（その２）である。
【図１５】本発明の一実施例のルート情報と記述例（その３）である。
【図１６】本発明の一実施例のＳＱＬ−ＤＤＬの出力例及びテーブルの生成例である。
【図１７】本発明の一実施例の操作フェーズにおいて使用する問い合わせと格納モデルである。
【図１８】本発明の一実施例の操作フェーズにおけるＳＱＬ生成イメージ（その１）である。
【図１９】本発明の一実施例の操作フェーズにおけるＳＱＬ生成イメージ（その２）である。
【図２０】本発明の一実施例の操作フェーズにおけるＳＱＬ生成イメージ（その３）である。
【図２１】本発明の一実施例の操作フェーズにおいて使用する問い合わせとＲＤＢ−ＸＭＬマッピング定義情報を示す図（その１）である。
【図２２】本発明の一実施例の操作フェーズにおいて使用する問い合わせとＲＤＢ−ＸＭＬマッピング定義情報を示す図（その２）である。
【図２３】本発明の一実施例の操作フェーズのＸＭＬ文書削除処理において生成するＳＱＬを示す図である。
【図２４】本発明の一実施例の操作フェーズにおけるＸＭＬ文書検索において使用する問い合わせとＲＤＢ−ＸＭＬマッピング定義情報と生成するＳＱＬを示す図である。
【図２５】本発明の一実施例の操作フェーズにおけるＸＭＬ文書検索において使用する問い合わせとＲＤＢ−ＸＭＬマッピング定義を生成するＳＱＬを示す図である。
【図２６】本発明の一実施例の操作フェーズのＸＭＬ文書検索において使用する問い合わせとＲＤＢ−ＸＭＬマッピング定義情報とＳＱＬ結果と問い合わせ結果を示す図である。
【符号の説明】
１００構造化文書処理装置
１１０アプリケーションプログラムインタフェース部
１２０操作判定手段、ＸＭＬ問い合わせ文解析部
１３０命令文生成手段、ＳＱＬ文生成部
１４０操作指示手段、ＲＤＢ管理システムインタフェース部
１５０文書件数取得手段、ＤＯＭ操作部
１６０構造化文書出力手段、問い合わせ結果生成部
１７０マッピング定義情報、ＲＤＢ−ＸＭＬマッピング定義情報
１７１ＲＤＢアクセス情報
１７２テーブル生成情報
１７３ルート情報
２００アプリケーションプログラム
３００データーベース、ＲＤＢ管理システム[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a structured document processing method and apparatus, a structured document processing program, and a storage medium storing the structured document processing program, and more particularly to a structured document processing method and apparatus using a database, and structured document processing. The present invention relates to a storage medium storing a program and a structured document processing program.
[0002]
[Prior art]
XML (extensible Markup Language) is a standard standard established by the World Wide Web Consortium (W3C) to define the description format of documents and data exchanged on a network. XML is expected as a format that combines the extensibility of SGML (Standard Generalized Markup Language), which is an international standard for structured documents, and the Internet use of HTML (HyperText Language). For example, XML can be used to describe data exchanged by EC (Electronic Commercial) or KM (Knowledge Management), or to describe a collection catalog of an electronic library. In such a usage example, the requirement that a large amount of XML documents can be stored, searched, and updated exceeds the requirement of XML interchangeability. To meet this requirement, it is necessary to apply and develop database technology. There are roughly two methods for this.
[0003]
(1) First, as one method, there is a development of a native XML database. This method newly develops a database that stores an XML document as it is and can search and update it. Tamino (software AG), eXcelon (Exelon), Yggdrasill (Media Fusion) and the like.
[0004]
(2) As a second method, there is a function extension of an existing database. The function of storing an XML document in an existing RDB (relational database) or the like and converting the acquired data into XML is extended. Main databases such as Oracle9i (Oracle), SQL Server2000 (Microsoft), and DB2 (IBM) are compatible with XML.
[0005]
The above (1) has a great advantage that the XML document can be handled as it is. However, most data is currently stored in the RDB, and (2) is suitable for coordination and utilization with these existing data. When storing, searching, and updating an XML document in (2), a mapping method between the XML document and the RDB is important. There are roughly two methods for mapping and storing XML in an RDB.
[0006]
a) Method of storing the entire XML document in one column:
By using a data type such as CLOB (Character Large Object) or Varchar of the RDB, the entire XML document is directly stored in one column. This is an effective method when not only the data in the XML document but also the document structure of the XML document itself is held. For example, this is the case when articles of newspapers and magazines are left as archives. In this case, only the specified element (date and time, reporter name, and the like) is set as a separate column, and an index can be created to search.
[0007]
b) Method of decomposing an XML document into data items and storing in multiple columns:
Elements and attributes of the original XML document are decomposed as data items and stored in a plurality of columns of the RDB. Since the original XML document itself is not stored, the document structure is not stored on the RDB. However, since it is handled as data on the RDB, this is an effective storage method when the data is shared by a plurality of applications. By determining the mapping method from the RDB to the XML, the data stored in the existing RDB can be acquired (published) as an XML document (such a technique is called XML Publishing).
[0008]
Techniques for realizing these storage methods using mapping definition information between XML and RDB have also been developed (for example, see Non-Patent Document 1).
[0009]
[Non-patent document 1]
IBM, DB2XMLExtender: http: // www-6. ibm. com / jp / software / data / developer / column / kantanextend / 05xmlextender / 01 / html
[0010]
[Problems to be solved by the invention]
However, there are many insufficient points in the technology that realizes the storage method based on the XML-RDB mapping definition information.
[0011]
(1) A problem that it is difficult to search, add, and update with a standard query to XML (functional requirement): A standard search language XQuery (http: // www, which is being developed by W3C) .W3.org / TR / xquery /) enables flexible description such as reconstructing and converting the structure of the search result XML. Regarding addition and update, a language corresponding to XML called Xupdate (http://www.xmldb.org/xupdate/) has been proposed. However, in the conventional method, the stored data is handled as data on the RDB. In general, a query used for search, addition, and update uses SQL that is XML-extended. Further, the mapping from RDB to XML in XML Publishing is fixed. That is, in the conventional method, the stored data is treated as data on the RDB, so that it is difficult to search, add, and update with a standard XML query.
[0012]
(2) It is difficult to achieve both high-speed search and high-speed addition and update (performance requirements): Generally, if only the entire XML document is searched and acquired, the above conventional a) becomes b) It is faster than that. On the other hand, when the frequency of adding and updating elements in the XML document is high, b) is faster than a). Then, depending on the use form of the XML document, both high-speed search and high-speed addition and update may be required. That is, depending on the application, it is necessary to combine these storage methods. However, in the conventional XML-RDB mapping definition information, it is difficult to flexibly combine the two storage methods a) and b) to achieve both high-speed search and high-speed addition and update.
[0013]
The present invention has been made in view of the above points, and divides an XML document into partial XML documents that allow duplication based on the hierarchical structure of the XML document and associates the XML document with an RDB according to mapping definition information and the definition. , A structured document processing method and apparatus using a database that realizes both high-speed search and high-speed addition and update by converting a standard query for input XML into SQL It is an object to provide a storage medium storing a processing program and a structured document processing program.
[0014]
[Means for Solving the Problems]
FIG. 1 is a diagram for explaining the principle of the present invention.
[0015]
The present invention relates to a structured document processing method for performing operations of storing, searching, updating and deleting structured documents using a database,
An operation determining step (step 1) of analyzing an input query for the structured document and determining whether the operation requested by the query is insertion, update, deletion, or search;
Based on the hierarchical structure of the structured document stored in the storage unit, referring to the mapping definition information that divides the structured document allowing duplication into divided partial documents and associates the divided partial documents with the database, A statement generation step (step 2) for converting the input inquiry into a statement for performing any one of insert / update / delete / search operations on the database based on the determination;
An operation instruction step (step 3) for transmitting the converted statement to the database and executing the operation;
If the statement is one of insert / update / delete operations, a document number acquisition step (step 4) for acquiring the number of documents that executed the statement;
If the imperative sentence is a search operation, a structured document output step (step 5) of outputting a database search result as a structured document with reference to the mapping definition information.
[0016]
Further, the statement generation step is a step of referring to the mapping definition information to convert the structured document into a statement for securing an area of a database for performing any one of operations of inserting / updating / deleting / retrieving. including.
[0017]
In addition, the statement generation step includes:
Determining whether the structure of the structured document for the insert / update operation is valid;
Determining whether the query is to insert the entire structured document or to insert a part / update of the structured document with respect to the already stored structured document;
In the case of a query that inserts the entire structured document, an identifier for identifying a plurality of structured documents having the same structure is issued, and referring to the mapping definition information, a statement is inserted into the database. And converting.
[0018]
In addition, the statement generation step includes:
Determining whether the structure of the structured document for the insert / update operation is valid;
Determining whether the query is to insert the entire structured document or to insert / update a part of the structured document with respect to the already stored structured document;
If the query is to insert / update a part of the structured document, refer to the mapping definition information to determine whether the ancestor element of the element corresponding to the part of the structured document is stored in the database When,
An update instruction generating step of generating a statement for updating the ancestor element if the ancestor element is stored in the database;
Referring to the mapping definition information to determine whether descendant elements of the element corresponding to a part of the structured document are stored in the database;
If the descendant element is stored in the database, a descendant element update instruction generating step of generating a statement for updating the descendant element is included.
[0019]
In addition, the statement generation step includes:
Referring to the mapping definition information to determine whether the ancestor element of the element corresponding to a part of the structured document is stored in the database;
If the ancestor element is stored in the database, an update instruction generating step of generating a statement for updating from the ancestor element to the corresponding element;
Referring to the mapping definition information to determine whether descendant elements of the element corresponding to a part of the structured document are stored in the database;
If the descendant element is stored in the database, a descendant element update instruction generating step of generating a statement for updating the descendant element is included.
[0020]
In addition, the update instruction generation step includes:
Generating and executing a statement for acquiring data to be updated;
Updating the acquired update target data;
Generating a statement for inserting the updated data again,
Repeating the above steps up to the element specified in the inquiry about the ancestor element stored in the database.
[0021]
Further, the descendant element update instruction generation step includes:
A step of determining whether the data of the same element to be updated is stored separately in a plurality of rows on the database or stored together in one row;
When stored separately in a plurality of rows, a statement for generating and executing a statement for temporarily deleting data, and re-inserting the data based on data containing an updated ancestor element is used. For all descendant elements storing the data;
If they are stored together on one line,
Generating and executing a statement for acquiring data to be updated;
Updating the acquired update target data;
Generating a statement for inserting the updated data again,
Generating a command for data re-insertion by repeating the above steps up to the element specified by the inquiry about the ancestor element stored in the database.
[0022]
In addition, the statement generation step includes:
Determining whether the search result requesting the inquiry is stored in the database as it is or divided and stored as a descendant element;
If it is stored in the database as it is, a step of acquiring its location,
If divided and stored as descendant elements, obtaining the location of all of them;
Generating a command sentence for retrieval from the acquired location and the input inquiry.
[0023]
FIG. 2 is a diagram illustrating the principle of the present invention.
[0024]
The present invention is a structured document processing apparatus that performs operations of storing, searching, updating and deleting structured documents using a database,
An operation determining unit 120 that analyzes a query for the input structured document and determines whether the operation requested by the query is insertion, update, deletion, or search;
Based on the hierarchical structure of the structured document stored in the storage unit, mapping definition information 170 that divides the structured document that allows duplication into divided partial documents and associates the divided partial documents with a database;
A command that converts an input query into a command that performs one of insert / update / delete / search operations with respect to the database 300 based on the determination by the operation determining unit with reference to the mapping definition information 170. Generating means 130;
An operation instructing means 140 for transmitting the converted statement to the database 300 and executing an operation;
If the statement is one of an insert / update / delete operation, a document number acquisition unit 150 for acquiring the number of documents that executed the statement;
If the imperative sentence is a search operation, the structured document output unit 160 outputs a database search result as a structured document with reference to the mapping definition information.
[0025]
In addition, the imperative sentence generating means 130
A means for referring to the mapping definition information 170 and converting the structured document into a command that secures an area of a database for performing any one of operations of inserting / updating / deleting / searching.
[0026]
In addition, the imperative sentence generating means 130
Means for determining whether the structure of the structured document for the insert / update operation is valid,
Means for determining whether the query is to insert the entire structured document or to insert a part / update of the structured document with respect to the already stored structured document;
In the case of a query for inserting the entire structured document, an identifier for identifying a plurality of structured documents having the same structure is issued, and a command for inserting the structured document into the database with reference to the mapping definition information 170 is provided. Including means for converting to
[0027]
In addition, the imperative sentence generating means 130
Means for determining whether the structure of the structured document for the insert / update operation is valid,
Means for determining whether the query is to insert the entire structured document or to insert / update a part of the structured document with respect to the already stored structured document;
If the query is to insert / update a part of the structured document, it is determined whether the ancestor element of the element corresponding to the part of the structured document is stored in the database 300 by referring to the mapping definition information 170. Means to
Update instruction generating means for generating a statement for updating the ancestor element if the ancestor element is stored in the database 300;
Means for referring to the mapping definition information 170 to determine whether descendant elements of elements corresponding to a part of the structured document are stored in the database;
If the descendant element is stored in the database 300, the method includes a descendant element update instruction generating means for generating a statement for updating the descendant element.
[0028]
In addition, the imperative sentence generating means 130
Means for referring to the mapping definition information 170 to determine whether an ancestor element of an element corresponding to a part of the structured document is stored in the database 300;
If the ancestor element is stored in the database 300, an update instruction generating means for generating a statement for updating from the ancestor element to the corresponding element;
Means for referring to the mapping definition information 170 to determine whether descendant elements of elements corresponding to a part of the structured document are stored in the database;
If the descendant element is stored in the database 300, a descendant element update instruction generating means for generating a statement for updating the descendant element is included.
[0029]
Further, the update instruction generating means includes:
Means for generating and executing a statement for acquiring data to be updated;
Means for updating the acquired update target data;
Means for generating a statement for inserting the updated data again,
Means for repeating the above means up to the element specified by the inquiry about the ancestor element stored in the database 300.
[0030]
Further, the descendant element update instruction generating means includes:
Means for determining whether the data of the same element to be updated is stored separately in a plurality of rows or stored collectively in one row on the database 300;
When stored separately in a plurality of rows, a statement for generating and executing a statement for temporarily deleting data, and re-inserting the data based on data containing an updated ancestor element is used. For all descendant elements storing the data,
If they are stored together on one line,
Means for generating and executing a statement for acquiring data to be updated;
Means for updating the acquired update target data;
Means for generating a statement for inserting the updated data again,
Means for generating a command for data reinsertion by means for repeating the above means up to the element specified in the inquiry about the ancestor element stored in the database.
[0031]
In addition, the imperative sentence generating means 130
Means for determining whether the search result requesting the inquiry is stored in the database 300 as it is, or stored as being divided as descendant elements;
A means for acquiring the location if stored directly in the database 300;
A means for acquiring all the locations of the elements if they are stored separately as descendant elements,
Means for generating a command sentence for retrieval from the acquired location and the input inquiry.
[0032]
The present invention is a structured document processing program that causes a computer to execute operations of storing, searching, updating, and deleting a structured document using a database,
An operation determining step of analyzing a query for the input structured document and determining whether the operation requested by the query is insertion, update, deletion, or search;
Based on the hierarchical structure of the structured document stored in the storage unit, referring to the mapping definition information that divides the structured document allowing duplication into divided partial documents and associates the divided partial documents with the database, A statement generation step for converting the input query into a statement for performing any one of insert / update / delete / search operations on the database based on the determination;
An operation instruction step of transmitting the converted statement to the database and executing the operation;
If the statement is an insert / update / delete operation, a document number acquisition step of acquiring the number of documents that executed the statement;
If the imperative sentence is a search process, a structured document output step of outputting a database search result as a structured document with reference to the mapping definition information.
[0033]
The present invention is a storage medium storing a structured document processing program for causing a computer to execute operations of storing, searching, updating and deleting a structured document using a database,
An operation determining step of analyzing a query for the input structured document and determining whether the operation requested by the query is insertion, update, deletion, or search;
Based on the hierarchical structure of the structured document stored in the storage unit, referring to the mapping definition information that divides the structured document allowing duplication into divided partial documents and associates the divided partial documents with the database, A statement generation step for converting the input query into a statement for performing any one of insert / update / delete / search operations on the database based on the determination;
An operation instruction step of transmitting the converted statement to the database and executing the operation;
If the statement is an insert / update / delete operation, a document number acquisition step of acquiring the number of documents that executed the statement;
If the command sentence is a search operation, a structured document output step of outputting a database search result as a structured document with reference to the mapping definition information is stored.
[0034]
As described above, the present invention analyzes a standard query with respect to an input XML, determines whether the operation requested by the query is insertion, update, deletion, or search. The input query is executed on the RDB, and the input query is executed on the RDB according to the mapping definition information (the mapping definition information for dividing the XML document into a duplicate-allowed XML document based on the hierarchical structure of the XML document and associating it with the RDB). In the case of update / deletion, the number of documents for which the operation has been executed is acquired. In the case of search, reference is made to XML-RDB mapping definition information, and the RDB search result is output as an XML document. By analyzing a standard query for XML and referring to the mapping definition information between XML and RDB, it becomes possible to insert, update, delete, and search an XML document in the RDB using the query. In addition, the mapping definition information is divided into partial XML documents while allowing the duplication based on the hierarchical structure of the XML document, and is associated with the RDB, so that both high-speed search and high-speed addition and update can be achieved. it can.
[0035]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
[0036]
FIG. 3 shows a configuration of the structured document processing apparatus according to the embodiment of the present invention.
[0037]
The structured document processing apparatus 100 includes an application program interface unit 110, an XML query sentence analysis unit 120, RDB-XML mapping definition information 170, an SQL sentence generation unit 130, an RDB management system interface unit 140, a DOM operation unit 150, and a query result. It comprises a generation unit 160.
[0038]
The application program interface unit 110 receives an XML query sent from a user via the application program and the communication network 400. The XML query statement includes a variable binding part, a query condition specification part, and a query result specification part.
[0039]
The XML query analysis unit 120 analyzes the syntax of the XML query sentence received by the application program interface unit 110, and outputs XML query sentence analysis information.
[0040]
FIG. 4 shows an outline of information included in the RDB-XML mapping definition information in one embodiment of the present invention. The RDB-XML mapping definition information 170 includes information 171 for accessing the RDB, table generation information 172 necessary for generating the SQL-DDL, and correspondence between the information of the XML document and the information stored in the RDB. Route information 173 for mapping the relationship is included. The RDB-XML mapping definition information 170 is stored in a storage medium such as a hard disk device.
[0041]
The SQL sentence generation unit 130 generates an SQL sentence from the XML query sentence analysis information and the RDB-XML mapping definition information, and transmits the generated SQL sentence to the RDB management system 300 via the RDB management system interface unit 140 and the communication network 400. Then, the SQL result is obtained from the RDB management system 300. The RDB management system interface unit 140 receives the generated SQL statement, connects to the RDB management system 300, and acquires the SQL result.
[0042]
When the ancestor / descendant element of the query result designation part is stored in the RDB, the DOM operation unit 150 acquires the element and develops it into a DOM (Document Object Model: one of standard APIs for XML operations). Thereafter, the inquiry result specified portion is updated by a DOM operation.
[0043]
The inquiry result generation unit 160 generates an inquiry result from the acquired SQL result and the read RDB-XML mapping definition information. The query result is the number of executed XML documents in the case of insertion / update / deletion, and indicates the returned XML document in the case of search.
[0044]
Next, an outline of a processing procedure of the structured document processing apparatus 100 of the present invention will be described.
[0045]
FIG. 5 is a flowchart of an overall operation of the structured document processing apparatus according to the embodiment of the present invention.
[0046]
The structured document processing apparatus 100 performs processing in the preparation phase (step 1000) and the operation phase (step 2000) in this order. In the preparation phase (step 1000), SQL-DDL is output based on the table generation information in the RDB-XML mapping definition information 170, and a table is generated on the RDB. In the operation phase (step 2000), the generated table is inserted, updated, deleted, and searched for an XML document.
[0047]
FIG. 6 is a flowchart of the preparation phase in one embodiment of the present invention.
[0048]
In the preparation phase, first, RDB-XML mapping definition information is created and stored in the storage unit (step 1010). The SQL statement generation unit 130 outputs SQL-DDL based on the table generation information in the RDB-XML mapping definition information 170 (step 1020). The RDB management system interface unit 140 connects to the RDB management system 300 based on the RDB access information in the RDB-XML mapping definition information 170, and transmits the SQL-DDL to the RDB management system 300 (Step 1030). The RDB management system 300 generates a table based on the transmitted SQL-DDL (Step 1040).
[0049]
7 to 9 are flowcharts of the operation phase according to the embodiment of the present invention.
[0050]
In the operation phase, first, the application program interface unit 110 receives an XML inquiry sentence input via the application program 200 (Step 2010). The query is composed of a variable binding part, a query condition specification part, and a query result specification part, as shown in FIG.
[0051]
The XML query sentence analysis unit 120 analyzes the XML query sentence and outputs XML query sentence analysis information (step 2020). From the XML query sentence analysis information, it is determined whether or not the received XML query sentence has a syntax for performing a search operation (step 2030). Similarly, it is determined whether the syntax is for performing an insert / update operation or a syntax for performing a delete operation (step 2031).
[0052]
If the syntax is for performing an insert / update operation, it is determined whether the structure of the insert / update XML document is valid (step 2040). If not, an error is output and the operation phase ends. If it is valid, it is determined whether the syntax is for inserting all the XML documents (step 2041). If the syntax is to insert all the XML documents, a document key for identifying each XML document is issued (step 2042), and the SQL sentence generation unit 130 sends the RDB-XML mapping definition information 170 and the XML Referring to the query sentence analysis information, an SQL for inserting all XML documents is generated (step 2043). The generated SQL statement is executed by the RDB management system 300 (step 2044), the number of inserted documents is acquired (step 2070), and this is output as an inquiry result (step 2087).
[0053]
If the syntax is not for inserting all XML documents (step 2041), the syntax is for inserting partial XML documents or updating all / partial XML documents. Subsequent processing is the same for partial XML document insertion / update / deletion. First, it is determined whether the ancestor element of the element corresponding to the query result designation part is stored in the RDB (step 2050). If it is stored, the SQL statement generation unit 130 refers to the RDB-XML mapping definition information 170 and the XML query statement analysis information, acquires a table name and a column name, and generates an update target data acquisition SQL (step 2051). The generated SQL is executed to acquire the data to be updated (step 2052).
[0054]
The DOM operation unit 150 expands the acquired data into a DOM (Document Object Model: one of standard APIs for the XML operation), updates the query result designation portion by the DOM operation, and converts the data into XML again. (Step 2053). The SQL sentence generation unit 130 generates the insertion SQL and re-inserts the updated XML into the database (step 2054). Steps 2051 to 2054 are repeated up to the element specified in the query result specification part.
[0055]
Next, it is determined whether or not the descendant elements of the element corresponding to the query result designation part are stored in the RDB (step 2060). If it is stored, it is determined whether the N elements of the same path are stored separately in N rows or stored together in one row (step 2061). If they are stored together in one row, the data to be updated is acquired, expanded in DOM, updated, converted to XML, and reinserted into the database in the same manner as in steps 2051 to 2054 (from step 2064). Step 2067). If they are stored separately in N rows, an SQL for data deletion is generated using the document key acquired at the time of updating the ancestor element, and after execution (step 2062), the SQL of the column storing the updated ancestor element is updated. An SQL for data insertion is generated based on the data and executed (step 2063).
[0056]
Steps 2060 to 2066 are performed for all descendant elements stored in the RDB. The number of documents for which partial XML document insertion / update / deletion has been executed is acquired (step 2070), and this is output as an inquiry result (step 2087).
[0057]
If it is a syntax for performing a search operation (step 2030), it is determined whether an element corresponding to the query result designation part is stored in the RDB with reference to the RDB-XML mapping definition information 170 (step 2080). If it is stored, the table name and column name of the RDB in which the element is stored are obtained (step 2083). If it is not stored, the descendant elements of the element are traced until the stored element appears, and the table name and column name of the stored RDB are obtained (step 2082). This is repeated for all descendant elements, and all table names and column names are obtained (step 2081).
[0058]
SQL is generated / executed based on the obtained table name, column name, and analysis information obtained from the query condition specification part (step 2084). By referring to the RDB-XML mapping definition information 170, it is determined whether or not an element corresponding to the query result designation part is stored in the RDB, thereby determining whether XML reconstruction is necessary (step 2085). If necessary, an XML document is generated as a query result from the SQL search result with reference to the RDB-XML mapping definition information 170 (step 2086) and output (step 2087). If the reconstruction is not necessary, the corresponding element is stored in the RDB, and is output as a query result (step 2087).
[0059]
【Example】
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
[0060]
First, the preparation phase will be described.
[0061]
(1) Table generation (preparation phase):
FIG. 10 is a diagram illustrating details of the RDB-XML mapping definition information according to an embodiment of the present invention. FIG. 11 shows RDB access information and a description example according to an embodiment of the present invention. FIG. 12 shows table generation information and a description example according to an embodiment of the present invention. 13 to 15 show route information and description examples according to an embodiment of the present invention.
The RDB-XML mapping definition information 170 is a definition defined in the XML format. As shown in FIG. 4, the RDB access information 171 (FIG. 11), the table generation information 172 (FIG. 12), and the route information 173 (FIG. 13 to 15) are included.
[0062]
(2) RDB-XML mapping definition information (FIG. 10):
This is the root element of the RDB-XML mapping definition information 170. The target RDB management system type is defined by the Target attribute. The RDB-XML mapping definition information 170 is identified based on the value of the version attribute.
[0063]
(3) RDB access information (FIG. 11):
This element defines information for each connection destination of the RDB.
[0064]
Access element
This is an element that defines information for connecting to the RDB via the RDB management system interface unit 140. The connection destination url required for connection is defined in the url element, and the user name and password required for connection are defined in the property element.
[0065]
(A) url element
This element defines a URL specified when connecting to the RDB via the RDB management system interface unit 140.
[0066]
(B) property element
This element specifies property information (user name, password, and the like) to be specified when connecting to the RDB via the RDB management system interface unit 140. The property name is defined by the name attribute.
[0067]
(2) Table generation information (FIG. 11)
The SQL sentence generation unit 130 defines information necessary for generating an SQL-DDL for generating an RDB table.
[0068]
(A) table element
Define information for individual tables. A table name is described for the name attribute.
[0069]
(B) column element
The column information defined in the table specified in the name attribute of the table element of the parent element is defined. The column name is described for the name attribute, and the data type defined in the RDB is described for the type attribute.
[0070]
(C) constraint element
When the parent element is a table element, a constraint as a table definition is described in the same format as SQL. If the parent element is a column element, the constraint as a row definition is described in the same format as SQL.
[0071]
(D) Characteristics element
Physical characteristics and table characteristics for the table specified by the table element of the parent element are defined. A part corresponding to physical_properties and table_properties in the CREATE statement of SQL is described in the same format as SQL.
[0072]
FIG. 12 shows a description example of table generation.
[0073]
(3) Route information (Fig. 13)
Information for mapping the relationship between the input / output XML document and the data stored in the RDB is defined. The element information, attribute element, and text element correspond to elements, attributes, and text of the original XML document, respectively, and describe the same hierarchical structure as the original XML.
[0074]
FIG. 15 shows items defined in the route information.
[0075]
(A) element element
This is an element corresponding to the element of the XML document. RDB matching information is defined by various attributes. The contents of each attribute are shown below.
[0076]
・ Name attribute
Define the element name of the XML document.
[0077]
・ Table attribute
Define the storage destination table name. Note that not all elements are defined, and the definition is effective until the next table attribute is defined for a descendant element.
[0078]
・ Column attribute
Define the storage destination column name. The target table is the target table defined by the above table attributes (defined as the nearest table attribute in the sector-or-self), and the storage location of the corresponding XML data is specified.
[0079]
・ Ratio attribute
This attribute is defined in association with the table attribute, and defines a relationship between a plurality of storage destination tables when there are n elements.
[0080]
In the case of “1: 1”, one or n elements having this attribute are stored in one tuple.
[0081]
In the case of “1: n” or “1: N”, one or N elements having this attribute are stored in N tuples.
[0082]
If this attribute is omitted, it is treated as "1: 1".
[0083]
-MinOccurs attribute
The minimum number of elements that the element defined with this attribute can have is defined by “0” or “1”.
[0084]
-MaxOccurs attribute
The maximum number of elements that the element defined with this attribute can have is defined as “1” or “unbounded” (no limit).
[0085]
・ DocID attribute
“Yes” is defined in this attribute for the element corresponding to the document ID for identifying the XML document.
[0086]
・ Type attribute
This attribute is defined in association with the column attribute, and defines the data type of the column at the storage destination. It is referred to when XML data is stored, and determines the consistency between input data and data.
[0087]
-Size attribute
This attribute is defined in association with the column attribute, and defines the size of the column at the storage destination. It is referred to when storing XML data, and determines whether the size of the input data can be stored.
[0088]
・ Pkey attribute
This attribute is defined in association with the table attribute, and defines a column name serving as a primary key of the table defined in the table attribute. If omitted, DocKey (document key) is regarded as the primary key.
[0089]
・ Parent-fkey attribute
This attribute is defined in association with the table attribute, and defines a column name serving as a foreign key to the primary key of the corresponding table. The column specified by this attribute shall be defined as the column of the table defined in the nearest ancestor element.
[0090]
・ Exts attribute
Describe the user-defined function name that is an extended function. By defining this attribute, the association between the user-defined function and the path in which this attribute is defined is defined.
[0091]
(B) attribute element
This is an element corresponding to the attribute of the XML document. Define mapping information with the database using various attributes. The attribute has the same attribute as the element element, but also has a meta attribute as an attribute unique to the attribute element.
[0092]
-Meta attribute
This is an attribute for identifying an additional attribute. An element for which “yes” is specified in this attribute is regarded as an additional attribute. When an operation is performed using an XML query, it is assumed that the operation cannot be performed unless an element in which this attribute is directly defined is specified in a query result specification portion (do clause / return clause). If an operation is attempted without directly specifying it, an exception will be made when updating, and will not be searched when searching.
[0093]
(C) text attribute
This is an element corresponding to the attribute of the XML document. The RDB mapping information is defined by various attributes. The attributes have the same attributes as the element element.
[0094]
FIG. 15 shows an example of an XML document using the above items and a description example of route information. The XML-RDB mapping definition information 170 (route information) of the present invention divides an XML document into partial XML documents and stores the divided XML documents in the RDB. Division is performed in a hierarchical manner, and overlapping is allowed. FIG. 16 shows this overlapping relationship of storage in a hierarchical structure (storage structure tree).
[0095]
In the preparation phase, first, RDB-XML mapping definition information 170 (FIG. 10) including RDB access information 171 (FIG. 11), table generation information 172 (FIG. 12), and route information 173 (FIGS. 13 to 15) is created. (Step 1010). The SQL statement generation unit 130 outputs the SQL-DDL based on the table generation information (FIG. 12) in the RDB-XML mapping definition information 170 (step 1020). The RDB management system interface unit 140 connects to the RDB management system 300 based on the RDB access information (FIG. 11) in the RDB-XML mapping definition information 170, and transmits the SQL-DDL to the RDB management system 300 (step). 1030).
[0096]
The RDB management system 300 generates a table based on the transmitted SQL-DDL (Step 1040). FIG. 16 shows an example of SQL-DDL output and an example of table generation.
[0097]
Next, the operation phase will be described.
[0098]
(1) Insert XML document (operation phase)
With reference to the RDB-XML mapping definition information 170 based on the input XML query sentence, the entire XML document or the partial XML document is inserted into the RDB. The case where the inquiry shown in FIGS. 17A and 17B is issued from the application program 200 will be specifically described as an example.
[0099]
First, an inquiry is received (FIG. 7, step 2010), and the XML query sentence analysis unit 120 analyzes the XML query sentence and outputs XML query sentence analysis information (FIG. 7, step 2020). The XML query for insert / update / delete is based on the syntax of XMLUpdate, and the XML query for search is based on the syntax of XQuery.
[0100]
From the XML query analysis information, the query result designation part is a do clause, so that it is not a syntax for performing a search operation (FIG. 7, step 2030), and it is determined to be a syntax for performing an insert operation because an append function is specified (FIG. 7). , Step 2031). The insertion XML document structure is checked based on the root information of the XML-RDB mapping definition information 170 (FIG. 8, step 2040). The root name which is an argument of the root function is acquired from the for clause of the variable binding part, and the root element having the corresponding root name in the XML-RDB mapping definition information 170 is specified. At the bind position specified in the for clause, it is determined whether the insertion is an entire XML document insertion or a partial XML document insertion. If the path is not specified in the bind specification, it is regarded as insertion of all XML documents, and if the root element is specified, it is regarded as insertion of partial XML sentences (FIG. 8, step 2041). This is because the storage model shown in FIG. 17C is assumed in this embodiment.
[0101]
If all the XML documents are inserted (FIG. 17B), the document key is paid out (FIG. 8, step 2042), and the SQL sentence generating unit 130 refers to the RDB-XML mapping definition information 170 and the XML query sentence analysis information. , An SQL for inserting all the XML documents is generated (FIG. 8, step 2043).
[0102]
Specifically, by referring to the root information 173 of the RDB-XML mapping definition information 170, sequentially tracing the hierarchy from the element of the highest hierarchy, and if the table attribute exists, the table name defined and the existing The column names defined in all the column attributes to be obtained are acquired, and an SQL statement is generated. If the ratio attribute is defined in association with the table attribute, an SQL statement is generated according to the following rules.
[0103]
-In the case of "1: 1":
One or N elements to be stored in a table having this attribute are stored in one row.
[0104]
-For "1: N":
One or N elements stored in a table having this attribute are stored in N rows.
[0105]
FIG. 18 shows a SQL generation image. The generated SQL statement is executed by the RDB management system 300 (FIG. 8, step 2044), the number of inserted documents is obtained (FIG. 8, step 2070), and this is output as a query result (FIG. 9, step 2087). ).
[0106]
If a partial XML document is inserted (FIG. 17A), if there is an element having a column attribute as an ancestor element of the element corresponding to the path specified in the do clause, it is regarded as an update target (FIG. 8). , Step 2050). With reference to the RDB-XML mapping definition information 170, a table name and a column name are obtained, and an SQL is generated together with the data obtained by analyzing the query condition specification part (where clause) (FIG. 8, step 2051). FIG. 19 shows an SQL generation image. The generated SQL is executed to acquire data to be updated (FIG. 8, step 2052). After the acquired data is expanded into a DOM tree, it is inserted into the path specified by the do clause by the DOM operation, and is converted into XML again (FIG. 8, step 2053).
[0107]
The SQL statement generation unit 130 generates the SQL for insertion, and re-stores the updated XML in the database (using an updatable ResultSet of the Oracle JBDC driver). The above operation is repeated up to the element of the path specified by the do clause (FIG. 8, steps 2050 to 2054).
[0108]
Next, when the element corresponding to the path specified in the do clause has a descendant element, and a column attribute is defined in the descendant element, it is determined that the element is stored in the RDB (FIG. 8, step 2060). ). If the ratio attribute is defined as "1: 1" for the table attribute to be updated (FIG. 8, determination in step 2061), update is performed in the same manner as the ancestor element (FIG. 8, step 2064 to step 20). 2067). If the ratio attribute is “1: n” or “1: N” (determination in step 2061 in FIG. 8), the target column is deleted once and then inserted. After generating and executing the DELETE statement using the document key obtained at the time of updating the ancestor element (FIG. 8, step 2062), the INSERT statement is generated based on the data of the column storing the updated ancestor element. (FIG. 8, step 2063). FIG. 20 shows an image of SQL generation. Steps 2060 to 2066 are performed for all descendant elements stored in the RDB. The number of documents for which partial XML document insertion / update / deletion has been executed is acquired (FIG. 8, step 2070), and this is output as an inquiry result (FIG. 9, step 2087).
[0109]
(2) XML document update (operation phase)
The XML document is updated for the RDB with reference to the RDB-XML mapping definition information 170 based on the input XML query sentence. It is assumed that the inquiry shown in FIG. 21A is issued from the application program 200.
[0110]
The reception of the query (FIG. 7, step 2010) and the analysis of the XML query sentence (FIG. 7, step 2020) are the same as the above-mentioned (1) XML document insertion. Since the query result designation part is a do clause, it is not a syntax for performing a search operation (FIG. 7, step 2030), but is determined to be a syntax for performing an update operation because an update function is specified (FIG. 7, step 2031). The insertion XML document structure is checked based on the route information 173 of the XML-RDB mapping definition information 170 (FIG. 7, step 2040). Since the entire XML document is not inserted but updated (step 2041 in FIG. 8), the subsequent processing is the same as steps 2050 to 2087 described in (1) Inserting a document, and a description thereof will be omitted.
[0111]
(3) XML document deletion (operation phase)
The XML document is updated in the RDB with reference to the RDB-XML mapping definition information 170 based on the input XML query sentence. It is assumed that the inquiries shown in FIGS. 22A and 22B are issued from the application program 200.
[0112]
The reception of the inquiry (FIG. 7, step 2010) and the analysis of the XML inquiry sentence (FIG. 7, step 2020) are the same as (1) Insert XML document and (2) Update XML document. Since the query result designation part is a do clause, it is not a syntax for performing a search operation (FIG. 7, step 2030), but is determined to be a syntax for performing a delete operation because a remove function is specified (FIG. 7, step 2031). Subsequent processes are the same as steps 2050 to 2087 in which (1) the XML document is inserted and (2) the XML document is updated, and a description thereof will be omitted.
[0113]
If there is no element considered as an update target in the root information of the MD in the data of the column storing the updated ancestor element, no value is specified when the INSERT statement is generated (NULL is inserted. ). In the present embodiment, the deletion process (FIG. 23A) using the XML query sentence shown in FIG. If the element to be updated is a tag only and a text node does not exist, an INSERT statement is generated by specifying a null value (""). In the present embodiment, a deletion process (FIG. 23B) using an XML query sentence shown in FIG.
[0114]
(4) XML document search (operation phase)
An XML document is searched for the RDB with reference to the RDB-XML mapping definition information 170 based on the input XML query sentence. It is assumed that the inquiry shown in FIG. 24A is issued from the application program 200.
[0115]
The reception of the inquiry (FIG. 7, step 2010) and the analysis of the XML inquiry sentence (FIG. 7, step 2020) are the same as the above-described (1) XML document insertion, (2) XML document update, and (3) XML document deletion. The description is omitted.
[0116]
Since the query result designation part is the return clause, it is determined that the syntax is to perform a search operation (FIG. 7, step 2030). Referring to the RDB-XML mapping definition information 170, since the column attribute exists in the element corresponding to the path of the query result designation portion, it is determined that the element is stored in the RDB (FIG. 9, step 2080). . The table name and the column name of the RDB in which the element is stored are acquired (FIG. 9, step 2083). SQL is generated / executed based on the obtained table name, column name, and analysis information obtained from the query condition specification part (FIG. 9, step 2084). Since the element corresponding to the query result designation part is stored in the RDB, it is determined that the XML reconstruction is unnecessary (FIG. 9, step 2085). Since the corresponding element is stored in the RDB, it is output as a query result (FIG. 9, step 2087).
[0117]
It is assumed that the inquiry shown in FIG. 25A is issued from the application program 200.
[0118]
The steps from the reception of the inquiry (FIG. 7, step 2010) to the determination as to the syntax for performing the search operation (FIG. 7, step 2030) are the same as in the previous example. Referring to the RDB-XML mapping definition information 170, since the column attribute does not exist in the element corresponding to the path of the query result designation portion, it is determined that the element is not stored in the RDB (FIG. 9, step 2080). ). The descendant element of the element is traced, and the table name and the column name are acquired by tracing the descendant element until an element having a column attribute appears (FIG. 9, step 2082). This is repeated for all descendant elements, and all table names and column names are obtained (FIG. 9, step 2081). SQL is generated / executed based on the obtained table name, column name, and analysis information obtained from the query condition specification part (step 2084). Since the element corresponding to the query result designation part is not stored in the RDB, it is determined that the XML reconstruction is necessary (step 2085). As shown in FIG. 26, an XML document is generated as a query result from the SQL search result with reference to the RDB-XML mapping definition information 170 (FIG. 9, step 2086) and output (step 2087).
[0119]
The above-described preparation phase and operation phase can be constructed as a program, installed in a computer used as a structured document processing apparatus, executed by a control unit such as a CPU, or distributed via a network. It is possible.
[0120]
In addition, the constructed program is stored in a portable storage medium such as a hard disk device, a flexible disk, or a CD-ROM, which is connected to a computer used as a structured document processing apparatus. It can be installed and run.
[0121]
Note that the present invention is not limited to the above-described embodiments and examples, and various modifications and applications are possible within the scope of the claims.
[0122]
【The invention's effect】
As described above, according to the present invention, based on the hierarchical structure of an XML document, mapping definition information that is divided into a partial XML document that allows duplication and is associated with an RDB, and a standard for an input XML based on the definition. By converting a typical query into SQL, depending on the usage of each XML document, such as whether the frequency of search or the frequency of insertion / update is high,
(1) Store the entire / part of the XML document in one column of the RDB.
(2) Decompose for each data item and store in a plurality of columns.
It is possible to combine these two storage methods.
[0123]
That is, when handling an XML document using the RDB, it is possible to achieve both high-speed search and high-speed addition and update.
[Brief description of the drawings]
FIG. 1 is a diagram for explaining the principle of the present invention.
FIG. 2 is a principle configuration diagram of the present invention.
FIG. 3 is a configuration diagram of a structured document processing apparatus according to an embodiment of the present invention.
FIG. 4 is a diagram showing an outline of information included in RDB-XML mapping definition information in one embodiment of the present invention.
FIG. 5 is a general operation flowchart of the entire structured document processing apparatus according to the embodiment of the present invention;
FIG. 6 is a flowchart of a preparation phase in one embodiment of the present invention.
FIG. 7 is a flowchart (1) of an operation phase according to the embodiment of the present invention.
FIG. 8 is a flowchart (part 2) of an operation phase according to the embodiment of the present invention.
FIG. 9 is a flowchart (part 3) of an operation phase in the embodiment of the present invention.
FIG. 10 is a diagram illustrating details of RDB-XML mapping definition information according to an embodiment of the present invention.
FIG. 11 is a diagram showing RDB access information and a description example according to an embodiment of the present invention.
FIG. 12 shows table generation information and a description example according to an embodiment of the present invention.
FIG. 13 shows route information and a description example (part 1) according to an embodiment of the present invention.
FIG. 14 shows route information and a description example (part 2) according to an embodiment of the present invention.
FIG. 15 shows route information and a description example (part 3) according to an embodiment of the present invention.
FIG. 16 shows an example of SQL-DDL output and an example of table generation according to an embodiment of the present invention.
FIG. 17 is a query and storage model used in the operation phase of one embodiment of the present invention.
FIG. 18 is an SQL generation image (part 1) in the operation phase according to an embodiment of the present invention.
FIG. 19 is an SQL generation image (part 2) in the operation phase according to an embodiment of the present invention.
FIG. 20 is an SQL generation image (part 3) in the operation phase according to an embodiment of the present invention.
FIG. 21 is a diagram (part 1) illustrating a query and RDB-XML mapping definition information used in the operation phase of one embodiment of the present invention.
FIG. 22 is a diagram illustrating a query and RDB-XML mapping definition information used in an operation phase according to an embodiment of the present invention (part 2).
FIG. 23 is a diagram illustrating an SQL generated in an XML document deletion process in an operation phase according to an embodiment of the present invention.
FIG. 24 is a diagram showing a query used in an XML document search in the operation phase of one embodiment of the present invention, RDB-XML mapping definition information, and SQL to be generated.
FIG. 25 is a diagram illustrating a query used in an XML document search in an operation phase according to an embodiment of the present invention and an SQL for generating an RDB-XML mapping definition.
FIG. 26 is a diagram showing a query, RDB-XML mapping definition information, an SQL result, and a query result used in an XML document search in an operation phase according to an embodiment of the present invention.
[Explanation of symbols]
100 Structured document processing device
110 application program interface
120 operation determination means, XML query sentence analysis unit
130 Command sentence generation means, SQL sentence generation unit
140 operation instruction means, RDB management system interface unit
150 Document count acquisition unit, DOM operation unit
160 structured document output means, query result generation unit
170 mapping definition information, RDB-XML mapping definition information
171 RDB access information
172 Table generation information
173 route information
200 application programs
300 database, RDB management system

Claims

In a structured document processing method for performing operations of storing, searching, updating and deleting structured documents using a database,
An operation determining step of analyzing a query for the input structured document and determining whether the operation requested by the query is insertion, update, deletion, or search;
Based on the hierarchical structure of the structured document stored in the storage unit, referring to the mapping definition information that divides the structured document allowing duplication into divided partial documents and associates the divided partial documents with a database, and performs the operation determination. A statement generation step of converting the input inquiry into a statement for performing any one of insert / update / delete / search operations on the database based on the determination of the step;
Transmitting the converted statement to the database, an operation instruction step for executing the operation,
A document number obtaining step for obtaining the number of documents that have executed the statement, if the statement is any of an insert / update / delete operation;
If the command statement is a search operation, a structured document output step of referring to the mapping definition information and outputting a search result of the database as a structured document;
A structured document processing method comprising:

The statement generation step includes:
2. A step of referring to the mapping definition information and converting the structured document into a command for securing an area of a database for performing any one of operations of insertion / update / deletion / search. Structured document processing method.

The statement generation step includes:
Determining whether the structure of the structured document for the insert / update operation is valid;
Determining whether the query is to insert the entire structured document or to insert a part / update of the structured document with respect to the already stored structured document;
If the inquiry is to insert the entire structured document, an identifier for identifying a plurality of structured documents having the same structure is issued, and the database is inserted into the database with reference to the mapping definition information. 2. The structured document processing method according to claim 1, further comprising a step of converting into a statement.

The statement generation step includes:
Determining whether the structure of the structured document for the insert / update operation is valid;
Determining whether the query is to insert the entire structured document or to insert / update a part of the structured document with respect to the already stored structured document;
If the query is to insert / update a part of the structured document, refer to the mapping definition information to determine whether an ancestor element of an element corresponding to the part of the structured document is stored in the database. Determining;
An update instruction generating step of generating a statement for updating the ancestor element if the ancestor element is stored in the database;
Referring to the mapping definition information, determining whether descendant elements of elements corresponding to a part of the structured document are stored in the database,
2. The structured document processing method according to claim 1, further comprising: if the descendant element is stored in the database, a descendant element update instruction generating step of generating a statement for updating the descendant element.

The statement generation step includes:
Referring to the mapping definition information to determine whether an ancestor element of an element corresponding to a part of the structured document is stored in the database;
If the ancestor element is stored in the database, an update instruction generation step of generating a statement for updating the ancestor element to the corresponding element;
Referring to the mapping definition information, determining whether descendant elements of elements corresponding to a part of the structured document are stored in the database,
2. The structured document processing method according to claim 1, further comprising: if the descendant element is stored in the database, a descendant element update instruction generating step of generating a statement for updating the descendant element.

The update instruction generating step includes:
Generating and executing a statement for acquiring data to be updated;
Updating the acquired update target data;
Generating a statement for inserting the updated data again,
6. The structured document processing method according to claim 4, further comprising: repeating the above steps up to an element specified by an inquiry about an ancestor element stored in the database.

The descendant element update instruction generating step includes:
A step of determining whether the data of the same element to be updated is stored separately in a plurality of rows on the database or stored together in one row;
When stored separately in a plurality of rows, a statement for generating and executing a statement for temporarily deleting data, and re-inserting the data based on data containing an updated ancestor element is used. For all descendant elements storing the data;
If they are stored together on one line,
Generating and executing a statement for acquiring data to be updated;
Updating the acquired update target data;
Generating a statement for inserting the updated data again,
6. The method according to claim 4, further comprising the step of: repeating the above steps up to an element specified by an inquiry about an ancestor element stored in the database, thereby generating a command for data reinsertion. Structured document processing method.

The statement generation step includes:
A step of determining whether the search result requesting the inquiry is stored in the database as it is or stored as a descendant element,
If it is stored in the database as it is, obtaining its location;
If divided and stored as descendant elements, obtaining the location of all of them;
The structured document processing method according to claim 1, further comprising: generating a command sentence for a search from the acquired location and an input query.

A structured document processing device that performs operations of storing, searching, updating, and deleting structured documents using a database,
An operation determination unit that analyzes an inquiry for the input structured document and determines whether the operation requested by the inquiry is insertion, update, deletion, or search;
Based on the hierarchical structure of the structured document stored in the storage means, mapping definition information that divides the structured document that allows duplication into divided partial documents and associates them with a database,
Referring to the mapping definition information, based on the judgment of the operation judging means, the input inquiry is converted into a command sentence for performing any one of insert / update / delete / search operations on the database. Statement generating means for performing
An operation instructing means for transmitting the converted statement to the database and executing an operation;
If the statement is any one of an insert / update / delete operation, a document number acquisition unit that acquires the number of documents that executed the statement;
If the command statement is a search operation, structured document output means for outputting the database search result as a structured document by referring to the mapping definition information,
And a structured document processing apparatus.

The statement generating means,
10. A means for referring to the mapping definition information and converting the structured document into a command for securing an area of a database for performing any one of operations of insertion / update / deletion / search. Structured document processing device.

The statement generating means,
Means for determining whether the structure of the structured document for the insert / update operation is valid,
Means for determining whether the inquiry is to insert the entire structured document or to insert a part / update of the structured document with respect to the already stored structured document;
If the inquiry is to insert the entire structured document, an identifier for identifying a plurality of structured documents having the same structure is issued, and the database is inserted into the database with reference to the mapping definition information. 10. The structured document processing apparatus according to claim 9, further comprising means for converting into a statement.

The statement generating means,
Means for determining whether the structure of the structured document for the insert / update operation is valid,
Means for determining whether the inquiry is to insert the entire structured document or to insert / update a part of the structured document with respect to the already stored structured document;
If the query is to insert / update a part of the structured document, refer to the mapping definition information to determine whether an ancestor element of an element corresponding to the part of the structured document is stored in the database. Means for determining;
Update instruction generating means for generating a statement for updating the ancestor element if the ancestor element is stored in the database;
Means for referring to the mapping definition information to determine whether descendant elements of elements corresponding to a part of the structured document are stored in the database,
10. The structured document processing apparatus according to claim 9, further comprising: a descendant element update instruction generating unit that generates a statement for updating the descendant element if the descendant element is stored in the database.

The statement generating means,
Means for referring to the mapping definition information to determine whether an ancestor element of an element corresponding to a part of the structured document is stored in the database,
Update instruction generating means for generating a statement for updating from the ancestor element to the corresponding element if the ancestor element is stored in the database;
Means for referring to the mapping definition information to determine whether descendant elements of elements corresponding to a part of the structured document are stored in the database,
10. The structured document processing apparatus according to claim 9, further comprising: a descendant element update instruction generating unit that generates a statement for updating the descendant element if the descendant element is stored in the database.

The update instruction generating means includes:
Means for generating and executing a statement for acquiring data to be updated;
Means for updating the acquired update target data;
Means for generating a statement for inserting the updated data again,
14. The structured document processing apparatus according to claim 12, further comprising: means for repeating each of the above means up to an element designated by an inquiry about an ancestor element stored in the database.

The descendant element update instruction generating means,
Means for determining whether the data of the same element to be updated is stored separately in a plurality of rows on the database or stored together in one row;
When stored separately in a plurality of rows, a statement for generating and executing a statement for temporarily deleting data, and re-inserting the data based on data containing an updated ancestor element is used. For all descendant elements storing the data,
If they are stored together on one line,
Means for generating and executing a statement for acquiring data to be updated;
Means for updating the acquired update target data;
Means for generating a statement for inserting the updated data again,
14. A means for generating a statement for data reinsertion by means for repeating each of the above means up to an element specified by an inquiry about an ancestor element stored in the database. Structured document processing device.

The statement generating means,
Means for determining whether the search result requesting the inquiry is stored in the database as it is, or stored separately as descendant elements,
Means for acquiring its location if stored in the database as it is,
A means for acquiring all the locations of the elements if they are stored separately as descendant elements,
The structured document processing apparatus according to claim 9, further comprising: a unit that generates a command sentence for a search from the acquired location and an input query.

A structured document processing program for causing a computer to execute operations of storing, searching, updating, and deleting a structured document using a database,
An operation determining step of analyzing a query for the input structured document and determining whether the operation requested by the query is insertion, update, deletion, or search;
Based on the hierarchical structure of the structured document stored in the storage unit, referring to the mapping definition information that divides the structured document allowing duplication into divided partial documents and associates the divided partial documents with a database, and performs the operation determination. A statement generation step of converting the input inquiry into a statement for performing any one of insert / update / delete / search operations on the database based on the determination of the step;
Transmitting the converted statement to the database, an operation instruction step for executing the operation,
A document number obtaining step of obtaining the number of documents that have executed the statement if the statement is an insert / update / delete operation;
If the command statement is a search operation, a structured document output step of referring to the mapping definition information and outputting a search result of the database as a structured document;
A structured document processing program characterized by comprising:

A storage medium storing a structured document processing program for causing a computer to execute operations of storing, searching, updating and deleting a structured document using a database,
An operation determining step of analyzing a query for the input structured document and determining whether the operation requested by the query is insertion, update, deletion, or search;
Based on the hierarchical structure of the structured document stored in the storage unit, referring to the mapping definition information that divides the structured document allowing duplication into divided partial documents and associates the divided partial documents with a database, and performs the operation determination. A statement generation step of converting the input inquiry into a statement for performing any one of insert / update / delete / search operations on the database based on the determination of the step;
Transmitting the converted statement to the database, an operation instruction step for executing the operation,
A document number obtaining step of obtaining the number of documents that have executed the statement if the statement is an insert / update / delete operation;
If the command statement is a search operation, a structured document output step of referring to the mapping definition information and outputting a search result of the database as a structured document;
A storage medium storing a structured document processing program characterized by storing a program comprising: