JP3719125B2

JP3719125B2 - Data storage apparatus and method

Info

Publication number: JP3719125B2
Application number: JP2000311114A
Authority: JP
Inventors: 中村　　秀男
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2000-10-11
Filing date: 2000-10-11
Publication date: 2005-11-24
Anticipated expiration: 2020-10-11
Also published as: JP2002117020A

Description

【０００１】
【発明の属する技術分野】
本発明は、要素（タグ）の入れ子構造、繰り返し、再帰的構造を含むＸＭＬ（Extensible Markup Language）文書をテーブル形式のデータベースに格納する際に用いて好適なデータ格納装置及び方法に関する。
【０００２】
【従来の技術】
ＸＭＬ(eXtensible Markup Language；拡張可能なマーク付け言語)は、Ｗ３Ｃ(World Wide Web Consortium)で標準化が進められているＷｅｂ上で構造化文書をやりとりするためのデータフォーマットである。ＸＭＬは、ＳＧＭＬ(Standard Generalized Markup Language)[ISO 8879]の利用を前提とし、そのサブセットとして設計されている。ＸＭＬ文書は、ルートと呼ばれる文書実体から始まり、それぞれマーク付けされ、かつ入れ子構造を有している宣言、要素、コメント、文字参照および処理命令を含んでいる。各文書は一つ以上の要素を含み、各文書にはルートまたは文書要素という要素が一つだけ存在し、これは他の要素の内容に含まれない。各要素は、開始タグと終了タグで区切られ、入れ子構造をなしている。すべての要素は、その開始タグが他の要素の内容に含まれれば、対応する終了タグも同じ要素の内容に含まれる。また、要素Ｂが要素Ａの内容に含まれ、かつ要素Ａの内容に含まれる他の要素に含まれないとき、要素Ａを要素Ｂの親といい、要素Ｂを要素Ａの子という。
【０００３】
図１８を参照して、ＸＭＬ文書を構成するデータ（以下、ＸＭＬデータ）をテーブル形式のデータベースに格納するためのＸＭＬデータ格納方式について説明する。図１８において、データ格納手段４０１は、ＸＭＬデータ４２１の内容を解析し、ＸＭＬデータ４２１内に含まれる各要素を抽出してデータベース４０２内のテーブル４２２に格納する。ＸＭＬデータ４２１が図１９に示すような内容を有している場合を例にして、図１８に示すデータ格納手段４０１の動作について説明する。この場合、ＸＭＬデータ４２１は１つのルート要素４２１ａを持ち、要素４２１ａの子要素として、次の１つ下位の階層に要素４２１ｂ（４２１ｂ１，４２１ｂ２）の繰り返しを持っている。各要素４２１ｂ１，４２１ｂ２は、要素４２１ｃ，ｄ，ｅからなる同じ子要素の列をそれぞれ含んでいる。データ格納手段４０１はＸＭＬデータ４２１を入力として受け取り、要素４１２ａの各子要素４１２ｂ１，４１２ｂ２をテーブル４２２の行Ｂ０１，Ｂ０２に対応させて格納する。また各子要素４１２ｂ１，４１２ｂ２の子要素４１２ｃ〜ｅを各行の列Ｃ，Ｄ，Ｅに対応させて格納する。テーブル４２２の例を図２０に示す。
【０００４】
【発明が解決しようとする課題】
上述したような従来技術には、次のような問題点があった。第１の問題点は、要素４２１ａの子要素４２１ｂ１，２の子要素４２１ｃ〜ｅがさらに子要素を持つ場合にテーブルの列に対応させて格納することができないということである。その理由は、テーブルの列が要素４２１ａの子要素４２１ｂ１，２の子要素４２１ｃ〜ｅに対応しているためである。第２の問題点は、要素４２１ａの子要素４２１ｂ１，２の内部に繰り返しを持つ場合にテーブルに格納することができないということである。その理由は、テーブルの行が表現する繰り返しが要素４２１ａの子要素にだけ対応しているためである。第３の問題点は、再帰的なタグ構造をテーブルに格納できないということである。その理由は、テーブルが再帰的なデータ構造をそのまま格納できないためである。
【０００５】
本発明は、従来の構成では対応できなかった、３階層以上の要素（タグ）の入れ子構造、繰り返し、または再帰的構造を含むＸＭＬ文書を、テーブル形式のデータベースに格納することができるようにするデータ格納装置及び方法を提供することを目的とする。
【０００６】
【課題を解決するための手段】
上記課題を解決するため、請求項１記載の発明は、ＸＭＬ(eXtensible Markup Language）データを表形式のデータに変換して所定の記憶手段に格納するデータ格納装置において、入力された第１のＸＭＬデータ（ 221 ）の内部の繰り返し要素であってルート要素Ａの子要素Ｂでない要素Ｆを分離し、要素Ｆが取り除かれた第２のＸＭＬデータ（ 222 ）と、新たなＸＭＬデータに要素Ｆを追加した第３のＸＭＬデータ（ 223 ）とを作成する第１のデータ変換手段（ 202 ）と、第２のＸＭＬデータ（ 222 ）の内部の再帰になっている要素Ｄを分離し、要素Ｄが取り除かれた第４のＸＭＬデータ（ 224 ）と、新たなＸＭＬデータに要素Ｄを追加した第５のＸＭＬデータ（ 225 ）とを作成する第２のデータ変換手段（ 203 ）と、第４のＸＭＬデータ（ 224 ）の内部の３レベル以上入れ子になっている要素Ｅをルート要素Ａの子要素Ｂの子要素として移動して第６のＸＭＬデータ（ 226 ）を作成する第３のデータ変換手段（ 204 ）と、第６のＸＭＬデータ（ 226 ）を第１のテーブルに格納し、第５のＸＭＬデータ（ 225 ）を第２のテーブルに格納し、第３のＸＭＬデータ（ 223 ）を第３のテーブルに格納するデータ格納手段（ 205 ）とを備えることを特徴とする。
【０００７】
請求項２記載の発明は、ＸＭＬ (eXtensible Markup Language ）データを表形式のデータに変換して所定の記憶手段に格納するデータ格納装置において、入力されたＸＭＬデータ（ 321 ）の内部の繰り返し要素である要素Ｃと文字列を分離し、新たなＸＭＬデータにルート要素Ａの子要素Ｂとその要素としての要素Ｃと文字列を、要素Ｂの順序位置を示す要素ＩＤとともに追加したＸＭＬデータ（ 323 ）を作成するデータ変換手段（ 302 ）と、ＸＭＬデータ（ 323 ）をテーブルに格納し、かつその際、要素Ｃと文字列を要素ＩＤともに格納するデータ格納手段（ 303 ）とを備えることを特徴とする。
【０００８】
請求項３記載の発明は、ＸＭＬ (eXtensible Markup Language ）データを表形式のデータに変換して所定の記憶手段に格納するデータ格納方法において、入力された第１のＸＭＬデータ（ 221 ）の内部の繰り返し要素であってルート要素Ａの子要素Ｂでない要素Ｆを分離し、要素Ｆが取り除かれた第２のＸＭＬデータ（ 222 ）と、新規のＸＭＬデータに要素Ｆを追加した第３のＸＭＬデータ（ 223 ）とを作成する第１のデータ変換過程と、第２のＸＭＬデータ（ 222 ）の内部の再帰になっている要素Ｄを分離し、要素Ｄが取り除かれた第４のＸＭＬデータ（ 224 ）と、新規のＸＭＬデータに要素Ｄを追加した第５のＸＭＬデータ（ 225 ）とを作成する第２のデータ変換過程と、第４のＸＭＬデータ（ 224 ）の内部の３レベル以上入れ子になっている要素Ｅをルート要素Ａの子要素Ｂの子要素として移動して第６のＸＭＬデータ（ 226 ）を作成する第３のデータ変換過程と、第６のＸＭＬデータ（ 226 ）を第１のテーブルに格納し、第５のＸＭＬデータ（ 225 ）を第２のテーブルに格納し、第３のＸＭＬデータ（ 223 ）を第３のテーブルに格納するデータ格納過程とを含んでいることを特徴とする。
【０００９】
【発明の実施の形態】
以下、図面を参照して本発明によるデータ格納装置の実施形態について説明する。
【００１０】
図１は、本発明によるデータ格納装置の一実施形態を説明するためのブロック図である。図１において、データ変換手段１１１はＸＭＬデータ１０１から文書型定義上ルート要素の子要素以外に繰り返しのある部分を分離し、分離した部分を除いたＸＭＬデータ１０２と、分離した部分を新規のルート要素の下に同じ要素が繰り返しになるように配置することで作成されたＸＭＬデータ１０３およびその他のＸＭＬデータとする。データ変換手段１１２はＸＭＬデータ１０２から文書定義上再帰的な構造になっている部分を分離し、分離した部分を再帰的に現れる要素がルート要素の子要素の繰り返し形式となるように変換することで作成したＸＭＬデータ１０５およびその他のＸＭＬデータと、分離した部分を除いたＸＭＬデータ１０４とする。ＸＭＬデータ１０３等、データ変換手段１１１のその他の出力ＸＭＬデータもＸＭＬデータ１０２と同様の変換を行う。データ変換手段１１３はＸＭＬデータ１０４の文書定義上３レベル以上の要素の入れ子になっている要素をルート要素の子要素の子要素とする形式に変換しＸＭＬデータ１０６とする。ＸＭＬデータ１０５等、データ変換手段１１２のその他の出力ＸＭＬデータについても同様に変換を行う。データ変換手段１１１〜１１３の変換によりＸＭＬデータ１０１はルート要素の子要素としてそれぞれ同じ要素の繰り返しを持ち子要素の内部に同じ要素の列を持つＸＭＬデータ１０６およびその他のＸＭＬデータに変換される。データ更新手段１１４はこれらのＸＭＬデータのルート要素の子要素の子要素をテーブルの列に対応付けることによってＸＭＬデータをデータベース１２１のテーブルに格納する。このようにして、３レベル以上のタグの入れ子構造、繰り返し、再帰的タグ構造を含むＸＭＬ文書をテーブル形式のデータベースに格納することを可能にする。
【００１１】
次に、図２を参照して本発明によるデータ格納装置の他の実施形態について説明する。図２を参照すると、本発明のＸＭＬデータ格納装置の一実施形態は、ＸＭＬデータ入力手段２０１と、データ変換手段２０２と、データ変換手段２０３と、データ変換手段２０４と、データ格納手段２０５と、データベース２１１から構成されている。ＸＭＬデータ２２１の例を図３に示す。ＸＭＬデータ２２１はルート要素Ａ（開始タグ：＜Ａ＞、終了タグ：＜／Ａ＞；以下同様。ただし図面では開始タグに参照符号の引き出し線を付けている。）から構成されている。要素Ａは要素Ｂ（Ｂ０１，Ｂ０２）の繰り返しで構成されている。要素Ｂ０１は要素Ｃ（Ｃ０１）と、要素Ｄ（Ｄ０１，Ｄ０２）から構成されている。要素Ｂ０２は要素Ｃ（Ｃ０２）と、要素Ｄ（Ｄ０３）から構成されている。要素Ｃ０１は１つの要素Ｅ（Ｅ０１）と、要素Ｆ（Ｆ０１，Ｆ０２）の繰り返しから構成されている。要素Ｃ０２は１つの要素Ｅ（Ｅ０２）と、要素Ｆ（Ｆ０３）から構成されている。要素Ｄ０１は１つの要素Ｇ（Ｇ０１）と１個の要素Ｄ（Ｄ０２）から構成されている。要素Ｄ０３は１つの要素Ｇ（Ｇ０３）から構成されている。要素Ｆ（Ｆ０１，Ｆ０２，Ｆ０３）はそれぞれ要素Ｉと要素Ｊから構成されている。要素Ｅ、要素Ｇ、要素Ｉ、要素Ｊは文字列から構成されている。
【００１２】
上記各手段はそれぞれ概略つぎのように動作する。データ入力手段２０１は、ＸＭＬデータ２２１を入力しデータ変換手段２０２へ渡す。データ変換手段２０２は、ＸＭＬデータ２２１の内部の繰り返し要素である要素Ｆ（Ｆ０１，Ｆ０２）を分離しＸＭＬデータ２２２とＸＭＬデータ２２３とする。要素Ｂ（Ｂ０１，Ｂ０２）も繰り返しになっているが、ルート要素Ａの子要素なので変換しない。データ変換手段２０３は、ＸＭＬデータ２２２の内部の再帰になっている要素Ｄ（Ｄ０１，Ｄ０２）を分離しＸＭＬデータ２２４とＸＭＬデータ２２５とする。データ変換手段２０４は、ＸＭＬデータ２２４の内部の３レベル以上入れ子になっている要素Ｅをルート要素Ａの子要素Ｂの子要素として移動しＸＭＬデータ２２６とする。データ格納手段２０５は、ＸＭＬデータ２２６をテーブルＡに格納する。このとき要素Ｅの内容を列Ｅに格納する。ＸＭＬデータ２２５をテーブルＹに格納する。このとき要素Ｇの内容を列Ｇに格納する。ＸＭＬデータ２２３をテーブルＸに格納する。このとき要素Ｉの内容を列Ｉに、要素Ｊの内容を列Ｊに格納する。
【００１３】
［実施形態の動作の説明］
次に、図４、図５、図６、図７のフローチャートを参照して本実施形態の全体の動作について詳細に説明する。
【００１４】
ＸＭＬデータ２２１をデータ変換手段２０２へ入力する。要素Ｘをルート要素とする新しいＸＭＬデータ２２３を生成する（図４のステップＡ１０１）。ここでＸは変数である。ＸＭＬデータ２２１に要素Ｂがなければ終わり（ステップＡ１０２）、要素Ｂがあれば最初の要素Ｂを要素Ｂ１とする（ステップＡ１０３）。ここで、ＢおよびＢ１は変数を示す。次に、要素Ｂ１に要素Ｆがなければ終わり（ステップＡ１０４）、要素Ｂ１に要素Ｆがあれば要素Ｂ１の最初の要素ＦをＦ１とする（ステップＡ１０５）。ここでＦおよびＦ１は変数である。要素Ｆ１を要素Ｂ１から取り除き要素Ｘに追加する（ステップＡ１０６）。要素Ｂ１に次の要素ＦがあればステップＡ１０８へ、なければステップＡ１０９へ進む（ステップＡ１０７）。ステップＡ１０８では次の要素ＦをＦ１として、ステップＡ１０６へ戻る。ステップＡ１０９では、次の要素ＢがあればステップＡ１１０へ、なければ終わる。ステップＡ１１０では、次の要素ＢをＢ１として、ステップＡ１０４へ戻る。終了時には総ての要素Ｆが取り除かれたＸＭＬデータ２２１をＸＭＬデータ２２２とする。
【００１５】
データ変換手段２０３はデータ変換手段２０２からＸＭＬデータ２２２を入力する。要素Ｙをルート要素とする新しいＸＭＬデータ２２５を生成する（図５のステップＡ２０１）。ここでＹは変数である。ＸＭＬデータ２２２に要素Ｂがなければ終わり（ステップＡ２０２）、要素Ｂがあれば最初の要素Ｂを要素Ｂ１とする（ステップＡ２０３）。次に要素Ｂ１の子要素の要素ＤをＤ１とする（ステップＡ２０４）。ここでＤおよびＤ１は変数である。次に要素Ｄ１を親要素から取り除き要素Ｙの子要素として追加する（ステップＡ２０５）。要素Ｄ１の子要素に要素ＤがあればステップＡ２０７へ、なければステップＡ２０８へ進む（ステップＡ２０６）。ステップＡ２０７では子要素ＤをＤ１としてステップＡ２０５へ戻る。ステップＡ２０８では次の要素ＢがあればステップＡ２０９へ、なければ終わる。ステップＡ２０９では次の要素ＢをＢ１としてステップＡ２０４へ戻る。終了時には総ての要素Ｄが取り除かれたＸＭＬデータ２２２をＸＭＬデータ２２４とする。
【００１６】
データ変換手段２０４はデータ変換手段２０３からＸＭＬデータ２２４を入力する。ＸＭＬデータ２２４に要素Ｂがなければ終わり（図６のステップＡ３０１）、要素Ｂがあれば最初の要素Ｂを要素Ｂ１とする（ステップＡ３０２）。要素Ｂ１の子要素Ｃの子要素Ｅを要素Ｃから取り除き、要素Ｂ１の子要素として追加する（ステップＡ３０３）。ここでＣ，Ｅは変数である。要素Ｂ１から要素Ｃを削除する（ステップＡ３０４）。次の要素ＢがあればステップＡ３０６へ、なければ終わる（ステップＡ３０５）。ステップＡ３０６では次の要素ＢをＢ１としてステップＡ３０３へ戻る。終了時には総ての要素Ｃ、要素Ｅの変換が行われたＸＭＬデータ２２４をＸＭＬデータ２２６とする。
【００１７】
データ格納手段２０５は、データ変換手段２０４からＸＭＬデータ２２６を、データ変換手段２０３からＸＭＬデータ２２５を、データ変換手段２０３からＸＭＬデータ２２３を入力する。ＸＭＬデータ２２６に要素ＢがなければステップＡ４０７へ進む（図７のステップＡ４０１）。ＸＭＬデータ２２６に要素ＢがあればＸＭＬデータ２２６の最初の要素Ｂを要素Ｂ１とする（ステップＡ４０２）。次にテーブルＡに行Ａ１を追加する（ステップＡ４０３）。ここでＡ１は変数である。次に要素Ｂ１の子要素Ｅの内容を行Ａ１の列Ｅに格納する（ステップＡ４０４）。次の要素ＢがあればステップＡ４０６へ、なければステップＡ４０７へ進む（ステップＡ４０５）。ステップＡ４０６では次の要素ＢをＢ１とし、ステップＡ４０３へ戻る。ステップＡ４０７では、ＸＭＬデータ２２５に要素ＤがなければステップＡ４１３へ進み、要素ＤがあればステップＡ４０８へ進む。ステップＡ４０８ではＸＭＬデータ２２５の最初の要素Ｄを要素Ｄ１とする。テーブルＹに行Ｙ１を追加する（ステップＡ４０９）。ここでＹ１は変数である。次に要素Ｄ１の子要素Ｇの内容を行Ｙ１の列Ｇに格納する（ステップＡ４１０）。ここでＧは変数である。次の要素ＤがあればステップＡ４１２へ、なければステップＡ４１３へ進む（ステップＡ４１１）。ステップＡ４１２では次の要素ＤをＤ１としてステップＡ４０９へ戻る。ステップＡ４１３ではＸＭＬデータ２２３に要素Ｆがなければ終わり、要素ＦがあればステップＡ４１４へ進む。ステップＡ４１４では、ＸＭＬデータ２２３の最初の要素Ｆを要素Ｆ１とする。テーブルＸに行Ｘ１を追加する（ステップＡ４１５）。次に要素Ｆ１の子要素Ｉの内容を行Ｘ１の列Ｉに格納する（ステップＡ４１６）。ここでＸ１，Ｉは変数である。次に要素Ｆ１の子要素Ｊの内容を行Ｘ１の列Ｊに格納する（ステップＡ４１７）。ここでＪは変数である。次に、次の要素ＦがあればステップＡ４１９へ、なければ終了する（ステップＡ４１８）。ステップＡ４１９では次の要素ＦをＦ１としてステップＡ４１５へ戻る。以上で処理が終了する。
【００１８】
次に、図３に示すＸＭＬデータ２２１を入力する場合の具体例を用いて説明する。ＸＭＬデータ２２１をデータ変換手段２０２へ入力する。データ変換手段２０２では、要素Ｘをルート要素とする新しいＸＭＬデータ２２３を生成する（図４のステップＡ１０１）（図８（ａ）参照）。図３に示すＸＭＬデータ２２１の最初の要素Ｂ、要素Ｂ０１を取り出す（ステップＡ１０３）。要素Ｂ０１の最初の要素Ｆ、要素Ｆ０１取り出す（ステップＡ１０５）。Ｆ０１をＢ０１から取り除き要素Ｘに追加する（ステップＡ１０６）（図８（ｂ）（図８の右上））。要素Ｂ０１の次の要素Ｆ、要素Ｆ０２を取り出す（ステップＡ１０８）。Ｆ０２をＢ０１から取り除き要素Ｘに追加する（ステップＡ１０６）（図８（ｃ））。次の要素Ｂ、要素Ｂ０２を取り出す（ステップＡ１１０）。要素Ｂ０２の最初の要素Ｆ、要素Ｆ０３取り出す（ステップＡ１０５）。Ｆ０３をＢ０２から取り除き要素Ｘに追加する（ステップＡ１０６）（図８（ｄ））。総ての要素Ｆが取り除かれたＸＭＬデータ２２１をＸＭＬデータ２２２（図８（ｅ））とする。
【００１９】
データ変換手段２０３はデータ変換手段２０２から図８（ｅ）に示すＸＭＬデータ２２２を入力する。要素Ｙをルート要素とする新しいＸＭＬデータ２２５を生成する（図５のステップＡ２０１）（図９（ａ））。図８（ｅ）のＸＭＬデータ２２２の最初の要素Ｂ、要素Ｂ０１を取り出す（ステップＡ２０３）。要素Ｂ０１の子要素の要素Ｄ、要素Ｄ０１を取り出す（ステップＡ２０４）。Ｄ０１をＢ０１から取り除き要素Ｙの子要素として追加する（ステップＡ２０５）（図９（ｂ）（図９の右上））。要素Ｄ０１の子要素の要素Ｄ、要素Ｄ０２を取り出す（ステップＡ２０７）。Ｄ０２をＤ０１から取り除き要素Ｙの子要素として追加する（ステップＡ２０５）（図９（ｃ））。次の要素Ｂ、要素Ｂ０２を取り出す（ステップＡ２０９）。要素Ｂ０２の子要素の要素Ｄ、要素Ｄ０３をＢ０２から取り除き要素Ｙの子要素として追加する（ステップＡ２０５）（図９（ｄ））。総ての要素Ｄが取り除かれたＸＭＬデータ２２２をＸＭＬデータ２２４とする（図９（ｅ））。
【００２０】
データ変換手段２０４はデータ変換手段２０３から図９（ｅ）に示すＸＭＬデータ２２４を入力する。ＸＭＬデータ２２４の最初の要素Ｂ、要素Ｂ０１を取り出す（ステップＡ３０２）。要素Ｂ０１の子要素Ｃ（Ｃ０１）の子要素Ｅ（Ｅ０１）を要素Ｃから取り除き要素Ｂ０１の子要素として追加する（ステップＡ３０３）（図１０（ａ））。要素Ｂ０１から要素Ｃを削除する（ステップＡ３０４）（図１０（ｂ））。次の要素Ｂ、要素Ｂ０２を取り出す（ステップＡ３０６）。要素Ｂ０２の子要素Ｃ（Ｃ０２）の子要素Ｅ（Ｅ０２）を要素Ｃから取り除き要素Ｂ０２の子要素として追加する（ステップＡ３０３）（図１０（ｃ））。要素Ｂ０２から要素Ｃを削除する（ステップＡ３０４）（図１０（ｄ））。総ての要素Ｃ、要素Ｅの変換が行われたＸＭＬデータ２２４をＸＭＬデータ２２６とする（図１０（ｅ））。
【００２１】
データデータ格納手段２０５は変換手段２０４から図１０（ｅ）に示すＸＭＬデータ２２６を、データ変換手段２０３から図９（ｄ）に示すＸＭＬデータ２２５を、データ変換手段２０３から図８（ｄ）に示すＸＭＬデータ２２３を入力する。ＸＭＬデータ２２６の最初の要素Ｂ、要素Ｂ０１を取り出す（図７のステップＡ４０２）。テーブルＡに行Ａ０１を追加する（ステップＡ４０３）（図１１（ａ））。要素Ｂ０１の子要素Ｅの内容を行Ａ１の列Ｅに格納する（ステップＡ４０４）（図１１（ｂ））。次の要素Ｂ、要素Ｂ０２を取り出す（ステップＡ４０６）。テーブルＡに行Ａ０２を追加する（ステップＡ４０３）（図１１（ｃ））。要素Ｂ０２の子要素Ｅの内容を行Ａ２の列Ｅに格納する（ステップＡ４０４）（図１１（ｄ））。ＸＭＬデータ２２５の最初の要素Ｄ、要素Ｄ０１を取り出す（ステップＡ４０８）。テーブルＹに行Ｙ０１を追加する（ステップＡ４０９）（図１１（ｅ））。ＸＭＬデータ２２５の要素Ｄ０１の子要素Ｇの内容を行Ｙ０１の列Ｇに格納する（ステップＡ４１０）（図１１（ｆ））。次の要素Ｄ、要素Ｄ０２を取り出す（ステップＡ４１２）。テーブルＹに行Ｙ０２を追加する（ステップＡ４０９）。要素Ｄ０２の子要素Ｇの内容を行Ｙ０２の列Ｇに格納する（ステップＡ４１０）。次の要素Ｄ、要素Ｄ０３を取り出す。テーブルＹに行Ｙ０３を追加する（ステップＡ４０９）。要素Ｄ０３の子要素Ｇの内容を行Ｙ０３の列Ｇに格納する（ステップＡ４１０）（図１１（ｇ））。ＸＭＬデータ２２３の最初の要素Ｆ、要素Ｆ０１を取り出す（ステップＡ４１４）。テーブルＸに行Ｘ０１を追加する（ステップＡ４１５）（図１１（ｈ））。要素Ｆ０１の子要素Ｉの内容を行Ｘ０１の列Ｉに格納する（ステップＡ４１６）。要素Ｆ０１の子要素Ｊの内容を行Ｘ０１の列Ｊに格納する（ステップＡ４１７）（図１１（ｉ））。次の要素Ｆ、要素Ｆ０２を取り出す（ステップＡ４１９）。テーブルＸに行Ｘ０２を追加する（ステップＡ４１５）（図１１（ｊ））。要素Ｆ０２の子要素Ｉの内容を行Ｘ０２の列Ｉに格納する（ステップＡ４１６）。要素Ｆ０２の子要素Ｊの内容を行Ｘ０２の列Ｊに格納する（ステップＡ４１７）（図１１（ｋ））。次の要素Ｆ、要素Ｆ０３を取り出す（ステップＡ４１９）。テーブルＸに行Ｘ０３を追加する（ステップＡ４１５）（図１１（ｌ））。要素Ｆ０３の子要素Ｉの内容を行Ｘ０３の列Ｉに格納する（ステップＡ４１６）。要素Ｆ０３の子要素Ｊの内容を行Ｘ０３の列Ｊに格納する（ステップＡ４１７）（図１１（ｍ））。
【００２２】
本実施形態によれば次のような効果を得ることができる。第１の効果は、タグの繰り返し構造を含むＸＭＬ文書をテーブル形式のデータベースに格納できることにある。その理由は、タグの繰り返し構造を持つ部分を別のデータとして括り出しテーブル形式に格納できる形に変換したためである。第２の効果は、再帰的タグ構造を含むＸＭＬ文書をテーブル形式のデータベースに格納できることにある。その理由は、再帰的タグ構造を持つ部分を別のデータとして括り出しテーブル形式に格納できる形に変換したためである。第３の効果は、３レベル以上のタグ入れ子構造を含むＸＭＬ文書をテーブル形式のデータベースに格納できることにある。その理由は、３レベル以上の入れ子構造を持つ部分の要素を上位の要素へ移動しテーブル形式のデータベースに格納できる形に変換したためである。
【００２３】
［発明の他の実施形態］
次に、本発明の他の実施形態について図面を参照して詳細に説明する。図１２を参照すると、本発明のＸＭＬデータ格納装置の他の実施形態は、ＸＭＬデータ入力手段３０１と、データ変換手段３０２と、データ格納手段３０３と、データベース３１１から構成されている。ＸＭＬデータ３２１の例を図１３に示す。ＸＭＬデータ３２１はルート要素Ａから構成されている。要素Ａは要素Ｂの繰り返し（Ｂ０１，Ｂ０２）で構成されている。要素Ｂ０１は要素Ｃ（ＣＴ０２）と文字列の繰り返し（ＣＴ０１，ＣＴ０３）から構成されている。要素Ｂ０２は要素Ｃの繰り返し（ＣＴ０４，ＣＴ０６）と文字列（ＣＴ０５）から構成されている。
【００２４】
上記各手段はそれぞれ概略つぎのような機能を有する。データ入力手段３０１は、ＸＭＬデータ３２１を入力しデータ変換手段３０２へ渡す。データ変換手段３０２は、ＸＭＬデータ３２１の内部の繰り返し要素である要素Ｃと文字列を分離しＸＭＬデータ３２２とＸＭＬデータ３２３とする。データ格納手段３０３は、ＸＭＬデータ３２３をテーブルＸに格納する。要素Ｃの内容を列Ｃに、文字列を列ＴＥＸＴに、要素Ｂの位置情報を列ＩＤに格納する。
【００２５】
次に、図１４及び図１５のフローチャートを参照して本実施形態の全体の動作について詳細に説明する。ＸＭＬデータ３２１をデータ変換手段３０２へ入力する。要素Ｘをルート要素とする新しいＸＭＬデータ３２３を生成する（図１４のステップＡ５０１）。ＸＭＬデータ３２１に要素Ｂがなければ終わる（ステップＡ５０２）。最初の要素Ｂを要素Ｂ１とする（ステップＡ５０３）。要素Ｂ１の順序位置をＩＤ１とする（ステップＡ５０４）。要素Ｂ１に要素Ｃか文字列がなければＡ５１４へ進む（ステップＡ５０５）。要素Ｂ１に要素Ｃか文字列があれば、要素Ｂ１の最初の要素Ｃか文字列をＣＴ１とする（ステップＡ５０６）。要素Ｙを生成しＹ１とする（ステップＡ５０７）。要素ＸにＹ１を追加する（ステップＡ５０８）。ＣＴ１をＢ１から取り除き要素Ｙ１に追加する（ステップＡ５０９）。要素Ｙ１に要素ＩＤを追加する（ステップＡ５１０）。要素ＩＤの内容としてＩＤ１を設定する（ステップＡ５１１）。要素Ｂ１に次の要素Ｃか文字列があればステップＡ５１３へ、なければステップＡ５１４へ進む（ステップＡ５１２）。次の要素Ｃか文字列をＣＴ１とする（ステップＡ５１３）。次の要素ＢがあればステップＡ５１５へ、なければ終わる（ステップＡ５１４）。次の要素ＢをＢ１とする（ステップＡ５１５）。要素Ｂの子の総ての要素Ｆと文字列が取り除かれたＸＭＬデータ３２１をＸＭＬデータ３２２とする。
【００２６】
データ格納手段３０３はデータ変換手段３０２からＸＭＬデータ３２３を入力する。ＸＭＬデータ３２３に要素Ｙがなければ終わる（図１５のステップＡ６０１）。ＸＭＬデータ３２３の最初の要素Ｙを要素Ｙ１とする（ステップＡ６０２）。テーブルＸに行Ｘ１を追加する（ステップＡ６０３）。要素Ｙ１に子要素Ｃがあれば内容を行Ｘ１の列Ｃに格納する（ステップＡ６０４、Ａ６０５）。要素Ｙ１に文字列があれば行Ｘ１の列ＴＥＸＴに格納する（ステップＡ６０６、Ａ６０７）。要素Ｙ１の子要素ＩＤの内容を列Ｘ１の要素ＩＤに格納する（ステップＡ６０８）。次の要素ＢがあればステップＡ６１０へ、なければ終わる。次の要素ＢをＢ１とする（ステップＡ６１０）。次に、具体例について説明する。
【００２７】
ＸＭＬデータ３２１をデータ変換手段３０２へ入力する。要素Ｘをルート要素とする新しいＸＭＬデータ３２３を生成する（図１４のステップＡ５０１）。最初の要素Ｂ、要素Ｂ０１を取り出す（ステップＡ５０３）。要素Ｂ０１の順序位置１をＩＤ１とする（ステップＡ５０４）。要素Ｂ０１の最初の文字列ＣＴ０１を取り出す（ステップＡ５０６）。要素Ｙを生成しＹ０１とする（ステップＡ５０７）。要素ＸにＹ０１を追加する（ステップＡ５０８）。ＣＴ０１をＢ０１から取り除き要素Ｙ０１に追加する（ステップＡ５０９）。要素Ｙ０１に要素ＩＤを追加する（ステップＡ５１０）。要素ＩＤの内容としてＩＤ１の値１を設定する（ステップＡ５１１）。次の要素Ｃ、要素ＣＴ０２を取り出す（ステップＡ５１３）。要素Ｙを生成しＹ０２とする（ステップＡ５０７）。要素ＸにＹ０２を追加する（ステップＡ５０８）。ＣＴ０２をＢ０１から取り除き要素Ｙ０２に追加する（ステップＡ５０９）。要素Ｙ０２に要素ＩＤを追加する（ステップＡ５１０）。要素ＩＤの内容としてＩＤ１の値１を設定する（ステップＡ５１１）。次の文字列、ＣＴ０３を取り出す（ステップＡ５１３）。要素Ｙを生成しＹ０３とする（ステップＡ５０７）。要素ＸにＹ０３を追加する（ステップＡ５０８）。ＣＴ０３をＢ０１から取り除き要素Ｙ０３に追加する（ステップＡ５０９）。要素Ｙ０３に要素ＩＤを追加する（ステップＡ５１０）。要素ＩＤの内容としてＩＤ１の値１を設定する（ステップＡ５１１）。次の要素Ｂ、要素Ｂ０２を取り出す（ステップＡ５１５）。要素Ｂ０２の順序位置２をＩＤ１とする（ステップＡ５０４）。要素Ｂ０２の最初の要素Ｃ、要素ＣＴ０４を取り出す（ステップＡ５０６）。要素Ｙを生成しＹ０４とする（ステップＡ５０７）。要素ＸにＹ０１を追加する（ステップＡ５０８）。ＣＴ０４をＢ０２から取り除き要素Ｙ０４に追加する（ステップＡ５０９）。要素Ｙ０４に要素ＩＤを追加する（ステップＡ５１０）。要素ＩＤの内容としてＩＤ１の値２を設定する（ステップＡ５１１）。次の文字列、ＣＴ０５を取り出す（ステップＡ５１３）。要素Ｙを生成しＹ０５とする（ステップＡ５０７）。要素ＸにＹ０５を追加する（ステップＡ５０８）。ＣＴ０５をＢ０２から取り除き要素Ｙ０５に追加する（ステップＡ５０９）。要素Ｙ０５に要素ＩＤを追加する（ステップＡ５１０）。要素ＩＤの内容としてＩＤ１の値２を設定する（ステップＡ５１１）。次の要素Ｃ、ＣＴ０６を取り出す（ステップＡ５１３）。要素Ｙを生成しＹ０６とする（ステップＡ５０７）。要素ＸにＹ０６を追加する（ステップＡ５０８）。ＣＴ０６をＢ０２から取り除き要素Ｙ０６に追加する（ステップＡ５０９）。要素Ｙ０６に要素ＩＤを追加する（ステップＡ５１０）。要素ＩＤの内容としてＩＤ１の値２を設定する（ステップＡ５１１）。要素Ｂの子の総ての要素Ｃと文字列が取り除かれたＸＭＬデータ３２１をＸＭＬデータ３２２とする。作成されたＸＭＬデータ３２３を図１６に示す。
【００２８】
データ格納手段３０３はデータ変換手段３０２からＸＭＬデータ３２３を入力する。ＸＭＬデータ３２３の最初の要素Ｙ、要素Ｙ０１を取り出す（図１５のステップＡ６０２）。テーブルＸに行Ｘ０１を追加する（ステップＡ６０３）。要素Ｙ０１の文字列を行Ｘ０１の列ＴＥＸＴに格納する（ステップＡ６０７）。要素Ｙ０１の子要素ＩＤの内容を行Ｘ０１の列ＩＤに格納する（ステップＡ６０８）。次の要素Ｙ、要素Ｙ０２を取り出す（ステップＡ６１０）。テーブルＸに行Ｘ０２を追加する（ステップＡ６０３）。要素Ｙ０２の子要素Ｃの内容を行Ｘ０２の列Ｃに格納する（ステップＡ６０５）。要素Ｙ０２の子要素ＩＤの内容を行Ｘ０２の列ＩＤに格納する（ステップＡ６０８）。次の要素Ｙ、要素Ｙ０３を取り出す（ステップＡ６１０）。テーブルＸに行Ｘ０３を追加する（ステップＡ６０３）。要素Ｙ０３の文字列を行Ｘ０１の列ＴＥＸＴに格納する（ステップＡ６０７）。要素Ｙ０３の子要素ＩＤの内容を列Ｘ０３の列ＩＤに格納する（ステップＡ６０８）。次の要素Ｙ、要素Ｙ０４を取り出す（ステップＡ６１０）。テーブルＸに行Ｘ０４を追加する（ステップＡ６０３）。要素Ｙ０４の子要素Ｃの内容を行Ｘ０４の列Ｃに格納する（ステップＡ６０５）。要素Ｙ０４の子要素ＩＤの内容を行Ｘ０４の列ＩＤに格納する（ステップＡ６０８）。次の要素Ｙ、要素Ｙ０５を取り出す（ステップＡ６１０）。テーブルＸに行Ｘ０５を追加する（ステップＡ６０３）。要素Ｙ０５の文字列を行Ｘ０５の列ＴＥＸＴに格納する（ステップＡ６０７）。要素Ｙ０５の子要素ＩＤの内容を列Ｘ０５の列ＩＤに格納する（ステップＡ６０８）。次の要素Ｙ、要素Ｙ０６を取り出す（ステップＡ６１０）。テーブルＸに行Ｘ０６を追加する（ステップＡ６０３）。要素Ｙ０６の子要素Ｃの内容を行Ｘ０６の列Ｃに格納する（ステップＡ６０５）。要素Ｙ０６の子要素ＩＤの内容を行Ｘ０６の列ＩＤに格納する（ステップＡ６０８）。作成されたテーブルＸを図１７に示す。
【００２９】
本実施形態の第１の効果は、文字列、要素が混在した繰り返しがテーブル形式のデータベースに格納できることである。その理由は、タグと文字列の繰り返し構造を持つ部分を要素あるいは文字列を内容として持つ要素で括り、別のデータとして括り出しテーブル形式に格納できる形に変換したためである。第２の効果は、テーブル形式のデータベースに格納された文字列、要素のデータの元の位置情報を格納できることにある。その理由は、文字列や要素を括り出す時に親要素の位置情報を付加し、その情報もテーブルに格納したためである。
【００３０】
なお、本発明の実施の形態は上述した形態に限定されるものではなく適宜変更可能である。例えば、各実施形態における変換手段の位置は他の変換手段と交換可能である。図２を参照して説明した実施形態に、図１８を参照して説明した実施形態で用いた要素の位置情報を表す要素を追加する構成を付加することなどが考えられる。また、本発明のデータ格納装置は、コンピュータとそのコンピュータで実行されるプログラムとを用いて実現することができ、そのコンピュータで実行されるプログラムはコンピュータ読み取り可能な記録媒体あるいは通信回線を介して頒布することが可能である。
【００３１】
【発明の効果】
以上説明したように発明によれば、次のような効果を得ることができる。第１の効果は、タグの繰り返し構造を含むＸＭＬ文書をテーブル形式のデータベースに格納できることにある。その理由は、タグの繰り返し構造を持つ部分を別のデータとして括り出しテーブル形式に格納できる形に変換したためである。第２の効果は、再帰的タグ構造を含むＸＭＬ文書をテーブル形式のデータベースに格納できることにある。その理由は、再帰的タグ構造を持つ部分を別のデータとして括り出しテーブル形式に格納できる形に変換したためである。第３の効果は、３レベル以上のタグ入れ子構造を含むＸＭＬ文書をテーブル形式のデータベースに格納できることにある。その理由は、３レベル以上の入れ子構造を持つ部分の要素を上位の要素へ移動しテーブル形式のデータベースに格納できる形に変換したためである。第４の効果は、文字列、要素が混在した繰り返しがテーブル形式のデータベースに格納できることである。その理由は、タグと文字列の繰り返し構造を持つ部分を要素あるいは文字列を内容として持つ要素で括り、別のデータとして括り出しテーブル形式に格納できる形に変換したためである。第５の効果は、テーブル形式のデータベースに格納された文字列、要素のデータの元の位置情報を格納できることにある。その理由は、文字列や要素を括り出す時に親要素の位置情報を付加し、その情報もテーブルに格納したためである。
【図面の簡単な説明】
【図１】本発明によるデータ格納装置の一実施の形態の構成を示すブロック図。
【図２】本発明によるデータ格納装置の一実施の形態の構成を示すブロック図。
【図３】図２のＸＭＬデータ２２１の一例を示す図。
【図４】図２の構成（データ変換手段２０２）の動作を示すフローチャート。
【図５】図２の構成（データ変換手段２０３）の動作を示すフローチャート。
【図６】図２の構成（データ変換手段２０４）の動作を示すフローチャート。
【図７】図２の構成（データ格納手段２０５）の動作を示すフローチャート。
【図８】図２におけるデータ変換手段２０２によるＸＭＬデータ２２３の作成経過（ａ）〜（ｄ）と作成されたＸＭＬデータ２２２（ｅ）を示す図。
【図９】図２におけるデータ変換手段２０３によるＸＭＬデータ２２５の作成経過（ａ）〜（ｄ）と作成されたＸＭＬデータ２２４（ｅ）を示す図。
【図１０】図２におけるデータ変換手段２０４によるＸＭＬデータ２２４の変換経過（ａ）〜（ｄ）と作成されたＸＭＬデータ２２６（ｅ）を示す図。
【図１１】図２におけるデータ格納手段２０５によるデータベース２１１内のテーブルＡ，Ｘ，Ｙの作成経過（ａ）〜（ｍ）を示す図。
【図１２】本発明によるデータ格納装置の他の実施の形態の構成を示すブロック図。
【図１３】図１２のＸＭＬデータ３２１の一例を示す図。
【図１４】図１２の構成（データ変換手段３０２）の動作を示すフローチャート。
【図１５】図１２の構成（データ格納手段３０３）の動作を示すフローチャート。
【図１６】図１２においてデータ変換手段３０２によって作成されたＸＭＬデータ３２３の内容の一例を示す図。
【図１７】図１２のデータ格納手段３０５によるデータベース３１１内のテーブルＸの作成例を示す図。
【図１８】従来のデータ格納装置の構成を示すブロック図。
【図１９】図１８のＸＭＬデータ４２１の内容を示す図。
【図２０】図１８のテーブル４２２の内容を示す図。
【符号の説明】
１０１〜１０６，２２１〜２２６，３２１〜３２３…ＸＭＬデータ
１１１〜１１３，２０２〜２０４，３０２…データ変換手段
２０１，３０１…データ入力手段
１２１，２１１，３１１…データベース
１１４…データ更新手段
２０５，３０３…データ格納手段[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a data storage device and method suitable for use in storing an XML (Extensible Markup Language) document including an element (tag) nested structure, repetition, and recursive structure in a table format database.
[0002]
[Prior art]
XML (eXtensible Markup Language) is a data format for exchanging structured documents on the Web that is being standardized by the World Wide Web Consortium (W3C). XML is designed as a subset of SGML (Standard Generalized Markup Language) [ISO 8879]. An XML document includes declarations, elements, comments, character references, and processing instructions that begin with a document entity called a root and are each marked and have a nested structure. Each document contains one or more elements, and each document has only one element, the root or document element, which is not included in the contents of other elements. Each element is delimited by a start tag and an end tag and has a nested structure. For all elements, if the start tag is included in the contents of other elements, the corresponding end tag is also included in the contents of the same element. When element B is included in the contents of element A and not included in other elements included in the contents of element A, element A is referred to as the parent of element B, and element B is referred to as a child of element A.
[0003]
With reference to FIG. 18, an XML data storage method for storing data constituting an XML document (hereinafter referred to as XML data) in a table format database will be described. In FIG. 18, the data storage unit 401 analyzes the contents of the XML data 421, extracts each element included in the XML data 421, and stores it in the table 422 in the database 402. The operation of the data storage unit 401 shown in FIG. 18 will be described by taking as an example a case where the XML data 421 has contents as shown in FIG. In this case, the XML data 421 has one root element 421a, and has a repetition of the element 421b (421b1, 421b2) in the next lower hierarchy as a child element of the element 421a. Each element 421b1, 421b2 includes a sequence of the same child elements consisting of elements 421c, d, e, respectively. The data storage unit 401 receives the XML data 421 as input, and stores the child elements 412b1 and 412b2 of the element 412a in association with the rows B01 and B02 of the table 422. The child elements 412c to 4e of the child elements 412b1 and 412b2 are stored in correspondence with the columns C, D, and E in the respective rows. An example of the table 422 is shown in FIG.
[0004]
[Problems to be solved by the invention]
The conventional technology as described above has the following problems. The first problem is that when the child elements 421c1 to 421c1 of the element 421a further have child elements, they cannot be stored in correspondence with the columns of the table. The reason is that the column in the table corresponds to the child elements 421c to e of the child elements 421b1 and 421b of the element 421a. The second problem is that it cannot be stored in the table when there are repetitions inside the child elements 421b1 and 2 of the element 421a. This is because the repetition represented by the row of the table corresponds only to the child element of the element 421a. A third problem is that a recursive tag structure cannot be stored in a table. The reason is that the table cannot store a recursive data structure as it is.
[0005]
The present invention makes it possible to store an XML document including a nested structure, a repetition structure, or a recursive structure of elements (tags) of three or more layers, which cannot be supported by a conventional configuration, in a table format database. It is an object to provide a data storage device and method.
[0006]
[Means for Solving the Problems]
In order to solve the above-mentioned problem, the invention according to claim 1 is a data storage device for converting XML (eXtensible Markup Language) data into tabular data and storing it in a predetermined storage means.Input first XML data ( 221 ) In the second XML data (in which the element F which is not the child element B of the root element A is separated and the element F is removed) 222 ) And the third XML data (the element F added to the new XML data) ( 223 ) And the first data conversion means ( 202 ) And second XML data ( 222 ) Inside the recursive element D, and the fourth XML data with the element D removed ( 224 ) And the fifth XML data (the element D added to the new XML data) ( 225 ) And the second data conversion means ( 203 ) And fourth XML data ( 224 ) To move the element E nested at three or more levels inside as a child element of the child element B of the root element A to obtain the sixth XML data ( 226 ) To create a third data conversion means ( 204 ) And sixth XML data ( 226 ) In the first table and the fifth XML data ( 225 ) In the second table and the third XML data ( 223 ) In the third table 205 ).
[0007]
The invention according to claim 2 is XML. (eXtensible Markup Language ) In the data storage device that converts the data into tabular data and stores it in a predetermined storage means, the input XML data ( 321 ) Is separated from the element C and the character string, and the child element B of the root element A, the element C and the character string as the element are added to the new XML data, and the element ID indicating the order position of the element B XML data added with ( 323 ) To create data conversion means ( 302 ) And XML data ( 323 ) In the table, and at that time, the data storage means (the element C and the character string are stored together with the element ID) 303 ).
[0008]
The invention according to claim 3 is XML. (eXtensible Markup Language ) In a data storage method for converting data into tabular data and storing it in a predetermined storage means, the input first XML data ( 221 ) In the second XML data (in which the element F which is not the child element B of the root element A is separated and the element F is removed) 222 ) And third XML data (adding element F to the new XML data) 223 ) And a second XML data ( 222 ) Inside the recursive element D, and the fourth XML data with the element D removed ( 224 ) And fifth XML data (adding element D to the new XML data) 225 ) To create a second data conversion process and a fourth XML data ( 224 ) To move the element E nested at three or more levels inside as a child element of the child element B of the root element A to obtain the sixth XML data ( 226 ) To create a third data conversion process and sixth XML data ( 226 ) In the first table and the fifth XML data ( 225 ) In the second table and the third XML data ( 223 ) Is stored in the third table.
[0009]
DETAILED DESCRIPTION OF THE INVENTION
Embodiments of a data storage device according to the present invention will be described below with reference to the drawings.
[0010]
FIG. 1 is a block diagram for explaining an embodiment of a data storage device according to the present invention. In FIG. 1, the data conversion unit 111 separates a portion having repetitions other than the child element of the root element in the document type definition from the XML data 101, and removes the separated portion from the XML data 102 and the separated portion as a new route. The XML data 103 and other XML data created by placing the same element repeatedly below the element are used. The data conversion means 112 separates a portion having a recursive structure in the document definition from the XML data 102, and converts the separated portion so that an element that recursively becomes a repetitive form of a child element of the root element. The XML data 105 and other XML data created in the above and the XML data 104 excluding the separated portion. The other output XML data of the data conversion unit 111 such as the XML data 103 is converted in the same manner as the XML data 102. The data conversion unit 113 converts the element nested in three or more levels in the document definition of the XML data 104 into a format in which the element is a child element of the root element, and forms the XML data 106. Conversion is similarly performed for other output XML data of the data conversion unit 112 such as the XML data 105. By the conversion of the data conversion means 111 to 113, the XML data 101 is converted into XML data 106 having other repetitions of the same element as a child element of the root element and having the same element column inside the child element and other XML data. The data updating unit 114 stores the XML data in the table of the database 121 by associating the child elements of the root elements of the XML data with the columns of the table. In this manner, an XML document including a tag nesting structure of three or more levels, a repetition, and a recursive tag structure can be stored in a table format database.
[0011]
Next, another embodiment of the data storage device according to the present invention will be described with reference to FIG. Referring to FIG. 2, an XML data storage device according to an embodiment of the present invention includes an XML data input unit 201, a data conversion unit 202, a data conversion unit 203, a data conversion unit 204, a data storage unit 205, The database 211 is configured. An example of the XML data 221 is shown in FIG. The XML data 221 is composed of a root element A (start tag: <A>, end tag: </A>; the same applies below, but in the drawing, a reference line is attached to the start tag). Element A is composed of repetition of element B (B01, B02). The element B01 is composed of an element C (C01) and an element D (D01, D02). The element B02 is composed of an element C (C02) and an element D (D03). Element C01 is composed of a repetition of one element E (E01) and element F (F01, F02). The element C02 is composed of one element E (E02) and element F (F03). The element D01 is composed of one element G (G01) and one element D (D02). The element D03 is composed of one element G (G03). Element F (F01, F02, F03) is composed of element I and element J, respectively. Element E, element G, element I, and element J are composed of character strings.
[0012]
Each of the above means generally operates as follows. The data input unit 201 inputs the XML data 221 and passes it to the data conversion unit 202. The data conversion unit 202 separates the element F (F01, F02), which is a repetitive element inside the XML data 221, into XML data 222 and XML data 223. Element B (B01, B02) is also repeated, but is not converted because it is a child element of root element A. The data conversion unit 203 separates the recursive element D (D01, D02) inside the XML data 222 into XML data 224 and XML data 225. The data conversion unit 204 moves the element E nested in three or more levels inside the XML data 224 as a child element of the child element B of the root element A and sets it as XML data 226. The data storage unit 205 stores the XML data 226 in the table A. At this time, the contents of element E are stored in column E. The XML data 225 is stored in the table Y. At this time, the contents of the element G are stored in the column G. The XML data 223 is stored in the table X. At this time, the contents of element I are stored in column I, and the contents of element J are stored in column J.
[0013]
[Description of Operation of Embodiment]
Next, the overall operation of this embodiment will be described in detail with reference to the flowcharts of FIGS. 4, 5, 6, and 7.
[0014]
The XML data 221 is input to the data conversion unit 202. New XML data 223 having the element X as a root element is generated (step A101 in FIG. 4). Here, X is a variable. If there is no element B in the XML data 221, the process ends (step A102), and if there is an element B, the first element B is set as element B1 (step A103). Here, B and B1 indicate variables. Next, if there is no element F in the element B1, the process ends (step A104). If there is an element F in the element B1, the first element F of the element B1 is set to F1 (step A105). Here, F and F1 are variables. The element F1 is removed from the element B1 and added to the element X (step A106). If there is a next element F in element B1, the process proceeds to step A108, and if not, the process proceeds to step A109 (step A107). In step A108, the next element F is set to F1, and the process returns to step A106. In step A109, if there is a next element B, the process proceeds to step A110. In step A110, the next element B is set to B1, and the process returns to step A104. At the end, the XML data 221 from which all the elements F have been removed is referred to as XML data 222.
[0015]
The data conversion unit 203 receives the XML data 222 from the data conversion unit 202. New XML data 225 having the element Y as a root element is generated (step A201 in FIG. 5). Here, Y is a variable. If there is no element B in the XML data 222, the process ends (step A202), and if there is an element B, the first element B is designated as element B1 (step A203). Next, the element D of the child element of the element B1 is set to D1 (step A204). Here, D and D1 are variables. Next, the element D1 is removed from the parent element and added as a child element of the element Y (step A205). If there is an element D as a child element of the element D1, the process proceeds to step A207, and if not, the process proceeds to step A208 (step A206). In step A207, the child element D is set to D1, and the process returns to step A205. In step A208, if there is a next element B, the process proceeds to step A209. In step A209, the next element B is set to B1, and the process returns to step A204. At the end, the XML data 222 from which all elements D are removed is referred to as XML data 224.
[0016]
The data conversion unit 204 receives the XML data 224 from the data conversion unit 203. If there is no element B in the XML data 224, the process ends (step A301 in FIG. 6), and if there is an element B, the first element B is designated as element B1 (step A302). The child element E of the child element C of the element B1 is removed from the element C and added as a child element of the element B1 (step A303). Here, C and E are variables. The element C is deleted from the element B1 (step A304). If there is a next element B, the process proceeds to step A306, and if not, the process ends (step A305). In step A306, the next element B is set to B1, and the process returns to step A303. At the end, the XML data 224 in which all the elements C and E have been converted are set as XML data 226.
[0017]
The data storage unit 205 receives the XML data 226 from the data conversion unit 204, the XML data 225 from the data conversion unit 203, and the XML data 223 from the data conversion unit 203. If there is no element B in the XML data 226, the process proceeds to step A407 (step A401 in FIG. 7). If there is an element B in the XML data 226, the first element B of the XML data 226 is set as an element B1 (step A402). Next, row A1 is added to table A (step A403). Here, A1 is a variable. Next, the contents of child element E of element B1 are stored in column E of row A1 (step A404). If there is a next element B, the process proceeds to step A406, and if not, the process proceeds to step A407 (step A405). In step A406, the next element B is set to B1, and the process returns to step A403. In step A407, if there is no element D in the XML data 225, the process proceeds to step A413, and if there is an element D, the process proceeds to step A408. In step A408, the first element D of the XML data 225 is set as the element D1. A row Y1 is added to the table Y (step A409). Here, Y1 is a variable. Next, the contents of child element G of element D1 are stored in column G of row Y1 (step A410). Here, G is a variable. If there is a next element D, the process proceeds to step A412; otherwise, the process proceeds to step A413 (step A411). In step A412, the next element D is set to D1, and the process returns to step A409. In step A413, if there is no element F in the XML data 223, the process ends. If there is an element F, the process proceeds to step A414. In step A414, the first element F of the XML data 223 is set as an element F1. A row X1 is added to the table X (step A415). Next, the contents of child element I of element F1 are stored in column I of row X1 (step A416). Here, X1 and I are variables. Next, the contents of child element J of element F1 are stored in column J of row X1 (step A417). Here, J is a variable. Next, if there is a next element F, the process proceeds to step A419, and if not, the process ends (step A418). In step A419, the next element F is set as F1, and the process returns to step A415. The process ends here.
[0018]
Next, a description will be given using a specific example in the case of inputting the XML data 221 shown in FIG. The XML data 221 is input to the data conversion unit 202. The data conversion unit 202 generates new XML data 223 having the element X as a root element (step A101 in FIG. 4) (see FIG. 8A). The first element B and element B01 of the XML data 221 shown in FIG. 3 are extracted (step A103). The first element F and element F01 of the element B01 are taken out (step A105). F01 is removed from B01 and added to element X (step A106) (FIG. 8B (upper right of FIG. 8)). The element F and element F02 next to the element B01 are taken out (step A108). F02 is removed from B01 and added to element X (step A106) (FIG. 8C). The next element B and element B02 are taken out (step A110). The first element F and element F03 of the element B02 are extracted (step A105). F03 is removed from B02 and added to element X (step A106) (FIG. 8D). The XML data 221 from which all elements F have been removed is taken as XML data 222 (FIG. 8 (e)).
[0019]
The data conversion unit 203 inputs the XML data 222 shown in FIG. 8E from the data conversion unit 202. New XML data 225 having the element Y as a root element is generated (step A201 in FIG. 5) (FIG. 9A). The first element B and element B01 of the XML data 222 in FIG. 8E are extracted (step A203). The child element D and the element D01 of the element B01 are taken out (step A204). D01 is removed from B01 and added as a child element of element Y (step A205) (FIG. 9B (upper right of FIG. 9)). The element D and the element D02 that are the child elements of the element D01 are extracted (step A207). D02 is removed from D01 and added as a child element of element Y (step A205) (FIG. 9C). The next element B and element B02 are taken out (step A209). The element D and the element D03, which are child elements of the element B02, are removed from B02 and added as child elements of the element Y (step A205) (FIG. 9D). The XML data 222 from which all elements D have been removed is taken as XML data 224 (FIG. 9 (e)).
[0020]
The data conversion unit 204 inputs the XML data 224 shown in FIG. 9E from the data conversion unit 203. The first element B and element B01 of the XML data 224 are extracted (step A302). The child element E (E01) of the child element C (C01) of the element B01 is removed from the element C and added as a child element of the element B01 (step A303) (FIG. 10A). The element C is deleted from the element B01 (step A304) (FIG. 10B). The next element B and element B02 are taken out (step A306). The child element E (E02) of the child element C (C02) of the element B02 is removed from the element C and added as a child element of the element B02 (step A303) (FIG. 10C). The element C is deleted from the element B02 (step A304) (FIG. 10 (d)). The XML data 224 obtained by converting all the elements C and E is defined as XML data 226 (FIG. 10 (e)).
[0021]
The data data storage unit 205 converts the XML data 226 shown in FIG. 10E from the conversion unit 204, the XML data 225 shown in FIG. 9D from the data conversion unit 203, and the data conversion unit 203 shown in FIG. 8D. XML data 223 shown is input. The first element B and element B01 of the XML data 226 are extracted (step A402 in FIG. 7). A row A01 is added to the table A (step A403) (FIG. 11A). The contents of child element E of element B01 are stored in column E of row A1 (step A404) (FIG. 11 (b)). The next element B and element B02 are taken out (step A406). A row A02 is added to the table A (step A403) (FIG. 11 (c)). The contents of child element E of element B02 are stored in column E of row A2 (step A404) (FIG. 11 (d)). The first element D and element D01 of the XML data 225 are extracted (step A408). A row Y01 is added to the table Y (step A409) (FIG. 11 (e)). The contents of child element G of element D01 of XML data 225 are stored in column G of row Y01 (step A410) (FIG. 11 (f)). The next element D and element D02 are taken out (step A412). A row Y02 is added to the table Y (step A409). The contents of child element G of element D02 are stored in column G of row Y02 (step A410). The next element D and element D03 are taken out. A row Y03 is added to the table Y (step A409). The contents of child element G of element D03 are stored in column G of row Y03 (step A410) (FIG. 11 (g)). The first element F and element F01 of the XML data 223 are extracted (step A414). A row X01 is added to the table X (step A415) (FIG. 11 (h)). The contents of child element I of element F01 are stored in column I of row X01 (step A416). The contents of child element J of element F01 are stored in column J of row X01 (step A417) (FIG. 11 (i)). The next element F and element F02 are taken out (step A419). A row X02 is added to the table X (step A415) (FIG. 11 (j)). The contents of child element I of element F02 are stored in column I of row X02 (step A416). The contents of child element J of element F02 are stored in column J of row X02 (step A417) (FIG. 11 (k)). The next element F and element F03 are taken out (step A419). A row X03 is added to the table X (step A415) (FIG. 11 (l)). The contents of child element I of element F03 are stored in column I of row X03 (step A416). The contents of child element J of element F03 are stored in column J of row X03 (step A417) (FIG. 11 (m)).
[0022]
According to the present embodiment, the following effects can be obtained. The first effect is that an XML document including a tag repeating structure can be stored in a database in a table format. The reason is that the portion having the tag repetitive structure is converted into a form that can be stored as another data in a table format. The second effect is that an XML document including a recursive tag structure can be stored in a table format database. The reason is that the portion having the recursive tag structure is converted into a form that can be stored as separate data in a table format. The third effect is that an XML document including a tag nesting structure of three or more levels can be stored in a table format database. The reason is that the element of the part having a nested structure of three or more levels is moved to a higher element and converted into a form that can be stored in a table format database.
[0023]
[Other Embodiments of the Invention]
Next, another embodiment of the present invention will be described in detail with reference to the drawings. Referring to FIG. 12, another embodiment of the XML data storage device of the present invention includes an XML data input means 301, a data conversion means 302, a data storage means 303, and a database 311. An example of the XML data 321 is shown in FIG. The XML data 321 is composed of a root element A. Element A is composed of repetitions of element B (B01, B02). Element B01 is composed of element C (CT02) and repetition of character strings (CT01, CT03). Element B02 is composed of a repetition of element C (CT04, CT06) and a character string (CT05).
[0024]
Each of the above means generally has the following functions. The data input unit 301 inputs XML data 321 and passes it to the data conversion unit 302. The data conversion unit 302 separates the element C and the character string, which are repetitive elements inside the XML data 321, into XML data 322 and XML data 323. The data storage unit 303 stores the XML data 323 in the table X. The contents of element C are stored in column C, the character string is stored in column TEXT, and the position information of element B is stored in column ID.
[0025]
Next, the overall operation of this embodiment will be described in detail with reference to the flowcharts of FIGS. The XML data 321 is input to the data conversion unit 302. New XML data 323 having the element X as a root element is generated (step A501 in FIG. 14). If there is no element B in the XML data 321, the process ends (step A 502). The first element B is set as element B1 (step A503). The order position of element B1 is ID1 (step A504). If there is no element C or character string in element B1, the process proceeds to A514 (step A505). If element B1 contains element C or a character string, the first element C or character string of element B1 is set as CT1 (step A506). Element Y is generated and set as Y1 (step A507). Y1 is added to the element X (step A508). CT1 is removed from B1 and added to element Y1 (step A509). An element ID is added to the element Y1 (step A510). ID1 is set as the content of the element ID (step A511). If there is the next element C or character string in element B1, the process proceeds to step A513, and if not, the process proceeds to step A514 (step A512). The next element C or character string is set as CT1 (step A513). If there is a next element B, the process proceeds to step A515, and if not, the process ends (step A514). The next element B is set to B1 (step A515). The XML data 321 from which all the elements F and character strings of the children of the element B are removed is referred to as XML data 322.
[0026]
The data storage unit 303 receives the XML data 323 from the data conversion unit 302. If there is no element Y in the XML data 323, the process ends (step A601 in FIG. 15). The first element Y of the XML data 323 is set as an element Y1 (step A602). A row X1 is added to the table X (step A603). If there is a child element C in element Y1, the contents are stored in column C of row X1 (steps A604 and A605). If there is a character string in element Y1, it is stored in column TEXT of row X1 (steps A606 and A607). The contents of the child element ID of the element Y1 are stored in the element ID of the column X1 (step A608). If there is a next element B, the process proceeds to step A610. The next element B is set to B1 (step A610). Next, a specific example will be described.
[0027]
The XML data 321 is input to the data conversion unit 302. New XML data 323 having the element X as a root element is generated (step A501 in FIG. 14). The first element B and element B01 are taken out (step A503). The order position 1 of the element B01 is set to ID1 (step A504). The first character string CT01 of the element B01 is taken out (step A506). Element Y is generated and set to Y01 (step A507). Y01 is added to the element X (step A508). CT01 is removed from B01 and added to element Y01 (step A509). An element ID is added to the element Y01 (step A510). The value 1 of ID1 is set as the content of the element ID (step A511). The next element C and element CT02 are taken out (step A513). Element Y is generated and set as Y02 (step A507). Y02 is added to the element X (step A508). CT02 is removed from B01 and added to element Y02 (step A509). An element ID is added to the element Y02 (step A510). The value 1 of ID1 is set as the content of the element ID (step A511). The next character string, CT03, is extracted (step A513). Element Y is generated and set as Y03 (step A507). Y03 is added to the element X (step A508). CT03 is removed from B01 and added to element Y03 (step A509). An element ID is added to the element Y03 (step A510). The value 1 of ID1 is set as the content of the element ID (step A511). The next element B and element B02 are taken out (step A515). The order position 2 of the element B02 is set to ID1 (step A504). The first element C and element CT04 of the element B02 are taken out (step A506). Element Y is generated and set as Y04 (step A507). Y01 is added to the element X (step A508). CT04 is removed from B02 and added to element Y04 (step A509). An element ID is added to the element Y04 (step A510). The value 2 of ID1 is set as the content of the element ID (step A511). The next character string, CT05, is extracted (step A513). Element Y is generated and set to Y05 (step A507). Y05 is added to element X (step A508). CT05 is removed from B02 and added to element Y05 (step A509). An element ID is added to the element Y05 (step A510). The value 2 of ID1 is set as the content of the element ID (step A511). The next element C, CT06 is taken out (step A513). Element Y is generated and set to Y06 (step A507). Y06 is added to the element X (step A508). CT06 is removed from B02 and added to element Y06 (step A509). An element ID is added to the element Y06 (step A510). The value 2 of ID1 is set as the content of the element ID (step A511). The XML data 321 from which all the elements C and character strings of the children of the element B are removed is referred to as XML data 322. The created XML data 323 is shown in FIG.
[0028]
The data storage unit 303 receives the XML data 323 from the data conversion unit 302. The first element Y and element Y01 of the XML data 323 are extracted (step A602 in FIG. 15). A row X01 is added to the table X (step A603). The character string of element Y01 is stored in column TEXT of row X01 (step A607). The contents of the child element ID of the element Y01 are stored in the column ID of the row X01 (step A608). The next element Y and element Y02 are taken out (step A610). A row X02 is added to the table X (step A603). The contents of child element C of element Y02 are stored in column C of row X02 (step A605). The contents of the child element ID of the element Y02 are stored in the column ID of the row X02 (step A608). The next element Y and element Y03 are taken out (step A610). A row X03 is added to the table X (step A603). The character string of element Y03 is stored in column TEXT of row X01 (step A607). The contents of the child element ID of the element Y03 are stored in the column ID of the column X03 (step A608). The next element Y and element Y04 are taken out (step A610). A row X04 is added to the table X (step A603). The contents of child element C of element Y04 are stored in column C of row X04 (step A605). The contents of the child element ID of the element Y04 are stored in the column ID of the row X04 (step A608). The next element Y and element Y05 are taken out (step A610). A row X05 is added to the table X (step A603). The character string of element Y05 is stored in column TEXT of row X05 (step A607). The contents of the child element ID of the element Y05 are stored in the column ID of the column X05 (step A608). The next element Y and element Y06 are taken out (step A610). A row X06 is added to the table X (step A603). The contents of child element C of element Y06 are stored in column C of row X06 (step A605). The contents of the child element ID of the element Y06 are stored in the column ID of the row X06 (step A608). The created table X is shown in FIG.
[0029]
The first effect of this embodiment is that repetitions in which character strings and elements are mixed can be stored in a table format database. The reason is that a portion having a repeated structure of a tag and a character string is bundled with an element or an element having a character string as contents, and converted into a form that can be bundled as another data and stored in a table format. The second effect is that the original position information of the character string and element data stored in the database in the table format can be stored. The reason is that the position information of the parent element is added when the character string or element is bundled, and the information is also stored in the table.
[0030]
In addition, embodiment of this invention is not limited to the form mentioned above, It can change suitably. For example, the position of the conversion means in each embodiment can be exchanged with other conversion means. It is conceivable to add a configuration for adding an element representing position information of an element used in the embodiment described with reference to FIG. 18 to the embodiment described with reference to FIG. The data storage device of the present invention can be realized by using a computer and a program executed by the computer, and the program executed by the computer is distributed via a computer-readable recording medium or a communication line. Is possible.
[0031]
【The invention's effect】
As described above, according to the invention, the following effects can be obtained. The first effect is that an XML document including a tag repeating structure can be stored in a database in a table format. The reason is that the portion having the tag repetitive structure is converted into a form that can be stored as another data in a table format. The second effect is that an XML document including a recursive tag structure can be stored in a table format database. The reason is that the portion having the recursive tag structure is converted into a form that can be stored as separate data in a table format. The third effect is that an XML document including a tag nesting structure of three or more levels can be stored in a table format database. The reason is that the element of the part having a nested structure of three or more levels is moved to a higher element and converted into a form that can be stored in a table format database. A fourth effect is that repetitions in which character strings and elements are mixed can be stored in a table format database. The reason is that a portion having a repeated structure of a tag and a character string is bundled with an element or an element having a character string as contents, and converted into a form that can be bundled as another data and stored in a table format. The fifth effect is that the original position information of the character string and element data stored in the table format database can be stored. The reason is that the position information of the parent element is added when the character string or element is bundled, and the information is also stored in the table.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of an embodiment of a data storage device according to the present invention.
FIG. 2 is a block diagram showing a configuration of an embodiment of a data storage device according to the present invention.
FIG. 3 is a diagram showing an example of XML data 221 in FIG. 2;
4 is a flowchart showing the operation of the configuration (data conversion means 202) in FIG. 2;
FIG. 5 is a flowchart showing the operation of the configuration of FIG. 2 (data conversion means 203).
6 is a flowchart showing the operation of the configuration (data conversion means 204) in FIG.
FIG. 7 is a flowchart showing the operation of the configuration of FIG. 2 (data storage means 205).
FIG. 8 is a diagram showing the creation progress (a) to (d) of XML data 223 by the data conversion means 202 in FIG. 2 and the created XML data 222 (e).
FIG. 9 is a diagram showing creation progress (a) to (d) of XML data 225 by the data conversion unit 203 in FIG. 2 and created XML data 224 (e).
FIG. 10 is a diagram showing conversion progress (a) to (d) of XML data 224 by the data conversion means 204 in FIG. 2 and created XML data 226 (e).
11 is a diagram showing the progress (a) to (m) of creation of tables A, X, and Y in the database 211 by the data storage unit 205 in FIG.
FIG. 12 is a block diagram showing a configuration of another embodiment of a data storage device according to the present invention.
13 is a diagram showing an example of XML data 321 in FIG.
FIG. 14 is a flowchart showing the operation of the configuration of FIG. 12 (data conversion means 302).
15 is a flowchart showing the operation of the configuration of FIG. 12 (data storage means 303).
16 is a view showing an example of the contents of XML data 323 created by the data conversion means 302 in FIG.
17 is a diagram showing an example of creating a table X in the database 311 by the data storage unit 305 in FIG.
FIG. 18 is a block diagram showing a configuration of a conventional data storage device.
FIG. 19 is a view showing the contents of XML data 421 in FIG. 18;
20 is a view showing the contents of a table 422 in FIG.
[Explanation of symbols]
101-106, 221-226, 321-323 ... XML data
111-113, 202-204, 302 ... data conversion means
201, 301 ... Data input means
121, 211, 311 ... database
114: Data updating means
205, 303 ... Data storage means

Claims

In a data storage device for converting XML (eXtensible Markup Language) data into tabular data and storing it in a predetermined storage means,
The second XML data ( 222 ) from which the element F that is a repetitive element inside the input first XML data ( 221 ) and is not a child element B of the root element A is separated and the element F is removed, and a new element First data conversion means ( 202 ) for generating third XML data ( 223 ) obtained by adding element F to the XML data ,
The recursive element D inside the second XML data ( 222 ) is separated, the fourth XML data ( 224 ) from which the element D is removed, and the fifth element D is added to the new XML data. Second data conversion means ( 203 ) for creating the XML data ( 225 ) of
A third XML element ( 224 ) is created by moving the element E nested in three or more levels inside the fourth XML data ( 224 ) as a child element of the child element B of the root element A to generate the sixth XML data ( 226 ) Data conversion means ( 204 );
The sixth XML data ( 226 ) is stored in the first table, the fifth XML data ( 225 ) is stored in the second table, and the third XML data ( 223 ) is stored in the third table. A data storage device comprising data storage means ( 205 ) .

In a data storage device for converting XML (eXtensible Markup Language) data into tabular data and storing it in a predetermined storage means,
The element C and the character string that are repetitive elements in the input XML data ( 321 ) are separated, and the child element B of the root element A and the element C and the character string as the elements are added to the new XML data. Data conversion means ( 302 ) for creating XML data ( 323 ) added together with an element ID indicating the order position of
Data storage means ( 303 ) for storing the XML data ( 323 ) in the table and storing the element C and the character string together with the element ID ;
A data storage device comprising:

In a data storage method for converting XML (eXtensible Markup Language ) data into tabular data and storing it in a predetermined storage means,
The second XML data ( 222 ) from which the element F which is a repetitive element inside the input first XML data ( 221 ) and is not a child element B of the root element A is separated and the element F is removed, and a new A first data conversion process for creating third XML data ( 223 ) obtained by adding element F to the XML data of
The recursive element D in the second XML data ( 222 ) is separated, the fourth XML data ( 224 ) from which the element D is removed, and the fifth element in which the element D is added to the new XML data. A second data conversion process for generating XML data ( 225 ) of
A third XML element ( 224 ) is created by moving the element E nested in three or more levels inside the fourth XML data ( 224 ) as a child element of the child element B of the root element A to generate the sixth XML data ( 226 ) Data conversion process,
The sixth XML data ( 226 ) is stored in the first table, the fifth XML data ( 225 ) is stored in the second table, and the third XML data ( 223 ) is stored in the third table. Data storage process and
The data storage method characterized by including .