JP3620203B2

JP3620203B2 - Database search processing method

Info

Publication number: JP3620203B2
Application number: JP05880197A
Authority: JP
Inventors: 真二藤原; 一智牛嶋
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1996-03-13
Filing date: 1997-03-13
Publication date: 2005-02-16
Anticipated expiration: 2017-03-13
Also published as: JPH09305614A

Description

【０００１】
【発明の属する技術分野】
本発明はテーブルの参照を行うデータベースシステムに係り、特に複数の検索処理を同時に実行する検索処理方法に関する。
【０００２】
【従来の技術】
商業用応用並列サーバ市場の拡大に伴って実用化されつつあるデータウェアハウスは、一般的には、膨大なデータを蓄積する企業データウェアハウスと部門毎に抽出されたデータを格納する部門データウェアハウスと多数のクライアント端末からなる３階層の企業情報システムである。
【０００３】
このデータウェアハウスでは、生データを保持するテーブルを様々な角度から分析する多次元解析が頻繁に行われる。そのため、一つのテーブルに対して種々の条件で検索が繰り返し実行される。データベースシステムでは、これらの検索を別々に実行していた。データウェアハウスが保持する生データはＴＢ規模の膨大な量でありデータを全件スキャンするだけでも多くの時間を要する。そのため、種々の条件で検索を繰り返し実行した場合、生データの全件スキャンを何度も繰り返す必要があり、膨大な処理時間が必要であるという問題点があった。
【０００４】
図３は複数の検索処理を含む検索要求のＳＱＬ文の例である。図３は、ＷＩＴＨ句で指示される２つの検索処理Ｑ１およびＱ２を含む。検索処理Ｑ１はテーブルＴ１とＤ３列が１００未満のテーブルＴ２をＣ１列をキーとしてジョインし、Ｃ１列とＣ２列でグループ化し、Ｄ１列とＤ２列のグループ毎の合計を求める。検索処理Ｑ２はＤ４列が１００未満のテーブルＴ２とテーブルＴ３をＣ３列をキーとしてジョインし、Ｃ２列とＣ３列でグループ化し、Ｄ１列とＤ２列のグループ毎の合計を求める。図３のＳＱＬでは、上記のＱ１及びＱ２の処理をＵＮＩＯＮＡＬＬで一つの検索結果としてまとめる。この際、取り出した結果がＱ１の検索処理の結果かＱ２の検索処理の結果かが分かるように先頭に識別子“Ｑ１”および“Ｑ２”の識別子をそれぞれ負荷してまとめる。本ＳＱＬでは、テーブルＴ１とＴ２を結合してグループ化する処理と、テーブルＴ２とＴ３を結合してグループ化する処理を含んでいる。本検索要求を従来方式で実行すると、テーブルＴ２が２回スキャンされる。このときの処理の流れを図２に示す。
【０００５】
図２は、複数の２次記憶装置２８〜３０に格納されているテーブルＴ１、Ｔ２、Ｔ３をスキャンバックエンドサーバ（ＳｃａｎＢＥＳ）６で読み込み、ジョインバックエンドサーバ（ＪｏｉｎＢＥＳ）４及び５に転送して処理する様子を示している。スキャンバックエンドサーバ６は、２次記憶から読み込んだデータを主記憶上にキャッシングするためのデータベースバッファ１７を有し、テーブルＴ１とＴ２を結合するためのテーブルスキャン処理１４及び３１と、テーブルＴ２とＴ３を結合するためのテーブルスキャン処理３２と１６の合計４つのテーブルスキャン処理を独立に実行する。
【０００６】
図２の処理では、テーブルＴ２のスキャン処理が２回重複して実行されるため、データベースバッファからテーブルスキャン処理がデータを取ってくる処理を２回重複して行わなければならず、さらに、Ｔ２のスキャン処理３１と３２の処理速度に差がある場合には、データベースバッファ１７上にそれぞれのスキャン処理が参照すべきテーブルＴ２のキャッシュ効果がなくなり、最悪の場合、２次記憶に対するＩ／Ｏ処理を２回発行する事になってしまう。
【０００７】
これらの問題を解決するために、従来の技術では特願平６−１３５４６４号公報に記載されているように複数の検索処理間で同一のデータをアクセスする場合には、同時実行制御を行うことにより、１回のＩ／Ｏでデータベースバッファに取り込まれたデータを複数の検索処理で利用可能にする技術が提案されてきた。
【０００８】
【発明が解決しようとする課題】
従来の複数の検索処理の同時実行制御方式では、データベースバッファのヒット率を高めるために各検索処理間で同期制御を行う必要があった。しかしながら、各検索要求はデータベースシステムに非同期に投入されるため、その同期制御を行うためのオーバヘッドが大きかった。さらに、各検索処理間でデータをスキャンする速度に差が生じる場合には、頻繁に同期制御を行わないとデータベースバッファに保持されているデータを共有できないという問題点があった。
【０００９】
また、従来の方式では、２次記憶からデータベースバッファにデータを読み出す処理は共通化されていたが、データベースバッファから各検索処理のローカルバッファにデータを読み出す処理を各検索処理毎に行っていたため、十分な処理の共通化が行えなかった。
【００１０】
複数の検索処理を一連の検索要求として一括化して発行する手段としてＳＱＬのストアドプロシジャやＳＱＬ３で標準化されるＷＩＴＨ句が提供されている。データウェアハウスの多次元解析では、複数の検索処理が同時に発行されるため、これらのストアドプロシジャやＷＩＴＨ句を用いた手段で１回の検索要求にまとめることができる。
【００１１】
複数の検索処理を一括化した一連の検索要求を発行する手段としては、上記以外にもたとえば、データベースシステムの問い合わせ受け付けサーバで複数のＳＱＬ文をバッファリングして一括化して発行する手段等も考えられる。
【００１２】
本発明は、上記のような一連の検索要求の中に内在する複数の検索処理を効率良く実行する方法を提供することを目的とする。
【００１３】
本発明は、複数の検索処理で構成される一連の検索要求をコンパイルして検索実行グラフを生成する方法を提供することを目的とする。
【００１４】
さらに、本発明は、複数の検索処理を同時に実行する時に発生しうるデッドロック状態をコンパイル時に検出し、それを回避する方法を提供することを目的とする。
【００１５】
【課題を解決するための手段】
本発明の実施例では、一つのテーブルに対して複数の検索処理を実行するときに、該複数の検索処理で共通な部分検索処理を一つにまとめて複数の検索処理間で共有する処理方法を提供する。この処理方法は、複数の検索処理を含む一つの検索要求を解析して検索処理の実行グラフを生成するステップと、前記生成された実行グラフに従い検索処理を実行するステップとで構成される。
【００１６】
検索要求を解析して検索処理の実行グラフを生成するステップは、該検索要求に含まれる複数の検索処理を解析して各検索処理毎の実行木を作成するステップと、各検索処理の実行木の間で共通な部分を切り出して、該共通な部分を共有した実行グラフに変換するステップと、該変換された実行グラフを解析して実行時にデッドロック状態になる可能性があるかどうかを検出してデッドロック状態になりうる場合には共通な部分グラフの一部を非共有化してデッドロックフリーな実行グラフに変換するステップとで構成される。
【００１７】
検索処理を実行するステップは、同一テーブルに対する複数の検索処理を実行する一括スキャン処理ステップを有し、一括スキャン処理ステップは、入力バッファに読み込まれた一つのデータを複数の検索条件に共通な条件で予め絞り込むステップと、該共通条件に適合したデータをさらに各検索条件に複数回照合して該検索条件に適合した全ての検索処理に対応する複数の出力バッファにそれぞれ出力するステップとで構成される。
【００１８】
【発明の実施の形態】
図１は本発明を実施するデータベースにおける複数の検索処理を同時に実行する方法を示した図である。端末１が図３で示される検索要求をデータベースシステム２に対して発行すると、検索要求受付サーバ（フロントエンドサーバ）３が該検索要求を受け付け、これをコンパイルする。今、コンパイルの結果、図４で示す様な検索処理の実行グラフを生成したとする。以下、図４で示される検索処理を実行するものとして説明する。コンパイル後、次に検索処理を実行するのに必要な各種バックエンドサーバを起動し、検索処理の実行を指示する。図４に示す検索実行グラフは２つの結合演算と３つのテーブルスキャン処理を含んでいるので、ジョインバックエンドサーバ４及び５とスキャンバックエンドサーバ６が起動される。
【００１９】
ジョインバックエンドサーバでは、ハッシュジョインとグループ化を行う処理が実行される。
【００２０】
テーブルＴ１またはＴ３をスキャンするスキャンバックエンドサーバでは、一つの条件でヒット判定を行い、ヒットしたデータを出力先（ジョインバックエンドサーバ）に出力するシングルスキャン処理１４、１６が実行される。
【００２１】
一方、テーブルＴ２をするスキャンバックエンドサーバでは、複数の条件でヒット判定を行いヒットした条件に対応する出力先（ジョインバックエンドサーバ）にデータを出力する一括スキャン処理１５が実行される。この一括スキャン処理では、１回のスキャン処理で読み込んだデータを複数の検索処理の検索条件でヒット判定する。
【００２２】
テーブルＴ１のシングルスキャン処理１４は、２次記憶からＤＢバッファ１７内にテーブルＴ１のデータを読み込み、更にそのデータをシングルスキャン処理のバッファ内に取り込み、当該処理済みのデータをジョインバックエンドサーバに出力する。
【００２３】
テーブルＴ３のシングルスキャン処理１６は、シングルスキャン処理１４と同様に、２次記憶からＤＢバッファ１７内にテーブルＴ３のデータを読み込み、更にそのデータをシングルスキャン処理のバッファ内に取り込み、当該処理済みのデータをジョインバックエンドサーバに出力する。
【００２４】
テーブルＴ２の一括スキャン処理１５は、２次記憶からＤＢバッファ内にテーブルＴ２のデータを読み込みさらにそのデータを一括スキャン処理のバッファ内に取り込み、それぞれの結合演算に対応する条件で取り込んだデータを複数回判定する（２２）。各条件に適合したデータは一括スキャン処理によって対応する出力バッファに格納され、それぞれのジョインバックエンドサーバに出力される。
【００２５】
各スキャン処理が出力したデータは、２次記憶装置に格納されることなくジョインバックエンドサーバで順次処理される。さらにジョインバックエンドサーバで求められた演算結果はフロントエンドサーバ３に順次出力され、最終的に検索要求を発行した端末１に出力される。このように本方式では各サーバ間のデータの受け渡しに２次記憶を用意する必要がないため、各処理を独立した制御装置で実行した場合には処理がパイプライン的に実行できる。
【００２６】
本実施例では各処理において出力したデータが次の処理の入力に転送されるように記述しているが、入出力のバッファを共有することによりデータの転送を省略することももちろん可能である。
【００２７】
また、本実施例では一つの端末からあらかじめ複数の検索処理を含む検索要求が発行されている例を示しているが、検索要求受け付けサーバで複数の検索処理をまとめることも可能であり、そのような場合でも本発明による検索処理の同時実行方式が適用できることは明らかである。
【００２８】
図５は一括スキャン処理の詳細な方法を示す図である。一括スキャン処理１５は、一括スキャン処理ノード（Ｍ−ＳＣＡＮＰｒｏｃ）６４をルートとする木を実行する。Ｍ−ＳＣＡＮＰｒｏｃノードは、左の部分木として出力先ノード（ＤＳＴ）を複数持つことができる（６５及び６６）。一括スキャン処理のＤＳＴノードは、出力を識別する情報である出力識別子と各ＤＳＴ毎の検索条件を示す個別検索条件式が記載された出力先情報（６６及び６７）を有する。
【００２９】
本実施例では、ＤＳＴノード６５では出力識別子が２（ジョインバックエンドサーバ４に対する出力）で個別検索条件がＴ２．Ｄ３＜１００となっており、ＤＳＴノード６６では出力識別子が３（ジョインバックエンドサーバ５に対する出力）で個別検索条件がＴ２．Ｄ４＜１００となっている。個別検索条件に適合したデータは各ＤＳＴノードが持つ出力バッファ１９及び２０にそれぞれ出力され、次の処理の入力データとして消費される。
【００３０】
Ｍ−ＳＣＡＮＰｒｏｃノードの右の部分木には該Ｍ−ＳＣＡＮＰｒｏｃで共通化した処理が記述される。本実施例ではテーブルＴ２の全件スキャン処理を共通化したので、その実行を指示する部分木がＳＣＡＮノード６３及びＯＢＪノード６２で表現されている。ＯＢＪノードには入力識別子及び共通検索条件が記述される（６１）。
【００３１】
入力識別子には入力元がデータベースのテーブルか他のサーバの出力バッファか等の、入力元を識別特定するための情報が記述され、例えば、入力元がテーブルの場合にはテーブルを示す識別子が、入力元が出力バッファの場合には出力バッファを示す出力識別子が記述される。本実施例では、入力元はテーブルでテーブル識別子がＴ２となっている。一方、ＯＢＪノードの共通検索条件には、本一括スキャン処理で絞り込むデータの共通部分（各個別条件の論理和）が記述される。
【００３２】
なお、ＤＳＴノードで指定される個別検索条件は共通検索条件で絞り込めなかった以外の条件を記述する。例えば、出力先が２つで、それぞれの検索条件が「１００≦Ｄ１＜２００」と「２００≦Ｄ１＜３００」の時には共通条件は「１００≦Ｄ１＜３００となり、各個別条件が、ぞれぞれＤ１＜２００と２００≦Ｄ１となる。本実施例では、共通検索条件がテーブルの全データとなるので共通条件は記載されない。
【００３３】
次に一括スキャン処理のデータの流れについて説明する。一括スキャン処理は、まずＯＢＪノードに記述されている入力識別子に従ってデータを読み込み、共通条件に適合するデータを入力バッファ６０に取り込む。入力バッファに取り込まれたデータは、ＤＳＴノード６５の出力先情報６６に記載されている個別検索条件と照合され、検索条件に合致した場合には出力バッファ２０に出力される。さらに次のＤＳＴノード６６でも同様の処理を行い、個別検索条件に合致したデータが出力バッファ１９に出力される。本発明では１回のデータ入力に対して複数の出力先にデータを出力するため、データの入力処理が１回で済み、オーバヘッドが削減される。
【００３４】
図６は、シングルスキャン処理の詳細な方法を示す。シングルスキャン処理１４は、シングルスキャン処理ノード（Ｓ−ＳＣＡＮＰｒｏｃ）７４をルートとする木を実行する。Ｓ−ＳＣＡＮＰｒｏｃノード７４は、左の部分木として出力先ノード（ＤＳＴ）を一つだけ持つことができる（７６）。シングルスキャン処理のＤＳＴノードは、出力識別子が記載された出力先情報７５を有する。本実施例では出力先ノード７５の出力識別子が１となっている。一括スキャン処理では出力先情報に個別検索条件式が記載されていたが、シングルスキャン処理では、出力先は唯一つであるため、検索条件はＯＢＪノード７２に記載される（７１）。
【００３５】
Ｓ−ＳＣＡＮＰｒｏｃノードの右の部分木にはスキャンするデータの入力元や検索条件等が記述される。図６の例ではテーブルＴ１の無条件全件スキャン処理の実行を指示する木がＳＣＡＮノード７４及びＯＢＪノード７２で表現されている。ＯＢＪノードには入力識別子及び検索条件が記述される（７１）。入力識別子には入力元がデータベースのテーブルか他のサーバの出力バッファか等を識別するための情報が記述され、テーブルの場合にはテーブル識別子が、出力バッファの場合には出力識別子が記述される。図６の例では、入力元はテーブルでテーブル識別子がＴ１となっている。ＯＢＪノードの検索条件には何も記述されず全件スキャンを指示している。
【００３６】
同一テーブルに対する複数のスキャンを従来のようにシングルスキャン処理ノード（Ｓ−ＳＣＡＮＰｒｏｃ）のみで行った場合の処理コストは、ＤＢバッファへのページ単位の入力処理コストをＤ＿ＩＮ、ＤＢバッファから１行毎にスキャン処理にデータを持ってきて選択条件で絞り込む処理コストをＤ＿ＦＥＴＣＨ、出力バッファにヒットしたデータを書き込む処理コストをＤ＿ＳＥＴ、出力バッファのデータをページ単位で出力する処理コストをＤ＿ＯＵＴとし、テーブルスキャンの多重度、即ち一括処理する検索処理の数をＮとしたとき、従来の方法では、
（Ｄ＿ＩＮ×（テーブルページ数）＋Ｄ＿ＦＥＴＣＨ×（テーブル行数））×Ｎ
＋（Ｄ＿ＳＥＴ×（ヒット件数）＋Ｄ＿ＯＵＴ×（スキャン結果のデータのページ数））×Ｎ
となる。
【００３７】
一方、本実施例の方法では前半の処理が共通化されるため、テーブルスキャンコストが
（Ｄ＿ＩＮ×（テーブルページ数）＋Ｄ＿ＦＥＴＣＨ×（テーブル行数））×１
＋（Ｄ＿ＳＥＴ×（ヒット件数）＋Ｄ＿ＯＵＴ×（スキャン結果のデータのページ数））×Ｎ
となる。
【００３８】
いま、Ｄ＿ＩＮ、Ｄ＿ＦＥＴＣＨ、Ｄ＿ＳＥＴ、Ｄ＿ＯＵＴの処理コストの比を５：２：１：５とし、テーブルのレコード長を５１２Ｂｙｔｅ、レコード数を十万行、スキャン結果のレコード長を５０Ｂｙｔｅ、各スキャン条件の選択率を５０％、ページサイズを４０９６Ｂｙｔｅとすると、図７に示すような結果となり処理コストを削減する効果があることがわかる。
【００３９】
一般に大規模な生データを保持するテーブルを多次元解析する場合には、膨大な生データに対して種々の角度から検索要求が出ることが多い。本実施例の方法を適用することによりこのような大規模なデータのスキャン処理を削減することができる。
【００４０】
この結果から全ての場合について共有可能なテーブルスキャン処理を一括スキャン処理で共有化する方式が考えられるが、実際にはテーブルスキャンを共有化すると検索処理のデッドロックが発生するような問い合わせ要求が存在する。
【００４１】
図８は、テーブルスキャンの共有化によりデッドロックを発生する問い合わせ要求の一例である。いま、８９で示すような検索要求が発行されたとする。本検索要求はテーブルＴ４（８５）をＣ１列でグループ化して各グループ毎のＤの平均値を求めその中間結果をＱ１（８４）とする。次に、テーブルＴ４と中間結果Ｑ１をＣ１列をキーとしてジョインし、ＤとＤの平均（ＤＡＶＧ）の差分を求める（８２）。
【００４２】
本検索処理ではＱ１を生成する時と、結合演算を実行するときにテーブルＴ４のスキャンを行う。そこで、Ｍ−ＳＣＡＮＰｒｏｃ８０によりテーブルＴ４のスキャン処理を共通化したとする。Ｍ−ＳＣＡＮＰｒｏｃ８０はグループ化をして平均を求める処理ノード８１とジョイン演算をするノード８２に結果を出力する（８６及び８８）。また、グループ化して平均を求める処理ノードはジョイン演算をするノード８２に結果を出力する（８７）。
【００４３】
グループ化を行い平均を求めるノード８１はテーブルＴ４の全データのスキャンが完了するまで結果を出力することはできない。このように入力データの終端（エンドオブデータ）を受け取らないと結果が出力されないノードをブロック処理ノード（ＢｌｏｃｋＰｒｏｃ）と呼ぶ。
【００４４】
ジョインを行うノード８２は２つの入力データ一方が到着しないと他方の入力を抑止する。この例の場合、ノード８１からＱ１のデータを受信しない間はノード８０からのデータの入力を抑止する。このように、複数の入力データを持つノードで、一方のデータの入力状況が他方のデータの入力を抑止するノードを同期処理ノード（ＳｙｎｃＰｒｏｃ）と呼ぶ。
【００４５】
一括スキャン処理ノードとブロック処理ノード、同期処理ノードが図８で示すような関係で結合されている場合には検索処理全体がデッドロックする。
【００４６】
図９は、図８で示した検索処理がデッドロック状態に遷移する過程を説明する図である。図９の（ａ）は、検索処理開始直後の様子を示している。一括スキャン処理ノード８０の入力バッファ９０にはテーブルＴ４のデータが読み込まれ、次の処理ノードへの出力バッファ９１及び９２に蓄積されていく。
【００４７】
図９の（ｂ）では、出力バッファ９１及び９２から次の処理ノード８２及び８１にデータが出力され、それぞれの入力バッファ９６及び９３に蓄積されている。
【００４８】
図９の（ｃ）では、グループ化及び平均値を求める処理として入力バッファ９３に到着したデータをワークバッファ領域９４上に足し込んでいく処理を行っている。しかし、全ての入力データが到着するまで出力バッファ９５にデータは出力されない。従って、ジョインを行うノード８２の入力バッファ９７にはデータが到着しない。
【００４９】
すると、図９の（ｄ）で示すような待ち状態となり、デッドロックが発生する。即ち、ジョインノード８２の入力バッファ９６に到着したデータは、ハッシュテーブル９８が未完成のためバッファ内で待たされる。ハッシュテーブル９８は入力バッファ９７にＱ１のデータが到着しないため作成できない。Ｑ１のデータはテーブルＴ４のデータがノード８１の入力バッファ９３に全て到着していないため、集計処理が完了せず出力ノード９５に出力することができない。テーブルＴ４のデータは一括スキャン処理ノード８０の出力ノード９１が満杯のため読み込むことができない。出力ノード９１は次のジョインを実行するノード８２の入力バッファ９６が満杯のためデータを送信することができない、というように全てが互いに他を待ち続ける状況となる。
【００５０】
このような状態を避けるためには、あらかじめ検索処理の実行グラフを作成したときにデッドロックが起きうるかどうかを検出して、デッドロックが起きる可能性があるときには、一括スキャン処理の一部を非共有化する必要がある。
【００５１】
図１０は、検索要求の実行グラフを作成する方法を示す流れ図である。検索要求のコンパイル処理では、まず検索要求に含まれる各検索毎の実行木を生成する。次に、各検索の実行木の共通部分（即ち共通部分グラフ）を切り出す。共通部分グラフは、複数の検索木間及び同一の検索実行木内の全ての場合について、実行木から取り出される。共通部分グラフの抽出には、単純なマッチングアルゴリズムのほかグラフ構造の変換を伴うあらゆる方式が適用可能である。なお共通部分グラフの抽出時には、テーブルの検索絞り込み条件の違いは無視して抽出する。
【００５２】
次に、抽出した共通部分グラフを一括スキャン処理ノードに変換する。この際、絞り込み条件が異なる検索処理に対しては、一括スキャン処理ノードの共通検索条件と個別検索条件とで当該検索処理を共通化する。最後に、共通部分木を一括スキャン処理ノードに変換した検索実行グラフに対して、デッドロック検出処理をおこない、デッドロック状態が検出された場合には一部の一括スキャン処理ノードの複製を作ることによりデッドロック状態の回避処理を行う。
【００５３】
図１１の（ａ）は、デッドロック状態を検出する方法を説明するのに使用する記号を説明する図である。符号１０５で示される図形は複数の入力間で同期処理を行う同期処理ノード（ＳｙｎｃＰｒｏｃ）を示し、符号１０６で示される図形は一括スキャン処理ノード（Ｍ−ＳＣＡＮＰｒｏｃ）を示し、符号１０７で示される図形は入力側のデータを全て読み込むまでデータを出力しないブロック処理ノード（ＢｌｏｃｋＰｒｏｃ）を示す。またパイプライン的に処理できる通常のパイプライン処理ノードは符号１０８で示す図形で示す。
【００５４】
検索実行グラフにはデータの流れに沿って有向線分が記されている。本発明では、検索実行グラフにおいて一括スキャン処理ノードから同期処理ノードに至るデータの流れをパイプラインストリーム１０９とブロッキングストリーム１１０の２種類に識別してデッドロック状態を検出することを特徴とする。パイプラインストリームは一括スキャン処理ノードから同期処理ノードに至る経路で途中にブロック処理ノードがないデータストリームであり、ブロッキングストリームは一括スキャン処理ノードから同期処理ノードに至る経路で途中にブロック処理ノードがあるデータストリームである。
【００５５】
図１１の（ｂ）、（ｃ）はパイプラインストリーム並びにブロッキングストリームを見つけるために検索グラフを簡素なものに変換する方法を示している。図１１の（ｂ）は、通常の処理ノードを除去し、入力と出力を短絡する変換処理１を示す。図１１の（ｃ）は、複数の連続するブロック処理ノードを一つにまとめる変換処理２を示している。これらの変換を検索実行グラフに適用することにより、一括スキャン処理ノードから同期処理ノードに至る全ての経路は、直接結合されているか、または、ブロック処理ノードを一つ含むかのいずれかに分類される。このとき直接接続されている経路をパイプラインストリーム、ブロック処理ノードを介して接続されている経路をブロックストリームとする。図１１（ｃ）では、説明を解かりやすくするためにブロッキングストリームを点線の有向線分で示している。
【００５６】
図１１の（ｄ）は、デッドロック状態を検出するためにブロックストリームの矢印の向きを反転させてブロック処理ノードを除去する処理を示している。本発明では、検索実行グラフのデッドロック状態の検出を、ブロックストリームの有向線分の向きを反転させてループの検出を行うことにより、実行することを特徴とする。
【００５７】
図１２は、一括スキャン処理ノードへの変換により生成された検索実行グラフがデッドロック状態を有する簡単な例を示している。図１２の（ａ）は最も簡単なデッドロック状態を示すグラフである。本グラフは図８で示した検索処理を実行する際に作成される検索実行グラフである。符号１２０が図８のＳｙｎｃＰｒｏｃ８２に、符号１２１が図８のＭ−ＳＣＡＮＰｒｏｃ８０に、符号１２２が図８のＢｌｏｃｋＰｒｏｃ８１にそれぞれ対応する。
【００５８】
図１２の（ｃ）は、このグラフがデッドロック状態を有するかどうかを検出するために、図１１の（ｄ）で示したデッドロック検出手続きにより、ブロッキングストリームの矢印の向きを反転させたグラフである。このようにすると、ノード１２０及び１２１と有向線分１２３と１２４でループが形成されていることわかる。本発明によるデッドロックの検出方法は変換した検索実行グラフにループがあるかどうかを検出することによって行われる。グラフに内在するループの検出方法についてはグラフ理論等で提示されている任意の方法を適用することができる。
【００５９】
図１２の（ｂ）は、複数の検索処理間でのデッドロックの状態を示す図である。この例では、ノード１２５または１２６をルートとする検索実行グラフは単独ではデッドロックに陥らないが、ノード１２７及び１２８の一括スキャン処理を共有するとデッドロック状態に陥る。なぜなら、ブロックストリーム１３０及び１３１の影響でパイプラインストリーム１２９及び１３２が、ブロックされ一括スキャン処理ノード１２７及び１２８がブロックされるためである。このような複数の検索処理間のデッドロックも本発明で示したデッドロック検出手続きを適用すると容易に検出可能である。即ち、図１２の（ｄ）に示すようにブロッキングストリームの矢印の向きを反転させれば、有向線分１２９〜１３１〜１３２〜１３０で示される横８の字型のループが形成されるため、グラフのループ検出処理により容易にデッドロック状態が検出できる。
【００６０】
図１３はより複雑な検索要求の検索実行グラフを示した図である。本図は、図１１の（ｂ）及び（ｃ）の変換処理を施した後の検索実行グラフを示す。本検索実行グラフは一括スキャン処理ノードを３つ（１４６、１４５、１４３）と同期処理ノードを４つ（１４０、１４１、１４２、１４４）含み、さらに４つのブロック処理ノードを含む。
【００６１】
図１４は図１３の検索実行処理のデッドロック状態を検出する手続きを示した図である。図１４の（ａ）は、検索実行グラフに図１１の（ｄ）で示したブロッキングストリームの矢印の反転処理を施した後の状態を示す図である。この処理によりブロッキングストリーム１５０、１５１、１５２、１５６の矢印の向きが反転される。次にこのグラフに対してループ検出処理を適用すると図１４の（ｂ）で示すような２つのループが検出される。即ち、有向線分１５１〜１５５〜１５４〜１５２〜１５３で示される第一のループと、有向線分１５２〜１５６〜１５５〜１５４で示される第二のループが検出される。従って本検索実行グラフはデッドロック状態を有することが解かる。
【００６２】
図１５は、本検索実行グラフのデッドロック状態を回避する手段を示した図である。デッドロック状態を回避するためには、デッドロックループを切断すればよい。そのためにはループ内に含まれる一括スキャン処理ノードの一部を複製して、該一括スキャン処理ノードから出力されるパイプラインストリームとブロッキングストリームを各々別のスキャン処理ノードで実行するようにする。本実施例では、一括スキャン処理ノード１４５を複製してスキャンノード１４７を作成し、一括スキャン処理１４５から出力されていた全てのブロッキングストリーム（本実施例では１５１と１５６）を割り当てる。複製元の一括スキャン処理ノード１４５には残りの全てのパイプラインストリームを割り当てる。デッドロックの回避処理では、検索実行グラフの全てのループが切断されるまでこの一括スキャン処理ノードの複製によるループの切断を行う。本実施例では、一括スキャン処理ノード１４５を複製すれば前記の第一のループ及び第二のループは同時に解消するため、これ以上の複製は行わない。
【００６３】
本実施例ではデッドロックの状態に陥っている一括スキャン処理ノードとしてはノード１４５の他にノード１４３がある。図１６にノード１４３の複製を行うことによりループの切断を試みた場合を示す。一括スキャン処理ノード１４３はブロッキングストリーム１５２とパイプラインストリーム１５３をそれぞれ１つずつしか持たないため、複製を作成すると共にシングルスキャン処理ノード１４８及び１４９となる。すると前記第一のループは一時的に解消されるものの、ノード１４２〜１４９〜１４５〜１４４で形成される前記第二のループは切断されずに残る。このように一括スキャン処理ノードを複製したばあい、選択した一括スキャン処理ノードによりループが切断できる場合とできない場合がある。このように１回の一括スキャン処理ノードの複製によりループを切断できない場合には他の一括スキャン処理ノードをさらに複製し、これを繰り返すことでデッドロック状態のループを切断することができる。
【００６４】
しかしながら、検索実行時の処理性能を考慮すると、なるべく少ない一括スキャン処理ノードの複製でデッドロック状態のループを解消できることが望ましい。本発明では、以下のように、より少ない複製処理でループを切断しうる一括スキャンノードの候補を選択する方法を提供することも特徴とする。
【００６５】
検索実行グラフのループの切断するために複製する一括スキャンノードの候補としては、デッドロック状態のループに陥っているノードのなかでストリームのより上流になっている一括スキャン処理ノードを一つ選択する。本実施例ではデッドロック状態のループに陥っている一括スキャン処理ノードはノード１４３とノード１４５があったのであるが、ノード１４５の出力がノード１４３に入力されているため、ノード１４５がより上流ノードにあたる。従ってノード１４５の複製処理をおこないデッドロック状態を回避する処理を行う。そして、この回避処理を検索実行グラフのループが解消されるまで繰り返し適用する。
【００６６】
上記デッドロック状態の回避処理は、少なくとも検索実行グラフの全ての一括スキャン処理ノードを複製すればループが無くなる。検索じっうグラフの中に含まれる一括スキャン処理ノードの数は有限であるため、本デッドロック回避方法は有限回で停止することが保証される。
【００６７】
なお、本実施例ではデッドロックの検出／回避処理をブロッキングストリームの有向線分の矢印の向きを反転させて行っていたが、本質的にはデッドロックループの検出並びにループ切断が行えればよいので、パイプラインストリームの有向線分の向きを反転させても全く同様のデッドロック検出／回避処理ができることは明かである。
【００６８】
以上の実施例では、一つの検索要求に複数の検索処理が含まれる場合について説明を行なった。しかし、本発明は、複数のクライアント端末から同一データに対する検索要求が発行される場合においても適用可能である。図１７は、複数のクライアント端末１７１・１７２から複数の検索要求Ｑ１・Ｑ２が発行された場合に本発明を適用する例を説明する図である。端末１７１から発行される検索処理Ｑ１はテーブルＴ１とＤ３列が１００未満のテーブルＴ２のをＣ１列をキーとしてジョインし、Ｃ１列とＣ２列でグループ化し、Ｄ１列とＤ２列のグループ毎の合計を求める。端末１７２から発行される検索処理Ｑ２はＤ４列が１００未満のテーブルＴ２とテーブルＴ３をＣ３列をキーとしてジョインし、Ｃ２列とＣ３列でグループ化し、Ｄ１列とＤ２列のグループ毎の合計を求める。本実施例では、それぞれの端末が発行した検索要求はクエリスケジューラ（ＱｕｅｒｙＳｃｈｅｄｕｌｅｒ：ＱＳＣＤ）１７５に送られ、該スケジューラが有する検索要求バッファ１７６にバッファリングされる。クエリスケジューラはバッファリングされた各問い合わせをコンパイルし、共通部分の切り出しを行ない、共通部分がある問い合わせを一括化した検索実行グラフを生成し、実行グラフの各部分木を各サーバに割り当てる。共通部分がある問い合わせを一括化した検索実行グラフの生成法は先に述べた実施例と同様の方法が適用可能である。
【００６９】
本実施例では、クエリスケジューラで生成された検索実行グラフの各部分木は、フロントエンドサーバ１７８及び１７９、ジョインバックエンドサーバ４及び５、スキャンバックエンドサーバ６にそれぞれ割り当てられる（１７７）。先の実施例では、検索要求を発行した端末が一つであったため、フロントエンドサーバは一つであったが、本実施例では、複数の端末からの検索要求を一括して実行するため、各検索要求毎に異なるフロントエンドサーバが割り当てられる。そして、各検索処理の結果は対応するフロントエンドサーバから、検索要求を発行した端末にそれぞれ返される（１８０、１８１）。本実施例では、クエリスケジューラ１７５が検索要求のバッファリング機能を持つことにより複数の検索要求に存在する共通処理を一括化することを可能としている。バッファリングされた検索要求を一括化するタイミングとしては、一定時間毎やデータベースの負荷状態に応じて決める等、種々の方法が適用できる。
【００７０】
【発明の効果】
本発明の検索処理の共通化によりデータベースシステムにおける複数の検索処理の同時実行が可能となる。
【００７１】
また、本発明ではかかる検索処理の共通化に伴い問題となるデッドロック状態をあらかじめ検出し回避する手段を提供することで検索実行時にデッドロックの検出を行う必要がないという効果がある。
【００７２】
更に、デッドロックの回避では、より少ない一括スキャンノードの複製によりデッドロック状態を回避する方法を提供することにより検索実行時間を短縮する。
【図面の簡単な説明】
【図１】複数の検索処理を同時に実行する方法を示す図である。
【図２】従来方式における処理概要図である。
【図３】複数の検索処理を含むＳＱＬ文の例を示す図である。
【図４】検索要求の検索実行グラフの例を示す図である。
【図５】一括スキャン処理の詳細な流れを示す図である。
【図６】シングルスキャン処理の詳細な流れを示す図である。
【図７】本発明と従来方法との処理コストの差を示す図である。
【図８】一括スキャン処理を用いたときにデッドロックが起こる場合の例を示す図である。
【図９】デッドロックに陥る状態の遷移を示す図である。
【図１０】検索処理要求をコンパイルして検索実行グラフを作成する処理の流れ図である。
【図１１】検索実行グラフのデッドロック検出手段の説明をする図である。
【図１２】簡単なデッドロック検出手順を示す図である。
【図１３】複雑な検索実行グラフを示す図である。
【図１４】複雑な検索実行グラフのデッドロック検出手段を説明する図である。
【図１５】複雑な検索実行グラフのデッドロック回避処理を説明する図である。
【図１６】一括スキャン処理ノードの複製によりループが解消できない場合を説明する図である。
【図１７】複数の検索処理を同時に実行する方法の他の例を示す図である。
【符号の説明】
１端末
２データベースシステム
３フロントエンドサーバ
４、５ジョインバックエンドサーバ
６スキャンバックエンドサーバ
７データベース２次記憶装置
１４、１６シングルスキャン処理手段
１５一括スキャン処理手段
１７データベースバッファ
１８、１９、２０、２１出力バッファ
２２多重選択手段。[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a database system for referring to a table, and more particularly to a search processing method for simultaneously executing a plurality of search processes.
[0002]
[Prior art]
Data warehouses that are being put into practical use with the expansion of the commercial application parallel server market are generally corporate data warehouses that store huge amounts of data and departmental dataware that stores data extracted for each department. This is a three-level corporate information system consisting of a house and a large number of client terminals.
[0003]
In this data warehouse, multidimensional analysis is frequently performed in which a table holding raw data is analyzed from various angles. Therefore, the search is repeatedly performed on one table under various conditions. The database system performed these searches separately. The raw data stored in the data warehouse is a huge amount of TB, and it takes a lot of time to scan all the data. Therefore, when the search is repeatedly executed under various conditions, there is a problem that it is necessary to repeatedly scan all cases of raw data many times, and enormous processing time is required.
[0004]
FIG. 3 is an example of an SQL sentence of a search request including a plurality of search processes. FIG. 3 includes two search processes Q1 and Q2 indicated by the WITH phrase. The search process Q1 joins the table T1 and the table T2 in which the D3 column is less than 100, using the C1 column as a key, groups them by the C1 column and the C2 column, and obtains the sum for each group of the D1 column and the D2 column. The search process Q2 joins the table T2 and the table T3 whose D4 column is less than 100 using the C3 column as a key, groups them by the C2 column and the C3 column, and obtains the sum for each group of the D1 column and the D2 column. In the SQL of FIG. 3, the processing of Q1 and Q2 is combined as one search result by UNION ALL. At this time, the identifiers “Q1” and “Q2” are respectively loaded at the head so as to know whether the extracted result is the result of the search process of Q1 or the result of the search process of Q2. This SQL includes processing for combining and grouping the tables T1 and T2, and processing for combining and grouping the tables T2 and T3. When this search request is executed by the conventional method, the table T2 is scanned twice. The processing flow at this time is shown in FIG.
[0005]
In FIG. 2, the tables T1, T2, and T3 stored in the plurality of secondary storage devices 28 to 30 are read by the scan back end server (Scan BES) 6 and transferred to the join back end servers (Join BES) 4 and 5. The process is shown. The scan back-end server 6 has a database buffer 17 for caching data read from the secondary storage on the main storage, table scan processing 14 and 31 for joining the tables T1 and T2, and a table T2. A total of four table scan processes of table scan processes 32 and 16 for joining T3 are executed independently.
[0006]
In the process of FIG. 2, since the scan process for the table T2 is executed twice, the table scan process must retrieve the data from the database buffer twice, and further, T2 If there is a difference in the processing speed between the scan processes 31 and 32, the cache effect of the table T2 to be referred to by each scan process on the database buffer 17 is lost, and in the worst case, the I / O process for the secondary storage Will be issued twice.
[0007]
In order to solve these problems, in the conventional technique, as described in Japanese Patent Application No. 6-135464, when the same data is accessed between a plurality of search processes, simultaneous execution control is performed. Thus, a technique has been proposed in which data taken into a database buffer by one I / O can be used in a plurality of search processes.
[0008]
[Problems to be solved by the invention]
In the conventional simultaneous execution control method of a plurality of search processes, it is necessary to perform synchronous control between the search processes in order to increase the hit rate of the database buffer. However, since each search request is input asynchronously to the database system, the overhead for performing the synchronization control is large. Further, when there is a difference in the data scanning speed between the search processes, there is a problem that the data held in the database buffer cannot be shared unless the synchronization control is frequently performed.
[0009]
In the conventional method, the process of reading data from the secondary storage to the database buffer has been shared, but the process of reading data from the database buffer to the local buffer of each search process is performed for each search process. Sufficient processing could not be shared.
[0010]
As means for issuing a plurality of search processes in a batch as a series of search requests, a SQL stored procedure and a SQL phrase standardized by SQL3 are provided. In multi-dimensional analysis of a data warehouse, a plurality of search processes are issued at the same time. Therefore, it is possible to combine them into a single search request by means using these stored procedures and the WITH phrase.
[0011]
As a means for issuing a series of search requests in which a plurality of search processes are batched, for example, a means for buffering and issuing a plurality of SQL statements at a query reception server of the database system may be considered. It is done.
[0012]
It is an object of the present invention to provide a method for efficiently executing a plurality of search processes inherent in a series of search requests as described above.
[0013]
An object of the present invention is to provide a method for generating a search execution graph by compiling a series of search requests including a plurality of search processes.
[0014]
Furthermore, an object of the present invention is to provide a method for detecting a deadlock state that may occur when a plurality of search processes are executed at the time of compiling and avoiding them.
[0015]
[Means for Solving the Problems]
In an embodiment of the present invention, when a plurality of search processes are executed on a single table, a partial search process common to the plurality of search processes is combined into one and shared among a plurality of search processes I will provide a. This processing method includes a step of analyzing one search request including a plurality of search processes and generating an execution graph of the search process, and a step of executing the search process according to the generated execution graph.
[0016]
The step of analyzing the search request and generating an execution graph of the search process includes a step of analyzing a plurality of search processes included in the search request and creating an execution tree for each search process, and an execution tree of each search process Cutting out a common part and converting it into an execution graph sharing the common part, and analyzing the converted execution graph to detect whether there is a possibility of deadlock at the time of execution. In the case where a deadlock state can be reached, the process includes a step of unsharing a part of the common subgraph and converting it into a deadlock-free execution graph.
[0017]
The step of executing the search process has a batch scan process step of executing a plurality of search processes for the same table, and the batch scan process step is a condition common to a plurality of search conditions for one data read into the input buffer. And a step of preliminarily narrowing the data in accordance with the common condition and a step of collating the data matched with the search conditions a plurality of times and outputting the data to a plurality of output buffers corresponding to all search processes adapted to the search conditions. The
[0018]
DETAILED DESCRIPTION OF THE INVENTION
FIG. 1 is a diagram showing a method for simultaneously executing a plurality of search processes in a database implementing the present invention. When the terminal 1 issues the search request shown in FIG. 3 to the database system 2, the search request receiving server (front-end server) 3 receives the search request and compiles it. Assume that a search processing execution graph as shown in FIG. 4 is generated as a result of compilation. In the following description, it is assumed that the search process shown in FIG. 4 is executed. After compiling, various back-end servers necessary for executing search processing are started next, and execution of search processing is instructed. Since the search execution graph shown in FIG. 4 includes two join operations and three table scan processes, the join back-end servers 4 and 5 and the scan back-end server 6 are activated.
[0019]
In the join back-end server, processing for performing hash join and grouping is executed.
[0020]
The scan back-end server that scans the table T1 or T3 performs hit determination under one condition, and single scan processing 14 and 16 that outputs hit data to the output destination (join back-end server) is executed.
[0021]
On the other hand, in the scan back-end server that performs the table T2, a batch scan process 15 is performed in which hit determination is performed under a plurality of conditions and data is output to an output destination (join back-end server) corresponding to the hit condition. In this batch scan process, hit determination is performed on the data read in one scan process using the search conditions of a plurality of search processes.
[0022]
The single scan processing 14 of the table T1 reads the data of the table T1 from the secondary storage into the DB buffer 17, further fetches the data into the single scan processing buffer, and outputs the processed data to the join back-end server. To do.
[0023]
Similarly to the single scan process 14, the single scan process 16 of the table T3 reads the data of the table T3 from the secondary storage into the DB buffer 17, and further imports the data into the buffer of the single scan process. Output data to join backend server.
[0024]
The batch scan process 15 of the table T2 reads the data of the table T2 from the secondary storage into the DB buffer, further fetches the data into the buffer of the batch scan process, and a plurality of data fetched under the conditions corresponding to each join operation. Determine the number of times (22). Data suitable for each condition is stored in the corresponding output buffer by batch scan processing and output to each join back-end server.
[0025]
Data output by each scan process is sequentially processed by the join back-end server without being stored in the secondary storage device. Further, the calculation results obtained by the join back-end server are sequentially output to the front-end server 3 and finally output to the terminal 1 that issued the search request. As described above, in this method, since it is not necessary to prepare a secondary storage for data transfer between the servers, when each process is executed by an independent control device, the process can be executed in a pipeline manner.
[0026]
In this embodiment, it is described that the data output in each process is transferred to the input of the next process, but it is of course possible to omit the data transfer by sharing the input / output buffer.
[0027]
Moreover, although the present embodiment shows an example in which a search request including a plurality of search processes is issued in advance from one terminal, it is also possible to combine a plurality of search processes at a search request receiving server. Even in such a case, it is obvious that the simultaneous execution method of search processing according to the present invention can be applied.
[0028]
FIG. 5 is a diagram showing a detailed method of batch scan processing. The collective scan processing 15 executes a tree having the collective scan processing node (M-SCAN Proc) 64 as a root. The M-SCAN Proc node can have a plurality of output destination nodes (DST) as the left subtree (65 and 66). The DST node for batch scan processing has output destination information (66 and 67) in which an output identifier which is information for identifying an output and an individual search condition expression indicating a search condition for each DST are described.
[0029]
In this embodiment, the DST node 65 has an output identifier of 2 (output to the join backend server 4) and an individual search condition of T2. D3 <100, the DST node 66 has an output identifier of 3 (output to the join backend server 5), and the individual search condition is T2. D4 <100. Data suitable for the individual search condition is output to output buffers 19 and 20 of each DST node and consumed as input data for the next processing.
[0030]
In the subtree on the right side of the M-SCAN Proc node, processing common to the M-SCAN Proc is described. In the present embodiment, since all case scan processing of the table T2 is made common, a subtree instructing the execution is expressed by the SCAN node 63 and the OBJ node 62. In the OBJ node, an input identifier and a common search condition are described (61).
[0031]
The input identifier describes information for identifying and specifying the input source, such as whether the input source is a database table or an output buffer of another server. For example, when the input source is a table, an identifier indicating the table is When the input source is an output buffer, an output identifier indicating the output buffer is described. In this embodiment, the input source is a table and the table identifier is T2. On the other hand, in the common search condition of the OBJ node, a common part (logical sum of each individual condition) of data to be narrowed down by this batch scan process is described.
[0032]
The individual search conditions specified by the DST node describe conditions other than those that cannot be narrowed down by the common search conditions. For example, when there are two output destinations and the search conditions are “100 ≦ D1 <200” and “200 ≦ D1 <300”, the common condition is “100 ≦ D1 <300”. Thus, D1 <200 and 200 ≦ D1 In this embodiment, the common search condition is all the data in the table, so the common condition is not described.
[0033]
Next, the data flow of batch scan processing will be described. In the batch scan process, first, data is read according to the input identifier described in the OBJ node, and data that meets the common conditions is read into the input buffer 60. The data fetched into the input buffer is collated with the individual search condition described in the output destination information 66 of the DST node 65, and is output to the output buffer 20 when the search condition is met. Further, similar processing is performed at the next DST node 66, and data matching the individual search condition is output to the output buffer 19. In the present invention, since data is output to a plurality of output destinations for one data input, the data input process is only required once, and overhead is reduced.
[0034]
FIG. 6 shows a detailed method of the single scan process. The single scan processing 14 executes a tree having a single scan processing node (S-SCAN Proc) 74 as a root. The S-SCAN Proc node 74 can have only one output destination node (DST) as a left subtree (76). The DST node for single scan processing has output destination information 75 in which an output identifier is described. In this embodiment, the output identifier of the output destination node 75 is 1. In the batch scan process, the individual search condition formula is described in the output destination information. However, in the single scan process, since there is only one output destination, the search condition is described in the OBJ node 72 (71).
[0035]
In the subtree on the right side of the S-SCAN Proc node, an input source of data to be scanned, a search condition, and the like are described. In the example of FIG. 6, a tree instructing execution of the unconditional all-case scan processing of the table T1 is represented by the SCAN node 74 and the OBJ node 72. An input identifier and a search condition are described in the OBJ node (71). In the input identifier, information for identifying whether the input source is a database table or an output buffer of another server is described. In the case of a table, a table identifier is described. In the case of an output buffer, an output identifier is described. . In the example of FIG. 6, the input source is a table and the table identifier is T1. Nothing is described in the search condition of the OBJ node, and all-case scanning is instructed.
[0036]
The processing cost when performing a plurality of scans on the same table only with a single scan processing node (S-SCAN Proc) as in the prior art is that the input processing cost for each page to the DB buffer is D_IN, and every row from the DB buffer. D_FETCH is the processing cost for bringing data to the scan processing and narrowing down with the selection conditions, D_SET is the processing cost for writing the hit data to the output buffer, D_OUT is the processing cost for outputting the output buffer data in page units, and When the multiplicity, that is, the number of search processes to be batch processed is N, in the conventional method,
(D_IN × (number of table pages) + D_FETCH × (number of table rows)) × N
+ (D_SET × (number of hits) + D_OUT × (number of pages of scan result data)) × N
It becomes.
[0037]
On the other hand, in the method of this embodiment, the first half of the process is shared, so the table scan cost is reduced.
(D_IN × (number of table pages) + D_FETCH × (number of table rows)) × 1
+ (D_SET × (number of hits) + D_OUT × (number of pages of scan result data)) × N
It becomes.
[0038]
Now, the processing cost ratio of D_IN, D_FETCH, D_SET, and D_OUT is 5: 2: 1: 5, the table record length is 512 bytes, the number of records is 100,000 rows, the scan result record length is 50 bytes, and each scan condition If the selection rate is 50% and the page size is 4096 bytes, the result shown in FIG. 7 is obtained and it can be seen that there is an effect of reducing the processing cost.
[0039]
In general, when a table holding a large amount of raw data is subjected to multidimensional analysis, a search request is often issued from various angles for a large amount of raw data. By applying the method of this embodiment, such a large-scale data scan process can be reduced.
[0040]
From this result, it is possible to share table scan processing that can be shared in all cases by batch scan processing, but there are actually query requests that cause search processing deadlock if table scanning is shared To do.
[0041]
FIG. 8 is an example of an inquiry request that causes a deadlock due to sharing of table scans. Assume that a search request as shown at 89 is issued. In this search request, the table T4 (85) is grouped by the C1 column to obtain an average value of D for each group, and the intermediate result is set to Q1 (84). Next, the table T4 and the intermediate result Q1 are joined using the column C1 as a key, and the difference between the average of D and D (DAVG) is obtained (82).
[0042]
In this search process, the table T4 is scanned when Q1 is generated and when a join operation is executed. Therefore, it is assumed that the scan processing of the table T4 is made common by M-SCAN Proc80. The M-SCAN Proc 80 outputs the result to the processing node 81 that performs grouping and obtains the average and to the node 82 that performs the join operation (86 and 88). In addition, the processing node that obtains the average by grouping outputs the result to the node 82 that performs the join operation (87).
[0043]
The node 81 that performs grouping and obtains the average cannot output the result until the scan of all data in the table T4 is completed. A node in which a result is not output unless the end of input data (end of data) is received is called a block processing node (Block Proc).
[0044]
If one of the two input data does not arrive, the node 82 that performs the join suppresses the other input. In this example, the input of data from the node 80 is suppressed while the data of Q1 is not received from the node 81. A node that has a plurality of input data and whose input status of one data suppresses the input of the other data is called a synchronization processing node (Sync Proc).
[0045]
When the collective scan processing node, the block processing node, and the synchronization processing node are connected in the relationship shown in FIG. 8, the entire search processing is deadlocked.
[0046]
FIG. 9 is a diagram illustrating a process in which the search process illustrated in FIG. 8 transitions to a deadlock state. FIG. 9A shows a state immediately after the search process is started. The data in the table T4 is read into the input buffer 90 of the batch scan processing node 80 and stored in the output buffers 91 and 92 to the next processing node.
[0047]
In FIG. 9B, data is output from the output buffers 91 and 92 to the next processing nodes 82 and 81 and stored in the input buffers 96 and 93, respectively.
[0048]
In (c) of FIG. 9, a process of adding data arriving at the input buffer 93 onto the work buffer area 94 is performed as a process for obtaining a grouping and an average value. However, data is not output to the output buffer 95 until all input data arrives. Therefore, data does not arrive at the input buffer 97 of the node 82 that performs the join.
[0049]
Then, a waiting state as shown in FIG. 9D occurs, and a deadlock occurs. That is, the data arriving at the input buffer 96 of the join node 82 is waited in the buffer because the hash table 98 is incomplete. The hash table 98 cannot be created because the data of Q1 does not arrive at the input buffer 97. The data of Q1 cannot be output to the output node 95 because the data of the table T4 has not all arrived at the input buffer 93 of the node 81 and the aggregation process is not completed. Data in table T4 cannot be read because output node 91 of batch scan processing node 80 is full. The output node 91 is in a situation where all continue to wait for each other, such as the input buffer 96 of the node 82 that performs the next join is full and cannot transmit data.
[0050]
To avoid this situation, detect whether a deadlock can occur when creating an execution graph for the search process in advance. It needs to be shared.
[0051]
FIG. 10 is a flowchart illustrating a method for creating an execution graph of a search request. In the search request compiling process, first, an execution tree for each search included in the search request is generated. Next, a common part (that is, a common part graph) of each search execution tree is cut out. The common subgraph is extracted from the execution tree for all cases between a plurality of search trees and in the same search execution tree. For extracting common subgraphs, any method that involves conversion of a graph structure can be applied in addition to a simple matching algorithm. When extracting the common subgraph, the difference in the table search refinement condition is ignored.
[0052]
Next, the extracted common subgraph is converted into a batch scan processing node. At this time, for the search processing with different narrowing-down conditions, the search processing is shared by the common search condition and the individual search condition of the batch scan processing node. Finally, deadlock detection processing is performed on the search execution graph obtained by converting the common subtree to the batch scan processing node, and if a deadlock state is detected, a copy of some batch scan processing nodes is created. To avoid deadlock.
[0053]
(A) of FIG. 11 is a figure explaining the symbol used in order to explain the method of detecting a deadlock state. A figure denoted by reference numeral 105 represents a synchronization processing node (Sync Proc) that performs synchronization processing between a plurality of inputs, and a figure denoted by reference numeral 106 represents a batch scan processing node (M-SCAN Proc), denoted by reference numeral 107. The figure to be displayed indicates a block processing node (Block Proc) that does not output data until all data on the input side is read. Further, a normal pipeline processing node that can be processed in a pipeline manner is indicated by a figure denoted by reference numeral 108.
[0054]
The search execution graph has directed lines along the data flow. The present invention is characterized in that the deadlock state is detected by identifying the data flow from the collective scan processing node to the synchronous processing node in the search execution graph into two types of pipeline stream 109 and blocking stream 110. A pipeline stream is a data stream without a block processing node on the way from the batch scan processing node to the synchronization processing node, and a blocking stream is a block processing node along the route from the batch scan processing node to the synchronization processing node. Data stream.
[0055]
(B) and (c) of FIG. 11 show a method of converting a search graph into a simple one in order to find a pipeline stream and a blocking stream. (B) of FIG. 11 shows the conversion process 1 which removes a normal process node and short-circuits an input and an output. FIG. 11C shows a conversion process 2 that combines a plurality of consecutive block processing nodes into one. By applying these transformations to the search execution graph, all paths from the batch scan processing node to the synchronous processing node are classified as either directly coupled or contain one block processing node. The At this time, a route directly connected is a pipeline stream, and a route connected via a block processing node is a block stream. In FIG. 11C, the blocking stream is indicated by a dotted directed line for easy understanding.
[0056]
FIG. 11D shows a process of removing the block processing node by inverting the direction of the arrow of the block stream in order to detect a deadlock state. In the present invention, the deadlock state of the search execution graph is detected by inverting the direction of the directed line segment of the block stream and detecting a loop.
[0057]
FIG. 12 shows a simple example in which the search execution graph generated by the conversion to the batch scan processing node has a deadlock state. FIG. 12A is a graph showing the simplest deadlock state. This graph is a search execution graph created when the search process shown in FIG. 8 is executed. Reference numeral 120 corresponds to Sync Proc 82 in FIG. 8, reference numeral 121 corresponds to M-SCAN Proc 80 in FIG. 8, and reference numeral 122 corresponds to Block Proc 81 in FIG.
[0058]
FIG. 12C is a graph in which the direction of the arrow of the blocking stream is reversed by the deadlock detection procedure shown in FIG. 11D in order to detect whether or not this graph has a deadlock state. It is. In this way, it can be seen that the nodes 120 and 121 and the directed line segments 123 and 124 form a loop. The deadlock detection method according to the present invention is performed by detecting whether or not there is a loop in the converted search execution graph. As a method for detecting a loop inherent in a graph, any method presented in graph theory or the like can be applied.
[0059]
FIG. 12B is a diagram illustrating a deadlock state between a plurality of search processes. In this example, the search execution graph having the node 125 or 126 as a root does not fall into a deadlock by itself, but if the collective scan processing of the nodes 127 and 128 is shared, the search execution graph falls into a deadlock state. This is because the pipeline streams 129 and 132 are blocked by the influence of the block streams 130 and 131 and the batch scan processing nodes 127 and 128 are blocked. Such a deadlock between a plurality of search processes can be easily detected by applying the deadlock detection procedure shown in the present invention. That is, if the direction of the arrow of the blocking stream is reversed as shown in FIG. 12D, a horizontal 8-shaped loop indicated by the directed line segments 129 to 131 to 132 to 130 is formed. The deadlock state can be easily detected by the loop detection process of the graph.
[0060]
FIG. 13 is a diagram showing a search execution graph of a more complicated search request. This figure shows the search execution graph after the conversion processing of (b) and (c) of FIG. 11 is performed. This search execution graph includes three batch scan processing nodes (146, 145, 143), four synchronization processing nodes (140, 141, 142, 144), and further includes four block processing nodes.
[0061]
FIG. 14 is a diagram showing a procedure for detecting a deadlock state in the search execution process of FIG. (A) of FIG. 14 is a figure which shows the state after performing the inversion process of the arrow of the blocking stream shown to (d) of FIG. 11 to the search execution graph. By this processing, the directions of the arrows of the blocking streams 150, 151, 152, and 156 are reversed. Next, when loop detection processing is applied to this graph, two loops as shown in FIG. 14B are detected. That is, the first loop indicated by the directed line segments 151 to 155 to 154 to 152 to 153 and the second loop indicated by the directed line segments 152 to 156 to 155 to 154 are detected. Therefore, it can be seen that this search execution graph has a deadlock state.
[0062]
FIG. 15 is a diagram showing means for avoiding a deadlock state of the search execution graph. In order to avoid the deadlock state, the deadlock loop may be disconnected. For this purpose, a part of the batch scan processing node included in the loop is duplicated, and the pipeline stream and the blocking stream output from the batch scan processing node are respectively executed by different scan processing nodes. In this embodiment, the batch scan processing node 145 is duplicated to create a scan node 147, and all blocking streams (151 and 156 in this embodiment) output from the batch scan processing 145 are assigned. All the remaining pipeline streams are allocated to the duplication source batch scan processing node 145. In the deadlock avoidance process, the loop is cut by duplicating the batch scan processing node until all the loops of the search execution graph are cut. In this embodiment, if the collective scan processing node 145 is duplicated, the first loop and the second loop are eliminated simultaneously, and no further duplication is performed.
[0063]
In the present embodiment, there is a node 143 in addition to the node 145 as a batch scan processing node falling into a deadlock state. FIG. 16 shows a case where a loop break is attempted by duplicating the node 143. Since the batch scan processing node 143 has only one blocking stream 152 and one pipeline stream 153, the batch scan processing node 143 creates a copy and becomes single scan processing nodes 148 and 149. Then, although the first loop is temporarily eliminated, the second loop formed by the nodes 142 to 149 to 145 to 144 remains without being cut. When a batch scan processing node is duplicated in this way, the loop may or may not be cut by the selected batch scan processing node. In this way, when the loop cannot be cut by one batch scan processing node duplication, another loop scan processing node is further duplicated, and the loop in the deadlock state can be cut by repeating this.
[0064]
However, in consideration of processing performance at the time of search execution, it is desirable that a deadlock loop can be eliminated by copying as few batch scan processing nodes as possible. The present invention is also characterized by providing a method for selecting a batch scan node candidate that can cut a loop with less duplication processing as described below.
[0065]
As a batch scan node candidate to be replicated to break the loop of the search execution graph, select one batch scan processing node that is upstream of the stream from among the nodes falling into the deadlock loop. . In this embodiment, there are the node 143 and the node 145 as the collective scan processing nodes falling into the deadlock loop. However, since the output of the node 145 is input to the node 143, the node 145 is more upstream. It hits. Therefore, the node 145 is duplicated to avoid a deadlock state. This avoidance process is repeatedly applied until the loop of the search execution graph is resolved.
[0066]
In the deadlock avoidance process, at least all the collective scan processing nodes of the search execution graph are duplicated to eliminate a loop. Since the number of batch scan processing nodes included in the search graph is finite, this deadlock avoidance method is guaranteed to stop at a finite number of times.
[0067]
In this embodiment, the deadlock detection / avoidance processing is performed by inverting the direction of the arrow of the directed segment of the blocking stream. However, if the deadlock loop can be detected and the loop can be cut essentially. Since it is good, it is clear that exactly the same deadlock detection / avoidance processing can be performed even if the direction of the directed line segment of the pipeline stream is reversed.
[0068]
In the above embodiment, the case where a plurality of search processes are included in one search request has been described. However, the present invention is also applicable when a search request for the same data is issued from a plurality of client terminals. FIG. 17 is a diagram for explaining an example in which the present invention is applied when a plurality of search requests Q1 and Q2 are issued from a plurality of client terminals 171 and 172. The search process Q1 issued from the terminal 171 joins the table T1 and the table T2 in which the D3 column is less than 100, using the C1 column as a key, groups the C1 column and the C2 column, and sums the D1 column and D2 column for each group Ask for. The search process Q2 issued from the terminal 172 joins the table T2 and the table T3 whose D4 column is less than 100 using the C3 column as a key, groups the C2 column and the C3 column, and calculates the sum of the D1 column and the D2 column for each group. Ask. In this embodiment, a search request issued by each terminal is sent to a query scheduler (QSCD) 175 and buffered in a search request buffer 176 included in the scheduler. The query scheduler compiles each buffered query, cuts out the common part, generates a search execution graph in which the queries having the common part are integrated, and assigns each subtree of the execution graph to each server. A method similar to that of the above-described embodiment can be applied to a method for generating a search execution graph in which queries having common parts are grouped.
[0069]
In this embodiment, each subtree of the search execution graph generated by the query scheduler is assigned to the front-end servers 178 and 179, the join back-end servers 4 and 5, and the scan back-end server 6 (177). In the previous embodiment, since there was one terminal that issued a search request, there was one front-end server, but in this embodiment, in order to execute search requests from a plurality of terminals at once, A different front-end server is assigned to each search request. Then, the result of each search process is returned from the corresponding front-end server to the terminal that issued the search request (180, 181). In this embodiment, the query scheduler 175 has a search request buffering function, so that common processes existing in a plurality of search requests can be integrated. Various methods can be applied as the timing for batching the buffered search requests, such as being determined at regular intervals or in accordance with the load state of the database.
[0070]
【The invention's effect】
By sharing the search process of the present invention, a plurality of search processes in the database system can be executed simultaneously.
[0071]
Further, the present invention provides a means for detecting and avoiding a deadlock state that becomes a problem with the common use of the search processing in advance, so that there is an effect that it is not necessary to detect deadlock at the time of search execution.
[0072]
Further, in avoiding deadlock, search execution time is shortened by providing a method for avoiding a deadlock state by duplicating fewer batch scan nodes.
[Brief description of the drawings]
FIG. 1 is a diagram illustrating a method for executing a plurality of search processes simultaneously.
FIG. 2 is a schematic diagram of processing in a conventional method.
FIG. 3 is a diagram illustrating an example of an SQL sentence including a plurality of search processes.
FIG. 4 is a diagram illustrating an example of a search execution graph of a search request.
FIG. 5 is a diagram illustrating a detailed flow of batch scan processing.
FIG. 6 is a diagram illustrating a detailed flow of a single scan process.
FIG. 7 is a diagram showing a difference in processing costs between the present invention and a conventional method.
FIG. 8 is a diagram illustrating an example of a case where deadlock occurs when batch scan processing is used.
FIG. 9 is a diagram showing a transition of a state falling into a deadlock.
FIG. 10 is a flowchart of processing for compiling a search processing request to create a search execution graph.
FIG. 11 is a diagram for explaining deadlock detection means of a search execution graph.
FIG. 12 is a diagram showing a simple deadlock detection procedure;
FIG. 13 is a diagram illustrating a complicated search execution graph.
FIG. 14 is a diagram illustrating deadlock detection means for a complicated search execution graph.
FIG. 15 is a diagram illustrating deadlock avoidance processing for a complicated search execution graph.
FIG. 16 is a diagram illustrating a case where a loop cannot be resolved by copying a batch scan processing node.
FIG. 17 is a diagram illustrating another example of a method for simultaneously executing a plurality of search processes.
[Explanation of symbols]
1 terminal
2 Database system
3 Front-end server
4, 5 Join backend server
6 Scan backend server
7 Database secondary storage
14, 16 Single scan processing means
15 Batch scan processing means
17 Database buffer
18, 19, 20, 21 Output buffer
22 Multiple selection means.

Claims

A database search processing method for executing a single search request including a plurality of search processes or a search request in which a plurality of search processes are integrated .
When the input search request includes multiple search processes for one table ,
The data read by one scanning process of said one table, the first step hit determining the logical sum of the individual filters of the plurality of search processing for the single table as a search condition,
The data hit by the hit determination, the second step of setting each of the plurality of output buffers respectively hit determination in individual search criteria of the plurality of search processing, which corresponds to each search processing for the one table,
And a third step of passing the set data to subsequent processing steps, respectively .

As a compilation process for generating a search execution graph of an input search request in order to execute the search request,
Analyzing the input search request and generating a search execution tree for each search process included in the search request;
Extracting a common subtree with a common scan target table from the generated search execution tree for each search process;
Converting the search execution graph composed of the generated search execution tree by converting the common subtree to a batch scan processing node;
A step of analyzing the converted search execution graph to detect whether there is a possibility of being in a deadlock state at the time of execution; and if the possibility of being in a deadlock state is detected, the converted search execution is Generating a search execution graph that re-converts the graph to avoid a deadlock state, and providing the search execution graph for execution of the first to third steps;
The database processing method according to claim 1, further comprising a compilation processing procedure including:

Detecting the possibility of entering the deadlock state,
Each node of the search execution graph composed of the search execution tree for each search process is processed with a batch scan processing node that outputs data to a plurality of nodes, and data is input from the plurality of nodes and synchronized with each other. Identifying a synchronous processing node, a block processing node that does not output output data until all input data is read, and other pipeline processing nodes;
Pipeline processing nodes are removed by shorting the input and output, and a plurality of consecutive block processing nodes are combined into one block processing node to simplify the search execution graph;
Identifying a path from the batch scan processing node to the synchronous processing node as a pipeline stream directly connecting the two nodes and a blocking stream connecting the two nodes via a block processing node;
In the directed graph representing the search execution graph, reversing the direction of the arrow of the directed line segment representing either the pipeline stream or the blocking stream;
The database search processing method according to claim 2, further comprising a step of detecting a loop in the search execution graph in which the direction of the arrow is reversed .

The step of generating a search execution graph that avoids the deadlock includes:
Selecting one of the batch scan processing nodes belonging to the loop of the search execution graph detected by detecting the possibility of becoming a deadlock state;
Making a copy of the selected batch scan processing, assigning each pipeline stream and blocking stream output from the selected batch scan processing node to separate scan processing nodes;
Converting the batch scan processing node to a single scan processing node when there is one stream output from the assigned scan processing node;
Applying the step of detecting the deadlock to a new search execution graph generated by the conversion;
When a deadlock loop is detected in the step of detecting the deadlock, the process returns to the step of selecting one batch scan processing node, and when the deadlock loop is not detected, the deadlock is avoided. The database search processing method according to claim 3, further comprising a step of ending the step of performing .

5. The step of selecting one batch scan processing node, when a plurality of batch scan processing nodes exist, selects a batch scan processing node that is more upstream in the data flow. Database search processing method .