JP2008507008A

JP2008507008A - Efficient extraction of XML content stored in a LOB

Info

Publication number: JP2008507008A
Application number: JP2007516612A
Authority: JP
Inventors: チャンドラセカール，シバサンカラン; スソー，アシシュ; マーシー，ラビ; アガオル，ナイプン; セドラ，エリック; ムカマラ，スリーダー
Original assignee: オラクル・インターナショナル・コーポレイション
Priority date: 2004-06-16
Filing date: 2005-06-13
Publication date: 2008-03-06
Anticipated expiration: 2025-06-13
Also published as: JP4866844B2

Abstract

データベース管理システムに格納されたＸＭＬ文書の中でノードについての有効な自立型フラグメントを抽出するための方法およびシステムが提供される。ＸＭＬ索引は、ノードに対応するＸＭＬフラグメントデータが位置付けられる位置を識別するために使用される。ノードの上位は、フラグメントの適切な解釈に必要な任意の情報のために識別され、調べられる。上位ノードがこのような必要な情報を含む場合、フラグメントが確実に有効な自立型ＸＭＬフラグメントであるようにこの情報はＸＭＬフラグメントにパッチされる。 A method and system is provided for extracting valid free-standing fragments for nodes in an XML document stored in a database management system. The XML index is used to identify the location where the XML fragment data corresponding to the node is located. The top of the node is identified and examined for any information necessary for proper interpretation of the fragment. If the superior node contains such necessary information, this information is patched to the XML fragment to ensure that the fragment is a valid freestanding XML fragment.

Description

発明の分野
この発明は情報の管理に関するものであり、より具体的には、格納されたＸＭＬデータから、Ｘパス（ＸPath）経路式によって識別される有効な自立型ＸＭＬフラグメントを抽出することに関するものである。 FIELD OF THE INVENTION The present invention relates to information management, and more particularly to extracting valid freestanding XML fragments identified from stored XML data by an XPath path expression. It is.

背景
近年、拡張マークアップ言語データ（eXtensible Markup Language）（「ＸＭＬデータ」）の格納およびクエリを可能にするデータベースシステムが開発されてきた。ＸＭＬのクエリのための多くの発展する規格が存在するが、それらはすべてＸパスの何らかの変形を含む。Ｘパスは、文書の論理構造または階層を通る経路に基づいてアドレス指定構文を使用することによってＸＭＬ文書の中の項目を位置付け、処理する方法を記載する言語である。Ｘパス「経路式」によって識別されるＸＭＬ文書の部分とは、ＸＭＬ文書の構造内で、経路式と一致する任意の経路の終わりに存在する部分である。 Background In recent years, database systems have been developed that allow storage and querying of extensible markup language data ("XML data"). There are many evolving standards for XML queries, but they all include some variation of the X path. XPath is a language that describes how to locate and process items in an XML document by using an addressing syntax based on the document's logical structure or path through the hierarchy. The part of the XML document identified by the X path “path expression” is the part existing at the end of any path that matches the path expression in the structure of the XML document.

リレーショナルデータベースサーバによって管理されるＸＭＬ文書は典型的には、構造化されていないシリアル化データとして、ＬＯＢ（ラージオブジェクト）データ型の何らかの形式で格納される。たとえば、ＸＭＬ文書は、ＣＬＯＢ（文字ＬＯＢ）またはＢＬＯＢ（バイナリＬＯＢ）などの構造化されていない記憶装置に格納される場合もあれば、文書はＯ−Ｒ（ＸＭＬスキーマを使用するオブジェクトリレーショナル構造）として格納される場合もある。 XML documents managed by a relational database server are typically stored in some form of LOB (Large Object) data type as unstructured serialized data. For example, an XML document may be stored in an unstructured storage device such as CLOB (character LOB) or BLOB (binary LOB), or the document may be OR (object-relational structure using XML schema). May be stored as

多くのＸパスクエリを満たすためにＸＭＬ文書がいかに格納されたとしても、Ｘパス経路式と一致する格納されたＸＭＬ文書のフラグメントを識別および抽出する方法は必要である。 Regardless of how the XML document is stored to satisfy many X-path queries, a method is needed to identify and extract fragments of the stored XML document that match the X-path path expression.

残念ながら、ＸＭＬデータを格納するための組込サポートを有するデータベースシステムでさえ、通常は経路ベースのクエリの取扱用に最適化されることはなく、データベースシステムのクエリ性能は不備な点が多い。ＸＭＬスキーマ定義が利用可能であろう特定の場合には、ＸＭＬインスタンス文書で使用される構造およびデータ型は、Ｘパスクエリを最適化するために使用され得る。しかしながら、ＸＭＬスキーマ定義が利用可能ではなく、探索される文書がいかなるスキーマにも従わない場合には、経路ベースのクエリのための効率的な技術は存在しない。 Unfortunately, even database systems with built-in support for storing XML data are not usually optimized for handling path-based queries, and the query performance of database systems is often flawed. In the specific case where an XML schema definition will be available, the structure and data types used in the XML instance document can be used to optimize the X-path query. However, if the XML schema definition is not available and the document being searched does not follow any schema, there is no efficient technique for path-based queries.

ＸＭＬスキーマ定義が利用可能でないときに文書を照会する性能を向上させるために、すべての文書の全面的な走査またはテキストキーワードベースの索引のような特別なメカニズムが使用されてもよい。しかしながら、これらのメカニズムは、Ｘパス経路式と一致する格納されたＸＭＬ文書のフラグメントをすばやく識別および抽出する効率的な方法の必要性を満たさない。 To improve the performance of querying documents when XML schema definitions are not available, special mechanisms such as full scans of all documents or text keyword-based indexes may be used. However, these mechanisms do not meet the need for an efficient method for quickly identifying and extracting stored XML document fragments that match an X-path path expression.

格納されたＸＭＬデータのフラグメントの位置をすばやく識別する方法がたとえ利用可能であったとしても、識別された位置からフラグメントを効率的に抽出する方法は依然として必要である。識別された位置に存在するフラグメントは、有効な自立型ＸＭＬ文書ではないかもしれない。たとえば、フラグメント内で使用される名前空間接頭辞はそのフラグメントの外側で宣言されてもよく、したがって、識別された位置から検索されるフラグメントはすべての必要な宣言を持たないことになる。 Even if a method for quickly identifying the location of a fragment of stored XML data is available, there still remains a need for a method that efficiently extracts fragments from the identified location. The fragment that exists at the identified location may not be a valid free-standing XML document. For example, a namespace prefix used within a fragment may be declared outside that fragment, and thus a fragment retrieved from an identified location will not have all the necessary declarations.

上述に基づいて、Ｘパス経路式と一致する有効な自立型ＸＭＬフラグメントを識別および抽出するためのシステムならびに方法が明らかに必要である。 Based on the above, there is clearly a need for a system and method for identifying and extracting valid freestanding XML fragments that are consistent with an X-path path expression.

このセクションに記載されるアプローチは、追求され得るアプローチであるが、必ずしも以前に考えられたアプローチまたは追求されたアプローチではない。したがって、特に他に表示がない限り、このセクションに記載されるアプローチはいずれも、このセクションに包含されているという理由だけで先行技術としての資格があると想定されるべきではない。 The approaches described in this section are approaches that can be pursued, but not necessarily approaches that have been previously considered or pursued. Accordingly, unless otherwise indicated, none of the approaches described in this section should be assumed to qualify as prior art solely because they are included in this section.

この発明は限定としてではなく一例として添付図面の図に図示され、添付図面では同一の参照数字は同様の要素を指す。 The present invention is illustrated by way of example and not limitation in the figures of the accompanying drawings, in which like reference numerals refer to like elements.

詳細な説明
以下の説明には、説明の目的で、この発明を完全に理解できるようにするために多くの具体的な詳細が記載される。しかしながら、この発明はこれらの具体的な詳細がなくても実施され得ることは明白である。他の場合に、この発明を不必要に曖昧にすることを避けるために、周知の構造および装置はブロック図の形式で示される。 DETAILED DESCRIPTION In the following description, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, it will be apparent that the invention may be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form in order to avoid unnecessarily obscuring the present invention.

例示的なＸＭＬ文書
説明の目的で、以下の２つのＸＭＬ文書に関連して例が以下に挙げられる。 Exemplary XML Document For illustrative purposes, examples are given below in connection with the following two XML documents:

上述のように、ｐｏ１．ｘｍｌおよびｐｏ２．ｘｍｌは、ＸＭＬ文書の２つの例に過ぎない。本明細書に記載される技術は、任意の特定の型、構造または内容を有するＸＭＬ文書に限定されるものではない。このような文書がこの発明のさまざまな実施例に従っていかに索引付けされ、アクセスされることができるかということについての例が以下に挙げ
られる。 As described above, po1. xml and po2. xml is just two examples of XML documents. The techniques described herein are not limited to XML documents having any particular type, structure or content. Examples are given below of how such documents can be indexed and accessed according to various embodiments of the present invention.

ＸＭＬ索引
２００４年７月２日に出願された「ＸＭＬデータにアクセスするための索引（INDEX FOR ACCESSING XML DATA）」と題される米国特許出願連続番号第１０／８８４，３１１号（以下「ＸＭＬ索引アプリケーション」）は、Ｘパスクエリに基づいて、リレーショナルデータベースサーバによって管理されるＸＭＬ文書に効率的にアクセスするために使用され得る索引のさまざまな実施例を記載する。このような索引は本明細書においてＸＭＬ索引と称される。 XML Index US Patent Application Serial No. 10 / 884,311 entitled “INDEX FOR ACCESSING XML DATA” filed on July 2, 2004 (hereinafter “XML Index”). Application ") describes various examples of indexes that can be used to efficiently access XML documents managed by a relational database server based on X-path queries. Such an index is referred to herein as an XML index.

ＸＭＬ索引アプリケーションに記載されるＸＭＬ索引は、実際のＸＭＬデータを格納するために使用されるフォーマットおよびデータ構造（「ベース構造」）にかかわらずＸパスクエリを処理するために使用されることができる。たとえば、実際のＸＭＬデータは、ＣＬＯＢ（実際のＸＭＬテキストを格納する文字ＬＯＢ）、Ｏ−Ｒ（ＸＭＬスキーマが存在する状態でのオブジェクトリレーショナル構造化形式）、またはＢＬＯＢ（ＸＭＬデータの何らかのバイナリ形式を格納するバイナリＬＯＢ）などのいずれの形式でも、データベース内またはデータベースの外側の構造に存在することが可能である。 The XML index described in the XML index application can be used to process X-path queries regardless of the format and data structure used to store the actual XML data ("base structure"). For example, the actual XML data can be CLOB (character LOB that stores the actual XML text), OR (object-relational structured format in the presence of the XML schema), or BLOB (any binary format of XML data). Any form, such as binary LOB to store, can exist in the database or in structures outside the database.

１つの実施例に従って、ＸＭＬ索引は、Ｘパスベースの述語および／またはＸパスベースのフラグメント抽出を含むクエリの性能を改善するドメイン索引である。たとえば、ＸＭＬ索引は、ＣＬＯＢまたは構造化記憶装置として格納されるＸＭＬスキーマベースの列およびスキーマのないＸＭＬ型の列の上に構築されることができる。１つの実施例では、ＸＭＬ索引は、経路索引、値索引および順序索引を協同使用することによって生じる論理索引である。 In accordance with one embodiment, the XML index is a domain index that improves the performance of queries that include X path based predicates and / or X path based fragment extraction. For example, an XML index can be built on XML schema-based columns stored as CLOB or structured storage and XML-type columns without schemas. In one embodiment, the XML index is a logical index that results from the cooperative use of a path index, a value index, and an order index.

経路索引は、単純な（ナビゲーション）経路式に基づいてこのメカニズムをルックアップノードにもたらす。値索引は、値の同等性または範囲に基づいてルックアップをもたらす。複数の二次的な値索引がデータ型当たり１つ存在し得るであろう。順序索引は、索引付けされたノードに階層順序付け情報を関連付ける。順序索引は、ＸＭＬノード間の親子関係、上位−下位関係および兄弟関係を決定するために使用される。 The path index brings this mechanism to the lookup node based on a simple (navigation) path expression. A value index provides a lookup based on value equality or range. There could be multiple secondary value indexes, one per data type. The order index associates hierarchical ordering information with the indexed nodes. The order index is used to determine parent-child relationships, upper-lower relationships, and sibling relationships between XML nodes.

ユーザがＸパスを伴うクエリを（述語またはフラグメント識別子として）実行依頼するとき、ＸパスステートメントはＸＭＬ索引テーブルにアクセスするＳＱＬクエリに分解される。生成されたクエリは典型的には、経路、値および順序制約付きのルックアップの組を実行し、その結果を適切に併合する。 When a user submits a query with an X path (as a predicate or fragment identifier), the X path statement is broken down into an SQL query that accesses the XML index table. The generated query typically performs a set of path, value, and ordered constraints, and merges the results appropriately.

説明の目的で、本明細書に記載される技術は、ＸＭＬ索引がＸＭＬ索引アプリケーションに記載されるようにＸＭＬ文書を索引付けするために使用される文脈で記載される。しかしながら、本明細書に記載される技術は、いかなる特定の索引構造またはメカニズムにも限定されるものではなく、クエリのどの方法が使用されるかにかかわらず有効な自立型ＸＭＬフラグメントを識別および抽出するために使用されることができる。 For purposes of explanation, the techniques described herein are described in the context where an XML index is used to index an XML document as described in an XML index application. However, the techniques described herein are not limited to any particular index structure or mechanism, and identify and extract freestanding XML fragments that are valid regardless of which method of query is used. Can be used to do.

ＰＡＴＨ（経路）テーブル
１つの実施例に従って、論理ＸＭＬ索引はＰＡＴＨテーブルおよび二次索引の組を含む。上述のように、各々の索引付けされたＸＭＬ文書は多くの索引付けされたノードを含んでもよい。ＰＡＴＨテーブルは索引付けされたノード当たり１つの行を含む。各々の索引付けされたノードごとに、ノードについてのＰＡＴＨテーブルの中の行は、ノードに関連付けられたさまざまな情報を含む。 PATH Table According to one embodiment, a logical XML index includes a set of PATH tables and secondary indexes. As described above, each indexed XML document may include a number of indexed nodes. The PATH table contains one row per indexed node. For each indexed node, a row in the PATH table for the node contains various information associated with the node.

１つの実施例に従って、ＰＡＴＨテーブルに含まれる情報は、（１）ノードへの経路を
示すＰＡＴＨＩＤ（経路ＩＤ）、（２）ベース構造内でノードについてのフラグメントデータを位置付けるための「位置データ」、および（３）ノードを含むＸＭＬ文書の構造的階層内のノードの位置を示す「階層データ」を含む。任意に、ＰＡＴＨテーブルは、値に関連付けられるそれらのノードについての値情報も含んでもよい。これらのタイプの情報の各々は、以下により詳細に説明される。 According to one embodiment, the information contained in the PATH table includes (1) PATHID (path ID) indicating the path to the node, (2) “location data” for positioning the fragment data for the node in the base structure, And (3) “hierarchical data” indicating the position of the node in the structural hierarchy of the XML document including the node. Optionally, the PATH table may also include value information for those nodes associated with the value. Each of these types of information is described in more detail below.

経路
ＸＭＬ文書の構造は、ＸＭＬ文書内のノード間の親子関係を確立する。ＸＭＬ文書の中のノードの「経路」は、「ルート」ノードから始まり特定のノードに達する一連の親子リンクを反映する。たとえば、ｐｏ２．ｘｍｌの中の「User」ノードへの経路は、/PurchaseOrder/Actions/Action/Userである。なぜなら、「User」ノードは「Action」ノードの子であり、「Action」ノードは「Actions」ノードの子であり、「Actions」ノードは「PurchaseOrder」ノードの子であるためである。 Path The structure of an XML document establishes a parent-child relationship between nodes in the XML document. The “path” of a node in the XML document reflects a series of parent-child links starting from the “root” node and reaching a particular node. For example, po2. The path to the “User” node in xml is / PurchaseOrder / Actions / Action / User. This is because the “User” node is a child of the “Action” node, the “Action” node is a child of the “Actions” node, and the “Actions” node is a child of the “PurchaseOrder” node.

ＸＭＬ索引が索引付けするＸＭＬ文書の組は本明細書において「索引付けされたＸＭＬ文書」と称される。１つの実施例に従って、ＸＭＬ索引は、索引付けされたＸＭＬ文書のすべての中の経路のすべて、または索引付けされたＸＭＬ文書の中の経路の一部に構築されてもよい。どの経路が索引付けされるかを指定するための技術が以下に説明される。特定のＸＭＬ索引によって索引付けされる経路の組は本明細書において「索引付けされたＸＭＬ経路」と称される。 The set of XML documents that the XML index indexes is referred to herein as “indexed XML documents”. According to one embodiment, an XML index may be built on all of the paths in all of the indexed XML documents, or on a part of the paths in the indexed XML documents. Techniques for specifying which paths are indexed are described below. The set of paths that are indexed by a particular XML index are referred to herein as “indexed XML paths”.

ＰＡＴＨＩＤ
１つの実施例に従って、索引付けされたＸＭＬ経路の各々は、一意の経路識別子（「ＰＡＴＨＩＤ」）を割当てられる。たとえば、ｐｏ１．ｘｍｌおよびｐｏ２．ｘｍｌに存在する経路は、以下のテーブルに図示されるようにＰＡＴＨＩＤを割当てられてもよい。 PATHID
In accordance with one embodiment, each indexed XML path is assigned a unique path identifier (“PATHID”). For example, po1. xml and po2. Paths present in xml may be assigned a PATHID as illustrated in the table below.

経路を識別し、経路にＰＡＴＨＩＤを割当てるためにさまざまな技術が使用されることができる。たとえば、ユーザは経路を明示的に列挙し、このように識別された経路についての対応するＰＡＴＨＩＤを指定してもよい。代替的には、索引付けされたＸＭＬ文書の組に文書が加えられるときにデータベースサーバは各々のＸＭＬ文書をパーズしてもよい。パージング動作の間、データベースサーバは、既にＰＡＴＨＩＤを割当てられていないいずれの経路も識別し、それらの経路に新しいＰＡＴＨＩＤを自動的に割当てる。経路へのＰＡＴＨＩＤのマッピングはさまざまな方法でデータベース内に格納されてもよい。１つの実施例に従って、経路へのＰＡＴＨＩＤのマッピングは、ＸＭＬ索引自体から切り離されたメタデータとして格納される。 Various techniques can be used to identify a route and assign a PATHID to the route. For example, the user may explicitly enumerate the routes and specify the corresponding PATHID for the routes thus identified. Alternatively, the database server may parse each XML document as it is added to the indexed set of XML documents. During the parsing operation, the database server identifies any paths that have not already been assigned a PATHID and automatically assigns a new PATHID to those paths. The mapping of the PATHID to the route may be stored in the database in various ways. According to one embodiment, the mapping of the PATHID to the path is stored as metadata separated from the XML index itself.

１つの実施例に従って、異なるスキーマに従うＸＭＬ文書のために同一のアクセス構造が使用される。索引付けされたＸＭＬ文書が異なるスキーマに従い得るので、各々のＸＭＬ文書は典型的には、ＰＡＴＨＩＤが割当てられた経路の一部のみを含むことになる。 According to one embodiment, the same access structure is used for XML documents that conform to different schemas. Since indexed XML documents can follow different schemas, each XML document will typically contain only a portion of the path that is assigned a PATHID.

位置データ
ノードに関連付けられる位置データは、（１）ノードを含むＸＭＬ文書がベース構造内のどこに存在するか、および（２）ノードに対応するＸＭＬフラグメントが、格納されたＸＭＬ文書内のどこに位置付けられるかを示す。したがって、位置データの性質はベース構造の性質に基づいて実現例ごとに異なることになる。位置情報は典型的には、ＸＭＬ文書がパーズされるときにＰＡＴＨテーブルに加えられる。 Position data The position data associated with a node is (1) where the XML document containing the node exists in the base structure, and (2) where the XML fragment corresponding to the node is located in the stored XML document. Indicate. Accordingly, the nature of the position data will vary from implementation to implementation based on the nature of the base structure. The location information is typically added to the PATH table when the XML document is parsed.

説明の目的で、（１）ベース構造はリレーショナルデータベース内のテーブルであり、（２）各々の索引付けされたＸＭＬ文書はベーステーブルの対応する行に格納されると仮定される。このような文脈では、ノードについての位置データはたとえば（１）ノードを含むＸＭＬ文書が格納されるベーステーブルの中の行の識別子（「ＲＩＤ」）と、（２）ノードに対応するフラグメントデータへの、格納されたＸＭＬ文書内での高速アクセスをもたらすロケータとを含み得る。 For purposes of explanation, it is assumed that (1) the base structure is a table in a relational database and (2) each indexed XML document is stored in a corresponding row of the base table. In such a context, the position data about the node is, for example, (1) the identifier (“RID”) of the row in the base table in which the XML document including the node is stored, and (2) the fragment data corresponding to the node. Locators that provide fast access in stored XML documents.

ロケータは概念上、元の文書を「指し示す」情報であり、典型的にはそのポイントから始まるフラグメントデータを検索するために使用される。ロケータは、ＸＭＬ文書のために使用される実際の記憶装置に依存し、記憶装置のＣＬＯＢ、Ｏ−ＲまたはＢＬＯＢ形式ごとに異なる可能性がある。たとえば、ＣＬＯＢに格納されるＸＭＬ文書の中のノードのロケータは、ノードが始まるＣＬＯＢ内の開始文字オフセットであり得るだろう。さらに、ノードのバイト長はロケータの一部として格納されてもよい。合わせて、この情報は格納されたＸＭＬ文書内の開始位置および終了位置をもたらし、ＸＭＬフラグメントを効率的に抽出するために使用されることができる。たとえば、ロケータは、データを抽出することによって、ロケータによって指定された文字オフセットから始めることによって、およびロケータによって示されたバイトの数についてのデータを読取ることによって、指定されたＸパスクエリと一致するノードを含むＸＭＬフラグメントを検索するために使用されてもよい。 A locator is conceptually information that “points” to the original document and is typically used to retrieve fragment data starting from that point. The locator depends on the actual storage used for the XML document and can vary from one storage CLOB, OR or BLOB format. For example, the locator of a node in an XML document stored in the CLOB could be the starting character offset in the CLOB where the node begins. Further, the byte length of the node may be stored as part of the locator. Together, this information provides a starting and ending position within the stored XML document and can be used to efficiently extract XML fragments. For example, a locator can match a specified X-path query by extracting data, by starting at the character offset specified by the locator, and by reading the data for the number of bytes indicated by the locator May be used to search for XML fragments containing

しかしながら、ロケータは文字オフセットまたはバイトオフセットよりも複雑である可能性がある。たとえば、ロケータは特定のフラグを含み得るであろう。別の例として、リレーショナルテーブルに細断されたＸＭＬ文書が格納される場合、ロケータは適切なテーブルおよび／または行識別子などを含み得るであろう。 However, locators can be more complex than character or byte offsets. For example, the locator could include a specific flag. As another example, if a shredded XML document is stored in a relational table, the locator could include an appropriate table and / or row identifier, etc.

階層データ
ノードについてのＰＡＴＨテーブルの行は、ノードを含むＸＭＬ文書の階層構造内のどこにノードが存在するかを示す情報も含む。このような階層情報は本明細書においてノードの「OrderKey（順序キー）」と称される。 The row of the PATH table for the hierarchical data node also includes information indicating where the node exists in the hierarchical structure of the XML document including the node. Such hierarchical information is referred to as “OrderKey” of a node in this specification.

１つの実施例に従って、階層順序情報はデューイタイプ（Dewey-type）の値を使用して表わされる。具体的には、１つの実施例では、ノードのOrderKeyはノードの直接の親のOrderKeyに値を追加することによって作成され、ここで、追加される値はその特定の子ノードの、親ノードの子の中での位置を示す。 According to one embodiment, hierarchical order information is represented using a Dewey-type value. Specifically, in one embodiment, the node's OrderKey is created by adding a value to the node's immediate parent's OrderKey, where the added value is the parent node's parent node's parent node's OrderKey. Indicates the position within the child.

たとえば、特定のノードＤがノードＣの子であり、ノードＣ自体がノードＢの子であり、ノードＢがノードＡの子であると仮定されたい。さらに、ノードＤがOrderKey１．２．４．３．を有すると仮定されたい。OrderKeyの中の最後の「３」は、ノードＤがその親ノードＣの第３の子であることを示す。同様に、４は、ノードＣがノードＢの第４の子であることを示す。２は、ノードＢがノードＡの第２の子であることを示す。先頭の１は、ノードＡがルートノードである（つまり、親を持たない）ことを示す。 For example, assume that a particular node D is a child of node C, node C itself is a child of node B, and node B is a child of node A. Further, the node D is OrderKey 1.2.2.3. Suppose we have The last “3” in OrderKey indicates that node D is the third child of its parent node C. Similarly, 4 indicates that node C is the fourth child of node B. 2 indicates that node B is the second child of node A. The leading 1 indicates that node A is the root node (that is, has no parent).

上述のように、子のOrderKeyは、子の数に対応する値を親のOrderKeyに追加することに
よって容易に作成されることができる。同様に、親のOrderKeyは、子のOrderKeyの中の最後の数を取除くことによって子のOrderKeyから容易に導き出される。 As described above, the child OrderKey can be easily created by adding a value corresponding to the number of children to the parent OrderKey. Similarly, the parent OrderKey is easily derived from the child OrderKey by removing the last number in the child OrderKey.

１つの実施例に従って、各々のOrderKeyによって表わされる合成数は、バイトに匹敵する値に変換され、そのため、２つのOrderKey間の数学的比較は、OrderKeyが対応するノードの、ＸＭＬ文書の構造的階層内での相対的な位置を示す。 According to one embodiment, the composite number represented by each OrderKey is converted to a value comparable to a byte, so a mathematical comparison between two OrderKeys is the structural hierarchy of the XML document at the node to which OrderKey corresponds. Indicates the relative position within.

たとえば、OrderKey１．２．７．７に関連付けられるノードは、ＸＭＬ文書の階層構造においてOrderKey１．３．１に関連付けられるノードに先行する。したがって、データベースサーバは、OrderKey１．２．７．７を第１の値に変換し、OrderKey１．３．１を第２の値に変換する変換メカニズムを使用し、ここで第１の値は第２の値未満である。第２の値を第１の値と比較することによって、第１の値に関連付けられるノードが第２の値に関連付けられるノードに先行することをデータベースサーバは容易に判断できる。この結果を達成するためにさまざまな変換技術が使用されてもよく、この発明は任意の特定の変換技術に限定されるものではない。 For example, the node associated with OrderKey 1.2.7.7 precedes the node associated with OrderKey 1.3.1 in the hierarchical structure of the XML document. Thus, the database server uses a conversion mechanism that converts OrderKey 1.2.7.7 to a first value and OrderKey 1.3.1 to a second value, where the first value is the second value. Is less than the value of. By comparing the second value with the first value, the database server can easily determine that the node associated with the first value precedes the node associated with the second value. Various conversion techniques may be used to achieve this result, and the invention is not limited to any particular conversion technique.

値情報
索引付けされた文書内のいくつかのノードは、属性ノードまたは単純要素に対応するノードであってもよい。本明細書において使用されるように、「単純要素」はいかなる属性または子要素も持たない要素であり、その値は単一のテキストストリングである。たとえば、「ｐｏ１．ｘｍｌ」では、「Reference」要素は「ＳＢＥＬＬ−２００２１００９１２３３３６０１ＰＤＴ」という単一のテキスト値を有する単純要素である。 Value information Some nodes in the indexed document may be attribute nodes or nodes corresponding to simple elements. As used herein, a “simple element” is an element that does not have any attributes or child elements, and its value is a single text string. For example, in “po1.xml”, the “Reference” element is a simple element with a single text value of “SBELL-200210091233601PDT”.

１つの実施例に従って、属性ノードおよび単純要素のために、ＰＡＴＨテーブルの行は属性および単純要素の実際の値も格納する。このような値はたとえばＰＡＴＨテーブルの「値の列」に格納されてもよい。以下により詳細に説明される二次的な「値索引」は値の列に構築される。 According to one embodiment, for attribute nodes and simple elements, the rows of the PATH table also store the actual values of the attributes and simple elements. Such a value may be stored, for example, in a “value column” of the PATH table. A secondary “value index”, described in more detail below, is built on the sequence of values.

ＰＡＴＨテーブルの例
１つの実施例に従って、ＰＡＴＨテーブルは以下のテーブルに指定されるように定義される列を含む。 PATH Table Example According to one embodiment, the PATH table includes columns defined as specified in the following table.

上に説明されたように、ＰＡＴＨＩＤはノードに割当てられた識別子であり、ノードへの十分に拡張された経路を一意に表わす。ＯＲＤＥＲ＿ＫＥＹは、ノードに関連付けられ
るデューイ順序付け数のシステム表現である。１つの実施例に従って、OrderKeyの内部表現は文書の順序付けも保存する。 As explained above, PATHID is an identifier assigned to a node and uniquely represents a fully expanded path to the node. ORDER_KEY is a system representation of the Dewey ordered number associated with the node. According to one embodiment, the internal representation of OrderKey also preserves document ordering.

ＶＡＬＵＥの列は単純要素（つまり、子要素のない）ノードおよび属性ノードのために効果的なテキスト値を格納する。１つの実施例に従って、隣接するテキストノードは連結によって合体される。ＸＭＬ索引アプリケーションに記載されるように、索引作成中にオプションを指定することによってＶＡＬＵＥの列に格納される効果的なテキスト値をユーザがカスタマイズできるようにメカニズムが設けられ、たとえば混合テキスト、余白、大文字と小文字の区別などの動きがカスタマイズされることができる。ユーザは、有界のＲＡＷ列またはＢＬＯＢを含むいかなる数のフォーマットでもＶＡＬＵＥの列を格納できる。ユーザが有界の記憶装置を選択する場合、索引作成中のオーバーフローはいずれもエラーとしてフラグを立てられる。 The VALUE column stores effective text values for simple element (ie, no child elements) nodes and attribute nodes. According to one embodiment, adjacent text nodes are merged by concatenation. As described in the XML Index application, a mechanism is provided to allow the user to customize the effective text values stored in the VALUE column by specifying options during indexing, eg mixed text, margins, Movements such as case sensitivity can be customized. The user can store VALUE columns in any number of formats, including bounded RAW columns or BLOBs. If the user selects a bounded storage device, any overflow during indexing will be flagged as an error.

以下のテーブルは、（１）上述の列を有し、（２）ｐｏ１．ｘｍｌおよびｐｏ２．ｘｍｌのための入力で埋められたＰＡＴＨテーブルの一例である。具体的には、ＰＡＴＨテーブルの各々の行はｐｏ１．ｘｍｌまたはｐｏ２．ｘｍｌの索引付けされたノードに対応する。この例では、ｐｏ１．ｘｍｌおよびｐｏ２．ｘｍｌはそれぞれにベーステーブルの行Ｒ１およびＲ２に格納されると仮定される。 The following table has (1) the above-mentioned columns, and (2) po1. xml and po2. FIG. 4 is an example of a PATH table filled with input for xml. FIG. Specifically, each row of the PATH table is po1. xml or po2. Corresponds to xml indexed nodes. In this example, po1. xml and po2. It is assumed that xml is stored in rows R1 and R2 of the base table, respectively.

この例では、rowid（行ｉｄ）の列は、ＰＡＴＨテーブルの各々の行ごとに一意の識別子を格納する。ＰＡＴＨテーブルが作成されるデータベースシステム次第で、rowidの列は暗黙の列である場合がある。たとえば、行のディスク位置はその行のための一意の識別子として使用されてもよい。以下により詳細に説明されるように、二次的な順序および値索引はＰＡＴＨテーブルのrowid値を使用して、ＰＡＴＨテーブル内に行を位置付ける。 In this example, the rowid (row id) column stores a unique identifier for each row of the PATH table. Depending on the database system in which the PATH table is created, the rowid column may be an implicit column. For example, the disk location of a row may be used as a unique identifier for that row. As described in more detail below, the secondary order and value index uses the PATH table rowid value to locate a row in the PATH table.

上に示された実施例では、ノードのＰＡＴＨＩＤ、ＯＲＤＥＲ＿ＫＥＹおよびＶＡＬＵＥはすべて単一のテーブルに含まれる。代替的な実施例では、ＰＡＴＨＩＤ、ＯＲＤＥＲ＿ＫＥＹおよびＶＡＬＵＥの情報を対応する位置データ（たとえば、ベーステーブルＲＩＤおよびＬＯＣＡＴＯＲ）にマップするために別個のテーブルが使用されてもよい。 In the example shown above, the node's PATHID, ORDER_KEY and VALUE are all contained in a single table. In an alternative embodiment, separate tables may be used to map PATHID, ORDER_KEY and VALUE information to corresponding location data (eg, base tables RID and LOCATOR).

上に示された実施例では、ＰＡＴＨテーブルの「ＲＩＤ」および「ＬＯＣＡＴＯＲ」の列の中の情報は、索引付けされたノードが格納される位置を識別するために使用される。この例では、ベーステーブルの中の各々の行は索引付けされたＸＭＬ文書に対応する。ベーステーブルの中の各々の行はＣＬＯＢを使用して、関連付けられるＸＭＬ文書を格納する。ＰＡＴＨテーブルの中のＲＩＤの列は、ＸＭＬ文書がＣＬＯＢとして格納されるベーステーブルの中の行を識別し、ＬＯＣＡＴＯＲの列は、索引付けされたノードが始まるＣＬＯＢへの文字オフセットおよびノードのための文字長を格納する。 In the example shown above, the information in the “RID” and “LOCATOR” columns of the PATH table is used to identify the location where the indexed node is stored. In this example, each row in the base table corresponds to an indexed XML document. Each row in the base table uses CLOB to store the associated XML document. The RID column in the PATH table identifies the row in the base table where the XML document is stored as a CLOB, and the LOCATOR column is the character offset into the CLOB where the indexed node begins and the node's Stores the character length.

たとえば、上述のサンプルのＸＭＬ文書ｐｏ１．ｘｍｌおよびｐｏ１．ｘｍｌは、ＣＬＯＢデータ構造としてベーステーブルの行Ｒ１およびＲ２に、構造化されていないシリアル化形式で格納される。ＰＡＴＨテーブルの中でrowid「１」によって識別されるノードは、ベーステーブルの行Ｒ１に位置付けられ、格納されたＣＬＯＢの文字１から始まり、３５０文字の長さを有する。別の例として、rowid「９」によって識別されるノードは、ベーステーブルの行Ｒ２に位置付けられ、文字７２から始まり、３６文字の長さを有する。ＰＡＴＨテーブルのこの行は、以下に示されるｐｏ２．ｘｍｌの第１の＜Action＞ノードに対応する。 For example, the sample XML document po1. xml and po1. xml is stored in unstructured serialized form in the base table rows R1 and R2 as a CLOB data structure. The node identified by rowid “1” in the PATH table is located in row R1 of the base table and starts with the stored CLOB character 1 and has a length of 350 characters. As another example, the node identified by rowid “9” is located in row R2 of the base table and starts at character 72 and has a length of 36 characters. This line of the PATH table is po2. Corresponds to the first <Action> node in xml.

＜Action＞
＜User＞ZLOTKEY＜/User＞
＜/Action＞
上記の埋められたＰＡＴＨテーブルに示される例は、ロケータ情報が単純要素および属性ノードのために格納されない実施例を図示する。他の実施例では、ロケータ情報は、単純要素を含むすべてのノードのために格納および維持され得るであろう。さらに、埋められたＰＡＴＨテーブルに示される例は、ＬＯＣＡＴＯＲの列がオフセットおよび長さ情報の両方を格納する実施例を図示する。代替的な実施例では、オフセット情報のみが格納されてもよい。代替的には、上述のように、他のタイプのロケータ情報がＬＯＣＡＴＯＲの列に格納されてもよい。本明細書に記載される技術は、任意の特定のタイプの位置データに依存するものではない。 <Action>
<User> ZLOTKEY </ User>
</ Action>
The example shown in the embedded PATH table above illustrates an embodiment where locator information is not stored for simple elements and attribute nodes. In other embodiments, locator information could be stored and maintained for all nodes that contain simple elements. Furthermore, the example shown in the filled PATH table illustrates an embodiment where the LOCATOR column stores both offset and length information. In an alternative embodiment, only offset information may be stored. Alternatively, as described above, other types of locator information may be stored in the LOCATOR column. The techniques described herein do not rely on any particular type of location data.

二次索引
ＰＡＴＨテーブルは、幅広い範囲のクエリを満たすＸＭＬ文書および／またはＸＭＬフラグメントを位置付けるのに必要な情報を含む。しかしながら、二次アクセス構造がなくても、このようなクエリを満たすためにＰＡＴＨテーブルを使用することにはしばしば、ＰＡＴＨテーブルの全面的な走査が必要となる。したがって、１つの実施例に従って、（１）経路ルックアップを実行し、および／または（２）順序ベースの関係を識別するクエリを加速するためにさまざまな二次索引がデータベースサーバによって作成される。１つの実施例に従って、以下の二次索引がＰＡＴＨテーブルで作成される。 The secondary index PATH table contains the information needed to locate XML documents and / or XML fragments that satisfy a wide range of queries. However, even without a secondary access structure, using a PATH table to satisfy such a query often requires a full scan of the PATH table. Thus, according to one embodiment, various secondary indexes are created by the database server to (1) perform path lookups and / or (2) accelerate queries that identify order-based relationships. According to one embodiment, the following secondary index is created in the PATH table:

・（ＰＡＴＨＩＤ，ＲＩＤ）上のＰＡＴＨＩＤ＿ＩＮＤＥＸ
・（ＲＩＤ，ＯＲＤＥＲ＿ＫＥＹ）上のＯＲＤＥＲＫＥＹ＿ＩＮＤＥＸ
・ＶＡＬＵＥＩＮＤＥＸＥＳ
・（ＲＩＤ，ＳＹＳ＿ＤＥＷＥＹ＿ＰＡＲＥＮＴ（ＯＲＤＥＲ＿ＫＥＹ））上のＰＡＲＥＮＴ＿ＯＲＤＥＲＫＥＹ＿ＩＮＤＥＸ・ PATHID_INDEX on (PATHID, RID)
-ORDERKEY_INDEX on (RID, ORDER_KEY)
・ VALUE INDEXES
・ PARENT_ORDERKEY_INDEX on (RID, SYS_DEWEY_PARENT (ORDER_KEY))

ＰＡＴＨＩＤ＿ＩＮＤＥＸ
ＰＡＴＨＩＤ＿ＩＮＤＥＸは、ＰＡＴＨテーブルのＰＡＴＨＩＤ、ＲＩＤの列に構築される。したがって、ＰＡＴＨＩＤ＿ＩＮＤＥＸへの入力は（キー値、rowid）の形式であり、ここでキー値は特定のＰＡＴＨＩＤ／ＲＩＤの組合せを表わす合成値であり、rowidはＰＡＴＨテーブルの特定の行を識別する。 PATHID_INDEX
PATHID_INDEX is constructed in the PATHID and RID columns of the PATH table. Thus, the input to PATHID_INDEX is of the form (key value, rowid), where the key value is a composite value representing a particular PATHID / RID combination, and rowid identifies a particular row in the PATH table.

（１）ベーステーブルの行および（２）ノードのＰＡＴＨＩＤが公知であるとき、ＰＡＴＨＩＤ＿ＩＮＤＥＸはそのノードについて、ＰＡＴＨテーブル内で行をすばやく位置付けるために使用されてもよい。たとえば、キー値「３．Ｒ１」に基づいて、ＰＡＴＨＩＤ＿ＩＮＤＥＸは、キー値「３．Ｒ１」に関連付けられる入力を見つけるために横断されてもよい。ＰＡＴＨテーブルが上に図示されたように埋められていると仮定すると、索引入力は３というrowid値を有するであろう。３というrowid値は、ＰＡＴＨテーブルの第３の行を指し、この第３の行はＰＡＴＨＩＤ３およびＲＩＤＲ１に関連付けられるノードのための行である。 When (1) a row in the base table and (2) a node's PATHID is known, PATHID_INDEX may be used for that node to quickly locate the row in the PATH table. For example, based on the key value “3.R1”, the PATHID_INDEX may be traversed to find an input associated with the key value “3.R1”. Assuming that the PATH table is filled as shown above, the index entry will have a rowid value of 3. A rowid value of 3 refers to the third row of the PATH table, which is the row for the node associated with PATHID3 and RID R1.

ＯＲＤＥＲＫＥＹ＿ＩＮＤＥＸ
ＯＲＤＥＲＫＥＹ＿ＩＮＤＥＸは、ＰＡＴＨテーブルのＲＩＤおよびＯＲＤＥＲ＿ＫＥＹの列に構築される。したがって、ＯＲＤＥＲＫＥＹ＿ＩＮＤＥＸへの入力は（キー値、rowid）の形式であり、ここでキー値は特定のＲＩＤ／ＯＲＤＥＲ＿ＫＥＹの組合せを表わす合成値であり、rowidはＰＡＴＨテーブルの特定の行を識別する。 ORDERKEY_INDEX
ORDERKEY_INDEX is constructed in the RID and ORDER_KEY columns of the PATH table. Thus, the input to ORDERKEY_INDEX is in the form (key value, rowid), where the key value is a composite value representing a particular RID / ORDER_KEY combination, and rowid identifies a particular row in the PATH table.

（１）ベーステーブルの行および（２）ノードのＯＲＤＥＲＫＥＹが公知であるとき、ＯＲＤＥＲＫＥＹ＿ＩＮＤＥＸはそのノードについて、ＰＡＴＨテーブル内で行をすばやく位置付けるために使用されてもよい。たとえば、キー値「Ｒ１．’１．２’」に基づいて、ＯＲＤＥＲＫＥＹ＿ＩＮＤＥＸは、キー値「Ｒ１．’１．２’」に関連付けられる入力を見つけるために横断されてもよい。ＰＡＴＨテーブルが上に図示されたように埋められていると仮定すると、索引入力は３というrowid値を有するであろう。３というrowid値は、ＰＡＴＨテーブルの第３の行を指し、この第３の行はＯＲＤＥＲＫＥＹ１．２およびＲＩＤＲ１に関連付けられるノードのための行である。 When (1) a row in the base table and (2) an ORDERKEY of a node are known, ORDERKEY_INDEX may be used for that node to quickly locate the row in the PATH table. For example, based on the key value “R1.′1.2 ′”, ORDERKEY_INDEX may be traversed to find an input associated with the key value “R1.’1.2 ′”. Assuming that the PATH table is filled as shown above, the index entry will have a rowid value of 3. A rowid value of 3 refers to the third row of the PATH table, which is the row for the node associated with ORDERKEY 1.2 and RID R1.

値索引
経路ルックアップに基づくクエリがＰＡＴＨＩＤ＿ＩＮＤＥＸを使用して加速されることができるのとちょうど同じように、値ルックアップに基づくクエリはＰＡＴＨテーブルのＶＡＬＵＥの列に構築された索引によって加速されることができる。しかしながら、ＰＡＴＨテーブルのＶＡＬＵＥの列はさまざまなデータ型についての値を保持することができる。したがって、１つの実施例に従って、ＶＡＬＵＥの列に格納された各々のデータ型ごとに別個の値索引が構築される。このように、ＶＡＬＵＥの列がストリング、数およびタイムスタンプを保持する実現例では、以下の値（二次）索引も作成される。 Value Index Queries based on value lookups are accelerated by an index built on the VALUE column of the PATH table, just as queries based on path lookups can be accelerated using PATHID_INDEX. Can do. However, the VALUE column of the PATH table can hold values for various data types. Thus, according to one embodiment, a separate value index is constructed for each data type stored in the VALUE column. Thus, in an implementation where the VALUE column holds a string, number and timestamp, the following value (secondary) index is also created:

・ＳＹＳ＿ＸＭＬＶＡＬＵＥ＿ＴＯ＿ＳＴＲＩＮＧ（value）上のＳＴＲＩＮＧ＿ＩＮＤＥＸ
・ＳＹＳ＿ＸＭＬＶＡＬＵＥ＿ＴＯ＿ＮＵＭＢＥＲ（value）上のＮＵＭＢＥＲ＿ＩＮＤＥＸ
・ＳＹＳ＿ＸＭＬＶＡＬＵＥ＿ＴＯ＿ＴＩＭＥＳＴＡＭＰ（value）上のＴＩＭＥＳＴＡＭＰ＿ＩＮＤＥＸ
これらの値索引は、データ型ベースの比較（同等性および範囲）を実行するために使用される。たとえば、ＮＵＭＢＥＲ値索引は、ユーザＸパス内で数ベースの比較を取扱うために使用される。たとえば、ＮＵＭＢＥＲ＿ＩＮＤＥＸへの入力は（数、rowid）の形式であってもよく、ここでrowidは「数」の値に関連付けられるノードのための、ＰＡＴＨテーブル内の行を指す。同様に、ＳＴＲＩＮＧ＿ＩＮＤＥＸ内の入力は（ストリング、rowid）の形式を有してもよく、ＴＩＭＥＳＴＡＭＰ＿ＩＮＤＥＸ内の入力は（タイムスタンプ、rowid）の形式を有してもよい。 STRING_INDEX on SYS_XMLVALUE_TO_STRING (value)
NUMBER_INDEX on SYS_XMLVALUE_TO_NUMBER (value)
TIMESTAMP_INDEX on SYS_XMLVALUE_TO_TIMESTAMP (value)
These value indexes are used to perform data type based comparisons (equalities and ranges). For example, the NUMBER value index is used to handle number-based comparisons within the user X path. For example, the input to NUMBER_INDEX may be of the form (number, rowid), where rowid refers to the row in the PATH table for the node associated with the value of “number”. Similarly, the input in STRING_INDEX may have the form (string, rowid) and the input in TIMESTAMP_INDEX may have the form (timestamp, rowid).

ＰＡＴＨテーブルの中の値のフォーマットは、データ型の固有のフォーマットに対応しないかもしれない。したがって、値索引を使用するとき、データベースサーバは格納されたフォーマットから指定されたデータ型に値のバイトを変換するために変換機能を呼出し
てもよい。さらに、データベースサーバは、以下に記載されるように、任意の必要な変形を適用する。１つの実施例に従って、変換機能はＲＡＷおよびＢＬＯＢ値の両方で作動し、変換が可能でない場合にはヌルを返す。 The format of the values in the PATH table may not correspond to the specific format of the data type. Thus, when using a value index, the database server may invoke a conversion function to convert the value bytes from the stored format to the specified data type. In addition, the database server applies any necessary variations as described below. According to one embodiment, the conversion function operates on both RAW and BLOB values and returns null if conversion is not possible.

デフォルトにより、ＸＭＬ索引が作成されるときに値索引が作成される。しかしながら、ユーザはクエリの作業量の知識に基づいて１つ以上の値索引の作成を抑えることができる。たとえば、すべてのＸパス述語がストリング比較のみを伴う場合、ＮＵＭＢＥＲおよびタイムスタンプ値索引は回避されることができる。 By default, a value index is created when an XML index is created. However, the user can suppress the creation of one or more value indexes based on knowledge of the query workload. For example, if all X-path predicates involve only string comparisons, NUMBER and timestamp value indexes can be avoided.

ＰＡＲＥＮＴ＿ＯＲＤＥＲＫＥＹ＿ＩＮＤＥＸ
１つの実施例に従って、ＰＡＴＨテーブルに構築される二次索引の組はＰＡＲＥＮＴ＿ＯＲＤＥＲＫＥＹ＿ＩＮＤＥＸを含む。ＯＲＤＥＲ＿ＫＥＹ索引と同様に、ＰＡＲＥＮＴ＿ＯＲＤＥＲＫＥＹ＿ＩＮＤＥＸは、ＰＡＴＨテーブルのＲＩＤおよびＯＲＤＥＲ＿ＫＥＹの列に構築される。結果として、ＰＡＲＥＮＴ＿ＯＲＤＥＲＫＥＹ＿ＩＮＤＥＸの索引入力は（キー値、rowid）の形式を有し、ここでキー値は特定のＲＩＤ／ＯＲＤＥＲ＿ＫＥＹの組合せに対応する合成値である。しかしながら、ＯＲＤＥＲ＿ＫＥＹ索引とは異なって、ＰＡＲＥＮＴ＿ＯＲＤＥＲＫＥＹ＿ＩＮＤＥＸ入力におけるrowidは、特定のＲＩＤ／ＯＲＤＥＲ＿ＫＥＹの組合せを有するＰＡＴＨテーブルの行を指さない。それどころか、各々のＰＡＲＥＮＴ＿ＯＲＤＥＲＫＥＹ＿ＩＮＤＥＸ入力のrowidは、ＲＩＤ／ＯＲＤＥＲ＿ＫＥＹの組合せに関連付けられるノードの直接の親であるノードのＰＡＴＨテーブルの行を指す。 PARENT_ORDERKEY_INDEX
According to one embodiment, the set of secondary indexes built on the PATH table includes PARENT_ORDERKEY_INDEX. Similar to the ORDER_KEY index, PARENT_ORDERKEY_INDEX is built on the RID and ORDER_KEY columns of the PATH table. As a result, the index entry for PARENT_ORDERKEY_INDEX has the form (key value, rowid), where the key value is a composite value corresponding to a particular RID / ORDER_KEY combination. However, unlike the ORDER_KEY index, the rowid at the PARENT_ORDERKEY_INDEX input does not point to a PATH table row with a particular RID / ORDER_KEY combination. Rather, the rowid of each PARENT_ORDERKEY_INDEX input points to the PATH table row of the node that is the immediate parent of the node associated with the RID / ORDER_KEY combination.

たとえば、上に図示された埋められたＰＡＴＨテーブルでは、ＲＩＤ／ＯＲＤＥＲ＿ＫＥＹの組合せ「Ｒ１．’１．２’」は、ＰＡＴＨテーブルの行３の中のノードに対応する。ＰＡＴＨテーブルの行３の中のノードの直接の親は、ＰＡＴＨテーブルの行１によって表わされるノードである。結果として、「Ｒ１．’１．２’」のキー値に関連付けられるＰＡＲＥＮＴ＿ＯＲＤＥＲＫＥＹ＿ＩＮＤＥＸ入力は、ＰＡＴＨテーブルの行１を指すrowidを有するであろう（つまり、rowid＝１）。 For example, in the embedded PATH table illustrated above, the RID / ORDER_KEY combination “R1.’1.2 ′” corresponds to the node in row 3 of the PATH table. The immediate parent of the node in row 3 of the PATH table is the node represented by row 1 of the PATH table. As a result, the PARENT_ORDERKEY_INDEX entry associated with the key value of “R1.’1.2 ′” will have a rowid pointing to row 1 of the PATH table (ie, rowid = 1).

ＸＭＬ索引を使用してＸパスクエリを処理する
上述のように、ＸＭＬ索引は、ＸＭＬ文書の必須部分、つまりタグ、値および入れ子情報をＰＡＴＨ、ＶＡＬＵＥおよびＯＲＤＥＲ索引に取込むことによってＸパスベースのクエリならびにフラグメント抽出の性能を改善する。ＰＡＴＨ索引は、タグを索引付けするために使用され、単純な経路式に基づいてフラグメントを識別するためにメカニズムを与える。ＶＡＬＵＥ索引は、ＸＭＬ値が索引付けされることを可能にする。ＯＲＤＥＲ索引は、索引付けされたノードに階層順序付け情報を関連付け、ＸＭＬノード間の親子関係、上位−下位関係および兄弟関係を決定するために使用される。 Using an XML index to process an X-path query As mentioned above, an XML index is an X-path based query by incorporating the required parts of an XML document, ie tags, values and nesting information, into the PATH, VALUE and ORDER indexes. As well as improving the performance of fragment extraction. The PATH index is used to index tags and provides a mechanism to identify fragments based on simple path expressions. The VALUE index allows XML values to be indexed. The ORDER index associates hierarchical ordering information with indexed nodes and is used to determine parent-child relationships, upper-lower relationships and sibling relationships between XML nodes.

ユーザがＸパスを伴うクエリを実行依頼するとき、Ｘパス式はＸＭＬ索引テーブルにアクセスするＳＱＬクエリに分解され得る。生成されたクエリは典型的には、経路、値および順序制約付きルックアップの組を実行し、その結果を適切に併合する。 When a user submits a query with an X path, the X path expression can be decomposed into an SQL query that accesses the XML index table. The generated query typically performs a set of path, value and order constrained lookups and merges the results appropriately.

特に、２００４年９月１６日に出願された「ＸＭＬ索引を使用したＸＭＬデータの効率的なクエリ処理（EFFICIENT QUERY PROCESSING OF XML DATA USING XML INDEX）と題される同時継続出願米国特許出願連続番号第１０／９４４，１７０号（以下「クエリ処理」アプリケーション）は、指定された経路に対応するＸＭＬデータを識別するためにＸＭＬ索引を使用する、「索引がイネーブルにされた」クエリを実行するための方法のさまざまな実施例を記載する。特に、クエリ処理アプリケーションは、ＸＭＬ索引を使用してＸパス演算子を評価するための技術を記載する。 In particular, a co-pending US patent application serial number entitled “EFFICIENT QUERY PROCESSING OF XML DATA USING XML INDEX” filed on September 16, 2004 entitled “EFFICIENT QUERY PROCESSING OF XML DATA USING XML INDEX” No. 10 / 944,170 (hereinafter “query processing” application) for executing an “index-enabled” query that uses an XML index to identify XML data corresponding to a specified path Various embodiments of the method are described. In particular, query processing applications describe techniques for evaluating X path operators using XML indexes.

より具体的には、クエリ処理アプリケーションは、（１）総称的な経路式を単純な経路、述語および構造結合などのより単純な構成要素に分解するため、（２）索引付けされた経路構成要素のデューイ順序キーでＳＱＬ述語を使用して構造結合を表わすことを伴い得るＳＱＬクエリをＸＭＬ索引のテーブルに対して生成するため、および（３）元のデータを指すロケータを使用したフラグメント抽出のための技術を記載する。 More specifically, the query processing application (1) breaks down generic path expressions into simpler components such as simple paths, predicates and structural joins, and (2) indexed path components. To generate an SQL query against a table of XML indexes that may involve representing a structure join using an SQL predicate with a Dewey order key of (3) and for fragment extraction using a locator pointing to the original data Describe the technology.

索引がイネーブルにされたクエリは、経路式に基づいて生成され、ＸＭＬ索引のＰＡＴＨテーブルにアクセスする。経路ベースのクエリの経路式またはそのフラグメントは、テンプレートと組み合わされる。各々のテンプレートは規則に関連付けられる。指定された経路のフラグメントがテンプレートと一致するフォーマットであるとき、対応する規則は索引がイネーブルにされたクエリについてのＳＱＬを生成するために使用される。このプロセスは、クエリ処理アプリケーションに詳細に記載される。 The index-enabled query is generated based on the path expression and accesses the XML index PATH table. A path expression of a path-based query or a fragment thereof is combined with a template. Each template is associated with a rule. When the specified path fragment is in a format that matches the template, the corresponding rule is used to generate the SQL for the index-enabled query. This process is described in detail in the query processing application.

ＸＭＬ索引を使用してextract（）演算子を処理する
クエリ処理アプリケーションに記載される技術を使用して評価され得る１つのＸパス演算子は、extract（）演算子である。Ｘパスextract（）演算子の結果は、指定されたＸパス式を満たすＸＭＬ文書のＸＭＬフラグメントを含むＸＭＬ型である。 Processing an extract () operator using an XML index One X-path operator that can be evaluated using techniques described in a query processing application is the extract () operator. The result of the X path extract () operator is an XML type that contains an XML fragment of the XML document that satisfies the specified X path expression.

クエリ処理アプリケーションに記載されるように、extract（）演算子はＸＭＬ索引テーブルにＳＱＬクエリとして再書込されることができる。たとえば、/PurchaseOrder/ActionsノードでのＸパスクエリのためのextract（）演算子は、以下のようにＳＱＬクエリに翻訳されてもよい。 As described in the query processing application, the extract () operator can be rewritten as an SQL query in the XML index table. For example, the extract () operator for an X path query at the / PurchaseOrder / Actions node may be translated into an SQL query as follows:

ここで、：Ｂ１＝pathid（‘/PurchaseOrder/Actions’）であり（pathid（）は、当該経路に関連付けられるＰＡＴＨＩＤを探すために使用される内部機能であり）、po＿tabは格納されたＸＭＬ文書を含むベーステーブルである。 Where: B1 = pathid ('/ PurchaseOrder / Actions') (pathid () is an internal function used to find the PATHID associated with the path) and po_tab is the stored XML document It is a base table that contains.

ＳＹＳ＿ＸＭＬＩＮＤＥＸ＿ＭＫＸＭＬ（）演算子は、索引列の値に基づいてＸＭＬ型イメージを構築する。１つの実施例では、このルックアップはＳＹＳ＿ＸＭＬＩＮＤＥＸ＿ＧＥＴＦＲＡＧ（）演算子を使用して実現されてもよい。行識別子およびロケータを与えられると、ＳＹＳ＿ＸＭＬＩＮＤＥＸ＿ＧＥＴＦＲＡＧ（）演算子は、行識別子およびロケータに対応するＸＭＬフラグメントからなるＸＭＬ型イメージを構成する。 The SYS_XMLINDEX_MKXML () operator constructs an XML type image based on the index column values. In one embodiment, this lookup may be implemented using the SYS_XMLINDEX_GETFRAG () operator. Given a row identifier and a locator, the SYS_XMLINDEX_GETFRAG () operator constructs an XML type image consisting of XML fragments corresponding to the row identifier and locator.

ＸＭＬＡＧＧ（）は、ＳＹＳ＿ＸＭＬＩＮＤＥＸ＿ＭＫＸＭＬ（）演算子によって生成されたフラグメントを連結する演算子である。上記の例を使用して、ノード‘/PurchaseOrder/Actions’を含む各々の行ごとに、フラグメントはベーステーブルから検索され、単一のＸＭＬ型イメージに集約される。 XMLAGG () is an operator that concatenates fragments generated by the SYS_XMLINDEX_MKXML () operator. Using the example above, for each row containing the node '/ PurchaseOrder / Actions', the fragments are retrieved from the base table and aggregated into a single XML type image.

たとえば、上記の埋められたＰＡＴＨテーブルを使用して、 For example, using the embedded PATH table above,

の出力は、 The output of

になるだろう。１つの実施例では、返される出力は、開始タグおよび終了タグを含む、上記の結果を連結することによって作成される単一の長いストリングである。 Will be. In one embodiment, the output returned is a single long string created by concatenating the above results, including a start tag and an end tag.

本明細書に記載される技術は、ノードに対応する実際のテキストフラグメントを得るＳＹＳ＿ＸＭＬＩＮＤＥＸ＿ＧＥＴＦＲＡＧ（）演算子を実現するために使用される。 The techniques described herein are used to implement a SYS_XMLINDEX_GETFRAG () operator that obtains the actual text fragment corresponding to a node.

効率的な抽出プロセス
図２に示されるプロセス２００は、この発明の実施例に従ってＸＭＬフラグメントを抽出するための１つの技術のステップを図示する。示されるように、ステップ２１０において、ノードが最初に識別される。ＸＭＬ索引およびクエリ処理アプリケーションに記載される技術などの技術はいずれも、経路式と一致するノードを識別するために使用されることができる。 Efficient Extraction Process The process 200 shown in FIG. 2 illustrates the steps of one technique for extracting XML fragments according to an embodiment of the invention. As shown, in step 210, a node is first identified. Any technique, such as the techniques described in XML indexing and query processing applications, can be used to identify nodes that match a path expression.

次に、ステップ２１５において、ノードが単純要素であるかまたは複合要素であるかを判断するためにノードが調べられる。上述のように、単純要素は子または属性を持たない要素であり、その値は単一のテキスト値である。複合要素は属性を有するかまたは子要素を有する要素である。 Next, in step 215, the node is examined to determine whether it is a simple element or a complex element. As mentioned above, a simple element is an element that has no children or attributes, and its value is a single text value. A composite element is an element that has attributes or has child elements.

ノードが単純要素である場合、ステップ２２０によって示されるように、ＸＭＬ索引に格納された情報を使用して、元のＸＭＬ文書を調査することなくフラグメントが構成されることができる。ノードが複合要素である場合、ステップ２３０によって示されるように、ベーステーブルに格納された元のＸＭＬ文書がフラグメントを抽出するために調査され、抽出されたフラグメントは適切な解釈のために必要に応じてパッチされる。各々のプロセスは以下により詳細に説明される。 If the node is a simple element, the fragment stored can be constructed using the information stored in the XML index without examining the original XML document, as shown by step 220. If the node is a complex element, the original XML document stored in the base table is examined to extract the fragments, as indicated by step 230, and the extracted fragments are optionally needed for proper interpretation. Patched. Each process is described in more detail below.

図２に示されるプロセスの実施例は、元のＸＭＬ文書を調査することなくフラグメントを構成するために、ＸＭＬ索引に格納された情報を利用するが、単純要素および複合要素が異なったように扱われることは要件ではない。単純要素または複合要素のいずれの型とも一致するフラグメントが、格納されたＸＭＬデータから抽出されることが可能である。 The embodiment of the process shown in FIG. 2 utilizes the information stored in the XML index to construct fragments without examining the original XML document, but treats simple and complex elements differently. It is not a requirement. Fragments that match either simple element or complex element types can be extracted from the stored XML data.

単純要素フラグメント
格納されたＸＭＬ文書がＸＭＬ索引で索引付けされるとき、単純要素の値はＰＡＴＨテーブルのＶＡＬＵＥの列に存在する。したがって、単純要素についてのＸＭＬフラグメントは元のＸＭＬ文書を格納するベーステーブルを調査することなく構成されることができる。フラグメントは、識別されたノードについてのＰＡＴＨテーブルのＶＡＬＵＥの列から得られる値に、適切な開始タグおよび終了タグを加えることによって構築される。 Simple Element Fragment When a stored XML document is indexed with an XML index, the value of the simple element is in the VALUE column of the PATH table. Thus, XML fragments for simple elements can be constructed without examining the base table that stores the original XML document. The fragment is constructed by adding the appropriate start and end tags to the value obtained from the VALUE column of the PATH table for the identified node.

たとえば、ノード‘/PurchaseOrder/Reference’は、上記のＸＭＬ文書ｐｏ１．ｘｍｌ
およびｐｏ２．ｘｍｌの中の単純要素である。式‘/PurchaseOrder/Reference’のＰＡＴＨＩＤが最初に求められる。この例では、ＰＡＴＨＩＤは「２」である。いずれかのノードがこのＰＡＴＨＩＤに対応するかどうかを判断するためにＰＡＴＨテーブルが調べられる（ステップ２１０）。この例では、「２」および「７」というrowidを有するノードが、ＰＡＴＨＩＤ＝２と一致する。図２のプロセスは各々の一致するノードごとに実行される。 For example, the node '/ PurchaseOrder / Reference' is the XML document po1. xml
And po2. A simple element in xml. The PATHID of the expression '/ PurchaseOrder / Reference' is first determined. In this example, PATHID is “2”. The PATH table is consulted to determine whether any node corresponds to this PATHID (step 210). In this example, nodes with rowids “2” and “7” match PATHID = 2. The process of FIG. 2 is performed for each matching node.

ステップ２１５において、ノード２およびノード７の両方について、ロケータ情報がなく、ＶＡＬＵＥの列が単純なテキストストリングを含むとき、これらの行についてのＬＯＣＡＴＯＲおよびＶＡＬＵＥの列を調べることによって各々が単純要素であることが判断されることができる。これらの単純要素のノードの各々ごとに、プロセスはステップ２２０に進む。ステップ２２０では、ノードについてのフラグメントは、開始タグ、値および終了タグを含むストリングを作成することによって構築されることができる。開始タグは、このＰＡＴＨＩＤに関連付けられる経路の最後の構成要素（この例では「Reference」）を抽出することによって作成される。ＰＡＴＨテーブルの中でこのノードに対応するＶＡＬＵＥは、開始タグの後のフラグメントに入れられる。たとえば、ノード２についてのフラグメントのＶＡＬＵＥの構成要素は、「ＳＢＥＬＬ−２００２１００９１２３３３６０１ＰＤＴ」である。むすびの文字「／」および上で判断された構成要素のストリング（たとえば、「Reference」）からなるむすびのタグは、フラグメントのストリングを完全なものにする。このプロセスを辿ることによって、ノード２のフラグメントは「＜Reference＞ＳＢＥＬＬ−２００２１００９１２３３３６０１ＰＤＴ＜/Reference＞」であると決定される。これは、このノードに対応する元のＸＭＬ文書ｐｏ１．ｘｍｌのフラグメントと一致する。 In step 215, when there is no locator information for both node 2 and node 7 and the VALUE column contains a simple text string, each is a simple element by examining the LOCATOR and VALUE columns for these rows Can be judged. For each of these simple element nodes, the process proceeds to step 220. In step 220, a fragment for a node can be constructed by creating a string that includes a start tag, a value, and an end tag. The start tag is created by extracting the last component of the path (in this example “Reference”) associated with this PATHID. The VALUE corresponding to this node in the PATH table is placed in the fragment after the start tag. For example, the VALUE component of the fragment for node 2 is “SBELL-200210091233601PDT”. The tag of the conclusion consisting of the letter “/” and the component string determined above (eg, “Reference”) completes the string of fragments. By following this process, the fragment of node 2 is determined to be “<Reference> SBELL-200210091233601PDT </ Reference>”. This is the original XML document po1. Matches xml fragment.

属性のみを抽出するクエリは、単純要素と同様に扱われることができる。しかしながら、属性を含む要素は、以下により詳細に記載される複合要素として扱われる。 Queries that extract only attributes can be treated like simple elements. However, elements containing attributes are treated as composite elements described in more detail below.

システムが名前空間および生成された接頭辞を加えることができるので、単純要素は適切な解釈のためにパッチングを必要とせず、プロセスは単純要素のためのステップ２９０に進む。 Since the system can add namespaces and generated prefixes, simple elements do not require patching for proper interpretation, and the process proceeds to step 290 for simple elements.

ＸＭＬ索引を使用して複合要素を抽出する
複合要素のノードの場合、フラグメントは、複合要素に関連付けられたＸＭＬ文書を格納するベーステーブルからパーズされなければならない。上述のように、ＰＡＴＨテーブルにおける各々の行は、ＸＭＬ文書におけるノードに対応し、元のＸＭＬ文書を含むベーステーブルにおける行のＲＩＤと、ベーステーブルに格納されたＸＭＬ文書内でノードを見つけるためのロケータとを含む。 Extracting complex elements using an XML index In the case of a complex element node, the fragment must be parsed from the base table that stores the XML document associated with the complex element. As described above, each row in the PATH table corresponds to a node in the XML document, for finding the node in the XML document stored in the base table and the RID of the row in the base table containing the original XML document. Including locator.

たとえば、ノード/PurchaseOrder/Reference/ActionsでのＸパスextract（）は、集約されたフラグメントをもたらすはずである。 For example, an X path extract () at node / PurchaseOrder / Reference / Actions should result in an aggregated fragment.

しかしながら、上述の単純要素とは異なって、これらのフラグメントは格納されたＸＭＬ文書から抽出される。たとえば、経路式「/PurchaseOrder/Reference/Actions」はＰＡＴＨＩＤ３に対応する。ＰＡＴＨテーブルから、rowid３および８を有するノードがこのＰＡＴＨＩＤと一致する。これらの行のＶＡＬＵＥの列は空いており、ＬＯＣＡＴＯＲの列はフラグメントを抽出するためのオフセットおよび長さの情報をもたらす。したがって、ステップ２１５において、これらのノードの各々が複合要素に対応することが判断され、プロセスはステップ２３０に進む。 However, unlike the simple elements described above, these fragments are extracted from the stored XML document. For example, the path expression “/ PurchaseOrder / Reference / Actions” corresponds to PATHID3. From the PATH table, nodes with rowids 3 and 8 match this PATHID. The VALUE column in these rows is free, and the LOCATOR column provides offset and length information for extracting fragments. Accordingly, in step 215, it is determined that each of these nodes corresponds to a composite element, and the process proceeds to step 230.

ステップ２３０において、ノードに対応するフラグメントテキストが位置付けられ、読取られる。たとえばノード３について、ＲＩＤの列は、格納されたＸＭＬデータがベーステーブルの行Ｒ１に位置付けられることを示し、ＬＯＣＡＴＯＲフィールドは、フラグメントが文字６４から始まり、５６という長さを有することを示す。したがって、ノード３に対応するフラグメントテキストは、「ｐｏ１．ｘｍｌ」を含むベーステーブルの行Ｒ１においてＣＬＯＢから文字６４−１２０を抽出することによって作成されることができる。ノード８に対応するＸＭＬフラグメントは同様に、「ｐｏ２．ｘｍｌ」を含むベーステーブルの行Ｒ２においてＣＬＯＢから文字６３−１５２を抽出することによって作成されることができる。 In step 230, the fragment text corresponding to the node is located and read. For example, for node 3, the RID column indicates that the stored XML data is located in row R1 of the base table, and the LOCATOR field indicates that the fragment starts at character 64 and has a length of 56. Thus, the fragment text corresponding to node 3 can be created by extracting characters 64-120 from the CLOB in row R1 of the base table containing “po1.xml”. The XML fragment corresponding to node 8 can similarly be created by extracting characters 63-152 from the CLOB in row R2 of the base table containing “po2.xml”.

これらの例では、抽出されたＸＭＬフラグメントは偶然有効である。しかしながら、多くの場合、これらの方法を使用して抽出されたＸＭＬフラグメントは自立型でないかもしれない。たとえば、抽出されたフラグメントはフラグメント内に定義されない参照を含むまたは使用する場合がある。本明細書に記載される方法は、結果として生じるフラグメントが確実に有効でありかつ自立型であるように、上記の技術を使用して作成されたフラグメントを「パッチする」ことを可能にする。 In these examples, the extracted XML fragment is valid by chance. In many cases, however, XML fragments extracted using these methods may not be self-supporting. For example, the extracted fragment may contain or use references that are not defined within the fragment. The methods described herein allow for "patching" fragments created using the techniques described above to ensure that the resulting fragments are valid and free standing.

接頭辞および名前空間
ＸＭＬにおける要素名が固定されないので、２つの異なる文書が２つの異なるタイプの要素を表わす同一の名前を使用するときには名前の衝突が起こり得る。名前の衝突を回避する１つの標準的な方法は、名前とともに接頭辞を使用するというものである。 Prefixes and Namespaces Because element names in XML are not fixed, name collisions can occur when two different documents use the same name representing two different types of elements. One standard way to avoid name collisions is to use a prefix with the name.

たとえば、表１および表２は、両方が「表」要素を使用するＸＭＬ文書を図示する。 For example, Table 1 and Table 2 illustrate XML documents that both use the “Table” element.

これら２つのＸＭＬ文書が両方データベースに格納される場合、場合によっては要素名の衝突が存在し得るであろう。なぜなら、両方の文書が異なる内容および定義を有する＜table＞要素を含むためである。これらのタイプの衝突を解決し、防止する１つの標準的な方法は、名前空間接頭辞の使用によるものである。一例として、以下の表１Ａおよび表２Ａは、要素名の衝突を回避するために表１および表２のＸＭＬ文書がいかに修正され得るかを図示する。 If both of these two XML documents are stored in the database, in some cases there may be an element name collision. This is because both documents contain <table> elements with different contents and definitions. One standard way to resolve and prevent these types of conflicts is through the use of namespace prefixes. As an example, Tables 1A and 2A below illustrate how the XML documents in Tables 1 and 2 can be modified to avoid element name collisions.

表１Ａおよび表２Ａに示されるように、要素名の衝突はもはや問題ではない。なぜなら、２つの文書が＜table＞要素について異なる名前（つまり、＜ｈ：table＞および＜ｆ：ｔable＞）を使用するためである。接頭辞を使用することによって、２つの異なるタイプの＜table＞要素が可能である。 As shown in Tables 1A and 2A, element name collisions are no longer a problem. This is because the two documents use different names for the <table> element (ie, <h: table> and <f: table>). By using a prefix, two different types of <table> elements are possible.

接頭辞は典型的には、要素についての情報を担持するＸＭＬ文書を参照する。表１Ｂおよび表２Ｂは、特定の名前空間を参照するために接頭辞がいかに定義され得るかを示す。 The prefix typically refers to an XML document that carries information about the element. Tables 1B and 2B show how prefixes can be defined to refer to specific namespaces.

名前空間に関連付けられる修飾名を要素の接頭辞に与えるために、接頭辞のみを使用する代わりに、xmlns属性が＜table＞タグに加えられた。典型的には、名前空間属性は以下の構文を用いて要素の開始タグに置かれる。 Instead of using just the prefix, an xmlns attribute was added to the <table> tag to give the element prefix a qualified name associated with the namespace. Typically, namespace attributes are placed in the element's start tag using the following syntax:

xmlns：namespace−prefix＝“namespace”
表１Ｂおよび表２Ｂによって示されるように、定型資源識別子（ＵＲＩ）が使用されることができるが、名前空間自体はインターネットアドレスを使用して定義されることができる。複数の名前空間接頭辞が単一要素の属性として宣言されることができる。 xmlns: namespace-prefix = "namespace"
As shown by Table 1B and Table 2B, a boilerplate resource identifier (URI) can be used, but the namespace itself can be defined using an Internet address. Multiple namespace prefixes can be declared as single element attributes.

名前空間が要素の開始タグにおいて属性として定義されるとき、同一の接頭辞を有するすべての子要素は同一の名前空間に関連付けられる。さらに、表１Ｃおよび表２Ｃに示されるように、デフォルトの名前空間が要素のために使用されることができる。デフォルトの名前空間が使用されるとき、接頭辞はすべての子要素で使用される必要はない。デフォルトの名前空間宣言は、その範囲内のすべての接頭辞が付いていない要素名に当てはまる。 When a namespace is defined as an attribute in an element start tag, all child elements with the same prefix are associated with the same namespace. In addition, a default namespace can be used for elements as shown in Tables 1C and 2C. When the default namespace is used, the prefix need not be used on all child elements. The default namespace declaration applies to element names that do not have all prefixes in the scope.

接頭辞は修飾名の名前空間接頭辞部分をもたらし、名前空間宣言の際に名前空間の参照に関連付けられなければならない。接頭辞は、名前空間名のためのプレースホルダとしてのみ機能する。接頭辞ではなく名前空間名は、含んでいる文書を超えて範囲が広がる名前を構成する際に使用される。接頭辞および名前空間の宣言は、属性および要素に当てはまり得る。 The prefix provides the namespace prefix part of the qualified name and must be associated with the namespace reference in the namespace declaration. The prefix serves only as a placeholder for the namespace name. Namespace names, not prefixes, are used in constructing names that extend beyond the containing document. Prefix and namespace declarations may apply to attributes and elements.

接頭辞を宣言する名前空間宣言の範囲は、現れる開始タグの初めから、対応する終了タグの終わりまで広がり、同一の接頭辞名を使用するいずれの内部宣言の範囲も排除する。このような名前空間宣言は、宣言に指定された接頭辞が一致する範囲内のすべての要素および属性の名前に当てはまる。 The scope of namespace declarations that declare prefixes extends from the beginning of the start tag that appears to the end of the corresponding end tag, eliminating the scope of any internal declarations that use the same prefix name. Such namespace declarations apply to the names of all elements and attributes within the scope that match the prefix specified in the declaration.

名前空間接頭辞は、接頭辞が使用される要素の開始タグまたは上位要素の中の名前空間宣言の属性において宣言されたに違いにない。この制約は、名前空間宣言の属性がＸＭＬ文書に直接にもたらされるのではなく、外部エンティティにおいて宣言されたデフォルトの属性を介してもたらされる場合に、問題を招くおそれがある。 The namespace prefix must have been declared in the attribute of the namespace declaration in the start tag or ancestor element of the element in which the prefix is used. This constraint can be problematic if the namespace declaration attribute is not brought directly into the XML document but via the default attribute declared in the external entity.

これは特にフラグメント抽出の文脈で問題がある。外部文書での宣言が問題であるだけでなく、抽出されたＸＭＬフラグメントが、フラグメントが抽出される文書の、より以前のセクションにおいて宣言された接頭辞を使用する可能性がある。さらに、抽出されたフラグメントがいかなる名前空間にも直接的な参照を持たないので、その上で有効であるフラグメントが抽出されるかもしれないが、抽出されたフラグメントは、上位要素の範囲内にある場合には、上位のデフォルトの名前空間宣言を使用すべきである。 This is particularly problematic in the context of fragment extraction. Not only is declaration in the external document problematic, but the extracted XML fragment may use the prefix declared in earlier sections of the document from which the fragment is extracted. In addition, since the extracted fragment does not have a direct reference to any namespace, fragments that are valid on it may be extracted, but the extracted fragment is within the scope of the superordinate element In that case, the higher-level default namespace declaration should be used.

本明細書に記載される技術は、所望のノードおよびすべてのその上位からの名前空間宣言のリストを構築することによってこの問題を解決する。このリストは、ＰＡＴＨテーブルを照会することによって構築される。このリストは次いで、完全で有効な自立型ＸＭＬフラグメントを得るために、ステップ２３０において作成されたフラグメントに継ぎ合わされる。 The technique described herein solves this problem by building a list of namespace declarations from the desired node and all its ancestors. This list is constructed by querying the PATH table. This list is then spliced to the fragments created in step 230 to obtain a complete and valid freestanding XML fragment.

フラグメント抽出における名前空間宣言の取扱
上述のように、Ｘパスextract（）演算子が単純要素に対して評価されるとき、所望のフラグメントはＰＡＴＨテーブルのみを使用して構成されることができる。複合要素が抽出されるときには、フラグメントはＰＡＴＨテーブルからの位置情報を使用して元のデータから読取られる。しかしながら、抽出されたＸＭＬフラグメントにおいて接頭辞が使用されるとき、抽出されたフラグメントは接頭辞も説明しなければならない。さらに、抽出されるノードの上位要素において使用されるデフォルトの名前空間宣言はいずれも考慮されなければならない。 Handling Namespace Declarations in Fragment Extraction As mentioned above, when the X-path extract () operator is evaluated on a simple element, the desired fragment can be constructed using only the PATH table. When the composite element is extracted, the fragment is read from the original data using position information from the PATH table. However, when a prefix is used in the extracted XML fragment, the extracted fragment must also describe the prefix. In addition, any default namespace declarations used in the superior element of the extracted node must be considered.

たとえば、表３において例示的なＸＭＬ文書「ｐｏ３．ｘｍｌ」を考慮されたい。 For example, consider the exemplary XML document “po3.xml” in Table 3.

Ｘパスクエリ「extract（/po:purchaseOrder/po:lineItem/myns:SomeOtherTag）」が上述のプロセスのみを使用して評価される場合、このクエリによって返される結果として生じるフラグメントは表３のライン１０１−１０４から成るであろう。しかしながら、このＸＭＬフラグメントは名前空間接頭辞「ｐｏ」を参照し、この名前空間接頭辞「ｐｏ」はロケータ情報に従って抽出されるフラグメント（つまり、ライン１０１−１０４）の中のどこにも定義されない。その代わりに、この接頭辞は宣言され、表１のライン１の中の名前空間「ｐｏ．ｘｓｄ」にマップされる。 If the X-path query “extract (/ po: purchaseOrder / po: lineItem / myns: SomeOtherTag)” is evaluated using only the process described above, the resulting fragment returned by this query will be the lines 101-104 in Table 3 Will consist of: However, this XML fragment references the namespace prefix “po”, which is not defined anywhere in the fragment that is extracted according to the locator information (ie, lines 101-104). Instead, this prefix is declared and mapped to the namespace “po.xsd” in line 1 of Table 1.

宣言「xmlns：ｐｏ＝″ｐｏ．ｘｓｄ″」は、適切に解釈されるフラグメント、つまり「自立型である」フラグメントのためにステップ２３０において順番に作成されるフラグメントに継ぎ合わされる必要がある。 Declaration “xmlns: po = ″ po. xsd ″ ”needs to be spliced to the fragments that are created in step 230 in order for the properly interpreted fragments, ie“ self-supporting ”fragments.

１つの実施例では、名前空間宣言はロケータ自体の中に維持されることができる。しかしながら、この情報はあらゆるレベルに存在するであろう。好ましい実施例では、宣言情報はＰＡＴＨテーブルに格納された情報を使用して構築される。この実施例では、抽出されているノードのすべての上位ノードを識別するためにＳＱＬクエリが使用され、名前空間宣言は上位ノードから集められる。さらに、本明細書に記載される技術は、前に記載されたＸＭＬ名前空間スコーピング規則に準拠するように、正確に、つまり、より深い宣言が浅い宣言に優先する状態で逆の順序で名前空間宣言を解く。 In one embodiment, the namespace declaration can be maintained within the locator itself. However, this information will be present at all levels. In the preferred embodiment, the declaration information is constructed using information stored in the PATH table. In this example, an SQL query is used to identify all superior nodes of the node being extracted, and namespace declarations are collected from the superior nodes. In addition, the techniques described herein provide names that are accurate, that is, in reverse order, with deeper declarations taking precedence over shallower declarations to comply with the XML namespace scoping rules described previously. Solve the space declaration.

図２におけるステップ２４０によって示されるように、ノードの上位が識別される。ＸＭＬ索引が使用される場合、上位情報がOrderKeyを使用して格納されるので、これは単純なクエリである。ステップ２５０において、ＸＭＬフラグメントの適切な解釈に必要な情報が各々の識別された上位ごとに検索される。フラグメントの適切な解釈に必要な、上位
から検索された任意の宣言または他の情報が存在する場合、ステップ２８０において、この情報はフラグメントにパッチされる。たとえば、フラグメントの中で使用されるが定義されないいずれの接頭辞のための名前空間宣言も、最も近い上位ノードから検索され、ステップ２３０において作成されたフラグメントにパッチされる。 As indicated by step 240 in FIG. 2, the top of the node is identified. If an XML index is used, this is a simple query because the superior information is stored using OrderKey. In step 250, information required for proper interpretation of the XML fragment is retrieved for each identified superior. If there is any declaration or other information retrieved from the top that is necessary for proper interpretation of the fragment, in step 280 this information is patched to the fragment. For example, namespace declarations for any prefixes that are used but not defined in the fragment are retrieved from the nearest ancestor node and patched to the fragment created in step 230.

たとえば、名前空間宣言を集め、それらを正確に解くためにすべての上位ノードを調べるように以下のＳＱＬクエリが使用され得るであろう。（：Ｂ１＝考慮されている文書のＲＩＤ、：Ｂ２＝抽出されるノードのOrderKey） For example, the following SQL query could be used to gather namespace declarations and go through all the superior nodes to solve them correctly. (: B1 = RID of the considered document, B2 = OrderKey of the extracted node)

示されるように、外側のサブクエリは所与の文書においてすべての名前空間宣言を選択する。各々のこのような宣言ごとに、宣言が上位要素に存在するかどうかをexists（）サブクエリが判断する。 As shown, the outer subquery selects all namespace declarations in a given document. For each such declaration, the exists () subquery determines whether the declaration exists in the ancestor element.

スコーピング規則を正確に説明するために、上位要素に存在する宣言は下位にも存在し、下位が親の宣言に優先するのでその宣言は無視されるはずである。さらに、親要素に存在する宣言は孫要素における宣言に優先するなどである。適切な順序で各々の上位を考慮し、スコーピング規則を説明することによって、フラグメントに加えられる必要のある宣言のリストがステップ２５０において作成される。スコーピング規則を説明するために、上位ノードは最も近いものから最も離れたものまで考慮される。各々の宣言が上位の中に見つけられるとき、宣言がフラグメント自体の一部としてまたはより以前の上位ノードにおいて既に考慮されていた場合には、それは無視される。そうでなければ、宣言は、フラグメントにパッチされるストリングに加えられる。 To accurately explain the scoping rules, the declarations that exist in the superordinate element also exist in the subordinates, and the declarations should be ignored because the subordinates take precedence over the parent declaration. Furthermore, a declaration that exists in a parent element takes precedence over a declaration in a grandchild element. A list of declarations that need to be added to the fragment is created in step 250 by considering each ancestor in the proper order and explaining the scoping rules. To account for scoping rules, the top nodes are considered from the closest to the farthest. As each declaration is found in the ancestor, it is ignored if the declaration was already considered as part of the fragment itself or in an earlier ancestor node. Otherwise, the declaration is added to the string that is patched to the fragment.

たとえば、表３の中のノードの以下のＸパスクエリを考慮されたい。
extract（‘/po:purchaseOrder/po:lineItem/myns:SomeOtherTag’）
ステップ２３０において表３から抽出されるフラグメントは以下のとおりである。 For example, consider the following X-path query for the nodes in Table 3.
extract ('/ po: purchaseOrder / po: lineItem / myns: SomeOtherTag')
The fragments extracted from Table 3 in step 230 are as follows.

接頭辞「ｐｏ」はこのフラグメントでは定義されない。
このフラグメントの上位がステップ２５０において考慮されるとき、定義の以下のリストが作成される。 The prefix “po” is not defined in this fragment.
When the top of this fragment is considered in step 250, the following list of definitions is created.

ステップ２８０において定義のリストの中でフラグメントに継ぎ合わされた後、結果として生じるフラグメントは以下のとおりである。 After splicing to fragments in the list of definitions in step 280, the resulting fragment is as follows:

この例示的なフラグメントを自立型のフラグメントにするために宣言xmlns：ｐｓ２＝"ｐｏ２．ｘｓｄ"は必要とされないが、そこに含まれているものはフラグメントを無効にすることなく、またはフラグメントの意味を変更することはない。代替的な実施例では、宣言がフラグメントにパッチされる前に、抽出されているノードに宣言が必要であるかどうかを判断するために宣言が調べられる。 The declaration xmlns: ps2 = “po2.xsd” is not required to make this exemplary fragment into a free-standing fragment, but what it contains does not invalidate the fragment or the meaning of the fragment Never change. In an alternative embodiment, before a declaration is patched to a fragment, the declaration is examined to determine if the node being extracted requires a declaration.

適切な解釈に必要なすべての情報を含む、ステップ２８０において作成された自立型のフラグメントは次いで、ステップ２９０において返される。 The free-standing fragment created in step 280, containing all the information necessary for proper interpretation, is then returned in step 290.

本明細書に記載される技術は名前空間宣言および接頭辞の文脈で記載されてきたが、この技術は他の状況で使用されることが可能である。たとえば、エンティティまたはマクロ参照の存在はフラグメントの自立型の性質を同様に複雑にする。名前空間と同様に、いずれのエンティティ参照もＤＴＤ（データ型定義）宣言とともに最初に付加される必要があるので、ＣＬＯＢオフセットによって識別されたフラグメントは簡単に流出されることができない。 Although the techniques described herein have been described in the context of namespace declarations and prefixes, the techniques can be used in other situations. For example, the presence of an entity or macro reference complicates the free standing nature of the fragment as well. As with namespaces, any entity reference must first be added with a DTD (data type definition) declaration, so fragments identified by a CLOB offset cannot be easily drained.

ハードウェアの概観
図１は、この発明の実施例が実現され得るコンピュータシステム１００を図示するブロック図である。コンピュータシステム１００は、バス１０２または情報を通信するための他の通信メカニズムと、バス１０２に結合され情報を処理するためのプロセッサ１０４とを含む。コンピュータシステム１００は、バス１０２に結合され情報およびプロセッサ１０４によって実行される命令を格納するための、ランダムアクセスメモリ（ＲＡＭ）または他の動的記憶装置などのメインメモリ１０６も含む。メインメモリ１０６は、プロセッサ１０４によって実行される命令の実行中に一時的な変数または他の中間情報を格納するためにも使用されてもよい。コンピュータシステム１００はさらに、バス１０２に結合されプロセッサ１０４のための静的情報および命令を格納するためのリードオンリメモリ（ＲＯＭ）１０８または他の静的記憶装置を含む。情報および命令を格納するために、磁気ディスクまたは光学ディスクなどの記憶装置１１０が設けられ、バス１０２に結合される。 Hardware Overview FIG. 1 is a block diagram that illustrates a computer system 100 upon which an embodiment of the invention may be implemented. Computer system 100 includes a bus 102 or other communication mechanism for communicating information, and a processor 104 coupled with bus 102 for processing information. Computer system 100 also includes a main memory 106, such as random access memory (RAM) or other dynamic storage device, coupled to bus 102 for storing information and instructions executed by processor 104. Main memory 106 may also be used to store temporary variables or other intermediate information during execution of instructions executed by processor 104. Computer system 100 further includes a read only memory (ROM) 108 or other static storage device coupled to bus 102 for storing static information and instructions for processor 104. A storage device 110, such as a magnetic disk or optical disk, is provided and coupled to the bus 102 for storing information and instructions.

コンピュータシステム１００は、コンピュータユーザに情報を表示するための陰極線管（ＣＲＴ）などのディスプレイ１１２に、バス１０２を介して結合されてもよい。英数字および他のキーを含む入力装置１１４は、プロセッサ１０４に情報およびコマンド選択を伝えるためにバス１０２に結合される。ユーザ入力装置の別のタイプは、プロセッサ１０４に方向情報およびコマンド選択を伝えるため、およびディスプレイ１１２でカーソルの動きを制御するための、マウス、トラックボールまたはカーソル方向キーなどのカーソル制御装置１１６である。この入力装置は典型的には、２つの軸、つまり第１の軸（たとえ
ば、ｘ）および第２の軸（たとえば、ｙ）において２つの自由度を有し、これによって、装置が平面で位置を指定できる。 Computer system 100 may be coupled via bus 102 to a display 112 such as a cathode ray tube (CRT) for displaying information to a computer user. An input device 114 containing alphanumeric characters and other keys is coupled to the bus 102 for communicating information and command selections to the processor 104. Another type of user input device is a cursor control device 116, such as a mouse, trackball or cursor direction key, for communicating direction information and command selections to the processor 104 and for controlling cursor movement on the display 112. . The input device typically has two degrees of freedom in two axes, a first axis (eg, x) and a second axis (eg, y) so that the device is positioned in a plane. Can be specified.

この発明は、本明細書に記載される技術を実現するためのコンピュータシステム１００の使用に関するものである。この発明の１つの実施例に従って、それらの技術は、メインメモリ１０６に含まれる１つ以上の命令の１つ以上のシーケンスを実行するプロセッサ１０４に応答して、コンピュータシステム１００によって実行される。このような命令は、記憶装置１１０などの別の機械可読媒体からメインメモリ１０６に読込まれてもよい。メインメモリ１０６に含まれる命令のシーケンスの実行は、本明細書に記載されるプロセスステップをプロセッサ１０４に実行させる。代替的な実施例では、この発明を実現するために、ソフトウェア命令の代わりにまたはソフトウェア命令と組合せられて、ハードワイヤード回路が使用されてもよい。したがって、この発明の実施例は、ハードウェア回路およびソフトウェアの任意の特定の組合せに限定されるものではない。 The invention is related to the use of computer system 100 for implementing the techniques described herein. In accordance with one embodiment of the invention, the techniques are performed by computer system 100 in response to processor 104 executing one or more sequences of one or more instructions contained in main memory 106. Such instructions may be read into main memory 106 from another machine-readable medium such as storage device 110. Execution of the sequence of instructions contained in main memory 106 causes processor 104 to perform the process steps described herein. In alternative embodiments, hardwired circuitry may be used in place of or in combination with software instructions to implement the present invention. Thus, embodiments of the invention are not limited to any specific combination of hardware circuitry and software.

本明細書において使用される「機械可読媒体」という用語は、特定の態様で機械に動作させるデータを与えることに関与する任意の媒体を指す。コンピュータシステム１００を使用して実現される実施例では、さまざまな機械可読媒体はたとえば実行のためにプロセッサ１０４に命令を与えることにかかわる。このような媒体は、不揮発性媒体、揮発性媒体および伝達媒体を含むがそれらに限定されない多くの形態を取ってもよい。不揮発性媒体はたとえば、記憶装置１１０などの光学ディスクまたは磁気ディスクを含む。揮発性媒体は、メインメモリ１０６などの動的メモリを含む。伝達媒体は、バス１０２を含む線などの同軸ケーブル、銅線および光ファイバを含む。伝達媒体は、電波および赤外線データ通信中に生成される波などの音波または光波の形態も取り得る。 The term “machine-readable medium” as used herein refers to any medium that participates in providing data that causes a machine to operation in a specific fashion. In an embodiment implemented using computer system 100, various machine-readable media are involved, for example, in providing instructions to processor 104 for execution. Such a medium may take many forms, including but not limited to, non-volatile media, volatile media, and transmission media. Non-volatile media includes, for example, optical or magnetic disks such as storage device 110. Volatile media includes dynamic memory, such as main memory 106. Transmission media includes coaxial cables, such as the wire containing bus 102, copper wire, and optical fiber. Transmission media can also take the form of acoustic or light waves, such as those generated during radio wave and infrared data communications.

機械可読媒体の一般的な形態はたとえばフロッピー（登録商標）ディスク、フレキシブルディスク、ハードディスク、磁気テープ、もしくは他の磁気媒体、ＣＤ−ＲＯＭ、他の光学媒体、パンチカード、紙テープ、孔のパターンを有する他の物理的な媒体、ＲＡＭ、ＰＲＯＭおよびＥＰＲＯＭ、フラッシュＥＰＲＯＭ、他のメモリチップもしくはカートリッジ、以下に記載される搬送波、またはコンピュータが読取ることのできる他の媒体を含む。 Common forms of machine readable media include, for example, floppy disks, flexible disks, hard disks, magnetic tapes or other magnetic media, CD-ROMs, other optical media, punch cards, paper tapes, hole patterns Other physical media include RAM, PROM and EPROM, flash EPROM, other memory chips or cartridges, carrier waves described below, or other media that can be read by a computer.

機械可読媒体のさまざまな形態は、実行のためにプロセッサ１０４に１つ以上の命令の１つ以上のシーケンスを搬送することにかかわってもよい。たとえば、命令は最初にリモートコンピュータの磁気ディスクで搬送されてもよい。リモートコンピュータはその動的メモリに命令をロードすることができ、モデムを使用して電話線によってその命令を送ることができる。コンピュータシステム１００にローカルなモデムは、電話線に沿ってデータを受取ることができ、データを赤外線信号に変換するために赤外線送信機を使用することができる。赤外線検出器は、赤外線信号の状態で搬送されたデータを受取ることができ、適切な回路はバス１０２にデータを置くことができる。バス１０２はメインメモリ１０６にデータを搬送し、メインメモリ１０６からプロセッサ１０４は命令を検索し、実行する。メインメモリ１０６によって受取られた命令は、プロセッサ１０４による実行の前または後に任意に記憶装置１１０に格納されてもよい。 Various forms of machine-readable media may be involved in carrying one or more sequences of one or more instructions to processor 104 for execution. For example, the instructions may initially be carried on a remote computer magnetic disk. The remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem. A modem local to computer system 100 can receive the data along the telephone line and use an infrared transmitter to convert the data to an infrared signal. An infrared detector can receive the data carried in the form of an infrared signal and a suitable circuit can place the data on the bus 102. Bus 102 carries data to main memory 106, from which processor 104 retrieves and executes instructions. The instructions received by main memory 106 may optionally be stored on storage device 110 either before or after execution by processor 104.

コンピュータシステム１００は、バス１０２に結合された通信インターフェイス１１８も含む。通信インターフェイス１１８は、ローカルネットワーク１２２に接続されるネットワークリンク１２０への２方向のデータ通信結合をもたらす。たとえば、通信インターフェイス１１８は、対応するタイプの電話線へのデータ通信接続をもたらすために、統合サービスデジタル網（ＩＳＤＮ）カードまたはモデムであってもよい。別の例として、通信インターフェイス１１８は、互換性のあるローカルエリアネットワーク（ＬＡＮ）へのデータ通信接続をもたらすために、ＬＡＮカードであってもよい。ワイヤレスリンクも実
現されてもよい。いかなるこのような実現例においても、通信インターフェイス１１８は、さまざまなタイプの情報を表わすデジタルデータストリームを搬送する電気信号、電磁信号または光信号を送受信する。 Computer system 100 also includes a communication interface 118 coupled to bus 102. Communication interface 118 provides a two-way data communication coupling to network link 120 that is connected to local network 122. For example, communication interface 118 may be an integrated services digital network (ISDN) card or modem to provide a data communication connection to a corresponding type of telephone line. As another example, communication interface 118 may be a LAN card to provide a data communication connection to a compatible local area network (LAN). A wireless link may also be implemented. In any such implementation, communication interface 118 sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.

ネットワークリンク１２０は典型的には、１つ以上のネットワークを介して他のデータ装置へのデータ通信をもたらす。たとえば、ネットワークリンク１２０は、ローカルネットワーク１２２を介してホストコンピュータ１２４への接続をもたらす場合もあれば、インターネットサービスプロバイダ（ＩＳＰ）１２６によって動作されるデータ機器への接続をもたらす場合もある。ＩＳＰ１２６は次いで、現在一般に「インターネット」１２８と称されるワールドワイドパケットデータ通信ネットワークを介してデータ通信サービスをもたらす。ローカルネットワーク１２２およびインターネット１２８は両方、デジタルデータストリームを搬送する電気信号、電磁信号または光信号を使用する。さまざまなネットワークを介する信号、ならびにコンピュータシステム１００へおよびコンピュータシステム１００からデジタルデータを搬送する、ネットワーク１２０に沿って通信インターフェイス１１８を介する信号は、情報を伝える搬送波の例示的な形態である。 Network link 120 typically provides data communication through one or more networks to other data devices. For example, the network link 120 may provide a connection to the host computer 124 via the local network 122 or may provide a connection to data equipment operated by an Internet service provider (ISP) 126. ISP 126 then provides data communication services through a worldwide packet data communication network now commonly referred to as the “Internet” 128. Local network 122 and Internet 128 both use electrical, electromagnetic or optical signals that carry digital data streams. Signals over various networks, and signals through communication interface 118 along network 120 that carry digital data to and from computer system 100 are exemplary forms of carriers that carry information.

コンピュータシステム１００は、ネットワーク、ネットワークリンク１２０および通信インターフェイス１１８を介してメッセージを送ることができ、プログラムコードを含むデータを受取ることができる。インターネットの例では、サーバ１３０はインターネット１２８、ＩＳＰ１２６、ローカルネットワーク１２２および通信インターフェイス１１８を介して、アプリケーションプログラムのために要求されたコードを伝えるかもしれない。 Computer system 100 can send messages over network, network link 120 and communication interface 118 and can receive data including program code. In the Internet example, server 130 may communicate the requested code for the application program via Internet 128, ISP 126, local network 122 and communication interface 118.

受取られたコードは受取られたときにプロセッサ１０４によって実行されてもよく、および／または後の実行のために記憶装置１１０もしくは他の不揮発性記憶装置に格納されてもよい。この態様で、コンピュータシステム１００は搬送波の形態でアプリケーションコードを得ることができる。 The received code may be executed by the processor 104 when received and / or stored in the storage device 110 or other non-volatile storage for later execution. In this manner, computer system 100 can obtain application code in the form of a carrier wave.

上記の明細書では、この発明の実施例は実現例ごとに異なる可能性のある多くの具体的な詳細を参照しながら記載されてきた。したがって、何がこの発明であるかおよび出願人によって何がこの発明であるように意図されるかは、このような特許請求の範囲が発行する具体的な形で、本出願から発行される特許請求の範囲の組に単独で排他的に示され、いかなるその後の修正も含む。このような特許請求の範囲に含まれる用語について本明細書に明白に記載される定義はいずれも、特許請求の範囲の中で使用される用語の意味を決定する。したがって、特許請求の範囲に明白に記載されない限定、要素、特性、特徴、利点または属性はいかなる方法でもこのような特許請求の範囲を限定すべきではない。したがって、明細書および図面は限定的な意味ではなく例示的な意味で考えられるべきである。 In the foregoing specification, embodiments of the invention have been described with reference to numerous specific details that may vary from implementation to implementation. Accordingly, what is the invention and what is intended by the applicant to be the invention is, in a specific form issued by such claims, a patent issued from this application. Claimed exclusively in the claim set, including any subsequent modifications. Any definitions expressly set forth herein for terms contained in such claims shall govern the meaning of such terms as used in the claims. Accordingly, no limitation, element, property, feature, advantage or attribute that is not expressly recited in a claim should limit the scope of such claim in any way. The specification and drawings are, accordingly, to be regarded in an illustrative sense rather than a restrictive sense.

本明細書に記載される技術が実現され得るシステムのブロック図である。1 is a block diagram of a system in which the techniques described herein may be implemented. 要求に応答して自立型ＸＭＬフラグメントを効率的に与えるためのステップを図示するフローチャートである。FIG. 6 is a flowchart illustrating steps for efficiently providing a free-standing XML fragment in response to a request.

Claims

A method for providing a self-supporting XML fragment for a node in an XML document managed by a database management system comprising:
Receiving a request for an XML fragment comprising a computer-implemented step, wherein the request includes an XML path expression, the method further comprising:
A computer-implemented step of identifying a node in the XML document managed by the database management system that matches the XML path expression;
A computer-implemented step of extracting a first XML fragment corresponding to the identified node;
A computer-implemented step of identifying a superior node for the identified node;
For each identified superior node, it is determined whether the superior node contains information necessary for proper interpretation of the first XML fragment, and if the superior node contains the necessary information, A computer-implemented step of inserting two XML fragments into the first XML fragment;
Providing a first XML fragment in response to the request.

The database management system includes an index that indexes an XML document stored in the database management system, and identifying a node in the XML document includes using the index to identify the node. The method according to 1.

The method of claim 2, wherein the index comprises a path, value, and order index.

Extracting the first XML fragment comprises:
Determining the location of stored XML data corresponding to the identified node;
2. The method of claim 1, comprising reading XML data from the determined location.

5. The method of claim 4, wherein determining the location of stored XML data corresponding to the identified node comprises reading location information from an index that indexes XML documents stored in the database management system. .

Extracting the first XML fragment comprises:
The method of claim 2, comprising constructing an XML fragment using information in the index.

The method of claim 3, wherein identifying an upper node includes using an order index.

The method of claim 1, wherein the information required for proper interpretation of the first XML fragment is a namespace declaration.

9. The method of claim 8, wherein determining whether a superior node contains information necessary for proper interpretation comprises determining whether a namespace declaration has been declared at a previously considered superior node. .

The method of claim 8, wherein determining whether an ancestor node contains information necessary for proper interpretation comprises determining whether a namespace declaration has been declared in the first XML fragment.

The method according to claim 1, wherein the step of determining whether an upper node includes information necessary for proper interpretation is executed for each upper node in order from the closest upper node to a root upper node.

The information required for proper interpretation of the first XML fragment is a namespace declaration, and if the namespace declaration in the higher node matches the namespace declaration in the higher node already considered, the namespace declaration is properly interpreted. The method of claim 11, wherein it is determined that it is not required.

A computer readable medium carrying one or more sequences of instructions that, when executed by one or more processors, cause one or more processors to perform the method of claim 1.

A computer readable medium carrying one or more sequences of instructions that, when executed by one or more processors, cause one or more processors to perform the method of claim 2.

A computer readable medium carrying one or more sequences of instructions that, when executed by one or more processors, cause one or more processors to perform the method of claim 3.

A computer readable medium carrying one or more sequences of instructions that, when executed by one or more processors, cause one or more processors to perform the method of claim 4.

A computer readable medium carrying one or more sequences of instructions that, when executed by one or more processors, cause one or more processors to perform the method of claim 5.

A computer readable medium carrying one or more sequences of instructions that, when executed by one or more processors, cause one or more processors to perform the method of claim 6.

A computer readable medium carrying one or more sequences of instructions that, when executed by one or more processors, cause one or more processors to perform the method of claim 7.

9. A computer readable medium carrying one or more sequences of instructions that, when executed by one or more processors, cause one or more processors to perform the method of claim 8.

A computer readable medium carrying one or more sequences of instructions that, when executed by one or more processors, cause one or more processors to perform the method of claim 9.

A computer readable medium carrying one or more sequences of instructions that, when executed by one or more processors, cause one or more processors to perform the method of claim 10.

A computer readable medium carrying one or more sequences of instructions that, when executed by one or more processors, cause one or more processors to perform the method of claim 11.

A computer readable medium carrying one or more sequences of instructions that, when executed by one or more processors, cause one or more processors to perform the method of claim 12.

A computer-implemented method for receiving a request for an XML fragment, wherein the request includes an XML path expression, the method further comprising:
Including a computer-implemented step of using an index to identify nodes in the database management system that match the XML path expression;
The node exists in the XML document managed by the database management system,
The XML document is stored in one or more base structures managed by a database management system, the method further comprising:
A computer-implemented step of determining whether the node is for a simple element;
If the node is for a simple element,
Constructing an XML fragment for a node based on information contained in an index without accessing one or more base structures;
Providing the XML fragment in response to the request.

26. A computer readable medium carrying one or more sequences of instructions that, when executed by one or more processors, cause one or more processors to perform the method of claim 25.