JP2011258024A

JP2011258024A - Information processing unit and information processing method

Info

Publication number: JP2011258024A
Application number: JP2010132411A
Authority: JP
Inventors: Keisuke Tamiya; 圭介田宮
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2010-06-09
Filing date: 2010-06-09
Publication date: 2011-12-22

Abstract

PROBLEM TO BE SOLVED: To provide a technology for correcting a value of an element in a structured document to a proper value with a more compact structure when the value is determined to be improper.SOLUTION: A data type comparator 118 compares a data type of an element in a structured document 142 with a data type in a data type definition table 116 to determine whether or not the data types are identical to each other. When the data types are not identical to each other, a data correction value acquisition section 120 acquires a corresponding correction value from a data correction value table 119. A data correction section 121 updates the data type of the element in the structured document 142 with the correction value acquired by the data correction value acquisition section 120.

Description

本発明は、ＸＭＬなどの言語で記述された構造化文書を解析するための技術に関するものである。 The present invention relates to a technique for analyzing a structured document described in a language such as XML.

従来、Ｗ３Ｃで仕様が規定されているＸＭＬ言語で記述された構造化文書を解析する手段として、ＸＭＬパーサとよばれる構文解析プログラムが知られている。また、ＸＭＬパーサを使って構造化文書の構文解析を行う場合、ＸＭＬスキーマ記述言語で記述されたスキーマ定義と比較して、文書構造、データ型などの妥当性を検証するスキーマ検証技術が知られている。 Conventionally, a syntax analysis program called an XML parser is known as means for analyzing a structured document described in an XML language whose specifications are defined by the W3C. In addition, when parsing a structured document using an XML parser, a schema verification technique for verifying the validity of a document structure, data type, etc. is known in comparison with a schema definition described in an XML schema description language. ing.

例えば、特許文献１では、妥当性検証処理を簡略化し、処理速度を向上する技術について記述されている。なお、ＸＭスキーマ記述言語として代表的なものには、Ｗ３Ｃ標準のＸＭＬＳｃｈｅｍａ、ＩＳＯ標準のＲｅｌａｘＮＧがある。 For example, Patent Document 1 describes a technique for simplifying the validity verification process and improving the processing speed. Typical examples of the XM schema description language include W3C standard XML Schema and ISO standard Relax NG.

特開２００３−８４９８７号公報JP 2003-84987 A

アプリケーションプログラムが、ＸＭＬパーサを使ってＸＭＬ文書を解析中、スキーマ定義とＸＭＬ文書とを比較する妥当性検証を行う場合がある。妥当性検証の結果、ＸＭＬ文書がスキーマ定義と異なった際、ＸＭＬパーサはエラーメッセージをアプリケーションプログラムに通知する。 In some cases, the application program performs validity verification by comparing the schema definition and the XML document while the XML document is being analyzed using the XML parser. As a result of the validity verification, when the XML document is different from the schema definition, the XML parser notifies an error message to the application program.

エラーメッセージ通知の際、ＸＭＬ文書の構造に問題はないが要素内容の文字列、属性値に問題があった場合、アプリケーションプログラムが適切な値に修正するなど、エラー回復を行って処理を継続することがよくある。 When an error message is notified, there is no problem in the structure of the XML document, but if there is a problem in the character string or attribute value of the element contents, the application program corrects the error to an appropriate value and continues processing. Often there is.

ただし、実行環境によって適切な値が異なったり、同じ実行環境でも、後で変更される可能性があったりした場合、アプリケーションプログラムに、修正値を実行時に動的に取得するための処理を追加する必要がある。 However, if the appropriate value differs depending on the execution environment or there is a possibility that it will be changed later in the same execution environment, add a process to dynamically acquire the correction value at the time of execution in the application program. There is a need.

このようなアプリケーションプラグラムが複数あった場合、それぞれのアプリケーションごとにこのような処理を追加する必要があり、アプリケーションプログラムのサイズの合計が大きくなる問題があった。 When there are a plurality of such application programs, it is necessary to add such processing for each application, and there is a problem that the total size of the application programs becomes large.

本発明は以上の問題に鑑みてなされたものであり、構造化文書中の構成要素の値に問題があった場合にこれを適切な値に修正する動作を、より簡便な構成で実現する為の技術を提供する。 The present invention has been made in view of the above problems, and in order to realize an operation for correcting an appropriate value when there is a problem in the value of a component in a structured document with a simpler configuration. Provide technology.

本発明の目的を達成するために、例えば、本発明の情報処理装置は以下の構成を備える。即ち、解析対象の構造化文書を取得する手段と、前記構造化文書中の構成要素の値が満たすべき条件として予め設定された条件情報と、該値が該条件を満たさない場合に該値の代わりに用いる値として予め設定された修正値と、が登録されたテーブル情報を用いて、前記構成要素の値が前記条件を満たしているか否かを判断する判断手段と、前記構成要素の値が前記条件を満たしていないと前記判断手段が判断した場合には、前記テーブル情報から前記修正値を取得し、前記構成要素の値を該取得した修正値に置き換えることで前記構造化文書を更新する更新手段とを備えることを特徴とする。 In order to achieve the object of the present invention, for example, an information processing apparatus of the present invention comprises the following arrangement. That is, a means for acquiring a structured document to be analyzed, condition information set in advance as a condition to be satisfied by a value of a component in the structured document, and the value of the value when the value does not satisfy the condition A determination means for determining whether or not the value of the component satisfies the condition using table information in which a correction value set in advance as a value to be used in advance is registered, and the value of the component is When the determination unit determines that the condition is not satisfied, the modification value is acquired from the table information, and the structured document is updated by replacing the component value with the acquired correction value. Update means.

本発明によれば、構造化文書中の構成要素の値に問題があった場合にこれを適切な値に修正する動作を、より簡便な構成で実現することができる。 According to the present invention, when there is a problem with the value of a component in a structured document, an operation for correcting this to an appropriate value can be realized with a simpler configuration.

構造化文書解析装置１００の構成例を示すブロック図。FIG. 3 is a block diagram showing a configuration example of a structured document analysis apparatus 100. スキーマ定義情報１４１の構成例を示す図。The figure which shows the structural example of the schema definition information 141. FIG. ＸＭＬ言語で記述した構造化文書１４２の構成例を示す図。The figure which shows the structural example of the structured document 142 described in the XML language. 構造定義一覧表１１３の構成例を示す図。The figure which shows the structural example of the structure definition list table. データ型定義一覧表１１６の構成例を示す図。The figure which shows the structural example of the data type definition list. データ修正値一覧表１１９の構成例を示す図。The figure which shows the structural example of the data correction value list 119. 構造化文書解析装置１００が行う処理のフローチャート。The flowchart of the process which the structured document analysis apparatus 100 performs. ステップＳ７０５における処理の詳細を示すフローチャート。The flowchart which shows the detail of the process in step S705. スキーマ定義情報１４１の構成例を示す図。The figure which shows the structural example of the schema definition information 141. FIG. データ修正値一覧表１１９の構成例を示す図。The figure which shows the structural example of the data correction value list 119. ステップＳ７０５における処理の詳細を示すフローチャート。The flowchart which shows the detail of the process in step S705.

以下、添付図面を参照し、本発明の好適な実施形態について説明する。なお、以下説明する実施形態は、本発明を具体的に実施した場合の一例を示すもので、特許請求の範囲に記載の構成の具体的な実施例の１つである。 Preferred embodiments of the present invention will be described below with reference to the accompanying drawings. The embodiment described below shows an example when the present invention is specifically implemented, and is one of the specific examples of the configurations described in the claims.

［第１の実施形態］
先ず、本実施形態に係る情報処理装置としての、構造化文書解析装置１００の機能構成例について、図１のブロック図を用いて説明する。なお、図１には、以下に説明する処理で用いる主要な構成のみを記している。また、以下に説明する処理を実現することができるのであれば、その構成は図１に示した構成に限るものではない。 [First Embodiment]
First, a functional configuration example of the structured document analysis apparatus 100 as an information processing apparatus according to the present embodiment will be described with reference to the block diagram of FIG. FIG. 1 shows only main components used in the processing described below. Further, as long as the processing described below can be realized, the configuration is not limited to the configuration shown in FIG.

構造化文書解析装置１００には、記憶装置１４０が接続されている。本実施形態では、この記憶装置１４０はハードディスクドライブ装置であるものとして説明するが、ＣＤ−ＲＯＭやＤＶＤ−ＲＯＭなどの記憶媒体であっても良い。この場合、構造化文書解析装置１００には、この記憶媒体から情報を読み出す機能を有するドライブ装置を接続する必要がある。 A storage device 140 is connected to the structured document analysis device 100. In the present embodiment, the storage device 140 is described as a hard disk drive device, but may be a storage medium such as a CD-ROM or a DVD-ROM. In this case, the structured document analyzing apparatus 100 needs to be connected with a drive device having a function of reading information from the storage medium.

記憶装置１４０には、解析対象となる構造化文書１４２、構造化文書１４２のスキーマ（記述言語の文法）を定義しているスキーマ定義情報１４１、がそれぞれデータ（ファイル）として格納されている。 The storage device 140 stores a structured document 142 to be analyzed and schema definition information 141 that defines the schema (description language grammar) of the structured document 142 as data (files).

構造化文書解析装置１００は、ＣＰＵ１３０及びメモリ１１０を有する。ＣＰＵ１３０は、メモリ１１０に格納されているコンピュータプログラムやデータを用いて処理を実行することで、以下に説明する各処理を実行する。メモリ１１０には、以下に説明するコンピュータプログラムやデータが格納されている。 The structured document analysis apparatus 100 includes a CPU 130 and a memory 110. The CPU 130 executes each process described below by executing a process using a computer program or data stored in the memory 110. The memory 110 stores computer programs and data described below.

スキーマ定義解析部１１１は、コンピュータプログラムとして実装されているもので、ＣＰＵ１３０はこのスキーマ定義解析部１１１を実行することで、記憶装置１４０からスキーマ定義情報１４１を読み込んで解析する機能を実現する。この解析により、構造定義一覧表１１３、データ型定義一覧表１１６、データ修正値一覧表１１９、がデータとして作成され、メモリ１１０に格納される。構造定義一覧表１１３、データ型定義一覧表１１６、データ修正値一覧表１１９のそれぞれについては後述する。 The schema definition analysis unit 111 is implemented as a computer program, and the CPU 130 executes the schema definition analysis unit 111 to realize a function of reading and analyzing the schema definition information 141 from the storage device 140. By this analysis, the structure definition list 113, the data type definition list 116, and the data correction value list 119 are created as data and stored in the memory 110. Each of the structure definition list 113, the data type definition list 116, and the data correction value list 119 will be described later.

構造化文書解析部１１２は、コンピュータプログラムとして実装されているもので、ＣＰＵ１３０はこの構造化文書解析部１１２を実行することで、記憶装置１４０から構造化文書１４２を読み込んで解析する機能を実現する。 The structured document analysis unit 112 is implemented as a computer program, and the CPU 130 executes the structured document analysis unit 112 to realize a function of reading and analyzing the structured document 142 from the storage device 140. .

構造定義取得部１１４は、コンピュータプログラムとして実装されているもので、ＣＰＵ１３０はこの構造定義取得部１１４を実行することで、構造定義一覧表１１３から必要な情報を読み出す機能を実現する。 The structure definition acquisition unit 114 is implemented as a computer program, and the CPU 130 executes the structure definition acquisition unit 114 to realize a function of reading necessary information from the structure definition list 113.

構造定義比較部１１５は、コンピュータプログラムとして実装されているものである。ＣＰＵ１３０はこの構造定義比較部１１５を実行することで、構造定義取得部１１４により構造定義一覧表１１３から取得された情報と、構造化文書解析部１１２により解析された構造化文書１４２の構造と、を比較する機能を実現する。 The structure definition comparison unit 115 is implemented as a computer program. The CPU 130 executes the structure definition comparison unit 115 to obtain information acquired from the structure definition list 113 by the structure definition acquisition unit 114, the structure of the structured document 142 analyzed by the structured document analysis unit 112, and Realize the function to compare.

データ型定義取得部１１７は、コンピュータプログラムとして実装されているものである。ＣＰＵ１３０はこのデータ型定義取得部１１７を実行することで、データ型定義一覧表１１６から必要な情報を読み出す機能を実現する。 The data type definition acquisition unit 117 is implemented as a computer program. The CPU 130 executes the data type definition acquisition unit 117 to realize a function of reading necessary information from the data type definition list 116.

データ型比較部１１８は、コンピュータプログラムとして実装されているものである。ＣＰＵ１３０はこのデータ型比較部１１８を実行することで、データ型定義取得部１１７によりデータ型定義一覧表１１６から取得された情報と、構造化文書解析部１１２により解析された構造化文書１４２のデータ型と、を比較する機能を実現する。 The data type comparison unit 118 is implemented as a computer program. The CPU 130 executes the data type comparison unit 118, whereby the information acquired from the data type definition list 116 by the data type definition acquisition unit 117 and the data of the structured document 142 analyzed by the structured document analysis unit 112. Realize the function to compare the mold.

データ修正値取得部１２０は、コンピュータプログラムとして実装されているものである。ＣＰＵ１３０はこのデータ修正値取得部１２０を実行することで、データ修正値一覧表１１９から必要な情報を読み出す機能を実現する。 The data correction value acquisition unit 120 is implemented as a computer program. The CPU 130 implements a function of reading out necessary information from the data correction value list 119 by executing the data correction value acquisition unit 120.

データ修正部１２１は、コンピュータプログラムとして実装されているものである。ＣＰＵ１３０はこのデータ修正部１２１を実行することで、データ修正値取得部１２０が取得した修正値を用いて、構造化文書１４２内の値を更新する等の機能を実現する。 The data correction unit 121 is implemented as a computer program. By executing the data correction unit 121, the CPU 130 implements a function such as updating the value in the structured document 142 using the correction value acquired by the data correction value acquisition unit 120.

データ通知部１２２は、コンピュータプログラムとして実装されているものである。ＣＰＵ１３０はこのデータ通知部１２２を実行することで、構造化文書１４２の解析結果をアプリケーションプログラムに対して通知する機能を実現する。 The data notification unit 122 is implemented as a computer program. The CPU 130 executes the data notification unit 122 to realize a function of notifying the application program of the analysis result of the structured document 142.

なお、以下の説明では便宜上、メモリ１１０にコンピュータプログラムとして格納されているものとして説明した上記各部を動作の主体として説明することもある。しかし実際には上記の通り、ＣＰＵ１３０がこれら各部を実行することで、対応する機能が実現される。 In the following description, for the sake of convenience, the above-described units described as being stored as computer programs in the memory 110 may be described as the subject of the operation. However, in practice, as described above, the CPU 130 executes these units to realize the corresponding functions.

図２に、スキーマ定義情報１４１の構成例を示す。図２では、ISOで規定されているXMLスキーマ記述言語であるRelax NGで記述されているスキーマ定義情報１４１の構成例を示している。但し、データ修正値一覧表１１９の生成で用いる名前空間宣言部２００、修正値の定義を指定するための属性定義部２０１、２０２、２０３、２０４、２０５がRelax NG仕様から拡張されている。 FIG. 2 shows a configuration example of the schema definition information 141. FIG. 2 shows a configuration example of the schema definition information 141 described in Relax NG, which is an XML schema description language defined by ISO. However, the namespace declaration unit 200 used for generating the data correction value list 119 and the attribute definition units 201, 202, 203, 204, and 205 for specifying the definition of the correction value are extended from the Relax NG specification.

図２に示したスキーマ定義情報１４１によるスキーマ定義によれば、解析対象となる構造化文書１４２（ここではＸＭＬ文書）は、camera_setting要素がルート要素であり、このルート要素は、name、pan、tiltという子要素を持つ。 According to the schema definition based on the schema definition information 141 shown in FIG. 2, the structured document 142 to be analyzed (here, XML document) has a camera_setting element as a root element, and the root element includes name, pan, and tilt. Has a child element.

name要素のデータ型は30文字以内の文字列であり、もし構造化文書１４２内のname要素のデータ型が「30文字以内の文字列」ではなかった場合にname要素の値の代わりに用いられる値（修正値）を”my camera”と定義している。 The data type of the name element is a character string of up to 30 characters, and is used instead of the value of the name element if the data type of the name element in the structured document 142 is not "character string of up to 30 characters" The value (corrected value) is defined as “my camera”.

また、pan要素のデータ型は、-180より大且つ180以下の条件をもつ整数値である。もし構造化文書１４２内のpan要素のデータ型が「-180より大且つ180以下の条件をもつ整数値」ではなかった場合にpan要素の値の代わりに用いられる値（修正値）を「−１７９」、「１８０」と定義している。 The data type of the pan element is an integer value having a condition greater than −180 and less than or equal to 180. If the data type of the pan element in the structured document 142 is not “an integer value having a condition greater than −180 and 180 or less”, a value (corrected value) used instead of the value of the pan element is set to “− 179 ”and“ 180 ”.

さらに、tilt要素のデータ型は、0以上且つ90以下の条件をもつ整数値である。もし構造化文書１４２内のtilt要素のデータ型が「0以上且つ90以下の条件をもつ整数値」ではなかった場合にtilt要素の値の代わりに用いられる値（修正値）を「０」、「９０」と定義している。 Further, the data type of the tilt element is an integer value having a condition of 0 or more and 90 or less. If the data type of the tilt element in the structured document 142 is not “an integer value having a condition of 0 or more and 90 or less”, a value (correction value) used instead of the value of the tilt element is “0”, It is defined as “90”.

以下の説明では、図２に示したものをスキーマ定義情報１４１として用いた場合について説明するが、要素の名称や値、条件が異なっていても以下の説明の本質は同じである。 In the following description, the case where the information shown in FIG. 2 is used as the schema definition information 141 will be described. However, the essence of the following description is the same even if the element names, values, and conditions are different.

図３に、ＸＭＬ言語で記述した構造化文書１４２の構成例を示している。図３（ａ）に示した構造化文書１４２におけるpan要素の値「−３０」は、図２に示したスキーマ定義情報１４１でpan要素について定義している条件「-180より大且つ180以下の条件をもつ整数値」に合致している。また、図３（ａ）に示した構造化文書１４２におけるtilt要素の値「６０」は、図２に示したスキーマ定義情報１４１でtilt要素について定義している条件「0以上且つ90以下の条件をもつ整数値」に合致している。即ち、図３（ａ）に示した構造化文書１４２は、図２に示したスキーマ定義情報１４１によるスキーマ定義に合致している構造化文書となる。 FIG. 3 shows a configuration example of the structured document 142 described in the XML language. The value “−30” of the pan element in the structured document 142 illustrated in FIG. 3A is greater than the condition “−180 and 180 or less than the condition defined for the pan element in the schema definition information 141 illustrated in FIG. 2. It matches the integer value with the condition. Further, the value “60” of the tilt element in the structured document 142 shown in FIG. 3A is the condition “0 or more and 90 or less condition defined for the tilt element in the schema definition information 141 shown in FIG. Is an integer value with That is, the structured document 142 shown in FIG. 3A is a structured document that matches the schema definition by the schema definition information 141 shown in FIG.

図３（ｂ）に示した構造化文書１４２におけるpan要素の値「−２００」は、図２に示したスキーマ定義情報１４１でpan要素について定義している条件「-180より大且つ180以下の条件をもつ整数値」には合致していない。また、図３（ｂ）に示した構造化文書１４２におけるtilt要素の値「６０」は、図２に示したスキーマ定義情報１４１でtilt要素について定義している条件「0以上且つ90以下の条件をもつ整数値」に合致している。即ち、図３（ｂ）に示した構造化文書１４２は、図２に示したスキーマ定義情報１４１によるスキーマ定義に合致していない構造化文書となる。 The value “−200” of the pan element in the structured document 142 shown in FIG. 3B is greater than the condition “−180 and 180 or less than the condition defined for the pan element in the schema definition information 141 shown in FIG. 2. It does not match “integer value with condition”. Further, the value “60” of the tilt element in the structured document 142 shown in FIG. 3B is the condition “0 or more and 90 or less condition defined for the tilt element in the schema definition information 141 shown in FIG. Is an integer value with That is, the structured document 142 shown in FIG. 3B is a structured document that does not match the schema definition by the schema definition information 141 shown in FIG.

図３（ｃ）に示した構造化文書１４２におけるpan要素の値「マイナス三十」は、図２に示したスキーマ定義情報１４１でpan要素について定義している条件「-180より大且つ180以下の条件をもつ整数値」には合致していない。また、図３（ｃ）に示した構造化文書１４２におけるtilt要素の値「６０」は、図２に示したスキーマ定義情報１４１でtilt要素について定義している条件「0以上且つ90以下の条件をもつ整数値」に合致している。即ち、図３（ｃ）に示した構造化文書１４２は、図２に示したスキーマ定義情報１４１によるスキーマ定義に合致していない構造化文書となる。 The value “minus thirty” of the pan element in the structured document 142 shown in FIG. 3C is the condition “-180 larger than 180 and less than 180 defined in the schema definition information 141 shown in FIG. It does not match the “integer value with the condition of”. Also, the tilt element value “60” in the structured document 142 shown in FIG. 3C is the condition “0 or more and 90 or less condition defined for the tilt element in the schema definition information 141 shown in FIG. Is an integer value with That is, the structured document 142 shown in FIG. 3C is a structured document that does not match the schema definition by the schema definition information 141 shown in FIG.

次に、構造化文書解析装置１００が行う処理について、同処理のフローチャートを示す図７を用いて説明する。図７のフローチャートに従った処理は、スキーマ定義情報１４１の解析指示をＣＰＵ１３０が検知したことに応じて開始される。なお、スキーマ定義情報１４１の解析指示は、メモリ１１０にロードされているアプリケーションプログラムを介して入力されるものである。この解析指示は、構造化文書解析装置１００に備わっている不図示のユーザインターフェース（キーボードやマウスなど）を用いて入力しても良いが、何れの場合においてもこの解析指示は、上記アプリケーションプログラムを介してＣＰＵ１３０に通知される。 Next, processing performed by the structured document analysis apparatus 100 will be described with reference to FIG. 7 showing a flowchart of the processing. The process according to the flowchart of FIG. 7 is started in response to the CPU 130 detecting an instruction to analyze the schema definition information 141. The analysis instruction for the schema definition information 141 is input via an application program loaded in the memory 110. This analysis instruction may be input using a user interface (not shown) (such as a keyboard or a mouse) provided in the structured document analysis apparatus 100. In any case, the analysis instruction is executed by the application program. Via the CPU 130.

ステップＳ７０１では、スキーマ定義解析部１１１は、記憶装置１４０からスキーマ定義情報１４１をメモリ１１０に読み込む。ステップＳ７０２では、スキーマ定義解析部１１１は、この読み込んだスキーマ定義解析部１１１から、構造化文書の構造を定義する情報を読み出し、読み出した情報を構造定義一覧表１１３に書き込むことで、この構造定義一覧表１１３を完成させる。 In step S <b> 701, the schema definition analysis unit 111 reads the schema definition information 141 from the storage device 140 into the memory 110. In step S702, the schema definition analysis unit 111 reads information defining the structure of the structured document from the read schema definition analysis unit 111, and writes the read information to the structure definition list 113, thereby defining the structure definition. The list 113 is completed.

図４に、図２のスキーマ定義情報１４１からステップＳ７０２で作成した構造定義一覧表１１３の構成例を示す。要素名の欄４０１には、スキーマ定義情報１４１内に定義されているルート要素の要素名camera_settings、子要素の要素名name,pan,tiltが登録されている。これらの要素は、「構造化文書１４２が有するべき要素」である。また、構造情報の欄４０２には、ルート要素が従うべき規則「子要素name、pan、tiltをこの順番でもつ」と、子要素が従うべき規則「要素内容として＜data＞で定義された形式の文字列をもつ」が登録されている。これらの規則は、「構造化文書１４２が従うべき規則」である。 FIG. 4 shows a configuration example of the structure definition list 113 created in step S702 from the schema definition information 141 of FIG. In the element name column 401, the element name camera_settings of the root element defined in the schema definition information 141 and the element names name, pan, and tilt of the child elements are registered. These elements are “elements that the structured document 142 should have”. Also, in the structure information column 402, the rule “the child elements name, pan, and tilt are in this order” to be followed by the root element and the rule “the content defined by <data> as the element content” are to be followed by the child elements. "Has a character string of" is registered. These rules are “rules that the structured document 142 should follow”.

図７に戻って次に、ステップＳ７０３ではスキーマ定義解析部１１１は、ステップＳ７０１で読み込んだスキーマ定義情報１４１から、構造化文書中の子要素のデータ型を定義する情報を読み出す。そしてスキーマ定義解析部１１１は、読み出した情報をデータ型定義一覧表１１６に書き込むことで、このデータ型定義一覧表１１６を完成させる。 Returning to FIG. 7, in step S703, the schema definition analysis unit 111 reads information defining the data type of the child element in the structured document from the schema definition information 141 read in step S701. Then, the schema definition analyzing unit 111 writes the read information in the data type definition list table 116 to complete the data type definition list table 116.

図５に、図２のスキーマ定義情報１４１からステップＳ７０３で作成したデータ型定義一覧表１１６の構成例を示す。要素名の欄５０１には、スキーマ定義情報１４１内に定義されている子要素の要素名name,pan,tiltが登録されている。これらの要素は、「構造化文書１４２が有するべき要素」である。また、データ型情報の欄５０２には、それぞれの子要素のデータ型が登録されている。例えば、name要素の要素名「name」は欄５０１に登録され、name要素のデータ型「30文字以内の文字列」は、欄５０２に登録される。このように、要素名と、この要素名を有する子要素のデータ型と、を対応付けてデータ型定義一覧表１１６に登録する。なお、これらのデータ型は、「構造化文書１４２中の子要素が有するべきデータ型」である。 FIG. 5 shows a configuration example of the data type definition list table 116 created in step S703 from the schema definition information 141 of FIG. In the element name column 501, element names name, pan, and tilt of child elements defined in the schema definition information 141 are registered. These elements are “elements that the structured document 142 should have”. In the data type information column 502, the data type of each child element is registered. For example, the element name “name” of the name element is registered in the column 501, and the data type “character string of up to 30 characters” of the name element is registered in the column 502. In this way, the element name and the data type of the child element having this element name are associated with each other and registered in the data type definition list table 116. Note that these data types are “data types that child elements in the structured document 142 should have”.

ステップＳ７０４でスキーマ定義解析部１１１は、ステップＳ７０１で読み込んだスキーマ定義情報１４１から、解析対象の構造化文書中の子要素のデータ型がスキーマ定義情報１４１内で定義されているデータ型と一致しない場合に用いる修正値を読み出す。そしてスキーマ定義解析部１１１は、各要素についてスキーマ定義情報１４１から読み出した修正値をデータ修正値一覧表１１９に書き込むことで、このデータ修正値一覧表１１９を完成させる。 In step S704, the schema definition analysis unit 111 determines that the data type of the child element in the structured document to be analyzed does not match the data type defined in the schema definition information 141 from the schema definition information 141 read in step S701. The correction value used in the case is read out. Then, the schema definition analyzing unit 111 completes the data correction value list 119 by writing the correction values read from the schema definition information 141 for each element in the data correction value list 119.

図６に、図２のスキーマ定義情報１４１からステップＳ７０４で作成したデータ修正値一覧表１１９の構成例を示す。要素名の欄６０１には、スキーマ定義情報１４１内に定義されている子要素の要素名name,pan,tiltが登録されている。これらの要素は、「構造化文書１４２が有するべき要素」である。条件の欄６０２には、それぞれの子要素について上記欄５０２内に登録されているデータ型で定義されている条件を満たしていない条件が登録されている。 FIG. 6 shows a configuration example of the data correction value list 119 created in step S704 from the schema definition information 141 of FIG. In the element name column 601, element names name, pan, and tilt of child elements defined in the schema definition information 141 are registered. These elements are “elements that the structured document 142 should have”. In the condition column 602, conditions that do not satisfy the conditions defined in the data type registered in the column 502 for each child element are registered.

例えば、name要素の場合、name要素のデータ型は図５より「30文字以内の文字列」であるため、このデータ型が示す条件を満たしていない条件とは「30文字より大」となる。然るに欄６０２においてname要素に対応する箇所には、「30文字より大」という条件が記される。 For example, in the case of the name element, since the data type of the name element is “a character string of 30 characters or less” from FIG. 5, the condition that does not satisfy the condition indicated by this data type is “greater than 30 characters”. However, the condition “greater than 30 characters” is written in the field 602 corresponding to the name element.

また、pan要素の場合、pan要素のデータ型は図５より「-180より大且つ180以下の条件をもつ整数値」であるため、このデータ型が示す条件を満たしていない条件とは「１８０以下」や「１８０より大」となる。然るに、欄６０２においてpan要素に対応する箇所には、「１８０以下」、「１８０より大」という条件が記される。 In the case of the pan element, the data type of the pan element is “an integer value having a condition greater than −180 and less than or equal to 180” from FIG. 5, and therefore the condition that does not satisfy the condition indicated by this data type is “180 "Below" or "greater than 180". However, in the column 602, conditions corresponding to the pan element are described as “180 or less” and “greater than 180”.

修正値の欄６０３には、欄６０２に記された条件が満たされた場合に用いる修正値が登録されている。例えば、欄６０３内のname要素に対応する箇所には、修正値として「”my camera”」が登録されている。そして、構造化文書１４２中のname要素のデータ型が「30文字より大」であれば、name要素の値をこの修正値「my camera」に置き換える。 In the correction value column 603, a correction value used when the conditions described in the column 602 are satisfied is registered. For example, ““ my camera ”” is registered as a correction value at a location corresponding to the name element in the field 603. If the data type of the name element in the structured document 142 is “greater than 30 characters”, the value of the name element is replaced with this modified value “my camera”.

このように、以上説明したステップＳ７０１〜ステップＳ７０４の処理を行うことで、各テーブル情報（構造定義一覧表１１３、データ型定義一覧表１１６、データ修正値一覧表１１９）を完成させることができる。なお、図２のスキーマ定義情報１４１の記述例では、構造化文書が属性定義を含まない例を示しているが、属性定義を含む場合も、要素名を属性名と読み替えることによってステップＳ７０１〜ステップＳ７０４と同様な処理を行うことができる。 In this way, by performing the processing of steps S701 to S704 described above, each table information (structure definition list 113, data type definition list 116, data correction value list 119) can be completed. Note that the description example of the schema definition information 141 in FIG. 2 shows an example in which the structured document does not include the attribute definition. However, even when the attribute definition is included, step S701 to step S701 are performed by replacing the element name with the attribute name. The same processing as S704 can be performed.

次に、メモリ１１０にロードされているアプリケーションプログラムを介して入力された構造化文書１４２の解析指示をＣＰＵ１３０が検知すると、ステップＳ７０５における処理が開始される。ステップＳ７０５では、ステップＳ７０１〜ステップＳ７０４の処理で完成させたテーブル情報を用いて構造化文書１４２を解析し、その解析結果を、構造化文書１４２の解析を指示したアプリケーションプログラムに対して通知する。ステップＳ７０５における処理の詳細を示す図８のフローチャートを用いて、ステップＳ７０５における処理について説明する。なお、以下に説明する処理（ステップＳ８０２〜ステップＳ８１２、ステップＳ８１４〜ステップＳ８１６）は、構造化文書１４２に含まれているノード（要素、属性などの文書の構成単位）毎に行われる。 Next, when the CPU 130 detects an analysis instruction for the structured document 142 input via the application program loaded in the memory 110, the processing in step S705 is started. In step S705, the structured document 142 is analyzed using the table information completed in the processing in steps S701 to S704, and the analysis result is notified to the application program instructed to analyze the structured document 142. The process in step S705 will be described using the flowchart of FIG. 8 showing details of the process in step S705. Note that the processing described below (steps S802 to S812 and steps S814 to S816) is performed for each node (document structural unit such as elements and attributes) included in the structured document 142.

ステップＳ８０２では構造化文書解析部１１２は、記憶装置１４０からメモリ１１０に既に読み出されている構造化文書１４２において未だ読み出していないノード（次のノード）読み出す。 In step S <b> 802, the structured document analysis unit 112 reads a node (next node) that has not been read yet in the structured document 142 that has been read from the storage device 140 to the memory 110.

ステップＳ８０３では構造定義取得部１１４は、構造定義一覧表１１３を参照し、ステップＳ８０２で読み出したノードのノード名（要素名）が従うべき規則（図４では欄４０２に登録されている構造情報）を読み出す。例えば、ステップＳ８０２で読み出したノードがname要素である場合、ステップＳ８０３では構造定義一覧表１１３から、name要素に対応する規則「要素内容として＜data＞で定義された形式の文字列をもつ」を読み出す。 In step S803, the structure definition acquisition unit 114 refers to the structure definition list 113, and the rules to be followed by the node name (element name) of the node read in step S802 (structure information registered in the column 402 in FIG. 4). Is read. For example, if the node read out in step S802 is the name element, in step S803, the rule corresponding to the name element “has a character string in the format defined by <data> as the element content” from the structure definition list 113. read out.

ステップＳ８０４では構造定義比較部１１５は、ステップＳ８０２で読み出したノードのノード構造と、ステップＳ８０３で読み出した規則が示すノード構造と、を比較し、互いに一致しているか否かを判断する。この判断の結果、一致していると判断した場合には処理はステップＳ８０５を介してステップＳ８０６に進み、一致していないと判断した場合には処理はステップＳ８０５を介してステップＳ８１４に進む。 In step S804, the structure definition comparison unit 115 compares the node structure of the node read in step S802 with the node structure indicated by the rule read in step S803, and determines whether or not they match each other. As a result of this determination, if it is determined that they match, the process proceeds to step S806 via step S805, and if it is determined that they do not match, the process proceeds to step S814 via step S805.

例えば、構造化文書１４２が図３（ａ）に示す構造を有する場合、構造化文書１４２におけるルート要素camera_settingsの構造は、図４に示す構造情報が示す構造（子要素name、pan、tiltをこの順番でもつ）と一致する。然るにこの場合は、処理はステップＳ８０５を介してステップＳ８０６に進むことになる。 For example, when the structured document 142 has the structure shown in FIG. 3A, the structure of the root element camera_settings in the structured document 142 includes the structure (child elements name, pan, and tilt shown by the structure information shown in FIG. Match in order). In this case, however, the process proceeds to step S806 via step S805.

ステップＳ８１４ではデータ通知部１２２は、構造化文書１４２の解析を指示したアプリケーションプログラムに対して、構造化文書１４２の構造がスキーマ定義情報１４１が定義した構造に一致していないことを示す構造不一致エラーを通知する。そしてその後、本処理を終了する。 In step S814, the data notifying unit 122 indicates to the application program that has instructed the analysis of the structured document 142 that the structure of the structured document 142 does not match the structure defined by the schema definition information 141. To be notified. Thereafter, this process is terminated.

一方、ステップＳ８０６では、データ型定義取得部１１７は、データ型定義一覧表１１６から、ステップＳ８０２で読み出したノードのノード名（要素名）に対応するデータ型（図５では欄５０２に登録されているデータ型情報）を読み出す。例えば、ステップＳ８０２で読み出したノードがname要素である場合、ステップＳ８０６ではデータ型定義一覧表１１６から、データ型「30文字以内の文字列」を読み出す。 On the other hand, in step S806, the data type definition acquisition unit 117 registers the data type (registered in the column 502 in FIG. 5) corresponding to the node name (element name) of the node read in step S802 from the data type definition list 116. Read data type information). For example, if the node read in step S802 is the name element, the data type “character string of up to 30 characters” is read from the data type definition list 116 in step S806.

ステップＳ８０７ではデータ型比較部１１８は、ステップＳ８０２で読み出したノードのデータ型と、ステップＳ８０６で読み出したデータ型とを比較し、互いに一致しているか否かを判断する。この判断の結果、一致していると判断した場合には処理はステップＳ８０８を介してステップＳ８１５に進み、一致していないと判断した場合には処理はステップＳ８０８を介してステップＳ８０９に進む。 In step S807, the data type comparison unit 118 compares the data type of the node read in step S802 with the data type read in step S806, and determines whether or not they match each other. As a result of this determination, if it is determined that they match, the process proceeds to step S815 via step S808, and if it is determined that they do not match, the process proceeds to step S809 via step S808.

例えば、構造化文書１４２が図３（ａ）に示す構造を有し、且つステップＳ８０２で読み出したノードがpan要素であるとする。この場合、pan要素の値「−３０」は、データ型定義一覧表１１６でpan要素について定義している条件「-180より大且つ180以下の条件をもつ整数値」に合致している。この場合、処理はステップＳ８０８を介してステップＳ８１５に進むことになる。 For example, it is assumed that the structured document 142 has the structure shown in FIG. 3A and the node read in step S802 is a pan element. In this case, the value “−30” of the pan element matches the condition “integer value having a condition greater than −180 and 180 or less” defined for the pan element in the data type definition list 116. In this case, the process proceeds to step S815 via step S808.

また例えば、構造化文書１４２が図３（ｂ）に示す構造を有し、且つステップＳ８０２で読み出したノードがpan要素であるとする。この場合、構造化文書１４２におけるpan要素の値「−２００」は、データ型定義一覧表１１６でpan要素について定義している条件「-180より大且つ180以下の条件をもつ整数値」に合致していない。この場合、処理はステップＳ８０８を介してステップＳ８０９に進むことになる。 For example, assume that the structured document 142 has the structure shown in FIG. 3B and the node read in step S802 is a pan element. In this case, the value “−200” of the pan element in the structured document 142 matches the condition “integer value having a condition greater than −180 and 180 or less” defined for the pan element in the data type definition list 116. I have not done it. In this case, the process proceeds to step S809 via step S808.

ステップＳ８１５では、データ通知部１２２は、構造化文書１４２の解析を指示したアプリケーションプログラムに対して、ステップＳ８０２で読み出したノードの値を通知する。例えば、構造化文書１４２が図３（ａ）に示す構造を有し、且つステップＳ８０２で読み出したノードがpan要素である場合、pan要素の値「−３０」を通知する。また、ステップＳ８０２で読み出したノードがtilt要素である場合、tilt要素の値「６０」を通知する。 In step S815, the data notification unit 122 notifies the application program instructed to analyze the structured document 142 of the node value read in step S802. For example, when the structured document 142 has the structure shown in FIG. 3A and the node read in step S802 is a pan element, the value “−30” of the pan element is notified. If the node read out in step S802 is a tilt element, the tilt element value “60” is notified.

一方、ステップＳ８０９でデータ修正値取得部１２０は、データ修正値一覧表１１９から、ステップＳ８０２で読み出したノードのノード名（要素名）に対応する修正値（図６では欄６０３に登録されている修正値）を読み出す。その際、データ修正値一覧表１１９からは、ステップＳ８０２で読み出したノードが満たす条件に対応する修正値を読み出す。 On the other hand, in step S809, the data correction value acquisition unit 120 registers the correction value (registered in the column 603 in FIG. 6) corresponding to the node name (element name) of the node read in step S802 from the data correction value list 119. Read the correction value. At this time, the correction value corresponding to the condition satisfied by the node read in step S802 is read from the data correction value list 119.

例えば、構造化文書１４２が図３（ｂ）に示した構造を有し、且つステップＳ８０２で読み出したノードがpan要素であるとする。この場合、pan要素の値「−２００」は、データ修正値一覧表１１９でpan要素について定義している２つの条件のうち「−１８０以下」に該当するため、ステップＳ８０９では、この条件に対応する修正値「−１７９」を読み出すことになる。 For example, assume that the structured document 142 has the structure shown in FIG. 3B and the node read in step S802 is a pan element. In this case, the value “−200” of the pan element corresponds to “−180 or less” out of the two conditions defined for the pan element in the data correction value list 119. In step S809, this value corresponds to this condition. The correction value “−179” to be read is read out.

なお、条件に対応する修正値がデータ修正値一覧表１１９に登録されていなかった場合（ステップＳ８０９にて修正値が読み出せなかった場合）には、処理はステップＳ８１０を介してステップＳ８１６に進む。ステップＳ８１６でデータ通知部１２２は、構造化文書１４２の解析を指示したアプリケーションプログラムに対して、ノードのデータ型がスキーマ定義情報１４１が定義したデータ型に一致していないことを示すデータ型不一致エラーを通知する。 If the correction value corresponding to the condition is not registered in the data correction value list 119 (when the correction value cannot be read in step S809), the process proceeds to step S816 via step S810. . In step S816, the data notification unit 122 indicates to the application program that has instructed the analysis of the structured document 142 that the data type of the node does not match the data type defined by the schema definition information 141. To be notified.

一方、条件に対応する修正値がデータ修正値一覧表１１９に登録されていた場合（ステップＳ８０９にて修正値が読み出せた場合）には、処理はステップＳ８１０を介してステップＳ８１１に進む。 On the other hand, when the correction value corresponding to the condition is registered in the data correction value list 119 (when the correction value can be read in step S809), the process proceeds to step S811 via step S810.

ステップＳ８１１ではデータ修正部１２１は、ステップＳ８０２で読み出したノードの値を、データ修正値取得部１２０が読み出した修正値で更新する。例えば、構造化文書１４２が図３（ｂ）に示した構造を有し、且つステップＳ８０２で読み出したノードがpan要素であるとする。この場合、pan要素の値「−２００」を修正値「−１７９」に更新する。これにより、構造化文書１４２を更新することになる。そしてデータ修正部１２１は、この修正値をデータ通知部１２２に転送する。なお、ステップＳ８０２で読み出したノードの値を更新する処理については必須ではなく、省いても良い。 In step S811, the data correction unit 121 updates the node value read in step S802 with the correction value read by the data correction value acquisition unit 120. For example, assume that the structured document 142 has the structure shown in FIG. 3B and the node read in step S802 is a pan element. In this case, the value “−200” of the pan element is updated to the correction value “−179”. As a result, the structured document 142 is updated. Then, the data correction unit 121 transfers this correction value to the data notification unit 122. Note that the process of updating the value of the node read in step S802 is not essential and may be omitted.

ステップＳ８１２では、データ通知部１２２は、ステップＳ８１１でデータ修正部１２１から受けた修正値を、構造化文書１４２の解析を指示したアプリケーションプログラムに対して通知する。 In step S812, the data notification unit 122 notifies the correction value received from the data correction unit 121 in step S811 to the application program instructed to analyze the structured document 142.

構造化文書１４２が図３（ｂ）に示した構造を有する場合、図８のフローチャートに従った処理を行うことで、構造化文書１４２の解析を指示したアプリケーションプログラムには以下の値が通知されることになる。 When the structured document 142 has the structure shown in FIG. 3B, the following values are notified to the application program instructing the analysis of the structured document 142 by performing the processing according to the flowchart of FIG. Will be.

・ name要素の値として”camera0001”（構造化文書１４２の記述どおり）
・ pan要素の値として”-179”(修正値)
・ tilt要素の値として”60”（構造化文書１４２の記述どおり）
また、構造化文書１４２が図３（ｃ）に示した構造を有する場合、図８のフローチャートに従った処理を行うことで、構造化文書１４２の解析を指示したアプリケーションプログラムには以下の値が通知されることになる。 -"Camera0001" as name element value (as described in structured document 142)
-Pan element value "-179" (modified value)
-The value of the tilt element is "60" (as described in the structured document 142)
When the structured document 142 has the structure shown in FIG. 3C, the following values are given to the application program instructing the analysis of the structured document 142 by performing the processing according to the flowchart of FIG. You will be notified.

・ name要素の値として”camera0001”（構造化文書１４２の記述どおり）
・ pan要素の値としてデータ型不一致エラー
・ tilt要素の値として”60”（構造化文書１４２の記述どおり）
これにより、構造化文書１４２の解析を指示したアプリケーションプログラムは、データ型が一致しなかった場合の修正値を、構造化文書解析部１１２から受け取ることができる。 -"Camera0001" as name element value (as described in structured document 142)
-Data type mismatch error as pan element value-"60" as tilt element value (as described in structured document 142)
Accordingly, the application program that has instructed the analysis of the structured document 142 can receive a correction value when the data types do not match from the structured document analysis unit 112.

なお、図８のフローチャートに従った処理は、要は次のような処理を骨格とするものである。図８の処理では、解析対象の構造化文書中の構成要素の値の代わりに用いる値として予め設定された修正値と、この値の代わりにこの修正値を用いるためにこの値が満たすべき条件として予め設定された条件情報と、が登録されたテーブル情報を用いる。 The process according to the flowchart of FIG. 8 is basically based on the following process. In the process of FIG. 8, a correction value set in advance as a value to be used instead of the value of the component in the structured document to be analyzed, and a condition that this value should satisfy in order to use this correction value instead of this value Table information registered in advance as condition information is used.

即ち、先ず、このテーブル情報を用いて、この構成要素の値がこの条件を満たしているか否かを判断する。そして、この構成要素の値がこの条件を満たしていると判断した場合には、このテーブル情報からこの修正値を取得し、この構成要素の値をこの該取得した修正値に置き換えることでこの構造化文書を更新する。 That is, first, using this table information, it is determined whether or not the value of this component satisfies this condition. When it is determined that the value of this component satisfies this condition, the correction value is acquired from the table information, and the value of the component is replaced with the acquired correction value. Update the document.

［第２の実施形態］
図９に、本実施形態に係るスキーマ定義情報１４１の構成例を示す。図９では、ISOで規定されているXMLスキーマ記述言語であるRelax NGで記述されているスキーマ定義情報１４１の構成例を示している。但し、データ修正値一覧表１１９の生成で用いる名前空間宣言部２００、要素が取りうる値及びこの要素の値がこの値に一致した場合に用いる修正値を指定するための定義部９０１、９０２、９０３がRelax NG仕様から拡張されている。 [Second Embodiment]
FIG. 9 shows a configuration example of the schema definition information 141 according to this embodiment. FIG. 9 shows a configuration example of the schema definition information 141 described in Relax NG, which is an XML schema description language defined by ISO. However, the namespace declaration part 200 used in the generation of the data correction value list 119, the definition part 901, 902 for designating the value that can be taken by the element and the correction value to be used when the value of this element matches this value, 903 is extended from the Relax NG specification.

図９では、name要素が取りうる値は、”camera0001”、”camera0002”、”camera0003”である。そして、解析対象の構造化文書中のname要素の値が、例えば”camera0001”であれば、用いるべき修正値は「１」となる。また、解析対象の構造化文書中のname要素の値が”camera000２”であれば、用いるべき修正値は「２」となるし、解析対象の構造化文書中のname要素の値が”camera000３”であれば、用いるべき修正値は「３」となる。 In FIG. 9, the values that the name element can take are “camera0001”, “camera0002”, and “camera0003”. If the value of the name element in the structured document to be analyzed is, for example, “camera0001”, the correction value to be used is “1”. If the value of the name element in the structured document to be analyzed is “camera0002”, the correction value to be used is “2”, and the value of the name element in the structured document to be analyzed is “camera0003”. If so, the correction value to be used is “3”.

本実施形態では、図７のフローチャートにおいてステップＳ７０１，Ｓ７０２，Ｓ７０４，Ｓ７０５の処理を行う。然るに、本実施形態に係る構造化文書解析装置１００では、データ型定義一覧表１１６、データ型定義取得部１１７、データ型比較部１１８、は不要となる。 In the present embodiment, steps S701, S702, S704, and S705 are performed in the flowchart of FIG. However, the structured document analysis apparatus 100 according to the present embodiment does not require the data type definition list 116, the data type definition acquisition unit 117, and the data type comparison unit 118.

ステップＳ７０４で生成されるデータ修正値一覧表１１９は、図１０に示す構成を有することになる。要素名の欄１００１には、スキーマ定義情報１４１から収集したname要素の要素名である「name」が登録されている。条件の欄１００２には、スキーマ定義情報１４１から収集したname要素が取りうる値（実際にはこの値を取った場合、という条件）が登録されている。修正値１００３の欄には、欄１００２に登録されているそれぞれの条件について、この条件が満たされた場合に用いる修正値が登録されている。 The data correction value list 119 generated in step S704 has the configuration shown in FIG. In the element name column 1001, “name”, which is the element name of the name element collected from the schema definition information 141, is registered. Registered in the condition column 1002 are values that can be taken by the name element collected from the schema definition information 141 (actually, when this value is taken). In the column of the correction value 1003, for each condition registered in the column 1002, the correction value used when this condition is satisfied is registered.

本実施形態では、ステップＳ７０５では、図１１に示したフローチャートに従った処理を行う。図１１において、図７と同じ処理ステップには同じステップ番号を付けており、その説明は省略する。 In this embodiment, in step S705, processing according to the flowchart shown in FIG. 11 is performed. In FIG. 11, the same processing steps as those in FIG. 7 are denoted by the same step numbers, and the description thereof is omitted.

ステップＳ１１０８において、データ修正部１２１は、ステップＳ８０２で読み出したノードの値が、データ修正値一覧表１１９中の欄１００２内の何れかの条件（データ型）を満たしているかを判断する。この判断の結果、何れの条件も満たしていない場合には、処理はステップＳ８１６に進む。一方、ステップＳ８０２で読み出したノードの値が満たしている条件が、データ修正値一覧表１１９中の欄１００２内にあった場合には、処理はステップＳ１１０９に進む。 In step S1108, the data correction unit 121 determines whether the node value read in step S802 satisfies any condition (data type) in the column 1002 in the data correction value list 119. As a result of this determination, if none of the conditions is satisfied, the process proceeds to step S816. On the other hand, if the condition that the node value read in step S802 satisfies the condition in the column 1002 in the data correction value list 119, the process proceeds to step S1109.

ステップＳ１１０９では、データ修正値取得部１２０は、データ修正値一覧表１１９中の欄１００３から、ステップＳ８０２で読み出したノードの値が満たしている条件に対応する修正値を取得する。 In step S1109, the data correction value acquisition unit 120 acquires a correction value corresponding to the condition that the node value read in step S802 satisfies from the column 1003 in the data correction value list 119.

対応する修正値が無かった場合には処理はステップＳ１１１０を介してステップＳ８１５に進む。一方、対応する修正値があった場合（ステップＳ１１０９における取得が成功した場合）には、処理はステップＳ１１１０を介してステップＳ１１１１に進む。ステップＳ１１１１では、データ修正値取得部１２０は、ステップＳ１１０９で取得した修正値をデータ通知部１２２に転送する。 If there is no corresponding correction value, the process proceeds to step S815 via step S1110. On the other hand, if there is a corresponding correction value (when acquisition in step S1109 is successful), the process proceeds to step S1111 via step S1110. In step S <b> 1111, the data correction value acquisition unit 120 transfers the correction value acquired in step S <b> 1109 to the data notification unit 122.

このように、本実施形態によれば、スキーマ定義によりあらかじめとり得る値がわかっているノード（ここではname要素）を含む構造化文書を解析する際、以下のことが可能となる。すなわち、構造化文書解析部１１２がデータ型検証時に評価した結果自体（name要素の値）ではなく、代替値をアプリケーションプログラムに通知することができる。これにより、特にノードの値がBase64型などの大きなデータのとき、アプリケーションプログラムが必要としない評価結果を、構造化文書解析部１１２から受け取る必要がなくなる。 As described above, according to the present embodiment, when a structured document including a node (name element in this case) whose value that can be taken in advance by a schema definition is analyzed, the following can be performed. That is, it is possible to notify the application program of an alternative value, not the result itself (value of the name element) evaluated by the structured document analysis unit 112 at the time of data type verification. This eliminates the need to receive an evaluation result that is not required by the application program from the structured document analysis unit 112, particularly when the node value is large data such as the Base64 type.

なお、図１１のフローチャートに従った処理は、要は次のような処理を骨格とするものである。図１１の処理では、解析対象の構造化文書中の構成要素が取りうる値として予め設定された設定値と、この構成要素の値がこの設定値と一致する場合にこの値の代わりに用いる値として予め設定された修正値と、が登録されたテーブル情報を用いる。 The processing according to the flowchart of FIG. 11 is basically based on the following processing. In the processing of FIG. 11, a setting value set in advance as a value that can be taken by the component in the structured document to be analyzed, and a value used instead of this value when the value of this component matches this setting value. Table information in which correction values set in advance are registered.

即ち、先ず、このテーブル情報を用いて、この構成要素の値がこの設定値と一致しているか否かを判断する。そして、この構成要素の値がこの設定値と一致していると判断した場合には、このテーブル情報からこの設定値を取得し、この構成要素の値をこの取得した設定値に置き換えることでこの構造化文書を更新する。 That is, first, using this table information, it is determined whether or not the value of this component matches the set value. When it is determined that the value of this component matches the setting value, the setting value is acquired from the table information, and the value of the component is replaced with the acquired setting value. Update structured documents.

［第３の実施形態］
第１の実施形態では、構造化文書１４２がXML形式であったが、本実施形態では、構造化文書１４２が、W3Cで標準化されているバイナリXMLの形式のひとつであるEXI形式である場合について説明する。 [Third Embodiment]
In the first embodiment, the structured document 142 is in the XML format. However, in this embodiment, the structured document 142 is in the EXI format, which is one of the binary XML formats standardized by W3C. explain.

解析対象の構造化文書１４２がEXI形式であっても、構造化文書解析装置１００の構成は図１の構成と同じである。しかし、第１の実施形態では、データ通知部１２２が、アプリケーションプログラムに、解析結果や修正値を通知する場合に、AT（属性）、CH（要素内容）などのイベントで通知する点が異なる。 Even if the structured document 142 to be analyzed is in the EXI format, the structure of the structured document analysis apparatus 100 is the same as that of FIG. However, the first embodiment is different in that the data notification unit 122 notifies an application program of an analysis result or a correction value by an event such as AT (attribute) or CH (element content).

また、EXI形式では、スキーマ定義解析部１１１が、スキーマ定義情報１４１から生成した文法情報をグラマーとよぶ。グラマーは、構造化文書解析部１１２が構造化文書１４２を解析するのに必要な文法情報である。構造定義一覧表１１３、データ型定義一覧表１１６、データ修正値一覧表１１９の情報は、グラマーに含まれる。これ以外については第１の実施形態と同様である。 In the EXI format, the grammar information generated from the schema definition information 141 by the schema definition analysis unit 111 is called a grammar. The grammar is grammatical information necessary for the structured document analysis unit 112 to analyze the structured document 142. Information of the structure definition list 113, the data type definition list 116, and the data correction value list 119 is included in the grammar. The rest is the same as in the first embodiment.

以上の各実施形態によれば、データの値がスキーマに一致しないＸＭＬ文書を読み込んだアプリケーションプログラムに対して、エラー回復処理を追加することなしに正しい値を取得して処理を継続することができる。 According to each of the embodiments described above, a correct value can be acquired and processing can be continued without adding error recovery processing to an application program that has read an XML document whose data value does not match the schema. .

したがって、妥当性検証が必要なＸＭＬ文書解析を行うアプリケーションプログラムが複数存在する場合でも、アプリケーションプログラムのサイズを削減することができる。また、修正値は解析時の条件によって複数指定することができる。 Therefore, even when there are a plurality of application programs that perform XML document analysis that requires validity verification, the size of the application program can be reduced. A plurality of correction values can be specified according to the conditions at the time of analysis.

また、アプリケーションプログラムの実行環境によって修正値が異なる場合も、スキーマ定義を実行環境ごとに用意することで、アプリケーションプログラム自体を修正する必要がなくなる。 Even when the correction value varies depending on the execution environment of the application program, it is not necessary to correct the application program itself by preparing a schema definition for each execution environment.

（その他の実施例）
また、本発明は、以下の処理を実行することによっても実現される。即ち、上述した実施形態の機能を実現するソフトウェア（プログラム）を、ネットワーク又は各種記憶媒体を介してシステム或いは装置に供給し、そのシステム或いは装置のコンピュータ（またはＣＰＵやＭＰＵ等）がプログラムを読み出して実行する処理である。 (Other examples)
The present invention can also be realized by executing the following processing. That is, software (program) that realizes the functions of the above-described embodiments is supplied to a system or apparatus via a network or various storage media, and a computer (or CPU, MPU, or the like) of the system or apparatus reads the program. It is a process to be executed.

Claims

Means for obtaining a structured document to be analyzed;
A correction value set in advance as a value to be used instead of a value of a component in the structured document, and condition information set in advance as a condition to be satisfied by the value in order to use the correction value instead of the value , Using the registered table information, a determination means for determining whether or not the value of the component satisfies the condition;
When the determination unit determines that the value of the component satisfies the condition, the correction value is acquired from the table information, and the value of the component is replaced with the acquired correction value. An information processing apparatus comprising: update means for updating the structured document.

Furthermore,
When the determination unit determines that the value of the component satisfies the condition, the unit includes a unit that notifies the application program that has issued an analysis instruction for the structured document. The information processing apparatus according to claim 1.

Means for obtaining a structured document to be analyzed;
A preset value that is preset as a value that can be taken by a component in the structured document, and a correction value that is preset as a value to be used in place of the value when the value of the component matches the preset value; , Using the registered table information, a determination means for determining whether or not the value of the component matches the set value;
When the determination means determines that the value of the component matches the set value, the setting value is acquired from the table information, and the value of the component is replaced with the acquired set value. And an update means for updating the structured document.

4. The information processing apparatus according to claim 1, wherein the table information is information created from schema definition information that defines a schema of the structured document. 5.

An information processing method performed by an information processing apparatus,
The acquisition unit of the information processing apparatus acquires a structured document to be analyzed; and
The determination means of the information processing apparatus should satisfy the correction value set in advance as a value to be used instead of the value of the component in the structured document, and the value to satisfy the correction value instead of the value A determination step of determining whether or not a value of the component satisfies the condition using table information in which condition information set in advance as a condition is registered;
When the update unit of the information processing apparatus determines in the determination step that the value of the component satisfies the condition, the correction value is acquired from the table information, and the value of the component is An information processing method comprising: an update step of updating the structured document by replacing the acquired correction value.

An information processing method performed by an information processing apparatus,
The acquisition unit of the information processing apparatus acquires a structured document to be analyzed; and
The determination unit of the information processing apparatus uses a setting value set in advance as a value that can be taken by the component in the structured document and the value of the component when the value matches the set value. A determination step of determining whether or not the value of the component matches the set value using table information in which a correction value set in advance as a value is registered;
When the update unit of the information processing apparatus determines in the determination step that the value of the component matches the set value, the setting value is acquired from the table information, and the value of the component An information processing method comprising: an updating step of updating the structured document by substituting the acquired setting value.

The computer program for functioning a computer as each means which the information processing apparatus of any one of Claims 1 thru | or 4 has.

A computer-readable storage medium storing the computer program according to claim 7.