JP2009059215A

JP2009059215A - Structured document processor, and structured document processing method

Info

Publication number: JP2009059215A
Application number: JP2007226694A
Authority: JP
Inventors: Wataru Shimizu; 渉清水
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2007-08-31
Filing date: 2007-08-31
Publication date: 2009-03-19
Also published as: US20090063954A1

Abstract

<P>PROBLEM TO BE SOLVED: To provide a technique enabling to deal with a plurality of kinds of formats of XML documents, by one application, and further to provide a technique for dealing efficiently with a binary XML of document described by encoding in response to a data format. <P>SOLUTION: The XML document is analyzed by either of a text XML parser 105 or a binary XML parser 106, in response to the format of the XML document. A format corresponding application 111 receives a request for acquiring an element described in the XML document, according to an assigned format. The format corresponding application 111 outputs the element to a requesting side, when the analyzed format is consistent with the assigned format, and outputs the element after converted into the assigned format, to the request source, when the analyzed format is not consistent with the assigned format. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、構造化文書を処理する技術に関するものである。 The present invention relates to a technique for processing a structured document.

現在、コンピュータ上で扱う様々なデータのフォーマットとしてＸＭＬ（ＥｘｔｅｎｓｉｂｌｅＭａｒｋｕｐＬａｎｇｕａｇｅ：http:／／www.w3.org／TR／2004／REC-xml-20040204／）が使用されている。ＸＭＬは、コンピュータやオペレーティングシステムなどに依存しないという特徴を持っている。これにより、ネットワーク上の異なる種類のコンピュータや機器間での通信が容易になるため、特に、ネットワーク上での通信データとして広く普及している。 Currently, XML (Extensible Markup Language: http://www.w3.org/TR/2004/REC-xml-20040204/) is used as a format of various data handled on a computer. XML has a feature that it does not depend on a computer or an operating system. This facilitates communication between different types of computers and devices on the network, and is particularly widespread as communication data on the network.

また、最近は携帯電話や複写機、デジタルカメラなどといった、パーソナルコンピュータやサーバ以外のさまざまな機器のネットワーク化が進んでいる。このため、これらの機器でもＸＭＬを扱うことが増えている。 Recently, various devices other than personal computers and servers such as mobile phones, copiers, and digital cameras have been networked. For this reason, these devices are increasingly handling XML.

このような中で、ＸＭＬの処理速度や効率性が大きな問題となっている。ＸＭＬのフォーマットは、処理速度の向上を優先させた書式にはなっていないため、解析処理に時間がかかる。また、記述に冗長性があるために、そのデータサイズが大きくなる。これらの問題は、処理速度が遅くメモリ資源の少ない小型機器では大きな問題となる。また、サーバなどリソースの多い機器であっても、非常に多くのＸＭＬ形式の文書を処理する場合には、ＸＭＬの解析時間が大きな問題となっている。 Under such circumstances, the processing speed and efficiency of XML become a big problem. Since the XML format is not a format that prioritizes the improvement of the processing speed, it takes time for the analysis process. Moreover, since the description has redundancy, the data size becomes large. These problems become a serious problem in small devices with a slow processing speed and a small memory resource. Further, even if a device having many resources such as a server is used, when processing a very large number of XML documents, the XML analysis time is a big problem.

そのため、ＸＭＬフォーマットと意味的に等価で、かつ、より効率的な処理が可能なフォーマットが使われるようになってきた。このようなフォーマットは一般に「バイナリＸＭＬ」と呼ばれている。これに対し、ＸＭＬの仕様に従った、テキスト形式のＸＭＬのことを本明細書では「テキストＸＭＬ」と呼ぶ。 Therefore, a format that is semantically equivalent to the XML format and capable of more efficient processing has been used. Such a format is generally called “binary XML”. On the other hand, XML in a text format according to the XML specification is referred to as “text XML” in this specification.

バイナリＸＭＬのフォーマット仕様は一つではなく、いくつかある。多くのフォーマットは、冗長性の排除とデータ型に応じたエンコーディングを行うことによってサイズの低減や処理の効率化を図っている。 There are several binary XML format specifications. Many formats attempt to reduce size and increase processing efficiency by eliminating redundancy and performing encoding according to the data type.

冗長性の排除とは、終了タグ名を省略したり、頻繁に登場する要素名や属性名、属性値などの文字列を整数に置き換えることである。終了タグは必ず直前に記述された開始タグと同じ名前でなければならないため、終了タグの名前は省くことが可能である。また、例えば画像を多く含むＸＨＴＭＬ文書では、「ｉｍｇ」という文字列が頻繁に出現する。これら頻出文字列をできるだけ小さな整数に置き換えることで、文書サイズの削減が行われている。 Eliminating redundancy means omitting end tag names or replacing character strings such as frequently appearing element names, attribute names, and attribute values with integers. Since the end tag must have the same name as the start tag described immediately before, the name of the end tag can be omitted. For example, in an XHTML document containing many images, the character string “img” frequently appears. The document size is reduced by replacing these frequent character strings with integers as small as possible.

データ型に応じたエンコーディングとは、要素の内容や属性値等に対するエンコーディング方法を、その型（整数、浮動小数、日付など）に応じて変えることである。例えば、テキストＸＭＬでは、＜ｘ＞１２３４５＜／ｘ＞という要素における”１２３４５”が整数の１２３４５を表していたとしても、文書中には”１２３４５”という文字列として記述される。よって、文書の文字エンコーディングがＵＴＦ−８の場合、０ｘ３０、０ｘ３１、０ｘ３２、０ｘ３３、０ｘ４５というデータになる。 Encoding according to the data type is to change the encoding method for element contents, attribute values, etc., according to the type (integer, float, date, etc.). For example, in text XML, even if “12345” in the element <x> 12345 </ x> represents the integer 12345, it is described as a character string “12345” in the document. Therefore, when the character encoding of the document is UTF-8, the data is 0x30, 0x31, 0x32, 0x33, 0x45.

このように、ＸＭＬ文書中に記述される書式とコンピュータ内部で扱うための書式が異なるため、ＸＭＬ文書を読み込んでコンピュータ内部で処理する際に、書式の変換を行わなければならない。例えば、ある構造化文書処理装置の内部で、整数がビッグエンディアンの４バイトで扱われている場合、整数１２３４５は０ｘ００、０ｘ００、０ｘ３０、０ｘ２Ｅというバイト列になる。このような型の変換は、特に浮動小数の場合に多くの時間を要する。 Thus, since the format described in the XML document is different from the format to be handled inside the computer, the format must be converted when the XML document is read and processed inside the computer. For example, when an integer is handled with 4 bytes of big endian inside a structured document processing apparatus, the integer 12345 becomes a byte string of 0x00, 0x00, 0x30, and 0x2E. This type of conversion takes a lot of time, especially for floating point numbers.

これに対しバイナリＸＭＬでは、整数や浮動小数の値を、コンピュータ内部で扱う書式と同じ形式で記述する。このため、書式の変換を行う必要が無く、より高速に処理することが可能である。 On the other hand, in binary XML, integers and floating point values are described in the same format as the format handled inside the computer. For this reason, it is not necessary to convert the format, and processing can be performed at higher speed.

インデックス化による冗長性の除去とデータ型に応じたエンコーディングを行う例としては、ＦａｓｔＩｎｆｏｓｅｔ（ITU-T Rec. X.891 | ISO／IEC24824-1）が挙げられる。 As an example of performing redundancy removal by indexing and encoding according to the data type, Fast Infoset (ITU-T Rec. X.891 | ISO / IEC24824-1) can be given.

バイナリＸＭＬを扱う場合は通常、バイナリＸＭＬデータ専用の解析器（以下バイナリＸＭＬパーサ）を使うのが普通である。また、バイナリＸＭＬパーサのインターフェースはテキストＸＭＬパーサと同じものになっていることが多い。なぜなら、インターフェースを揃えておくことで、テキストＸＭＬパーサを使用するアプリケーションを改変せずにバイナリＸＭＬパーサに対応させることができるからである。テキストＸＭＬパーサと同じインターフェースを持つバイナリＸＭＬパーサとして、ＳｕｎＭｉｃｒｏｓｙｓｔｅｍｓＩｎｃ．のＦａｓｔＩｎｆｏｓｅｔＰｒｏｊｅｃｔのパーサがある。 When handling binary XML, it is usual to use an analyzer dedicated to binary XML data (hereinafter referred to as a binary XML parser). Further, the interface of the binary XML parser is often the same as that of the text XML parser. This is because by arranging the interfaces, it is possible to correspond to the binary XML parser without modifying the application that uses the text XML parser. As a binary XML parser having the same interface as the text XML parser, Sun Microsystems Inc. There is a parser for Fast Infoset Project.

特許文献１には、XMLデータを利用したシステムと、レガシーファイルデータを利用したシステムの、どちらでもデータを処理することができるように、それぞれをデータ変換することが記載されている。
ＦａｓｔＩｎｆｏｓｅｔＰｒｏｊｅｃｔ (https:／／fi.dev.java.net／) 特開2004-318420号公報 Patent Document 1 describes that data conversion is performed so that data can be processed by either a system using XML data or a system using legacy file data.
Fast Infoset Project (https: ///fi.dev.java.net/) JP 2004-318420 A

しかしながら、テキストＸＭＬパーサと同じインターフェースを持つバイナリＸＭＬパーサでは、データ型に応じたエンコーディングを行うバイナリＸＭＬフォーマットのメリットを生かすことができないという欠点があった。 However, the binary XML parser having the same interface as the text XML parser has a drawback that it cannot take advantage of the binary XML format that performs encoding according to the data type.

なぜなら、テキストＸＭＬパーサのインターフェースでは、データはすべて文字列型として受け渡しが行われるため、インターフェースを同じにすると文字列型のデータしか扱えなくなってしまうからである。このため、バイナリＸＭＬ文書中にＩＥＥＥ７５４形式の浮動小数データがある場合も、バイナリＸＭＬパーサが文字列型に変換して渡し、さらにアプリケーションがまたＩＥＥＥ７５４形式に戻す、という無駄な変換が行われてしまう。 This is because, in the text XML parser interface, all data is transferred as a character string type, so if the interface is the same, only character string type data can be handled. For this reason, even if there is IEEE 754 format floating point data in the binary XML document, the binary XML parser converts it to a character string type and passes it, and further, the application converts it back to the IEEE 754 format. .

また、バイナリＸＭＬパーサのインターフェースをテキストＸＭＬパーサと異なるものにしてしまうと、バイナリＸＭＬパーサ用アプリケーションはテキストＸＭＬパーサを扱えないことになる。つまり、テキストＸＭＬ文書に対応できなくなるという問題が起きてしまう。 If the interface of the binary XML parser is different from that of the text XML parser, the binary XML parser application cannot handle the text XML parser. That is, a problem that the text XML document cannot be handled occurs.

本発明は以上の問題に鑑みてなされたものであり、一つのアプリケーションで複数種のフォーマットのＸＭＬ文書を扱うことを可能にするための技術を提供することを目的とする。 The present invention has been made in view of the above problems, and an object of the present invention is to provide a technique for making it possible to handle XML documents of a plurality of formats in one application.

更に、データ型に応じたエンコーディングで記述されたバイナリＸＭＬの文書を効率的に扱う為の技術を提供することも目的とする。 It is another object of the present invention to provide a technique for efficiently handling a binary XML document described in an encoding according to a data type.

本発明の目的を達成するために、例えば、本発明の構造化文書処理装置は以下の構成を備える。 In order to achieve the object of the present invention, for example, a structured document processing apparatus of the present invention comprises the following arrangement.

即ち、構造化文書を処理する構造化文書処理装置であって、
構造化文書のフォーマットを取得する取得手段と、
前記取得手段が取得したフォーマットに応じた解析方法で、前記構造化文書を解析する解析手段と、
前記構造化文書中に記されている要素を、指定された型で取得する要求を受け付ける手段と、
前記要素について前記解析手段が解析した型と、前記指定された型とが一致しているか否かを判断する判断手段と、
前記判断手段が一致していると判断した場合には前記要素を要求元に出力し、前記判断手段が一致していないと判断した場合には前記要素の型を前記指定された型に変換してから当該要素を前記要求元に出力する出力手段と
を備えることを特徴とする。 That is, a structured document processing apparatus for processing a structured document,
An acquisition means for acquiring the format of the structured document;
Analysis means for analyzing the structured document by an analysis method according to the format acquired by the acquisition means;
Means for receiving a request for acquiring an element described in the structured document in a specified type;
Determining means for determining whether or not the type analyzed by the analyzing means for the element matches the specified type;
When the determination means determines that they match, the element is output to the request source. When the determination means determines that they do not match, the element type is converted to the specified type. And an output means for outputting the element to the request source.

本発明の目的を達成するために、例えば、本発明の構造化文書処理方法は以下の構成を備える。 In order to achieve the object of the present invention, for example, the structured document processing method of the present invention comprises the following arrangement.

即ち、構造化文書を処理する構造化文書処理装置が行う構造化文書処理方法であって、
構造化文書のフォーマットを取得する取得工程と、
前記取得工程で取得したフォーマットに応じた解析方法で、前記構造化文書を解析する解析工程と、
前記構造化文書中に記されている要素を、指定された型で取得する要求を受け付ける工程と、
前記要素について前記解析工程で解析した型と、前記指定された型とが一致しているか否かを判断する判断工程と、
前記判断工程で一致していると判断した場合には前記要素を要求元に出力し、前記判断工程で一致していないと判断した場合には前記要素の型を前記指定された型に変換してから当該要素を前記要求元に出力する出力工程と
を備えることを特徴とする。 That is, a structured document processing method performed by a structured document processing apparatus that processes a structured document,
An acquisition process for acquiring the format of the structured document;
In the analysis method according to the format acquired in the acquisition step, an analysis step of analyzing the structured document;
Receiving a request for acquiring an element described in the structured document in a specified type;
A determination step for determining whether or not the type analyzed in the analysis step for the element matches the designated type;
If it is determined in the determination step that the elements match, the element is output to the request source. If it is determined in the determination step that the elements do not match, the element type is converted to the specified type. And an output step for outputting the element to the request source.

本発明の構成によれば、一つのアプリケーションで複数種のフォーマットのＸＭＬ文書を扱うことを可能にする。 According to the configuration of the present invention, it is possible to handle XML documents of a plurality of formats in one application.

更に、データ型に応じたエンコーディングで記述されたバイナリＸＭＬの文書を効率的に扱うことができる。 Furthermore, it is possible to efficiently handle a binary XML document described in an encoding according to the data type.

以下、添付図面を参照し、本発明の好適な実施形態について詳細に説明する。 Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings.

［第１の実施形態］
図１は、本実施形態に係る構造化文書処理装置に適用可能なコンピュータのハードウェア構成例を示すブロック図である。なお、本実施形態に係る構造化文書処理装置に適用可能な装置が有する構成は、図１に示した構成に限定するものではなく、当業者であれば、種種の変形例が考え得る。更に、本実施形態に係る構造化文書処理装置を１台の装置で実現させることに限定するものではなく、複数台の装置による協調動作でもって、本実施形態に係る構造化文書処理装置を実現させても良い。この場合、複数台の装置間は、ＬＡＮなどのネットワークを介して接続されていることになる。 [First Embodiment]
FIG. 1 is a block diagram illustrating a hardware configuration example of a computer applicable to the structured document processing apparatus according to the present embodiment. Note that the configuration of the apparatus applicable to the structured document processing apparatus according to the present embodiment is not limited to the configuration illustrated in FIG. 1, and various modifications can be considered by those skilled in the art. Further, the structured document processing apparatus according to the present embodiment is not limited to being realized by a single apparatus, and the structured document processing apparatus according to the present embodiment is realized by a cooperative operation by a plurality of apparatuses. You may let them. In this case, a plurality of devices are connected via a network such as a LAN.

図１において、ＣＰＵ１０１は、ＲＯＭ１０２やＲＡＭ１０３に格納されているプログラムやデータを用いて、コンピュータ１００全体の制御を行うと共に、コンピュータ１００が行うものとして説明する後述の各処理を実行する。 In FIG. 1, a CPU 101 controls the entire computer 100 using programs and data stored in a ROM 102 and a RAM 103, and executes each process described later as what the computer 100 performs.

ＲＯＭ１０２には、コンピュータ１００の設定データやブートプログラム、変更を必要としないパラメータのデータなどが格納されている。 The ROM 102 stores setting data of the computer 100, a boot program, data of parameters that do not need to be changed, and the like.

ＲＡＭ１０３は、記憶装置１０４からロードされたプログラムやデータ、ネットワークインターフェース１５０を介して外部から受信したデータなどを一時的に記憶するためのエリアを有する。更には、ＲＡＭ１０３は、ＣＰＵ１０１が各種の処理を実行する際に用いるワークエリアも有する。 The RAM 103 has an area for temporarily storing programs and data loaded from the storage device 104, data received from the outside via the network interface 150, and the like. Furthermore, the RAM 103 also has a work area used when the CPU 101 executes various processes.

記憶装置１０４は、ハードディスクドライブ装置に代表される大容量情報記憶装置である。記憶装置１０４には、ＯＳ（オペレーティングシステム）や、コンピュータ１００が行うものとして説明する後述の各処理をＣＰＵ１０１に実行させるためのプログラムやデータが保存されている。また、記憶装置１０４には、後述する処理の対象となる構造化文書としてのＸＭＬ文書のデータが、ファイルとして保存されている。記憶装置１０４に保存されているプログラムやデータは、ＣＰＵ１０１による制御に従って適宜ＲＡＭ１０３にロードされ、ＣＰＵ１０１による処理対象となる。 The storage device 104 is a large-capacity information storage device represented by a hard disk drive device. The storage device 104 stores an OS (operating system) and programs and data for causing the CPU 101 to execute each process described later that is executed by the computer 100. The storage device 104 stores data of an XML document as a structured document to be processed later, as a file. Programs and data stored in the storage device 104 are appropriately loaded into the RAM 103 under the control of the CPU 101 and are processed by the CPU 101.

以下に、記憶装置１０４に保存されている各ソフトウェアについて説明する。 Below, each software preserve | saved at the memory | storage device 104 is demonstrated.

テキストＸＭＬパーサ１０５は、テキスト形式のＸＭＬ文書（以下、テキストＸＭＬ文書と呼称する）の解析依頼を受けると、このＸＭＬ文書に対する解析処理を行い、その結果を返す。 When the text XML parser 105 receives an analysis request for a text-format XML document (hereinafter referred to as a text XML document), the text XML parser 105 performs an analysis process on the XML document and returns the result.

バイナリＸＭＬパーサ１０６は、バイナリ形式のＸＭＬ文書（以下、バイナリＸＭＬ文書と呼称する）の解析依頼を受けると、このＸＭＬ文書に対する解析処理を行い、その結果を返す。 When the binary XML parser 106 receives an analysis request for an XML document in a binary format (hereinafter referred to as a binary XML document), the binary XML parser 106 performs an analysis process on the XML document and returns the result.

データ型変換部１０７は、変換前のデータの型、変換後のデータの型、変換対象データ、の３つを指定すると、変換対象データの型を変換後の型に変換して返す。 When the data type conversion unit 107 designates three types, the data type before conversion, the data type after conversion, and the conversion target data, the data type conversion unit 107 converts the type of the conversion target data into the type after conversion and returns it.

フォーマット判別部１０８は、与えられたデータのフォーマットを判別する。 The format discrimination unit 108 discriminates the format of given data.

共通ＸＭＬパーサ１０９は、テキストＸＭＬパーサ１０５やバイナリＸＭＬパーサ１０６を使い分けて、テキストＸＭＬ文書やバイナリＸＭＬ文書の解析処理を実現するものである。 The common XML parser 109 implements analysis processing of text XML documents and binary XML documents by using the text XML parser 105 and the binary XML parser 106 properly.

レガシーアプリケーション１１０は、テキストＸＭＬパーサ１０５のＡＰＩ（ＡｐｐｌｉｃａｔｉｏｎＰｒｏｇｒａｍｍｉｎｇＩｎｔｅｒｆａｃｅ）を使用して処理を行う。 The legacy application 110 performs processing using an API (Application Programming Interface) of the text XML parser 105.

型対応アプリケーション１１１は、共通ＸＭＬパーサ１０９のＡＰＩを使用して処理を行う。また、レガシーアプリケーション１１０と型対応アプリケーション１１１とは、両方ともネットワーク上から受信したＸＭＬ文書を処理するサービスとして機能するものである。 The type corresponding application 111 performs processing using the API of the common XML parser 109. The legacy application 110 and the type-compatible application 111 both function as services that process XML documents received from the network.

なお、記憶装置１０４に保存されているものとして説明した各ソフトウェアによって実現される処理については後述する。 The processing realized by each software described as being stored in the storage device 104 will be described later.

ネットワークインターフェース１５０は、コンピュータ１００をＬＡＮやインターネット等に接続するためのものであり、コンピュータ１００はこのネットワークインターフェース１５０を介して、外部機器とのデータ通信を行うことができる。 The network interface 150 is for connecting the computer 100 to a LAN, the Internet, or the like, and the computer 100 can perform data communication with an external device via the network interface 150.

１１２は、上述の各部を繋ぐバスである。 A bus 112 connects the above-described units.

図２は、上記コンピュータ１００を適用したネットワークの構成例を示す図である。 FIG. 2 is a diagram illustrating a configuration example of a network to which the computer 100 is applied.

図２に示す如く、コンピュータ１００をサーバとしてネットワーク２０１に接続する。ネットワーク２０１は、ＬＡＮやインターネット等により構成されている。２０２，２０３はそれぞれクライアント端末で、ネットワーク２０１に接続されている。 As shown in FIG. 2, a computer 100 is connected to a network 201 as a server. The network 201 is configured by a LAN, the Internet, or the like. Reference numerals 202 and 203 denote client terminals, which are connected to the network 201.

ここで、クライアント端末２０２はバイナリＸＭＬ文書を生成し、生成したバイナリＸＭＬ文書をコンピュータ１００に送信するものとする。一方、クライアント端末２０３はテキストＸＭＬ文書を生成し、生成したテキストＸＭＬ文書をコンピュータ１００に送信するものとする。 Here, it is assumed that the client terminal 202 generates a binary XML document and transmits the generated binary XML document to the computer 100. On the other hand, the client terminal 203 generates a text XML document and transmits the generated text XML document to the computer 100.

次に、テキストＸＭＬパーサ１０５のＡＰＩについて、図３を用いて説明する。図３は、テキストＸＭＬパーサ１０５のＡＰＩの一例を示す図である。 Next, the API of the text XML parser 105 will be described with reference to FIG. FIG. 3 is a diagram illustrating an example of the API of the text XML parser 105.

「ＳｅｔＤｏｃｕｍｅｎｔ」は、解析対象のＸＭＬ文書を開く為の関数である。 “SetDocument” is a function for opening an XML document to be analyzed.

「Ｒｅａｄ」は、解析対象のＸＭＬ文書の先頭からノード１つ分読み進める為の関数である。ここで、ノードとはＸＭＬ文書を構成する単位であり、開始タグ（ＳｔａｒｔＥｌｅｍｅｎｔ）、終了タグ（ＥｎｄＥｌｅｍｅｎｔ）、要素の内容（Ｃｏｎｔｅｎｔ）などがある。 “Read” is a function for reading one node from the beginning of the XML document to be analyzed. Here, a node is a unit that constitutes an XML document, and includes a start tag (StartElement), an end tag (EndElement), an element content (Content), and the like.

「ＧｅｔＮｏｄｅＴｙｐｅ」は、現在参照しているノードの型（ノード型）を返す為の関数であり、「ＳｔａｒｔＥｌｅｍｅｎｔ」や「ＥｎｄＥｌｅｍｅｎｔ」といった値を返す。 “GetNodeType” is a function for returning the type (node type) of the currently referenced node, and returns a value such as “StartElement” or “EndElement”.

「ＧｅｔＮａｍｅ」は、現在参照しているノードの名前を返す為の関数である。つまり、現在参照しているノードが開始タグの場合には、開始タグのタグ名を返す。 “GetName” is a function for returning the name of the currently referenced node. In other words, if the currently referenced node is a start tag, the tag name of the start tag is returned.

「ＧｅｔＶａｌｕｅ」は、現在参照しているノードの値を返す為の関数である。つまり、現在参照しているノードが「Ｃｏｎｔｅｎｔ」の場合には、その要素の内容を返す。テキストＸＭＬ文書はすべてテキスト形式で記述されているため、「ＧｅｔＶａｌｕｅ」の戻り値もｓｔｒｉｎｇ型である。 “GetValue” is a function for returning the value of the currently referenced node. That is, when the currently referenced node is “Content”, the content of the element is returned. Since all text XML documents are described in a text format, the return value of “GetValue” is also a string type.

「Ｃｌｏｓｅ」は、解析処理を終了し、確保していたメモリ資源などを解放する為の関数である。 “Close” is a function for ending the analysis processing and releasing the reserved memory resources.

次に、バイナリＸＭＬパーサ１０６のＡＰＩについて、図４を用いて説明する。図４は、バイナリＸＭＬパーサ１０６のＡＰＩの一例を示す図である。 Next, the API of the binary XML parser 106 will be described with reference to FIG. FIG. 4 is a diagram illustrating an example of the API of the binary XML parser 106.

「ＳｅｔＤｏｃｕｍｅｎｔ」、「Ｒｅａｄ」、「ＧｅｔＮｏｄｅＴｙｐｅ」、「ＧｅｔＮａｍｅ」、「Ｃｌｏｓｅ」の各関数については図３に示したものと同じであり、その説明も上述の通りである。即ち、これらの関数は、テキストＸＭＬパーサ１０５の同名のＡＰＩと同じ役目を果たす。しかし、ノードの値を取得する為の関数については、テキストＸＭＬパーサ１０５とバイナリＸＭＬパーサ１０６とでは、大きく異なる。 The functions “SetDocument”, “Read”, “GetNodeType”, “GetName”, and “Close” are the same as those shown in FIG. 3, and the description thereof is also as described above. That is, these functions play the same role as the API of the same name in the text XML parser 105. However, the function for obtaining the value of the node is greatly different between the text XML parser 105 and the binary XML parser 106.

「ＧｅｔＶａｌｕｅＴｙｐｅ」は、現在参照しているノードの値の型を返す為の関数である。例えば、現在参照しているノードの値が、バイナリＸＭＬ文書内に整数値として記述されている場合は「ｉｎｔ」を返し、浮動小数として記述されている場合は「ｄｏｕｂｌｅ」を返す。 “GetValueType” is a function for returning the value type of the currently referenced node. For example, if the value of the currently referenced node is described as an integer value in the binary XML document, “int” is returned, and if it is described as a floating point number, “double” is returned.

「ＧｅｔＳｔｒｉｎｇＶａｌｕｅ」は、現在参照している文字列型のノードの値を取得する為の関数である。 “GetStringValue” is a function for acquiring the value of the currently referenced character string type node.

「ＧｅｔＩｎｔＶａｌｕｅ」は、現在参照している整数型のノードの値を取得するための関数である。 “GetIntValue” is a function for obtaining the value of the currently referenced integer type node.

「ＧｅｔＤｏｕｂｌｅＶａｌｕｅ」は、現在参照している浮動小数型のノードの値を取得する為の関数である。 “GetDoubleValue” is a function for acquiring the value of the currently referenced floating-point node.

つまり、バイナリＸＭＬパーサ１０６のＡＰＩは、現在参照しているノードの値を、ＸＭＬ文書内に記されている型で返す。 That is, the API of the binary XML parser 106 returns the value of the currently referenced node in the type described in the XML document.

次に、共通ＸＭＬパーサ１０９のＡＰＩについて、図５を用いて説明する。図５は、共通ＸＭＬパーサ１０９のＡＰＩの一例を示す図である。 Next, the API of the common XML parser 109 will be described with reference to FIG. FIG. 5 is a diagram illustrating an example of the API of the common XML parser 109.

「ＳｅｔＤｏｃｕｍｅｎｔ」、「Ｒｅａｄ」、「ＧｅｔＮｏｄｅＴｙｐｅ」、「ＧｅｔＮａｍｅ」、「Ｃｌｏｓｅ」の各関数については図３に示したものと同じであり、その説明も上述の通りである。即ち、これらの関数は、テキストＸＭＬパーサ１０５、バイナリＸＭＬパーサ１０６の同名のＡＰＩと同じ役目を果たす。 The functions “SetDocument”, “Read”, “GetNodeType”, “GetName”, and “Close” are the same as those shown in FIG. 3, and the description thereof is also as described above. That is, these functions play the same role as the API of the same name of the text XML parser 105 and the binary XML parser 106.

「ＧｅｔＶａｌｕｅＡｓＳｔｒｉｎｇ」は、現在参照しているノードの値を、文字列として取得する関数である。 “GetValueAsString” is a function that acquires the value of the currently referenced node as a character string.

「ＧｅｔＶａｌｕｅＡｓＩｎｔ」は、現在参照しているノードの値を、整数として取得する関数である。 “GetValueAsInt” is a function that acquires the value of the currently referenced node as an integer.

「ＧｅｔＶａｌｕｅＡｓＤｏｕｂｌｅ」は、現在参照しているノードの値を、浮動小数型として取得する関数である。 “GetValueAsDouble” is a function that acquires the value of the currently referenced node as a floating-point number type.

次に、図６に例示する構成を有するＸＭＬ文書を処理対象とする場合における、コンピュータ１００の動作について説明する。図６は、コンピュータ１００による処理対象としてのＸＭＬ文書の構成例を示す図である。図６に示す構成を有するＸＭＬ文書は、人の名前（ｎａｍｅ）と身長（ｈｅｉｇｈｔ）とを格納する個人情報データである。 Next, the operation of the computer 100 when processing an XML document having the configuration illustrated in FIG. 6 will be described. FIG. 6 is a diagram illustrating a configuration example of an XML document as a processing target by the computer 100. The XML document having the configuration shown in FIG. 6 is personal information data for storing a person's name (name) and height (height).

「＜」と「＞」で囲まれたものは開始タグを表す。図６では、６０２，６０３，６０６が開始タグに相当する。 What is surrounded by “<” and “>” represents a start tag. In FIG. 6, reference numerals 602, 603, and 606 correspond to start tags.

「＜／＞」は終了タグを表す。図６では、６０５，６０８，６０９が終了タグに相当する。 “</>” Represents an end tag. In FIG. 6, reference numerals 605, 608, and 609 correspond to end tags.

要素の内容部分６０４，６０７はＳやＦといった記号から始まり、その後に実際の値が記されている。内容部分６０４における先頭の「Ｓ」は、それに後続する値がＵＴＦ−８で記されている文字列であることを示している。「Ｆ」は、それに後続する値がＩＥＥＥ７５４形式の４バイト浮動小数形式で記されていることを示している。 The content portions 604 and 607 of the element start with a symbol such as S or F, followed by the actual value. The leading “S” in the content portion 604 indicates that the subsequent value is a character string written in UTF-8. “F” indicates that the subsequent value is written in a 4-byte floating-point format in the IEEE754 format.

ＩＥＥＥ７５４形式は、アプリケーションが扱う浮動小数型と同じ形式である。ＸＭＬ文書における先頭部分、即ち６０１で示す部分は、マジックナンバーと呼ばれており、ＸＭＬ文書の先頭付近の数バイトを見ることで、このＸＭＬ文書のフォーマットが認識できるようになっている。本実施形態では、ＸＭＬ文書がバイナリＸＭＬ文書であることを示すためには、マジックナンバー６０１として、「０ｘ０１、０ｘ０２、０ｘ０３」を用いる。 The IEEE 754 format is the same format as the floating point type handled by the application. The head portion in the XML document, that is, the portion indicated by 601 is called a magic number, and the format of the XML document can be recognized by looking at several bytes near the head of the XML document. In the present embodiment, “0x01, 0x02, 0x03” is used as the magic number 601 to indicate that the XML document is a binary XML document.

次に、図６に示すＸＭＬ文書のデータを記憶装置１０４からＲＡＭ１０３にロードした後に、コンピュータ１００が行う処理について、図８を用いて説明する。図８は、ＣＰＵ１０１が、型対応アプリケーション１１１のプログラムを実行することでなされる処理のフローチャートである。係る処理では、図６に示すＸＭＬ文書、即ち、個人情報データから、名前と身長をそれぞれ文字列、整数として取得する。 Next, processing performed by the computer 100 after loading the XML document data shown in FIG. 6 from the storage device 104 to the RAM 103 will be described with reference to FIG. FIG. 8 is a flowchart of processing performed by the CPU 101 executing the program of the type correspondence application 111. In such processing, the name and height are acquired as a character string and an integer, respectively, from the XML document shown in FIG. 6, that is, personal information data.

先ず、ステップＳ８０２では、関数「ＳｅｔＤｏｃｕｍｅｎｔ」を実行し、図６に示すＸＭＬ文書を開く。なお、ステップＳ８０２における処理を行うと、図９に示したフローチャートに従った処理が開始される。図９のフローチャートについては後述する。 First, in step S802, the function “SetDocument” is executed to open the XML document shown in FIG. Note that when the processing in step S802 is performed, processing according to the flowchart shown in FIG. 9 is started. The flowchart of FIG. 9 will be described later.

次にステップＳ８０３では、最初の開始タグが「ｐｅｒｓｏｎ」であることを確認する為に、関数「Ｒｅａｄ」を実行し、係る実行により進めた現在の参照位置に対して関数「ＧｅｔＮｏｄｅＴｙｐｅ」、関数「ＧｅｔＮａｍｅ」を実行する。そして、関数「ＧｅｔＮｏｄｅＴｙｐｅ」の返り値が開始タグ、且つ関数「ＧｅｔＮａｍｅ」による返り値が「ｐｅｒｓｏｎ」となるまで、関数「Ｒｅａｄ」を実行する。 In step S803, in order to confirm that the first start tag is “person”, the function “Read” is executed, and the function “GetNodeType” and the function “ Execute "GetName". Then, the function “Read” is executed until the return value of the function “GetNodeType” is the start tag and the return value of the function “GetName” is “person”.

次にステップＳ８０４では、関数「Ｒｅａｄ」を実行し、係る実行により進めた現在の参照位置に対して関数「ＧｅｔＮｏｄｅＴｙｐｅ」、関数「ＧｅｔＮａｍｅ」を実行する。そして、関数「ＧｅｔＮｏｄｅＴｙｐｅ」の返り値がｎａｍｅタグ、且つ関数「ＧｅｔＮａｍｅ」による返り値が「ｎａｍｅ」となるまで、関数「Ｒｅａｄ」を実行する。 In step S804, the function “Read” is executed, and the function “GetNodeType” and the function “GetName” are executed on the current reference position advanced by the execution. Then, the function “Read” is executed until the return value of the function “GetNodeType” becomes the name tag and the return value of the function “GetName” becomes “name”.

次にステップＳ８０５では、関数「ＧｅｔＶａｌｕｅＡｓＳｔｒｉｎｇ」を実行し、ｎａｍｅタグ（要素）の内容、即ち、「Ａｌｉｃｅ」を文字列として取得する。ステップＳ８０５における処理の詳細については図１０を用いて後述する。 In step S805, the function “GetValueAsString” is executed to acquire the contents of the name tag (element), that is, “Alice” as a character string. Details of the processing in step S805 will be described later with reference to FIG.

次にステップＳ８０６では、関数「Ｒｅａｄ」を実行し、係る実行により進めた現在の参照位置に対して関数「ＧｅｔＮｏｄｅＴｙｐｅ」、関数「ＧｅｔＮａｍｅ」を実行する。そして、関数「ＧｅｔＮｏｄｅＴｙｐｅ」の返り値がｈｅｉｇｈｔタグ、且つ関数「ＧｅｔＮａｍｅ」による返り値が「ｈｅｉｇｈｔ」となるまで、関数「Ｒｅａｄ」を実行する。 In step S806, the function “Read” is executed, and the function “GetNodeType” and the function “GetName” are executed on the current reference position advanced by the execution. Then, the function “Read” is executed until the return value of the function “GetNodeType” is the height tag and the return value of the function “GetName” is “height”.

次にステップＳ８０７では、関数「ＧｅｔＶａｌｕｅＡｓＤｏｕｂｌｅ」を実行し、ｈｅｉｇｈｔタグ（要素）の内容、即ち、「１６０．５」を浮動小数形式の値として取得する。ステップＳ８０７における処理の詳細については図１０を用いて後述する。 In step S807, the function “GetValueAsDouble” is executed to acquire the content of the height tag (element), that is, “160.5” as a value in the floating-point format. Details of the processing in step S807 will be described later with reference to FIG.

次にステップＳ８０８では、関数「Ｃｌｏｓｅ」を実行し、ＲＡＭ１０３に対するメモリ資源等を解放する。 In step S808, the function “Close” is executed to release memory resources and the like for the RAM 103.

次に、上記ステップＳ８０２における処理が実行されると共に開始される処理について、同処理のフローチャートを示す図９を用いて以下説明する。図９のフローチャートに従った処理は、ＣＰＵ１０１が共通ＸＭＬパーサ１０９のプログラムを実行することでなされる処理である。 Next, processing that is started when the processing in step S802 is executed will be described below with reference to FIG. 9 showing a flowchart of the processing. The process according to the flowchart of FIG. 9 is a process performed by the CPU 101 executing the program of the common XML parser 109.

ステップＳ９０２では、フォーマット判別部１０８を実行し、フォーマット判別部１０８に、上記ステップＳ８０２で開いたＸＭＬ文書におけるマジックナンバー（図６における６０１）を取得させる。そして、フォーマット判別部１０８が取得したマジックナンバーを、共通ＸＭＬパーサ１０９が取得する。そしてこの取得したマジックナンバーを用いて、ＸＭＬ文書のフォーマットを判別する。即ち、ＸＭＬ文書がテキストＸＭＬ文書であるのか、バイナリＸＭＬ文書であるのかを判別する。 In step S902, the format discrimination unit 108 is executed to cause the format discrimination unit 108 to acquire the magic number (601 in FIG. 6) in the XML document opened in step S802. Then, the common XML parser 109 acquires the magic number acquired by the format determination unit 108. Then, the format of the XML document is determined using the acquired magic number. That is, it is determined whether the XML document is a text XML document or a binary XML document.

ここで、係る判別では、マジックナンバーが「＜？」という文字列で始まっている場合、テキストＸＭＬ文書であると判別し、「０ｘ０１、０ｘ０２、０ｘ０３」という文字列で始まっている場合、バイナリＸＭＬ文書であると判別する。図６に示したＸＭＬ文書の場合、バイナリＸＭＬ文書と判別されることになる。 In this determination, if the magic number starts with the character string “<?”, It is determined that the magic number is a text XML document. If the magic number starts with the character string “0x01, 0x02, 0x03”, binary XML is used. It is determined as a document. In the case of the XML document shown in FIG. 6, it is determined as a binary XML document.

しかし、ＸＭＬ文書のフォーマットを判別する方法はこれに限定するものではなく、様々な方法が考えられる。例えば、ＨＴＴＰヘッダのＣｏｎｔｅｎｔ−Ｔｙｐｅフィールドの情報を参照することでフォーマットを判別したり、ＸＭＬ文書の拡張子を参照することでＸＭＬ文書のフォーマットを判別しても良い。 However, the method for determining the format of the XML document is not limited to this, and various methods are conceivable. For example, the format may be determined by referring to the information in the Content-Type field of the HTTP header, or the format of the XML document may be determined by referring to the extension of the XML document.

そして、ステップＳ９０２における判別処理の結果、テキストＸＭＬ文書であると判別された場合には、処理をステップＳ９０３を介してステップＳ９０４に処理を進める。一方、バイナリＸＭＬ文書であると判別された場合には、処理をステップＳ９０３を介してステップＳ９０５に進める。 If it is determined as a text XML document as a result of the determination process in step S902, the process proceeds to step S904 via step S903. On the other hand, if it is determined that the document is a binary XML document, the process proceeds to step S905 via step S903.

ステップＳ９０４では、共通ＸＭＬパーサ１０９は、テキストＸＭＬパーサ１０５の関数「ＳｅｔＤｏｃｕｍｅｎｔ」を呼び出し、ＸＭＬ文書をテキストＸＭＬパーサ１０５に渡す。これにより、テキストＸＭＬパーサ１０５にこのＸＭＬ文書の解析を行わせる。 In step S <b> 904, the common XML parser 109 calls the function “SetDocument” of the text XML parser 105 and passes the XML document to the text XML parser 105. This causes the text XML parser 105 to analyze this XML document.

一方、ステップＳ９０５では、共通ＸＭＬパーサ１０９は、バイナリＸＭＬパーサ１０６の関数「ＳｅｔＤｏｃｕｍｅｎｔ」を呼び出し、ＸＭＬ文書をバイナリＸＭＬパーサ１０６に渡す。これにより、バイナリＸＭＬパーサ１０６にこのＸＭＬ文書の解析を行わせる。 On the other hand, in step S 905, the common XML parser 109 calls the function “SetDocument” of the binary XML parser 106 and passes the XML document to the binary XML parser 106. This causes the binary XML parser 106 to analyze this XML document.

テキストＸＭＬパーサ１０５、バイナリＸＭＬパーサ１０６は何れも、ＸＭＬ文書中（構造化文書中）に記述されている要素に対する解析処理を行う。即ち、ＸＭＬ文書のフォーマットに応じた解析処理を実現する。 Both the text XML parser 105 and the binary XML parser 106 perform analysis processing on elements described in the XML document (in the structured document). That is, an analysis process according to the format of the XML document is realized.

共通ＸＭＬパーサ１０９の関数「Ｒｅａｄ」、「ＧｅｔＮｏｄｅＴｙｐｅ」、「ＧｅｔＮａｍｅ」、「Ｃｌｏｓｅ」は、テキストＸＭＬパーサ１０５、バイナリＸＭＬパーサ１０６の同名の関数をそのまま呼び出し、戻り値もそのまま渡すだけのラッパである。 The functions “Read”, “GetNodeType”, “GetName”, and “Close” of the common XML parser 109 are wrappers that simply call the functions of the same names of the text XML parser 105 and the binary XML parser 106 and pass the return values as they are. .

次に、上記ステップＳ８０５，Ｓ８０７における処理の詳細について、図１０を用いて説明する。図１０は、ステップＳ８０５，Ｓ８０７における処理の詳細を示すフローチャートである。 Next, details of the processing in steps S805 and S807 will be described with reference to FIG. FIG. 10 is a flowchart showing details of the processing in steps S805 and S807.

先ず、ステップＳ１００２では、上記ステップＳ９０２における判別処理の結果として、テキストＸＭＬパーサ１０５、バイナリＸＭＬパーサ１０６の何れに解析処理を行わせているのかを判別する。係る判別の結果、現在テキストＸＭＬパーサ１０５に解析処理を行わせている場合には、処理をステップＳ１００８に進める。一方、現在バイナリＸＭＬパーサ１０６に解析処理を行わせている場合には、処理をステップＳ１００３に進める。図６に示したＸＭＬ文書の場合、バイナリＸＭＬパーサ１０６を用いてこのＸＭＬ文書に対する解析処理を行わせていることになるので、処理はステップＳ１００３に進むことになる。 First, in step S1002, it is determined which of the text XML parser 105 and the binary XML parser 106 is performing analysis processing as a result of the determination processing in step S902. If it is determined that the text XML parser 105 is currently performing analysis processing, the process advances to step S1008. On the other hand, if the binary XML parser 106 is currently performing analysis processing, the process advances to step S1003. In the case of the XML document shown in FIG. 6, since the analysis processing is performed on the XML document using the binary XML parser 106, the process proceeds to step S1003.

ステップＳ１００３以降の処理を、ステップＳ８０５において行う場合と、ステップＳ８０７において行う場合とに分けて説明する。 The processing after step S1003 will be described separately for the case where it is performed in step S805 and the case where it is performed in step S807.

先ず、ステップＳ８０５においてステップＳ１００３以降の処理を行う場合について説明する。 First, the case where the process after step S1003 is performed in step S805 will be described.

ステップＳ１００３では、関数「ＧｅｔＶａｌｕｅＴｙｐｅ」を実行することで、バイナリＸＭＬパーサ１０６が解析した結果を取得する。ステップＳ８０５では、関数「ＧｅｔＶａｌｕｅＡｓＳｔｒｉｎｇ」が実行されるので、バイナリＸＭＬパーサ１０６はｎａｍｅタグの型を取得することになり、図６に示したＸＭＬ文書の場合、ｓｔｒｉｎｇ型を取得する。従って、ステップＳ１００３では、このｓｔｒｉｎｇ型を「型情報」として取得する。 In step S1003, the function “GetValueType” is executed to obtain a result analyzed by the binary XML parser 106. In step S805, since the function “GetValueAsString” is executed, the binary XML parser 106 acquires the type of the name tag. In the case of the XML document shown in FIG. 6, the string type is acquired. Therefore, in step S1003, this string type is acquired as “type information”.

次に、ステップＳ１００４では、関数「ＧｅｔＳｔｒｉｎｇＶａｌｕｅ」を実行することで、バイナリＸＭＬパーサ１０６が解析した結果を取得する。ステップＳ８０５では、関数「ＧｅｔＶａｌｕｅＡｓＳｔｒｉｎｇ」が実行されるので、バイナリＸＭＬパーサ１０６はｎａｍｅタグの内容を取得することになり、図６に示したＸＭＬ文書の場合、文字列”Ａｌｉｃｅ”を取得する。従って、ステップＳ１００４では、この文字列”Ａｌｉｃｅ”を取得する。 Next, in step S1004, the result of analysis by the binary XML parser 106 is acquired by executing the function “GetStringValue”. In step S805, since the function “GetValueAsString” is executed, the binary XML parser 106 acquires the contents of the name tag. In the case of the XML document shown in FIG. 6, the character string “Alice” is acquired. Therefore, in step S1004, this character string “Alice” is acquired.

次にステップＳ１００５では、ステップＳ８０５で実行した関数が要求する（受け付けた）データの型（依頼された型）と、ステップＳ１００３で取得した型とが一致しているか否かを判断する。係る判断の結果、一致する場合には、処理をステップＳ１００７に進める。図６に示したＸＭＬ文書の場合、ステップＳ８０５で実行した関数が要求するデータの型はｓｔｒｉｎｇ型であるし、ステップＳ１００３で取得した型もまたｓｔｒｉｎｇ型であるので、一致すると判断される。この場合、ステップＳ１００７では、上記ステップＳ１００４で取得したデータ（文字列）を、要求元（型対応アプリケーション１１１）に出力する。 Next, in step S1005, it is determined whether the data type requested (accepted) requested by the function executed in step S805 matches the type acquired in step S1003. As a result of the determination, if they match, the process proceeds to step S1007. In the case of the XML document shown in FIG. 6, since the data type requested by the function executed in step S805 is the string type, and the type acquired in step S1003 is also the string type, it is determined that they match. In this case, in step S1007, the data (character string) acquired in step S1004 is output to the request source (type-compatible application 111).

一方、ステップＳ１００５における判断の結果、一致していない場合には、処理をステップＳ１００６に進める。ステップＳ１００６では、ステップＳ１００４で取得したデータの型を、ステップＳ８０５で実行した関数が要求するデータの型に変換する。そして、その後、ステップＳ１００７では、ステップＳ１００６で型を変換したデータを、上記要求元に対して出力する。 On the other hand, if the result of determination in step S1005 is that they do not match, the process proceeds to step S1006. In step S1006, the data type acquired in step S1004 is converted into the data type required by the function executed in step S805. Thereafter, in step S1007, the data whose type has been converted in step S1006 is output to the request source.

次に、ステップＳ８０７においてステップＳ１００３以降の処理を行う場合について説明する。 Next, the case where the process after step S1003 is performed in step S807 will be described.

ステップＳ１００３では、関数「ＧｅｔＶａｌｕｅＴｙｐｅ」を実行することで、バイナリＸＭＬパーサ１０６が解析した結果を取得する。ステップＳ８０７では、関数「ＧｅｔＶａｌｕｅＡｓＤｏｕｂｌｅ」が実行されるので、バイナリＸＭＬパーサ１０６はｈｅｉｇｈｔタグの型を取得することになり、図６に示したＸＭＬ文書の場合、ｄｏｕｂｌｅ型を取得する。従って、ステップＳ１００３では、このｄｏｕｂｌｅ型を「型情報」として取得する。 In step S1003, the function “GetValueType” is executed to obtain a result analyzed by the binary XML parser 106. In step S807, since the function “GetValueAsDouble” is executed, the binary XML parser 106 acquires the type of the height tag, and in the case of the XML document shown in FIG. 6, acquires the double type. Accordingly, in step S1003, this double type is acquired as “type information”.

次に、ステップＳ１００４では、関数「ＧｅｔＳｔｒｉｎｇＶａｌｕｅ」を実行することで、バイナリＸＭＬパーサ１０６が解析した結果を取得する。ステップＳ８０７では、関数「ＧｅｔＶａｌｕｅＡｓＤｏｕｂｌｅ」が実行されるので、バイナリＸＭＬパーサ１０６はｈｅｉｇｈｔタグの内容を取得することになり、図６に示したＸＭＬ文書の場合、実数値”１６０．５”を取得する。従って、ステップＳ１００４では、この実数値”１６０．５”を取得する。 Next, in step S1004, the result of analysis by the binary XML parser 106 is acquired by executing the function “GetStringValue”. In step S807, since the function “GetValueAsDouble” is executed, the binary XML parser 106 acquires the content of the height tag. In the case of the XML document shown in FIG. 6, the real value “160.5” is acquired. . Accordingly, in step S1004, the real value “160.5” is acquired.

次にステップＳ１００５では、ステップＳ８０７で実行した関数が要求するデータの型（依頼された型）と、ステップＳ１００３で取得した型とが一致しているか否かを判断する。係る判断の結果、一致する場合には、処理をステップＳ１００７に進める。図６に示したＸＭＬ文書の場合、ステップＳ８０７で実行した関数が要求するデータの型はｄｏｕｂｌｅ型であるし、ステップＳ１００３で取得した型もまたｄｏｕｂｌｅ型であるので、一致すると判断される。この場合、ステップＳ１００７では、上記ステップＳ１００４で取得したデータ（実数値）を、要求元（型対応アプリケーション１１１）に出力する。 In step S1005, it is determined whether the data type requested by the function executed in step S807 (requested type) matches the type acquired in step S1003. As a result of the determination, if they match, the process proceeds to step S1007. In the case of the XML document shown in FIG. 6, since the data type requested by the function executed in step S807 is a double type, and the type acquired in step S1003 is also a double type, it is determined that they match. In this case, in step S1007, the data (real value) acquired in step S1004 is output to the request source (type-compatible application 111).

一方、ステップＳ１００５における判断の結果、一致していない場合には、処理をステップＳ１００６に進める。ステップＳ１００６では、ステップＳ１００４で取得したデータの型を、ステップＳ８０７で実行した関数が要求するデータの型に変換する。そして、その後、ステップＳ１００７では、ステップＳ１００６で型を変換したデータを、上記要求元に対して出力する。 On the other hand, if the result of determination in step S1005 is that they do not match, the process proceeds to step S1006. In step S1006, the data type acquired in step S1004 is converted into the data type required by the function executed in step S807. Thereafter, in step S1007, the data whose type has been converted in step S1006 is output to the request source.

次に、図６に示したＸＭＬ文書の代わりに、図７に示した構成を有するＸＭＬ文書を処理対象とした場合における、コンピュータ１００の動作について説明する。図７は、コンピュータ１００による処理対象としてのＸＭＬ文書の構成例を示す図である。図７に示す構成を有するＸＭＬ文書は、図６に示したＸＭＬ文書と同様の内容が記述されている個人情報データであるが、図６に示したＸＭＬ文書がバイナリＸＭＬ文書であるのに対し、図７に示すＸＭＬ文書は、テキストＸＭＬ文書である。 Next, the operation of the computer 100 when the XML document having the configuration shown in FIG. 7 is set as the processing target instead of the XML document shown in FIG. 6 will be described. FIG. 7 is a diagram illustrating a configuration example of an XML document as a processing target by the computer 100. The XML document having the configuration shown in FIG. 7 is personal information data in which the same contents as the XML document shown in FIG. 6 are described, whereas the XML document shown in FIG. 6 is a binary XML document. The XML document shown in FIG. 7 is a text XML document.

タグ７０１は、このＸＭＬ文書がテキスト形式のものであることを示すものである。 A tag 701 indicates that the XML document is in a text format.

タグ７０２，７０３，７０５，７０６，７０８，７０９はそれぞれ、図６のタグ６０２，６０３，６０５，６０６，６０８，６０９に対応するもので、テキスト形式固有の表現となっている。 Tags 702, 703, 705, 706, 708, and 709 correspond to the tags 602, 603, 605, 606, 608, and 609 in FIG.

７０４，７０５はそれぞれ、人の名前を示す文字列、身長を示す実数値、であり、内容が異なるのみで、実質的には図６の６０４，６０７と同じである。 704 and 705 are a character string indicating a person's name and a real value indicating height, respectively, and are substantially the same as 604 and 607 in FIG.

図７に示したＸＭＬ文書を処理対象とする場合に図８〜図１０に示したフローチャートに従った処理を行う場合、図８〜図１０ついて説明した上記処理と異なる点は以下の通りである。 When processing the XML document shown in FIG. 7 according to the flowcharts shown in FIGS. 8 to 10 when the XML document shown in FIG. 7 is to be processed, the following points are different from those described with reference to FIGS. .

ステップＳ９０２では、フォーマット判別部１０８を実行し、フォーマット判別部１０８に、上記ステップＳ８０２で開いたＸＭＬ文書におけるマジックナンバー（図７における７０１）を取得させる。そして、フォーマット判別部１０８が取得したマジックナンバーを、共通ＸＭＬパーサ１０９が取得する。そしてこの取得したマジックナンバーを用いて、ＸＭＬ文書のフォーマットを判別する。即ち、ＸＭＬ文書がテキストＸＭＬ文書であるのか、バイナリＸＭＬ文書であるのかを判別する。図７に示したＸＭＬ文書の場合、テキストＸＭＬ文書と判別されることになる。従って、処理はステップＳ９０３を介してステップＳ９０４に進み、ステップＳ９０４では、共通ＸＭＬパーサ１０９は、テキストＸＭＬパーサ１０５の関数「ＳｅｔＤｏｃｕｍｅｎｔ」を呼び出し、ＸＭＬ文書をテキストＸＭＬパーサ１０５に渡す。これにより、テキストＸＭＬパーサ１０５にこのＸＭＬ文書の解析を行わせる。 In step S902, the format discrimination unit 108 is executed to cause the format discrimination unit 108 to acquire the magic number (701 in FIG. 7) in the XML document opened in step S802. Then, the common XML parser 109 acquires the magic number acquired by the format determination unit 108. Then, the format of the XML document is determined using the acquired magic number. That is, it is determined whether the XML document is a text XML document or a binary XML document. In the case of the XML document shown in FIG. 7, it is determined as a text XML document. Therefore, the process proceeds to step S904 via step S903. In step S904, the common XML parser 109 calls the function “SetDocument” of the text XML parser 105 and passes the XML document to the text XML parser 105. This causes the text XML parser 105 to analyze this XML document.

先ず、ステップＳ１００２では、上記ステップＳ９０２における判別処理の結果として、テキストＸＭＬパーサ１０５、バイナリＸＭＬパーサ１０６の何れに解析処理を行わせているのかを判別する。図７に示したＸＭＬ文書の場合、テキストＸＭＬパーサ１０５を用いてこのＸＭＬ文書に対する解析処理を行わせていることになるので、処理はステップＳ１００８に進むことになる。 First, in step S1002, it is determined which of the text XML parser 105 and the binary XML parser 106 is performing analysis processing as a result of the determination processing in step S902. In the case of the XML document shown in FIG. 7, since the text XML parser 105 is used to perform analysis processing on this XML document, the processing proceeds to step S1008.

ステップＳ１００８以降の処理を、ステップＳ８０５において行う場合と、ステップＳ８０７において行う場合とに分けて説明する。 The processing after step S1008 will be described separately for the case where it is performed in step S805 and the case where it is performed in step S807.

先ず、ステップＳ８０５においてステップＳ１００８以降の処理を行う場合について説明する。 First, the case where the process after step S1008 is performed in step S805 will be described.

ステップＳ１００８では、関数「ＧｅｔＶａｌｕｅ」を実行することで、テキストＸＭＬパーサ１０５が解析した結果を取得する。ステップＳ８０５では、関数「ＧｅｔＶａｌｕｅＡｓＳｔｒｉｎｇ」が実行されるので、テキストＸＭＬパーサ１０５はｎａｍｅタグの内容を取得することになり、図７に示したＸＭＬ文書の場合、文字列”Ｂｏｂ”を取得する。従って、ステップＳ１００８では、この文字列”Ａｌｉｃｅ”を取得する。 In step S1008, the function “GetValue” is executed to obtain the result of analysis by the text XML parser 105. In step S805, since the function “GetValueAsString” is executed, the text XML parser 105 acquires the contents of the name tag. In the case of the XML document shown in FIG. 7, the character string “Bob” is acquired. Therefore, in step S1008, this character string “Alice” is acquired.

次にステップＳ１００９では、ステップＳ８０５で実行した関数が要求するデータの型（依頼された型）が、ｓｔｒｉｎｇ型（文字列型）若しくは指定無しであるか否かを判断する。係る判断の結果、ｓｔｒｉｎｇ型若しくは指定無しである場合には、処理をステップＳ１００７に進める。図７に示したＸＭＬ文書の場合、ステップＳ８０５で実行した関数が要求するデータの型はｓｔｒｉｎｇ型であるので、処理をステップＳ１００７に進める。ステップＳ１００７では、上記ステップＳ１００８で取得したデータ（文字列）を、要求元（型対応アプリケーション１１１）に出力する。 In step S1009, it is determined whether the data type requested by the function executed in step S805 (requested type) is a string type (character string type) or not specified. If the result of this determination is string or no designation, the process advances to step S1007. In the case of the XML document shown in FIG. 7, since the data type requested by the function executed in step S805 is the string type, the process advances to step S1007. In step S1007, the data (character string) acquired in step S1008 is output to the request source (type-compatible application 111).

一方、ステップＳ１００９における判断の結果、ｓｔｒｉｎｇ型若しくは指定無しではない場合には、処理をステップＳ１０１０に進める。ステップＳ１０１０では、上記ステップＳ１００６と同様の処理を行う。そして、その後、ステップＳ１００７では、ステップＳ１０１０で型を変換したデータを、上記要求元に対して出力する。 On the other hand, if the result of determination in step S1009 is not string type or no designation, processing proceeds to step S1010. In step S1010, the same processing as in step S1006 is performed. Thereafter, in step S1007, the data whose type has been converted in step S1010 is output to the request source.

次に、ステップＳ８０７においてステップＳ１００８以降の処理を行う場合について説明する。 Next, the case where the process after step S1008 is performed in step S807 will be described.

ステップＳ１００８では、関数「ＧｅｔＶａｌｕｅ」を実行することで、テキストＸＭＬパーサ１０５が解析した結果を取得する。ステップＳ８０７では、関数「ＧｅｔＶａｌｕｅＡｓＤｏｕｂｌｅ」が実行されるので、テキストＸＭＬパーサ１０５はｈｅｉｇｈｔタグの内容を取得することになり、図７に示したＸＭＬ文書の場合、文字列”１７５．３”を取得する。従って、ステップＳ１００８では、この文字列”１７５．３”を取得する。 In step S1008, the function “GetValue” is executed to obtain the result of analysis by the text XML parser 105. In step S807, since the function “GetValueAsDouble” is executed, the text XML parser 105 acquires the contents of the height tag. In the case of the XML document shown in FIG. 7, the character string “175.3” is acquired. . Therefore, in step S1008, this character string “175.3” is acquired.

次にステップＳ１００９では、ステップＳ８０７で実行した関数が要求するデータの型（依頼された型）が、ｓｔｒｉｎｇ型（文字列型）若しくは指定無しであるか否かを判断する。係る判断の結果、ｓｔｒｉｎｇ型若しくは指定無しである場合には、処理をステップＳ１００７に進める。一方、ステップＳ１００９における判断の結果、ｓｔｒｉｎｇ型、指定無しの何れでもない場合には、処理をステップＳ１０１０に進める。 In step S1009, it is determined whether the data type requested by the function executed in step S807 (requested type) is a string type (character string type) or not specified. If the result of this determination is string or no designation, the process advances to step S1007. On the other hand, if the result of determination in step S1009 is neither string type nor designation, the process proceeds to step S1010.

図７に示したＸＭＬ文書の場合、ステップＳ８０７で実行した関数が要求するデータの型はｄｏｕｂｌｅ型（浮動小数点型）であり、ｓｔｒｉｎｇ型、指定無しの何れでもない。従ってこの場合、処理をステップＳ１０１０に進める。 In the case of the XML document shown in FIG. 7, the type of data requested by the function executed in step S807 is a double type (floating point type), which is neither a string type nor an unspecified type. Therefore, in this case, the process proceeds to step S1010.

ステップＳ１０１０では、ステップＳ１００８で取得したデータの型を、ステップＳ８０７で実行した関数が要求するデータの型に変換する。その結果、ＩＥＥＥ７５４形式の１７５．３という浮動小数値を取得することができる。 In step S1010, the data type acquired in step S1008 is converted into the data type required by the function executed in step S807. As a result, a floating point value of 175.3 in the IEEE 754 format can be acquired.

そして、その後、ステップＳ１００７では、ステップＳ１０１０で型を変換したデータを、上記要求元に対して出力する。 Thereafter, in step S1007, the data whose type has been converted in step S1010 is output to the request source.

次に、レガシーアプリケーション１１０の動作について説明する。レガシーアプリケーション１１０は元々、バイナリＸＭＬ文書を対象としていなかったものであるため、テキストＸＭＬパーサ１０５のＡＰＩを使用して作られている。このレガシーアプリケーション１１０が個人情報データを扱う場合にコンピュータ１００が行う処理は、図１１に示したフローチャートに従ったものとなる。 Next, the operation of the legacy application 110 will be described. Since the legacy application 110 was originally not intended for binary XML documents, it is created using the API of the text XML parser 105. When the legacy application 110 handles personal information data, the processing performed by the computer 100 follows the flowchart shown in FIG.

図１１は、レガシーアプリケーション１１０が個人情報データを扱う場合にコンピュータ１００が行う処理のフローチャートである。 FIG. 11 is a flowchart of processing performed by the computer 100 when the legacy application 110 handles personal information data.

ステップＳ１１０２〜ステップＳ１１０４、ステップＳ１１０６、ステップＳ１１０８はそれぞれ、図８に示したステップＳ８０２〜ステップＳ８０４、ステップＳ８０６、ステップＳ８０８と同じである。以下では、ステップＳ１１０５，Ｓ１１０７における処理について説明する。 Step S1102 to step S1104, step S1106, and step S1108 are the same as step S802 to step S804, step S806, and step S808 shown in FIG. 8, respectively. Hereinafter, processing in steps S1105 and S1107 will be described.

ステップＳ１１０５，Ｓ１１０７において、ノードの値はすべて関数「ＧｅｔＶａｌｕｅ」を用いて取得する。ステップＳ１１０５，Ｓ１１０７における処理の詳細は、図１０に示したフローチャートに従ったものとなる。 In steps S1105 and S1107, the values of all nodes are acquired using the function “GetValue”. Details of the processing in steps S1105 and S1107 are according to the flowchart shown in FIG.

この場合、テキストＸＭＬパーサ１０５を用いることになるので、ステップＳ１００２からステップＳ１００８に処理を進めることになる。 In this case, since the text XML parser 105 is used, the processing proceeds from step S1002 to step S1008.

ステップＳ１００８では、関数「ＧｅｔＶａｌｕｅ」を実行することで、テキストＸＭＬパーサ１０５が解析した結果を取得するので、ステップＳ１１０５では、図６に示したＸＭＬ文書の場合、文字列”Ａｌｉｃｅ”を取得する。従って、ステップＳ１００８では、この文字列”Ａｌｉｃｅ”を取得する。 In step S1008, the function “GetValue” is executed to obtain the result of analysis by the text XML parser 105. In step S1105, the character string “Alice” is obtained in the case of the XML document shown in FIG. Therefore, in step S1008, this character string “Alice” is acquired.

次に、ステップＳ１１０５で実行した関数が要求するデータの型（依頼された型）はｓｔｒｉｎｇ型であるので、処理をステップＳ１００９を介してステップＳ１００７に進める。ステップＳ１００７では、上記ステップＳ１００８で取得したデータ（文字列）を、要求元（型対応アプリケーション１１１）に出力する。 Next, since the data type requested by the function executed in step S1105 (requested type) is the string type, the process proceeds to step S1007 via step S1009. In step S1007, the data (character string) acquired in step S1008 is output to the request source (type-compatible application 111).

また、ステップＳ１００８では、関数「ＧｅｔＶａｌｕｅ」を実行することで、テキストＸＭＬパーサ１０５が解析した結果を取得するので、ステップＳ１１０７では、図６に示したＸＭＬ文書の場合、文字列”１６０．５”を取得する。従って、ステップＳ１００８では、この文字列”１６０．５”を取得する。 In step S1008, the function “GetValue” is executed to obtain the result of analysis by the text XML parser 105. In step S1107, the character string “160.5” is obtained in the case of the XML document shown in FIG. To get. Accordingly, in step S1008, this character string “160.5” is acquired.

次に、ステップＳ１１０７で実行した関数が要求するデータの型（依頼された型）はｄｏｕｂｌｅ型（浮動小数点型）であり、ｓｔｒｉｎｇ型ではないので、処理をステップＳ１００９を介してステップＳ１０１０に進める。 Next, since the data type requested by the function executed in step S1107 (requested type) is a double type (floating point type) and not a string type, the process proceeds to step S1010 via step S1009.

ステップＳ１０１０では、ステップＳ１００８で取得したデータの型を、ステップＳ１１０７で実行した関数が要求するデータの型に変換する。その結果、ＩＥＥＥ７５４形式の１６０．５という浮動小数値を取得することができる。 In step S1010, the data type acquired in step S1008 is converted to the data type required by the function executed in step S1107. As a result, a floating point value of 160.5 in the IEEE754 format can be acquired.

そしてその後、ステップＳ１００７では、この浮動小数値１６０．５を上記要求元に対して出力する。 Thereafter, in step S1007, the floating point value 160.5 is output to the request source.

このようにして、レガシーアプリケーション１１０はバイナリＸＭＬ文書から値を取得することができる。 In this way, legacy application 110 can obtain values from a binary XML document.

レガシーアプリケーション１１０に対してテキストＸＭＬ文書が渡された場合、共通ＸＭＬパーサ１０９は特別な処理を行わず、テキストＸＭＬパーサ１０５の単なるラッパとして振舞うことになるため、正常に値を取得することができる。 When a text XML document is passed to the legacy application 110, the common XML parser 109 does not perform special processing and behaves as a simple wrapper of the text XML parser 105, so that a value can be acquired normally. .

以上説明したように、本実施形態によれば、共通ＸＭＬパーサ１０９は、２種類のアプリケーションと２種類の形式のＸＭＬ文書の組み合わせ、つまり計４つの場合のすべてにおいて正しく値を取得する機能を提供することができる。 As described above, according to the present embodiment, the common XML parser 109 provides a function for correctly acquiring values in a combination of two types of applications and two types of XML documents, that is, in all four cases. can do.

さらに、型対応アプリケーション１１１がバイナリＸＭＬ文書を扱う場合は、途中で値の型の変換が行われないため、無駄のない高速な処理が可能となる。これにより、ＸＭＬ文書を使用するアプリケーションにおいて、バイナリＸＭＬ文書を用いた高速な処理をサポートし、かつテキストＸＭＬ文書も扱うことを可能にする。 Further, when the type-compatible application 111 handles a binary XML document, value type conversion is not performed in the middle, so that high-speed processing without waste is possible. As a result, an application using an XML document supports high-speed processing using a binary XML document and can handle a text XML document.

また、テキストＸＭＬ文書用に作成されたアプリケーションでバイナリＸＭＬ文書を扱うことが可能になる。 In addition, a binary XML document can be handled by an application created for a text XML document.

［第２の実施形態］
図１２は、本実施形態に係る構造化文書処理装置に適用可能なコンピュータ１２００のハードウェア構成を示すブロック図である。図１２において、図１に示したものと同じものについては同じ番号を付けており、その説明は省略する。即ち、図１２に示した構成は、図１に示したバイナリＸＭＬパーサ１０６の代わりに、ＦａｓｔＩｎｆｏｓｅｔパーサ１２０６が記憶装置１０４内に保存されている構成となっている。 [Second Embodiment]
FIG. 12 is a block diagram showing a hardware configuration of a computer 1200 applicable to the structured document processing apparatus according to this embodiment. 12, the same components as those shown in FIG. 1 are denoted by the same reference numerals, and the description thereof is omitted. That is, the configuration shown in FIG. 12 is configured such that a Fast Infoset parser 1206 is stored in the storage device 104 instead of the binary XML parser 106 shown in FIG.

ＦａｓｔＩｎｆｏｓｅｔパーサ１２０６は、バイナリＸＭＬフォーマットの一つであるＦａｓｔＩｎｆｏｓｅｔ形式のＸＭＬ文書を解析するパーサである。 A Fast Infoset parser 1206 is a parser that analyzes an XML document in a Fast Infoset format, which is one of binary XML formats.

本実施形態において、型対応アプリケーション１１１による処理対象となるＸＭＬ文書の一例を、図１３，１４に示す。図１４は、テキストＸＭＬ文書の構成例を示す図であり、図１３は、図１４に示したＸＭＬ文書をＦａｓｔＩｎｆｏｓｅｔ形式で表現した場合の構成例を示す図である。 In this embodiment, an example of an XML document to be processed by the type correspondence application 111 is shown in FIGS. FIG. 14 is a diagram showing a configuration example of a text XML document, and FIG. 13 is a diagram showing a configuration example when the XML document shown in FIG. 14 is expressed in the Fast Infoset format.

図１３において、１３０１で示す「Ｅ０００」はマジックナンバーであり、係るＸＭＬ文書がＦａｓｔＩｎｆｏｓｅｔ形式であることを示している。 In FIG. 13, “E000” indicated by reference numeral 1301 is a magic number, which indicates that the XML document is in the Fast Infoset format.

次の１３０２で示す「０００１」はＦａｓｔＩｎｆｏｓｅｔのバージョンであり、この例では１である。 Next, “0001” indicated by 1302 is the version of Fast Infoset, which is 1 in this example.

次の１３０３で示す「００」はオプションとなるデータの存在を示すものであり、「００」は存在しないことを意味する。 Next, “00” indicated by 1303 indicates the existence of optional data, and “00” means that there is no data.

次の１３０４で示す「３Ｃ００」は１ビットごとに意味を持つため多くの意味を持つが、主に次のノードが要素であることを意味している。他に、属性の有無や名前空間名の存在、要素名のバイト数などの情報も含まれているが、ここでの説明の本質とは関連が薄いため詳細な説明は省略する。 “3C00” indicated by the next 1304 has many meanings because it has a meaning for each bit, but it mainly means that the next node is an element. In addition, information such as presence / absence of attribute, existence of namespace name, number of bytes of element name, and the like are included.

次の１３０５で示す「６１」はＵＴＦ−８でエンコードされた要素名「ａ」である。 Next, “61” indicated by 1305 is an element name “a” encoded by UTF-8.

次の１３０６で示す「９Ｃ１Ａ」という２バイトも同様に多くの意味を持つが、主に次のノードが要素の内容であり、またその値が浮動小数型であることを意味している。その他にもバイト数などの情報も含まれている。 The next two bytes “9C1A” shown by 1306 have many meanings as well, but mainly the next node is the content of the element, and the value is a floating-point type. In addition, information such as the number of bytes is also included.

次の１３０７で示す「Ｃ２ＥＤ４０００」はＩＥＥＥ７５４形式でエンコードされた−１１８．６２５という浮動小数の値である。 Next, “C2ED4000” indicated by 1307 is a floating-point value of −118.625 encoded in the IEEE754 format.

最後の１３０８で示す「ＦＦ」は、はじめのＦが要素の終端、次のＦが文書の終端を表している。すなわち、図１３に示したＸＭＬ文書は、図１４のテキストＸＭＬ文書とほぼ同じ意味である。また文書の意味だけでなく、ノードの出現する順序も同じである。 “FF” shown at the last 1308 represents the end of the element, and the next F represents the end of the document. That is, the XML document shown in FIG. 13 has almost the same meaning as the text XML document shown in FIG. In addition to the meaning of the document, the order in which the nodes appear is the same.

ここで、型対応アプリケーション１１１が、図１３に示したａ要素の値を取得する場合は、図１５に示したフローチャートに従った処理を、共通ＸＭＬパーサ１０９が行う。 Here, when the type corresponding application 111 acquires the value of the a element shown in FIG. 13, the common XML parser 109 performs processing according to the flowchart shown in FIG. 15.

図１５は、型対応アプリケーション１１１が、図１３に示したａ要素の値を取得する場合に、コンピュータ１２００が行う処理のフローチャートである。 FIG. 15 is a flowchart of processing performed by the computer 1200 when the type-corresponding application 111 acquires the value of the a element shown in FIG.

先ず、ステップＳ１５０２では、関数「ＳｅｔＤｏｃｕｍｅｎｔ」を実行し、図１３に示すＸＭＬ文書を開く。なお、ステップＳ１５０２における処理を行うと、図９に示したフローチャートに従った処理が第１の実施形態と同様に開始される。なお、図９のフローチャートに従った処理において、フォーマットの判別処理は、ＦａｓｔＩｎｆｏｓｅｔ形式であるか否かを判断するのであるが、これは、マジックナンバーとして「Ｅ０００」が記述されているのかを判断することで行う。マジックナンバーとして「Ｅ０００」が記述されていれば、ＦａｓｔＩｎｆｏｓｅｔパーサ１２０６を使用する。記述されていなければ、テキストＸＭＬパーサ１０５を使用する。 First, in step S1502, the function “SetDocument” is executed to open the XML document shown in FIG. When the process in step S1502 is performed, the process according to the flowchart shown in FIG. 9 is started in the same manner as in the first embodiment. In the process according to the flowchart of FIG. 9, the format determination process determines whether or not the Fast Infoset format is used, and this determines whether or not “E000” is described as the magic number. To do. If “E000” is described as the magic number, the Fast Infoset parser 1206 is used. If not described, the text XML parser 105 is used.

次にステップＳ１５０３では、関数「Ｒｅａｄ」を実行し、係る実行により進めた現在の参照位置に対して関数「ＧｅｔＮｏｄｅＴｙｐｅ」、関数「ＧｅｔＮａｍｅ」を実行する。そして、関数「ＧｅｔＮｏｄｅＴｙｐｅ」の返り値が開始タグ、且つ関数「ＧｅｔＮａｍｅ」による返り値が「ａ」となるまで、関数「Ｒｅａｄ」を実行する。ＦａｓｔＩｎｆｏｓｅｔ形式もテキストＸＭＬ形式と同様、まず要素の開始を表すバイト列と要素の名前を示すバイト列が現れるため、最初のノードは開始タグａとなる。 In step S1503, the function “Read” is executed, and the function “GetNodeType” and the function “GetName” are executed on the current reference position advanced by the execution. Then, the function “Read” is executed until the return value of the function “GetNodeType” is the start tag and the return value of the function “GetName” is “a”. Similarly to the text XML format in the Fast Infoset format, first, a byte sequence indicating the start of an element and a byte sequence indicating the name of the element appear, so the first node is a start tag a.

次にステップＳ１５０４では、関数「ＧｅｔＶａｌｕｅＡｓＤｏｕｂｌｅ」を実行し、ａタグ（要素）の内容、即ち、「−１１８．６２５」を実数値として取得する。ステップＳ１５０４における処理の詳細については、図１０に示したフローチャートに従ったものとなる。 In step S1504, the function “GetValueAsDouble” is executed to acquire the content of the a tag (element), that is, “−118.625” as a real value. Details of the processing in step S1504 are according to the flowchart shown in FIG.

即ち、ＦａｓｔＩｎｆｏｓｅｔパーサ１２０６を使用しているため、先ず、ステップＳ１００３では、ＦａｓｔＩｎｆｏｓｅｔパーサ１２０６からデータの型情報を受け取る。ＦａｓｔＩｎｆｏｓｅｔパーサ１２０６は図１３の１３０６で示す「９Ｃ１Ａ」の部分により、このデータの値が浮動小数であることを判断するので、ｄｏｕｂｌｅ型を型情報として返す。そしてステップＳ１００４では、その型で値、つまり「−１１８．６２５」を取得する。 That is, since the Fast Infoset parser 1206 is used, first, in step S1003, data type information is received from the Fast Infoset parser 1206. The Fast Infoset parser 1206 determines that the value of this data is a floating-point number based on the portion “9C1A” indicated by 1306 in FIG. 13, and therefore returns a double type as type information. In step S1004, a value of that type, that is, “−118.625” is acquired.

次にステップＳ１００５では、ステップＳ１５０４で実行した関数が要求するデータの型（依頼された型）と、ステップＳ１００３で取得した型とが一致しているか否かを判断する。係る判断の結果、一致する場合には、処理をステップＳ１００７に進める。図１３に示したＸＭＬ文書の場合、ステップＳ１５０４で実行した関数が要求するデータの型はｄｏｕｂｌｅ型であるし、ステップＳ１００３で取得した型もまたｄｏｕｂｌｅ型であるので、一致すると判断される。この場合、ステップＳ１００７では、上記ステップＳ１００４で取得したデータ（実数値）を、要求元（型対応アプリケーション１１１）に出力する。 In step S1005, it is determined whether the data type requested by the function executed in step S1504 (requested type) matches the type acquired in step S1003. As a result of the determination, if they match, the process proceeds to step S1007. In the case of the XML document shown in FIG. 13, since the data type requested by the function executed in step S1504 is a double type, and the type acquired in step S1003 is also a double type, it is determined that they match. In this case, in step S1007, the data (real value) acquired in step S1004 is output to the request source (type-compatible application 111).

これにより、無駄な変換を行うことなく、アプリケーションにデータを渡すことができる。 As a result, data can be passed to the application without performing unnecessary conversion.

なお、図１４に示したテキストＸＭＬ文書を処理対象とする場合であっても、第１の実施形態と同様に処理すれば良い。 Even if the text XML document shown in FIG. 14 is a processing target, the processing may be performed in the same manner as in the first embodiment.

以上のようにして、従来のテキストＸＭＬ形式のＸＭＬ文書とＦａｓｔＩｎｆｏｓｅｔ形式の文書の両方に対応可能で、かつ無駄なデータ型変換を行わずに処理が可能な構造化文書処理装置が実現できる。 As described above, it is possible to realize a structured document processing apparatus that can handle both an XML document in the conventional text XML format and a Fast Infoset format document and that can perform processing without performing unnecessary data type conversion.

ここで、上記コンピュータ１００、１２００としては、携帯電話や複写機など、ＸＭＬ文書が使用可能な通信機器を用いることができる。 Here, as the computers 100 and 1200, communication devices that can use XML documents, such as mobile phones and copiers, can be used.

［その他の実施形態］
また、本発明の目的は、以下のようにすることによって達成されることはいうまでもない。即ち、前述した実施形態の機能を実現するソフトウェアのプログラムコードを記録した記録媒体（または記憶媒体）を、システムあるいは装置に供給する。係る記憶媒体は言うまでもなく、コンピュータ読み取り可能な記憶媒体である。そして、そのシステムあるいは装置のコンピュータ（またはＣＰＵやＭＰＵ）が記録媒体に格納されたプログラムコードを読み出し実行する。この場合、記録媒体から読み出されたプログラムコード自体が前述した実施形態の機能を実現することになり、そのプログラムコードを記録した記録媒体は本発明を構成することになる。 [Other Embodiments]
Needless to say, the object of the present invention can be achieved as follows. That is, a recording medium (or storage medium) in which a program code of software that realizes the functions of the above-described embodiments is recorded is supplied to the system or apparatus. Needless to say, such a storage medium is a computer-readable storage medium. Then, the computer (or CPU or MPU) of the system or apparatus reads and executes the program code stored in the recording medium. In this case, the program code itself read from the recording medium realizes the functions of the above-described embodiment, and the recording medium on which the program code is recorded constitutes the present invention.

また、コンピュータが読み出したプログラムコードを実行することにより、そのプログラムコードの指示に基づき、コンピュータ上で稼働しているオペレーティングシステム（ＯＳ）などが実際の処理の一部または全部を行う。その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。 Further, by executing the program code read by the computer, an operating system (OS) or the like running on the computer performs part or all of the actual processing based on the instruction of the program code. Needless to say, the process includes the case where the functions of the above-described embodiments are realized.

さらに、記録媒体から読み出されたプログラムコードが、コンピュータに挿入された機能拡張カードやコンピュータに接続された機能拡張ユニットに備わるメモリに書込まれたとする。その後、そのプログラムコードの指示に基づき、その機能拡張カードや機能拡張ユニットに備わるＣＰＵなどが実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。 Furthermore, it is assumed that the program code read from the recording medium is written in a memory provided in a function expansion card inserted into the computer or a function expansion unit connected to the computer. After that, based on the instruction of the program code, the CPU included in the function expansion card or function expansion unit performs part or all of the actual processing, and the function of the above-described embodiment is realized by the processing. Needless to say.

本発明を上記記録媒体に適用する場合、その記録媒体には、先に説明したフローチャートに対応するプログラムコードが格納されることになる。 When the present invention is applied to the recording medium, program code corresponding to the flowchart described above is stored in the recording medium.

本発明の第１の実施形態に係る構造化文書処理装置に適用可能なコンピュータのハードウェア構成例を示すブロック図である。It is a block diagram which shows the hardware structural example of the computer applicable to the structured document processing apparatus concerning the 1st Embodiment of this invention. コンピュータ１００を適用したネットワークの構成例を示す図である。1 is a diagram illustrating a configuration example of a network to which a computer 100 is applied. テキストＸＭＬパーサ１０５のＡＰＩの一例を示す図である。It is a figure which shows an example of API of the text XML parser 105. FIG. バイナリＸＭＬパーサ１０６のＡＰＩの一例を示す図である。3 is a diagram illustrating an example of an API of a binary XML parser 106. FIG. 共通ＸＭＬパーサ１０９のＡＰＩの一例を示す図である。3 is a diagram illustrating an example of an API of a common XML parser 109. FIG. コンピュータ１００による処理対象としてのＸＭＬ文書の構成例を示す図である。2 is a diagram illustrating a configuration example of an XML document to be processed by the computer 100. FIG. コンピュータ１００による処理対象としてのＸＭＬ文書の構成例を示す図である。2 is a diagram illustrating a configuration example of an XML document to be processed by the computer 100. FIG. ＣＰＵ１０１が、型対応アプリケーション１１１のプログラムを実行することでなされる処理のフローチャートである。4 is a flowchart of processing performed when the CPU 101 executes a program of the type correspondence application 111. ステップＳ８０２における処理が実行されると共に開始される処理のフローチャートである。It is a flowchart of the process started when the process in step S802 is performed. ステップＳ８０５，Ｓ８０７における処理の詳細を示すフローチャートである。It is a flowchart which shows the detail of the process in step S805, S807. レガシーアプリケーション１１０が個人情報データを扱う場合にコンピュータ１００が行う処理のフローチャートである。10 is a flowchart of processing performed by the computer 100 when the legacy application 110 handles personal information data. 本発明の第２の実施形態に係る構造化文書処理装置に適用可能なコンピュータ１２００のハードウェア構成を示すブロック図である。It is a block diagram which shows the hardware constitutions of the computer 1200 applicable to the structured document processing apparatus concerning the 2nd Embodiment of this invention. 図１４に示したＸＭＬ文書をＦａｓｔＩｎｆｏｓｅｔ形式で表現した場合の構成例を示す図である。FIG. 15 is a diagram illustrating a configuration example in a case where the XML document illustrated in FIG. 14 is expressed in a Fast Infoset format. テキストＸＭＬ文書の構成例を示す図である。It is a figure which shows the structural example of a text XML document. 型対応アプリケーション１１１が、図１３に示したａ要素の値を取得する場合に、コンピュータ１２００が行う処理のフローチャートである。14 is a flowchart of processing performed by a computer 1200 when a type-corresponding application 111 acquires the value of an a element illustrated in FIG. 13.

Claims

A structured document processing apparatus for processing a structured document,
An acquisition means for acquiring the format of the structured document;
Analysis means for analyzing the structured document by an analysis method according to the format acquired by the acquisition means;
Means for receiving a request for acquiring an element described in the structured document in a specified type;
Determining means for determining whether or not the type analyzed by the analyzing means for the element matches the specified type;
When the determination means determines that they match, the element is output to the request source. When the determination means determines that they do not match, the element type is converted to the specified type. A structured document processing apparatus, comprising: an output unit that outputs the element to the request source.

The analysis means is composed of a binary XML parser and a text XML parser,
If the format acquired by the acquisition unit is binary XML, the structured document is analyzed by the binary XML parser,
The structured document processing apparatus according to claim 1, wherein when the format acquired by the acquiring unit is text XML, the structured document is analyzed by the text XML parser.

The analysis means includes a Fast Infoset parser and a text XML parser.
When the format acquired by the acquisition unit is a Fast Infoset format, the structured document is analyzed by the Fast Infoset parser,
The structured document processing apparatus according to claim 1, wherein when the format acquired by the acquiring unit is text XML, the structured document is analyzed by the text XML parser.

A structured document processing method performed by a structured document processing apparatus for processing a structured document,
An acquisition process for acquiring the format of the structured document;
In the analysis method according to the format acquired in the acquisition step, an analysis step of analyzing the structured document;
Receiving a request to acquire an element described in the structured document with a specified type;
A determination step of determining whether or not the type analyzed in the analysis step for the element matches the designated type;
If it is determined in the determination step that the elements match, the element is output to the request source. If it is determined in the determination step that the elements do not match, the element type is converted to the specified type A structured document processing method comprising: an output step of outputting the element to the request source.

A program for causing a computer to execute the structured document processing method according to claim 4.

A computer-readable storage medium storing the program according to claim 5.