JP4750581B2

JP4750581B2 - Storage area management method, storage area management apparatus, storage area management system, and storage area management method

Info

Publication number: JP4750581B2
Application number: JP2006049970A
Authority: JP
Inventors: 淳平羽藤; 幹郎佐々木
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2006-02-27
Filing date: 2006-02-27
Publication date: 2011-08-17
Anticipated expiration: 2026-02-27
Also published as: JP2007226724A

Description

この発明は、データ固有の構造とデータ間の関係とを表す構造化データを情報処理機器が処理できるデータ形式の情報に変換する場合に、変換したデータ形式の情報を記憶する記憶領域の記憶領域量を予め求めるものであり、特に、構造化データを解析した結果の解析結果情報に適合する記憶領域量を推定する記憶領域推定情報より記憶領域量を決定する記憶領域管理方式、記憶領域管理装置、記憶領域管理システム、及び、記憶領域管理方法に関する。 The present invention relates to a storage area of a storage area for storing information in a converted data format when converting structured data representing a data-specific structure and a relationship between data into information in a data format that can be processed by an information processing device. A storage area management method and a storage area management apparatus for determining a storage area amount from storage area estimation information for estimating a storage area amount that matches analysis result information obtained as a result of analyzing structured data. The present invention relates to a storage area management system and a storage area management method.

ＸＭＬ（ｅＸｔｅｎｓｉｂｌｅＭａｒｋｕｐＬａｎｇｕａｇｅ）等の構造化データを処理するデジタル回路やソフトウェアは、そのデータを一時的に保持しておくために、メモリを使用することが多い。その際に、構造化データをそのままメモリ上に展開するのではなく、そのデータを解析し、内部的に処理しやすい内部データ表現形式に変換し、メモリ上に展開するのが一般的である。
例えば、ＸＭＬの場合には、通常Ｗ３Ｃ（ＷｏｒｌｄＷｉｄｅＷｅｂＣｏｎｓｏｒｔｉｕｍ）で仕様策定されたＤＯＭ（ＤｏｃｕｍｅｎｔＯｂｊｅｃｔＭｏｄｅｌ）に基づいて、ＸＭＬの構造化データを内部データ表現形式に変換しメモリ上に展開する。展開する際に、データ内の各ノードは、タグはＥｌｅｍｅｎｔオブジェクト、属性はＡｔｔｒオブジェクト、テキストはＴｅｘｔオブジェクトと言った様に、ノード種別によって異なるオブジェクトのインスタンスで表現される。このように構造化データを内部データ表現形式に変換しメモリ上に展開する場合、必要となるメモリ領域をいかに効率的に確保するかは重要な課題であり、メモリ制約の厳しい組込機器への構造化データ適用のニーズが増えている現在において特に重要な課題となっている。 Digital circuits and software that process structured data such as XML (extensible Markup Language) often use a memory to temporarily hold the data. At that time, it is general that the structured data is not expanded on the memory as it is, but the data is analyzed, converted into an internal data representation format that can be easily processed internally, and expanded on the memory.
For example, in the case of XML, XML structured data is converted into an internal data representation format based on DOM (Document Object Model), which is normally defined by W3C (World Wide Web Consortium). When expanding, each node in the data is represented by an instance of an object that differs depending on the node type, such as a tag being an Element object, an attribute being an Attr object, and a text being a Text object. In this way, when structured data is converted into internal data representation format and expanded on memory, how to efficiently secure the necessary memory area is an important issue. This is a particularly important issue as the need for applying structured data increases.

従来のメモリ確保の方法（従来方法１）では、構造化データの解析中、随時必要最低限のメモリ量（記憶領域量）を確保する方法が取られていた。この方法は、理論上解析終了時に余分に確保されたメモリが存在しない利点がある。
その他の従来の方法（従来方法２）として、構造化データのデータサイズに基づいて解析前に必要となるメモリ量（記憶領域量）を推定する方法がある。この方法では、必要となると推定されたメモリ量（記憶領域量）を一度に確保するため、その領域を消費するまではメモリ確保処理が不要となるため、前期方法よりも高速に処理可能な利点がある。
また、例えば、特開平５−１０１１１１号公報（特許文献１、従来方法３）には、データのファイルサイズを求め、そのファイルサイズに応じて使用される記憶領域サイズを決定し、記憶領域を確保する技術が記載されている。
特開平５−１０１１１１号公報 In the conventional memory securing method (conventional method 1), a method of securing a necessary minimum memory amount (storage area amount) at any time during analysis of structured data has been taken. This method has the advantage that there is no theoretically reserved memory at the end of analysis.
As another conventional method (conventional method 2), there is a method of estimating a memory amount (storage area amount) required before analysis based on the data size of structured data. This method secures the memory amount (storage area amount) estimated to be necessary at a time, and therefore does not require the memory securing process until the area is consumed. There is.
Further, for example, in Japanese Patent Laid-Open No. 5-101111 (Patent Document 1, Conventional Method 3), a file size of data is obtained, a storage area size to be used is determined according to the file size, and a storage area is secured. The technology to do is described.
JP-A-5-101111

しかし、従来方法１では、メモリが必要となるたびにメモリ確保処理を行う必要があるため、処理速度が遅くなる問題や、アドレスバウンダリの影響により、理論上のメモリ量（記憶領域量）よりも多くのメモリ領域を消費する問題がある。
また、従来方法２，３では、構造化データはデータサイズが同じでも、内部構造の複雑さは千差万別であり、必要となるメモリ量（記憶領域量）には大きな開きが生じる。このため、内部構造が非常に簡単な場合には、実際に必要となるメモリ量（記憶領域量）よりも極端に多くのメモリ量（記憶領域量）を確保してしまう。また逆に、内部構造が非常に複雑な場合には、実際に必要となるメモリ量（記憶領域量）よりも少ないメモリ量（記憶領域量）しか確保できず、メモリ確保処理を繰り返し行う必要が生じる問題がある。 However, in the conventional method 1, since it is necessary to perform the memory securing process every time the memory is required, the problem is that the processing speed is slow, and the influence of the address boundary causes more than the theoretical memory amount (storage area amount). There is a problem of consuming a lot of memory area.
Further, in the conventional methods 2 and 3, even if the structured data has the same data size, the complexity of the internal structure is various, and the required memory amount (storage area amount) is greatly different. For this reason, when the internal structure is very simple, a memory amount (storage area amount) that is extremely larger than the memory amount (storage area amount) actually required is secured. Conversely, if the internal structure is very complex, it is possible to secure only a smaller memory amount (storage area amount) than the memory amount actually required (storage area amount), and it is necessary to repeat the memory securing process. There are problems that arise.

前記従来技術の課題を鑑みて、本発明は、従来技術よりも推定するメモリ量（記憶領域量）と実際に必要となるメモリ量（記憶領域量）の誤差を少なくする事を可能とすることを目的とする。 In view of the problems of the prior art, the present invention makes it possible to reduce the error between the estimated memory amount (storage area amount) and the actually required memory amount (storage area amount) than the prior art. With the goal.

この発明に係るデータ固有の構造とデータ間の関係とを表す構造化データを情報処理機器が処理できるデータ形式の情報に変換する場合に、変換したデータ形式の情報を記憶するデータ記憶領域の記憶領域量を予め求める記憶領域管理方式は、
処理を実行するセントラル・プロセッシング・ユニット（ＣｅｎｔｏｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ：ＣＰＵ）と、
上記ＣＰＵが処理を行った結果を記憶する記憶部と、
構造化データを入力してＣＰＵにより解析して記憶領域量を推定するパラメータとなる解析結果情報を生成し、生成した解析結果情報をＣＰＵにより記憶部に記憶する解析処理部と、
記憶領域量を推定するパラメータを予め複数記憶するとともに、複数のパラメータそれぞれに対応させて、記憶領域量を推定する記憶領域推定情報を予め記憶する記憶領域推定情報記憶部と、
上記解析処理部が生成した解析結果情報を記憶部より入力して、入力した解析結果情報と適合するパラメータに対応する記憶領域推定情報を上記記憶領域推定情報記憶部から取得して、取得した記憶領域推定情報に基づいて上記記憶領域量を決定して、決定した記憶領域量をＣＰＵにより記憶部に記憶する記憶領域決定部と、
上記記憶領域決定部が決定した記憶領域量を記憶部より入力して、入力した記憶領域量に相当する記憶領域を、上記変換したデータ形式の情報を記憶するデータ記憶領域としてＣＰＵにより確保する記憶領域管理部と
を備えたことを特徴とする。 Storage of a data storage area for storing converted data format information when converting structured data representing the data-specific structure and the relationship between data into data format information that can be processed by the information processing device The storage area management method for obtaining the area amount in advance is:
A central processing unit (CPU) for executing processing;
A storage unit for storing a result of processing performed by the CPU;
An analysis processing unit that inputs structured data, generates analysis result information that is a parameter for estimating the storage area amount by the CPU, and stores the generated analysis result information in the storage unit by the CPU;
A storage area estimation information storage unit that stores in advance a plurality of parameters for estimating the storage area amount, and stores in advance storage area estimation information for estimating the storage area amount corresponding to each of the plurality of parameters;
The analysis result information generated by the analysis processing unit is input from the storage unit, the storage region estimation information corresponding to the parameter that matches the input analysis result information is acquired from the storage region estimation information storage unit, and the acquired storage A storage area determination unit that determines the storage area amount based on the area estimation information and stores the determined storage area amount in the storage unit by the CPU;
The storage area amount determined by the storage area determination unit is input from the storage unit, and the storage area corresponding to the input storage area amount is secured by the CPU as a data storage area for storing the information of the converted data format And an area management unit.

この発明の記憶領域管理方式は、解析処理部が構造化データを入力してＣＰＵにより解析して記憶領域量を推定するパラメータとなる解析結果情報を生成し、記憶領域推定情報記憶部が記憶領域量を推定するパラメータを予め複数記憶するとともに、複数のパラメータそれぞれに対応させて、記憶領域量を推定する記憶領域推定情報を予め記憶し、記憶領域決定部が上記解析処理部が生成した解析結果情報を入力して、入力した解析結果情報と適合するパラメータに対応する記憶領域推定情報を上記記憶領域推定情報記憶部から取得して、取得した記憶領域推定情報に基づいて上記記憶領域量を決定して、決定した記憶領域量をＣＰＵにより記憶部に記憶する。このため、例えばオブジェクトの出現回数を解析結果情報とし、記憶領域推定情報はオブジェクトの出現回数から取得したり、ある１つのオブジェクトの出現回数から他のオブジェクトの出現回数を予測して記憶領域推定情報を取得するようにすると、構造化データの内部構造が複雑であっても、記憶領域推定情報を取得する処理時間を高速にできる効果がある。また、記憶領域推定情報を取得する処理は複雑でないため、高速に行える効果がある。また、例えば、記憶領域推定情報は、オブジェクトの単一サイズにそのオブジェクトの出現回数を掛けたり、複数のオブジェクトの出現回数を合計した数に複数のオブジェクトの平均サイズを掛けたりして、オブジェクトの出現回数から記憶領域量を推定するようにすると、推定した記憶領域量と実際に必要となる記憶領域量との差を小さくできる効果がある。 According to the storage area management system of the present invention, the analysis processing unit inputs structured data and analyzes it by the CPU to generate analysis result information that is a parameter for estimating the storage area amount, and the storage area estimation information storage unit stores the storage area. A plurality of parameters for estimating the amount are stored in advance, storage region estimation information for estimating the storage region amount is stored in advance corresponding to each of the plurality of parameters, and the analysis result generated by the analysis processing unit by the storage region determination unit Information is input, storage area estimation information corresponding to a parameter that matches the input analysis result information is acquired from the storage area estimation information storage unit, and the storage area amount is determined based on the acquired storage area estimation information Then, the determined storage area amount is stored in the storage unit by the CPU. For this reason, for example, the number of appearances of an object is used as analysis result information, and the storage area estimation information is obtained from the number of appearances of an object, or the number of appearances of one object is predicted to predict the number of appearances of another object. As a result, even if the internal structure of the structured data is complex, there is an effect that the processing time for acquiring the storage area estimation information can be increased. Further, since the process for acquiring the storage area estimation information is not complicated, there is an effect that it can be performed at high speed. Further, for example, the storage area estimation information may be obtained by multiplying the single size of an object by the number of appearances of the object, or multiplying the total number of appearances of a plurality of objects by the average size of the plurality of objects. Estimating the storage area amount from the number of appearances has the effect of reducing the difference between the estimated storage area amount and the actually required storage area amount.

実施の形態１．
図１は、この実施の形態の記憶領域管理方式を実行する構造化データメモリ管理装置の機能ブロック図である。
図２は、図１の記憶領域管理方式を実現する構造化データメモリ管理装置を含むシステム構成を示す図である。
図３は、図２の構造化データメモリ管理装置を含むシステムのハードウェア資源の一例を示す図である。 Embodiment 1 FIG.
FIG. 1 is a functional block diagram of a structured data memory management apparatus for executing the storage area management system of this embodiment.
FIG. 2 is a diagram showing a system configuration including a structured data memory management apparatus that implements the storage area management method of FIG.
FIG. 3 is a diagram illustrating an example of hardware resources of a system including the structured data memory management device of FIG.

図１〜図３の構造化データメモリ管理装置の要素について説明を行う前に、本実施の形態、及び、他の実施の形態で説明する構造化データと内部データ形式について説明する。
構造化データとは、情報処理機器で処理可能なように、既存の知識の延長線上、またはある集団・組織内に浸透する一般的な意味を交換可能なデータ固有の構造や関係をデータ自身に組み込んだデータの事である。例えば、ＸＭＬ、ＨＴＭＬ（ＨｙｐｅｒＴｅｘｔＭａｒｋｕｐＬａｎｇｕａｇｅ）等が代表的な構造化データである。
図４（ａ）は、構造化データの一例としてのＨＴＭＬコンテンツを示す図である。例えば、図４（ａ）の「＜ｈｔｍｌ＞」、「＜ｔｉｔｌｅ＞」、「＜ｂｏｄｙ＞」、「＜ｂ＞」、「＜ｂｒ＞」は、タグの持つ意味を表現するＥｌｅｍｅｎｔオブジェクトである。また、「ｓａｍｐｌｅ１」、「ｓａｍｐｌｅ」は、テキストの持つ意味を表現するＴｅｘｔオブジェクトである。 Before describing the elements of the structured data memory management apparatus of FIGS. 1 to 3, the structured data and internal data format described in this embodiment and other embodiments will be described.
Structured data refers to the data's unique structure and relationships that can be exchanged for general meaning that is an extension of existing knowledge or permeates within a group / organization so that it can be processed by information processing equipment. It is the embedded data. For example, XML, Hyper Text Markup Language (HTML), etc. are typical structured data.
FIG. 4A is a diagram showing HTML content as an example of structured data. For example, “<html>”, “<title>”, “<body>”, “<b>”, and “<br>” in FIG. 4A are Element objects representing the meaning of the tag. . “Sample1” and “sample” are Text objects that express the meaning of the text.

また、内部データ形式とは、前記構造化データを情報処理機器でより容易に取り扱える様に情報処理機器に都合のよい形式に変換した内部表現形式のことである。例えば、ＸＭＬやＨＴＭＬ等であれば標準仕様であるＤＯＭ（ＤｏｃｕｍｅｎｔＯｂｊｅｃｔＭｏｄｅｌ）に準拠した内部表現形式から、固有に策定した内部表現形式までを指す。一般的に内部データ形式は、前記構造化データの保持するデータ固有の構造や関係・意味を保持可能とするため、それぞれのデータに適した複数の内部表現形式を持つ。例えば、ＤＯＭであれば、ＸＭＬドキュメントを表現するＤｏｃｕｍｅｎｔオブジェクトがあり、タグの持つ意味を表現するためのオブジェクトとしてＥｌｅｍｅｎｔオブジェクトがあり、テキストの持つ意味を表現するためにＴｅｘｔオブジェクトがある。それぞれのオブジェクトの構造関係を表現するために各オブジェクトをノードとした木構造で表現する。図４（ｂ）は、図４（ａ）のＨＴＭＬコンテンツをＤＯＭに準拠した内部データ形式で表現した場合のブロック図の例である。四角形のブロックはＤｏｃｕｍｅｎｔオブジェクト、六角形のブロックはＥｌｅｍｅｎｔオブジェクト、楕円のブロックはＴｅｘｔオブジェクトを表現している。ブロック間の縦方向の矢印は木構造の親子関係を、横方向の矢印は兄弟関係を表現している。兄弟関係はコンテンツ内の同一階層のオブジェクト同士で結ばれ、親子関係は隣接する階層構造間のオブジェクト同士で結ばれる関係構造である。 The internal data format is an internal representation format in which the structured data is converted into a format convenient for the information processing device so that the information processing device can more easily handle the structured data. For example, XML, HTML, or the like refers to an internal representation format that conforms to a standard specification DOM (Document Object Model) to a uniquely formulated internal representation format. Generally, the internal data format has a plurality of internal representation formats suitable for each data in order to be able to hold the structure, relationship and meaning unique to the data held by the structured data. For example, in the case of DOM, there is a Document object that represents an XML document, an Element object as an object for expressing the meaning of a tag, and a Text object for expressing the meaning of a text. In order to express the structural relationship of each object, it is expressed in a tree structure with each object as a node. FIG. 4B is an example of a block diagram when the HTML content of FIG. 4A is expressed in an internal data format conforming to DOM. A rectangular block represents a Document object, a hexagonal block represents an Element object, and an elliptical block represents a Text object. The vertical arrows between blocks represent the parent-child relationship of the tree structure, and the horizontal arrows represent the sibling relationship. Sibling relationships are connected with objects in the same hierarchy in the content, and parent-child relationships are relationship structures connected with objects between adjacent hierarchical structures.

また、情報処理機器とは、ＰＣ（ＰｅｒｓｏｎａｌＣｏｍｐｕｔｅｒ）、Ｗｏｒｋｓｔａｔｉｏｎと言った据え置き型の情報処理機器から、携帯電話、カーナビゲーションシステム、デジタルテレビや情報家電といった、組込機器と呼ばれる情報機器が含まれる。 In addition, information processing devices include information devices called embedded devices such as mobile phones, car navigation systems, digital TVs, and information appliances, from stationary information processing devices such as PCs (Personal Computers) and workstations. .

次に、図１の記憶領域管理方式を実行する構造化データメモリ管理装置について説明する。
図１において、構造化データメモリ管理装置１０は、記憶領域管理方式を実行する装置の一例であり、記憶領域管理装置の一例である。また、構造化データメモリ管理装置１０は、記憶領域管理方法を実行する装置の一例である。構造化データメモリ管理装置１０は、構造化データを入力してＣＰＵにより解析して記憶領域量を推定するパラメータとなる解析結果情報を生成し、生成した解析結果情報をＣＰＵにより記憶部に記憶する解析処理部の一例の前解析処理部１３を備える。また、記憶領域量を推定するパラメータを予め複数記憶するとともに、複数のパラメータそれぞれに対応させて、記憶領域量を推定する記憶領域推定情報を予め記憶する記憶領域推定情報記憶部の一例の記憶領域推定情報格納部１６を備える。また、前解析処理部１３が生成した解析結果情報を記憶部より入力して、入力した解析結果情報と適合するパラメータに対応する記憶領域推定情報を上記記憶領域推定情報格納部１６から取得して、取得した記憶領域推定情報に基づいて上記記憶領域量を決定して、決定した記憶領域量をＣＰＵにより記憶部に記憶する記憶領域決定部１５を備える。また、上記記憶領域決定部１５が決定した記憶領域量を記憶部より入力して、入力した記憶領域量に相当する記憶領域を、上記変換したデータ形式の情報を記憶するデータ記憶領域としてＣＰＵにより確保する記憶領域管理部１７を備える。また、従来のメモリ管理方式によるメモリ管理を行う構造化データ解析部１４を備える。また、構造化データを記憶する構造化データ記憶部の一例である構造化データ格納部１１を備える。また、解析の対象とする構造化データの解析対象範囲を指定する解析範囲情報１００を入力する入力部１８を備える。また、構造化データメモリ管理装置１０の前記した各要素の動作を制御する解析処理制御部１２を備える。 Next, a structured data memory management apparatus that executes the storage area management method of FIG. 1 will be described.
In FIG. 1, a structured data memory management apparatus 10 is an example of an apparatus that executes a storage area management method, and is an example of a storage area management apparatus. The structured data memory management device 10 is an example of a device that executes a storage area management method. The structured data memory management device 10 receives the structured data, analyzes it by the CPU, generates analysis result information as a parameter for estimating the storage area amount, and stores the generated analysis result information in the storage unit by the CPU. A pre-analysis processing unit 13 as an example of an analysis processing unit is provided. Also, a storage area as an example of a storage area estimation information storage unit that stores in advance a plurality of parameters for estimating the storage area amount and stores in advance storage area estimation information for estimating the storage area amount corresponding to each of the plurality of parameters An estimated information storage unit 16 is provided. Further, the analysis result information generated by the pre-analysis processing unit 13 is input from the storage unit, and the storage area estimation information corresponding to the parameter that matches the input analysis result information is acquired from the storage area estimation information storage unit 16. A storage area determination unit 15 is provided that determines the storage area amount based on the acquired storage area estimation information and stores the determined storage area amount in the storage unit by the CPU. Further, the storage area amount determined by the storage area determination unit 15 is input from the storage unit, and the storage area corresponding to the input storage area amount is set as a data storage area for storing the information of the converted data format by the CPU. A storage area management unit 17 is provided. In addition, a structured data analysis unit 14 that performs memory management by a conventional memory management method is provided. In addition, a structured data storage unit 11 that is an example of a structured data storage unit that stores structured data is provided. Moreover, the input part 18 which inputs the analysis range information 100 which designates the analysis object range of the structured data made into analysis object is provided. Further, an analysis processing control unit 12 that controls the operation of each element described above of the structured data memory management device 10 is provided.

以下に、構造化データメモリ管理装置１０の前記した各要素の動作を、解析処理制御部１２の動作を中心にして説明する。動作の説明において、図５のシーケンス図を参照する。図５は、解析処理制御部１２を中心としたシーケンス図である。
構造化データ格納部１１（構造化データ記憶部の一例）は、少なくとも一つ以上の構造化データの実体もしくは実体を参照する情報（ＵＲＬ、ポインタ、その他の参照情報等）を永続的、もしくは一時的に格納する記憶領域である。具体的にはＲＡＭやＲＯＭなどの内部メモリ、ハードディスク、ＦＤＤ（Ｆｌｅｘｉｂｌｅ・Ｄｉｓｋ・Ｄｒｉｖｅ）、メモリーカード、ＣＤＲＯＭ、ＤＶＤ等が考えられる。構造化データ格納部１１に格納されている構造化データは、通常のファイルシステムによって管理する方法、データベースの様に各構造化データにキーを設定し、そのキーを利用して管理する方法、解析処理を要求されている構造化データがリクエスト順に列挙されているリストやキュー、リングバッファ形式で管理する方法等が考えられる。また、各構造化データに付随する関連情報を一緒に格納し管理する事も考えられる。その場合の関連情報は、構造化データの取得元情報（ＵＲＬ等）やこの構造化データの作成日時情報、更新日時情報、作成作者情報、更新作者情報等が考えられる。 Hereinafter, the operation of each element described above of the structured data memory management device 10 will be described focusing on the operation of the analysis processing control unit 12. In the description of the operation, reference is made to the sequence diagram of FIG. FIG. 5 is a sequence diagram centering on the analysis processing control unit 12.
The structured data storage unit 11 (an example of a structured data storage unit) stores at least one entity of structured data or information (URL, pointer, other reference information, etc.) referring to the entity permanently or temporarily. It is a storage area for storing automatically. Specifically, an internal memory such as a RAM or a ROM, a hard disk, an FDD (Flexible / Disk / Drive), a memory card, a CDROM, a DVD, or the like can be considered. Structured data stored in the structured data storage unit 11 is managed by a normal file system, a method is used in which a key is set for each structured data like a database, and management is performed using the key, and analysis A method of managing the structured data requested to be processed in a list, queue, or ring buffer format in which the structured data is listed in the order of request can be considered. It is also conceivable to store and manage related information accompanying each structured data together. In this case, as related information, structured data acquisition source information (URL or the like), creation date / time information, update date / time information, creation author information, update author information, etc. of the structured data can be considered.

解析処理制御部１２は、構造化データの解析処理の制御を行う中央制御部である。解析処理制御部１２は、上位モジュールより構造化データ解析要求を受けると、その要求に対応した構造化データを構造化データ格納部１１より取得する。適切に取得できなかった場合には、解析処理制御部１２は上位モジュールに対してコンテンツ取得不正応答を返す。
正常に取得できた場合には、解析処理制御部１２はそのコンテンツを前解析するため前解析処理部１３に対して、前解析要求を発行する。また、構造化データを前解析する範囲の指定がある場合には、入力部１８より、解析範囲を示す解析範囲情報１００を入力して、入力した解析範囲情報１００を前解析要求に含めて、前解析処理部１３に発行する。前解析が正常に終了しなかった場合には、解析処理制御部１２は上位モジュールに対して前解析不正応答を返すか、もしくは、従来通りのメモリ管理方式で解析を継続するため、構造化データ解析部１４に対して、従来解析要求を発行する。この「解析を継続する」とは、万が一、前解析が正常終了できなかった場合に、そこで処理を中断させないために、記憶領域管理方式による記憶領域量の推定は行わないで、従来からのメモリ管理方式を行う構造化データ解析部１４を利用して構造化データの解析を実行するように、構造化データ解析部１４に命令するものである。つまり、前解析処理部１３による解析結果情報は生成されないため、記憶領域量を推定して推定した記憶領域量の記憶領域をデータ記憶領域に確保する処理は、構造化データ解析部１４が独自に行う事になる。構造化データ解析部１４による記憶領域の確保は、多くの場合、解析前にコンテンツに関係なく一定量の記憶領域量の記憶領域をデータ記憶領域に確保する、或いは、記憶領域が必要になった時に、必要な記憶領域量の記憶領域をデータ記憶領域に毎回確保する方式になる。 The analysis processing control unit 12 is a central control unit that controls analysis processing of structured data. When receiving the structured data analysis request from the upper module, the analysis processing control unit 12 acquires the structured data corresponding to the request from the structured data storage unit 11. If the acquisition is not properly performed, the analysis processing control unit 12 returns a content acquisition illegal response to the upper module.
If it can be acquired normally, the analysis processing control unit 12 issues a pre-analysis request to the pre-analysis processing unit 13 in order to pre-analyze the content. In addition, when there is a designation of a range in which the structured data is pre-analyzed, the analysis range information 100 indicating the analysis range is input from the input unit 18 and the input analysis range information 100 is included in the pre-analysis request. Issued to the pre-analysis processing unit 13. If the pre-analysis does not end normally, the analysis processing control unit 12 returns a pre-analysis illegal response to the upper module, or the analysis is continued with the conventional memory management method. A conventional analysis request is issued to the analysis unit 14. This “continue analysis” means that if the previous analysis cannot be completed normally, the storage area management method is not used to estimate the storage area amount so that the processing is not interrupted. The structured data analysis unit 14 is instructed to execute analysis of the structured data using the structured data analysis unit 14 that performs the management method. That is, since the analysis result information by the pre-analysis processing unit 13 is not generated, the structured data analysis unit 14 uniquely performs the process of estimating the storage area amount and securing the storage area of the estimated storage area amount in the data storage area. Will do. In many cases, the structured data analysis unit 14 secures a storage area by securing a certain amount of storage area in the data storage area regardless of the content before analysis, or a storage area is required. Sometimes, a method of securing a required storage area amount in the data storage area every time is used.

前解析処理部１３による前解析が正常に終了した場合には、解析処理制御部１２は解析結果情報を元にして、前解析の対象とした構造化データを内部データ形式に変換する際に必要となる記憶領域サイズ（記憶領域量）を決定するために、記憶領域決定部１５に領域決定要求を発行する。記憶領域決定部１５による領域決定が正常に終了しなかった場合には、解析処理制御部１２は上位モジュールに対して領域決定不正応答を返すか、もしくは、従来通りのメモリ管理方式で解析を継続するため、構造化データ解析部１４に対して、従来解析要求を発行する。この「解析を継続する」とは、前述した前解析処理部１３による解析結果情報の生成が正常に行われなかった場合の「解析を継続する」ことと、同様の動作をいう。
記憶領域決定部１５による領域決定が正常に終了した場合には、解析処理制御部１２は決定された領域サイズ（記憶領域量）の記憶領域をデータ記憶領域に確保するため、記憶領域管理部１７に対して、領域確保要求を発行する。記憶領域管理部１７による領域確保に失敗した場合には、解析処理制御部１２は上位モジュールに対して領域確保不正応答を返すか、もしくは、従来通りのメモリ管理方式で解析を継続するため、構造化データ解析部１４に対して、従来解析要求を発行する。この「解析を継続する」とは、前述した前解析処理部１３による解析結果情報の生成が正常に行われなかった場合の「解析を継続する」ことと、同様の動作をいう。
記憶領域管理部１７による領域確保に成功した場合には、確保されたメモリ領域（記憶領域量）を使用して構造化データの解析を行うために、解析処理制御部１２は構造化データ解析部１４に対して解析要求を発行し、解析結果を上位モジュールに返す。 When the pre-analysis by the pre-analysis processing unit 13 is normally completed, the analysis processing control unit 12 is necessary when converting the structured data to be pre-analyzed into the internal data format based on the analysis result information. In order to determine the storage area size (storage area amount) to become, an area determination request is issued to the storage area determination unit 15. If the area determination by the storage area determination unit 15 does not end normally, the analysis processing control unit 12 returns an area determination illegal response to the higher-level module or continues the analysis using the conventional memory management method Therefore, a conventional analysis request is issued to the structured data analysis unit 14. “Continue the analysis” means the same operation as “continue the analysis” when the generation of the analysis result information by the pre-analysis processing unit 13 described above is not performed normally.
When the area determination by the storage area determination unit 15 is completed normally, the analysis processing control unit 12 secures a storage area of the determined area size (storage area amount) in the data storage area, and therefore the storage area management unit 17 In response to this, an area allocation request is issued. If the storage area management unit 17 fails to secure an area, the analysis processing control unit 12 returns an illegal area reservation response to the upper module, or continues analysis using the conventional memory management method. A conventional analysis request is issued to the data analysis unit 14. “Continue the analysis” means the same operation as “continue the analysis” when the generation of the analysis result information by the pre-analysis processing unit 13 described above is not performed normally.
When the storage area management unit 17 succeeds in securing the area, the analysis processing control unit 12 analyzes the structured data using the secured memory area (storage area amount). 14 issues an analysis request and returns the analysis result to the upper module.

前解析処理部１３は、解析処理制御部１２からの前解析要求を受け、要求に対応する構造化データを前解析するモジュールである。解析する構造化データは、構造化データ格納部１１をアクセスして取得する場合と、ネットワークを介して外部装置より受信する場合とがある。前解析を行った結果生成する解析結果情報は、構造化データを解析し、内部データ形式を構成する内部データオブジェクトを生成するために必要となる記憶領域サイズ（記憶領域量）を決定するためのパラメータとして利用される。 The pre-analysis processing unit 13 is a module that receives a pre-analysis request from the analysis processing control unit 12 and pre-analyzes structured data corresponding to the request. The structured data to be analyzed may be obtained by accessing the structured data storage unit 11 or received from an external device via a network. The analysis result information generated as a result of the pre-analysis is used to analyze the structured data and determine the storage area size (storage area amount) required to generate the internal data object constituting the internal data format. Used as a parameter.

記憶領域決定部１５は、解析処理制御部１２からの領域決定要求を受け、要求に含まれる解析結果情報を元に、構造化データを内部データ形式で表現した場合に、内部データを構成する内部データオブジェクトを生成するために必要となる記憶領域サイズ（記憶領域量）を決定するモジュールである。また、記憶領域決定部１５が記憶領域サイズ（記憶領域量）を決定する際にアクセスする記憶領域推定情報格納部１６には、記憶領域サイズ（記憶領域量）を決定する際に必要となる記憶領域推定情報を、記憶領域量を推定する複数のパラメータ毎に対応させて格納している。記憶領域決定部１５は必要に応じて記憶領域推定情報格納部１６から、解析結果情報に適合するパラメータに対応する記憶領域推定情報を取得して、記憶領域量を決定する記憶領域決定処理を遂行する。決定された記憶領域量は、オブジェクトの区別をしないで一括したサイズで求める場合や、オブジェクトの区別を行い、各オブジェクトでどれだけのサイズの領域が必要となるかを求める場合が考えられる。例えば、前記したように構造化データがＸＭＬドキュメントである場合は、ＥｌｅｍｅｎｔオブジェクトとＴｅｘｔオブジェクトとが存在するので、これらのオブジェクトを区別することなく１つの記憶領域量を決定する場合と、これらのオブジェクト毎に対応する記憶領域量を求める場合とがある。なお、それぞれのオブジェクトに対応する記憶領域量を求める場合には、前解析処理部１３によりオブジェクト毎の解析結果情報を生成することが必要となる。さらに、オブジェクト毎の解析結果情報を生成する場合には、解析結果情報を生成するオブジェクトを指定する情報を入力部１８より入力するとともに、指定したオブジェクト毎に解析結果情報を生成することを指示する情報を入力部１８より入力することが必要となる。 The storage area determination unit 15 receives an area determination request from the analysis processing control unit 12, and when structured data is expressed in an internal data format based on analysis result information included in the request, the storage area determination unit 15 configures internal data This is a module for determining a storage area size (storage area amount) required for generating a data object. Further, the storage area estimation information storage unit 16 accessed when the storage area determination unit 15 determines the storage area size (storage area amount) stores the memory required for determining the storage area size (storage area amount). The area estimation information is stored in association with each of a plurality of parameters for estimating the storage area amount. The storage area determination unit 15 acquires storage area estimation information corresponding to a parameter that matches the analysis result information from the storage area estimation information storage unit 16 as necessary, and performs a storage area determination process for determining the storage area amount. To do. It is conceivable that the determined storage area amount is obtained with a collective size without distinguishing between objects, or the object is distinguished to determine how much area is required for each object. For example, as described above, when the structured data is an XML document, there are an Element object and a Text object. Therefore, when one storage area amount is determined without distinguishing between these objects, There is a case where the storage area amount corresponding to each is obtained. Note that, when obtaining the storage area amount corresponding to each object, the pre-analysis processing unit 13 needs to generate analysis result information for each object. Further, when generating analysis result information for each object, information specifying an object for generating the analysis result information is input from the input unit 18 and an instruction is given to generate analysis result information for each specified object. It is necessary to input information from the input unit 18.

記憶領域管理部１７は、解析処理制御部１２からの領域確保要求を受け、要求に応じた記憶領域サイズ（記憶領域量）の記憶領域をデータ記憶領域に確保する。また、記憶領域管理部１７は、構造化データ解析部１４が構造化データを内部データ形式に変換するための解析処理の途中に、変換した内部データ形式の内部データを格納するために必要となった記憶領域要求に対して、適切な記憶領域を参照するための情報を返す。その際に、データ記憶領域に確保した記憶領域が枯渇した場合には、構造化データ解析部１４に領域不足応答を返す。
記憶領域決定部１５が決定した記憶領域サイズ（記憶領域量）が、オブジェクトの区別を行わずに一括した記憶領域サイズ（記憶領域量）で求めている場合には、記憶領域管理部１７では、記憶領域の確保方法は、その記憶領域サイズ（記憶領域量）の記憶領域を一つ確保する事になる。一方、オブジェクトの区別を行って記憶領域サイズ（記憶領域量）を決定した場合には、各オブジェクト毎に決定された記憶領域量の記憶領域をそれぞれ独立して確保する。 The storage area management unit 17 receives an area securing request from the analysis processing control unit 12 and secures a storage area having a storage area size (storage area amount) according to the request in the data storage area. In addition, the storage area management unit 17 is necessary for storing the internal data in the converted internal data format during the analysis process for the structured data analysis unit 14 to convert the structured data into the internal data format. In response to the storage area request, information for referring to an appropriate storage area is returned. At this time, if the storage area secured in the data storage area is exhausted, an area shortage response is returned to the structured data analysis unit 14.
When the storage area size (storage area amount) determined by the storage area determination unit 15 is obtained as a batch storage area size (storage area amount) without distinguishing objects, the storage area management unit 17 The storage area securing method secures one storage area of the storage area size (storage area amount). On the other hand, when the storage area size (storage area amount) is determined by distinguishing objects, storage areas of the storage area amount determined for each object are secured independently.

また、記憶領域管理部１７による記憶領域の確保が正常に終了した場合、構造化データ解析部１４では、解析処理制御部１２の解析要求に応じて、要求に対応する構造化データの解析を行う。その際に構造化データを解析し、内部データとして表現する処理を行う際に必要となる記憶領域の記憶領域サイズ（記憶領域量）は、記憶領域管理部１７に対して、構造化データ解析部１４が要求する事によって、記憶領域管理部１７が、要求された記憶領域サイズ（記憶領域量）の記憶領域について他からの使用を禁止することによって構造化データ解析部１４による使用を保証する。そして、記憶領域管理部１７は、構造化データ解析部１４から要求された内容に適合した記憶領域サイズ（記憶領域量）の記憶領域を確保できたことを通知する情報を、構造化データ解析部１４に応答として返す。さらに、確保した記憶領域を使用するための必要となる情報も構造化データ解析部１４に通知する。また、その際にメモリ領域が不足した場合には、記憶領域管理部１７は構造化データ解析部１４に対して、エラー通知を行う。 When the storage area management unit 17 has successfully secured the storage area, the structured data analysis unit 14 analyzes the structured data corresponding to the request in response to the analysis request from the analysis processing control unit 12. . At this time, the storage area size (storage area amount) of the storage area that is required when the structured data is analyzed and expressed as internal data is determined by the structured data analysis unit. The storage area management unit 17 guarantees the use by the structured data analysis unit 14 by prohibiting the use of the storage area having the requested storage area size (storage area amount) from other sources. Then, the storage area management unit 17 sends information notifying that a storage area having a storage area size (storage area amount) suitable for the contents requested from the structured data analysis unit 14 has been secured, to the structured data analysis unit 14 is returned as a response. Further, the structured data analysis unit 14 is also notified of information necessary for using the secured storage area. If the memory area is insufficient at that time, the storage area management unit 17 notifies the structured data analysis unit 14 of an error.

入力部１８は、構造化データの解析対象とする範囲を指定する解析範囲情報１００を入力して、解析範囲情報１００を解析処理制御部１２に通知する。解析範囲情報１００は、例えば、構造化データ全てを解析対象として指定する情報や、構造化データの一部を解析対象として指定する情報である。また、構造化データの有する特定の情報を解析対象とするように、特定の情報を解析範囲情報１００に含めて指定する。解析範囲情報１００の具体的な内容は後で説明する。
以上が、図１の構造化データメモリ管理装置１０の備える要素の概略である。 The input unit 18 inputs analysis range information 100 that specifies a range to be analyzed of structured data, and notifies the analysis processing control unit 12 of the analysis range information 100. The analysis range information 100 is, for example, information that designates all structured data as an analysis target or information that designates a part of structured data as an analysis target. In addition, the specific information included in the analysis range information 100 is specified so that the specific information of the structured data is to be analyzed. Specific contents of the analysis range information 100 will be described later.
The above is the outline of the elements provided in the structured data memory management device 10 of FIG.

次に、構造化データメモリ管理装置１０の外観を説明する。構造化データメモリ管理装置１０は、例えば図２のような機器構成をしているものとする。図２において、構造化データメモリ管理装置１０は、クライアント装置９０９、サーバ装置９１０、ＣＲＴ（ＣａｔｈｏｄｅＲａｙＴｕｂｅ）やＬＣＤ（液晶）の表示画面を有する表示装置９０１、キーボード９０２（Ｋ／Ｂ）、マウス９０３、ＦＤＤ９０４（ＦｌｅｘｉｂｌｅＤｉｓｋＤｒｉｖｅ）、コンパクトディスク装置９０５（ＣＤＤ）などのハードウェア資源を備え、これらはケーブルや信号線で接続されている。
サーバ装置９１０及びクライアント装置９０９とは、コンピュータであり、サーバ装置９１０とクライアント装置９０９とはケーブルで接続されている。また、クライアント装置９０９は、サーバ装置９１０を介してデータベース９０８に記憶されている情報に対してアクセスすることが出来る。データベース９０８は、例えば構造化データ格納部１１に相当し、クライアント装置９０９はサーバ装置９１０を介して、データベース９０８が記憶する構造化データを取得する。サーバ装置９１０とデータベース９０８とはケーブルで接続され、また、ローカルエリアネットワーク９４２（ＬＡＮ）、ゲートウェイ９４１を介してインターネット９４０に接続されている。インターネット９４０の右側には、別のコンピュータシステムが接続されている。クライアント装置９０９は、サーバ装置９１０、ＬＡＮ９４２、ゲートウェイ９４１、インターネット９４０を介して、インターネット９４０の右側にある別のコンピュータシステムの記憶装置に記憶された情報をアクセスすることが出来る。 Next, the appearance of the structured data memory management device 10 will be described. Assume that the structured data memory management device 10 has a device configuration as shown in FIG. 2, the structured data memory management device 10 includes a client device 909, a server device 910, a display device 901 having a CRT (Cathode Ray Tube) or LCD (liquid crystal) display screen, a keyboard 902 (K / B), a mouse. 903, FDD904 (Flexible Disk Drive), compact disk device 905 (CDD), and the like are provided with hardware resources, and these are connected by cables and signal lines.
The server device 910 and the client device 909 are computers, and the server device 910 and the client device 909 are connected by a cable. Further, the client device 909 can access information stored in the database 908 via the server device 910. The database 908 corresponds to, for example, the structured data storage unit 11, and the client device 909 acquires structured data stored in the database 908 via the server device 910. The server apparatus 910 and the database 908 are connected by a cable, and are connected to the Internet 940 via a local area network 942 (LAN) and a gateway 941. Another computer system is connected to the right side of the Internet 940. The client device 909 can access information stored in a storage device of another computer system on the right side of the Internet 940 via the server device 910, the LAN 942, the gateway 941, and the Internet 940.

図３において、構造化データメモリ管理装置１０は、プログラムを実行するＣＰＵ９１１（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ、中央処理装置、処理装置、演算装置ともいう）を備えている。ＣＰＵ９１１は、バス９１２を介してＲＯＭ９１３、ＲＡＭ９１４、通信ボード９１５、表示装置９０１、キーボード９０２、マウス９０３、ＦＤＤ９０４、ＣＤＤ９０５、磁気ディスク装置９２０と接続され、これらのハードウェアデバイスを制御する。ＣＰＵ９１１は、解析処理制御部１２、前解析処理部１３、記憶領域決定部１５、記憶領域管理部１７、構造化データ解析部１４、入力部１８の動作を制御する。磁気ディスク装置９２０の代わりに、光ディスク装置、メモリーカード読み書き装置などの記憶装置でもよい。
ＲＡＭ９１４は、揮発性メモリの一例である。ＲＯＭ９１３、ＦＤＤ９０４、ＣＤＤ９０５、磁気ディスク装置９２０の記憶媒体は、不揮発性メモリの一例である。ＲＯＭ９１３、ＦＤＤ９０４、ＣＤＤ９０５、磁気ディスク装置９２０の記憶媒体は、記憶装置、格納部、あるいは記憶部の一例であって、情報処理機器が処理できるデータ形式（内部データ形式）の情報に変換した情報を記憶するデータ記憶領域を備える。
通信ボード９１５、キーボード９０２、ＦＤＤ９０４などは、入力部、入力装置の一例である。
また、通信ボード９１５、表示装置９０１出力部、出力装置の一例である。 In FIG. 3, the structured data memory management device 10 includes a CPU 911 (also referred to as a central processing unit, a central processing unit, a processing unit, or an arithmetic unit) that executes a program. The CPU 911 is connected to the ROM 913, the RAM 914, the communication board 915, the display device 901, the keyboard 902, the mouse 903, the FDD 904, the CDD 905, and the magnetic disk device 920 via the bus 912, and controls these hardware devices. The CPU 911 controls operations of the analysis processing control unit 12, the pre-analysis processing unit 13, the storage area determination unit 15, the storage area management unit 17, the structured data analysis unit 14, and the input unit 18. Instead of the magnetic disk device 920, a storage device such as an optical disk device or a memory card read / write device may be used.
The RAM 914 is an example of a volatile memory. The storage media of the ROM 913, the FDD 904, the CDD 905, and the magnetic disk device 920 are an example of a nonvolatile memory. The storage media of the ROM 913, the FDD 904, the CDD 905, and the magnetic disk device 920 are examples of a storage device, a storage unit, or a storage unit, and information converted into information in a data format (internal data format) that can be processed by an information processing device. A data storage area for storing is provided.
The communication board 915, the keyboard 902, the FDD 904, and the like are examples of an input unit and an input device.
Moreover, it is an example of the communication board 915, the display apparatus 901 output part, and an output device.

通信ボード９１５は、ＬＡＮ９４２に接続されている。通信ボード９１５は、ＬＡＮ９４２に限らず、インターネット９４０、ＩＳＤＮ等のＷＡＮ（ワイドエリアネットワーク）などに接続されていても構わない。インターネット９４０或いはＩＳＤＮ等のＷＡＮに接続されている場合、ゲートウェイ９４１は不用となる。
磁気ディスク装置９２０には、オペレーティングシステム９２１（ＯＳ）、ウィンドウシステム９２２、プログラム群９２３、ファイル群９２４が記憶されている。プログラム群９２３のプログラムは、ＣＰＵ９１１、オペレーティングシステム９２１、ウィンドウシステム９２２により実行される。 The communication board 915 is connected to the LAN 942. The communication board 915 is not limited to the LAN 942 and may be connected to the Internet 940, a WAN (wide area network) such as ISDN, or the like. When connected to a WAN such as the Internet 940 or ISDN, the gateway 941 is unnecessary.
The magnetic disk device 920 stores an operating system 921 (OS), a window system 922, a program group 923, and a file group 924. The programs in the program group 923 are executed by the CPU 911, the operating system 921, and the window system 922.

上記プログラム群９２３には、実施の形態で説明する構造化データメモリ管理装置１０の備える「〜部」と、構造化データメモリ管理装置１０により動作する「〜ステップ」として説明する機能を実行するプログラムが記憶されている。プログラムは、ＣＰＵ９１１により読み出され実行される。
ファイル群９２４には、実施の形態の説明において、「解析結果情報」、「記憶領域量」、「解析範囲情報」、「〜の解析結果」、「〜決定した結果」、「〜の判定結果」、「〜の計算結果」、「〜の処理結果」として説明するデータや信号値や変数値やパラメータが、「〜ファイル」や「〜データベース」の各項目として記憶されている。
また、実施の形態の説明において説明するフローチャートの矢印の部分は主としてデータや信号の入出力を示し、データや信号値は、ＲＡＭ９１４のメモリ、ＦＤＤ９０４のフレキシブルディスク、ＣＤＤ９０５のコンパクトディスク、磁気ディスク装置９２０の磁気ディスク、その他光ディスク、ミニディスク、ＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｋ）等の記録媒体に記録される。また、データや信号は、バス９１２や信号線やケーブルその他の伝送媒体によりオンライン伝送される。 The program group 923 includes a program for executing a function described as “˜step” provided by the structured data memory management device 10 described in the embodiment and “˜step” operated by the structured data memory management device 10. Is remembered. The program is read and executed by the CPU 911.
In the file group 924, in the description of the embodiment, “analysis result information”, “storage area amount”, “analysis range information”, “˜analysis result”, “˜decision result”, “˜determination result” Data, signal values, variable values, and parameters described as “,“ calculation results of ”, and“ processing results of ”are stored as items of“ ˜file ”and“ ˜database ”.
In addition, arrows in the flowcharts described in the description of the embodiments mainly indicate input / output of data and signals, and the data and signal values are the memory of the RAM 914, the flexible disk of the FDD904, the compact disk of the CDD905, and the magnetic disk device 920. Recording media such as magnetic discs, other optical discs, mini discs, and DVDs (Digital Versatile Disks). Data and signals are transmitted online via a bus 912, signal lines, cables, or other transmission media.

また、実施の形態の説明において「〜部」として説明するものは、ＲＯＭ９１３に記憶されたファームウェアで実現されていても構わない。或いは、ソフトウェアのみ、或いは、ハードウェアのみ、或いは、ソフトウェアとハードウェアとの組み合わせ、さらには、ファームウェアとの組み合わせで実施されても構わない。ファームウェアとソフトウェアは、プログラムとして、磁気ディスク、フレキシブルディスク、光ディスク、コンパクトディスク、ミニディスク、ＤＶＤ等の記録媒体に記憶される。プログラムはＣＰＵ９１１により読み出され、ＣＰＵ９１１により実行される。すなわち、プログラムは、「〜部」としてコンピュータを機能させるものである。あるいは、以下に述べる「〜部」、「〜ステップ」の手順をコンピュータに実行させるものである。 Also, what is described as “˜unit” in the description of the embodiment may be realized by firmware stored in the ROM 913. Alternatively, it may be implemented by software alone, hardware alone, a combination of software and hardware, or a combination of firmware. Firmware and software are stored as programs in a recording medium such as a magnetic disk, a flexible disk, an optical disk, a compact disk, a mini disk, and a DVD. The program is read by the CPU 911 and executed by the CPU 911. That is, the program causes the computer to function as “to part”. Alternatively, the computer executes the procedures of “˜unit” and “˜step” described below.

以下では前解析処理部１３の処理の具体例を挙げて説明を行う。この実施の形態では、前解析処理部１３で構造化データの前解析を行い、解析結果情報を生成して、記憶領域決定部１５により構造化データを内部データ形式で表現する際に必要となるデータ記憶領域の記憶領域量を、解析結果情報から求める。
その前解析処理の具体例として、構造化データが有するデータ固有の構造を特徴付ける特徴データについて解析を行った場合を説明する。例えば、構造化データの構造を特徴付ける情報を特徴データとして構造化データから抽出し、その個数を前解析結果とするものである。
構造化データがＸＭＬの場合には、タグの開始記号となる”＜”や、タグの終了記号となる”／＞”、属性への値代入を示す”＝”等が、特徴データの一例として挙げられる。 Hereinafter, a specific example of the process of the pre-analysis processing unit 13 will be described. In this embodiment, the pre-analysis processing unit 13 performs pre-analysis of the structured data, generates analysis result information, and is required when the storage area determination unit 15 expresses the structured data in the internal data format. The storage area amount of the data storage area is obtained from the analysis result information.
As a specific example of the pre-analysis process, a case will be described in which feature data that characterizes a data-specific structure of structured data is analyzed. For example, information that characterizes the structure of the structured data is extracted from the structured data as feature data, and the number thereof is used as a pre-analysis result.
When the structured data is XML, “<” that is the start symbol of the tag, “/>” that is the end symbol of the tag, “=” that indicates the value assignment to the attribute, etc. are examples of the feature data. Can be mentioned.

図１の説明において、入力部１８は、解析範囲情報１００として、例えば、構造化データ全てを解析対象として指定する情報や、構造化データの一部を解析対象として指定する情報や、解析対象データの有する特定の情報を解析対象とするように、特定の情報を解析範囲情報１００に含めて指定すると説明した。ここでは、入力部１８は、全ての構造化データを解析範囲として指定するとともに、特定の情報のみを解析対象とする場合について、例えば特徴データであるタグの開始記号となる”＜”、タグの終了記号となる”／＞”、属性への値代入を示す”＝”について前解析処理部１３で解析を行い、その出現回数を解析結果情報として生成することについて説明する。すなわち、解析範囲情報１００は、構造化データの全データを解析対象にして、”＜”，”／＞”，”＝”の出現回数を解析することを指定する情報となる。図６は、構造化データがＸＭＬの場合における特徴データとしてタグ開始記号のみを利用した場合の前解析処理部１３の処理フロー図である。 In the description of FIG. 1, the input unit 18 includes, as analysis range information 100, for example, information specifying all structured data as an analysis target, information specifying a part of structured data as an analysis target, and analysis target data It has been described that the specific information included in the analysis range information 100 is specified so that the specific information included in the analysis target is the analysis target. Here, the input unit 18 designates all structured data as an analysis range, and in the case where only specific information is to be analyzed, for example, “<”, which is a start symbol of a tag that is characteristic data, An explanation will be given of the fact that the pre-analysis processing unit 13 analyzes the end symbol “/>” and “=” indicating the value substitution to the attribute, and generates the number of appearances as analysis result information. That is, the analysis range information 100 is information that specifies analysis of the number of occurrences of “<”, “/>”, “=” with all data of structured data being analyzed. FIG. 6 is a process flow diagram of the pre-analysis processing unit 13 when only the tag start symbol is used as the feature data when the structured data is XML.

図５のシーケンスではＳ５の構造化データ取得ステップが、解析処理の開始であり、はじめに、解析処理制御部１２は、解析対象とする構造化データを指定する情報を入力する。例えば、構造化データ格納部１１に予め複数の構造化データが記憶されて、それぞれの構造化データが構造化データを識別するファイル名によって区別されている場合には、入力部１８によりファイル名を入力する。また、解析する構造化データが構造化データメモリ管理装置１０に記憶されていない場合は、解析する構造化データそのものを入力部１８より入力する。入力部１８は、例えば通信ボード９１５、キーボード９０２、ＦＤＤ９０４であるので、通信ボード９１５の場合には、解析処理制御部１２は、ネットワークを介して通信ボード９１５が受信したファイル名や構造化データそのものを入力する。キーボード９０２である場合には、解析処理制御部１２は、キーボード９０２が入力したファイル名を入力する。ＦＤＤ９０４である場合には、解析処理制御部１２は、ＦＤＤ９０４のフレキシブルディスクをＣＰＵによりアクセスして記憶されたファイル名、或いは、解析データそのものを入力する。ファイル名が入力された場合には、解析処理制御部１２は、構造化データ格納部１１からファイル名から特定される構造化データをＣＰＵにより取り出す。 In the sequence of FIG. 5, the structured data acquisition step of S5 is the start of the analysis process. First, the analysis process control unit 12 inputs information specifying the structured data to be analyzed. For example, when a plurality of structured data is stored in the structured data storage unit 11 in advance and each structured data is distinguished by a file name that identifies the structured data, the input unit 18 sets the file name. input. If the structured data to be analyzed is not stored in the structured data memory management device 10, the structured data itself to be analyzed is input from the input unit 18. Since the input unit 18 is, for example, the communication board 915, the keyboard 902, and the FDD 904, in the case of the communication board 915, the analysis processing control unit 12 receives the file name and the structured data itself received by the communication board 915 via the network. Enter. In the case of the keyboard 902, the analysis processing control unit 12 inputs the file name input by the keyboard 902. In the case of the FDD 904, the analysis processing control unit 12 inputs the file name stored by accessing the flexible disk of the FDD 904 by the CPU or the analysis data itself. When the file name is input, the analysis processing control unit 12 retrieves the structured data specified from the file name from the structured data storage unit 11 by the CPU.

解析処理制御部１２は、解析対象の構造化データを取得できたら、次は、解析範囲情報１００を入力する。この実施の形態では、前述のように、構造化データの全データを解析対象にして、”＜”，”／＞”，”＝”の出現回数を解析するため、解析範囲情報１００は、構造化データの全データを解析対象とすることを指定する情報、及び、”＜”，”＜”，”／＞”，”＝”の出現回数を解析することを指定する情報とを有する。前述のように入力部１８は、例えば通信ボード９１５、キーボード９０２、ＦＤＤ９０４であるので、通信ボード９１５の場合には、解析処理制御部１２は、ネットワークを介して通信ボード９１５が受信した解析範囲情報１００を入力する。キーボード９０２である場合には、解析処理制御部１２は、キーボード９０２が入力した解析範囲情報１００を入力する。ＦＤＤ９０４である場合には、解析処理制御部１２は、ＦＤＤ９０４のフレキシブルディスクに記憶された解析範囲情報１００をＣＰＵによりアクセスして取り出す。
ここでは、解析対象とする構造化データの指定（或いは、入力）と、解析範囲情報１００の指定（或いは、入力）とを、解析処理制御部１２が動作を開始する処理開始要求の入力とする。
解析処理制御部１２は、前述の処理開始要求を入力すると、前解析処理を開始する指示と、解析対象の構造化データと、解析範囲情報１００とを前解析要求として、ＣＰＵにより記憶部に記憶して、前解析処理部１３を動作させる。すなわち、前解析処理を開始する指示と、解析対象の構造化データと、解析範囲情報１００とを前解析要求として、ＣＰＵにより前解析処理部１３に出力する。以下の説明において、解析処理制御部１２から他の「〜部」へ情報を送信、通知、出力する処理は、送信、通知、出力する情報をＣＰＵにより記憶部に記憶して、解析処理制御部１２が他の「〜部」を動作させてことをいうものとする。また、他の「〜部」が、解析処理制御部１２から受信、入力、取得する情報は、解析処理制御部１２がＣＰＵにより記憶部に記憶した情報を読み取る動作をいうものとする。すなわち、解析処理制御部１２と他の「〜部」との間の情報のやり取りは、記憶部を介して行うものである。 If the analysis processing control unit 12 can acquire the structured data to be analyzed, next, the analysis range information 100 is input. In this embodiment, as described above, since all the structured data is analyzed, the number of occurrences of “<”, “/>”, “=” is analyzed. Information specifying that all data of the digitized data are to be analyzed, and information specifying analysis of the number of occurrences of “<”, “<”, “/>”, “=”. As described above, the input unit 18 is, for example, the communication board 915, the keyboard 902, and the FDD 904. In the case of the communication board 915, the analysis processing control unit 12 analyzes the analysis range information received by the communication board 915 via the network. Enter 100. In the case of the keyboard 902, the analysis processing control unit 12 inputs the analysis range information 100 input by the keyboard 902. In the case of the FDD 904, the analysis processing control unit 12 accesses and retrieves the analysis range information 100 stored in the flexible disk of the FDD 904 by the CPU.
Here, the designation (or input) of the structured data to be analyzed and the designation (or input) of the analysis range information 100 are input as a process start request for the analysis process control unit 12 to start an operation. .
When the analysis processing control unit 12 inputs the above-described processing start request, the CPU stores the instruction to start the pre-analysis processing, the structured data to be analyzed, and the analysis range information 100 as a pre-analysis request in the storage unit. Then, the pre-analysis processing unit 13 is operated. That is, the instruction to start the pre-analysis process, the structured data to be analyzed, and the analysis range information 100 are output to the pre-analysis processing unit 13 by the CPU as a pre-analysis request. In the following description, the process of transmitting, notifying, and outputting information from the analysis processing control unit 12 to another “˜ unit” stores the information to be transmitted, notified, and output in the storage unit by the CPU, and the analysis processing control unit 12 means that other “˜parts” are operated. Further, the information received, input, and acquired by the other “˜ units” from the analysis processing control unit 12 refers to an operation in which the analysis processing control unit 12 reads information stored in the storage unit by the CPU. That is, the exchange of information between the analysis processing control unit 12 and the other “˜ units” is performed via the storage unit.

Ｓ１の解析処理ステップでは、前解析処理部１３は、解析処理制御部１２より前解析要求を入力すると、前述の解析範囲情報１００をＣＰＵにより解析して、解析対象の範囲が全データであることを判断する。さらに、前解析処理部１３は、前述の解析範囲情報１００をＣＰＵにより解析して、解析を行う内容が、構造化データ中の”＜”，”＜”，”／＞”，”＝”の出現回数を求めることであることを判断する。前解析処理部１３は、図６のフロー図に従い、構造化データ中の”＜” ，”＜”，”／＞”，”＝”の出現回数をＣＰＵにより求める動作を開始する。図６では、はじめにタグ開始記号の個数を記憶する変数ｔ＿ｃｏｕｎｔに０を代入し初期化を行う。この処理は、前解析処理開始直前であるため、抽出された特徴データは０個である事を意味する。また、構造化データの先頭文字を現在文字とする。
次に、現在文字が終端文字であるかどうかを判別する。この処理は、構造化データ全ての前解析が終了したかどうかを判定するための処理であり、終端文字であった場合には前解析が全て終了した事を意味する。終端文字でない場合には、前解析を必要としているデータが残っている事を意味する。この時に終端文字であった場合には、前解析を終了し、その時点のｔ＿ｃｏｕｎｔの値が前解析を行った構造化データに対する前解析の結果である解析結果情報となる。
逆に現在文字が終端文字でなかった場合には、次の処理へと移行する。次の処理では、終端文字ではない現在文字が”＜”，”／＞”，”＝”のいずれかであるかどうかを判別する。この処理は現在文字が特徴データと一致するかを判別する処理であり、特徴データと一致した場合には変数ｔ＿ｃｏｕｎｔに１を加算し、特徴データが一つ発見された事を変数ｔ＿ｃｏｕｎｔに反映させ、次の処理へ移行する。逆に現在文字が特徴データと一致しなかった場合には、ｔ＿ｃｏｕｎｔには何も影響を及ぼさずに次の処理へ移行する。
次の処理は現在文字を現在の文字位置から一文字進める。この処理は、前解析を終了した文字からその隣接する前解析を終了していない文字に現在文字を移動させるための処理である。この処理を終了後、現在文字が終端文字かどうかの判別処理から繰り返す。 In the analysis processing step of S1, when the pre-analysis processing unit 13 inputs a pre-analysis request from the analysis processing control unit 12, the analysis range information 100 is analyzed by the CPU, and the range to be analyzed is all data. Judging. Further, the pre-analysis processing unit 13 analyzes the above-described analysis range information 100 by the CPU, and the contents to be analyzed are “<”, “<”, “/>”, “=” in the structured data. It is determined that the number of appearances is to be obtained. The pre-analysis processing unit 13 starts an operation of obtaining the number of occurrences of “<”, “<”, “/>”, “=” in the structured data by the CPU according to the flowchart of FIG. In FIG. 6, first, initialization is performed by assigning 0 to a variable t_count that stores the number of tag start symbols. Since this process is immediately before the start of the pre-analysis process, it means that the extracted feature data is zero. The first character of the structured data is the current character.
Next, it is determined whether or not the current character is a terminal character. This process is a process for determining whether or not the pre-analysis of all structured data has been completed, and if it is a terminal character, it means that all the pre-analysis has been completed. If it is not a terminal character, it means that there is still data that needs pre-analysis. If it is a terminal character at this time, the pre-analysis is terminated, and the value of t_count at that time becomes analysis result information which is a result of the pre-analysis for the structured data subjected to the pre-analysis.
Conversely, if the current character is not a terminal character, the process proceeds to the next process. In the next process, it is determined whether or not the current character that is not the terminal character is one of “<”, “/>”, and “=”. This process is a process for determining whether or not the current character matches the feature data. If the current character matches the feature data, 1 is added to the variable t_count, and the fact that one feature data is found is reflected in the variable t_count. Then, the process proceeds to the next process. Conversely, if the current character does not match the feature data, the process proceeds to the next process without affecting t_count.
The next process advances the current character one character from the current character position. This process is a process for moving the current character from the character for which the pre-analysis has been completed to the adjacent character for which the pre-analysis has not been completed. After completing this process, the process repeats from the process of determining whether the current character is a terminal character.

以上の処理を前解析処理部１３がＣＰＵにより行う事によって、構造化データに幾つのタグ開始記号と、タグの終了記号となる”／＞”と、属性への値代入を示す”＝”とがあるかを解析できる。この出現回数をカウントした情報を用いる事によって、構造化データであるＸＭＬコンテンツをＤＯＭの内部データ形式で表現する際に、オブジェクトが幾つ必要となるかを推定する事が可能となる。
ただし、この方法では構造化データに含まれるタグ開始記号と、タグの終了記号と、属性への値代入との合計数を前解析しているため、異なるオブジェクトの合計数を解析結果情報とすることになる。オブジェクトの単一サイズは、オブジェクトの種類によりそれぞれ異なるので、異なるオブジェクトの合計数を解析結果情報とすると、記憶領域決定部１５の動作の説明は後述するが、例えば、記憶領域決定部１５が、異なるオブジェクトの単一サイズを合計して、その合計から求めた平均のサイズを使用して記憶領域量を求めると仮定する。属性への値代入の出現回数が合計数の８割を占め、属性への値代入のオブジェクトの単一サイズと、他のオブジェクトの単一サイズとの差が大きい場合は、平均のサイズが属性への値代入のオブジェクトの単一サイズよりも小さくなるので、平均サイズを使用して推定した記憶領域量は、実際に必要とする記憶領域量よりも少なくなることが予想される。
また、例えば構造化データに含まれるタグ開始記号のみの出現回数を解析するように解析範囲情報１００に指定することができる。この方法では、タグの個数を推定する事は可能であるが、その他の要素の個数に関して推定する事ができない。このため、前述のように、属性への値代入のオブジェクトの単一サイズと、他のオブジェクトの単一サイズとの差が大きく、属性への値代入のオブジェクトの数がタグ開始記号のオブジェクトの数よりも多い場合には、タグ開始記号のオブジェクトの出現回数を解析結果情報として記憶領域量を求めると、実際に必要とする記憶領域量よりも少ない記憶領域量が求められてしまう。
そこで、タグの個数と、属性値の個数と、テキスト文字数とをそれぞれ別々に解析する前解析処理の処理フローの一例を説明する。なお、解析範囲情報１００は、”＜”，”＜”，”／＞”，”＝”の出現回数を解析することを指定する情報に替えて、タグの個数と、属性値の個数と、テキスト文字数とのそれぞれの出現回数を別々に解析することを指定する情報となる。 When the pre-analysis processing unit 13 performs the above processing by the CPU, several tag start symbols in the structured data, “/>” as the tag end symbol, and “=” indicating the value substitution to the attribute. Can be analyzed. By using the information obtained by counting the number of appearances, it is possible to estimate how many objects are required when expressing the XML content, which is structured data, in the DOM internal data format.
However, since this method pre-analyzes the total number of tag start symbols, tag end symbols, and value assignments to attributes included in structured data, the total number of different objects is used as analysis result information. It will be. Since the single size of the object differs depending on the type of the object, if the total number of different objects is the analysis result information, the operation of the storage area determination unit 15 will be described later. For example, the storage area determination unit 15 Assume that a single size of different objects is summed, and the amount of storage area is obtained using an average size obtained from the sum. If the number of occurrences of value assignment to an attribute occupies 80% of the total number, and the difference between the single size of the value assignment object to the attribute and the single size of other objects is large, the average size is the attribute Therefore, the storage area amount estimated using the average size is expected to be smaller than the storage area amount actually required.
For example, the analysis range information 100 can be specified so as to analyze the number of appearances of only the tag start symbol included in the structured data. In this method, the number of tags can be estimated, but the number of other elements cannot be estimated. Therefore, as described above, there is a large difference between the single size of the value assignment object for the attribute and the single size of the other objects, and the number of value assignment objects for the attribute is When the number of storage areas is larger than the number, if the storage area amount is obtained by using the number of appearances of the object of the tag start symbol as the analysis result information, a storage area amount smaller than the storage area amount actually required is obtained.
Therefore, an example of a processing flow of pre-analysis processing for separately analyzing the number of tags, the number of attribute values, and the number of text characters will be described. Note that the analysis range information 100 is replaced with information that specifies analysis of the number of appearances of “<”, “<”, “/>”, “=”, the number of tags, the number of attribute values, This is information specifying that the number of occurrences of each of the text characters is analyzed separately.

図７は、構造化データがＸＭＬの場合における特徴データとしてタグの開始・終了記号、属性設定記号、テキスト文字数のそれぞれによる特徴抽出処理フロー図である。前解析処理部１３が解析処理制御部１２から前解析要求を受けた後の解析範囲情報１００の内容を判断する処理は同様であるため省略する。前解析処理部１３が、構造化データの全データについて、タグの開始・終了記号、属性設定記号、テキスト文字数のそれぞれの出現回数について解析を行い、解析結果情報を生成する処理について説明する。
（１）変数ｔ＿ｃｏｕｎｔ、ａ＿ｃｏｕｎｔ、ｓ＿ｃｏｕｎｔに０を代入する事で初期化を行う。この３変数はそれぞれ、前解析を行う構造化データに存在するタグ数、属性数、テキスト文字数の推定値を保持するための変数である。また、同時に現在文字として構造化データの先頭文字を設定する。
（２）次に現在文字が終端文字であるか判別する。もし終端文字である場合には、前解析を行っている構造化データは全て前解析を終了した事を意味するため、前解析を終了し、その時点のｔ＿ｃｏｕｎｔ、ａ＿ｃｏｕｎｔ、ｓ＿ｃｏｕｎｔの値がそれぞれタグ数、属性数、テキスト文字数の出現値を表している。終端文字でない場合には次の処理へ移行する。
（３）次の処理では現在文字が”＜”であるか判別する。もし”＜”である場合には現在文字はタグの開始記号である事を意味するため、ｔ＿ｃｏｕｎｔに１加算する事で解析結果を変数に反映し、現在文字を１文字先に進める。”＜”でない場合には、その文字はテキストである事を意味するのでｓ＿ｃｏｕｎｔに１加算し、現在文字を１文字先に進め、（２）の処理へ移行する。”＜”の場合には次の処理へ移行する。
（４）この処理では、タグの内部の前解析を行う処理であるため、現在文字が属性への値代入を意味する”＝”であるか判別する。”＝”の場合には、属性がある事を示すため、ａ＿ｃｏｕｎｔに１加算し、現在文字を１文字先に進め（４）の処理を繰り返す。”＝”でない場合には、次の処理へ進む。
（５）この処理では、タグが終了しているかどうかを判別する処理となる。そのため、現在文字が”／”であり、かつ現在文字の次の文字が”＞”であるか判別し、タグの終了記号の判別を行う。この条件を満たさない場合には、タグ終了記号ではないため、現在文字を１文字先に進め、（４）の処理へ移行する。この条件を満たす場合には、タグが終了した事を意味するため、現在文字を２文字先に進め、（２）へ処理を移行する。 FIG. 7 is a feature extraction process flow diagram for each of the tag start / end symbol, attribute setting symbol, and number of text characters as feature data when the structured data is XML. Since the process of determining the content of the analysis range information 100 after the pre-analysis processing unit 13 receives the pre-analysis request from the analysis processing control unit 12 is the same, the description thereof is omitted. A description will be given of a process in which the pre-analysis processing unit 13 analyzes the number of appearances of the tag start / end symbol, the attribute setting symbol, and the number of text characters for all the structured data, and generates analysis result information.
(1) Initialization is performed by assigning 0 to the variables t_count, a_count, and s_count. These three variables are variables for holding estimated values of the number of tags, the number of attributes, and the number of text characters existing in the structured data to be pre-analyzed. At the same time, the first character of the structured data is set as the current character.
(2) Next, it is determined whether the current character is a terminal character. If it is a terminal character, all the structured data being pre-analyzed means that pre-analysis has been completed, so pre-analysis is terminated, and the values of t_count, a_count, and s_count at that time are the tags. This represents the appearance value of the number, the number of attributes, and the number of text characters. If it is not a terminal character, the process proceeds to the next process.
(3) In the next processing, it is determined whether or not the current character is “<”. If “<”, it means that the current character is the start symbol of the tag. Therefore, by adding 1 to t_count, the analysis result is reflected in the variable, and the current character is advanced by one character. If it is not “<”, it means that the character is text, so 1 is added to s_count, the current character is advanced one character, and the process proceeds to (2). If “<”, the process proceeds to the next process.
(4) Since this process is a process of pre-analyzing the inside of the tag, it is determined whether or not the current character is “=” meaning value substitution to the attribute. In the case of “=”, to indicate that there is an attribute, 1 is added to a_count, the current character is advanced by one character, and the process of (4) is repeated. If it is not “=”, the process proceeds to the next process.
(5) This process is a process for determining whether or not the tag has ended. Therefore, it is determined whether the current character is “/” and the next character after the current character is “>”, and the end symbol of the tag is determined. If this condition is not satisfied, it is not a tag end symbol, so the current character is advanced one character, and the process proceeds to (4). If this condition is satisfied, it means that the tag has been completed, so that the current character is advanced by two characters, and the process proceeds to (2).

以上の処理を行う事によって、構造化データに存在するタグ数、属性数、テキスト文字数をそれぞれ個別に推定する事が可能である。これらの情報を用いる事によって、構造化データであるＸＭＬコンテンツをＤＯＭの内部データ形式で表現する際に、Ｅｌｅｍｅｎｔオブジェクト（タグに対応）、Ａｔｔｒオブジェクト（属性に対応）が幾つ必要となるか、そしてテキスト領域として何文字分確保する必要があるかを個別に推定する事が可能となる。
前述した例では、前解析処理部１３は解析した”＜”，”＜”，”／＞”，”＝”の出現回数や、タグの開始・終了記号、属性設定記号、テキスト文字数のそれぞれの出現回数を、記憶部に解析結果情報として記憶し、前解析処理部１３が解析処理の終了を解析処理制御部１２に通知する。そして、解析処理制御部１２は、解析処理の終了通知を受けて、記憶部から解析結果情報を取り出して、記憶領域決定部１５に対して解析結果情報とともに、領域決定要求を出力する。 By performing the above processing, the number of tags, the number of attributes, and the number of text characters existing in the structured data can be estimated individually. By using these pieces of information, the number of Element objects (corresponding to tags) and Attr objects (corresponding to attributes) are required to express the XML content as structured data in the DOM internal data format. It is possible to individually estimate how many characters need to be secured as a text area.
In the above-described example, the pre-analysis processing unit 13 analyzes the number of occurrences of the analyzed “<”, “<”, “/>”, “=”, the start / end symbol of the tag, the attribute setting symbol, and the number of text characters. The number of appearances is stored as analysis result information in the storage unit, and the pre-analysis processing unit 13 notifies the analysis processing control unit 12 of the end of the analysis processing. The analysis process control unit 12 receives the analysis process end notification, extracts the analysis result information from the storage unit, and outputs a region determination request together with the analysis result information to the storage region determination unit 15.

以上の方法では、属性数の推定に属性への値代入記号である”＝”を利用しているため、属性へ値を代入していない場合には実際の属性数と異なる推定値が求まる事となる。しかし、本技術では各オブジェクトの個数を正確に求める事が目的ではなく、いかに簡単な方法で各オブジェクトの個数の推定値を求め、その推定値からより誤差の少ない記憶領域量の決定を行う事が目的であるため、これは本質的な問題ではない。もし、より正確な個数を前解析で求めたいのであれば、実際に構造化データの有する全ての要素に対する前解析を行う事によって求める事が可能である。事実、この前解析で全ての要素に対して正確な個数を求めてしまっても構わない。 In the above method, since the value substitution symbol “=” is used to estimate the number of attributes, an estimated value different from the actual number of attributes can be obtained if no value is assigned to the attribute. It becomes. However, the purpose of this technology is not to accurately determine the number of objects, but to determine an estimated value of the number of objects using a simple method and determine the amount of storage area with less error from the estimated value. This is not an essential problem. If it is desired to obtain a more accurate number by pre-analysis, it can be obtained by performing pre-analysis for all elements of the structured data. In fact, this pre-analysis may determine the exact number for all elements.

前解析で、前述の様な構造化データから特徴データを抽出した解析結果に加えて、構造化データ全体のデータサイズやファイルサイズ、ファイルの作成日時情報、更新情報、作成者情報、更新者情報等の構造化データの基本情報を解析結果情報に含めても構わない。また、前述の様な構造化データから特徴データを抽出した解析結果に替えて、構造化データ全体のデータサイズやファイルサイズ、ファイルの作成日時情報、更新情報、作成者情報、更新者情報等の構造化データの基本情報を解析結果情報としても構わない。この場合、入力部１８により入力する解析範囲情報１００に、基本情報を解析結果情報とする、或いは、基本情報を解析結果情報に含めるという指示を示す情報を含める。 In addition to the analysis results obtained by extracting the feature data from the structured data as described above in the previous analysis, the data size and file size of the entire structured data, file creation date / time information, update information, creator information, and updater information Basic information of structured data such as the above may be included in the analysis result information. Also, instead of the analysis results obtained by extracting the feature data from the structured data as described above, the data size and file size of the entire structured data, file creation date / time information, update information, creator information, updater information, etc. Basic information of structured data may be used as analysis result information. In this case, the analysis range information 100 input by the input unit 18 includes information indicating an instruction to use basic information as analysis result information or include basic information in analysis result information.

以下では、前解析処理部１３が生成した解析結果情報と記憶領域推定情報格納部１６に記憶した記憶領域推定情報とを元に、記憶領域決定部１５が必要となる記憶領域のサイズ（記憶領域量）を決定するＳ２の記憶領域決定ステップの動作を、具体的な例を挙げて説明する。
記憶領域推定情報格納部１６は、前解析を行った構造化データを内部データ形式で表現した場合に、各内部データオブジェクトを生成するのに必要となるメモリ量（記憶領域量）を決定するための情報を予め格納する。ここで言うメモリ量（記憶領域量）は、実際に必要となる物理メモリの領域サイズでもよいし、必要となる各内部データオブジェクト数でもよい。
図８は、記憶領域推定情報格納部１６に格納されている情報の具体例の一つであり、（ａ）はパラメータが所定の範囲の値を示す例であり、（ｂ）はパラメータが１つの値を示す例である。パラメータは解析結果情報と対応し、複数のパラメータのそれぞれに対応して記憶領域量を推定するための記憶領域推定情報を記憶する。パラメータは、図８（ａ）のように、所定の範囲の値を設定してもいいし、図８（ｂ）のように１つの値を設定してもよい。前解析処理部１３が異なる構造化データを解析した場合、解析する内容が同じであっても、構造化データの有するデータの内容により異なる解析結果が求まる。このため、図８のように、パラメータに所定の範囲の値を設定することと、１つの値を設定することとを可能にしている。また、同じ構造化データを解析しても、解析する内容が異なれば、異なる解析結果が求まる。このため、パラメータは異なる解析内容に対応して複数種類の設定を行うことを可能にする。例えば、図８は前述した”＜”，”＜”，”／＞”，”＝”の出現回数を解析する場合に使用するパラメータと記憶領域推定情報との組であり、後で説明する図１０は、前述したタグの開始・終了記号、属性設定記号、テキスト文字数のそれぞれの出現回数について解析する場合に使用するパラメータと記憶領域推定情報との組である。記憶領域推定情報格納部１６は、図８（ａ），（ｂ）と図１０（ａ），（ｂ），（ｃ）とをそれぞれ異なるテーブルとして記憶する。構造化データメモリ管理装置１０を使用する利用者は、どのような構造化データを解析対象として、それぞれの構造化データをどのような解析内容で解析を行うのかを予め予想して、解析結果情報として求められると予想される値をパラメータとして記憶領域推定情報格納部１６に予め設定するとともに、そのパラメータの値に対応させて、記憶領域量を求める計算式や、関数、或いは、記憶領域量そのものを予め設定する。また、計算式や、関数、或いは、記憶領域量そのものは、過去に解析を行って実際に必要となって確保した記憶領域量の統計を取り、統計の結果から計算式や、関数を求めてもいいし、或いは、実際に確保した記憶領域量そのものを設定してもよい。 In the following, based on the analysis result information generated by the pre-analysis processing unit 13 and the storage area estimation information stored in the storage area estimation information storage unit 16, the size of the storage area (storage area required by the storage area determination unit 15) The operation of the storage area determination step of S2 for determining (quantity) will be described with a specific example.
The storage area estimation information storage unit 16 determines the memory amount (storage area amount) necessary to generate each internal data object when the structured data subjected to the pre-analysis is expressed in the internal data format. Is stored in advance. The memory amount (storage area amount) referred to here may be the area size of the physical memory actually required or the number of internal data objects required.
FIG. 8 is one of specific examples of information stored in the storage area estimation information storage unit 16, (a) is an example in which a parameter indicates a value within a predetermined range, and (b) is a parameter 1 It is an example showing one value. The parameter corresponds to the analysis result information and stores storage area estimation information for estimating the storage area amount corresponding to each of the plurality of parameters. As the parameter, a value within a predetermined range may be set as shown in FIG. 8A, or a single value may be set as shown in FIG. 8B. When the pre-analysis processing unit 13 analyzes different structured data, even if the contents to be analyzed are the same, different analysis results are obtained depending on the data contents of the structured data. Therefore, as shown in FIG. 8, it is possible to set a value within a predetermined range for the parameter and to set one value. Even if the same structured data is analyzed, different analysis results can be obtained if the contents to be analyzed are different. For this reason, the parameter makes it possible to perform a plurality of types of settings corresponding to different analysis contents. For example, FIG. 8 shows a set of parameters and storage area estimation information used when analyzing the number of appearances of “<”, “<”, “/>”, “=” described above. Reference numeral 10 denotes a set of parameters and storage area estimation information used when analyzing the number of appearances of the above-described tag start / end symbol, attribute setting symbol, and number of text characters. The storage area estimation information storage unit 16 stores FIGS. 8A and 8B and FIGS. 10A, 10B, and 10C as different tables. A user who uses the structured data memory management device 10 predicts in advance what kind of structured data is to be analyzed, and what kind of analysis contents to analyze each structured data. A value that is expected to be obtained as a parameter is set in the storage area estimation information storage unit 16 in advance as a parameter, and a calculation formula, a function, or a storage area amount itself that obtains the storage area amount corresponding to the value of the parameter Is set in advance. Also, the calculation formula, function, or storage area amount itself is analyzed in the past, and the statistics of the storage area amount that is actually required and secured are obtained, and the calculation formula and function are obtained from the statistical results. Alternatively, the storage area amount actually secured itself may be set.

図８（ａ）は、解析結果情報が第一列（パラメータ）のどの要素の範囲内に収まるかで、記憶領域量がいくつになるかを示した対応情報である。この例では、特徴データとして”＜”，”＜”，”／＞”，”＝”を指定し、それらの合計の出現回数を解析することを仮定している。記憶領域決定部１５は、前解析処理部１３が生成した解析結果情報が、記憶領域推定情報格納部１６の有するテーブルの第一列のどの範囲に適合するかを検索し、適合したパラメータに対応する記憶領域推定情報を取得する。記憶領域推定情報の「Ｓ」は複数のオブジェクトの平均記憶領域サイズを示し、記憶領域推定情報は平均記憶領域サイズにパラメータに示した値の範囲のうち上限の値を掛ける計算式を、記憶領域量を推定する計算式とすることを示している。 FIG. 8A shows correspondence information indicating how much the storage area amount is based on which element of the first column (parameter) the analysis result information falls within. In this example, it is assumed that “<”, “<”, “/>”, “=” are designated as feature data, and the total number of appearances is analyzed. The storage area determination unit 15 searches the range of the first column of the table of the storage area estimation information storage unit 16 that the analysis result information generated by the pre-analysis processing unit 13 matches, and corresponds to the matched parameters. The storage area estimation information to be acquired is acquired. “S” of the storage area estimation information indicates an average storage area size of a plurality of objects, and the storage area estimation information is calculated by multiplying the average storage area size by an upper limit value in the range of values indicated by the parameters. It shows that the calculation formula is used to estimate the quantity.

ここで、記憶領域推定情報格納部１６へパラメータと記憶領域推定情報とを設定する手順を説明する。図９は、記憶領域推定情報格納部１６へパラメータと記憶領域推定情報とを設定する手順のフローチャート図である。構造化データメモリ管理装置１０の入力部１８は、パラメータを入力する（Ｓ１１）。ここで入力部１８は前述したように、キーボード９０２や通信ボード９１５やＦＤＤ９０４等である。続いて、入力部１８は、パラメータに対応する記憶領域推定情報を入力する（Ｓ１２）。そして、入力部１８は、入力したパラメータと記憶領域推定情報とをＣＰＵにより記憶領域推定情報格納部１６に記憶する（Ｓ１３）。全パラメータについての入力が終了するまで、Ｓ１１〜Ｓ１３を繰り返す（Ｓ１４）。Ｓ１１〜Ｓ１４は記憶領域推定情報記憶ステップＳ１である。
記憶するパラメータと記憶領域推定情報とは、過去に構造化データを解析した結果を分析して、パラメータとなる値とパラメータに対応する記憶領域推定情報とをあらかじめ決めておき、それをユーザが入力部１８より入力する。あるいは、過去に解析した構造化データとその解析した結果からパラメータと記憶領域推定情報とを導き出して、記憶領域推定情報格納部１６に書き込みを行う処理をコンピュータに実行させるプログラムをあらかじめユーザが作成して、プログラム群９２３に記憶させておき、そのプログラムを実行するようにしてもかまわない。また、パラメータと記憶領域推定情報とは、構造化データ毎にそれぞれ記憶領域推定情報格納部１６に記憶してもかまわない。また、パラメータと記憶領域推定情報とは、前解析処理部１３が行う解析内容毎にそれぞれ記憶領域推定情報格納部１６に記憶してもかまわない。 Here, a procedure for setting parameters and storage area estimation information in the storage area estimation information storage unit 16 will be described. FIG. 9 is a flowchart of a procedure for setting parameters and storage area estimation information in the storage area estimation information storage unit 16. The input unit 18 of the structured data memory management device 10 inputs parameters (S11). Here, as described above, the input unit 18 is the keyboard 902, the communication board 915, the FDD 904, or the like. Subsequently, the input unit 18 inputs storage area estimation information corresponding to the parameter (S12). And the input part 18 memorize | stores the input parameter and storage area estimation information in the storage area estimation information storage part 16 by CPU (S13). S11 to S13 are repeated until input for all parameters is completed (S14). S11 to S14 are storage area estimation information storage steps S1.
The parameters to be stored and the storage area estimation information are obtained by analyzing the result of analyzing the structured data in the past, determining the parameter value and the storage area estimation information corresponding to the parameter in advance, and inputting them by the user Input from unit 18. Alternatively, the user creates in advance a program for deriving parameters and storage area estimation information from the structured data analyzed in the past and the analysis results, and causing the computer to execute processing for writing to the storage area estimation information storage unit 16. The program may be stored in the program group 923 and the program may be executed. The parameters and the storage area estimation information may be stored in the storage area estimation information storage unit 16 for each structured data. The parameters and the storage area estimation information may be stored in the storage area estimation information storage unit 16 for each analysis content performed by the pre-analysis processing unit 13.

図１０は、（ａ）はタグ数、（ｂ）は属性数、（ｃ）は文字数それぞれについてパラメータと記憶領域推定情報とを記憶する記憶領域推定情報格納部１６の例を示す図である。
前解析処理部１３がタグの開始・終了記号、属性設定記号、テキスト文字数のそれぞれの出現回数を解析した場合は、３つの解析結果情報を生成する。記憶領域決定部１５は、図１０（ａ）の対応表より、タグの開始・終了記号の出現回数の解析結果情報に適合する記憶領域推定情報を取得する。次に、図１０（ｂ）の対応表より属性設定記号の出現回数の解析結果情報に適合する記憶領域推定情報を取得する。次に、図１０（ｃ）の対応表より、テキスト文字数の出現回数の解析結果情報に適合する記憶領域推定情報を取得する。
また、前解析処理部１３が１つの解析内容について、例えば、タグの開始・終了記号の出現回数を解析した場合は、１つの解析結果情報を生成する。記憶領域決定部１５は、図１０（ａ）の対応表より、タグの開始・終了記号の出現回数の解析結果情報に適合する記憶領域推定情報を取得する。そして、属性設定記号の記憶領域推定情報と、テキスト文字数の記憶領域推定情報とは、タグの開始・終了記号の出現回数から求める。これは、図１０（ｂ）の対応表と図１０（ｃ）の対応表の前解析結果が、タグの開始・終了記号の出現回数であり、タグの開始・終了記号の出現回数から属性設定記号の記憶領域推定情報と、テキスト文字数の記憶領域推定情報とを推測する場合に適用できる。記憶領域決定部１５は、図１０（ａ）の対応表からタグの開始・終了記号の出現回数の解析結果情報に適合する記憶領域推定情報を取得したあと、図１０（ｂ）の対応表よりタグの開始・終了記号の出現回数に適合する属性設定記号の記憶領域推定情報を取得する。さらに、図１０（ｃ）の対応表より、タグの開始・終了記号の出現回数に適合するテキスト文字数の記憶領域推定情報を取得する。これは、タグの開始・終了記号の出現回数と、属性設定記号やテキスト文字数との出現回数の関連性を分析して、関連性がわかっている場合に有効である。このように、タグの開始・終了記号の出現回数に対応させて属性やテキストの記憶領域推定情報を記憶領域推定情報格納部１６に記憶させておくと、実際には解析しなかった属性やテキストの記憶領域推定情報を取得できる。
例えば、記憶領域決定部１５は、この構造化データを解析し内部データ形式で表現した場合、必要となるオブジェクト数はそれぞれタグの開始・終了記号（Ｅｌｅｍｅｎｔオブジェクト）が２４個だった場合、Ａｔｔｒオブジェクトは図１０（ｂ）の属性数の対応情報より３２個、テキスト文字数は図１０（ｃ）の文字数の対応情報より、１２４文字と決定する。なお、図１０の「ｘ」、「ｙ」、「ｚ」はそれぞれのオブジェクトの単一の記憶領域サイズを示している。このように記憶領域推定情報格納部１６は、一つの解析結果情報で全ての内部データオブジェクトの個数との対応付けを行ってもよいし、内部データオブジェクト毎に一つの対応情報を作成しても構わない。オブジェクト毎に対応情報を作成する事で、前解析処理部１３の解析内容をオブジェクト単位で指定する事が可能となる。 FIG. 10 is a diagram illustrating an example of the storage area estimation information storage unit 16 that stores parameters and storage area estimation information for (a) the number of tags, (b) the number of attributes, and (c) the number of characters.
When the pre-analysis processing unit 13 analyzes the appearance count of the tag start / end symbol, the attribute setting symbol, and the number of text characters, three pieces of analysis result information are generated. The storage area determination unit 15 acquires storage area estimation information that matches the analysis result information of the number of appearances of the start / end symbols of the tag from the correspondence table of FIG. Next, storage area estimation information suitable for the analysis result information of the number of appearances of the attribute setting symbol is acquired from the correspondence table of FIG. Next, storage area estimation information matching the analysis result information of the number of appearances of the number of text characters is acquired from the correspondence table of FIG.
For example, when the pre-analysis processing unit 13 analyzes the number of appearances of the start / end symbols of the tag for one analysis content, one analysis result information is generated. The storage area determination unit 15 acquires storage area estimation information that matches the analysis result information of the number of appearances of the start / end symbols of the tag from the correspondence table of FIG. Then, the storage area estimation information of the attribute setting symbol and the storage area estimation information of the number of text characters are obtained from the number of appearances of the start / end symbols of the tag. This is because the pre-analysis result of the correspondence table of FIG. 10B and the correspondence table of FIG. 10C is the number of appearances of the start / end symbols of the tag, and the attribute setting is determined from the number of appearances of the start / end symbols of the tag. The present invention can be applied to estimation of symbol storage area estimation information and text character count storage area estimation information. The storage area determination unit 15 obtains storage area estimation information that matches the analysis result information of the number of appearances of the start / end symbols of the tag from the correspondence table of FIG. 10A, and then uses the correspondence table of FIG. The storage area estimation information of the attribute setting symbol that matches the number of appearances of the start / end symbol of the tag is acquired. Furthermore, the storage area estimation information of the number of text characters that matches the number of appearances of the start / end symbols of the tag is acquired from the correspondence table of FIG. This is effective when the relationship between the number of appearances of the start / end symbol of the tag and the number of appearances of the attribute setting symbol and the number of text characters is analyzed and the relationship is known. In this way, if the storage area estimation information storage unit 16 stores the attribute and text storage area estimation information in correspondence with the number of appearances of the start / end symbols of the tag, the attribute or text that was not actually analyzed is stored. Storage area estimation information can be acquired.
For example, when the structured data is analyzed and expressed in the internal data format, the storage area determining unit 15 requires the Attr object when the number of required objects is 24 start / end symbols (Element objects). Is determined from the correspondence information of the number of attributes of FIG. 10B, and the number of text characters is determined to be 124 characters from the correspondence information of the number of characters of FIG. Note that “x”, “y”, and “z” in FIG. 10 indicate a single storage area size of each object. As described above, the storage area estimation information storage unit 16 may associate one analysis result information with the number of all internal data objects, or create one correspondence information for each internal data object. I do not care. By creating correspondence information for each object, it is possible to specify the analysis content of the pre-analysis processing unit 13 in units of objects.

また、前解析の範囲指定は、範囲で表現せずに離散値の集合でも、単独の値で表現しても構わない。単独で表現した場合の例は、図８である。離散地で表現した場合の例は図１０と図１１である。図１０の前解析結果の間隔は１０間隔であったが、図１１は、前解析結果の間隔が属性の場合は１０間隔であるが、文字数の場合は４間隔としている。このように、間隔を変えることで、実際にそのオブジェクトで必要とする記憶領域量に近い記憶領域量の推定情報を設定できる。なお、図１１では、「属性数」、「文字数」に対する記憶領域推定情報は、記憶領域量ではなくオブジェクトの数を記憶領域推定情報としているため、「＊ｙ」、「＊ｚ」の表示がない。
また図１０、図１１では３つのオブジェクトを例に説明したが、実際にはこれ以上に多くのオブジェクトが必要となる場合がある。この場合は、それらのオブジェクトと記憶領域推定情報との対応情報を作成することになる。また、この例では前解析の結果と各オブジェクト数のみの対応情報となっているが、前解析の結果と、ある特定のオブジェクト数の組からその他のオブジェクトの必要数の対応情報の様に複数の検索キーとなる情報から一つの結果を導き出す対応情報でもよい。 In addition, the range specification for the pre-analysis may be expressed as a set of discrete values or as a single value without being expressed as a range. FIG. 8 shows an example when expressed alone. Examples when expressed in discrete places are shown in FIGS. The interval of the pre-analysis result in FIG. 10 is 10 intervals, but in FIG. 11, when the interval of the pre-analysis result is an attribute, it is 10 intervals, but in the case of the number of characters, it is 4 intervals. In this way, by changing the interval, it is possible to set the estimation information of the storage area amount close to the storage area amount actually required for the object. In FIG. 11, since the storage area estimation information for “number of attributes” and “number of characters” uses the number of objects instead of the storage area amount as storage area estimation information, “* y” and “* z” are displayed. Absent.
10 and 11 have been described using three objects as an example, but in reality, more objects may be required. In this case, correspondence information between the objects and the storage area estimation information is created. In addition, in this example, the correspondence information includes only the result of the pre-analysis and the number of each object. Corresponding information for deriving one result from information serving as a search key may be used.

記憶領域決定部１５は、取得した記憶領域推定情報を記憶部に記憶する。この時、記憶領域推定情報が図８や図９、図１０のように記憶領域量を計算する式である場合には、これを計算した結果を記憶部に記憶する。オブジェクトごとの単一のサイズを示す変数の値は、記憶領域推定情報格納部１６に変数用のテーブルとして記憶されているものとする。記憶領域決定部１５は、この変数用のテーブルから記憶領域量の計算に使用する変数を取得し、ＣＰＵにより記憶領域量を計算して記憶部に記憶する。そして、解析処理制御部１２に処理の終了を通知する。以上の処理が、図５のＳ２の記憶領域決定ステップである。 The storage area determination unit 15 stores the acquired storage area estimation information in the storage unit. At this time, if the storage area estimation information is an expression for calculating the storage area amount as shown in FIG. 8, FIG. 9, or FIG. 10, the calculation result is stored in the storage unit. It is assumed that a variable value indicating a single size for each object is stored in the storage area estimation information storage unit 16 as a variable table. The storage area determination unit 15 acquires a variable used for calculation of the storage area amount from the variable table, calculates the storage area amount by the CPU, and stores it in the storage unit. Then, the analysis process control unit 12 is notified of the end of the process. The above processing is the storage area determination step in S2 of FIG.

Ｓ３の記憶領域管理ステップにおいて、解析処理制御部１２は、記憶領域決定部１５が記憶した記憶領域量をＣＰＵにより記憶部より取り出して、記憶領域管理部１７に出力する。記憶領域管理部１７は、記憶領域量を入力して、記憶領域量に相当するデータ記憶領域を、記憶部に確保する。ここで記憶部とは、前述したように、例えばＲＯＭ９１３、ＦＤＤ９０４、ＣＤＤ９０５、磁気ディスク装置９２０の不揮発性の記憶媒体である。確保されたデータ記憶領域が他からアクセスされないように、記憶領域管理部１７は、ＣＰＵにより排他制御をかける。記憶領域管理部１７は、領域の確保が行えたら、そのことを解析処理制御部１２にＣＰＵにより通知する。 In the storage area management step of S 3, the analysis processing control unit 12 takes out the storage area amount stored by the storage area determination unit 15 from the storage unit by the CPU and outputs it to the storage area management unit 17. The storage area management unit 17 inputs the storage area amount, and secures a data storage area corresponding to the storage area amount in the storage unit. Here, as described above, the storage unit is a non-volatile storage medium such as the ROM 913, the FDD 904, the CDD 905, and the magnetic disk device 920, for example. The storage area management unit 17 performs exclusive control by the CPU so that the secured data storage area is not accessed from others. When the storage area management unit 17 can secure the area, the storage area management unit 17 notifies the analysis processing control unit 12 of the fact by the CPU.

解析処理制御部１２は、領域の確保が行えたことを記憶領域管理部１７より通知されると、Ｓ４の実解析処理ステップにおいて、構造化データ解析部１４に対して、構造化データの解析要求と、解析する構造化データとをＣＰＵにより出力する。「構造化データを出力する」とは、構造化データそのものを出力するのではなく、例えば構造化データが記憶されている記憶部のアドレスを渡すことである。構造化データ解析部１４は、解析要求と構造化データを入力すると、構造化データを解析する処理を開始する。このとき、記憶領域管理部１７が確保したデータ記憶領域を使用する。
構造化データ解析部１４の動作は従来の構造化データ処理部と同様である。 When the storage area management unit 17 is notified that the area has been secured, the analysis processing control unit 12 requests the structured data analysis unit 14 to analyze the structured data in the actual analysis processing step of S4. And structured data to be analyzed are output by the CPU. “Output structured data” is not to output the structured data itself, but to pass, for example, the address of the storage unit in which the structured data is stored. When the structured data analysis unit 14 inputs the analysis request and the structured data, the structured data analyzing unit 14 starts a process of analyzing the structured data. At this time, the data storage area secured by the storage area management unit 17 is used.
The operation of the structured data analysis unit 14 is the same as that of the conventional structured data processing unit.

この実施の形態では、以下の手段を備えた構造化データメモリ管理方式の一例を説明した。
一つまたは複数の構造化データを一時的または永続的に格納・管理する構造化データ格納手段（構造化データ格納部１１）
構造化データを前解析し、内部データ形式に変換するために必要な記憶領域量を推定するための前解析パラメータを導出する前解析処理手段（前解析処理部１３）
記憶領域の推定基準となる記憶領域推定情報を格納する記憶領域推定情報格納手段（記憶領域推定情報格納部１６）
前解析パラメータと記憶領域推定情報を元に、前解析パラメータを導出するのに利用した構造化データを内部データ形式に変換するために必要な記憶領域量を推定する記憶領域決定手段（記憶領域決定部１５）
記憶領域決定手段が推定した記憶領域量に基づき、記憶領域を確保・管理する記憶領域管理手段（記憶領域管理部１７）。 In this embodiment, an example of a structured data memory management system including the following means has been described.
Structured data storage means for storing or managing one or a plurality of structured data temporarily or permanently (structured data storage unit 11)
Pre-analysis processing means (pre-analysis processing unit 13) for pre-analyzing structured data and deriving a pre-analysis parameter for estimating a storage area amount necessary for conversion into an internal data format
Storage area estimation information storage means (storage area estimation information storage unit 16) for storing storage area estimation information which is a storage area estimation criterion
Based on the pre-analysis parameter and the storage area estimation information, a storage area determination means (storage area determination) that estimates the storage area amount necessary to convert the structured data used to derive the pre-analysis parameter into the internal data format Part 15)
Storage area management means (storage area management unit 17) for securing and managing the storage area based on the storage area amount estimated by the storage area determination means.

また、前解析処理手段（前解析処理部１３）で、構造化データの全てを前解析することを説明した。 In addition, it has been described that the preanalysis processing means (preanalysis processing unit 13) preanalyzes all the structured data.

また、前解析処理手段（前解析処理部１３）で、構造化データから構造を特徴付ける構造化データ特徴データを検出する事で前解析を行うことを説明した。例えば、ＸＭＬで記述された構造化データの場合、”＜”（タグの開始記号）、”＝”（属性値の代入記号）、＜−−（コメント開始記号）等を、構造を特徴付けるデータと言い、その個数などが前解析結果となることを説明した。 Further, it has been described that the pre-analysis processing means (pre-analysis processing unit 13) performs the pre-analysis by detecting the structured data feature data that characterizes the structure from the structured data. For example, in the case of structured data described in XML, “<” (tag start symbol), “=” (attribute value substitution symbol), <-(comment start symbol), and the like are data that characterizes the structure. It was explained that the number of the results is the result of the previous analysis.

また、記憶領域推定情報格納部は、前解析処理手段（前解析処理部１３）から得られる前解析結果の値または、前解析結果の値の集合と、記憶領域推定量が対応付けられた対応表であることを説明した。例えば、記憶領域推定情報が離散的な取り扱いがなされている事が特徴。精度が劣る反面、推定量を求める処理が高速であることを説明した。 In addition, the storage area estimation information storage unit is a correspondence in which a pre-analysis result value or a set of pre-analysis result values obtained from the pre-analysis processing unit (pre-analysis processing unit 13) is associated with a storage area estimation amount. It explained that it was a table. For example, the storage area estimation information is handled in a discrete manner. While the accuracy is inferior, it has been explained that the processing for obtaining the estimated amount is fast.

また、前解析処理手段（前解析処理部１３）、記憶領域推定情報格納手段（記憶領域推定情報格納部１６）、構造化データ格納手段（構造化データ格納部１１）、記憶領域決定手段（記憶領域決定部１５）を有する装置が、解析手段（構造化データ解析部）、記憶領域管理手段（記憶領域管理部１７）を有する装置と同一あることを説明した。例えば、メモリ量（記憶領域量）推定を行う装置が、メモリを実際に確保する装置と同一であることを説明した。例えば、構造化データメモリ管理装置１０を、Ｎｅｔｗｏｒｋシステムに例えると、ＸＭＬコンテンツを取得するクライアントにあたる。 Further, the pre-analysis processing means (pre-analysis processing section 13), the storage area estimation information storage means (storage area estimation information storage section 16), the structured data storage means (structured data storage section 11), the storage area determination means (memory) It has been described that the apparatus having the area determination unit 15) is the same as the apparatus having the analysis unit (structured data analysis unit) and the storage area management unit (storage area management unit 17). For example, it has been described that the device that estimates the memory amount (storage area amount) is the same as the device that actually secures the memory. For example, when the structured data memory management device 10 is compared to a network system, it corresponds to a client that acquires XML content.

この実施の形態の記憶領域管理方式を実行する記憶領域管理装置の一例である構造化データメモリ管理装置１０は、構造化データの特徴データ、例えばオブジェクトの出現回数を解析結果情報とし、記憶領域推定情報はオブジェクトの出現回数から取得するようにした。また、記憶領域推定情報は、オブジェクトの単一サイズにそのオブジェクトの出現回数を掛けたり、複数のオブジェクトの平均サイズを複数のオブジェクトの出現回数を合計した数に掛けた。また、ある１つのオブジェクトの出現回数から他のオブジェクトの出現回数を予測して記憶領域推定情報を取得するようにした。このため、構造化データの内部構造が複雑であっても、記憶領域推定情報を取得する処理時間を高速にできる効果がある。また、構造化データの特徴データ、例えばオブジェクトの出現回数から記憶領域量を推定するので、推定した記憶領域量と実際に必要となる記憶領域量との差を小さくできる効果がある。また、オブジェクト出現回数をパラメータとして、そのパラメータに記憶領域推定情報を対応させた。このため、記憶領域推定情報を取得する処理は複雑でないため、高速に行える効果がある。 The structured data memory management device 10, which is an example of the storage region management device that executes the storage region management method of this embodiment, uses the characteristic data of structured data, for example, the number of appearances of an object as analysis result information, and estimates the storage region. Information was acquired from the number of appearances of an object. The storage area estimation information is obtained by multiplying the single size of an object by the number of appearances of the object, or multiplying the average size of the plurality of objects by the total number of appearances of the plurality of objects. Further, the storage area estimation information is obtained by predicting the number of appearances of another object from the number of appearances of a certain object. For this reason, even if the internal structure of the structured data is complicated, there is an effect that the processing time for acquiring the storage area estimation information can be increased. Further, since the storage area amount is estimated from the feature data of the structured data, for example, the number of appearances of the object, there is an effect that the difference between the estimated storage area amount and the actually required storage area amount can be reduced. Further, the number of appearances of the object is used as a parameter, and the storage area estimation information is associated with the parameter. For this reason, since the process which acquires storage area estimation information is not complicated, there exists an effect which can be performed at high speed.

また、この実施の形態では、記憶領域決定部が決定する構造化データを内部データ形式に変換したものを記憶するでーた記憶領域の記憶領域量と、実際に使用する記憶領域量との差が少なくなるので、コンピュータの有限なメモリ資源を有効に活用したいという課題に対して、コンピュータのメモリの使用効率が向上するという効果が得られる。すなわち、この実施の形態で説明した記憶領域管理方式は、前述したコンピュータのハードウェア資源のうち特にメモリ等に確保する記憶領域量を、実際に使用する記憶領域量に近い量とする処理をする記憶領域決定部を用いている点が特徴である。 Further, in this embodiment, the difference between the storage area amount that can be stored by converting the structured data determined by the storage area determination unit into the internal data format and the storage area amount that is actually used. As a result, the memory usage efficiency of the computer is improved in response to the problem of effectively utilizing the finite memory resources of the computer. That is, the storage area management method described in this embodiment performs processing for setting the storage area amount secured in the memory or the like among the computer hardware resources described above to an amount close to the storage area amount actually used. A feature is that a storage area determination unit is used.

実施の形態２．
この実施の形態では、上記実施の形態１とは別の記憶領域推定情報を記憶領域推定情報格納部１６が記憶する一例を説明する。
図１２は、過去に行った前解析処理部１３の解析結果情報に基づいて記憶領域決定部１５が推定した記憶領域量と、実際に構造化データ解析部１４が使用した記憶領域量とを利用して、メモリ使用推定量関数を求めるグラフを示す図である。
ここで「推定関数」とは、過去に行った前解析結果の解析結果情報に基づいて記憶領域決定部１５が推定した記憶領域量と、実際に構造化データ解析部１４が使用したメモリ量（記憶領域量）とを利用して、メモリ使用量推定関数を数学的に求めたものである。図１２のグラフはその一例である。このグラフは、インターネット上に存在するＨＴＭＬコンテンツを前解析した解析結果情報（図１２ではＨＴＭＬコンテンツに含まれるタグを解析して、そのタグの出現回数を解析結果情報とする）をｘ軸とし、実際に使用する事になったメモリサイズ（記憶領域量）をｙ軸としたグラフである。グラフ上のひし形の小さい点は、実際の解析結果情報と使用メモリサイズ（使用記憶領域量）のデータであり、プロット情報の一例である。またグラフ上に表示されている線分が、ひし形の点を元に推定関数として求めた近似関数（「推定関数」のことである）である。この近似関数を利用して、記憶領域量を推定する。図１２の例では、グラフ右上に示した「ｙ＝０．０９ｘ＋１．２３４５」が推定関数である。ｘが解析結果情報であるので、記憶領域決定部１５は、前解析処理部１３が生成した解析結果情報をこの推定関数のｘに当てはめて、記憶領域量（ｙ）を求める。このように、推定関数は、得られたプロット情報からの線形近似や対数近似など、数学的な近似関数が具体的な推定方法となる。また、Ｒの２乗（精度情報）は、推定関数で求めた値ｙがどのぐらい正しいかを示すものであり、１に近いほど精度が高いことを示している。このため、推定関数と精度情報とを記憶領域推定情報として記憶しておくと、推定関数が複数あるとき、制度情報の高い順に推定関数を採用することができる。
また、推定関数を記憶領域推定情報とする場合、パラメータはオブジェクトの種類や、コンテンツの種類（例えば、ＨＴＭＬコンテンツであるか、ＸＭＬコンテンツであるかなどである）となる。この場合、前解析処理部１３は、解析結果情報に解析したオブジェクトの種類を含める。また、コンテンツの種類を解析して、解析できたコンテンツの種類を含める。 Embodiment 2. FIG.
In this embodiment, an example will be described in which the storage area estimation information storage unit 16 stores storage area estimation information different from that of the first embodiment.
FIG. 12 uses the storage area amount estimated by the storage area determination unit 15 based on the analysis result information of the pre-analysis processing unit 13 performed in the past and the storage area amount actually used by the structured data analysis unit 14. FIG. 6 is a diagram illustrating a graph for obtaining a memory usage estimation function.
Here, the “estimation function” means the storage area amount estimated by the storage area determination unit 15 based on the analysis result information of the previous analysis result performed in the past, and the memory amount actually used by the structured data analysis unit 14 ( The memory use amount estimation function is mathematically obtained using the storage area amount). The graph of FIG. 12 is an example. This graph uses analysis result information obtained by pre-analyzing HTML content existing on the Internet (analyzing a tag included in the HTML content in FIG. 12 and using the number of appearances of the tag as analysis result information) as an x-axis, It is a graph which made the y-axis the memory size (storage area amount) actually used. Small points on the graph on the graph are data of actual analysis result information and used memory size (used storage area amount), which is an example of plot information. In addition, the line segment displayed on the graph is an approximation function (which is an “estimation function”) obtained as an estimation function based on diamond points. Using this approximate function, the amount of storage area is estimated. In the example of FIG. 12, “y = 0.09x + 1.2345” shown in the upper right of the graph is an estimation function. Since x is analysis result information, the storage area determination unit 15 applies the analysis result information generated by the pre-analysis processing unit 13 to x of this estimation function to obtain the storage area amount (y). In this way, the estimation function is a specific estimation method such as a mathematical approximation function such as linear approximation or logarithmic approximation from the obtained plot information. The square of R (accuracy information) indicates how correct the value y obtained by the estimation function is, and the closer to 1, the higher the accuracy. For this reason, when the estimation function and the accuracy information are stored as the storage area estimation information, when there are a plurality of estimation functions, the estimation functions can be adopted in descending order of the system information.
Further, when the estimation function is storage area estimation information, the parameters are the type of object and the type of content (for example, whether it is HTML content or XML content). In this case, the pre-analysis processing unit 13 includes the analyzed object type in the analysis result information. Also, the content type is analyzed, and the analyzed content type is included.

実施の形態１で説明した図１０は、解析結果情報と各オブジェクト数の離散的な対応情報を構成していた。しかし、図１３のように解析結果情報を入力とし、各オブジェクトの出現回数を出力とする、連続近似関数を記憶領域推定情報としてもかまわない。図１３（ａ）はタグの出現回数が解析結果情報である場合の連続近似関数ｆ（ｘ）を示し、（ｂ）は属性の出現回数が解析結果情報である場合の連続近似関数ｆ（ｙ）を示し、（ｃ）は文字の出現回数が解析結果情報である場合の連続近似関数ｆ（ｚ）を示す。単純に近似関数を記憶領域推定情報として取り扱ってもよく、その近似関数を求めるための近似パラメータを記憶領域推定情報として取り扱ってもよい。この場合の記憶領域推定情報に対応するパラメータはタグや属性や文字数等のオブジェクトの種類であり、前解析処理部１３は、解析結果情報に解析したオブジェクトの種類を含める。一例を図１４、図１５に示す。図１４は、図１３の連続近似関数に対応する記憶領域推定情報格納部１６の一例を示し、図１５は、解析結果情報の値の範囲によって対応する連続近似関数が異なるとともに、オブジェクト別に解析結果情報と記憶領域推定情報との対応表を設けた例を示す図である。図１４の「混合」は複数のオブジェクトを区別することなくまとめて出現回数を解析した場合を示している。図１５の「ｘ１」、「ｘ２」・・・「ｘｎ」は、解析結果情報の値が適合する範囲によって、記憶領域量を計算する関数が異なることを示している。また、「ｙ１」、「ｙ２」・・・「ｙｎ」及び「ｚ１」、「ｚ２」・・・「ｚｎ」ついても同様である。
記憶領域決定部１５は、各オブジェクトに対応する近似関数に解析結果情報を入力する事で、内部データ表現に変換した場合に必要となる特定のオブジェクトの近似数を求めることが出来る。そして、求まったオブジェクトの近似数にそのオブジェクトの単一のサイズを掛けて、記憶領域量を求める。図１０の記憶領域推定情報から求められる記憶領域量と、図１２の記憶領域推定情報から求められる記憶領域量との差異は、図１０の場合にはある特定の範囲の前解析結果情報では、求められるオブジェクト数が同一となり、記憶領域量が同一となるが、近似関数の場合には図１０よりも細かいオブジェクト数の近似が可能となるので、このようなオブジェクト数から計算する記憶領域量は実際に必要とする記憶領域量に近い量となる。 FIG. 10 described in the first embodiment configures discrete correspondence information between analysis result information and the number of objects. However, as shown in FIG. 13, a continuous approximation function that receives the analysis result information and outputs the number of appearances of each object may be used as the storage area estimation information. FIG. 13A shows the continuous approximation function f (x) when the number of appearances of the tag is analysis result information, and FIG. 13B shows the continuous approximation function f (y) when the number of appearances of the attribute is analysis result information. (C) shows a continuous approximation function f (z) when the number of appearances of characters is analysis result information. An approximate function may simply be handled as storage area estimation information, or an approximate parameter for obtaining the approximate function may be handled as storage area estimation information. The parameters corresponding to the storage area estimation information in this case are the object types such as tags, attributes, and the number of characters, and the pre-analysis processing unit 13 includes the analyzed object types in the analysis result information. An example is shown in FIGS. FIG. 14 shows an example of the storage area estimation information storage unit 16 corresponding to the continuous approximation function of FIG. 13, and FIG. 15 shows that the corresponding continuous approximation function differs depending on the value range of the analysis result information, and that the analysis result for each object It is a figure which shows the example which provided the correspondence table of information and storage area estimation information. “Mixed” in FIG. 14 shows a case where the number of appearances is analyzed collectively without distinguishing a plurality of objects. “X1”, “x2”... “Xn” in FIG. 15 indicate that the function for calculating the storage area amount varies depending on the range in which the value of the analysis result information is suitable. The same applies to “y1”, “y2”... “Yn” and “z1”, “z2”.
The storage area determination unit 15 can obtain the approximate number of specific objects required when converted into the internal data representation by inputting the analysis result information to the approximation function corresponding to each object. Then, the storage area amount is obtained by multiplying the approximate number of the obtained objects by the single size of the object. The difference between the storage area amount obtained from the storage area estimation information in FIG. 10 and the storage area amount obtained from the storage area estimation information in FIG. 12 is, in the case of FIG. The number of objects to be obtained is the same and the amount of storage area is the same. However, in the case of an approximate function, the number of objects can be approximated smaller than in FIG. The amount is close to the amount of storage area actually required.

この実施の形態では、記憶領域推定情報が、前解析結果および、実際に必要となったメモリ量（記憶領域量）を軸とした平面にプロットされたＮ個の事前に前解析結果および実際に必要となったメモリ量（記憶領域量）の結果を元に推定された推定関数である事を特徴とする記憶領域管理方式を実行する構造化データメモリ管理装置１０、および、記憶領域管理装置の一例である構造化データメモリ管理装置１０について説明した。
例えば、記憶領域推定情報が連続的な取り扱いがなされている事を説明した。 In this embodiment, the storage area estimation information includes the pre-analysis results and the N pre-analysis results plotted in a plane with the memory amount (storage area amount) actually required as an axis. A structured data memory management device 10 for executing a storage area management method characterized by being an estimation function estimated based on a result of a required memory amount (storage area amount), and a storage area management device The structured data memory management device 10 as an example has been described.
For example, it has been explained that the storage area estimation information is continuously handled.

この実施の形態の構造化データメモリ管理装置１０は、記憶領域推定情報が連続的な取り扱いがなされている。このため、記憶領域決定部１５が決定する記憶領域量の精度は、離散的な取り扱いよりも高くなるという効果がある。 In the structured data memory management device 10 of this embodiment, the storage area estimation information is continuously handled. For this reason, there is an effect that the accuracy of the storage area amount determined by the storage area determination unit 15 is higher than that in discrete handling.

実施の形態３．
この実施の形態では、構造化データに関する基本情報を解析結果とする場合の、記憶領域量の決定について一例を説明する。
図１６は、記憶領域推定情報格納部１６に格納されている付加情報の具体例の一つを示す図である。
図１７は、図１６の付加情報を利用した記憶領域決定ステップ（Ｓ２）のフローチャート図である。
図１６に示す情報は、過去に解析を行った事のある構造化データに関する付加情報を記録する記憶領域推定情報格納部１６が記憶するテーブルの一例である。記憶領域推定情報格納部１６は、パラメータに対応するものとして、過去解析を行った構造化データの識別情報としてのＵＲＩと、その構造化データを過去に解析した時の更新日時情報とを記憶する。そして、その構造化データを過去に解析した時に必要となった内部データオブジェクトの実際の個数をオブジェクト別に要素数１６ｘ、属性数１６ｙ、文字数１６ｚとして保持している。なお、記憶領域推定情報格納部１６は、図１６に示した情報の他に、実施の形態１〜２で説明した例えば図１０に示した情報を図１６とは別のテーブルとして記憶しているものとする。
以下で、この情報を利用した記憶領域決定部１５による記憶領域決定ステップを説明する。
解析処理制御部１２は記憶領域決定部１５に対して処理要求を出力する際に、処理要求に少なくとも、構造化データ、構造化データの解析結果情報を含める。また、その解析結果情報の一つとして、解析を行った構造化データの更新日時情報を含める。ＵＲＩは、構造化データ自身が自身の識別情報として有している。また、更新日時情報は、構造化データ格納部１１に構造化データに対する付加情報として記憶されている。更新日時情報は前解析処理部１３或いは、解析処理制御部１２が、構造化データ格納部１１より取得する。また、構造化データ自身に自身の識別情報としてのＵＲＩが含まれていない場合には、構造化データ格納部１１に付加情報の１つとして記憶されているので、前解析処理部１３或いは、解析処理制御部１２が、構造化データ格納部１１より取得する。そして解析処理制御部１２が、解析結果情報の一つに、構造化データに対するＵＲＩを含める。 Embodiment 3 FIG.
In this embodiment, an example of the determination of the storage area amount in the case where basic information about structured data is used as an analysis result will be described.
FIG. 16 is a diagram showing one specific example of the additional information stored in the storage area estimation information storage unit 16.
FIG. 17 is a flowchart of the storage area determination step (S2) using the additional information of FIG.
The information shown in FIG. 16 is an example of a table stored in the storage area estimation information storage unit 16 that records additional information related to structured data that has been analyzed in the past. The storage area estimation information storage unit 16 stores a URI as identification information of structured data subjected to past analysis and update date / time information when the structured data was analyzed in the past as corresponding to the parameter. . Then, the actual number of internal data objects required when the structured data was analyzed in the past is held as the number of elements 16x, the number of attributes 16y, and the number of characters 16z for each object. In addition to the information shown in FIG. 16, the storage area estimation information storage unit 16 stores, for example, the information shown in FIG. 10 described in the first and second embodiments as a table different from FIG. Shall.
Hereinafter, the storage area determination step by the storage area determination unit 15 using this information will be described.
When the analysis processing control unit 12 outputs a processing request to the storage area determination unit 15, the processing request includes at least structured data and analysis result information of the structured data. Also, as one of the analysis result information, update date / time information of the structured data that has been analyzed is included. The URI has the structured data itself as its own identification information. The update date / time information is stored in the structured data storage unit 11 as additional information for the structured data. The update date and time information is acquired from the structured data storage unit 11 by the pre-analysis processing unit 13 or the analysis processing control unit 12. Further, when the structured data itself does not include a URI as its own identification information, it is stored as one of the additional information in the structured data storage unit 11, so that the pre-analysis processing unit 13 or the analysis data The process control unit 12 acquires from the structured data storage unit 11. Then, the analysis processing control unit 12 includes a URI for the structured data in one of the analysis result information.

以下は、図１７のフローチャート図に従い説明する。
解析処理制御部１２から処理要求を受けた記憶領域決定部１５は、処理要求に含まれている構造化データの識別情報としてＵＲＩを取り出す。記憶領域決定部１５は、このＵＲＩと、記憶領域推定情報格納部１６が記憶する図１６に示したテーブルから構造化データのＵＲＩの値を比較し、一致する値が存在するか検索する。検索の結果、一致するＵＲＩが存在しない場合には、記憶領域決定部１５は、図１０に示したような解析結果情報とオブジェクト毎の記憶領域量の対応表や、図１５のような記憶領域推定情報に近似関数を用いた、実施の形態１や実施の形態２で説明した決定方法を実施して、記憶領域量を決定する。
逆に一致するＵＲＩが存在した場合には、記憶領域決定部１５は、次に処理要求に含まれる更新日時情報と図１６のテーブルの更新日時の値を比較する。比較の結果、双方の更新日時が一致するかチェックする。チェックの結果、一致した場合、前解析処理部１３が前解析を行った構造化データは、過去に解析した結果と同一構造を有する構造化データであると判明した。このため、記憶領域決定部１５は、過去のオブジェクト数をテーブルから取得し、取得したオブジェクト数を記憶領域量として利用する。一致しなかった場合には、ＵＲＩが存在しない場合と同様に、記憶領域決定部１５は、図１０に示したような解析結果情報とオブジェクト毎の記憶領域量の対応表や、図１５のような記憶領域推定情報に近似関数を用いた、実施の形態１や実施の形態２で説明した決定方法を実施する。 The following will be described with reference to the flowchart of FIG.
The storage area determination unit 15 that has received the processing request from the analysis processing control unit 12 extracts the URI as identification information of the structured data included in the processing request. The storage area determination unit 15 compares the URI and the URI value of the structured data from the table shown in FIG. 16 stored in the storage area estimation information storage unit 16 and searches for a matching value. If there is no matching URI as a result of the search, the storage area determination unit 15 displays the correspondence table between the analysis result information and the storage area amount for each object as shown in FIG. The determination method described in the first embodiment or the second embodiment using an approximation function as estimation information is performed to determine the storage area amount.
Conversely, if there is a matching URI, the storage area determination unit 15 compares the update date / time information included in the processing request with the value of the update date / time in the table of FIG. As a result of the comparison, it is checked whether both update dates and times match. As a result of the check, if they match, the structured data subjected to the pre-analysis by the pre-analysis processing unit 13 is found to be structured data having the same structure as the result analyzed in the past. Therefore, the storage area determination unit 15 acquires the past number of objects from the table, and uses the acquired number of objects as the storage area amount. If they do not match, as in the case where no URI exists, the storage area determination unit 15 displays the correspondence table between the analysis result information and the storage area amount for each object as shown in FIG. The determination method described in the first embodiment and the second embodiment using an approximate function as the storage area estimation information is executed.

図１６では、一つの構造化データに対して、一つの更新日時、オブジェクト数の組が対応付けられているが、一つの構造化データに対して更新日時、オブジェクト数の組を複数対応付ける事も可能である。その場合には、同一ＵＲＩがあった場合には、そのＵＲＩに対応付けられている全ての組の更新日時と解析結果情報に含まれた更新日時を比較し、同一日時の情報がないか検索する事になる。 In FIG. 16, one set of update date / time and the number of objects is associated with one structured data. However, a plurality of sets of update date / time and the number of objects may be associated with one structured data. Is possible. In that case, if there is the same URI, the update date and time of all the sets associated with the URI are compared with the update date and time included in the analysis result information, and search for information on the same date and time. Will do.

また、図１６では、要素数、属性数、文字数とに対してそれぞれ別々にオブジェクト数を設定した。このため、記憶領域決定部１５は、オブジェクト毎の単一サイズを、それぞれのオブジェクト数に掛けて、オブジェクト毎の記憶領域量を決定する。これは図１０や図１５の記憶領域推定情報格納部１６を使用して記憶領域量を決定した場合も同様である。このように、オブジェクト毎に記憶領域量を決定すると、解析処理制御部１２は、記憶領域決定部１５が決定したオブジェクト毎の記憶領域量を、記憶領域管理部１７に出力する。オブジェクト毎の記憶領域量を入力した記憶領域管理部１７は、データ記憶領域にオブジェクト毎の記憶領域量の記憶領域を確保する。 In FIG. 16, the number of objects is set separately for the number of elements, the number of attributes, and the number of characters. Therefore, the storage area determining unit 15 determines the storage area amount for each object by multiplying the single size for each object by the number of objects. The same applies to the case where the storage area amount is determined using the storage area estimation information storage unit 16 of FIGS. 10 and 15. As described above, when the storage area amount is determined for each object, the analysis processing control unit 12 outputs the storage area amount for each object determined by the storage area determination unit 15 to the storage area management unit 17. The storage area management unit 17 that has input the storage area amount for each object secures a storage area of the storage area amount for each object in the data storage area.

以上までで説明した実施の形態１〜３では、記憶領域量決定の具体的な例を挙げてきたが、以上の様な処理で求まる、構造化データの解析前に必要となる各内部データオブジェクト数の推定値や、全体で必要となるメモリサイズ（記憶領域量）の推定値を利用して、記憶領域管理部１７がデータ記憶領域の確保を行い、その確保した領域を構造化データ解析部１４が使用できるように、確保した領域を参照するための情報を、記憶領域管理部１７が解析処理制御部１２に返す。記憶領域管理部１７がデータ記憶領域の確保を行う場合には、記憶領域決定部１５が記憶領域推定情報に従い記憶領域量を求めることを説明した。しかし、内部データオブジェクト数の推定値を利用する場合には、記憶領域管理部１７がその推定値に従って各オブジェクトの配列を作成してもよいし、記憶領域管理部１７が全オブジェクトで必要となるメモリ量（記憶領域量）を算出して、データ記憶領域を一括確保してもよい。 In the first to third embodiments described above, specific examples of determining the storage area amount have been given. However, each internal data object required before the analysis of the structured data obtained by the above processing The storage area management unit 17 secures a data storage area using the estimated value of the number and the estimated value of the memory size (storage area amount) required as a whole, and the secured area is used as the structured data analysis unit. The storage area management unit 17 returns information for referring to the reserved area to the analysis processing control unit 12 so that the area 14 can be used. It has been described that when the storage area management unit 17 secures a data storage area, the storage area determination unit 15 obtains the storage area amount according to the storage area estimation information. However, when the estimated value of the number of internal data objects is used, the storage area management unit 17 may create an array of each object according to the estimated value, or the storage area management unit 17 is necessary for all objects. The memory amount (storage area amount) may be calculated to secure the data storage area at once.

この実施の形態では、前解析処理部１３で、前解析を行う構造化データに関する基本情報を前解析の結果の全てまたは一部とする記憶領域管理方式を実行する記憶領域管理装置の一例の構造化データメモリ管理装置１０について説明した。
また、構造化データの基本情報は、例えばある構造化データの出所情報（ＵＲＬ等）、構造化データの作成日時情報、更新日時情報として説明した。 In this embodiment, the pre-analysis processing unit 13 is a structure of an example of a storage area management apparatus that executes a storage area management method in which basic information related to structured data to be pre-analyzed is all or part of the result of the pre-analysis. The structured data memory management device 10 has been described.
The basic information of structured data has been described as, for example, source information (URL or the like) of certain structured data, creation date / time information of structured data, and update date / time information.

この実施の形態の構造化データメモリ管理装置１０は、過去に解析を行った構造化データと、今回解析する構造化データの構造が一致する場合、過去に解析した際の記憶領域量を今回の記憶領域量として使用するため、実際に使用する記憶領域量を確保できるので、記憶部の使用効率を向上できる効果がある。 When the structured data analyzed in the past and the structure of the structured data analyzed this time match, the structured data memory management device 10 of this embodiment determines the storage area amount when analyzed in the past. Since it is used as the amount of storage area, the amount of storage area to be actually used can be secured, so that the use efficiency of the storage unit can be improved.

実施の形態４．
実施の形態１〜３では、構造化データの全データを解析していた。この実施の形態では、構造化データのデータのうち、一部のデータを解析する例を説明する。
前解析処理部１３は、構造化データの全てを解析するのではなく、先頭からある特定の条件が満たされる箇所までを前解析の対象データとして前解析を行うようにしてもよい。
例えば前解析を行う前に、前解析を行う構造化データのファイルサイズを取得し、そのファイルサイズの特定の割合までを解析処理対象とする方法がある。この方法を利用すると、例えばＸＭＬコンテンツなどであれば、構造化データの最後のあたりでは終了タグの記述が多くなり、実質その部分はＤＯＭオブジェクトとして生成されないため、解析対象からはずした方が、メモリ使用量の推定は精度が高くなる可能性がある。このため、その最後の領域を削除する事が可能となる。このように、先頭からある特定の条件が満たされる箇所までを前解析の対象データとする場合は、解析対象とするデータサイズを、解析範囲情報１００として指定し、入力部１８よりデータサイズを指定した解析範囲情報１００を入力する。 Embodiment 4 FIG.
In the first to third embodiments, all the structured data is analyzed. In this embodiment, an example in which a part of the structured data is analyzed will be described.
The pre-analysis processing unit 13 does not analyze all of the structured data, but may perform pre-analysis using the data up to a point where a specific condition is satisfied as the target data of the pre-analysis.
For example, before performing the pre-analysis, there is a method in which the file size of the structured data to be pre-analyzed is acquired, and up to a specific ratio of the file size is the analysis processing target. If this method is used, for example, in the case of XML content, the description of the end tag increases around the end of the structured data, and the portion is not generated as a DOM object. The estimation of usage can be highly accurate. Therefore, it is possible to delete the last area. As described above, when data up to a point where a specific condition is satisfied from the beginning is to be analyzed, the data size to be analyzed is specified as the analysis range information 100 and the data size is specified from the input unit 18. The analyzed range information 100 is input.

また、コンテンツサイズを利用した方法以外には、構造化データの構造自身を利用した方法が考えられる。
この方法の場合には、例えばＨＴＭＬコンテンツであれば、＜ＨＥＡＤ＞の終了タグ＜／ＨＥＡＤ＞もしくは＜ＢＯＤＹ＞タグが出現する前までを前解析の対象であるとして、前解析の終了箇所を指定する事により、それ以降にメモリサイズの推定にあまり貢献できない記述が並ぶ場合には、その箇所以降を前解析の対象からはずす事が可能となる効果がある。このように、前解析の終了箇所を指定する場合には、その終了する箇所を判定するための情報を解析範囲情報１００として指定し、入力部１８より終了する箇所を判定するための情報を指定した解析範囲情報１００を入力する。 Besides the method using the content size, a method using the structure of the structured data itself can be considered.
In the case of this method, for example, in the case of HTML contents, the end point of the pre-analysis is designated as the target of the pre-analysis until the end tag <HEAD> or <BODY> of <HEAD> appears. By doing this, if there are descriptions that do not contribute much to the estimation of the memory size after that, it is possible to remove the subsequent portions from the target of the previous analysis. As described above, when specifying the end point of the pre-analysis, information for determining the end point is specified as the analysis range information 100, and information for determining the end point from the input unit 18 is specified. The analyzed range information 100 is input.

また、構造化データの途中の一部分を解析の対象として指定するようにしてもよい。
例えば、構造化データのファイルサイズを取得し、そのファイルサイズの特定の割合までを前解析の対象からはずし、それ以降からのデータから前解析を開始する方法がある。この場合には、指定したファイルサイズに相当する構造化データの部分は、解析対象としないことを、解析範囲情報１００して指定して、入力部１８より入力する。 A part of the structured data may be designated as an analysis target.
For example, there is a method in which the file size of structured data is acquired, a specific ratio of the file size is excluded from the target of the pre-analysis, and the pre-analysis is started from the data after that. In this case, the structured data portion corresponding to the designated file size is designated as the analysis range information 100 and is input from the input unit 18 so as not to be analyzed.

その他に、構造化データの構造自身を利用した方法も考えられる。
この場合には、例えばＨＴＭＬコンテンツであれば＜ＢＯＤＹ＞タグ以降を前解析の対象とすることによって、それ以前のコンテンツ内容を前解析からはずすと言った方法が考えられる。この場合、＜ＢＯＤＹ＞タグと、＜ＢＯＤＹ＞タグ以降を前解析の対象とすることを解析範囲情報１００に指定して、入力部１８より入力する。
このように先頭から数えて一部分を前解析の対象からはずす事によって、先頭にある不要なヘッダ情報などを前解析の対象からはずす事が可能となり、より精度よく記憶領域量の推定が可能となる効果がある。 In addition, a method using the structure of structured data itself can be considered.
In this case, for example, in the case of HTML content, a method may be considered in which the content content before that is removed from the pre-analysis by setting the portion after the <BODY> tag as the target of the pre-analysis. In this case, the <BODY> tag and the <BODY> tag and the subsequent items are designated in the analysis range information 100 and input from the input unit 18.
By removing a part from the target of the previous analysis in this way, unnecessary header information at the head can be removed from the target of the previous analysis, and the storage area amount can be estimated more accurately. effective.

また、ＨＴＭＬの場合を例に挙げると、例えば＜ＦＯＲＭ＞タグ以下の全要素はそれ以外のタグの場合と比較して、メモリ使用量が統計的に多い事が分かっているとする。その場合には、前解析を開始し、＜ＦＯＲＭ＞タグを検出するまでは通常の前解析を行い、＜ＦＯＲＭ＞タグ以降の要素を前解析する場合には、通常の前解析よりも、メモリ量（記憶領域量）が多くなるような結果を出力する前解析処理に切り替え、前解析を継続する。そして＜ＦＯＲＭ＞タグ以下の前解析が終了（＜／ＦＯＲＭ＞タグを検出）した後は、通常の前解析に戻る。最終的に全ての前解析が完了した後で、二つの前解析の結果を統合する事で、一つの前解析結果とする。図１８は、前解析処理部１３が第１と第２の前解析処理部を備えて、記憶領域管理方式を実行する構造化データメモリ管理装置の機能ブロック図である。前解析処理部１３は、図１８のように、第１の解析処理部１３１と第２の解析処理部１３２とを備えて、前述した通常の前解析は、第１の解析処理部１３１により実行し、前述のメモリ量（記憶領域量）が多くなるような結果を出力する前解析処理は、第２の解析処理部１３２により実行する。 Further, taking the case of HTML as an example, it is assumed that, for example, it is known that all elements below the <FORM> tag have a statistically large memory usage compared to other tags. In that case, the pre-analysis is started, the normal pre-analysis is performed until the <FORM> tag is detected, and when the elements after the <FORM> tag are pre-analyzed, the memory is more effective than the normal pre-analysis. Switch to pre-analysis processing that outputs a result that increases the amount (storage area amount), and continue the pre-analysis. After the pre-analysis below the <FORM> tag is completed (the </ FORM> tag is detected), the normal pre-analysis is returned. After all the previous analyzes are finally completed, the results of the two previous analyzes are integrated into a single previous analysis result. FIG. 18 is a functional block diagram of a structured data memory management apparatus in which the pre-analysis processing unit 13 includes the first and second pre-analysis processing units and executes the storage area management method. As shown in FIG. 18, the pre-analysis processing unit 13 includes a first analysis processing unit 131 and a second analysis processing unit 132, and the normal pre-analysis described above is executed by the first analysis processing unit 131. The second analysis processing unit 132 executes the pre-analysis process for outputting a result that increases the memory amount (storage area amount) described above.

前述のように、前解析処理部１３は、構造化データ全てを同一の前解析アルゴリズム（第１の解析処理部１３１）で前解析するのではなく、一部を別の前解析アルゴリズム（第２の解析処理部１３２）で処理するとしてもよい。
例えば、ＨＴＭＬコンテンツを前解析する場合、＜ＦＯＲＭ＞タグの全要素は、通常の前解析アルゴリズムを適用するよりも、他の前解析アルゴリズムを適用した方が精度が高くなることが分かっている場合、前解析で＜ＦＯＲＭ＞タグを発見した場合には、その終了タグが発見されるまでは別の前解析アルゴリズムを適用し、終了タグ以降はまた通常の前解析アルゴリズムを適用して、全てを前解析してもよい。このようにアルゴリズムを変更する場合には、構造化データの特別な構造に関しては、他の構造と比較して明らかに異なるメモリ消費が分かっている場合には、有効な前解析方法となる。 As described above, the pre-analysis processing unit 13 does not pre-analyze all the structured data with the same pre-analysis algorithm (first analysis processing unit 131), but partially analyzes it with another pre-analysis algorithm (second May be processed by the analysis processing unit 132).
For example, when pre-analyzing HTML content, it is known that all elements of the <FORM> tag are more accurate by applying another pre-analysis algorithm than by applying a normal pre-analysis algorithm When the <FORM> tag is found in the pre-analysis, another pre-analysis algorithm is applied until the end tag is found, and after the end tag, the normal pre-analysis algorithm is applied again, Pre-analysis may be performed. When the algorithm is changed in this way, the special structure of the structured data is an effective pre-analysis method when memory consumption that is clearly different from other structures is known.

また、ＨＴＭＬコンテンツを前解析する場合、例えば＜ＦＯＲＭ＞タグは、＜ＦＯＲＭ＞タグの終了を示す＜／ＦＯＲＭ＞タグと対で記述される。このため、＜ＦＯＲＭ＞タグから＜／ＦＯＲＭ＞タグまでを、解析対象とすることも可能である。この場合、＜ＦＯＲＭ＞タグを特定情報として解析範囲情報１００に指定して、入力部１８より入力すると、前解析処理部１３は、特定情報である＜ＦＯＲＭ＞タグに対応する終了情報が＜／ＦＯＲＭ＞タグであることを判断して、＜ＦＯＲＭ＞タグから＜／ＦＯＲＭ＞タグまでを、解析対象とする。なお、＜／ＦＯＲＭ＞タグを＜ＦＯＲＭ＞タグに対応する情報として、解析範囲情報１００に指定してもいいし、予め構造化データメモリ管理装置１０の備えるファイル群９２４に特定情報と特定情報に対応する終了情報とを対にして記憶させておき、これを前解析処理部１３が参照するようにしてもかまわない。 Further, when pre-analyzing HTML content, for example, a <FORM> tag is described in a pair with a </ FORM> tag indicating the end of the <FORM> tag. For this reason, the <FORM> tag to the </ FORM> tag can be analyzed. In this case, when the <FORM> tag is specified as the specific information in the analysis range information 100 and input from the input unit 18, the pre-analysis processing unit 13 displays the end information corresponding to the <FORM> tag as the specific information as </ The FORM> tag is determined, and the <FORM> tag to the </ FORM> tag are analyzed. Note that the </ FORM> tag may be specified in the analysis range information 100 as information corresponding to the <FORM> tag, or the file group 924 included in the structured data memory management device 10 may be specified in advance as specific information and specific information. The corresponding end information may be stored as a pair and referred to by the pre-analysis processing unit 13.

この実施の形態では、前解析処理部１３で、構造化データの先頭から一部分までを前解析する事を特徴とする記憶領域管理方式を実行する記憶領域管理装置の一例の構造化データメモリ管理装置１０を説明した。
また、前解析処理部１３で、構造化データのある特定箇所から前解析を開始し、構造化データの一部分を前解析する事を特徴とする記憶領域管理方式を実行する記憶領域管理装置の一例の構造化データメモリ管理装置１０を説明した。
また、前解析処理部１３で、前解析処理中に特定情報が検出された場合に、その特定情報に対応する特定終了情報が検出されるまで前解析処理をすることなく、構造化データを読み飛ばす事を特徴とする記憶領域管理方式を実行する記憶領域管理装置の一例の構造化データメモリ管理装置１０を説明した。 In this embodiment, the pre-analysis processing unit 13 performs a pre-analysis from the beginning to a part of the structured data, and a structured data memory management device as an example of a storage region management device that executes a storage region management method. 10 explained.
Also, an example of a storage area management apparatus that executes a storage area management system characterized in that the pre-analysis processing unit 13 starts pre-analysis from a specific location of structured data and pre-analyzes a part of the structured data. The structured data memory management apparatus 10 has been described.
Further, when the pre-analysis processing unit 13 detects specific information during the pre-analysis processing, the structured data is read without performing the pre-analysis processing until specific end information corresponding to the specific information is detected. The structured data memory management apparatus 10 as an example of the storage area management apparatus that executes the storage area management system characterized by skipping has been described.

また、前解析処理部１３で、前解析処理中に特定情報が検出された場合に、その特定情報に対応する特定終了情報が検出されるまで前解析処理を異なる処理体系とする事を特徴とする記憶領域管理方式を実行する記憶領域管理装置の一例の構造化データメモリ管理装置１０を説明した。 Further, when specific information is detected during the pre-analysis process in the pre-analysis processing unit 13, the pre-analysis process is set to a different processing system until specific end information corresponding to the specific information is detected. The structured data memory management apparatus 10 as an example of the storage area management apparatus that executes the storage area management method is described.

この実施の形態の構造化データメモリ管理装置１０は、解析対象の範囲を指定することによって、構造化データを構成するデータの特徴を、記憶領域量の推定に反映させることが出来るので、精度よく記憶領域量の推定が可能となる効果がある。 The structured data memory management device 10 of this embodiment can reflect the characteristics of the data constituting the structured data in the estimation of the storage area amount by specifying the range to be analyzed. There is an effect that the amount of storage area can be estimated.

実施の形態５．
この実施の形態では、構造化データに含まれる特徴データの密度を解析結果情報の１つにすることを説明する。
解析結果情報として構造化データが含む特徴データの個数だけではなく、その密度を利用してもよい。密度とは、ある特定の領域のメモリサイズで、その特定の領域に含まれる各構造化データの特徴データの個数を割った値（割合）である。密度を利用する場合としては、全体としての個数を利用する場合よりも、特定領域の密度を利用した方が精度のよい推定が行える場合等が考えられる。 Embodiment 5 FIG.
In this embodiment, it will be described that the density of feature data included in structured data is one analysis result information.
As the analysis result information, not only the number of feature data included in the structured data but also its density may be used. The density is a value (ratio) obtained by dividing the number of feature data of each structured data included in the specific area by the memory size of the specific area. As a case where the density is used, it is conceivable that the estimation can be performed with higher accuracy by using the density of the specific region than when using the total number.

例えば、「構造化データの特徴データ」とは、ＸＭＬで言えば”＜”（タグの開始記号）、”＝”（属性値の代入記号）、”＜−−”（コメント開始記号）等の、構造を特徴付けるデータの事を言う。「構造化データの特徴データの密度」とは、コンテンツ内に含まれる全文字数で、”＜”（タグの開始記号）、”＝”（属性値の代入記号）、”＜−−”（コメント開始記号）等の各構造化データの特徴データの個数を割って求めた数値である。全文字数ではなく、実施の形態４で説明した、解析対象とする構造化データの一部分の特定領域に含まれる文字数に対する各構造化データの特徴データの個数を割って求めた数値とすることでもかまわない。
この値を利用する事によって、全体で特徴となる情報が何個あるかよりも、ある特定領域での密度や分布を調べる事の方がより正確に推定できる場合等に有効となる。 For example, “characteristic data of structured data” means “<” (tag start symbol), “=” (attribute value substitution symbol), “<-” (comment start symbol), etc. in XML. This refers to data that characterizes the structure. “Characteristic data density of structured data” refers to the total number of characters included in the content, “<” (tag start symbol), “=” (attribute value substitution symbol), “<-” (comment This is a numerical value obtained by dividing the number of feature data of each structured data such as a start symbol). Instead of the total number of characters, a numerical value obtained by dividing the number of feature data of each structured data with respect to the number of characters included in a specific area of a part of the structured data to be analyzed described in the fourth embodiment may be used. Absent.
By using this value, it is effective when it is possible to estimate the density and distribution in a specific area more accurately than the number of pieces of information that are characteristic as a whole.

前解析処理部１３は、前述した密度を解析結果情報とする。或いは、前解析処理部１３は、実施の形態１〜３のように、オブジェクトの数とともに、前述した密度を解析結果情報とする。解析結果情報にオブジェクトの数と密度とが含まれる場合には、記憶領域決定部１５は、それぞれの解析結果に基づいて、記憶領域量を決定し、決定した記憶領域量のうち、大きい方を解析処理制御部１２に出力する。或いは、小さい方を解析処理制御部１２に出力する。どちらを出力するかは、最低限必要とする記憶領域量を確保するか、余裕を持って記憶領域量を確保するかのシステムのポリシーによって異なる。このため、記憶領域決定部１５は、これらの記憶領域量を表示装置９０１に表示して、利用者がいずれかの記憶領域量を入力部１８より選択することを可能にする。 The pre-analysis processing unit 13 uses the above-described density as analysis result information. Alternatively, the pre-analysis processing unit 13 uses the above-described density as analysis result information together with the number of objects as in the first to third embodiments. When the analysis result information includes the number and density of objects, the storage area determination unit 15 determines the storage area amount based on each analysis result, and the larger one of the determined storage area amounts is selected. The data is output to the analysis processing control unit 12. Alternatively, the smaller one is output to the analysis processing control unit 12. Which is output depends on the policy of the system, which is to secure the minimum required storage area amount or to secure the storage area amount with a margin. Therefore, the storage area determination unit 15 displays these storage area amounts on the display device 901 and allows the user to select any storage area amount from the input unit 18.

この実施の形態では、前解析処理部１３で、構造化データの特徴データの密度を解析結果情報とする事を特徴とする記憶領域管理方式を実行する記憶領域管理装置の一例の構造化データメモリ管理装置１０を説明した。 In this embodiment, the pre-analysis processing unit 13 uses a structured data memory as an example of a storage area management apparatus that executes a storage area management method characterized by using the density of feature data of structured data as analysis result information. The management device 10 has been described.

この実施の形態の構造化データメモリ管理装置１０は、全体で特徴となる情報が何個あるかよりも、ある特定領域での密度や分布を調べる事の方がより正確に推定できる場合に有効となる効果がある。 The structured data memory management device 10 of this embodiment is effective when it is possible to estimate more accurately the density and distribution in a specific area than the number of pieces of information that are characteristic as a whole. There is an effect.

実施の形態６．
この実施の形態では、記憶領域決定部１５が、最大記憶領域量と、最小記憶領域量と、統計により確保した記憶領域量が不足しないことが保障された棄却可能記憶領域量とのいずれかを、記憶領域量として決定する構造化データメモリ管理装置１０の一例を説明する。
図１９は、最大記憶領域量と、最小記憶領域量と、棄却可能記憶領域量とを、複数のオブジェクト毎にそれぞれ記憶する記憶領域推定情報格納部１６の一例であり、（ａ）はＥｌｅｍｅｎｔに対応し、（ｂ）はａｔｔｒオブジェクトに対応し、（ｃ）はテキストオブジェクトに対応する例を示す図である。 Embodiment 6 FIG.
In this embodiment, the storage area determination unit 15 selects any one of the maximum storage area amount, the minimum storage area amount, and the rejectable storage area amount in which it is ensured that the storage area amount secured by the statistics is not insufficient. An example of the structured data memory management device 10 determined as the storage area amount will be described.
FIG. 19 is an example of the storage area estimation information storage unit 16 that stores the maximum storage area amount, the minimum storage area amount, and the rejectable storage area amount for each of a plurality of objects. (B) corresponds to the attr object, and (c) is a diagram illustrating an example corresponding to the text object.

記憶領域推定情報格納部１６は、図１９に示したように、最大記憶領域量と、最小記憶領域量と、棄却可能記憶領域量とを記憶する。記憶領域決定部１５は、構造化データの記憶領域を決定した際に、決定した記憶領域量と、記憶領域推定情報格納部１６が記憶する最大記憶領域量と比較して、記憶領域量＞最大記憶領域量である時、記憶領域推定情報格納部１６の最大記憶領域量を記憶領域量により更新する。また、決定した記憶領域量と、記憶領域推定情報格納部１６が記憶する最小記憶領域量と比較して、記憶領域量＜最小記憶領域量である時、記憶領域推定情報格納部１６の最小記憶領域量を記憶領域量により更新する。 As illustrated in FIG. 19, the storage area estimation information storage unit 16 stores the maximum storage area amount, the minimum storage area amount, and the discardable storage area amount. The storage area determination unit 15 compares the determined storage area amount with the maximum storage area amount stored by the storage area estimation information storage unit 16 when determining the storage area of the structured data, and the storage area amount> maximum When it is the storage area amount, the maximum storage area amount of the storage area estimation information storage unit 16 is updated with the storage area amount. Further, when the storage area amount is smaller than the minimum storage area amount compared with the determined storage area amount and the minimum storage area amount stored in the storage area estimation information storage unit 16, the minimum storage in the storage area estimation information storage unit 16 is performed. The area amount is updated with the storage area amount.

また、棄却可能記憶領域量は以下のようにして統計により求める。
ある構造化データＡに対して解析結果がＲｅｓ（Ａ）となったと仮定した場合、その時に解析結果情報の値または、解析結果情報の値の集合と、記憶領域量が対応付けられた記憶領域推定情報格納部１６に記憶された対応表で、Ｒｅｓ（Ａ）が分類された区分Ｐａｔｔｅｒｎ（ｋ）に含まれる前解析結果とそれに対する実際に使用されたメモリサイズの組を要素とする集合Ｓ＝｛（ｘ，ｙ）｜ｘ ∈ Ｐａｔｔｅｒｎ（ｋ），ｙｉｓａｃｔｕａｌｍｅｍｏｒｙｓｉｚｅ｝
に対して、Ｓのｙに関する情報を統計的に検定する事で、棄却域となるｙ’が求まり、そのｙ’を「メモリ不足が発生する事をＰａｔｔｅｒｎ（ｋ）において棄却可能な量」として使用する。 Further, the amount of storage area that can be discarded is obtained by statistics as follows.
When it is assumed that the analysis result is Res (A) with respect to a certain structured data A, a storage area in which a value of analysis result information or a set of values of analysis result information is associated with a storage area amount at that time In the correspondence table stored in the estimation information storage unit 16, a set having a combination of a pre-analysis result included in the classification Pattern (k) in which Res (A) is classified and a memory size actually used for the result as an element S = {(X, y) | x∈Pattern (k), y is actual memory size}
On the other hand, by statistically examining the information on y of S, y ′ that becomes a rejection area is obtained, and that y ′ is defined as “a quantity that can be rejected in Pattern (k) that a memory shortage occurs”. use.

記憶領域決定部１５は、余裕があるように記憶領域量を決定する場合には、記憶領域推定情報格納部１６より最大記憶領域量を取得して、解析処理制御部１２に出力する。また、最低限必要となる記憶領域量を決定する場合には、記憶領域推定情報格納部１６より最小記憶領域量を取得して、解析処理制御部１２に出力する。また、記憶領域不足にならない程度の記憶領域量を決定する場合には、記憶領域推定情報格納部１６より棄却可能記憶領域量を取得して、解析処理制御部１２に出力する。
前述の余裕があるように記憶領域量を決定するか、最低限必要となる記憶領域量を決定するか、記憶領域不足にならない程度の記憶領域量を決定するかは、利用者が構造化データメモリ管理装置１０に入力部１８より予め設定しておき、ファイル群９２４に記憶する。記憶領域決定部１５はファイル群９２４に記憶された情報を参照して、記憶領域推定情報格納部１６より最大記憶領域量と、最小記憶領域量と、棄却可能記憶領域量との何れを取得するかを決定する。 When determining the storage area amount so that there is a margin, the storage area determination unit 15 acquires the maximum storage area amount from the storage area estimation information storage unit 16 and outputs it to the analysis processing control unit 12. Further, when determining the minimum required storage area amount, the minimum storage area amount is acquired from the storage area estimation information storage unit 16 and output to the analysis processing control unit 12. Further, when determining a storage area amount that does not cause a shortage of storage areas, the storage area estimation information storage unit 16 obtains a rejectable storage area amount and outputs it to the analysis processing control unit 12.
Whether the storage area is determined so as to have the above-mentioned margin, the minimum required storage area is determined, or the storage area is determined so as not to be short of the storage area. It is preset in the memory management device 10 from the input unit 18 and stored in the file group 924. The storage area determination unit 15 refers to the information stored in the file group 924 and acquires any of the maximum storage area amount, the minimum storage area amount, and the discardable storage area amount from the storage area estimation information storage unit 16. To decide.

また、最大記憶領域量と、最小記憶領域量と、棄却可能記憶領域量は、図１６のような付加情報とともに設定することも可能である。
図２０は、記憶領域推定情報格納部１６に格納されている付加情報に最大記憶領域量と、最小記憶領域量と、棄却可能記憶領域量を設定した例を示す図である。
記憶領域決定部１５は、解析結果情報に基本情報が含まれている場合、或いは、解析結果情報が基本情報である場合、ファイル群９２４に記憶された情報を参照して、記憶領域推定情報格納部１６より最大記憶領域量と、最小記憶領域量と、棄却可能記憶領域量との何れを取得するかを決定する。 Further, the maximum storage area amount, the minimum storage area amount, and the rejectable storage area amount can be set together with the additional information as shown in FIG.
FIG. 20 is a diagram illustrating an example in which the maximum storage area amount, the minimum storage area amount, and the rejectable storage area amount are set in the additional information stored in the storage area estimation information storage unit 16.
When the analysis result information includes basic information or when the analysis result information is basic information, the storage area determination unit 15 refers to the information stored in the file group 924 to store the storage area estimation information. The unit 16 determines which of the maximum storage area amount, the minimum storage area amount, and the discardable storage area amount is to be acquired.

この実施の形態では、記憶領域決定部１５で、記憶領域推定情報を元に求める前記記憶領域量を、同一の解析結果情報が過去得られた複数の構造化データの内、実際に必要となったメモリ量（記憶領域量）が最大となったメモリ量（記憶領域量）とする事を特徴とする記憶領域管理方式を実行する記憶領域管理装置の一例の構造化データメモリ管理装置１０を説明した。
これは、極力、メモリ不足を発生させない方針であるときに有効である。 In this embodiment, the storage area determination unit 15 actually needs the storage area amount obtained based on the storage area estimation information among a plurality of structured data from which the same analysis result information has been obtained in the past. A structured data memory management apparatus 10 as an example of a storage area management apparatus that executes a storage area management method characterized by setting the memory amount (storage area amount) to the maximum memory amount (storage area amount) will be described. did.
This is effective when the policy is to prevent memory shortage as much as possible.

この実施の形態では、記憶領域決定部１５で、記憶領域推定情報を元に求める前記記憶領域量を、同一の解析結果情報が過去得られた複数の構造化データの内、実際に必要となったメモリ量（記憶領域量）が最小となったメモリ量（記憶領域量）とする事を特徴とする記憶領域管理方式を実行する記憶領域管理装置の一例の構造化データメモリ管理装置１０を説明した。
これは、極力、メモリ確保する量を減らす方針であるときに有効である。 In this embodiment, the storage area determination unit 15 actually needs the storage area amount obtained based on the storage area estimation information among a plurality of structured data from which the same analysis result information has been obtained in the past. A structured data memory management apparatus 10 as an example of a storage area management apparatus that executes a storage area management system characterized by having a minimum memory amount (storage area amount) is set as a memory amount (storage area amount). did.
This is effective when the policy is to reduce the memory allocation as much as possible.

この実施の形態では、記憶領域決定部１５で、前記記憶領域推定情報を元に求める前記記憶領域量を、同一の解析結果情報が過去得られた複数の構造化データの中で、統計的にメモリ不足が発生する事を棄却可能な量から選出する事を特徴とする記憶領域管理方式を実行する記憶領域管理装置の一例の構造化データメモリ管理装置１０を説明した。
これは、メモリ不足の発生率を減らしつつも、メモリを確保する量を減らす方針の場合に有効であるという効果がある。 In this embodiment, the storage area determining unit 15 statistically calculates the storage area amount obtained based on the storage area estimation information, among a plurality of structured data from which the same analysis result information has been obtained in the past. The structured data memory management apparatus 10 as an example of the storage area management apparatus that executes the storage area management system characterized by selecting the amount of memory shortage from the amount that can be rejected has been described.
This is effective in the case of a policy of reducing the amount of memory while reducing the occurrence rate of memory shortage.

この実施の形態の構造化データメモリ管理装置１０は、極力、メモリ不足を発生させない方針であるとき、または、極力、メモリ確保する量を減らす方針であるとき、または、メモリ不足の発生率を減らしつつも、メモリを確保する量を減らす方針であるとき、のいずれかの方針に有効である。 The structured data memory management device 10 of this embodiment reduces the occurrence rate of memory shortage when it is a policy that does not cause memory shortage as much as possible, or when it is a policy that reduces the amount of memory to be reserved as much as possible. However, when the policy is to reduce the amount of memory reserved, it is effective for either policy.

実施の形態７．
以上までで説明した実施の形態１〜６の記憶領域決定部１５が求める記憶領域量は、システム全体のポリシーにも依存する。具体的には、記憶領域決定部１５で求める記憶領域量は、解析結果情報から分類される状況において、全ての構造化データで記憶領域量の不足が発生しないとするポリシーで算出する場合（ポリシー１）と、全ての構造化データで最低限必要となる記憶領域量のみを確保するポリシーで算出する場合（ポリシー２）とでは、同じ解析結果情報から算出される記憶領域量は異なってくる。そのポリシーの違いは、記憶領域推定情報格納部１６に格納されている記憶領域推定情報および記憶領域決定部１５のアルゴリズムに反映される。
次に、以下では構造化データ解析部１４が行う、構造化データ解析処理について説明する。
記憶領域管理部１７は、記憶部にデータ記憶領域を確保した後、構造化データ解析部１４から、確保したデータ記憶領域のうち、構造化データ解析部１４が使用する記憶領域量、すなわち、構造化データ解析部１４が内部データ形式に変換したデータを記憶するために必要な記憶領域量の要求を受け付ける。記憶領域管理部１７は、構造化データ解析部１４から要求された分の記憶領域量が、確保したデータ記憶領域に未使用領域として残っていれば、構造化データ解析部１４に対して、使用を許可する通知を出力するとともに、使用を許可する記憶領域を構造化データ解析部１４が使用するためのデータ記憶領域のアドレス情報等を通知する。一方、確保したデータ記憶領域に構造化データ解析部１４が使用を要求している分の未使用領域が残っていなければ、データ記憶領域の不足を構造化データ解析部１４に通知する。さらに、解析処理制御部１２にも、データ記憶領域の不足を通知する。このようにして、記憶領域管理部１７は、確保したデータ記憶領域の残量を管理する。 Embodiment 7 FIG.
The storage area amount obtained by the storage area determination unit 15 of the first to sixth embodiments described above also depends on the policy of the entire system. Specifically, the storage area amount obtained by the storage area determination unit 15 is calculated based on a policy that does not cause a shortage of the storage area amount in all structured data in the situation classified from the analysis result information (policy) The amount of storage area calculated from the same analysis result information differs between 1) and the case of calculating with a policy that secures only the minimum storage area amount necessary for all structured data (policy 2). The policy difference is reflected in the storage area estimation information stored in the storage area estimation information storage unit 16 and the algorithm of the storage area determination unit 15.
Next, a structured data analysis process performed by the structured data analysis unit 14 will be described below.
The storage area management unit 17 secures a data storage area in the storage unit, and then, from the structured data analysis unit 14, among the secured data storage areas, the storage area amount used by the structured data analysis unit 14, that is, the structure A request for the storage area amount necessary for storing the data converted into the internal data format by the data analysis unit 14 is received. If the storage area amount requested from the structured data analysis unit 14 remains as an unused area in the secured data storage area, the storage area management unit 17 uses the structured data analysis unit 14 to use it. Is sent, and the address information of the data storage area for the structured data analysis unit 14 to use the storage area to be used is notified. On the other hand, if there is no unused area for which the structured data analysis unit 14 requests use in the reserved data storage area, the structured data analysis unit 14 is notified of the lack of the data storage area. Further, the analysis processing control unit 12 is also notified of the lack of the data storage area. In this way, the storage area management unit 17 manages the remaining amount of the secured data storage area.

解析処理制御部１２は、解析対象となる構造化データと、記憶領域管理部１７がデータ記憶領域に確保した記憶領域を参照するための情報とを含む、解析処理要求を構造化データ解析部１４に対して出力する。その要求を受け取った構造化データ解析部１４は、解析対象となる構造化データの解析を開始する。構造化データ解析部１４は、内部データ形式に変換したデータを記憶するために記憶領域が必要となった時に、必要な分を、記憶領域管理部１７に要求する。構造化データ解析部１４は、記憶領域管理部１７から、使用を許可する通知を受け取ると、記憶領域管理部１７から通知されたアドレス情報により示されるデータ記憶領域に、内部データ形式に変換したデータを記憶する。解析処理が終了すると、その時生成された内部データ形式のルート要素、ＤＯＭの場合にはＤｏｃｕｍｅｎｔオブジェクトの参照を解析処理制御部１２に出力する。 The analysis processing control unit 12 sends an analysis processing request including the structured data to be analyzed and information for referring to the storage area secured in the data storage area by the storage area management unit 17 to the structured data analysis unit 14. Output for. Upon receiving the request, the structured data analysis unit 14 starts analyzing the structured data to be analyzed. When the storage area is required to store the data converted into the internal data format, the structured data analysis unit 14 requests the storage area management unit 17 for the necessary storage area. When the structured data analysis unit 14 receives a notification permitting use from the storage area management unit 17, the data converted into the internal data format in the data storage area indicated by the address information notified from the storage area management unit 17 Remember. When the analysis process ends, the root element in the internal data format generated at that time, or in the case of DOM, a reference to the Document object is output to the analysis process control unit 12.

次に、構造化データ解析部１４が解析処理中に、記憶領域管理部１７が確保した記憶領域だけでは全ての構造化データを内部データ形式で表現する事ができない場合の具体的な対処例について説明する。
図２１は、確保した記憶領域が不足した場合に発生する対処処理の処理シーケンスの一例を示す図である。
Ｓ２０の不足通知ステップにおいて、記憶領域不足を検知した記憶領域管理部１７は、構造化データ解析部１４に対して、記憶領域不足通知を行う。この不足通知に対して、構造化データ解析部１４は、構造化データのどの位置まで解析できているのかを示す解析位置情報、例えば、次に解析を開始するデータ位置の参照、解析が終了したデータサイズ、解析されていないデータサイズ等を、記憶領域管理部１７に通知する。そして、記憶領域管理部１７は、解析処理制御部１２に対して、記憶領域不足通知を行う。その通知には、構造化データのどの位置まで解析できているのかを示す解析位置情報、例えば、次に解析を開始するデータ位置の参照、解析が終了したデータサイズ、解析されていないデータサイズ等が含まれる。
記憶領域不足通知を受けた解析処理制御部１２は、追加で記憶領域を確保するための再前解析要求を前解析処理部１３に発行する。この要求には前述した構造化データのどの位置まで解析できているのかを示す解析位置情報が含まれる。 Next, a specific example of a case where all structured data cannot be expressed in the internal data format only with the storage area secured by the storage area management unit 17 during the analysis processing by the structured data analysis unit 14. explain.
FIG. 21 is a diagram illustrating an example of a processing sequence of coping processing that occurs when the reserved storage area is insufficient.
In the shortage notification step of S20, the storage area management unit 17 that has detected a shortage of storage area notifies the structured data analysis unit 14 of a shortage of storage area. In response to this shortage notification, the structured data analysis unit 14 has completed analysis position information indicating how far the structured data has been analyzed, for example, reference to the data position where analysis is started next, and analysis. The storage area management unit 17 is notified of the data size, the unanalyzed data size, and the like. The storage area management unit 17 then notifies the analysis processing control unit 12 of a storage area shortage. The notification includes analysis position information indicating how far the structured data can be analyzed, for example, a reference to a data position at which analysis is started next, a data size at which analysis is completed, an unanalyzed data size, etc. Is included.
Upon receiving the storage area shortage notification, the analysis processing control unit 12 issues a pre-reanalysis request to the pre-analysis processing unit 13 for additionally securing a storage area. This request includes analysis position information indicating how far the structured data described above can be analyzed.

再解析処理ステップＳ２１では、再前解析要求を受けた前解析処理部１３は、解析位置情報から解析が終っている構造化データの位置を判断して、解析が終了していない部分を構造化データから抽出する。そして、抽出した構造化データの解析が終了していない部分の前解析処理を行い、解析結果情報を求める。この時に、図５の解析処理ステップＳ１と同等の処理を行ってもよいし、異なる処理方式の前解析処理を行ってもよい。例えば、構造化データ全体に対して未解析のデータが占める割合を求める。そして、図５の解析処理ステップＳ１と同様の解析処理を行って求めた解析結果情報に対して、前述の割合をかけて、かけた結果を再前解析処理を行って求めた解析結果情報とする、などの方法を採用してもよい。また他に、同一の前解析処理に対して、解析位置情報に応じて、バイアスを加える方法でもよい。この「バイアスを加える」とは、その直前に記載した「図５の解析処理ステップＳ１と同様の解析処理を行って求めた解析結果情報に対して、前述の割合をかけて、かけた結果を再前解析処理を行って求めた解析結果情報とする」との説明のなかで、割合をかけている処理を、掛け算から、足し算に変えた方法を指す。具体的には、未解析のデータの割合に応じて決まる値（一般には負の値）を、図５の解析処理ステップＳ１と同様の解析処理を行って求めた解析結果情報の値に足すという処理に変えたものである。前解析処理部１３は、再前解析終了後に、再前解析処理を行って求めた解析結果情報を解析処理制御部１２に返す。 In the re-analysis processing step S21, the pre-analysis processing unit 13 that has received the pre-analysis request determines the position of the structured data that has been analyzed from the analysis position information, and structures the part that has not been analyzed. Extract from the data. Then, a pre-analysis process is performed on a portion where the analysis of the extracted structured data is not completed, and analysis result information is obtained. At this time, processing equivalent to the analysis processing step S1 in FIG. 5 may be performed, or pre-analysis processing of a different processing method may be performed. For example, the ratio of unanalyzed data to the entire structured data is obtained. Then, the analysis result information obtained by performing the same analysis process as the analysis process step S1 of FIG. Or the like. Alternatively, a bias may be applied to the same pre-analysis process according to the analysis position information. This “add bias” means “the result obtained by multiplying the analysis result information obtained by performing the analysis process similar to the analysis process step S1 of FIG. In the description of “analysis result information obtained by performing pre-reanalysis processing”, this refers to a method in which the processing that has been multiplied is changed from multiplication to addition. Specifically, a value determined according to the ratio of unanalyzed data (generally a negative value) is added to the value of the analysis result information obtained by performing the analysis process similar to the analysis process step S1 of FIG. It is a change to processing. The pre-analysis processing unit 13 returns the analysis result information obtained by performing the pre-preanalysis processing to the analysis processing control unit 12 after the pre-analysis is completed.

記憶領域再決定ステップＳ２２において、Ｓ２１で求めた解析結果情報を受けた解析処理制御部１２は、その結果と共に再決定要求を記憶領域決定部１５に発行する。
再決定要求を受けた記憶領域決定部１５は、Ｓ２１で求めた解析結果情報を利用して、追加で確保する記憶領域量を決定する。この時に図５の記憶領域決定ステップＳ２で行った記憶領域量決定処理と同一の処理を利用してよい。また、同一の記憶領域量決定情報を記憶領域推定情報格納部１６から取得して決定処理に利用してもよい。逆に、記憶領域量を再度決定するための、再決定専用の記憶領域量決定処理方式や異なる記憶領域量決定情報を準備して、再決定時にはそれらを利用してもよい。決定処理終了後、決定した記憶領域量を解析処理制御部１２に返す。 In the storage area redetermination step S22, the analysis processing control unit 12 that has received the analysis result information obtained in S21 issues a redetermination request to the storage area determination unit 15 together with the result.
The storage area determination unit 15 that has received the redetermination request uses the analysis result information obtained in S21 to determine the additional storage area amount to be secured. At this time, the same process as the storage area amount determination process performed in the storage area determination step S2 of FIG. 5 may be used. Further, the same storage area amount determination information may be acquired from the storage area estimation information storage unit 16 and used for the determination process. On the contrary, a storage area amount determination processing method dedicated to redetermination for re-determining the storage area amount or different storage area amount determination information may be prepared and used at the time of redetermination. After the determination process is completed, the determined storage area amount is returned to the analysis process control unit 12.

記憶領域再管理ステップＳ２３において、記憶領域量を受けた解析処理制御部１２は、その記憶領域量と共に再確保要求を記憶領域管理部１７に発行する。
再確保要求を受けた記憶領域管理部１７は、記憶領域量を元に追加でデータ記憶領域を確保し、その領域を参照するための情報を解析処理制御部１２に返す。 In the storage area re-management step S23, the analysis processing control unit 12 that has received the storage area amount issues a re-allocation request to the storage area management unit 17 together with the storage area amount.
Upon receipt of the re-allocation request, the storage area management unit 17 additionally reserves a data storage area based on the storage area amount, and returns information for referring to the area to the analysis processing control unit 12.

実解析処理ステップＳ２４において、追加記憶領域を参照するための情報を受けた解析処理制御部１２は、その参照するための情報を構造化データ解析部１４に返す。その参照をするための情報を受けた事をトリガーとして、構造化データ解析部１４は、受け取った参照先のデータ記憶領域を利用して解析処理を再開する。 In the actual analysis processing step S24, the analysis processing control unit 12 that has received the information for referring to the additional storage area returns the information for reference to the structured data analysis unit 14. The structured data analysis unit 14 restarts the analysis process using the received data storage area of the reference destination, triggered by receiving the information for the reference.

また、前述の記憶領域再決定ステップＳ２２において、記憶領域管理部１７が記憶領域を各オブジェクトの配列で領域管理を行っている場合には、どのオブジェクトが不足したのかを示す情報を構造化データ解析部１４が記憶領域管理部１７に通知し、記憶領域管理部１７が、さらに解析処理制御部１２に通知し、解析処理制御部１２は、その通知された情報を利用して、前解析処理部１３、記憶領域決定部１５、記憶領域管理部１７がそのオブジェクトに対してのみ処理を行うように制御してもよい。 In addition, in the storage area redetermination step S22 described above, when the storage area management unit 17 manages the storage area with the array of each object, information indicating which object is lacking is structured data analysis. The unit 14 notifies the storage area management unit 17, the storage area management unit 17 further notifies the analysis processing control unit 12, and the analysis processing control unit 12 uses the notified information to perform the pre-analysis processing unit 13, the storage area determination unit 15 and the storage area management unit 17 may be controlled to perform processing only on the object.

また、解析範囲情報によって前解析処理を行う範囲を指定した場合は、構造化データ中に前解析処理をしていない部分が残っている。解析処理制御部１２は、データ領域の不足通知を受けた場合、前回の前解析処理で解析対象外となった構造化データの部分を解析することを、前解析処理部１３に要求してもかまわない。前解析処理部１３は、前述の要求を解析処理制御部１２から受けると、解析範囲情報１００から解析対象外とした構造化データの部分を判定して、再解析処理を開始する。 In addition, when the range for performing the pre-analysis process is specified by the analysis range information, a portion that has not been subjected to the pre-analysis process remains in the structured data. When the analysis processing control unit 12 receives the data area shortage notification, the analysis processing control unit 12 may request the pre-analysis processing unit 13 to analyze the portion of the structured data that has been excluded from analysis in the previous pre-analysis processing. It doesn't matter. When the pre-analysis processing unit 13 receives the above-described request from the analysis processing control unit 12, the pre-analysis processing unit 13 determines the portion of the structured data that is excluded from the analysis target from the analysis range information 100, and starts the re-analysis processing.

この実施の形態では、記憶領域決定部１５が求めた記憶領域量に基づき、記憶領域管理部１７がデータ記憶領域を確保し、構造化データを解析処理中にその確保したデータ記憶領域にメモリ不足が発生した場合、その構造化データの内、構造化データ解析部１４による解析が行われていない部分を、再度前解析処理部１３により前解析処理して、その前解析処理の結果生成した解析結果情報から記憶領域決定部１５で再度確保する記憶領域量を決定し、その決定した記憶領域量に基づき記憶領域管理部１７が再度データ記憶領域を確保して、その再度確保したデータ記憶領域を使用して、構造化データ解析部１４が解析処理を再開する事を特徴とする記憶領域管理方式を実行する記憶領域管理装置の一例の構造化データメモリ管理装置１０を説明した。 In this embodiment, based on the storage area amount obtained by the storage area determination unit 15, the storage area management unit 17 secures a data storage area, and the structured data is insufficient in the reserved data storage area during analysis processing. Of the structured data, the portion that has not been analyzed by the structured data analysis unit 14 is pre-analyzed by the pre-analysis processing unit 13 again, and the analysis generated as a result of the pre-analysis processing From the result information, the storage area determination unit 15 determines the storage area amount to be secured again, and the storage area management unit 17 secures the data storage area again based on the determined storage area amount, and the data storage area that has been secured again Using the structured data memory management device 10 as an example of a storage area management device for executing a storage area management method characterized in that the structured data analysis unit 14 restarts the analysis processing. And Akira.

この実施の形態の構造化データメモリ管理装置１０は、一旦確保したデータ記憶領域が、構造化データ解析中に不足する状態になっても、再度データ記憶領域を追加で確保するので、記憶領域不足によって構造化データの解析処理が出来なくなることを防ぐ効果がある。 The structured data memory management device 10 of this embodiment reserves additional data storage areas even if the data storage areas once secured become insufficient during the structured data analysis. This prevents the structured data from being analyzed.

実施の形態８．
この実施の形態では、再度データ記憶領域に確保する内部データ形式に変換したデータを記憶する記憶領域の記憶領域量を、再前解析処理することなく確保する例を説明する。
記憶領域管理部１７は、記憶領域の不足が発生すると、図５の記憶領域管理ステップＳ３で確保した記憶領域量と同じ記憶領域量のデータ記憶領域を、記憶部に再度確保する。すなわち、図２１の不足通知ステップＳ２０において、記憶領域管理部１７は、記憶領域の不足の通知を解析処理制御部１２に通知せずに、前回確保した記憶領域量と同じ量のデータ記憶領域の再確保を行い、再確保したデータ記憶領域を参照するための情報を解析処理制御部１２に通知する。そして、実解析処理ステップＳ２４が開始される。この場合、前回の記憶領域量は、記憶領域決定部１５が記憶部に或いは、ファイル群９２４に記憶しておき、記憶領域管理部１７が記憶部やファイル群９２４に記憶されたものを取得する。 Embodiment 8 FIG.
In this embodiment, an example will be described in which the storage area amount of the storage area for storing the data converted into the internal data format to be secured again in the data storage area is secured without re-preanalysis processing.
When the storage area shortage occurs, the storage area management unit 17 secures again a data storage area having the same storage area amount as the storage area amount secured in the storage area management step S3 of FIG. That is, in the shortage notification step S20 of FIG. 21, the storage area management unit 17 does not notify the analysis processing control unit 12 of the shortage of storage area, and stores the data storage area of the same amount as the storage area secured last time. Re-reserve is performed, and information for referring to the re-reserved data storage area is notified to the analysis processing control unit 12. Then, actual analysis processing step S24 is started. In this case, the previous storage area amount is stored in the storage section or in the file group 924 by the storage area determination section 15, and the storage area management section 17 obtains the storage area amount stored in the storage section or the file group 924. .

また、再度確保する記憶領域量を予め記憶しておく追加記憶量記憶部を構造化データメモリ管理装置１０が備えるようにしてもかまわない。
図２２は、追加記憶量記憶部を備える構造化データメモリ管理装置１０のブロック図を示す図である。
図２２において、追加記憶量記憶部１９は、再度データ記憶領域を確保する際の再確保する記憶領域量を予め記憶する。その他の要素は、図１と同様である。
記憶領域管理部１７からデータ記憶領域の不足を通知された解析処理制御部１２は、記憶領域決定部１５に対して再度確保する記憶領域量を決定することを要求する。すなわち、図２１のように、前解析処理部１３による再解析処理ステップＳ２１を行わずに、記憶領域決定部１５が追加記憶量記憶部１９から再度確保する記憶領域量を取得する。 Further, the structured data memory management apparatus 10 may be provided with an additional storage amount storage unit that stores in advance the storage area amount to be secured again.
FIG. 22 is a block diagram of the structured data memory management device 10 including the additional storage amount storage unit.
In FIG. 22, the additional storage amount storage unit 19 stores in advance a storage area amount to be re-secured when a data storage area is secured again. Other elements are the same as those in FIG.
The analysis processing control unit 12 notified of the shortage of the data storage area from the storage area management unit 17 requests the storage area determination unit 15 to determine the storage area amount to be secured again. That is, as illustrated in FIG. 21, the storage area determination unit 15 acquires the storage area amount to be secured again from the additional storage amount storage unit 19 without performing the reanalysis processing step S21 by the pre-analysis processing unit 13.

前述のように、図５の記憶領域管理ステップＳ３で確保した記憶領域量と同じ記憶領域量のデータ記憶領域を確保したり、或いは、追加記憶量記憶部１９から再度確保する記憶領域量を取得することにより、前解析処理部１３による前解析処理が不要となるため、構造化データ解析部１４による処理を早く再開できる効果がある。 As described above, a data storage area having the same storage area amount as that secured in the storage area management step S3 in FIG. 5 is secured, or a storage area amount to be secured again is obtained from the additional storage quantity storage unit 19. This eliminates the need for the pre-analysis processing by the pre-analysis processing unit 13, and thus has an effect of resuming the processing by the structured data analysis unit 14 quickly.

また、実施の形態７に記載した記憶領域量を再度決定するための、再決定専用の記憶領域量決定処理方式や異なる記憶領域量決定情報を、追加記憶量記憶部１９に記憶するようにしてもかまわない。
この場合、データ記憶領域量の不足が発生した際の処理手順は、図２１の処理手順となる。ただし、記憶領域決定部１５は、記憶領域再決定ステップＳ２２で、記憶領域推定情報格納部１６をアクセスする代わりに、追加記憶量記憶部１９をアクセスする。 In addition, the storage area amount determination processing method dedicated to redetermination and different storage area amount determination information for determining the storage area amount described in the seventh embodiment are stored in the additional storage amount storage unit 19. It doesn't matter.
In this case, the processing procedure when the shortage of the data storage area amount is the processing procedure of FIG. However, the storage area determination unit 15 accesses the additional storage amount storage unit 19 instead of accessing the storage area estimation information storage unit 16 in the storage area redetermination step S22.

この実施の形態では、データ記憶領域に不足が発生した後、記憶領域決定部１５が使用する記憶領域推定情報は初回に使用した前記記憶領域推定情報と同一であるとして、初回に確保した記憶領域量と同じ記憶領域量を再度確保することを特徴とする記憶領域管理方式を実行する記憶領域管理装置の一例の構造化データメモリ管理装置１０を説明した。 In this embodiment, after a shortage occurs in the data storage area, the storage area estimation information used by the storage area determination unit 15 is the same as the storage area estimation information used for the first time, and the storage area secured for the first time The structured data memory management apparatus 10 as an example of the storage area management apparatus that executes the storage area management system characterized by re-allocating the same storage area amount as the amount has been described.

また、データ記憶領域に不足が発生した後、記憶領域決定部１５が使用する記憶領域推定情報は初回に使用した記憶領域推定情報とは異なる物であることを特徴とする記憶領域管理方式を実行する記憶領域管理装置の一例の構造化データメモリ管理装置１０を説明した。例えば、データ記憶領域が不足した場合には、再度確保する記憶領域量を予め記憶する追加記憶量記憶部を備える。また、不足時用の推定情報を別に記憶する追加記憶量記憶部を備える。 In addition, after the data storage area is insufficient, the storage area estimation information used by the storage area determination unit 15 is different from the storage area estimation information used for the first time. The structured data memory management device 10 as an example of the storage area management device to be described has been described. For example, when a data storage area is insufficient, an additional storage amount storage unit that stores in advance a storage area amount to be secured again is provided. Moreover, the additional memory | storage amount memory | storage part which memorize | stores the estimated information for shortage separately is provided.

この実施の形態の構造化データメモリ管理装置１０は、データ記憶領域に不足が発生した場合、前解析処理部１３による前解析処理が不要となるため、構造化データ解析部１４による処理を早く再開できる効果がある。 In the structured data memory management device 10 according to this embodiment, when the data storage area is insufficient, the pre-analysis processing by the pre-analysis processing unit 13 is not required, so that the processing by the structured data analysis unit 14 is quickly resumed. There is an effect that can be done.

実施の形態９．
この実施の形態では、構造化データ解析部１４による解析処理で実際に必要となった記憶領域量を、記憶領域量の推定に反映する一例を説明する。
図２３は、推定情報補正部２０を備える構造化データメモリ管理装置１０を示すブロック図である。
図２３の構造化データメモリ管理装置１０は、記憶領域推定情報格納部１６が記憶する記憶領域推定情報を補正する推定情報補正部２０を備える。他の要素は、図１と同様である。
推定情報補正部２０は、構造化データ解析部１４による解析処理終了後に、解析前に決定した領域サイズと実際に必要となった領域サイズの誤差を記憶領域推定情報に反映する。補正する手順は、はじめに、解析処理制御部１２が、構造化データ解析部１４から解析処理の終了通知を受信する。そして、解析処理制御部１２は、記憶領域管理部１７に対して、構造化データ解析部１４が実際に使用したデータ記憶領域に確保した内部データ形式に変換したデータを記憶する記憶領域の記憶領域量を記憶領域推定情報に反映させることを要求する。記憶領域管理部１７は、実際に使用した記憶領域量をデータ記憶領域を調査して、実際に使用した記憶領域量を取得する。さらに、記憶領域管理部１７は、図５の記憶領域管理ステップＳ３で確保した記憶領域量、及び、図２１の記憶領域再管理ステップＳ２３で再度確保した記憶領域量とを合わせてた推定した記憶領域量を求めて、推定した記憶領域量と、実際に使用した記憶領域量とを、解析処理制御部１２に通知する。通知を受けた解析処理制御部１２は、推定した記憶領域量と、実際に使用した記憶領域量とを推定情報補正部２０に通知するとともに、記憶領域推定情報の補正を要求する。推定情報補正部２０は、推定した記憶領域量と、実際に使用した記憶領域量との差を算出し、その算出した値を利用して、次回以降の解析前の記憶領域量確保の精度を上げるために、記憶領域推定情報格納部１６が記憶する記憶領域推定情報の更新を行う。 Embodiment 9 FIG.
In this embodiment, an example in which the storage area amount actually required in the analysis processing by the structured data analysis unit 14 is reflected in the estimation of the storage area amount will be described.
FIG. 23 is a block diagram illustrating the structured data memory management device 10 including the estimated information correction unit 20.
The structured data memory management device 10 of FIG. 23 includes an estimated information correction unit 20 that corrects the storage area estimation information stored in the storage area estimation information storage unit 16. Other elements are the same as in FIG.
After the analysis processing by the structured data analysis unit 14 is completed, the estimated information correction unit 20 reflects the error between the region size determined before the analysis and the actually required region size in the storage region estimation information. In the correction procedure, first, the analysis processing control unit 12 receives an analysis processing end notification from the structured data analysis unit 14. Then, the analysis processing control unit 12 stores the data converted into the internal data format secured in the data storage area actually used by the structured data analysis unit 14 with respect to the storage area management unit 17. Requests that the amount be reflected in the storage area estimation information. The storage area management unit 17 checks the data storage area for the storage area amount actually used, and acquires the storage area amount actually used. Further, the storage area management unit 17 estimates the storage that is obtained by combining the storage area amount secured in the storage area management step S3 in FIG. 5 and the storage area amount secured again in the storage area re-management step S23 in FIG. The area amount is obtained, and the estimated storage area amount and the actually used storage area amount are notified to the analysis processing control unit 12. Upon receiving the notification, the analysis processing control unit 12 notifies the estimated information correction unit 20 of the estimated storage area amount and the actually used storage area amount, and requests correction of the storage area estimation information. The estimated information correction unit 20 calculates the difference between the estimated storage area amount and the actually used storage area amount, and uses the calculated value to increase the accuracy of securing the storage area amount before the next analysis. In order to increase the storage area estimation information, the storage area estimation information storage unit 16 updates the storage area estimation information.

例えば、記憶領域推定情報格納部１６が記憶領域推定情報として近似関数を記憶している場合、関数は数学的、統計的に推定されるので、この近似関数をどのように変更するかは、その数学上のアルゴリズムに依存する。しかし、数学的に定まったアルゴリズムになっているので、入力となるｘ軸の解析結果情報と、出力となるｙ軸の実際の使用量が決定され、その組を近似に利用するデータ集合に加えて再近似処理を行えば、新たな近似関数が決定する。 For example, when the storage area estimation information storage unit 16 stores an approximate function as the storage area estimation information, the function is estimated mathematically and statistically, so how to change this approximate function is Depends on mathematical algorithm. However, since the algorithm is mathematically determined, the x-axis analysis result information to be input and the actual usage of the y-axis to be output are determined, and the set is added to the data set used for approximation. If re-approximation processing is performed, a new approximation function is determined.

この実施の形態では、構造化データ解析部１４による解析終了後に実際に必要となったデータ記憶領域量と記憶領域決定部１５が求めた記憶領域量との差を用いて、記憶領域推定情報を補正する推定情報補正部２０を有する事を特徴とする記憶領域管理方式を実行する記憶領域管理装置の一例の構造化データメモリ管理装置１０を説明した。 In this embodiment, the storage area estimation information is obtained by using the difference between the data storage area amount actually required after the analysis by the structured data analysis unit 14 and the storage area amount obtained by the storage area determination unit 15. The structured data memory management apparatus 10 as an example of the storage area management apparatus that executes the storage area management system characterized by including the estimated information correction unit 20 to be corrected has been described.

この実施の形態の構造化データメモリ管理装置１０は、記憶領域推定情報が静的ではなく、動的に更新される事で個別の環境に適応可能とする効果がある。 The structured data memory management device 10 of this embodiment has an effect that the storage area estimation information is not static but can be adapted to individual environments by being dynamically updated.

実施の形態１０．
以上までで説明した実施の形態の具体例は、説明した全ての処理が単独の情報端末内で実施される事を前提としている。すなわち、構造化データメモリ管理装置１０が全ての「〜部」を備えていた。しかし、実際には、一部の処理部をネットワーク上のコンテンツサーバやプロキシサーバに実装してもよい。図２４は、記憶領域管理システムの一例を示すブロック図である。例えば、図２４はプロキシサーバである記憶領域推定装置１０１に前解析処理部１３、制御部１０４（解析処理制御部１２の機能の一部を有する）、記憶領域決定部１５および記憶領域推定情報格納部１６を実装し、解析処理制御部１２、構造化データ格納部１１、記憶領域管理部１７および構造化データ解析部１４を、利用者が使用する情報端末である記憶領域管理装置１０３に実装する。記憶領域推定装置１０１と記憶領域管理装置１０３とは、ＬＡＮ９４２、ゲートウェイ９４１を介してインターネット９４０に接続され、２つの装置は例えばインターネットを介して接続されている。また、記憶領域推定装置１０１と記憶領域管理装置１０３とは、通信部１０１１と通信部１０３１とを介して、情報の通信を行う。通信部１０１１、通信部１０３１とは、通信ボード９１５により通信を行う。 Embodiment 10 FIG.
The specific examples of the embodiments described above are based on the premise that all the processes described are performed in a single information terminal. That is, the structured data memory management device 10 includes all “˜parts”. However, in practice, some processing units may be mounted on a content server or proxy server on the network. FIG. 24 is a block diagram illustrating an example of a storage area management system. For example, FIG. 24 shows a storage area estimation apparatus 101 that is a proxy server, a pre-analysis processing unit 13, a control unit 104 (having a part of the function of the analysis processing control unit 12), a storage area determination unit 15, and storage area estimation information storage. The analysis processing control unit 12, the structured data storage unit 11, the storage area management unit 17, and the structured data analysis unit 14 are mounted on the storage area management device 103, which is an information terminal used by the user. . The storage area estimation apparatus 101 and the storage area management apparatus 103 are connected to the Internet 940 via a LAN 942 and a gateway 941, and the two apparatuses are connected via, for example, the Internet. Further, the storage area estimation apparatus 101 and the storage area management apparatus 103 communicate information via the communication unit 1011 and the communication unit 1031. The communication unit 1011 and the communication unit 1031 communicate with each other through the communication board 915.

構造化データ格納部１１は、記憶領域管理装置１０３が備えている。このため、記憶領域管理装置１０３の解析処理制御部１２は、構造化データ解析部１４が解析しようとする構造化データを構造化データ格納部１１より取得して、通信部１０３１により構造化データを記憶領域推定装置１０１に送信する。記憶領域推定装置１０１は、通信部１０１１により記憶領域管理装置１０３が送信した構造化データを受信して、制御部１０４に出力する。制御部１０４は、入力した構造化データを前解析処理部１３に出力するとともに、前解析処理の要求を通知する。この後の処理は、図５のＳ１，Ｓ２と同じである。記憶領域決定部１５が決定した記憶領域量は、制御部１０４を介して通信部１０１１により記憶領域管理装置１０３へ送信する。通信部１０３１は、記憶領域推定装置１０１が送信した記憶領域量を受信して、解析処理制御部１２に出力する。この後の処理は、図５のＳ３，Ｓ４の処理と同様である。 The structured data storage unit 11 is provided in the storage area management device 103. For this reason, the analysis processing control unit 12 of the storage area management apparatus 103 acquires the structured data to be analyzed by the structured data analysis unit 14 from the structured data storage unit 11, and the structured data is received by the communication unit 1031. The data is transmitted to the storage area estimation apparatus 101. The storage area estimation apparatus 101 receives the structured data transmitted from the storage area management apparatus 103 via the communication unit 1011 and outputs the structured data to the control unit 104. The control unit 104 outputs the input structured data to the pre-analysis processing unit 13 and notifies the pre-analysis processing request. The subsequent processing is the same as S1 and S2 in FIG. The storage area amount determined by the storage area determination unit 15 is transmitted to the storage area management apparatus 103 by the communication unit 1011 via the control unit 104. The communication unit 1031 receives the storage area amount transmitted by the storage area estimation apparatus 101 and outputs it to the analysis processing control unit 12. The subsequent processing is the same as the processing in S3 and S4 in FIG.

また、構造化データ解析部１４による解析処理中にデータ記憶領域の不足が発生すると、解析処理制御部１２から通信部１０３１を介して、記憶領域推定装置１０１にデータ記憶領域の不足を通知する。この後の記憶領域推定装置１０１における処理は、図２１のＳ２０〜Ｓ２２までと同じである。再確保する記憶領域量が決定すると、記憶領域推定装置１０１から通信部１０１１を介して、記憶領域管理装置１０３へ再確保する記憶領域量を送信する。記憶領域管理装置１０３は、通信部１０３１により再確保する記憶領域量を受信して、ず２１のＳ２３，Ｓ２４の処理を行う。 Further, when the shortage of the data storage area occurs during the analysis process by the structured data analysis unit 14, the shortage of the data storage area is notified from the analysis processing control unit 12 to the storage area estimation apparatus 101 via the communication unit 1031. The subsequent processing in the storage area estimation apparatus 101 is the same as S20 to S22 in FIG. When the storage area amount to be re-secured is determined, the storage area amount to be re-secured is transmitted from the storage area estimation apparatus 101 to the storage area management apparatus 103 via the communication unit 1011. The storage area management apparatus 103 receives the amount of storage area to be re-secured by the communication unit 1031 and performs the processes of S23 and S24 in 21.

この実施の形態では、前解析処理部１３、記憶領域推定情報格納部１６、記憶領域決定部１５を有する装置が、構造化データ解析部１４、記憶領域管理部１７を有する装置とは異なりことの一例を説明した。また、構造化データ解析部１４を有する記憶領域管理装置１０３および前解析処理部１３を有する記憶領域推定装置１０１は、互いに通信するための通信部１０１１，１０３１を有し、記憶領域管理装置１０３は記憶領域推定装置１０１が推定した記憶領域量を、通信部１０１１，１０３１を介して取得することを特徴とする記憶領域管理システムの一例を説明した。 In this embodiment, the apparatus having the pre-analysis processing unit 13, the storage area estimation information storage unit 16, and the storage area determination unit 15 is different from the apparatus having the structured data analysis unit 14 and the storage area management unit 17. An example was described. The storage area management apparatus 103 having the structured data analysis unit 14 and the storage area estimation apparatus 101 having the pre-analysis processing unit 13 have communication units 1011 and 1031 for communicating with each other. An example of a storage area management system has been described in which the storage area amount estimated by the storage area estimation apparatus 101 is acquired via the communication units 1011 and 1031.

この実施の形態の記憶領域管理システムは、記憶領域量を推定する処理と、構造化データを実際に解析する処理とを別々の装置により実現する。このため、構造化データを実際に解析する処理を行う情報端末（記憶領域管理装置１０３）が複数あった場合に、その情報端末（記憶領域管理装置１０３）を推定装置（記憶領域推定装置１０１）にネットワークを介して接続して、複数の推定装置（記憶領域推定装置１０１）を設置するコストを省ける効果がある。また、記憶領域推定情報を一元管理できる効果がある。 The storage area management system of this embodiment realizes the process of estimating the storage area amount and the process of actually analyzing the structured data by separate devices. For this reason, when there are a plurality of information terminals (storage area management apparatus 103) that perform processing for actually analyzing structured data, the information terminal (storage area management apparatus 103) is set as an estimation apparatus (storage area estimation apparatus 101). There is an effect that the cost of installing a plurality of estimation devices (storage area estimation devices 101) by connecting to the network via a network can be saved. In addition, the storage area estimation information can be managed in an integrated manner.

実施の形態１１．
この実施の形態では、解析する構造化データが、記憶領域管理装置１０３と記憶領域推定装置１０１以外の他の装置に記憶されている記憶領域管理システムの一例を説明する。
図２５は、記憶領域管理システムの図２４とは別の一例を示すブロック図である。図２５の記憶領域管理システムは、情報端末（記憶領域管理装置１０３）がプロキシサーバ（記憶領域推定装置１０１）を経由して、コンテンツサーバである構造化データ記憶装置１０２に存在する構造化データを通信して取得し、解析処理を行う場合の一つの具体例である。構造化データがコンテンツサーバ（構造化データ記憶装置１０２）に記憶されている点以外は、図２５は図２４の記憶領域管理システムは同じ装置構成をしており、それぞれの装置は図５及び図２１の各処理ステップを、実施の形態１０の説明と同じように、いずれかの装置でそれぞれ実行する。
プロキシサーバ（記憶領域推定装置１０１）はコンテンツサーバ（構造化データ記憶装置１０２）と情報端末（記憶領域管理装置１０３）と通信するための通信部１０５を備え、情報端末（記憶領域管理装置１０３）はプロキシサーバ（記憶領域推定装置１０１）との通信を行うための通信部１０３１を備える。記憶領域推定装置１０１と記憶領域管理装置１０３とコンテンツサーバ（構造化データ記憶装置１０２）とは、ＬＡＮ９４２、ゲートウェイ９４１を介してインターネット９４０に接続され、３つの装置は例えばインターネットを介して接続されている（図２５の第１のネットワーク１０７と第２のネットワーク１０８とは、例えば、ＬＡＮ９４２、ゲートウェイ９４１、インターネット９４０とする）。通信部１０５、通信部１０３１とは、通信ボードにより通信を行う。また、通信部１０５とコンテンツサーバ（構造化データ記憶装置１０２）とが通信する情報、及び、通信部１０５と通信部１０３１とが通信する情報とは、ヘッダ部とボディ部とを有するものとする。 Embodiment 11 FIG.
In this embodiment, an example of a storage area management system in which structured data to be analyzed is stored in a device other than the storage area management apparatus 103 and the storage area estimation apparatus 101 will be described.
FIG. 25 is a block diagram showing an example of the storage area management system different from FIG. In the storage area management system of FIG. 25, the structured data existing in the structured data storage device 102, which is a content server, is transferred from the information terminal (storage area management apparatus 103) via the proxy server (storage area estimation apparatus 101). It is one specific example in the case of acquiring through communication and performing analysis processing. 25 is the same as the storage area management system of FIG. 24 except that structured data is stored in the content server (structured data storage device 102). Each processing step 21 is executed by any one of the devices as in the description of the tenth embodiment.
The proxy server (storage area estimation apparatus 101) includes a communication unit 105 for communicating with the content server (structured data storage apparatus 102) and the information terminal (storage area management apparatus 103), and the information terminal (storage area management apparatus 103). Includes a communication unit 1031 for performing communication with the proxy server (storage area estimation apparatus 101). The storage area estimation apparatus 101, the storage area management apparatus 103, and the content server (structured data storage apparatus 102) are connected to the Internet 940 via a LAN 942 and a gateway 941, and the three apparatuses are connected via, for example, the Internet. (The first network 107 and the second network 108 in FIG. 25 are, for example, a LAN 942, a gateway 941, and the Internet 940). The communication unit 105 and the communication unit 1031 communicate with each other using a communication board. In addition, information that communicates between the communication unit 105 and the content server (structured data storage device 102) and information that communicates between the communication unit 105 and the communication unit 1031 include a header part and a body part. .

情報端末（記憶領域管理装置１０３）は、入力部１８よりコンテンツサーバ（構造化データ記憶装置１０２）の構造化データを取得するコンテンツ取得リクエストを入力して、入力したコンテンツ取得リクエストをプロキシサーバ（記憶領域推定装置１０１）を経由して、コンテンツサーバ（構造化データ記憶装置１０２）へ送信して、構造化データを取得する事が可能である。情報端末（記憶領域管理装置１０３）から通信部１０３１と通信部１０５とを介してコンテンツ取得リクエストを受けたプロキシサーバ（記憶領域推定装置１０１）は、リクエストしているコンテンツサーバ（構造化データ記憶装置１０２）に通信部１０５を介してそのリクエストを転送する。そして、プロキシサーバ（記憶領域推定装置１０１）は、コンテンツサーバ（構造化データ記憶装置１０２）からの応答が到着するまで待機する。コンテンツサーバ（構造化データ記憶装置１０２）からの応答を通信部１０５により受信すると、プロキシサーバ（記憶領域推定装置１０１）は、その応答からボディ部が構造化データを有する場合には、ボディを取り出し、前解析処理部１３に前解析処理を実行させ、生成した解析結果情報から記憶領域決定部１５に必要となる記憶領域量を算出させる。算出された記憶領域量をコンテンツサーバ（構造化データ記憶装置１０２）から受け取った応答のヘッダに追加し、記憶領域量を追加した応答を通信部１０５を介して情報端末（記憶領域管理装置１０３）に送信する。 The information terminal (storage area management device 103) inputs a content acquisition request for acquiring structured data of the content server (structured data storage device 102) from the input unit 18, and the input content acquisition request is stored in the proxy server (storage). It is possible to obtain structured data by transmitting to the content server (structured data storage device 102) via the region estimation device 101). The proxy server (storage area estimation apparatus 101) that receives the content acquisition request from the information terminal (storage area management apparatus 103) via the communication unit 1031 and the communication unit 105 receives the requesting content server (structured data storage apparatus) 102), the request is transferred via the communication unit 105. Then, the proxy server (storage area estimation device 101) waits until a response from the content server (structured data storage device 102) arrives. When the response from the content server (structured data storage device 102) is received by the communication unit 105, the proxy server (storage area estimation device 101) extracts the body from the response when the body portion has structured data. Then, the pre-analysis processing unit 13 is caused to execute the pre-analysis processing, and the storage area determining unit 15 is made to calculate the storage area amount necessary from the generated analysis result information. The calculated storage area amount is added to the header of the response received from the content server (structured data storage device 102), and the response with the added storage area amount is added to the information terminal (storage region management device 103) via the communication unit 105. Send to.

プロキシサーバ（記憶領域推定装置１０１）から応答を受信した情報端末（記憶領域管理装置１０３）の通信部１０３１は、受信した応答を解析処理制御部１２に渡す。応答を受け取った解析処理制御部１２は、応答のボディ部から構造化データを取り出し、構造化データ格納部１１に格納する。次に応答のヘッダ部から記憶領域量を取得し、その情報を記憶領域管理部１７に渡し、記憶領域管理部１７は記憶領域量のデータ記憶領域を確保する。そして、記憶領域管理部１７は確保したことを解析処理制御部１２に通知する。解析処理制御部１２は、その後、構造化データ解析部１４に対して、構造化データの解析要求を発行し、構造化データ解析部１４は記憶領域管理部１７が確保した記憶領域を使用して、構造化データ格納部１１に記憶した構造化データの解析処理を実行する。 The communication unit 1031 of the information terminal (storage area management apparatus 103) that has received the response from the proxy server (storage area estimation apparatus 101) passes the received response to the analysis processing control unit 12. Upon receiving the response, the analysis processing control unit 12 extracts structured data from the body part of the response and stores it in the structured data storage unit 11. Next, the storage area amount is acquired from the header portion of the response, and the information is passed to the storage area management unit 17, and the storage area management unit 17 secures the data storage area of the storage area amount. Then, the storage area management unit 17 notifies the analysis processing control unit 12 that it has been secured. Thereafter, the analysis processing control unit 12 issues a structured data analysis request to the structured data analysis unit 14, and the structured data analysis unit 14 uses the storage area secured by the storage area management unit 17. Then, analysis processing of the structured data stored in the structured data storage unit 11 is executed.

このようなプロキシサーバ（記憶領域推定装置１０１）を利用した場合、情報端末（記憶領域管理装置１０３）にマシンパワーがなく、前解析処理や記憶領域量決定処理が大きな負荷となってしまう場合には有効であり、その負荷をプロキシサーバ（記憶領域推定装置１０１）で代行する事が可能である。
また、コンテンツサーバ（構造化データ記憶装置１０２）が前述した具体例のプロキシサーバ（記憶領域推定装置１０１）が保持する機能を備えて同等の機能を果たす事も可能である。その場合には、コンテンツサーバ（記憶領域推定装置１０１）が情報端末（記憶領域管理装置１０３）からの構造化データリクエストを受信した事をトリガーとして、前解析処理、記憶領域量決定処理を実行し、その結果を応答のヘッダ部に付加して情報端末（記憶領域管理装置１０３）に送信する。この場合の情報端末（記憶領域管理装置１０３）の処理はプロキシサーバ（記憶領域推定装置１０１）を利用した場合と同等である。
この方法の場合、同一コンテンツに対して複数のリクエストがきた場合に、それぞれの場合で、コンテンツサーバ（記憶領域推定装置１０１）側で前解析処理および記憶領域量決定処理を実行するのは非効率であるため、一旦算出された結果をキャッシュしておく機構を持たせてもよい。 When such a proxy server (storage area estimation apparatus 101) is used, the information terminal (storage area management apparatus 103) does not have machine power, and the pre-analysis process and the storage area amount determination process become a heavy load. Is effective, and the proxy server (storage area estimation apparatus 101) can substitute the load.
In addition, the content server (structured data storage device 102) can be provided with the function held by the proxy server (storage area estimation device 101) of the specific example described above, and can perform an equivalent function. In that case, the pre-analysis process and the storage area amount determination process are executed with the content server (storage area estimation apparatus 101) receiving a structured data request from the information terminal (storage area management apparatus 103) as a trigger. The result is added to the header portion of the response and transmitted to the information terminal (storage area management device 103). The processing of the information terminal (storage area management apparatus 103) in this case is equivalent to the case where the proxy server (storage area estimation apparatus 101) is used.
In the case of this method, when a plurality of requests are made for the same content, it is inefficient to execute the pre-analysis process and the storage area amount determination process on the content server (storage area estimation apparatus 101) side in each case. Therefore, a mechanism for caching the result once calculated may be provided.

また、記憶領域量は、最初に情報端末（記憶領域管理装置１０３）が送信した構造化データリクエストに対する応答のヘッダ部に含めて、最終的に情報端末（記憶領域管理装置１０３）へ通知された。しかし、記憶領域量を情報端末（記憶領域管理装置１０３）に通知するための専用の情報を設けて、その専用の情報に含めて通知するようにしてもかまわない。
また、プロキシサーバ（記憶領域推定装置１０１）は、情報端末（記憶領域管理装置１０３）からコンテンツ取得リクエストを受けたことを、前解析処理の指示と判断していたが、情報端末（記憶領域管理装置１０３）からコンテンツ取得リクエストを要求する情報とは別に、前解析処理の指示を行う情報を情報端末（記憶領域管理装置１０３）からプロキシサーバ（記憶領域推定装置１０１）へ送信するようにしてもかまわない。 In addition, the storage area amount is included in the header part of the response to the structured data request first transmitted by the information terminal (storage area management apparatus 103), and finally notified to the information terminal (storage area management apparatus 103). . However, dedicated information for notifying the amount of storage area to the information terminal (storage area management apparatus 103) may be provided and notified in the dedicated information.
Further, the proxy server (storage area estimation apparatus 101) has determined that the content acquisition request from the information terminal (storage area management apparatus 103) has been received as an instruction for the pre-analysis process, but the information terminal (storage area management) In addition to the information requesting the content acquisition request from the device 103), information for instructing the pre-analysis process may be transmitted from the information terminal (storage area management apparatus 103) to the proxy server (storage area estimation apparatus 101). It doesn't matter.

この実施の形態では、前解析処理部１３、記憶領域推定情報格納部１６、記憶領域決定部１５を有するのが、構造化データ解析部１４、記憶領域管理部１７を有する装置とは異なることを説明した。また、構造化データ解析部１４を有する管理装置（記憶領域管理装置１０３）および前解析処理部１３を有する推定装置（記憶領域推定装置１０１）は、通信するための通信部１０５，１０３１を有し、管理装置（記憶領域管理装置１０３）は解析する構造化データを構造化データ格納装置から通信部１０５を介して取得する際に、管理装置（記憶領域管理装置１０３）と構造化データ格納装置の通信経路の間に、推定装置（記憶領域推定装置１０１）を配し、管理装置（記憶領域管理装置１０３）が構造化データ格納装置から構造化データを取得する通信中に推定装置（記憶領域推定装置１０１）が構造化データ格納装置からの応答を解析し、記憶領域量を求め、管理装置（記憶領域管理装置１０３）に通信手段を介して転送する事を特徴とする記憶領域管理システムの一例を説明した。 In this embodiment, the pre-analysis processing unit 13, the storage area estimation information storage unit 16, and the storage area determination unit 15 are different from the apparatus having the structured data analysis unit 14 and the storage area management unit 17. explained. Further, the management device (storage area management device 103) having the structured data analysis unit 14 and the estimation device (storage area estimation device 101) having the pre-analysis processing unit 13 have communication units 105 and 1031 for communication. When the management apparatus (storage area management apparatus 103) obtains the structured data to be analyzed from the structured data storage apparatus via the communication unit 105, the management apparatus (storage area management apparatus 103) and the structured data storage apparatus An estimation apparatus (storage area estimation apparatus 101) is arranged between the communication paths, and the management apparatus (storage area management apparatus 103) acquires structured data from the structured data storage apparatus during communication. The apparatus 101) analyzes the response from the structured data storage device, determines the storage area amount, and transfers it to the management apparatus (storage area management apparatus 103) via the communication means. For explaining an example of a storage area management systems.

また、推定装置（記憶領域推定装置１０１）が求めた記憶領域量を応答のヘッダ情報として付加することにより管理装置（記憶領域管理装置１０３）に転送する事を特徴とする記憶領域管理システムの一例を説明した。 An example of a storage area management system is characterized in that the storage area amount obtained by the estimation apparatus (storage area estimation apparatus 101) is added to the management apparatus (storage area management apparatus 103) by adding it as header information of the response. Explained.

また、推定装置（記憶領域推定装置１０１）が求めた記憶領域量を専用の応答情報として管理装置（記憶領域管理装置１０３）に転送する事を特徴とする記憶領域管理システムの一例を説明した。 In addition, an example of a storage area management system has been described in which the storage area amount obtained by the estimation apparatus (storage area estimation apparatus 101) is transferred to the management apparatus (storage area management apparatus 103) as dedicated response information.

また、推定装置（記憶領域推定装置１０１）は管理装置（記憶領域管理装置１０３）から解析要求を、通信部を介して転送された場合にのみ推定処理を行い、推定結果を返す事を特徴とする。 Further, the estimation device (storage area estimation device 101) performs an estimation process only when an analysis request is transferred from the management device (storage area management device 103) via a communication unit, and returns an estimation result. To do.

また、構造化データの転送要求のヘッダ情報に記憶領域量を推定することを要求する情報を付加する事を特徴とする記憶領域管理システムの一例を説明した。 In addition, an example of a storage area management system has been described in which information requesting to estimate the storage area amount is added to the header information of the structured data transfer request.

また、記憶領域量を推定することを要求する専用の情報を、通信部を介して推定装置に転送する事を特徴とする。 Further, it is characterized in that dedicated information for requesting estimation of the storage area amount is transferred to the estimation device via the communication unit.

この実施の形態の記憶領域管理システムは、構造化データを一元管理するコンテンツサーバ（構造化データ記憶装置１０２）を備えたので、複数の情報端末（記憶領域管理装置１０３）で重複して同じ構造化データを管理することがないので、情報端末（記憶領域管理装置１０３）の記憶部を有効に使用できる効果がある。 Since the storage area management system of this embodiment includes a content server (structured data storage apparatus 102) that centrally manages structured data, a plurality of information terminals (storage area management apparatus 103) overlap and have the same structure. Therefore, the storage unit of the information terminal (storage area management device 103) can be used effectively.

また、記憶領域量を通知する情報は、コンテンツの要求に対する応答に含めたので、記憶領域推定装置１０１と記憶領域管理装置１０３との間の通信量を抑える効果がある。 In addition, since the information for notifying the storage area amount is included in the response to the content request, there is an effect of suppressing the communication amount between the storage area estimation apparatus 101 and the storage area management apparatus 103.

また、記憶領域量を算出することを要求する情報は、コンテンツを取得する要求と兼用としたので、記憶領域推定装置１０１と記憶領域管理装置１０３との間の通信量を抑える効果がある。 Further, since the information for requesting calculation of the storage area amount is also used as the request for acquiring the content, there is an effect of suppressing the communication amount between the storage area estimation apparatus 101 and the storage area management apparatus 103.

また、記憶領域量を算出することを要求する情報は、コンテンツの要求に対する応答とは別の専用の情報としたので、コンテンツだけを取得したい場合に対処できる効果がある。 Further, since the information requesting to calculate the storage area amount is dedicated information different from the response to the content request, there is an effect that it is possible to cope with the case where only the content is desired to be acquired.

また、記憶領域量を通知する情報は、コンテンツの要求に対する応答とは別の専用の情報にしたので、コンテンツだけを取得したい装置に、不要に記憶領域量を通知することを防ぐ効果がある。このため、記憶領域量を受信することに対応していない情報端末が、不要に記憶領域量を受信して不正動作を起こす事を防ぐ効果がある。 Further, since the information for notifying the storage area amount is dedicated information different from the response to the request for content, there is an effect of preventing unnecessary notification of the storage area amount to an apparatus that wants to acquire only the content. For this reason, there is an effect of preventing an information terminal that does not support receiving the storage area amount from receiving the storage area amount unnecessarily and causing an illegal operation.

記憶領域管理方式を実行する構造化データメモリ管理装置の機能ブロック図である。It is a functional block diagram of a structured data memory management device that executes a storage area management method. は、図１の記憶領域管理方式を実現する構造化データメモリ管理装置を含むシステム構成を示す図である。These are figures which show the system configuration | structure containing the structured data memory management apparatus which implement | achieves the storage area management system of FIG. 図２の構造化データメモリ管理装置を含むシステムのハードウェア資源の一例を示す図である。It is a figure which shows an example of the hardware resource of the system containing the structured data memory management apparatus of FIG. （ａ）は構造化データの一例としてのＨＴＭＬコンテンツを示す図である。（ｂ）は、（ａ）のＨＴＭＬコンテンツをＤＯＭに準拠した内部データ形式で表現した場合のブロック図の例である。(A) is a figure which shows the HTML content as an example of structured data. (B) is an example of a block diagram when the HTML content of (a) is expressed in an internal data format conforming to DOM. 実施の形態１の解析処理制御部１２を中心とする処理のシーケンスを示す図である。FIG. 4 is a diagram illustrating a processing sequence centering on an analysis processing control unit 12 according to the first embodiment. 構造化データがＸＭＬの場合における特徴データとしてタグ開始・終了記号および属性設定記号による記号のみを利用した場合の前解析処理部の処理フロー図である。FIG. 10 is a process flow diagram of a pre-analysis processing unit when only a tag start / end symbol and an attribute setting symbol are used as feature data when structured data is XML. 構造化データがＸＭＬの場合における特徴データとしてタグの開始・終了記号、属性設定記号、テキスト文字数のそれぞれによる特徴抽出処理フロー図である。FIG. 11 is a feature extraction process flow diagram based on each of a tag start / end symbol, an attribute setting symbol, and the number of text characters as feature data when structured data is XML. 記憶領域推定情報格納部１６に格納されている情報の具体例を示す図であり、（ａ）はパラメータが所定の範囲の値を示す例であり、（ｂ）はパラメータが１つの値を示す例である。It is a figure which shows the specific example of the information stored in the storage area estimation information storage part 16, (a) is an example in which a parameter shows the value of a predetermined range, (b) shows a parameter in one value It is an example. 記憶領域推定情報格納部１６へパラメータと記憶領域推定情報とを設定する手順のフローチャート図である。It is a flowchart figure of the procedure which sets a parameter and storage area estimation information to the storage area estimation information storage part. （ａ）はタグ数、（ｂ）は属性数、（ｃ）は文字数それぞれについてパラメータと記憶領域推定情報とを記憶する記憶領域推定情報格納部１６の例を示す図である。(A) is a tag number, (b) is an attribute number, (c) is a figure which shows the example of the storage area estimation information storage part 16 which memorize | stores a parameter and storage area estimation information about the number of characters, respectively. は、（ａ）は前解析結果の間隔が属性の場合で１０間隔である例を示し、（ｂ）は前解析結果の間隔が文字数の場合で４間隔である例を示している。(A) shows an example in which the interval of the pre-analysis result is an attribute and is 10 intervals, and (b) shows an example in which the interval of the pre-analysis result is the number of characters and is 4 intervals. 過去に行った前解析処理部１３の解析結果情報に基づいて記憶領域決定部１５が推定した記憶領域量と、実際に構造化データ解析部１４が使用した記憶領域量とを利用して、メモリ使用推定量関数を求めるグラフを示す図である。Using the storage area amount estimated by the storage area determination unit 15 based on the analysis result information of the pre-analysis processing unit 13 performed in the past and the storage area amount actually used by the structured data analysis unit 14, It is a figure which shows the graph which calculates | requires a usage estimation amount function. （ａ）はタグの出現回数が解析結果情報である場合の連続近似関数ｆ（ｘ）を示し、（ｂ）は属性の出現回数が解析結果情報である場合の連続近似関数ｆ（ｙ）を示し、（ｃ）は文字の出現回数が解析結果情報である場合の連続近似関数ｆ（ｚ）を示す図である。(A) shows the continuous approximation function f (x) when the number of appearances of the tag is analysis result information, and (b) shows the continuous approximation function f (y) when the number of appearances of the attribute is analysis result information. (C) is a figure which shows the continuous approximation function f (z) in case the frequency | count of appearance of a character is analysis result information. 図１３の連続近似関数に対応する記憶領域位置情報格納部１６の一例を示す図である。It is a figure which shows an example of the storage area position information storage part 16 corresponding to the continuous approximation function of FIG. 解析結果情報の値の範囲によって対応する連続近似関数が異なるとともに、オブジェクト別に解析結果情報と記憶領域推定情報との対応表を設けた例を示す図である。It is a figure which shows the example which provided the correspondence table of analysis result information and storage area estimation information for every object while the corresponding continuous approximation function changes with the range of the value of analysis result information. 記憶領域推定情報格納部１６に格納されている付加情報の具体例の一つを示す図である。It is a figure which shows one of the specific examples of the additional information stored in the storage area estimation information storage part. 図１６の付加情報を利用した記憶領域決定ステップ（Ｓ２）のフローチャート図である。It is a flowchart figure of the storage area determination step (S2) using the additional information of FIG. 第１と第２の前解析処理部を備えて記憶領域管理方式を実行する構造化データメモリ管理装置の機能ブロック図である。It is a functional block diagram of a structured data memory management device that includes first and second pre-analysis processing units and executes a storage area management method. 最大記憶領域量と、最小記憶領域量と、棄却可能記憶領域量とを、複数のオブジェクト毎にそれぞれ記憶する記憶領域推定情報格納部１６の一例であり、（ａ）はＥｌｅｍｅｎｔに対応し、（ｂ）はａｔｔｒオブジェクトに対応し、（ｃ）はテキストオブジェクトに対応する例を示す図である。It is an example of the storage area estimation information storage unit 16 that stores the maximum storage area amount, the minimum storage area amount, and the rejectable storage area amount for each of a plurality of objects, and (a) corresponds to the Element, b) corresponds to an attr object, and (c) is a diagram illustrating an example corresponding to a text object. 記憶領域推定情報格納部１６に格納されている付加情報に最大記憶領域量と、最小記憶領域量と、棄却可能記憶領域量を設定した例を示す図である。It is a figure which shows the example which set the maximum storage area amount, the minimum storage area amount, and the discardable storage area amount to the additional information stored in the storage area estimation information storage unit 16. 確保した記憶領域が不足した場合に発生する対処処理の処理シーケンスの一例を示す図である。It is a figure which shows an example of the process sequence of the coping process which generate | occur | produces when the secured storage area runs short. 追加記憶量記憶部１９を備える構造化データメモリ管理装置１０を示すブロック図である。It is a block diagram which shows the structured data memory management apparatus 10 provided with the additional memory | storage amount memory | storage part 19. FIG. 推定情報補正部２０を備える構造化データメモリ管理装置１０を示すブロック図である。2 is a block diagram illustrating a structured data memory management device 10 including an estimated information correction unit 20. FIG. 記憶領域管理システムの一例を示すブロック図である。It is a block diagram which shows an example of a storage area management system. 記憶領域管理システムの図２４とは別の一例を示すブロック図である。FIG. 25 is a block diagram illustrating an example of the storage area management system different from FIG. 24.

Explanation of symbols

１０構造化データメモリ管理装置、１１構造化データ格納部、１２解析処理制御部、１３前解析処理部、１４構造化データ解析部、１５記憶領域決定部、１６記憶領域推定情報格納部、１６ｘ要素数、１６ｙ属性数、１６ｚ文字数、１７記憶領域管理部、１８入力部、１９追加記憶量記憶部、２０推定情報補正部、１０４制御部、１０５通信部、１００解析範囲情報、１０１記憶領域推定装置、１０２構造化データ記憶装置、１０３記憶領域管理装置、１０７第１のネットワーク、１０８第２のネットワーク、１３１第１の解析処理部、１３２第２の解析処理部、１６１最大、１６２最小、１６３棄却可能、９０１表示装置、９０２キーボード、９０３マウス、９０４ＦＤＤ、９０５ＣＤＤ、９０８データベース、９０９クライアント装置、９１０サーバ装置、９１１ＣＰＵ、９１２バス、９１３ＲＯＭ、９１４ＲＡＭ、９１５通信ボード、９２０磁気ディスク装置、９２１ＯＳ、９２２ウィンドウシステム、９２３プログラム群、９２４ファイル群、９４０インターネット、９４１ゲートウェイ、９４２ＬＡＮ、１０１１，１０３１通信部。 DESCRIPTION OF SYMBOLS 10 Structured data memory management apparatus, 11 Structured data storage part, 12 Analysis processing control part, 13 Pre-analysis processing part, 14 Structured data analysis part, 15 Storage area determination part, 16 Storage area estimation information storage part, 16x element Number, 16y Number of attributes, 16z Number of characters, 17 Storage area management section, 18 input section, 19 Additional storage amount storage section, 20 Estimated information correction section, 104 Control section, 105 Communication section, 100 Analysis range information, 101 Storage area estimation device , 102 structured data storage device, 103 storage area management device, 107 first network, 108 second network, 131 first analysis processing unit, 132 second analysis processing unit, 161 maximum, 162 minimum, 163 rejection Possible, 901 display device, 902 keyboard, 903 mouse, 904 FDD, 905 CDD, 9 08 database, 909 client device, 910 server device, 911 CPU, 912 bus, 913 ROM, 914 RAM, 915 communication board, 920 magnetic disk device, 921 OS, 922 window system, 923 program group, 924 file group, 940 Internet, 941 Gateway, 942 LAN, 1011 and 1031 Communication unit.

Claims

When converting structured data representing the data-specific structure and the relationship between the data into data format information that can be processed by the information processing device, the storage area amount of the data storage area for storing the converted data format information is set in advance. In the storage area management method you want,
A central processing unit (CPU) for executing processing;
A storage unit for storing a result of processing performed by the CPU;
An analysis processing unit that inputs structured data, generates analysis result information that is a parameter for estimating the storage area amount by the CPU, and stores the generated analysis result information in the storage unit by the CPU;
A storage area estimation information storage unit that stores in advance a plurality of parameters for estimating the storage area amount, and stores in advance storage area estimation information for estimating the storage area amount corresponding to each of the plurality of parameters;
The analysis result information generated by the analysis processing unit is input from the storage unit, the storage region estimation information corresponding to the parameter that matches the input analysis result information is acquired from the storage region estimation information storage unit, and the acquired storage A storage area determination unit that determines the storage area amount based on the area estimation information and stores the determined storage area amount in the storage unit by the CPU;
The storage area amount determined by the storage area determination unit is input from the storage unit, and the storage area corresponding to the input storage area amount is secured by the CPU as a data storage area for storing the information of the converted data format A storage area management system comprising an area management unit.

The storage area management method further includes:
An input unit for inputting analysis range information for designating a range of data to be analyzed among the structured data analyzed by the analysis processing unit by an input device, and storing the input analysis range information in a storage unit by a CPU is provided. ,
2. The analysis processing unit according to claim 1, wherein the analysis range information input by the input unit is input from the storage unit, and the structured data in the range specified by the input analysis range information is analyzed by the CPU. The storage area management method described.

The analysis range information input by the input unit includes information specifying that all structured data is to be analyzed, and
Information that specifies that the first part of the structured data to a part of it is to be analyzed;
3. The storage area management system according to claim 2, wherein the storage area management method is any one of information specifying that the data from a specific portion of the structured data to a part thereof is to be analyzed.

The structured data has specific information characterizing the structure of the data, and specific end information corresponding to the specific information,
The input unit inputs information indicating specific information included in the structured data as the analysis range information, and stores the information in the storage unit by the CPU.
The analysis processing unit inputs the analysis range information input by the input unit from the storage unit, analyzes the structured data until specific information is detected from the top data of the structured data, When the specific information is detected, the structured data is skipped until the specific end information corresponding to the detected specific information is detected, and when the specific end information of the structured data is detected, the detected specific end information 3. The storage area management system according to claim 2, wherein the CPU analyzes from the next structured data to the last structured data.

The structured data has specific information characterizing the structure of the data and specific end information corresponding to the specific information,
The input unit inputs information indicating specific information included in the structured data as the analysis range information, and stores the information in the storage unit by the CPU.
The analysis processing unit includes a first analysis processing unit that analyzes structured data in a predetermined procedure, and a second analysis processing unit that analyzes structured data in a procedure different from the first analysis processing unit. The analysis range information input by the input unit is input from the storage unit, the structured data until the specific information is detected from the top data of the structured data is analyzed by the first analysis processing unit, When the specific information of the structured data is detected, the structured data is analyzed by the second analysis processing unit until the specific end information corresponding to the detected specific information is detected, and the specific end information of the structured data is obtained. 3. The storage area management system according to claim 2, wherein when detected, the first analysis processing section analyzes from the next structured data to the last structured data of the detected specific end information.

The structured data has characteristic data characterizing the structure unique to the data,
5. The storage area management system according to claim 1, wherein the analysis processing unit detects feature data from the structured data, and generates the analysis result information from the detected result.

The structured data has basic information about the structured data,
The analysis processing unit performs either one of setting the basic information of the structured data as the analysis result information and including the basic information of the structured data in the analysis result information. Item 4. The storage area management method according to Item 1.

The basic information is at least one of origin information indicating the origin of the structured data, creation date information indicating the creation date of the structured data, and update date information indicating the update date of the structured data. 8. The storage area management system according to claim 7, wherein:

The analysis processing unit obtains a ratio of the detected result to the structured data to be analyzed with respect to the result of detecting the feature data from the structured data, and determines the obtained ratio as the analysis result information. The storage area management method according to claim 6, wherein the ratio is included in the analysis result information.

2. The storage according to claim 1, wherein each of the plurality of parameters stored in the storage area estimation information storage unit is one value or a value indicating a predetermined range. Area management method.

The storage area estimation information stored in the storage area estimation information storage unit is generated in the past with respect to the coordinate plane having the analysis result information and the storage area amount as axes (N is an integer, N ≧ 1) is estimation function information indicating a function obtained by plotting in advance the analysis result information and the storage area amount obtained based on the analysis result information;
The storage area determination unit inputs the analysis result information generated by the analysis processing unit from the storage unit, and acquires estimation function information corresponding to a parameter that matches the input analysis result information from the storage area estimation information storage unit The storage area management method according to claim 1, wherein the storage area amount is determined based on the obtained estimation function information.

The storage area estimation information storage unit sets the storage area amount that is actually used among the plurality of structured data for which the same analysis result information has been generated in the past, and the same analysis result information as a parameter. To be included in the storage area estimation information and
The storage area determining unit, when the acquired storage area estimation information includes a maximum storage area amount, stores the maximum storage area amount as the determined storage area amount in the storage unit by the CPU. Item 4. The storage area management method according to Item 1.

The storage area estimation information storage unit sets the storage area amount of the smallest storage area actually used among the plurality of structured data for which the same analysis result information has been generated in the past, and the same analysis result information as a parameter. To be included in the storage area estimation information and
The storage area determining unit, when the acquired storage area estimation information includes a minimum storage area amount, stores the minimum storage area amount as the determined storage area amount in the storage unit by the CPU. Item 4. The storage area management method according to Item 1.

The storage area estimation information storage unit is a memory that does not cause a storage area shortage determined based on statistics of analysis result information generated in the past and the amount of storage area actually used by the structured data that generated the analysis result information. The area amount is previously included in the storage area estimation information and stored,
When the acquired storage area estimation information includes a storage area amount that does not cause a shortage of storage area, the storage area determination unit stores the storage area amount that does not cause a shortage of storage area as the determined storage area amount by the CPU. 2. The storage area management system according to claim 1, wherein the storage area management method is stored.

The storage area management unit monitors the usage status of the data storage area for storing the information of the converted data format, detects whether the secured storage area amount is insufficient, and is insufficient If it is detected, the analysis processing unit is notified of the shortage of storage area,
When the analysis processing unit receives that the storage area amount is insufficient from the storage area management unit, the CPU analyzes the structured data other than the range specified by the analysis range information input by the input unit, The analysis result information is generated and stored in the storage unit by the CPU,
The storage area determination unit inputs the analysis result information generated by the analysis processing unit from the storage unit, and stores storage area estimation information corresponding to a parameter that matches the input analysis result information from the storage area estimation information storage unit. Acquiring the storage area amount of the newly reserved storage area based on the acquired storage area estimation information, storing the determined storage area amount of the newly reserved storage area in the storage unit by the CPU,
The storage area management unit inputs the storage area amount determined by the storage area determination unit from the storage unit, and additionally secures a storage area corresponding to the input storage area amount with respect to the already reserved storage area. 5. The storage area management system according to claim 3, wherein the storage area management system is a storage area management system.

The storage area management unit monitors the usage status of the data storage area for storing the information of the converted data format, detects whether the secured storage area amount is insufficient, and is insufficient If it can be detected, the storage area amount determined by the storage area determination unit is input from the storage unit, and a storage area corresponding to the input storage area amount is additionally allocated to the already allocated storage area. The storage area management system according to claim 1, wherein:

The storage area management method further includes an additional storage area amount storage unit that pre-stores additional storage area amount information indicating the storage area amount additionally secured with respect to the already secured storage area amount,
The storage area management unit monitors the usage status of the data storage area for storing the information of the converted data format, detects whether the secured storage area amount is insufficient, and is insufficient If it is detected, the additional storage area amount information stored in the additional storage area amount storage unit is acquired by the CPU, and the storage area corresponding to the acquired additional storage area amount information is obtained with respect to the storage area already secured. 2. The storage area management method according to claim 1, wherein the storage area management method is additionally secured.

When the storage area management unit stores information in the data format obtained by converting the structured data in the storage area, the storage area management unit obtains the actual usage by managing the actual usage of the storage area. Store in the storage unit by the CPU,
The storage area management method further includes:
The actual usage amount obtained by the storage area management unit is input from the storage unit, the storage area amount determined by the storage area determination unit is input from the storage unit, and the difference between the input actual usage amount and the storage area amount is input. The storage area management system according to claim 1, further comprising: an estimation information correction unit that corrects the storage area estimation information stored in the storage area estimation information storage unit based on the obtained difference.

A storage area comprising the CPU according to claim 1, a storage section, an analysis processing section, a storage area estimation information storage section, a storage area determination section, and a storage area management section. Management device.

A storage area estimation device comprising: a CPU according to claim 1; a storage section; an analysis processing section; a storage area estimation information storage section; and a storage area determination section.
A storage area management system for connecting a CPU according to claim 1, a storage area, and a storage area management device comprising a storage area management section via a network,
A storage area management system, wherein each of the storage area estimation apparatus and the storage area management apparatus includes a communication unit that communicates information with a communication apparatus via the network.

A structured data storage device comprising a structured data storage unit for storing structured data representing data-specific structures and relationships between data;
The structured data storage device is connected via the first network, and the structured data stored in the structured data storage unit is obtained from the structured data storage device via the first network. When converting structured data into information in a data format that can be processed by an information processing device, a storage area for obtaining the storage area amount of the data storage area for storing the converted data format information and transmitting the obtained storage area amount An estimation device;
A storage area management device that connects the storage area estimation device via a second network and receives the storage area amount transmitted by the storage area estimation device via the second network;
The storage area management device includes:
A central processing unit (CPU) for executing processing;
A storage unit for storing a result of processing performed by the CPU;
An instruction to acquire structured data stored in the structured data storage unit of the structured data storage device is transmitted by the communication device, and the storage region amount transmitted by the storage region estimation device is received by the communication device and received. A communication unit for storing the storage area amount by the CPU in the storage unit;
Storage area management in which the amount of storage area received by the communication unit is input from the storage unit, and the storage area corresponding to the input storage area amount is secured by the CPU in the data storage area for storing the information of the converted data format With
The storage area estimation device includes:
A central processing unit (CPU) for executing processing;
A storage unit for storing a result of processing performed by the CPU;
An instruction to obtain structured data transmitted by the storage area management device is received by the communication device, and structured from the structured data storage unit of the structured data storage device based on the received instruction to obtain structured data A communication unit that receives data by the communication device and stores the data in the storage unit by the CPU, and transmits the storage area amount by the communication device;
Analyzing the structured data input by the communication unit by the CPU to generate analysis result information that is a parameter for estimating the storage area amount, an analysis processing unit storing the generated analysis result information in the storage unit by the CPU,
A plurality of parameters for estimating the storage area amount, and a storage area estimation information storage unit that stores in advance storage area estimation information for estimating the storage area amount, corresponding to each of the plurality of parameters,
The analysis result information generated by the analysis processing unit is input from the storage unit, the storage region estimation information corresponding to the parameter that matches the input analysis result information is acquired from the storage region estimation information storage unit, and the acquired storage A storage area management system comprising: a storage area determination unit that determines the storage area amount based on area estimation information and stores the determined storage area amount in a storage unit by a CPU.

The communication unit of the storage area estimation device includes the storage area amount in header information of response information to an instruction to acquire structured data transmitted by the storage area management device, and transmits the header information to the storage area management device. The storage area management system according to claim 21, wherein:

The communication unit of the storage area estimation device includes the storage area amount in information different from response information to an instruction to acquire structured data transmitted by the storage area management device, and transmits the information to the storage area management device. 22. The storage area management system according to claim 21, wherein

The storage area according to claim 21, wherein the storage area estimation apparatus performs a process of determining the storage area amount when receiving an instruction to acquire structured data transmitted by the storage area management apparatus. Management system.

The storage area management apparatus according to claim 21, wherein the communication unit of the storage area management apparatus transmits the header information of the request information instructing to acquire structured data including information indicating the instruction. system.

24. The communication unit of the storage area management apparatus transmits dedicated request information including information indicating the instruction using dedicated request information for instructing acquisition of structured data. Storage area management system.

When converting structured data representing the data-specific structure and the relationship between the data into data format information that can be processed by the information processing device, the storage area amount of the data storage area for storing the converted data format information In the desired storage area management method,
The analysis processing unit inputs structured data, analyzes the input structured data by the CPU, generates analysis result information as a parameter for estimating the storage area amount, and stores the generated analysis result information by the CPU Analysis processing steps stored in
The input device inputs a plurality of parameters for estimating the storage area amount and storage area estimation information for estimating a storage area amount corresponding to each of the plurality of parameters from the input device. Storage area estimation information storage step for storing storage area estimation information respectively corresponding to the storage area estimation information storage unit in advance,
A storage area determination unit inputs analysis result information generated by the analysis processing unit from the storage unit, and stores storage area estimation information corresponding to a parameter that matches the input analysis result information from the storage area estimation information storage unit. A storage area determination step of acquiring, determining the storage area amount based on the acquired storage area estimation information, and storing the determined storage area amount in the storage unit by the CPU;
And a storage area management step in which the storage area management unit secures a storage area based on the storage area amount determined by the storage area determination unit as a data storage area for storing information of the converted data format. A storage area management method.

The structured data has characteristic data characterizing the structure unique to the data,
28. The storage area management method according to claim 27, wherein, in the analysis processing step, the analysis processing unit detects feature data from the structured data, and generates the analysis result information from the detected result.