JP4951407B2

JP4951407B2 - Content parts retrieval method and apparatus

Info

Publication number: JP4951407B2
Application number: JP2007128824A
Authority: JP
Inventors: 祐一小川; 克志八高; 謙一茶谷
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2007-05-15
Filing date: 2007-05-15
Publication date: 2012-06-13
Anticipated expiration: 2027-05-15
Also published as: JP2008287311A

Description

本発明は、コンピュータを利用した情報検索技術に関する。 The present invention relates to an information retrieval technique using a computer.

例えば、情報検索技術の一つとして、特許文献１（特開２００３−１５８７２）に開示の技術がある。特許文献１に開示の技術によれば、オブジェクト指向で開発されたソフトウェアの各オブジェクトを検索キーに基づいて効率的に検索するオブジェクト検索装置が提供される。
特開２００３−１５８７２号公報 For example, as one of information retrieval technologies, there is a technology disclosed in Patent Document 1 (Japanese Patent Laid-Open No. 2003-15872). According to the technique disclosed in Patent Document 1, an object search device that efficiently searches each object of software developed in an object-oriented manner based on a search key is provided.
JP 2003-15872 A

ところで、テキストや入力コントロール（例えば、テキストボックス、ラジオボタン或いはチェックボックスなど）を有する画面コンテンツの設計が必要となるケースがある。例えば、電子申請システムでは、そのような画面コンテンツとして、たくさんの入出力画面（例えば、電子申請システムが大規模であれば数千の入出力画面）の設計が必要となる。このケースにおいて、そのような画面コンテンツを人手で一から設計するとなると、開発者の負担が大きい。 By the way, there are cases where it is necessary to design a screen content having text and input controls (for example, a text box, a radio button, or a check box). For example, in the electronic application system, it is necessary to design many input / output screens (for example, thousands of input / output screens if the electronic application system is large) as such screen content. In this case, if such a screen content is designed from scratch, the burden on the developer is large.

そのような画面コンテンツの設計を支援する方法として、例えば、所望のテキストを検索キーとして入力し、その検索キーと一致するテキストを有した既存の画面コンテンツを検索し、その既存の画面コンテンツを流用する方法が考えられる。しかし、この方法よりも開発者の負担を軽減することが望ましい。 As a method for supporting the design of such screen content, for example, a desired text is input as a search key, an existing screen content having a text matching the search key is searched, and the existing screen content is diverted. A way to do this is considered. However, it is desirable to reduce the burden on the developer rather than this method.

従って、本発明の目的は、テキストや入力コントロールを有する画面コンテンツの開発者の負担を、テキストを検索キーとして既存の画面コンテンツを検索し流用する方法よりも軽減することにある。 Accordingly, an object of the present invention is to reduce the burden on a developer of screen content having text and input controls, compared to a method of searching for and using existing screen content using text as a search key.

本発明の他の目的は、後の説明から明らかになるであろう。 Other objects of the present invention will become clear from the following description.

記憶部が、一の画面コンテンツにつき一以上のコンテンツ部品情報グループを記憶する。コンテンツ部品情報グループは、テキストと入力コントロールとを含んだコンテンツ部品を表し、一以上の情報パッケージを含む。情報パッケージには、表示されるテキストを表すテキスト情報要素と入力コントロールの種類を表す入力コントロール種類情報要素とが含まれる。検索装置が、テキスト情報要素と入力コントロール種類情報要素との組合せを含んだ検索条件の入力を受け、その検索条件と記憶部内の各コンテンツ部品情報グループとの類似度を、該検索条件に含まれているテキスト情報要素及び入力コントロール種類情報要素とコンテンツ部品情報グループに含まれているテキスト情報要素及び入力コントロール種類情報要素とに基づいて算出する。そして、検索装置が、その検索条件についてのコンテンツ部品候補に関する情報を上記算出された類似度に基づいて表示画面に表示する。 The storage unit stores one or more content component information groups for one screen content. The content component information group represents a content component including text and input controls, and includes one or more information packages. The information package includes a text information element representing text to be displayed and an input control type information element representing the type of input control. The search device receives an input of a search condition including a combination of a text information element and an input control type information element, and the similarity between the search condition and each content part information group in the storage unit is included in the search condition. It is calculated based on the text information element and the input control type information element that are included, and the text information element and the input control type information element that are included in the content component information group. Then, the search device displays information on the content part candidate for the search condition on the display screen based on the calculated similarity.

一つの実施形態では、検索装置に、検索条件入力部、部品検索部及び部品候補表示部を備えることで、コンテンツ部品検索装置が構築される。検索条件入力部は、テキストを表すテキスト情報要素と入力コントロールの種類を表す入力コントロール種類情報要素との組合せを含んだ検索条件の入力を受け付ける。部品検索部が、記憶部（例えば記憶資源）内の各コンテンツ部品情報グループと入力された検索条件との類似度を、該検索条件に含まれているテキスト情報要素及び入力コントロール種類情報要素とコンテンツ部品情報グループに含まれているテキスト情報要素及び入力コントロール種類情報要素とに基づいて算出する。部品候補表示部が、検索条件についてのコンテンツ部品候補に関する情報を算出された類似度に基づいて表示画面に表示する。 In one embodiment, a content component search device is constructed by providing a search device with a search condition input unit, a component search unit, and a component candidate display unit. The search condition input unit accepts input of a search condition including a combination of a text information element representing text and an input control type information element representing a type of input control. The parts search unit determines the similarity between each content part information group in the storage unit (for example, storage resource) and the input search condition, the text information element, the input control type information element, and the content included in the search condition. Calculation is performed based on the text information element and the input control type information element included in the component information group. The component candidate display unit displays information on the content component candidate regarding the search condition on the display screen based on the calculated similarity.

ここで、コンテンツ部品とは、テキスト又は入力コントロールを最小単位とし、テキストと入力コントロールとの一つの組合せ、その組合せを複数個含んだグループ、或いは画面コンテンツそれ自体であっても良い。具体的には、例えば、階層構造を有する画面コンテンツについて、階層レベルが最高の場合には、コンテンツ部品は、画面コンテンツそれ自体であり、階層レベルが最低の場合には、コンテンツ部品は、一のテキスト、一の入力コントロール又はそれらの一の組合せであり、階層レベルが最高でも最低でも無い場合には、コンテンツ部品は、テキスト及び入力コントロールの組合せを複数個有したグループであって良い。 Here, the content component may be text or input control as a minimum unit, one combination of text and input control, a group including a plurality of the combinations, or screen content itself. Specifically, for example, for screen content having a hierarchical structure, when the hierarchy level is the highest, the content component is the screen content itself, and when the hierarchy level is the lowest, the content component is one If it is text, a single input control, or a combination thereof, and the hierarchy level is neither the highest nor the lowest, the content component may be a group having a plurality of combinations of text and input controls.

また、「表示画面」は、一又は複数の表示装置がそれぞれ有するハードウェアとしての一又は複数の表示画面であっても良いし、一又は複数のハードウェアとしての表示画面に表示されるソフトウェアとしての表示画面であっても良い。 In addition, the “display screen” may be one or a plurality of display screens as hardware included in one or a plurality of display devices, or as software displayed on the display screen as one or a plurality of hardware. It may be a display screen.

また、「コンテンツ部品候補に関する情報」としては、コンテンツ部品の識別子、そのコンテンツ部品に対応した類似度、或いは、そのコンテンツ部品に対応したコンテンツ部品情報グループの或る情報パッケージに含まれているテキスト情報要素など、種々の情報を採用することができる。 Further, as “information regarding content component candidates”, content component identifier, similarity corresponding to the content component, or text information included in a certain information package of the content component information group corresponding to the content component Various information such as elements can be employed.

また、「コンテンツ部品候補に関する情報を算出された類似度に基づいて表示する」とは、例えば、類似度の降順に表示することであっても良いし、所定の閾値以上の類似度であるコンテンツ部品候補に関する情報を表示することであっても良い。 Further, “displaying information on content component candidates based on the calculated similarity” may be, for example, displaying in descending order of similarity, or content having a similarity equal to or higher than a predetermined threshold. It is also possible to display information on the component candidates.

一つの実施形態では、類似度は、検索条件に含まれている一以上のテキスト情報要素とコンテンツ部品情報グループに含まれている一以上のテキスト情報要素との第一の類似度と、検索条件に含まれている一以上の入力コントロール種類情報要素とコンテンツ部品情報グループに含まれている一以上の入力コントロール種類情報要素との第二の類似度とに基づいて算出された値（例えば所定の計算式に第一の類似度及び第二の類似度を代入することで算出された値）とすることができる。 In one embodiment, the similarity is calculated based on a first similarity between one or more text information elements included in the search condition and one or more text information elements included in the content part information group, and the search condition. Calculated based on the second similarity between one or more input control type information elements included in the content part information group and one or more input control type information elements included in the content part information group (for example, a predetermined value) A value calculated by substituting the first similarity and the second similarity into the calculation formula).

一つの実施形態では、類似度は、検索条件に含まれているテキスト情報要素及び入力コントロール種類情報要素の一以上のセットと、コンテンツ部品情報グループに含まれているテキスト情報要素及び入力コントロール種類情報要素の一以上のセットとの類似度であるとすることができる。 In one embodiment, the degree of similarity is determined by one or more sets of text information elements and input control type information elements included in the search condition, and text information elements and input control type information included in the content part information group. It can be a similarity to one or more sets of elements.

一つの実施形態では、コンテンツ部品検索装置が、更に、コンテンツ解析部、グルーピング部、パッケージグループ生成部、テキスト付与部及び部品登録部を備えることができる。コンテンツ解析部は、画面コンテンツ（例えば入力フォーム）を表すコンテンツファイルを解析することで、テキスト情報要素、テキスト配置位置、入力コントロール種類情報要素、及び入力コントロール配置位置を特定することができる。具体的には、例えば、コンテンツ解析部は、コンテンツファイルを解析することで、テキスト情報要素及びテキスト配置位置情報要素の組合せであるテキスト／位置組合せと、入力コントロール種類情報要素及び入力コントロール配置位置情報要素の組合せである入力コントロール／位置組合せとを特定し、特定されたテキスト／位置組合せをテキスト管理情報に格納し、特定された入力コントロール／位置組合せを入力コントロール管理情報に格納することができる。グルーピング部は、コンテンツファイルの解析の結果を基に（例えば入力コントロール管理情報を参照することにより）、それぞれの入力コントロール配置位置関係を特定し、特定された位置関係が所定の第一の条件を満たす場合に、前記条件を満たす複数の入力コントロールをグループとして抽出することができる。パッケージグループ生成部は、抽出したグループを構成する複数の入力コントロールに対応した複数の入力コントロール種類情報要素を含んだ複数の情報パッケージを有するパッケージグループを生成することができる。テキスト付与部は、例えば、コンテンツファイルの解析の結果を基に（例えばテキスト管理情報から）、情報パッケージにおける入力コントロール種類情報要素に対応するテキスト情報要素を取得し、取得したテキスト情報要素をその情報パッケージに含めることができる（例えば、取得したテキスト情報要素をラベルとして情報パッケージに含めることができる）。部品登録部は、テキスト情報要素がそれぞれ含められた複数の情報パッケージを有するパッケージグループであるコンテンツ情報グループを、解析されたコンテンツファイルに関連付けて記憶部に格納することができる。 In one embodiment, the content component search device may further include a content analysis unit, a grouping unit, a package group generation unit, a text addition unit, and a component registration unit. The content analysis unit can specify a text information element, a text arrangement position, an input control type information element, and an input control arrangement position by analyzing a content file representing screen content (for example, an input form). Specifically, for example, the content analysis unit analyzes a content file, thereby combining a text / position combination that is a combination of a text information element and a text arrangement position information element, an input control type information element, and input control arrangement position information. An input control / position combination, which is a combination of elements, is specified, the specified text / position combination is stored in the text management information, and the specified input control / position combination is stored in the input control management information. The grouping unit identifies each input control arrangement positional relationship based on the analysis result of the content file (for example, by referring to the input control management information), and the identified positional relationship satisfies a predetermined first condition. When satisfied, a plurality of input controls that satisfy the condition can be extracted as a group. The package group generation unit can generate a package group having a plurality of information packages including a plurality of input control type information elements corresponding to a plurality of input controls constituting the extracted group. For example, based on the analysis result of the content file (for example, from the text management information), the text giving unit acquires the text information element corresponding to the input control type information element in the information package, and the acquired text information element is the information It can be included in the package (eg, the acquired text information element can be included as a label in the information package). The component registration unit can store a content information group, which is a package group having a plurality of information packages each including text information elements, in the storage unit in association with the analyzed content file.

一つの実施形態では、所定の第一の条件は、入力コントロール同士の配置位置間の距離が所定の値以下であるとすることができる。 In one embodiment, the predetermined first condition may be that a distance between the arrangement positions of the input controls is equal to or less than a predetermined value.

一つの実施形態では、情報パッケージに含められるテキスト情報要素は、その情報パッケージ内の入力コントロール種類情報要素に対応した入力コントロールとの間の配置位置関係が所定の第二の条件を満たすテキストを表す情報要素であるとすることができる。 In one embodiment, the text information element included in the information package represents a text whose arrangement positional relationship with the input control corresponding to the input control type information element in the information package satisfies a predetermined second condition. It can be an information element.

一つの実施形態では、所定の第二の条件は、テキストの配置位置と入力コントロールの配置位置との間の距離が所定の値以下であるとすることができる。 In one embodiment, the predetermined second condition may be that the distance between the text layout position and the input control layout position is less than or equal to a predetermined value.

一つの実施形態では、コンテンツ情報グループは、最低階層レベルの末端に対応する情報パッケージをリーフとし、最低階層レベル以外の階層レベルの先頭に対応する情報パッケージをブランチとする情報パッケージツリーであるとすることができる。コンテンツ部品情報グループは、コンテンツ情報グループそれ自体又はそれのサブセットであるとすることができる。 In one embodiment, the content information group is an information package tree in which an information package corresponding to the end of the lowest hierarchy level is a leaf and an information package corresponding to the head of a hierarchy level other than the lowest hierarchy level is a branch. be able to. The content part information group may be the content information group itself or a subset thereof.

一つの実施形態では、コンテンツ部品候補に関する情報には、コンテンツ部品情報グループの最上位にある情報パッケージ内のテキスト情報要素が含まれてもよい。 In one embodiment, the information related to the content component candidate may include a text information element in the information package at the top of the content component information group.

一つの実施形態では、コンテンツ部品候補は、算出された類似度が所定の閾値以上であるコンテンツ部品情報グループに対応したコンテンツ部品であるとすることができる。 In one embodiment, the content component candidate may be a content component corresponding to a content component information group whose calculated similarity is equal to or greater than a predetermined threshold.

一つの実施形態では、部品候補表示部が、更に、コンテンツ部品候補の数を表示することができる。 In one embodiment, the component candidate display unit can further display the number of content component candidates.

一つの実施形態では、部品候補表示部は、入力された検索条件が複数個存在し、それらの検索条件が階層関係（例えば親子関係）にある場合、上位（例えば親）の検索条件についてのコンテンツ部品候補に関する情報を優先的に表示することができる。 In one embodiment, the component candidate display unit has content regarding a higher-level (for example, parent) search condition when there are a plurality of input search conditions and the search conditions are in a hierarchical relationship (for example, a parent-child relationship). Information regarding candidate parts can be preferentially displayed.

一つの実施形態では、部品登録部が、同一又は類似のコンテンツ部品情報グループの集まりである部品グループを作成することができる。部品検索部が、入力された検索条件と各部品グループとの類似度を算出することができる。部品候補表示部が、コンテンツ部品候補に関する情報として部品グループに関する情報を表示することができる。 In one embodiment, the component registration unit can create a component group that is a collection of identical or similar content component information groups. The component search unit can calculate the similarity between the input search condition and each component group. The component candidate display unit can display information regarding the component group as information regarding the content component candidate.

ここで、検索条件と部品グループとの類似度としては、その部品グループに属する一のコンテンツ部品情報グループと検索条件との類似度であっても良いし、その部品グループに属する複数のコンテンツ部品情報グループと検索条件との複数の類似度を用いた所定の計算により算出された値（例えば複数の類似度の平均）であっても良い。 Here, the similarity between the search condition and the part group may be a similarity between one content part information group belonging to the part group and the search condition, or a plurality of pieces of content part information belonging to the part group. It may be a value (for example, an average of a plurality of similarities) calculated by a predetermined calculation using a plurality of similarities between the group and the search condition.

また、コンテンツ部品情報グループと類似する部品グループとしては、類似度が所定の閾値を越えている部品グループのうちの一の（例えば最も類似度が高い）部品グループであるとすることができる。 Further, the component group similar to the content component information group may be one (for example, the highest similarity) component group among the component groups whose similarity exceeds a predetermined threshold.

一つの実施形態では、部品登録部が、更に、一の部品グループについて複数のスタイルグループを作成することができる。スタイルグループは、同一又は類似のスタイルを有する複数のコンテンツ部品情報グループの集まりであるとすることができる。部品候補表示部が、部品グループに関する情報に加えスタイルグループに関する情報を表示することができる。 In one embodiment, the component registration unit can further create a plurality of style groups for one component group. The style group may be a collection of a plurality of content component information groups having the same or similar style. The part candidate display unit can display information about the style group in addition to information about the part group.

ここで、テキストのスタイルは、例えば、テキストのフォントサイズ、フォントタイプ（例えば、明朝、ゴシックなど）、フォントスタイル（例えば、通常、太字、斜体など）、フォントの色、或いはテキストの長さなど一以上のテキストスタイル属性により定義することができる。 Here, the text style is, for example, the font size of the text, font type (eg, Mincho, Gothic, etc.), font style (eg, normal, bold, italic, etc.), font color, text length, etc. It can be defined by one or more text style attributes.

また、スタイルが類似しているとは、スタイルに関する複数の要素（例えば、テキスト、テキストの並び順、テキストのフォントサイズ、フォントタイプなど）が所定の割合（例えば８０％）以上に一致していることとすることができる。 A style is similar that a plurality of elements relating to the style (for example, text, text arrangement order, text font size, font type, etc.) match a predetermined ratio (for example, 80%) or more. Can be.

一つの実施形態では、部品登録部が、新規のコンテンツファイルを記憶部に登録する際に、該新規のコンテンツファイルについてのコンテンツ部品情報グループを、そのコンテンツ部品情報グループと同一又は類似の部品グループに分類することができる。 In one embodiment, when the component registration unit registers a new content file in the storage unit, the content component information group for the new content file is changed to a component group that is the same as or similar to the content component information group. Can be classified.

一つの実施形態では、部品登録部が、登録性能よりも検索性能を重視することが入力された場合に、新規のコンテンツファイルを前記記憶部に登録する際に、該新規のコンテンツファイルについてのコンテンツ部品情報グループを、そのコンテンツ部品情報グループと同一又は類似の部品グループに分類することができる。 In one embodiment, when the component registration unit inputs that the search performance is more important than the registration performance, when the new content file is registered in the storage unit, the content for the new content file is stored. The component information group can be classified into a component group that is the same as or similar to the content component information group.

一つの実施形態では、部品検索部が、どの部品グループにも分類されていないコンテンツ部品情報グループと前記検索条件との類似度を算出することができる。部品登録部が、どの部品グループにも分類されていないコンテンツ部品情報グループに対応したコンテンツ部品がコンテンツ部品候補とされた場合に、そのコンテンツ部品情報グループをそれに該当する部品グループに分類することができる。 In one embodiment, the component search unit can calculate the similarity between the content component information group that is not classified into any component group and the search condition. When a component part corresponding to a content component information group not classified into any component group is a content component candidate, the component registration unit can classify the content component information group into the corresponding component group .

一つの実施形態では、部品登録部が、検索性能よりも登録性能を重視することが入力された場合に、どの部品グループにも分類されていないコンテンツ部品情報グループに対応したコンテンツ部品がコンテンツ部品候補とされた場合に、そのコンテンツ部品情報グループをそれに該当する部品グループに分類することができる。 In one embodiment, when the component registration unit inputs that the registration performance is more important than the search performance, a content component corresponding to a content component information group that is not classified into any component group is a content component candidate. If it is determined that the content component information group is classified into a corresponding component group.

一つの実施形態では、複数のコンテンツ部品情報グループが二以上の部品グループに分類されており、部品検索部が、前記入力された検索条件と各部品グループとの類似度を算出し、部品候補表示部が、コンテンツ部品候補に関する情報として、部品グループに関する情報を表示することができる。 In one embodiment, the plurality of content component information groups are classified into two or more component groups, and the component search unit calculates the similarity between the input search condition and each component group, and displays the component candidate display The section can display information on the component group as information on the content component candidate.

一つの実施形態では、一の部品グループについて、複数のスタイルグループがあり、スタイルグループは、同一又は類似のスタイルを有する複数のコンテンツ部品情報グループの集まりであり、部品候補表示部が、部品グループに関する情報に加えスタイルグループに関する情報を表示することができる。 In one embodiment, there is a plurality of style groups for one component group, the style group is a collection of a plurality of content component information groups having the same or similar style, and the component candidate display section relates to the component group. In addition to information, information about style groups can be displayed.

一つの実施形態では、第一の部品グループに属する既存部品サブツリー内の情報パッケージが、第一の部品グループとは別の第二の部品グループにも属していれば、その第二の部品グループが、第一の部品グループに含まれる部品グループであり、部品候補表示部が、第一の部品グループに含まれる第二の部品グループに関する情報を表示することができる。 In one embodiment, if an information package in an existing component subtree that belongs to a first component group belongs to a second component group that is different from the first component group, the second component group is The component candidate display unit can display information on the second component group included in the first component group.

一つの実施形態では、部品候補表示部が、スタイルグループに対応したコンテンツ部品情報グループの数を表示することができる。 In one embodiment, the component candidate display unit can display the number of content component information groups corresponding to the style group.

一つの実施形態では、検索条件は、階層構造を持つ画面コンテンツの定義情報又はそれのサブセットであるとすることができる。 In one embodiment, the search condition may be definition information of a screen content having a hierarchical structure or a subset thereof.

上述した複数の実施形態のうちの二以上の実施形態を組み合わせることができる。また、上述した各部（例えば、検索条件入力部、部品検索部、部品候補表示部、コンテンツ解析部、グルーピング部、パッケージグループ生成部、テキスト付与部及び部品登録部）は、各手段と言い換えてもよい。各部は、ハードウェア（例えば回路）、コンピュータプログラム、或いはそれらの組み合わせ（例えば、一部をコンピュータプログラムで実行し一部をハードウェア回路で実行すること）によって実現することもできる。各コンピュータプログラムは、コンピュータマシンに備えられる記憶資源（例えばメモリ）から読み込むことができる。その記憶資源には、ＣＤ−ＲＯＭやＤＶＤ（Digital Versatile Disk）等の記録媒体を介してインストールすることもできるし、インターネットやＬＡＮ等の通信ネットワークを介してダウンロードすることもできる。 Two or more embodiments of the plurality of embodiments described above can be combined. Each of the above-described units (for example, a search condition input unit, a component search unit, a component candidate display unit, a content analysis unit, a grouping unit, a package group generation unit, a text addition unit, and a component registration unit) may be rephrased as each means. Good. Each unit can also be realized by hardware (for example, a circuit), a computer program, or a combination thereof (for example, a part is executed by a computer program and a part is executed by a hardware circuit). Each computer program can be read from a storage resource (for example, memory) provided in the computer machine. The storage resource can be installed via a recording medium such as a CD-ROM or DVD (Digital Versatile Disk), or can be downloaded via a communication network such as the Internet or a LAN.

以下、図面を参照して、本発明の幾つかの実施形態を詳細に説明する。 Hereinafter, some embodiments of the present invention will be described in detail with reference to the drawings.

＜第一の実施形態＞。 <First embodiment>.

画面コンテンツとして、Ｗｅｂコンテンツを例に採ることができる。Ｗｅｂコンテンツとしては、ＨＴＭＬ（HyperText Markup Language）コンテンツ、独自形式のＷｅｂコンテンツ、あるいは文書作成アプリケーションで作成された電子化文書など、画像や動画を除くテキストを主体としたデジタルコンテンツとすることができる。第一の実施形態では、Ｗｅｂコンテンツとして、ＨＴＭＬ形式の入力フォームを例に採ることができる。以下の説明では、入力フォームなどのようなＷｅｂコンテンツを、単に「コンテンツ」と呼ぶことにする。 Web content can be taken as an example of screen content. The Web content may be digital content mainly composed of text excluding images and moving images, such as HTML (HyperText Markup Language) content, a unique format Web content, or an electronic document created by a document creation application. In the first embodiment, an HTML input form can be taken as an example of Web content. In the following description, Web content such as an input form is simply referred to as “content”.

第一の実施形態において行われる処理を、部品登録ステップと部品検索ステップとに大別することができる。 The processing performed in the first embodiment can be broadly divided into a component registration step and a component search step.

部品登録ステップでは、既存のコンテンツ中に含まれる要素（テキストや入力コントロール）の位置情報などのレイアウト情報を用いて要素間をグループ化するグルーピングステップと、各グループおよび要素に対してラベルを付与するラベル付与ステップと、それらグルーピングステップとラベル付与ステップの結果からコンテンツの構成を表すツリー（以下、コンテンツ構成ツリーと呼ぶ）を生成するコンテンツ構成ツリー生成ステップと、コンテンツ構成ツリーをコンテンツと関連付けてレポジトリに登録するコンテンツ登録ステップとが行われる。 In the component registration step, a grouping step for grouping elements using layout information such as position information of elements (text and input control) included in existing content, and a label is assigned to each group and element A label assignment step, a content structure tree generation step for generating a tree (hereinafter referred to as a content structure tree) representing a content structure from the results of the grouping step and the label assignment step, and a content structure tree associated with the content in the repository A content registration step for registration is performed.

部品検索ステップでは、階層構造を持つ新規コンテンツの定義情報のサブツリー（以下、クエリツリーと呼ぶ）と、レポジトリに蓄積されている各コンテンツ構成ツリーのサブツリー（以下、既存部品サブツリーと呼ぶ）との類似度を算出する部品検索ステップと、クエリツリーに対して類似度の高い（例えば所定の閾値以上の）既存部品サブツリーを、クエリツリーに対する部品候補として出力する部品候補出力ステップとが行われる。 In the component search step, the subtree of definition information for new content having a hierarchical structure (hereinafter referred to as a query tree) is similar to the subtree of each content configuration tree stored in the repository (hereinafter referred to as an existing component subtree). A component search step for calculating a degree and a component candidate output step for outputting an existing component subtree having a high similarity to the query tree (for example, a predetermined threshold or more) as a component candidate for the query tree are performed.

以下、より詳細に説明する。 This will be described in more detail below.

図１は、本発明の第一の実施形態に係るコンテンツ部品検索システムの全体構成例を示す。 FIG. 1 shows an example of the overall configuration of a content parts search system according to the first embodiment of the present invention.

コンテンツ部品検索システムは、一又は複数の計算機で構成することができる。コンテンツ部品検索システムには、例えば、ＣＰＵ１００、磁気ディスク装置１０１、主メモリ１０２、フロッピーディスクドライブ（以下、ＦＤＤ１０３と呼ぶ）、及びこれらを結ぶバス１０４が備えられる。コンテンツ部品検索システムは、他の機器とネットワーク１０５を介して接続することもできる。 The content component search system can be composed of one or a plurality of computers. The content component search system includes, for example, a CPU 100, a magnetic disk device 101, a main memory 102, a floppy disk drive (hereinafter referred to as an FDD 103), and a bus 104 connecting them. The content component search system can be connected to other devices via the network 105.

磁気ディスク装置１０１は、二次記憶装置の一つであり、コンテンツのソースを含むコンテンツファイル１６０とコンテンツ構成ツリー１６１が格納される。ＦＤＤ１０３を介してフロッピーディスク１０６に格納されている情報が、磁気ディスク装置１０１あるいは主メモリ１０２へ読み込まれる。 The magnetic disk device 101 is one of secondary storage devices, and stores a content file 160 including a content source and a content configuration tree 161. Information stored in the floppy disk 106 is read into the magnetic disk device 101 or the main memory 102 via the FDD 103.

主メモリ１０２には、システム制御処理部１１０、部品登録制御処理部１１１、部品検索制御処理部１１２、コンテンツ取得処理部１２１、コンテンツ内情報取得処理部１２２、コンテンツ構成ツリー生成処理部１２３、グルーピング処理部１３０、ラベル付与処理部１３１、部品登録処理部１２６、入出力定義情報取得処理部１４０、入出力定義情報解析処理部１４１、部品検索処理部１４２、部品候補出力処理部１４３が格納され、ワークエリア１５０が確保される。なお、本実施形態におけるシステム制御、部品登録制御、部品検索制御、コンテンツ取得、コンテンツ内情報取得、コンテンツ構成ツリー生成、グルーピング、ラベル付与、部品登録、入出力定義情報取得、入出力定義情報解析、部品検索および部品候補出力の各処理は、システム制御処理部１１０、部品登録制御処理部１１１、部品検索制御処理部１１２、コンテンツ取得処理部１２１、コンテンツ内情報取得処理部１２２、コンテンツ構成ツリー生成処理部１２３、グルーピング処理部１３０、ラベル付与処理部１３１、部品登録処理部１２６、入出力定義情報取得処理部１４０、入出力定義情報解析処理部１４１、部品検索処理部１４２および部品候補出力処理部１４３をＣＰＵ１００で実行することにより実現するが、各処理を行なうプログラムとして集積回路化するなどしてハードウェアで実現することもできる。以下説明を簡略化するため、各種プログラムをＣＰＵ１００が実行することで実現される各プログラムを各処理の主体として説明する。なお各処理部をハードウェアで実現した場合にはその各処理部が主体となって各処理を行なう。 The main memory 102 includes a system control processing unit 110, a component registration control processing unit 111, a component search control processing unit 112, a content acquisition processing unit 121, an in-content information acquisition processing unit 122, a content configuration tree generation processing unit 123, and a grouping process. , 130, label addition processing unit 131, component registration processing unit 126, input / output definition information acquisition processing unit 140, input / output definition information analysis processing unit 141, component search processing unit 142, and component candidate output processing unit 143 are stored. Area 150 is secured. In this embodiment, system control, component registration control, component search control, content acquisition, content information acquisition, content configuration tree generation, grouping, label assignment, component registration, input / output definition information acquisition, input / output definition information analysis, Each process of component search and component candidate output includes a system control processing unit 110, a component registration control processing unit 111, a component search control processing unit 112, a content acquisition processing unit 121, an in-content information acquisition processing unit 122, and a content configuration tree generation process. Unit 123, grouping processing unit 130, label addition processing unit 131, component registration processing unit 126, input / output definition information acquisition processing unit 140, input / output definition information analysis processing unit 141, component search processing unit 142, and component candidate output processing unit 143. Is executed by the CPU 100, but each process is performed. It may be implemented in hardware, for example, by an integrated circuit as a program. In order to simplify the description below, each program implemented by the CPU 100 executing various programs will be described as the subject of each process. When each processing unit is realized by hardware, each processing unit mainly performs each process.

システム制御処理部１１０は、部品登録制御処理部１１１および検索制御処理部１２１の制御を行なう。 The system control processing unit 110 controls the component registration control processing unit 111 and the search control processing unit 121.

部品登録制御処理部１１１は、コンテンツ取得処理部１２１、コンテンツ内情報取得処理部１２２、コンテンツ構成ツリー生成処理部１２３、グルーピング処理部１３０、ラベル付与処理部１３１および部品登録処理部１２６の制御を行なう。 The component registration control processing unit 111 controls the content acquisition processing unit 121, the in-content information acquisition processing unit 122, the content configuration tree generation processing unit 123, the grouping processing unit 130, the label addition processing unit 131, and the component registration processing unit 126. .

コンテンツ構成ツリー生成処理部１２３は、グルーピング処理部１３０およびラベル付与処理部１３１の制御を行なう。 The content configuration tree generation processing unit 123 controls the grouping processing unit 130 and the label addition processing unit 131.

部品検索制御処理部１１２は、入出力定義情報取得処理部１４０、入出力定義情報解析処理部１４１、部品検索処理部１４２および部品候補出力処理部１４３の制御を行なう。 The component search control processing unit 112 controls the input / output definition information acquisition processing unit 140, the input / output definition information analysis processing unit 141, the component search processing unit 142, and the component candidate output processing unit 143.

本実施形態では、これらの処理部を主メモリ１０２に格納するものとしたが、磁気ディスク装置１０１、フロッピーディスク１０６、ＭＯ、ＣＤ−ＲＯＭ、ＤＶＤ等の記憶媒体（図１には示していない）に格納し、駆動装置を介して主メモリ１０２に読み込み、ＣＰＵ１００によって実行することも可能である。 In this embodiment, these processing units are stored in the main memory 102, but a storage medium (not shown in FIG. 1) such as a magnetic disk device 101, a floppy disk 106, an MO, a CD-ROM, or a DVD. Can be stored in the main memory 102 via the driving device and executed by the CPU 100.

また、本実施形態では、コンテンツファイル１６０およびコンテンツ構成ツリー１６１は、磁気ディスク装置１０１に格納されるものとしたが、フロッピーディスク（登録商標）１０６、ＭＯ（Magneto-Optical disk）、ＣＤ−ＲＯＭ、ＤＶＤ等の記憶媒体（図１には示していない）に格納されるものとしてもよいし、あるいはネットワーク１０５を介して、他のシステムに接続された記憶装置（図１には示していない）に格納されるものとしてもよい。また、さらにはネットワーク１０５に直接接続された記憶媒体（図１には示していない）に格納されるものとしても構わない。 In the present embodiment, the content file 160 and the content configuration tree 161 are stored in the magnetic disk device 101, but a floppy disk (registered trademark) 106, an MO (Magneto-Optical disk), a CD-ROM, It may be stored in a storage medium such as a DVD (not shown in FIG. 1), or may be stored in a storage device (not shown in FIG. 1) connected to another system via the network 105. It may be stored. Further, it may be stored in a storage medium (not shown in FIG. 1) directly connected to the network 105.

以下、本実施形態に係るコンテンツ部品検索システムで行われる処理の概要を説明する。 Hereinafter, an outline of processing performed in the content component search system according to the present embodiment will be described.

システム制御処理部１１０は、キーボード１０１（他種の入力装置でも良い）から入力されたコマンドを解析する。この結果が、登録実行のコマンドであると解析された場合には、システム制御処理部１１０が、部品登録制御処理部１１１を起動し、部品登録制御処理部１１１が、コンテンツファイル１６０およびコンテンツ構成ツリー１６１を磁気ディスク装置１０１へ登録する。また、検索実行のコマンドであると解析された場合には、システム制御処理部１１０が、部品検索制御処理部１１２を起動し、部品検索制御処理部１１２が、指定されたクエリツリー１７１（図１１参照）に基づいて部品検索の処理を行なう。 The system control processing unit 110 analyzes a command input from the keyboard 101 (which may be another type of input device). If the result is analyzed as a registration execution command, the system control processing unit 110 activates the component registration control processing unit 111, and the component registration control processing unit 111 executes the content file 160 and the content configuration tree. 161 is registered in the magnetic disk device 101. Further, when it is analyzed that the command is a search execution command, the system control processing unit 110 activates the component search control processing unit 112, and the component search control processing unit 112 performs the specified query tree 171 (FIG. 11). The part search process is performed based on the reference).

以下、部品検索制御処理部１１２が行う処理の流れを説明する。 Hereinafter, the flow of processing performed by the component search control processing unit 112 will be described.

まず、部品検索制御処理部１１２は、入出力定義情報取得処理部１４０を起動し、入出力定義情報取得処理部１４０が、新規作成コンテンツの定義情報（以下、「入出力定義情報」と呼ぶ）１７０（図１１参照）を取得し、取得された入出力定義情報１７０をワークエリア１５０に格納する。 First, the component search control processing unit 112 activates the input / output definition information acquisition processing unit 140, and the input / output definition information acquisition processing unit 140 defines definition information of newly created content (hereinafter referred to as “input / output definition information”). 170 (see FIG. 11) is acquired, and the acquired input / output definition information 170 is stored in the work area 150.

次に、部品検索制御処理部１１２は、入出力定義情報解析処理部１４１を起動し、入出力定義情報解析処理部１４１が、入出力定義情報１７０から、ユーザ（例えば開発者）によって指定されたクエリツリー１７１を取得し、取得されたクエリツリー１７１を、ワークエリア１５０に格納する。 Next, the part search control processing unit 112 activates the input / output definition information analysis processing unit 141, and the input / output definition information analysis processing unit 141 is designated by the user (for example, developer) from the input / output definition information 170. The query tree 171 is acquired, and the acquired query tree 171 is stored in the work area 150.

次に、部品検索制御処理部１１２は、部品検索処理部１４２を起動し、部品検索処理部１４２が、各コンテンツファイル１６０に含まれるすべての部品（以下、既存部品サブツリー１７４と呼ぶ）に対し、クエリツリー１７１との類似度を算出し、算出結果を評価結果１８３（図１２参照）としてワークエリア１５０に格納する。 Next, the component search control processing unit 112 activates the component search processing unit 142, and the component search processing unit 142 performs processing for all components included in each content file 160 (hereinafter referred to as an existing component subtree 174). The degree of similarity with the query tree 171 is calculated, and the calculation result is stored in the work area 150 as the evaluation result 183 (see FIG. 12).

最後に、部品検索制御処理部１１２は、部品候補出力処理部１４３を起動し、部品候補出力処理部１４３が、クエリツリー１７１に対して類似度の高い既存部品サブツリー１７４を、クエリツリー１７１に対する部品候補として出力する。 Finally, the component search control processing unit 112 activates the component candidate output processing unit 143, and the component candidate output processing unit 143 converts the existing component subtree 174 having a high similarity to the query tree 171 into the component for the query tree 171. Output as a candidate.

入出力定義情報取得処理部１４０は、ユーザによって入力された入出力定義情報１７０をワークエリア１５０に格納する。入出力定義情報１７０は、例えば、図１１に示すように、ＸＭＬ（Extensible Markup Language）形式で表すことができる。新規作成コンテンツの入力項目の構造情報をＸＭＬの構造で表現し、入力項目のラベル（ラベルとは実質的に入力項目名を意味する）を、ＸＭＬのタグ名で表現し、入力コントロールの種類を、各エレメントのｔｙｐe属性で表現することができる。なお、本実施形態では、入出力定義情報１７０のデータ形式を、ＸＭＬ形式としているが、入力項目の構造情報、入力項目のラベルおよび入力コントロールの種類を定義できるものならどのようなデータ形式であってもよい。また、入出力定義情報１７０は、ユーザから入力されるものとしたが、ＦＤＤ１０３等のドライブを介してフロッピーディスク１０６、ＭＯ、ＣＤ−ＲＯＭ、ＤＶＤ等の記憶媒体（図１には示していない）に格納されているもの、あるいはネットワーク１０５を介して他のシステムに接続された記憶装置（図１には示していない）に格納されているもの、さらにはネットワーク１０５に直接接続された記憶媒体（図１には示していない）に格納されているものから入力されてもよい。 The input / output definition information acquisition processing unit 140 stores the input / output definition information 170 input by the user in the work area 150. The input / output definition information 170 can be expressed in an XML (Extensible Markup Language) format, for example, as shown in FIG. The structure information of the input item of the newly created content is expressed by the XML structure, the label of the input item (the label substantially means the input item name) is expressed by the XML tag name, and the type of the input control is expressed. , And can be expressed by the type attribute of each element. In this embodiment, the data format of the input / output definition information 170 is the XML format. However, any data format can be used as long as it can define the input item structure information, the input item label, and the input control type. May be. The input / output definition information 170 is input from the user, but a storage medium (not shown in FIG. 1) such as a floppy disk 106, MO, CD-ROM, or DVD via a drive such as the FDD 103. Stored in a storage device (not shown in FIG. 1) connected to another system via the network 105, or a storage medium directly connected to the network 105 ( It may be input from those stored in (not shown in FIG. 1).

入出力定義情報解析処理部１４１は、入出力定義情報取得処理部１４０によって取得された入出力定義情報１７０から、ユーザによって指定された構造情報（以下、「ユーザ指定構造情報）と呼ぶ）１７２（図１１参照）に対応するクエリツリー１７１を取得し、取得されたクエリツリー１７１をワークエリア１５０に格納する。なお、クエリツリー１７１は、入出力定義情報１７０のサブセットである。 The input / output definition information analysis processing unit 141 uses the structure information (hereinafter referred to as “user-specified structure information”) 172 specified by the user from the input / output definition information 170 acquired by the input / output definition information acquisition processing unit 140. 11 is obtained, and the obtained query tree 171 is stored in the work area 150. The query tree 171 is a subset of the input / output definition information 170.

部品検索処理部１４２が行う処理の流れを、図１０及び図１２を参照して説明する。 The flow of processing performed by the component search processing unit 142 will be described with reference to FIGS.

まず、部品検索処理部１４２は、入出力定義情報解析処理部１４１で取得されたすべてのクエリツリー１７１に対して、ステップ２０１０〜２０１１を繰り返す（ステップ２０００）。 First, the component search processing unit 142 repeats Steps 2010 to 2011 for all the query trees 171 acquired by the input / output definition information analysis processing unit 141 (Step 2000).

次に、部品検索処理部１４２は、クエリツリー１７１から、部品内情報１７７Ａ（ラベル情報１７５Ａおよび入力コントロール情報１７６Ａを含んだ情報）を取得し、取得された部品内情報１７７Ａを、ワークエリア１５０に格納する（ステップ２０１０）。 Next, the part search processing unit 142 acquires the in-part information 177A (information including the label information 175A and the input control information 176A) from the query tree 171 and stores the acquired in-part information 177A in the work area 150. Store (step 2010).

次に、部品検索処理部１４２は、磁気ディスク装置１０１に登録されているすべてのコンテンツファイル１６０に対して、ステップ２０１２を繰り返す（ステップ２０１１）。 Next, the component search processing unit 142 repeats Step 2012 for all the content files 160 registered in the magnetic disk device 101 (Step 2011).

次に、部品検索処理部１４２は、該コンテンツファイル１６０に対応するコンテンツ構成ツリー１６１に含まれるすべての既存部品サブツリー１７４に対して、ステップ２１１３〜２０１５を繰り返す（ステップ２０１２）。 Next, the component search processing unit 142 repeats Steps 2113 to 2015 for all the existing component subtrees 174 included in the content configuration tree 161 corresponding to the content file 160 (Step 2012).

次に、部品検索処理部１４２は、該既存部品サブツリー１７４の部品内情報１７７Ｂ（ラベル情報１７５Ｂと入力コントロール情報１７６Ｂを含んだ情報）を取得し、取得された部品内情報１７７Ｂを、ワークエリア１５０に格納する（ステップ２０１３）。 Next, the part search processing unit 142 acquires the in-part information 177B (information including the label information 175B and the input control information 176B) of the existing part subtree 174, and uses the acquired in-part information 177B as the work area 150. (Step 2013).

次に、部品検索処理部１４２は、クエリツリー１７１の部品内情報１７７Ａと既存部品サブツリー１７４の部品内情報１７７Ｂから、類似度算出のためのベクトル空間とベクトルを表現したテーブル形式のベクトル情報１８２（ラベル情報１７８、入力コントロール情報１７９、クエリツリーベクトル情報１８０および既存部品サブツリーベクトル情報１８１を含んだ情報）を生成し、生成されたベクトル情報１８２を、ワークエリア１５０に格納する（ステップ２０１４）。 Next, the part search processing unit 142 uses the in-part information 177A of the query tree 171 and the in-part information 177B of the existing part subtree 174 to obtain vector space 182 and vector information 182 in a table format expressing vectors. Label information 178, input control information 179, query tree vector information 180, and information including existing parts subtree vector information 181) are generated, and the generated vector information 182 is stored in the work area 150 (step 2014).

最後に、部品検索処理部１４２は、ベクトル情報１８２を用いて、該クエリツリー１７１に対する該既存部品サブツリー１７４の類似度を（式１）により算出し（ステップ２０１５）、その類似度算出結果を、該クエリツリー１７１に対する各既存部品サブツリー１７４の評価結果１８３（コンテンツＩＤ情報１８３１、ノード番号情報１８３２および類似度情報１８３３を含んだ情報）を、ワークエリア１５０に格納する。類似度は、ベクトル情報１８２からベクトル空間法のよる余弦尺度により算出される。 Finally, the part search processing unit 142 uses the vector information 182 to calculate the similarity of the existing part subtree 174 with respect to the query tree 171 using (Equation 1) (step 2015), and calculates the similarity calculation result as follows: An evaluation result 183 (information including content ID information 1831, node number information 1832, and similarity information 1833) of each existing component subtree 174 for the query tree 171 is stored in the work area 150. The similarity is calculated from the vector information 182 by a cosine scale according to the vector space method.

なお、ステップ２０１５では類似度の算出式におけるベクトル空間を、類似度を算出するクエリツリーと既存部品サブツリーのみが含まれる空間（次元）で計算するものとしたが、磁気ディスク装置１０１に含まれるすべてのコンテンツ１６０およびクエリツリーに関する共通空間（次元）で計算するものとしてもよい。 In step 2015, the vector space in the similarity calculation formula is calculated using a space (dimension) including only the query tree for calculating the similarity and the existing component subtree. It is good also as what calculates in the common space (dimension) regarding the content 160 of this, and a query tree.

まず、部品候補出力処理部１４３は、部品検索処理部１４２によって算出された評価結果１８３から、各クエリツリー１７１に対して、類似度の降順で既存部品サブツリー１７４に対応する部品の候補に関する情報の一覧（以下、「部品候補一覧」と呼ぶ）を部品候補一覧表示画面１８７（図１３参照）に表示する。なお、部品候補一覧には、各部品候補について、部品ＩＤ（例えば、“コンテンツＩＤ＿コンテンツ構成ツリーのノード番号”で表現される部品の識別子）、類似度、および既存部品サブツリーのルートノードに格納されているラベルが含まれる。 First, the component candidate output processing unit 143 obtains information on candidate components corresponding to the existing component subtree 174 in descending order of similarity for each query tree 171 from the evaluation result 183 calculated by the component search processing unit 142. A list (hereinafter referred to as “part candidate list”) is displayed on the part candidate list display screen 187 (see FIG. 13). In the component candidate list, each component candidate is stored in the component ID (for example, the identifier of the component expressed by “content ID_content configuration tree node number”), the similarity, and the root node of the existing component subtree. The label is included.

次に、部品候補出力処理部１４３は、部品候補一覧よりユーザによって選択された部品候補に対して、コンテンツファイル１６０から、選択された部品に対応するコードを取得し、取得したコードを、ワークエリア１５０に格納する。 Next, the component candidate output processing unit 143 acquires a code corresponding to the selected component from the content file 160 for the component candidate selected by the user from the component candidate list, and the acquired code is used as the work area. 150.

最後に、部品候補出力処理部１４３は、上記で取得されたコードを用いて、ユーザによって選択された部品を、プレビュー画面１８４（図１３参照）に表示する。 Finally, the component candidate output processing unit 143 displays the component selected by the user on the preview screen 184 (see FIG. 13) using the code acquired above.

以上が、部品候補出力処理部１４３の処理手順である。 The processing procedure of the component candidate output processing unit 143 has been described above.

なお、部品候補一覧よりユーザによって部品候補が選択されることにより該部品候補がプレビュー画面１８４に表示されるものとしたが、部品候補出力処理部１４３は、ユーザ操作を介さずにクエリツリー１７１に対して類似度が最も高い既存部品サブツリーに対応する部品をプレビュー画面１８４に表示してもよい。 Note that the component candidate is displayed on the preview screen 184 when the user selects a component candidate from the component candidate list, but the component candidate output processing unit 143 displays the query candidate in the query tree 171 without any user operation. On the other hand, the part corresponding to the existing part subtree having the highest similarity may be displayed on the preview screen 184.

また、ユーザによって指定されたクエリツリーが複数個存在する場合は、クエリツリーの情報（以下、「クエリツリー情報」と呼ぶ）として、複数のクエリツリー間の関係をクエリツリー情報画面１８６（図１３参照）に表示してもよい。 When there are a plurality of query trees designated by the user, the query tree information screen 186 (see FIG. 13) shows the relationship between the plurality of query trees as query tree information (hereinafter referred to as “query tree information”). Display).

また、部品候補一覧において、或る閾値以上の類似度を持たない既存部品サブツリーについては、部品候補として含まれなくてよい。この際、クエリツリー情報表示画面１８６に、各クエリツリーに対して検索された部品数（既存部品サブツリーの数）が表示されてもよい。 In the component candidate list, an existing component subtree that does not have a similarity greater than or equal to a certain threshold may not be included as a component candidate. At this time, the number of parts searched for each query tree (the number of existing part subtrees) may be displayed on the query tree information display screen 186.

また、ユーザによって指定された複数のクエリツリー間に階層関係（例えば親子関係）が存在する場合（入出力定義情報１７０に含まれるすべてのサブツリーをクエリツリー１７１とした場合も含む）、上位（例えば親）のクエリツリーの部品候補から優先的に部品候補一覧表示画面１８７に表示されてもよい。優先的に表示するとは、例えば、上位のクエリツリーの部品候補の類似度がそれよりも下位のクエリツリーの部品候補の類似度より低くても、上位のクエリツリーの部品候補が、下位のクエリツリーの部品候補よりも、降順で上位に位置することを言う。 In addition, when there is a hierarchical relationship (for example, a parent-child relationship) between a plurality of query trees specified by the user (including a case where all subtrees included in the input / output definition information 170 are used as the query tree 171), a higher level (for example, It may be preferentially displayed on the part candidate list display screen 187 from the candidate parts of the parent) query tree. For example, even if the similarity of the part candidate of the upper query tree is lower than the similarity of the part candidate of the lower query tree, the part candidate of the upper query tree is displayed in the lower query. It means that it is positioned higher in descending order than the tree part candidates.

また、現在表示されている部品候補一覧に対するクエリツリー１７１の内容が判断しやすいように、部品候補出力処理部１４３が、入出力定義情報表示画面１８５に表示される入出力定義情報において、クエリツリー１７１の部分が強調表示（例えば網掛け表示）してもよい。 In addition, the component candidate output processing unit 143 uses the query tree in the input / output definition information displayed on the input / output definition information display screen 185 so that the contents of the query tree 171 for the currently displayed component candidate list can be easily determined. The portion 171 may be highlighted (for example, shaded).

以上が、部品検索制御処理部１１２が行う処理についての説明である。 This completes the description of the processing performed by the component search control processing unit 112.

次に、部品登録制御処理部１１１が行う処理について説明する。 Next, processing performed by the component registration control processing unit 111 will be described.

まず、部品登録制御処理部１１１は、コンテンツ取得処理部１２１を起動し、コンテンツ取得処理部１２１が、ＦＤＤ１０４を介してフロッピーディスク１０８に格納されているコンテンツファイル１６０を読み込む。 First, the component registration control processing unit 111 activates the content acquisition processing unit 121, and the content acquisition processing unit 121 reads the content file 160 stored in the floppy disk 108 via the FDD 104.

次に、部品登録制御処理部１１１は、コンテンツ内情報取得処理部１２２を起動し、コンテンツ内情報取得処理部１２２が、コンテンツファイル１６０のソース中に含まれるテキストや入力コントロール（例えばテキストボックスやラジオボタン等）と、これらのコンテンツ内要素の位置情報やサイズ（例えば、テキストの場合はフォントサイズ、入力コントロールの場合は領域サイズ）を取得する。 Next, the component registration control processing unit 111 activates the in-content information acquisition processing unit 122, and the in-content information acquisition processing unit 122 performs text and input control (for example, a text box or radio) included in the source of the content file 160. Button, etc.) and position information and sizes (for example, font size for text, area size for input control) of these elements in the content.

次に、部品登録制御処理部１１１は、コンテンツ構成ツリー生成処理部１２３を起動し、コンテンツ構成ツリー生成処理部１２３が、コンテンツ内情報取得処理部１２２によって抽出されたコンテンツ内の要素情報を用いて、要素間のグルーピング、および要素やグループに対するラベルを付与することで、コンテンツ構成ツリー１６１を生成する。 Next, the component registration control processing unit 111 activates the content configuration tree generation processing unit 123, and the content configuration tree generation processing unit 123 uses the element information in the content extracted by the in-content information acquisition processing unit 122. The content composition tree 161 is generated by grouping between elements and assigning labels to the elements and groups.

最後に、部品登録制御処理部１１１は、部品登録処理部１２６を起動し、部品登録処理部１２６が、コンテンツファイル１６０とコンテンツファイル１６０に対応するコンテンツ構成ツリー１６１を磁気ディスク装置１０１へ登録する。 Finally, the component registration control processing unit 111 activates the component registration processing unit 126, and the component registration processing unit 126 registers the content file 160 and the content configuration tree 161 corresponding to the content file 160 in the magnetic disk device 101.

コンテンツ取得処理部１２１は、ＦＤＤ１０４を介してフロッピーディスク１０８に格納されているコンテンツファイル１６０を読み込み、ワークエリア１５０に格納する。なお、コンテンツファイル１６０はフロッピーディスク１０８に格納されているものとしたが、ＭＯ、ＣＤ−ＲＯＭ、ＤＶＤ等の記憶媒体（図１には示していない）に格納されるものとしてもよいし、ネットワーク１０５を介して、他のシステムに接続された記憶媒体（図１には示していない）に格納されるものとしてもよい。 The content acquisition processing unit 121 reads the content file 160 stored in the floppy disk 108 via the FDD 104 and stores it in the work area 150. Although the content file 160 is stored in the floppy disk 108, it may be stored in a storage medium (not shown in FIG. 1) such as MO, CD-ROM, DVD, or network. The data may be stored in a storage medium (not shown in FIG. 1) connected to another system via 105.

コンテンツ内情報取得処理部１２２は、まず、コンテンツ取得処理部１２１によって取得されたコンテンツファイル１６０の中から、磁気ディスク装置１０１や主メモリ１０２に格納されたルールに基づいて、テキスト情報要素や入力コントロール種類情報要素（以下、これらを「コンテンツ内情報」と呼ぶことがある）を抽出する。抽出された全てのテキスト情報要素が記録されたテキストリスト情報１６２と、抽出された全ての入力コントロール種類情報要素が記録された入力コントロールリスト情報１６３とが、ワークエリア１５０に格納される。ここでいう「テキスト」とは、コンテンツファイルが例えばＨＴＭＬファイルの場合、画面上に表示されないタグなどではなく、実際に画面上で表示される文字列のことをいう。また、ここでいう「入力コントロール」とは、コンテンツファイルが例えばＨＴＭＬファイルの場合、ＩＮＰＵＴタグ、ＳＥＬＥＣＴタグおよびＴＥＸＴＡＲＥＡタグで表現されたものなど、ユーザからの指定を受け付ける要素をいう。 First, the in-content information acquisition processing unit 122 first selects a text information element or input control from the content file 160 acquired by the content acquisition processing unit 121 based on the rules stored in the magnetic disk device 101 or the main memory 102. Type information elements (hereinafter, these may be referred to as “content information”) are extracted. Text list information 162 in which all the extracted text information elements are recorded and input control list information 163 in which all the extracted input control type information elements are recorded are stored in the work area 150. “Text” here refers to a character string that is actually displayed on the screen, not a tag that is not displayed on the screen, for example, when the content file is an HTML file. The “input control” here refers to an element that accepts designation from the user such as an INPUT tag, a SELECT tag, and a TEXTAREA tag when the content file is an HTML file, for example.

なお、テキストリスト情報１６２や入力コントロールリスト情報１６３は、ワークエリア１５０に格納されるが、磁気ディスク装置１０１に格納されてもよい。コンテンツファイルの中のどのようなものをテキストとして抽出するかを示す情報や、コンテンツファイルの中のどのようなものを入力コントロールとして抽出するかを示す情報は、主メモリ１０２または磁気ディスク装置１０１に格納されている。 The text list information 162 and the input control list information 163 are stored in the work area 150, but may be stored in the magnetic disk device 101. Information indicating what is extracted from the content file as text and information indicating what is extracted from the content file as input control are stored in the main memory 102 or the magnetic disk device 101. Stored.

次に、コンテンツ内情報取得処理部１２２は、抽出されたコンテンツ内要素の配置位置情報をコンテンツファイルの中から抽出する。ここで、配置位置情報は、コンテンツ内要素がコンテンツ内での位置と占める範囲の定義を表す情報である。具体的には、例えば、テキストの配置位置情報には、基準位置情報とフォントサイズ情報とが含まれ、入力コントロールの配置位置情報には、基準位置情報と領域サイズ情報とが含まれる。テキストの基準位置情報やフォントサイズ情報は、テキストリスト情報１６２に記録され、入力コントロールの基準位置情報や領域サイズ情報は、入力コントロールリスト情報１６３に記録される。なお、テキストリスト情報１６２は、図５に例示するように、例えばテーブルであり、コンテンツファイルから検出されたテキスト毎に、テキスト番号７００（例えば通し番号）、テキスト情報要素７１０、基準位置情報７２０（例えばコンテンツの所定位置を原点とした場合の基準位置の座標）、およびフォントサイズ７３０が記録される。また、入力コントロールリスト情報１６３は、図６に例示するように、例えばテーブルであり、コンテンツファイルから検出された入力コントロール毎に、入力コントロール番号８００（例えば通し番号）、入力コントロール種類情報要素８１０（例えばタグ付きの情報）、基準位置情報８２０、および領域サイズ８３０（例えば、基準位置を中心とした幅及び高さのサイズ）が記録される。 Next, the in-content information acquisition processing unit 122 extracts the arrangement position information of the extracted in-content element from the content file. Here, the arrangement position information is information representing the definition of the range occupied by the element in the content and the position in the content. Specifically, for example, text layout position information includes reference position information and font size information, and input control layout position information includes reference position information and area size information. Text reference position information and font size information are recorded in the text list information 162, and input control reference position information and area size information are recorded in the input control list information 163. The text list information 162 is, for example, a table as illustrated in FIG. 5, and for each text detected from the content file, a text number 700 (for example, a serial number), a text information element 710, and reference position information 720 (for example, for example). The coordinates of the reference position when the predetermined position of the content is the origin) and the font size 730 are recorded. Further, as illustrated in FIG. 6, the input control list information 163 is, for example, a table, and for each input control detected from the content file, an input control number 800 (for example, a serial number) and an input control type information element 810 (for example, for example). Information with a tag), reference position information 820, and region size 830 (for example, width and height sizes centered on the reference position) are recorded.

コンテンツ構成ツリー生成処理部１２３は、まず、グルーピング処理部１３０を起動し、グルーピング処理部１３０が、コンテンツ内情報取得処理部１２２によって抽出された入力コントロールリスト情報１６３を用いて、コンテンツファイル１６０に含まれる入力コントロールのグルーピングを行ない、そのグルーピング結果（例えば図８参照）をワークエリア１５０に格納する。 The content configuration tree generation processing unit 123 first activates the grouping processing unit 130, and the grouping processing unit 130 uses the input control list information 163 extracted by the in-content information acquisition processing unit 122 to be included in the content file 160. The input control is grouped, and the grouping result (see, for example, FIG. 8) is stored in the work area 150.

次に、コンテンツ構成ツリー生成処理部１２３は、ラベル付与処理部１３１を起動し、ラベル付与処理部１３１が、グルーピング処理部１３０によって生成されたグルーピング結果に対して、テキストリスト情報１６２や入力コントロールリスト情報１６３に格納されている配置位置情報を用いて、各入力コントロールの名前やグルーピングによって生成されたグループの名前（以下、まとめてラベルと呼ぶ）を付与することで、最終的なコンテンツ構成ツリー１６１を生成し、その生成されたコンテンツ構成ツリー１６１をワークエリア１５０に格納する。 Next, the content configuration tree generation processing unit 123 activates the label addition processing unit 131, and the label addition processing unit 131 performs text list information 162 and an input control list on the grouping result generated by the grouping processing unit 130. By using the arrangement position information stored in the information 163, the name of each input control or the name of a group generated by grouping (hereinafter collectively referred to as a label) is given, so that the final content configuration tree 161 is obtained. And the generated content configuration tree 161 is stored in the work area 150.

なお、コンテンツ構成ツリー１６１は、各ノードに、ノード属性情報として、ノード番号、ラベル、基準点、親ノード番号（直近の上位のノード番号）およびコントロール種類を持つ。コンテンツ構成ツリーの概念図と各ノードの属性情報の格納例を図９に示す。図９には、コンテンツ構成ツリーの上位部分が示されており、下位部分は図示が省略されている。コンテンツ構成ツリー１６１は、ワークエリア１５０に格納されるが、磁気ディスク装置１０１に格納されてもよい。 Note that the content configuration tree 161 has, as node attribute information, each node has a node number, a label, a reference point, a parent node number (the most recent upper node number), and a control type. FIG. 9 shows a conceptual diagram of a content structure tree and an example of storing attribute information of each node. FIG. 9 shows the upper part of the content structure tree, and the lower part is not shown. The content configuration tree 161 is stored in the work area 150, but may be stored in the magnetic disk device 101.

部品登録処理部１２６は、まず、各コンテンツファイル１６０にコンテンツIDを付与する。コンテンツIDは、主メモリ１０２または磁気ディスク装置１０１に格納されているルールに基づいて生成され付与される。 The component registration processing unit 126 first assigns a content ID to each content file 160. The content ID is generated and assigned based on the rules stored in the main memory 102 or the magnetic disk device 101.

次に、部品登録処理部１２６は、同コンテンツにおけるコンテンツファイル１６０とコンテンツ構成ツリー１６１を関連付けた上で、それぞれを磁気ディスク装置１０１へ登録する。 Next, the component registration processing unit 126 associates the content file 160 and the content configuration tree 161 in the same content, and registers them in the magnetic disk device 101.

図２は、グルーピング処理部１３０が行う処理の流れの一例を示す。 FIG. 2 shows an example of the flow of processing performed by the grouping processing unit 130.

グルーピング処理部１３０は、まず、コンテンツ内情報取得処理部１２２によって抽出された入力コントロールリスト情報１６３に含まれるＮ個の入力コントロールに対して、ステップ３０１〜ステップ３０２を、ｎ＝１〜（Ｎ−１）まで繰り返し実行する（ステップ３００）。 First, the grouping processing unit 130 performs steps 301 to 302 for n input controls included in the input control list information 163 extracted by the in-content information acquisition processing unit 122, with n = 1 to (N− Repeat until 1) (step 300).

次に、グルーピング処理部１３０は、磁気ディスク装置１０１又は主メモリ１０２に格納された値Ｍに基づいて、Ｍ個の階層レベルに対して、ステップ３０２を、ｍ＝２〜（Ｍ−１）まで繰り返し実行する（ステップ３０１）。なお、階層レベルｍとは、階層レベル１がコンテンツ全体を表し、ｍが大きいほど詳細な内容に関するグルーピングを行ない、最下位階層レベル（つまりｍ＝Ｍ−１）では、１グループ１入力コントロールとなる。 Next, the grouping processing unit 130 performs step 302 from m = 2 to (M−1) for M hierarchical levels based on the value M stored in the magnetic disk device 101 or the main memory 102. Repeatedly (step 301). Note that the hierarchical level m represents the entire content, and the higher the m, the more detailed the grouping is performed, and at the lowest hierarchical level (that is, m = M-1), 1 group 1 input control is performed. .

次に、グルーピング処理部１３０は、階層レベルｍにおけるグルーピング条件を、階層レベルｍに関し、ｎ番目とｎ＋１番目の入力コントロール間が同じグループに属するか否かの判定を行ない、判定結果をワークエリア１５０に格納する（ステップ３０２）。なお、階層レベルｍにおいて、ｎ番目とｎ＋１番目の入力コントロール間が同じグループと判定され、さらにｎ＋１番目とｎ＋２番目の入力コントロール間が同じグループと判定された場合、ｎ番目〜ｎ＋２番目の入力コントロールは階層レベルｍにおいて同じグループとみなす。すなわち、同じグループと判定された入力コントロールペアが複数個存在し、入力コントロールペアにおける一方の入力コントロールが別の入力コントロールペアにおける他方の入力コントロールとなっている場合、複数の入力コントロールペアにおける入力コントロールが同一のグループとみなされる。なお、ｎ番目の入力コントロールとは、コンテンツ中でｎ番目に入力される入力コントロール（例えば、ｎは図６の入力コントロール番号８００）を表す。 Next, the grouping processing unit 130 determines whether or not the nth and n + 1th input controls belong to the same group with respect to the hierarchy level m as the grouping condition at the hierarchy level m. (Step 302). When the nth and n + 1th input controls are determined to be the same group at the hierarchical level m, and the n + 1th and n + 2th input controls are determined to be the same group, the nth to n + 2th input controls. Are considered the same group at hierarchy level m. That is, if there are multiple input control pairs that are determined to be the same group, and one input control in the input control pair is the other input control in another input control pair, the input control in the multiple input control pairs Are considered to be the same group. The nth input control represents the nth input control (for example, n is the input control number 800 in FIG. 6) input in the content.

最後に、グルーピング処理部１３０は、ステップ３０２によって階層レベル別にグルーピングされた結果をツリー構造で管理するために、コンテンツ構成ツリー１６１における各ノードに属性情報として親ノード番号を格納する（ステップ３０３）。なお、コンテンツ構成ツリー１６１は、各ノードの属性情報として、ノード番号、ラベル、コンテンツ内で該当する領域（入力コントロールが占める範囲）の基準点、親ノードのノード番号、および、入力コントロール種類（コントロール種類）を有するが、グルーピング処理部１３０の処理完了時点では、ラベルおよび基準点は格納されない。 Finally, the grouping processing unit 130 stores the parent node number as attribute information in each node in the content configuration tree 161 in order to manage the result of grouping by hierarchical level in step 302 in a tree structure (step 303). The content configuration tree 161 includes, as attribute information of each node, a node number, a label, a reference point of a corresponding area (a range occupied by the input control) in the content, a node number of the parent node, and an input control type (control). However, the label and the reference point are not stored when the processing of the grouping processing unit 130 is completed.

ここで、ステップ３０２で行なわれるｎ番目とｎ＋１番目の入力コントロール間のグルーピング方法について詳しく説明する。 Here, the grouping method between the nth and n + 1th input controls performed in step 302 will be described in detail.

ｎ番目とｎ＋１番目の入力コントロール間のグルーピング方法については、ｎ番目とｎ＋１番目の入力コントロール間の配置位置関係によって、以下の８つのケースで判定ルール（以下、グルーピング条件と呼ぶ）を決めて判定を行なう。なお、以下の説明で出てくる基準点は、入力コントロールの左上を表す。また、最右端、最左端、最上端および最下端は、基準点と領域サイズ８３０から定まる、入力コントロールの最も端の点を表す。 The grouping method between the nth and n + 1th input controls is determined by determining the determination rule (hereinafter referred to as a grouping condition) in the following eight cases depending on the arrangement positional relationship between the nth and n + 1th input controls. To do. The reference point that appears in the following description represents the upper left of the input control. Further, the rightmost end, the leftmost end, the uppermost end, and the lowermost end represent the endmost points of the input control determined from the reference point and the region size 830.

まず、ｎ番目の入力コントロールの基準点に対してｎ＋１番目の入力コントロールの基準点が真右に位置する場合（以下、第１のケースと呼ぶ）、ｎ番目の入力コントロールの最右端とｎ＋１番目の入力コントロールの最左端との水平距離が、第１のケースにおける階層レベルｍでのグルーピング条件を満たすとき、ｎ番目とｎ＋１番目の入力コントロールは、階層レベルｍで同じグループに属するとみなされる。 First, when the reference point of the (n + 1) th input control is located right to the reference point of the nth input control (hereinafter referred to as the first case), the rightmost end of the nth input control and the (n + 1) th input control Nth and n + 1th input controls are considered to belong to the same group at the hierarchy level m when the horizontal distance from the leftmost edge of the input control satisfies the grouping condition at the hierarchy level m in the first case.

ｎ番目の入力コントロールの基準点に対してｎ＋１番目の入力コントロールの基準点が真下に位置する場合（以下、第２のケースと呼ぶ）、ｎ番目の入力コントロールの最下端とｎ＋１番目の入力コントロールの最上端との垂直距離が、第２のケースにおける階層レベルｍでのグルーピング条件を満たすとき、ｎ番目とｎ＋１番目の入力コントロールは、階層レベルｍで同じグループに属するとみなされ。 When the reference point of the (n + 1) th input control is located directly below the reference point of the nth input control (hereinafter referred to as the second case), the lowest end of the nth input control and the (n + 1) th input control When the vertical distance from the topmost edge satisfies the grouping condition at hierarchical level m in the second case, the nth and n + 1th input controls are considered to belong to the same group at hierarchical level m.

ｎ番目の入力コントロールの基準点に対してｎ＋１番目の入力コントロールの基準点が右下に位置する場合（以下、第３のケースと呼ぶ）、ｎ番目の入力コントロールの最右端とｎ＋１番目の入力コントロールの最左端との水平距離が第３のケースにおける階層レベルｍでのグルーピング条件を満たし、かつｎ番目の入力コントロールの最下端とｎ＋１番目の入力コントロールの最上端との垂直距離が第３のケースにおける階層レベルｍでのグルーピング条件を満たすとき、ｎ番目とｎ＋１番目の入力コントロールは、階層レベルｍで同じグループに属するとみなされる。 When the reference point of the (n + 1) th input control is located in the lower right with respect to the reference point of the nth input control (hereinafter referred to as the third case), the rightmost end of the nth input control and the (n + 1) th input The horizontal distance from the leftmost edge of the control satisfies the grouping condition at the hierarchical level m in the third case, and the vertical distance between the lowest edge of the nth input control and the highest edge of the (n + 1) th input control is 3rd. When the grouping condition at hierarchical level m in the case is met, the nth and n + 1th input controls are considered to belong to the same group at hierarchical level m.

ｎ番目の入力コントロールの基準点に対してｎ＋１番目の入力コントロールの基準点が右上に位置する場合（以下、第４のケースと呼ぶ）、ｎ番目の入力コントロールの最右端とｎ＋１番目の入力コントロールの最左端との水平距離が第４のケースにおける階層レベルｍでのグルーピング条件を満たすとき、かつｎ番目の入力コントロールの最上端とｎ＋１番目の入力コントロールの最下端との垂直距離が第４のケースにおける階層レベルｍでのグルーピング条件を満たすとき、ｎ番目とｎ＋１番目の入力コントロールは、階層レベルｍで同じグループに属するとみなされる。 When the reference point of the (n + 1) th input control is located in the upper right with respect to the reference point of the nth input control (hereinafter referred to as the fourth case), the rightmost end of the nth input control and the (n + 1) th input control When the horizontal distance from the leftmost edge of the second pixel satisfies the grouping condition at the hierarchical level m in the fourth case, the vertical distance between the uppermost edge of the nth input control and the lowermost edge of the (n + 1) th input control is the fourth distance. When the grouping condition at hierarchical level m in the case is met, the nth and n + 1th input controls are considered to belong to the same group at hierarchical level m.

ｎ番目の入力コントロールの基準点に対してｎ＋１番目の入力コントロールの基準点が左下に位置する場合（以下、第５のケースと呼ぶ）、ｎ番目の入力コントロールの最下端とｎ＋１番目の入力コントロールの最上端との垂直距離が第５のケースにおける階層レベルｍでのグルーピング条件を満たすとき、ｎ番目とｎ＋１番目の入力コントロールは、階層レベルｍで同じグループに属するとみなされる。 When the reference point of the (n + 1) th input control is located at the lower left with respect to the reference point of the nth input control (hereinafter referred to as the fifth case), the lowest end of the nth input control and the (n + 1) th input control Nth and n + 1th input controls are considered to belong to the same group at hierarchy level m when the vertical distance to the topmost edge satisfies the grouping condition at hierarchy level m in the fifth case.

その他、ｎ番目の入力コントロールの基準点に対してｎ＋１番目の入力コントロールの基準点が真左に位置する場合（以下、第６のケースと呼ぶ）、ｎ番目の入力コントロールの基準点に対してｎ＋１番目の入力コントロールの基準点が真上に位置する場合（以下、第７のケースと呼ぶ）、ｎ番目の入力コントロールの基準点に対してｎ＋１番目の入力コントロールの基準点が左上に位置する場合（以下、第８のケースと呼ぶ）、ｎ番目とｎ＋１番目の入力コントロールは、階層レベル１を除くすべての階層レベルにおいて同じグループに属さないとみなされる。 In addition, when the reference point of the (n + 1) th input control is located to the left of the reference point of the nth input control (hereinafter referred to as the sixth case), the reference point of the nth input control When the reference point of the (n + 1) th input control is located directly above (hereinafter referred to as the seventh case), the reference point of the (n + 1) th input control is located at the upper left with respect to the reference point of the nth input control. In the case (hereinafter referred to as the eighth case), the nth and n + 1th input controls are considered not to belong to the same group at all hierarchy levels except hierarchy level 1.

図７は、グルーピング条件９３０を格納するグルーピング条件リスト情報９００の例を示す図である。グルーピング条件情報９００は、例えばテーブルであり、各ケースについて、ケース番号９１０、ｎ番目の入力コントロールに対するｎ＋１番目の入力コントロールの配置位置関係を表す情報（以下、「位置関係情報」と呼ぶ）９２０、およびグルーピング条件９３０を有する。たとえば、第１のケースの場合、ケース番号９１０には“１”、位置関係情報９２０には“真右”、グルーピング条件９３０には“ｎ番目の最右端とｎ＋１番目の最左端との距離が１００ｐｘ未満”、といった情報を格納する。 FIG. 7 is a diagram illustrating an example of grouping condition list information 900 that stores grouping conditions 930. The grouping condition information 900 is, for example, a table. For each case, the case number 910, information indicating the arrangement positional relationship of the (n + 1) th input control with respect to the nth input control (hereinafter referred to as “positional relationship information”) 920, And a grouping condition 930. For example, in the case of the first case, the case number 910 is “1”, the positional relationship information 920 is “true right”, and the grouping condition 930 is “the distance between the nth rightmost edge and the (n + 1) th leftmost edge. Information such as “less than 100 px” is stored.

グルーピング条件９２０は、階層レベル毎に設定される。このため、例えば、グルーピング条件リスト情報９００が、一の階層レベルにつき一つである場合、各階層レベルに対応した各グルーピング条件リスト情報９００が用意される。グルーピング条件リスト情報９００は、例えば、主メモリ１０２または磁気ディスク装置１０１に格納されている。 The grouping condition 920 is set for each hierarchical level. For this reason, for example, when there is one grouping condition list information 900 for each hierarchical level, each grouping condition list information 900 corresponding to each hierarchical level is prepared. The grouping condition list information 900 is stored in the main memory 102 or the magnetic disk device 101, for example.

なお、図７で示したグループピング条件９３０は、例えば、一般的な入力フォーム（コンテンツ）に基づいて、ｎ番目とｎ＋１番目の入力コントロール間の相対距離に関する閾値が設定されたものとすることができる。しかし、それに限らず、例えば、コンテンツファイル１６０に対して実際にｎ番目とｎ＋１番目の入力コントロール間の相対距離を抽出した上で、その平均値に基づいて動的にグルーピング条件９３０中の相対距離の閾値が設定されてもよい。具体的には、たとえば、第２のケースの場合、グルーピング条件として、“ｎ番目の最下端とｎ＋１番目の最上端との垂直距離が、［垂直方向に関して、隣接する入力コントロール間の平均間隔］以内”としてもよい。このような動的な閾値設定（条件設定）により、最適なグルーピング条件の設定が容易になることや、コンテンツ間の作成基準における開発者の個人差を吸収することができる。 Note that the grouping condition 930 illustrated in FIG. 7 may be set, for example, based on a general input form (content), in which a threshold regarding the relative distance between the nth and n + 1th input controls is set. it can. However, the present invention is not limited to this. For example, after the relative distance between the nth and n + 1th input controls is actually extracted from the content file 160, the relative distance in the grouping condition 930 is dynamically calculated based on the average value. May be set. Specifically, for example, in the case of the second case, the grouping condition is “the vertical distance between the nth lowest end and the (n + 1) th highest end is [average interval between adjacent input controls in the vertical direction]” It may be “within”. Such dynamic threshold setting (condition setting) makes it easy to set an optimal grouping condition and can absorb individual differences among developers in the creation criteria between contents.

また、ステップ３０２では、入力コントロールの位置情報を用いてグルーピングを行なったが、コンテンツ内情報取得処理部１２２によって生成されたテキストリスト情報１６２のフォントサイズ情報を用いて、ある閾値以上のフォントサイズを持つテキストの位置を階層レベルｍにおけるｎ番目とｎ＋１番目の入力コントロール間をグループの境界をみなすといった方法をとってもよい。 In step 302, the grouping is performed using the position information of the input control, but the font size information of the text list information 162 generated by the in-content information acquisition processing unit 122 is used to set a font size greater than a certain threshold. A method may be used in which the boundary of the group between the nth and n + 1th input controls at the hierarchical level m is regarded as a group boundary.

また、コンテンツ内情報取得処理部１２２によって生成されたテキストリスト情報１６２のＨ１などの文書構造情報を用いて、階層レベルｍにおけるｎ番目とｎ＋１番目の入力コントロール間をグループの境界をみなすといった方法をとってもよい。これらフォントサイズや文書構造情報と、位置情報を併用してグルーピング処理を行なうことで、グルーピングの精度が高くなる。 In addition, a method of using the document structure information such as H1 of the text list information 162 generated by the in-content information acquisition processing unit 122 to regard a group boundary between the nth and n + 1th input controls at the hierarchical level m. It may be taken. By performing the grouping process using the font size and document structure information together with the position information, the grouping accuracy is improved.

図３は、ラベル付与処理部１３１が行う処理の流れの一例を示す。 FIG. 3 shows an example of the flow of processing performed by the label addition processing unit 131.

ラベル付与処理部１３１は、まず、コンテンツ構成ツリー１６１のＭ個の階層レベルに対し、ステップ４１０〜ステップ４１１および４１５をｍ＝Ｍ〜１まで繰り返す（ステップ４００）。 First, the label addition processing unit 131 repeats Steps 410 to 411 and 415 for M hierarchical levels of the content structure tree 161 until m = M-1 (Step 400).

次に、ラベル付与処理部１３１は、階層レベルが最下位階層レベルであるかどうか（ｍ＝Ｍであるかどうか）を判定する（ステップ４１０）。 Next, the label assignment processing unit 131 determines whether or not the hierarchy level is the lowest hierarchy level (whether m = M) (step 410).

ステップ４１０の判定の結果、階層レベルが最下位階層レベル（ｍ＝Ｍ）である場合、ラベル付与処理部１３１は、コンテンツ全体に含まれるＮ個の入力コントロールに対し、ステップ４１２〜ステップ４１４をｎ＝１〜Ｎまで繰り返し実行する（ステップ４１１）。 As a result of the determination in step 410, when the hierarchical level is the lowest hierarchical level (m = M), the label assignment processing unit 131 performs steps 412 to 414 for N input controls included in the entire content. = 1 to N are repeatedly executed (step 411).

次に、ラベル付与処理部１３１は、ｎ番目の入力コントロールの基準点を領域の左上として設定し、ワークエリア１５０に格納する（ステップ４１２）。 Next, the label assignment processing unit 131 sets the reference point of the nth input control as the upper left of the area and stores it in the work area 150 (step 412).

次に、ラベル付与処理部１３１は、ｎ番目の入力コントロールの基準点に対し、テキストリスト情報１６２に格納されているコンテンツ中のテキストの中から上方向と左方向（距離が同じ場合は左より上優先）で最も近い距離にあるテキストを、ｎ番目の入力コントロールのラベルとして抽出し、そのラベルをコンテンツ構成ツリー１６１における階層レベルＭのｎ番目のノードの属性情報として、ワークエリア１５０に格納する（ステップ４１３）。 Next, the label addition processing unit 131 moves upward and leftward from the text in the content stored in the text list information 162 with respect to the reference point of the nth input control (from the left if the distance is the same). The text at the closest distance is extracted as the label of the nth input control, and the label is stored in the work area 150 as attribute information of the nth node at the hierarchical level M in the content structure tree 161. (Step 413).

次に、ラベル付与処理部１３１は、ラベルを含めたｎ番目の入力コントロールの基準点を、ラベル領域の左上とし、その基準点をコンテンツ構成ツリー１６１における階層レベルＭのｎ番目のノードの属性情報として、ワークエリア１５０に格納する（ステップ４１４）。なお、ステップ４１３でラベルが抽出できなかった場合には、ラベル付与処理部１３１は、ステップ４１２で設定した入力コントロールの左上を基準点とする。 Next, the label assignment processing unit 131 sets the reference point of the nth input control including the label as the upper left of the label area, and sets the reference point as the attribute information of the nth node at the hierarchical level M in the content configuration tree 161. Is stored in the work area 150 (step 414). If the label cannot be extracted in step 413, the label addition processing unit 131 sets the upper left of the input control set in step 412 as a reference point.

次に、ラベル付与処理部１３１は、ステップ４１０の判定の結果、階層レベルが最下位階層レベルでない場合、階層レベルｍに含まれるＮ個のグループに対し、ステップ４１５〜ステップ４１８をｎ＝１〜Ｎまで繰り返し実行する（ステップ４１５）。 Next, if the result of determination in step 410 is that the hierarchy level is not the lowest hierarchy level, the label assignment processing unit 131 performs steps 415 to 418 for n groups included in the hierarchy level m, where n = 1 to 1. Repeat until N (step 415).

次に、ラベル付与処理部１３１は、ｎ番目のグループに含まれる要素（子ノード）の中で、最も左上（例えば、距離が同じ場合は左より上優先）の基準点をもつ要素の基準点をｎ番目のグループの基準点として、ワークエリア１５０に格納する（ステップ４１６）。 Next, the label addition processing unit 131 selects the reference point of the element having the reference point of the upper left (e.g., priority over the left when the distance is the same) among the elements (child nodes) included in the nth group. Is stored in the work area 150 as a reference point of the nth group (step 416).

次に、ラベル付与処理部１３１は、ステップ４１６で設定されたｎ番目のグループの基準点に対し、テキストリスト情報１６２に格納されているコンテンツ中のテキストの中から上方向と左方向（距離が同じ場合は左より上優先）で最も近い距離にあるテキストであって、未だラベルとして抽出されていないテキストを、ｎ番目のグループのラベルとして抽出し、そのラベルをコンテンツ構成ツリー１６１における該ノードの属性情報として、ワークエリア１５０に格納する（ステップ４１７）。 Next, the label addition processing unit 131 performs the upward and leftward (distances) from the text in the content stored in the text list information 162 with respect to the reference point of the nth group set in step 416. If it is the same, the text that is closest to the left) and has not been extracted as a label yet is extracted as the label of the nth group, and the label of the node in the content composition tree 161 is extracted. The attribute information is stored in the work area 150 (step 417).

なお、未だラベルとして抽出されていないテキストがｎ番目のグループではないグループに属してしまう場合や、未だラベルとして抽出されていないテキストが無い場合には、ラベル付与処理部１３１は、ｎ番目のグループの基準点のノードのラベルを、ｎ番目のグループのラベルとする。 If the text that has not yet been extracted as a label belongs to a group that is not the nth group, or if there is no text that has not yet been extracted as a label, the label assignment processing unit 131 determines that the nth group. Let the label of the node of the reference point be the label of the nth group.

次に、ラベル付与処理部１３１は、ラベルを含めたｎ番目のグループの基準位置を、ラベル領域の左上とし、その基準位置をコンテンツ構成ツリー１６１における該ノードの属性情報として、ワークエリア１５０に格納する（ステップ４１８）。 Next, the label assignment processing unit 131 sets the reference position of the nth group including the label as the upper left of the label area, and stores the reference position in the work area 150 as attribute information of the node in the content configuration tree 161. (Step 418).

ラベル付与処理部１３１は、グルーピング結果を表すノードツリーのリーフからルートにかけてボトムアップ式に順次にラベルを付与していくことができる。 The label addition processing unit 131 can sequentially apply labels in a bottom-up manner from the leaf of the node tree representing the grouping result to the root.

なお、ステップ４１３では、ｎ番目の入力コントロールのラベルをテキストリスト情報１６２に格納されているコンテンツ中のテキストの中から、入力コントロールの基準点に対して上方向と左方向（距離が同じ場合は左より上優先）で最も近い距離にあるテキストとしたが、ラジオボタンやチェックボックスについては入力コントロールの基準点に対して右方向と下方向（距離が同じ場合は下より右優先）で最も近い距離にあるテキストとしてもよい。 In step 413, the label of the nth input control is selected from the text in the content stored in the text list information 162 in the upward direction and the left direction (if the distance is the same). The text is the closest distance in the top priority (from left to top), but the radio buttons and check boxes are closest to the reference point of the input control in the right and down directions (if the distance is the same, the right priority is given to the bottom) It may be text at a distance.

また、隣接する入力コントロール間が間隔なく配置されているグループについては、ｎ行×ｍ列のテーブル構造とみなし、１行目の各列に付与されたラベルを、ｎ行目以降の各列にも同じラベル付与する。例えば、１行目の１列目の入力コントロールに付与されたラベルは、２行目の１列目の入力コントロールに付与されるラベルとしても使われる。 In addition, regarding a group in which adjacent input controls are arranged without an interval, it is regarded as a table structure of n rows × m columns, and labels given to the columns of the first row are assigned to the columns after the nth row. Also give the same label. For example, the label given to the input control in the first column of the first row is also used as the label given to the input control in the first column of the second row.

また、本実施形態におけるグループのラベル抽出方法は、グループ（ノード）の基準位置に対し、テキストリスト情報１６２に格納されているコンテンツ中のテキストの中から上方向と左方向で最も近い距離にあるテキストをラベルとして抽出するものとしたが、距離に関する閾値を抽出条件として設定してもよい。これにより、誤ったラベル付与が少なくなる。 Also, the group label extraction method according to the present embodiment is the closest distance in the upward direction and the left direction from the text in the content stored in the text list information 162 with respect to the reference position of the group (node). Although the text is extracted as a label, a threshold related to distance may be set as an extraction condition. This reduces false labeling.

また、本実施形態におけるグループのラベル抽出方法は、グループ（ノード）の基準点に対し、テキストリスト情報１６２に格納されているコンテンツ中のテキストの中から上方向と左方向で最も近い距離にあるテキストをラベルとして抽出するものとしたが、グループの中に含まれる要素（子ノード）のラベルの中から代表値として抽出してもよいし、他のコンテンツによるトレーニングデータによって、トレーニングデータ中でラベル付与したいグループと類似するグループに付与されたラベルを用いてもよい。これにより、コンテンツ中にラベルに該当するテキストが存在しない場合でも、各コンテンツ内要素やグループにラベルを付与することができる。 Also, the group label extraction method in this embodiment is the closest distance in the upward and left directions from the text in the content stored in the text list information 162 with respect to the reference point of the group (node). Although text is extracted as a label, it may be extracted as a representative value from the labels of elements (child nodes) included in the group, or it may be labeled in training data by training data from other content. You may use the label provided to the group similar to the group to give. Thereby, even when the text applicable to a label does not exist in content, a label can be provided to each content element or group.

以上が、部品登録制御処理部１１１によって行われる処理の説明である。 The above is the description of the processing performed by the component registration control processing unit 111.

以下、部品検索制御処理部１１２及び部品登録制御処理部１１１によって行われる処理について、更に詳細に説明する。まず、部品検索制御処理部１１２の具体的な処理を説明する。 Hereinafter, processing performed by the component search control processing unit 112 and the component registration control processing unit 111 will be described in more detail. First, specific processing of the component search control processing unit 112 will be described.

図１１は、入出力定義情報解析処理部１４１の具体的な処理を説明した図である。 FIG. 11 is a diagram illustrating specific processing of the input / output definition information analysis processing unit 141.

入出力定義情報解析処理部１４１は、入出力定義情報取得処理部１４０によって取得された入出力定義情報１７０から、ユーザによって指定されたユーザ指定構造情報１７２に対応するサブツリーを抽出し、クエリツリー１７１を取得する。本図の例では、ユーザ指定構造情報１７２として“Applicant Information”が指定されており、入出力定義情報１７０から“Applicant Information”の部分のサブツリーがクエリツリー１７１として取得されている。 The input / output definition information analysis processing unit 141 extracts a subtree corresponding to the user-specified structure information 172 specified by the user from the input / output definition information 170 acquired by the input / output definition information acquisition processing unit 140, and the query tree 171. To get. In the example of this figure, “Applicant Information” is specified as the user-specified structure information 172, and the subtree of “Applicant Information” is acquired as the query tree 171 from the input / output definition information 170.

図１２は、部品検索処理部１４２の具体的な処理を説明した図である。 FIG. 12 is a diagram for explaining specific processing of the component search processing unit 142.

部品検索処理部１４２は、磁気ディスク装置１０１に登録されているすべてのコンテンツファイル１６０に含まれるすべての既存部品サブツリー１７４に対して、１つのクエリツリー１７１との類似度を算出する。なお、既存部品サブツリー１７４は、コンテンツ構成ツリー１６１のサブセットであり、最小単位として、コンテンツ構成ツリー１６１のリーフであることがある。 The component search processing unit 142 calculates the similarity with one query tree 171 for all the existing component subtrees 174 included in all the content files 160 registered in the magnetic disk device 101. The existing part subtree 174 is a subset of the content structure tree 161, and may be a leaf of the content structure tree 161 as a minimum unit.

図１２の例では、“Applicant Information”の部分のクエリツリー１７１と、図４に示すコンテンツ５００（コンテンツファイル１６０が示すコンテンツ）におけるコンテンツ構成ツリー１６１の既存部品サブツリー１７４（図９で示すコンテンツ構成ツリー１１００のノード番号２のサブツリー）の類似度算出処理の詳細な具体例を示している。 In the example of FIG. 12, the query tree 171 of the “Applicant Information” portion and the existing component subtree 174 (content configuration tree shown in FIG. 9) of the content configuration tree 161 in the content 500 (content indicated by the content file 160) shown in FIG. A detailed specific example of similarity calculation processing of a subtree of node number 2 of 1100) is shown.

まず、部品検索処理部１４２は、クエリツリー１７１から部品内情報１７７Ａ（ラベル情報１７５Ａと入力コントロール情報１７４Ａを含んだ情報）を取得し、同様に、既存部品サブツリー１７４から、部品内情報１７７Ｂ（ラベル情報１７５Ｂと入力コントロール情報１７４Ｂを含んだ情報）を取得する。本図の例では、クエリツリー１７１からは、７個のラベル情報要素（テキスト情報要素）および入力コントロール種類情報要素が取得され、既存部品サブツリー１７４からは、８個のラベル情報要素および入力コントロール種類情報要素が取得される。 First, the part search processing unit 142 acquires the in-part information 177A (information including the label information 175A and the input control information 174A) from the query tree 171, and similarly, the in-part information 177B (label) from the existing part subtree 174. Information 175B and input control information 174B). In the example of this figure, seven label information elements (text information elements) and input control type information elements are acquired from the query tree 171, and eight label information elements and input control types are acquired from the existing part subtree 174. An information element is obtained.

次に、部品検索処理部１４２は、各部品内情報１７７から、類似度算出に必要なベクトル空間とベクトルを生成するために、ベクトル情報１８２として、ラベル情報１７８、入力コントロール情報１７９、クエリツリーベクトル１８０および既存部品サブツリーベクトル１８１を生成する。本図の例では、ベクトル空間を表す次元として、言い換えれば、ラベル情報１７８と入力コントロール情報１７９として、部品内情報１７７Ａ及び１７７Ｂを用いて、（Name・text、Address・text、Telephone Number・text、Company Name・text、Year・select、Month・select、Day・select、First・text、Last・text、Email Address・text）が生成される（上記カッコ内では、ラベル情報要素・入力コントロール種類情報要素、となっている）。また、生成されたベクトル空間におけるクエリツリーベクトル１８０として（１、１、１、１、１、１、１、０、０、０）、既存部品サブツリーベクトル１８１として（０、１、１、０、１、１、１、１、１、１）が生成されている。各ベクトルの要素については、例えば、ラベル“Name”と入力コントロール種類“text”との組合せが存在する場合は“１”、その組合せが存在しない場合は“０”となる。 Next, the component search processing unit 142 generates, as vector information 182, label information 178, input control information 179, query tree vector, in order to generate a vector space and a vector necessary for similarity calculation from the in-component information 177. 180 and an existing part subtree vector 181 are generated. In the example of this figure, as the dimension representing the vector space, in other words, using the in-part information 177A and 177B as the label information 178 and the input control information 179, (Name · text, Address · text, Telephone Number · text, Company Name / text, Year / select, Month / select, Day / select, First / text, Last / text, Email Address / text) are generated (in the above parentheses, label information element / input control type information element, ) In addition, as a query tree vector 180 in the generated vector space (1, 1, 1, 1, 1, 1, 1, 0, 0, 0) and as an existing part subtree vector 181 (0, 1, 1, 0, 1, 1, 1, 1, 1, 1) has been generated. The element of each vector is, for example, “1” when the combination of the label “Name” and the input control type “text” exists, and “0” when the combination does not exist.

次に、部品検索処理部１４２は、生成されたベクトル情報１８２から、（式１）によりクエリツリー１７１に対する既存部品サブツリー１７４の類似度を算出し、コンテンツＩＤ情報１８３１、ノード番号情報１８３２及び類似度情報１８３３を含んだ評価結果１８３を生成し、ワークエリア１５０に格納する。 Next, the component search processing unit 142 calculates the similarity of the existing component subtree 174 with respect to the query tree 171 from the generated vector information 182 by (Equation 1), and the content ID information 1831, node number information 1832, and similarity An evaluation result 183 including information 1833 is generated and stored in the work area 150.

本図の例では、クエリツリー１７１に対する既存部品サブツリー１７４の類似度として“０．６７”が算出され、該算出結果は、評価結果１８３のコンテンツＩＤ情報１８３１として“０００１”、ノード番号情報１８３２として“２”、類似度情報１８３３として“０．６７”が格納される。 In the example of this figure, “0.67” is calculated as the similarity of the existing component subtree 174 with respect to the query tree 171, and the calculation result is “0001” as the content ID information 1831 of the evaluation result 183 and the node number information 1832. “0.67” is stored as “2” and similarity information 1833.

図１３は、部品候補出力処理部１４３の具体的な処理を説明した図である。 FIG. 13 is a diagram illustrating specific processing of the component candidate output processing unit 143.

部品候補出力処理部１４３は、部品検索処理部１４３によって生成された類似度算出結果（評価結果１８３）から、部品候補一覧表示１８７を出力する。本図の例では、コンテンツＩＤ０００１〜０００３のコンテンツに含まれる部品に対して、類似度の降順でクエリツリー１７１に対する部品候補の一覧が部品候補一覧表示画面１８７に表示される。その一覧によれば、第１位の部品候補は、部品ＩＤが“０００１_２”、類似度が“０．６７”、ラベルが“Applicant Information”の部品である。 The component candidate output processing unit 143 outputs a component candidate list display 187 from the similarity calculation result (evaluation result 183) generated by the component search processing unit 143. In the example of this figure, a list of candidate parts for the query tree 171 is displayed on the candidate part list display screen 187 in descending order of similarity with respect to the parts included in the contents of the content IDs 0001 to 0003. According to the list, the first candidate component is a component having a component ID “0001_2”, a similarity “0.67”, and a label “Applicant Information”.

また、本図の例では、部品のプレビューをするためのプレビューボタンが部品候補一覧表示画面１８７に各部品について表示される。ユーザが第１位の部品候補（部品ＩＤ：０００１_２）のプレビューボタンを押下することで、部品候補出力処理部１４３は、第１位の部品候補に対応したコードを該部品候補を含んだコンテンツのコンテンツファイルから取得し、取得したコードに基づき、プレビュー画面１８４に、第１位の部品候補の部品イメージを表示する。 Further, in the example of this figure, a preview button for previewing a component is displayed for each component on the component candidate list display screen 187. When the user presses the preview button of the first candidate component (component ID: 0001_2), the candidate component output processing unit 143 displays the code corresponding to the first candidate component of the content including the candidate component. The part image of the first candidate part is displayed on the preview screen 184 based on the acquired code.

また、本図の例では、入出力定義情報表示画面１８５が表示され、入出力定義情報表示画面１８５で、現在出力されている部品候補一覧に対応する入出力定義情報１７０中のクエリツリー１７１部分が、網掛けで示されている。 Further, in the example of this figure, an input / output definition information display screen 185 is displayed. In the input / output definition information display screen 185, the query tree 171 portion in the input / output definition information 170 corresponding to the currently output component candidate list is displayed. Is shown by shading.

また、本図の例では、複数のクエリツリー１７１として、“Applicant Information”、“Date of Birth”および“Service”が指定されており、それらのクエリツリー間の関係（例えば階層関係）が、クエリツリー情報表示画面１８６に表示される。クエリツリー情報表示画面１８６から、クエリツリー“Applicant Information”を親としクエリツリー“Date of Birth”を子とした親子関係があることがわかる。また、これら３つのクエリツリー１７１の中で、部品候補が存在しかつ最上位階層レベルであるクエリツリー“Applicant Information”に対する部品候補が、部品候補一覧表示１８７に優先的に表示される。 In the example of this figure, “Applicant Information”, “Date of Birth”, and “Service” are designated as the plurality of query trees 171, and the relationship (for example, hierarchical relationship) between these query trees is the query. It is displayed on the tree information display screen 186. It can be seen from the query tree information display screen 186 that there is a parent-child relationship in which the query tree “Applicant Information” is a parent and the query tree “Date of Birth” is a child. Further, among these three query trees 171, the candidate parts for the query tree “Applicant Information”, which has the candidate parts and is the highest hierarchical level, are preferentially displayed in the candidate part list display 187.

また、本図の例では、類似度の閾値が“０．５０”で設定されており、部品候補一覧表示画面１８７には、類似度が０．５０以上の４件の部品が部品候補として表示されている。すなわち、部品候補出力処理部１４３は、類似度の閾値が“０．５０”未満の部品を部品候補として表示しない。なお、類似度の閾値に代えて又は加えて、類似度が高い部品から上位所定件数の部品のみが表示されても良い。 Further, in the example of this figure, the similarity threshold is set to “0.50”, and four parts with similarity of 0.50 or more are displayed as part candidates on the part candidate list display screen 187. Has been. That is, the component candidate output processing unit 143 does not display a component having a similarity threshold value less than “0.50” as a component candidate. Note that instead of or in addition to the similarity threshold, only the upper predetermined number of components from the components with a high similarity may be displayed.

また、クエリツリー情報表示画面１８６に、各クエリツリー１７１に対する部品候補数が表示される。その画面１８６からは、クエリツリー“Applicant Information”に対しては４件の部品候補が存在することがわかる。また、クエリツリー“Service”に対しては、０件、すなわち部品候補がないことがわかる。 In addition, the number of candidate parts for each query tree 171 is displayed on the query tree information display screen 186. From the screen 186, it can be seen that there are four candidate parts for the query tree “Applicant Information”. Further, it can be seen that there is no case for the query tree “Service”, that is, there is no component candidate.

また、図１３によれば、入出力定義情報表示画面１８５、プレビュー画面１８４、クエリツリー情報表示画面１８６及び部品候補一覧表示画面１８７の組合せにより一つの表示画面が構成されているが、それらの画面は互いに離れていても重なっても良い。 Moreover, according to FIG. 13, one display screen is comprised by the combination of the input / output definition information display screen 185, the preview screen 184, the query tree information display screen 186, and the component candidate list display screen 187. May be separated from each other or overlap.

以上が、部品検索御処理部１１２の具体的な処理である。開発者は、部品候補一覧の中から所望の部品候補を選択し、その部品候補をそのまま或いは編集することで、新規のコンテンツを作成することができる。開発者は、その新規のコンテンツを表すコンテンツファイルを、磁気ディスク装置１０１に登録することができる。 The above is the specific processing of the component search control processing unit 112. The developer can create a new content by selecting a desired component candidate from the component candidate list and editing the component candidate as it is or editing it. The developer can register a content file representing the new content in the magnetic disk device 101.

次に、部品登録制御処理部１１１の具体的な処理を説明する。 Next, specific processing of the component registration control processing unit 111 will be described.

コンテンツ取得処理部１２１とコンテンツ内情報取得処理部１２２の具体的な処理例を図４、図５および図６を用いて説明する。 Specific processing examples of the content acquisition processing unit 121 and the in-content information acquisition processing unit 122 will be described with reference to FIGS. 4, 5, and 6.

図４は、コンテンツ５００の一例として、クライアント端末からアクセスされたＨＴＭＬコンテンツのカンファレンス申し込みフォームの例を示す。 FIG. 4 shows an example of a conference application form for HTML content accessed from a client terminal as an example of the content 500.

まず、コンテンツ内情報取得処理部１２２は、コンテンツ取得処理部１２１によって取得されたコンテンツ５００のＨＴＭＬソース（コンテンツファイル１６０）から、テキスト情報要素およびそれらテキストの配置位置情報（例えば基準位置情報やフォントサイズ）を抽出し、テキスト番号７００、テキスト情報７１０、基準位置情報７２０およびフォントサイズ７３０をテキストリスト情報１６２に記録する。テキストリスト情報１６２の具体例を図５に示す。図５の例では、テキストリスト情報１６２に、全部で１３個のテキスト情報要素が記録されている（例えば、１番目のテキスト情報要素７１０として、“○○ Conference Form”が記録され、基準位置情報７２０として“Ｘ：１７７ｐｘＹ：３５ｐｘ”が記録され、フォントサイズ７３０として“＋４”が記録される）。 First, the in-content information acquisition processing unit 122 reads text information elements and arrangement position information (for example, reference position information and font size) of the text from the HTML source (content file 160) of the content 500 acquired by the content acquisition processing unit 121. ) And the text number 700, text information 710, reference position information 720, and font size 730 are recorded in the text list information 162. A specific example of the text list information 162 is shown in FIG. In the example of FIG. 5, a total of 13 text information elements are recorded in the text list information 162 (for example, “XX Conference Form” is recorded as the first text information element 710, and the reference position information "X: 177 px Y: 35 px" is recorded as 720, and "+4" is recorded as the font size 730).

次に、コンテンツ内情報取得処理部１２２は、コンテンツ取得処理部１２１によって取得されたコンテンツ５００のＨＴＭＬソースから、ＩＮＰＵＴタグとＳＥＬＥＣＴタグ（入力コントロールの種類を表す情報要素）、およびそれら入力コントロールの配置位置情報（例えば基準位置情報及び領域サイズ）を抽出し、入力コントロール番号８００、入力コントロール種類情報要素８１０、基準位置情報８２０および領域サイズ８３０を入力コントロールリスト情報１６３へ格納する。入力コントロールリスト情報１６３の具体例を図６に示す。図６の例では、入力コントロールリスト情報１６３に、全部で１５個の入力コントロール種類情報要素が記録される（例えば、１番目の入力コントロール種類情報要素８１０として、“＜INPUT type=”text”＞”が記録され、基準位置情報８２０として“Ｘ：３１５ｐｘＹ：１６１ｐｘ”が記録され、領域サイズ８３０として“ｗｉｄｔｈ：７３ｐｘｈｅｉｇｈｔ：２０ｐｘ”が記録される）。 Next, the in-content information acquisition processing unit 122, from the HTML source of the content 500 acquired by the content acquisition processing unit 121, an INPUT tag and a SELECT tag (information element indicating the type of input control), and the arrangement of these input controls Position information (for example, reference position information and area size) is extracted, and the input control number 800, the input control type information element 810, the reference position information 820, and the area size 830 are stored in the input control list information 163. A specific example of the input control list information 163 is shown in FIG. In the example of FIG. 6, a total of 15 input control type information elements are recorded in the input control list information 163 (eg, “<INPUT type =“ text ”> as the first input control type information element 810) "Is recorded," X: 315 px Y: 161 px "is recorded as the reference position information 820, and" width: 73 px height: 20 px "is recorded as the area size 830).

図４に示すコンテンツ５００に対し、５つの階層レベルを持つコンテンツ構成ツリーを生成する場合のコンテンツ構成ツリー生成処理部１２３の具体的な処理の流れを、図５〜図９を用いて説明する。 A specific processing flow of the content configuration tree generation processing unit 123 when generating a content configuration tree having five hierarchical levels for the content 500 shown in FIG. 4 will be described with reference to FIGS.

グルーピング処理部１３０は、図６に示す入力コントロールリスト情報１６３を読み込み、各階層レベル（２〜Ｍ−１）に対して、入力順が隣り合う入力コントロール間が同じグループに属するか否かを、主メモリ１０２または磁気ディスク装置１０１に格納されたグルーピング条件リスト９００に基づいて判定する。 The grouping processing unit 130 reads the input control list information 163 shown in FIG. 6 and determines whether or not the input controls adjacent to each other in the input order belong to the same group for each hierarchical level (2 to M−1). The determination is made based on the grouping condition list 900 stored in the main memory 102 or the magnetic disk device 101.

図７は、階層レベル２におけるグルーピング条件リスト９００の例である。図５および図６の例では、階層レベル２のグルーピング処理に関して、まず、図６に示す入力コントロールリスト情報１６３より、１番目の入力コントロールに対する２番目の入力コントロールの位置関係は“真右”であることから、図７に示すグルーピング条件リスト９００における第１のケースが用いられる。 FIG. 7 is an example of the grouping condition list 900 at the hierarchical level 2. In the example of FIGS. 5 and 6, regarding the grouping process at hierarchical level 2, first, the input control list information 163 shown in FIG. 6 indicates that the positional relationship of the second input control with respect to the first input control is “true right”. For this reason, the first case in the grouping condition list 900 shown in FIG. 7 is used.

そして、入力コントロールリスト情報１６３に示されている基準位置情報と領域サイズより、１番目の入力コントロールの最右端と２番目の入力コントロールの最左端との水平距離が５２ｐｘであることから、１番目の入力コントロールと２番目の入力コントロールの配置位置関係は、グルーピング条件９００における第１のケースの条件である“ｎ番目の入力コントロールの最右端とｎ＋１番目の入力コントロールの最左端との水平距離が１００ｐｘ未満”を満たす。この結果、１番目の入力コントロールと２番目の入力コントロールは階層レベル２において同じグループとみなされる。 From the reference position information and area size shown in the input control list information 163, the horizontal distance between the rightmost end of the first input control and the leftmost end of the second input control is 52 px. The positional relationship between the second input control and the second input control is that the horizontal distance between the rightmost end of the nth input control and the leftmost end of the (n + 1) th input control is the first case condition in the grouping condition 900. Satisfies “less than 100 px”. As a result, the first input control and the second input control are regarded as the same group at the hierarchical level 2.

次に、入力コントロールリスト情報１６３より、２番目の入力コントロールに対する３番目の入力コントロールの位置関係は“左下”であることから、グルーピング条件９００における第５のケースが用いられる。そして、入力コントロールリスト情報１６３に示されている基準位置情報と領域サイズより、２番目の入力コントロールの最下端と３番目の入力コントロールの最上端との垂直距離が１５ｐｘであることから、２番目の入力コントロールと３番目の入力コントロールの配置位置関係は、グルーピング条件９００における第５のケースの条件である“ｎ番目の入力コントロールの最下端とｎ＋１番目の入力コントロールの最上端との垂直距離が２０ｐｘ未満”を満たす。この結果、２番目の入力コントロールと３番目の入力コントロールは階層レベル２において同じグループとみなされる。 Next, from the input control list information 163, since the positional relationship of the third input control with respect to the second input control is “lower left”, the fifth case in the grouping condition 900 is used. Since the vertical distance between the lowermost end of the second input control and the uppermost end of the third input control is 15 px from the reference position information and area size indicated in the input control list information 163, the second The positional relationship between the input control and the third input control is “the vertical distance between the lowest end of the nth input control and the highest end of the (n + 1) th input control, which is the condition of the fifth case in the grouping condition 900. Satisfies “less than 20 px”. As a result, the second input control and the third input control are regarded as the same group at the hierarchical level 2.

次に、入力コントロールリスト情報１６３により、９番目の入力コントロールに対する１０番目の入力コントロールの位置関係は“左下”であることから、グルーピング条件９００における第５のケースが用いられる。そして、入力コントロールリスト情報１６３に示されている基準点と領域サイズの情報より、９番目の入力コントロールの最下端と１０番目の入力コントロールの最上端との垂直距離が１０８ｐｘであることから、９番目の入力コントロールと１０番目の入力コントロールの配置位置関係は、グルーピング条件９００における第５のケースの条件である“ｎ番目の入力コントロールの最下端とｎ＋１番目の入力コントロールの最上端との垂直距離が２０ｐｘ未満”を満たさない。この結果、９番目の入力コントロールと１０番目の入力コントロール間は、階層レベル２において異なるグループ、すなわちグループの境界とみなされる。 Next, according to the input control list information 163, the positional relationship of the 10th input control with respect to the 9th input control is “lower left”, so the fifth case in the grouping condition 900 is used. Based on the reference point and area size information indicated in the input control list information 163, the vertical distance between the bottom end of the ninth input control and the top end of the tenth input control is 108 px. The arrangement position relationship between the tenth input control and the tenth input control is “the vertical distance between the lowest end of the nth input control and the highest end of the (n + 1) th input control, which is the condition of the fifth case in the grouping condition 900. Does not satisfy “less than 20 px”. As a result, the ninth input control and the tenth input control are regarded as different groups in the hierarchy level 2, that is, the group boundaries.

この結果、１番目から９番目までの入力コントロールが、階層レベル２における１つのグループとみなされ、１０番目の入力コントロールが次のグループの先頭とされる。 As a result, the first to ninth input controls are regarded as one group in the hierarchy level 2, and the tenth input control is set as the head of the next group.

このように、入力順が隣接するすべての入力コントロール間に対して、上記のように各階層レベルでグルーピング判定を行ない、グループを生成していく。そして、グルーピング結果は、ノード番号、ラベル、基準点、親ノード番号およびコントロール種類を各ノードの属性情報として持つ図８に示すようなツリー構造で管理され、グルーピング処理部１３０の処理が終了した時点におけるコンテンツ構成ツリー１０００が生成される。 In this way, grouping is determined at each hierarchical level as described above for all input controls that are adjacent in input order, and groups are generated. The grouping result is managed in a tree structure as shown in FIG. 8 having the node number, label, reference point, parent node number, and control type as attribute information of each node, and when the processing of the grouping processing unit 130 ends. A content structure tree 1000 is generated.

図８では、１番目から９番目までの入力コントロールをノード番号５〜９およびノード番号１３〜１７のノードとし、これを束ねるグループをブランチノードとしてノード番号２とする。また、ノード番号２〜４の３つのブランチノードはコンテンツ全体を表す階層レベル１のルートノード（ノード番号１）に束ねられる。コンテンツ構成ツリーのデータ構造は、例えば、リーフからルート方向へ順次１つ上位階層のノードのデータ格納領域へのポインタを指定してツリー構造を形成する。また、ルートからリーフ方向へ順次１つ下位階層のノードのデータ格納領域へのポインタを指定して、ツリー構造を形成することもできる。また、ルートからリーフ方向、リーフからルート方向の両方のポインタをノードが有することもできる。 In FIG. 8, the first to ninth input controls are nodes having node numbers 5 to 9 and node numbers 13 to 17, and a group in which these are bundled is a branch node and node number 2. In addition, the three branch nodes having node numbers 2 to 4 are bundled into a root node (node number 1) at a hierarchical level 1 representing the entire content. As the data structure of the content structure tree, for example, a pointer to a data storage area of a node one level higher in order from the leaf to the root is designated to form the tree structure. It is also possible to form a tree structure by designating a pointer to the data storage area of a node one level lower in order from the root to the leaf. A node can also have pointers in both the root-to-leaf direction and the leaf-to-root direction.

なお、図８で示すようにコンテンツ構成ツリー１０００の各ノードの属性情報であるラベルと基準点は、グルーピング処理部１３０の実行完了時点ではまだ格納されない。 As shown in FIG. 8, the labels and reference points that are the attribute information of each node of the content structure tree 1000 are not yet stored when the execution of the grouping processing unit 130 is completed.

また、本実施形態では、入力コントロールのみを対象としてグルーピングを行なっているが、入力コントロールだけでなく、一般的にＷｅｂコンテンツに対してはテキストや画像などをグルーピング処理の対象としてもよい。 In this embodiment, the grouping is performed only for the input control. However, not only the input control, but generally, text, an image, or the like may be the target of the grouping process for the Web content.

また、本実施形態では、コンテンツ構成ツリー生成処理部１２３が、グルーピング処理部１３０を起動し、グルーピング処理部１３０が、グループの生成とコンテンツ構成ツリー１０００の生成を行っているが、グルーピング処理部１３０はグループの生成だけ行い、コンテンツ構成ツリー生成処理部１２３が、コンテンツ構成ツリー１０００の生成を行ってもよい。 In this embodiment, the content configuration tree generation processing unit 123 activates the grouping processing unit 130, and the grouping processing unit 130 generates a group and the content configuration tree 1000. May only generate a group, and the content configuration tree generation processing unit 123 may generate the content configuration tree 1000.

ラベル付与処理部１３１は、階層レベルＭ（最下位階層レベル）から階層レベル１の順で、図８に示すコンテンツ構成ツリー１０００の各ノードにおけるラベルの抽出および基準点の設定を行なう。 The label assignment processing unit 131 extracts labels and sets reference points in each node of the content configuration tree 1000 shown in FIG. 8 in order from the hierarchy level M (the lowest hierarchy level) to the hierarchy level 1.

ラベル付与処理部１３１は、階層レベルが階層レベルＭ（最下位階層レベル）のリーフノードの場合、各入力コントロールの基準位置を領域の左上とした上で、テキストリスト情報１６２および入力コントロールリスト情報１６３を用いて、ｎ番目の入力コントロールの基準点に対し、上方向あるいは左方向（距離が同じ場合は左より上優先）で、配置位置的に最も近いテキストをｎ番目の入力コントロールのラベルとして抽出し、ラベルを含めたｎ番目の入力コントロールの基準点を再設定する。 When the hierarchical level is a leaf node of the hierarchical level M (the lowest hierarchical level), the label assignment processing unit 131 sets the reference position of each input control as the upper left of the area, and then the text list information 162 and the input control list information 163. Is used to extract the text closest to the position of the nth input control as the label of the nth input control in the up or left direction (when the distance is the same, the top is given priority over the left). Then, the reference point of the nth input control including the label is reset.

図５および図６の例では、入力コントロールリスト情報１６３より３番目の入力コントロールの基準点“Ｘ：２８５ｐｘＹ：１９６ｐｘ”に対し、テキスト情報１６２より６番目のテキストの基準点“Ｘ：１００ｐｘＹ：１９２ｐｘ”が位置的に最も近いため、３番目の入力コントロールのラベルとして、６番目のテキスト“Address”が抽出され、抽出結果は、ラベル付与処理部１３１の処理実行後におけるコンテンツ構成ツリー１１００（図９参照）のノード番号６のノードに属性情報として格納されている。さらに、ラベルを含めた３番目の入力コントロール（ノード番号６）の基準位置を、６番目のテキストの基準位置“Ｘ：１００ｐｘＹ：１９２ｐｘ”として再設定され、コンテンツ構成ツリー１１００のノード番号６のノードに属性情報として格納されている。 In the example of FIGS. 5 and 6, the reference point “X: 285 px Y: 196 px” of the third input control from the input control list information 163, and the reference point “X: 100 px Y of the sixth text from the text information 162. Since 192 px ”is closest in position, the sixth text“ Address ”is extracted as the label of the third input control, and the extraction result is the content configuration tree 1100 ( Is stored as attribute information in the node of node number 6 in FIG. Further, the reference position of the third input control (node number 6) including the label is reset as the reference position “X: 100px Y: 192px” of the sixth text, and the node number 6 of the content structure tree 1100 is set. It is stored as attribute information in the node.

このように、コンテンツ５００に含まれるすべての入力コントロールに対して、上述と同様の処理が行なわれることで、階層レベルＭ（最下位階層レベル）の各ノードのラベル名および基準位置が、コンテンツ構成ツリー１１００の各ノードの属性情報として格納される。 As described above, the same processing as described above is performed on all the input controls included in the content 500, so that the label name and the reference position of each node at the hierarchical level M (the lowest hierarchical level) Stored as attribute information of each node of the tree 1100.

なお、本具体例では、ｎ番目の入力コントロールのラベルをテキストリスト情報１６２に格納されているコンテンツ中のテキストの中から、入力コントロールの基準点に対して上方向と左方向（距離が同じ場合は左より上優先）で最も近い距離にあるテキストとしたが、ラジオボタンやチェックボックスについては入力コントロールの基準点に対して右方向と下方向（距離が同じ場合は下より右優先）で最も近い距離にあるテキストとされてもよい。 In this specific example, the label of the nth input control is selected from the text in the content stored in the text list information 162 in the upward direction and the left direction (when the distance is the same). Is the closest text to the left, but for radio buttons and check boxes, the most right and lower direction (if the distance is the same, the right priority is given to the bottom) The text may be a close distance.

次に、ラベル付与処理部１３１は、階層レベルが階層レベルＭ（最下位層レベル）以外のブランチノードの場合、テキストリスト情報１６２および入力コントロールリスト情報１６３を用いて、ノード番号ｎの基準点に対し、上方向あるいは左方向（距離が同じ場合は左より上優先）で位置的に最も近いテキストであって、未だラベルとして抽出されていないテキストをノード番号ｎのラベルとして抽出し、ラベルを含めたノード番号ｎの基準点を再設定する。 Next, when the hierarchy level is a branch node other than the hierarchy level M (the lowest layer level), the label assignment processing unit 131 uses the text list information 162 and the input control list information 163 as the reference point of the node number n. On the other hand, the text that is closest in position in the upward direction or the left direction (when the distance is the same, priority is given to the upper left) and has not yet been extracted as a label, is extracted as the label of the node number n and includes the label. The reference point of node number n is reset.

図５および図６の例では、図８に示すコンテンツ構成ツリー１０００のノード番号２に関して、ノード番号２に属する要素（子ノード）の中で、最も左上（距離が同じ場合は左より上優先）の基準位置をもつノード番号５の基準点“Ｘ：１００ｐｘＹ：１５８ｐｘ”がノード番号２の基準位置に設定される。そのノード番号２の基準点に対し位置的に最も近いテキストの基準位置は、テキストリスト情報１６２の４番目のテキスト情報“First”であるが、これはすでにノード番号：５のラベルに使用されている。そこで、テキストリスト情報１６２の２番目のテキストの基準位置“Ｘ：９０ｐｘＹ：１１８ｐｘ”が、未だラベルとして抽出されておらず位置的に最も近いテキスト情報要素であるため、ノード番号２のラベルとして、２番目のテキスト“Applicant Information”が抽出され、抽出結果がコンテンツ構成ツリー１１００のノード番号２のノードに属性情報として格納される。さらに、ラベルを含めたノード番号２の基準位置をラベル（２番目のテキスト）の基準位置“Ｘ：９０ｐｘＹ：１１８ｐｘ”として再設定され、図９で示すコンテンツ構成ツリー１１００のノード番号２のノードに属性情報として格納される。 In the example of FIG. 5 and FIG. 6, regarding the node number 2 of the content structure tree 1000 shown in FIG. 8, among the elements (child nodes) belonging to the node number 2, the upper left (when the distance is the same, the upper priority is given to the left) The reference point “X: 100 px Y: 158 px” of the node number 5 having the reference position is set as the reference position of the node number 2. The reference position of the text closest to the reference point of the node number 2 is the fourth text information “First” in the text list information 162, which is already used for the label of the node number: 5. Yes. Therefore, since the reference position “X: 90 px Y: 118 px” of the second text in the text list information 162 is not yet extracted as a label and is the closest text information element in position, The second text “Applicant Information” is extracted, and the extraction result is stored as attribute information in the node of node number 2 in the content structure tree 1100. Further, the reference position of the node number 2 including the label is reset as the reference position “X: 90 px Y: 118 px” of the label (second text), and the node of the node number 2 in the content configuration tree 1100 shown in FIG. Stored as attribute information.

このように、階層レベルが階層レベルＭ（最下位階層レベル）以外のすべてのノードに対して、上述と同様の処理が行われることで、階層レベルが階層レベルＭ（最下位階層レベル）以外の各ノードのラベルおよび基準位置が、コンテンツ構成ツリー１１００の各ノードの属性情報として格納される（なお、前述したように、図９のコンテンツ構成ツリーは一部が図示省略されている）。 In this way, the same processing as described above is performed on all nodes other than the hierarchy level M (the lowest hierarchy level), so that the hierarchy level is other than the hierarchy level M (the lowest hierarchy level). The label and reference position of each node are stored as attribute information of each node of the content configuration tree 1100 (note that a part of the content configuration tree in FIG. 9 is not shown).

なお、本具体例におけるグルーピング処理部１３０では、グルーピング方法として、各入力コントロールの基準位置情報を用いて行ったが、図５に示すテキストリスト情報１６２のフォントサイズ情報を用いて、或る閾値以上のフォントサイズを持つテキストの位置を階層レベルｍにおける入力コントロール間のグループ境界とみなすという方法をとってもよい。たとえば、フォントサイズが“＋３”以上あるテキストの位置の直前を階層レベル２におけるグループの境界とみなすというグルーピング条件が設定されていた場合に、図５および図６の例では、テキストリスト情報１６２の２番目のテキスト“Applicant Information”と１３番目のテキスト“Elective Conference”と２６番目のテキスト”Payment”が“＋３”のフォントサイズを持つので、入力コントロールリスト情報１６３の１番目と２番目と、９番目と１０番目と、１２番目と１３番目の入力コントロール間が階層レベル２におけるグループの境界とみなされ、１番目〜９番目、１０番目〜１２番目および１３番目〜１５番目の入力コントロールが階層レベル２における１つのグループとみなされる。 In the grouping processing unit 130 in this specific example, the reference position information of each input control is used as the grouping method. However, the font size information of the text list information 162 shown in FIG. A method may be used in which a position of text having a font size of 2 is regarded as a group boundary between input controls at hierarchical level m. For example, in the example of FIGS. 5 and 6, when the grouping condition that the immediately preceding position of the text having the font size “+3” or more is regarded as the group boundary at the hierarchical level 2 is set, Since the second text “Applicant Information”, the 13th text “Elective Conference”, and the 26th text “Payment” have a font size of “+3”, the first and second of the input control list information 163, 9 The 10th, 10th, and 12th and 13th input controls are considered as group boundaries in the hierarchy level 2, and the 1st to 9th, 10th to 12th and 13th to 15th input controls are the hierarchy level. 2 is regarded as one group.

以上が、部品登録制御処理部１１１の具体的な処理の流れである。 The above is a specific processing flow of the component registration control processing unit 111.

上述した第一の実施形態によれば、コンテンツを複数の部品に分け、複数の部品をそれぞれ表す複数の既存部品サブツリーもリポジトリ（例えば磁気ディスク装置１０１）に格納され、各既存部品サブツリーも、検索の対象となる。これにより、コンテンツファイルそれ自体のみを検索対象とすることに比べて、ユーザが作成しようとするコンテンツにより近いコンテンツとしての部品の検索でき、以って、ユーザの負担を軽減することができる。 According to the first embodiment described above, the content is divided into a plurality of parts, and a plurality of existing part subtrees representing the plurality of parts are also stored in the repository (for example, the magnetic disk device 101), and each existing part subtree is also searched. It becomes the object of. As a result, it is possible to search for a component as content closer to the content that the user intends to create than to search only the content file itself, thereby reducing the burden on the user.

また、上述した第一の実施形態によれば、クエリツリー及び既存部品サブツリーに、テキスト情報要素に加えて入力コントロール種類情報要素が含まれており、テキスト情報要素だけでなく入力コントロール種類情報要素をも基に、クエリツリーと既存部品サブツリーの類似度が算出される。これにより、テキストのみが検索キーとして入力される検索方法に比べて、ユーザが作成しようとするコンテンツにより近いコンテンツとしての部品を検索することができ、以って、以って、ユーザの負担を軽減することができる。 Further, according to the first embodiment described above, the query tree and the existing part subtree include the input control type information element in addition to the text information element, and the input control type information element is not only the text information element. Based on this, the similarity between the query tree and the existing part subtree is calculated. As a result, compared to a search method in which only text is input as a search key, it is possible to search for a component as a content closer to the content that the user intends to create. Can be reduced.

また、上述した第一の実施形態によれば、コンテンツファイルをリポジトリに登録するだけで、コンテンツ部品を表す既存部品サブツリーをリポジトリに蓄積できるとともに、それら蓄積された既存部品サブツリーに対して、人手によってあらかじめメタ情報を付与しなくても、開発者が所望するコンテンツ部品を検索することができる。 In addition, according to the first embodiment described above, it is possible to store existing component subtrees representing content components in the repository simply by registering content files in the repository, and to manually store the stored existing component subtrees. Even without adding meta information in advance, the developer can search for a desired content component.

また、上述した第一の実施形態によれば、同一のコンピュータシステムや異なるコンピュータシステム（コンピュータシステムは例えば電子申請システム）でも、コンテンツ（例えば入力フォーム）のデザインの統一が図れる。 Further, according to the first embodiment described above, it is possible to unify the design of content (for example, an input form) even in the same computer system or different computer systems (for example, an electronic application system).

＜第二の実施形態＞。 <Second embodiment>.

次に、本発明の第二の実施形態について説明する。なお、以下、第一の実施形態の相違点を主に説明し、第一の実施形態との共通点については説明を省略或いは簡略する。 Next, a second embodiment of the present invention will be described. In the following, differences from the first embodiment will be mainly described, and description of common points with the first embodiment will be omitted or simplified.

第二の実施形態では、レポジトリに登録される全ての既存部品サブツリーが、部品検索前に（具体的には部品登録の際に）、機能別およびスタイル別にグルーピングされ、部品候補一覧では、機能別およびスタイル別に部品候補に関する情報が提示される。この結果、複数の部品候補に関する情報を、機能単位およびスタイル単位で表示することができる。 In the second embodiment, all existing parts subtrees registered in the repository are grouped by function and style before parts search (specifically, when parts are registered). Information on candidate parts is presented by style. As a result, information regarding a plurality of component candidates can be displayed in units of functions and styles.

第一の実施形態と異なる点は、図１４に示すとおり、部品登録処理部１２６に、機能別クルーピング処理部１２７とスタイル別グルーピング１２８が追加されている点と、部品検索処理部１４２と部品候補出力処理部１４３の処理がそれぞれ一部変更されている点である（変更後の部品登録処理部１２６、部品検索処理部１４２および部品候補出力処理部１４３を、それぞれ部品登録処理部１２６ｃ、部品グループ検索処理部１４２ｃおよび部品候補出力処理部１４３ｃと呼ぶ）。 The difference from the first embodiment is that, as shown in FIG. 14, a function-specific grouping processing unit 127 and a style-specific grouping 128 are added to the component registration processing unit 126, a component search processing unit 142, and a component The processing of the candidate output processing unit 143 is partially changed (the component registration processing unit 126, the component search processing unit 142, and the component candidate output processing unit 143 after the change are replaced by the component registration processing unit 126c and the component (Referred to as group search processing unit 142c and component candidate output processing unit 143c).

以下、第二の実施形態で行われる処理を、第一の実施形態で行われる処理と異なる部分を主に説明する。 In the following, the processing performed in the second embodiment will be described mainly with respect to differences from the processing performed in the first embodiment.

図１５は、機能別グルーピング処理部１２７が行う処理の流れの一例を示す。 FIG. 15 shows an example of the flow of processing performed by the functional grouping processing unit 127.

まず、機能別グルーピング処理部１２７は、磁気ディスク装置１０１又はワークエリア１５０に登録されている本処理を実行していないすべてのコンテンツファイル１６０に対して、ステップ２５１１およびステップ２５１７〜２５１９を繰り返す（ステップ２５１０）。 First, the functional grouping processing unit 127 repeats Step 2511 and Steps 2517 to 2519 for all content files 160 that have not been subjected to this processing registered in the magnetic disk device 101 or the work area 150 (Step 2511). 2510).

次に、機能別グルーピング処理部１２７は、該コンテンツファイル１６０に対応するコンテンツ構成ツリー１６１に含まれるすべてのサブツリー（既存部品サブツリー１７４）に対して、ステップ２５１２〜２５１３を繰り返す（ステップ２５１１）。 Next, the functional grouping processing unit 127 repeats Steps 2512 to 2513 for all subtrees (existing part subtrees 174) included in the content configuration tree 161 corresponding to the content file 160 (Step 2511).

次に、機能別グルーピング処理部１２７は、該既存部品サブツリーの部品内情報１７７Ｂ（ラベル情報１７５Ｂと入力コントロール情報１７６Ｂを含んだ情報）を取得し、取得した部品内情報１７７Ｂをワークエリア１５０に格納する（ステップ２５１２）。 Next, the functional grouping processing unit 127 acquires in-part information 177B (information including label information 175B and input control information 176B) of the existing part subtree, and stores the acquired in-part information 177B in the work area 150. (Step 2512).

次に、機能別グルーピング処理部１２７は、すべての機能グループ１８８に対して、ステップ２５１４〜２５１６を繰り返す（ステップ２５１３）。なお、機能グループとは、同一又は類似の機能を持つ部品に対応した既存部品サブツリーの集まりである。 Next, the function-specific grouping processing unit 127 repeats steps 2514 to 2516 for all the function groups 188 (step 2513). A function group is a collection of existing component subtrees corresponding to components having the same or similar functions.

次に、機能別グルーピング処理部１２７は、該機能グループ１８８の部品内情報１７７Ｃ（ラベル情報１７５Ｃと入力コントロール情報１７６Ｃを含んだ情報）を取得し、取得された部品内情報１７７Ｃ（図１９参照）を、ワークエリア１５０に格納する（ステップ２５１４）。 Next, the function-specific grouping processing unit 127 acquires in-part information 177C (information including label information 175C and input control information 176C) of the function group 188, and the acquired in-part information 177C (see FIG. 19). Is stored in the work area 150 (step 2514).

次に、機能別グルーピング処理部１２７は、該既存部品サブツリー１７４の部品内情報１７７Ｂと該機能グループ１８８の部品内情報１７７Ｃとから、類似度算出のためのベクトル空間とベクトルを表現したテーブル形式のベクトル情報１８２を生成し、生成したベクトル情報１８２をワークエリア１５０に格納する（ステップ２５１５）。 Next, the functional grouping processing unit 127 uses a table format representing a vector space and vectors for calculating similarity from the in-part information 177B of the existing part subtree 174 and the in-part information 177C of the function group 188. Vector information 182 is generated, and the generated vector information 182 is stored in the work area 150 (step 2515).

次に、機能別グルーピング処理部１２７は、ベクトル情報１８２を用いて、該既存部品サブツリーに対する該機能グループの類似度を式２により算出する。類似度は、ベクトル情報１８２からベクトル空間法のよる余弦尺度（式２）により算出し、図１８に例示する機能グループ別類似度算出結果１８９（機能グループＩＤ情報１８９１と類似度情報１８９２を含んだ情報）を、ワークエリア１５０に格納する（ステップ２５１６）。 Next, the function-specific grouping processing unit 127 uses the vector information 182 to calculate the similarity of the function group with respect to the existing component subtree using Equation 2. The similarity is calculated from the vector information 182 by the cosine scale (Formula 2) by the vector space method, and includes the similarity calculation result 189 (functional group ID information 1891 and similarity information 1892) illustrated in FIG. Information) is stored in the work area 150 (step 2516).

次に、機能別グルーピング処理部１２７は、該既存サブツリー１７４に対して、閾値以上の類似度を持つ機能グループ１８８が存在するか否かを判定する（ステップ２５１７）。 Next, the function-specific grouping processing unit 127 determines whether or not there is a function group 188 having a similarity greater than or equal to a threshold for the existing subtree 174 (step 2517).

ステップ２５１７の判定の結果、閾値以上の類似度を持つ機能グループ１８８が存在する場合は、機能別グルーピング処理部１２７は、該既存部品サブツリー１７４を、最も高い類似度を持つ機能グループ１８８のメンバとし、該既存部品サブツリー１７４に対応する該コンテンツ構成ツリー１６１のノード１９５（図２２参照）に機能グループIDを格納する（ステップ２５１８）。 If the result of determination in step 2517 is that there is a function group 188 having a similarity greater than or equal to the threshold, the function-specific grouping processing unit 127 sets the existing component subtree 174 as a member of the function group 188 having the highest similarity. The function group ID is stored in the node 195 (see FIG. 22) of the content structure tree 161 corresponding to the existing part subtree 174 (step 2518).

ステップ２５１７の判定の結果、閾値以上の類似度を持つ機能グループ１８８が存在しない場合は、機能別グルーピング処理部１２７は、該既存部品サブツリーをメンバとする新たな機能グループを生成し、機能グループIDを付与した上で（ステップ２５１９）、磁気ディスク装置１０１又はワークエリア１５０に、その既存部品サブツリーを登録する。なお、機能グループIDは、主メモリ１０２または磁気ディスク装置１０１に格納されているルールに基づいて機能グループIDを生成して付与することができる。また、一の機能グループについて、その機能グループに最後にメンバとされた既存部品サブツリーを、代表の既存部品サブツリーとすることができる。代表の既存部品サブツリーがどれであるかは、例えば、既存部品サブツリーのルートノードに代表であることを表すコード（以下、代表コード）を含めることで、特定することができる（代表となる既存部品サブツリーが変わる場合には、既存の代表の既存部品サブツリーのルートノードから代表コードを削除することができる）。 If the result of determination in step 2517 is that there is no function group 188 having a similarity greater than or equal to the threshold, the function-specific grouping processing unit 127 generates a new function group whose members are the existing component subtree, and the function group ID (Step 2519), the existing component subtree is registered in the magnetic disk device 101 or the work area 150. The function group ID can be generated and assigned based on the rules stored in the main memory 102 or the magnetic disk device 101. Further, for one function group, the existing component subtree that is the last member of the function group can be used as a representative existing component subtree. The representative existing part subtree can be identified by including, for example, a code representing the representative (hereinafter, representative code) in the root node of the existing part subtree (the existing existing part subtree). If the subtree changes, the representative code can be deleted from the root node of the existing part subtree of the existing representative).

図１６は、スタイル別グルーピング処理部１２８が行う処理の流れの一例を示す。 FIG. 16 shows an example of the flow of processing performed by the style-specific grouping processing unit 128.

まず、スタイル別グルーピング処理部１２８は、磁気ディスク装置１０１又はワークエリア１５０に登録されている本処理を実行していないすべての機能グループ１８８に対して、ステップ２６１１を繰り返す（ステップ２６１０）。 First, the style-specific grouping processing unit 128 repeats Step 2611 for all function groups 188 registered in the magnetic disk device 101 or the work area 150 and not executing this processing (Step 2610).

次に、スタイル別グルーピング処理部１２８は、該機能グループに含まれるすべての既存部品サブツリーに対して、ステップ２６１２〜２６１３を繰り返す（ステップ２６１１）。 Next, the style-specific grouping processing unit 128 repeats Steps 2612 to 2613 for all the existing component subtrees included in the function group (Step 2611).

次に、スタイル別グルーピング処理部１２８は、該既存部品サブツリーに対応するコンテンツファイル１６０のソースから、該既存部品サブツリーに対応した部品のスタイル情報を取得し（ステップ２６１２）、取得したスタイル情報を、ワークエリア１５０に格納する。なお、類似性を比較する部品間のテキスト数が二以上である場合、図２０に例示するスタイル情報１９３（二以上のテキストとそれら二以上のテキストのコンテンツにおける並び順（図２０では、一番左が先頭））が取得される。また、類似性を比較する部品間のテキスト数が１である場合、図２１に例示するスタイル情報１９４（ラベル情報１９４１、ｔｙｐｅ情報１９４２、ｓｉｚｅ情報１９４３、ｆｏｎｔ−ｓｉｚｅ情報１９４４、ｃｏｌｏｒ情報１９４５およびｍａｘｌｅｎｇｔｈ情報１９４６を含んだ情報）が取得される。なお、スタイル情報１９３及び１９４は、ＨＴＭＬソース（コンテンツファイル）の属性情報等から構築することができるが、ソースから抽出できる他の情報要素から構築されても良い。また、スタイル情報１９４に含まれる情報要素は、一のテキストに関する属性を表す情報要素であれば何でもよい。また、類似性を比較する部品間のテキスト数が二以上であっても、各テキストについてスタイル情報１９４が作成され、スタイル情報１９４同士の比較が行われても良い。 Next, the style-specific grouping processing unit 128 acquires the style information of the part corresponding to the existing part subtree from the source of the content file 160 corresponding to the existing part subtree (step 2612). Store in work area 150. When the number of texts between parts to be compared for similarity is two or more, the style information 193 illustrated in FIG. 20 (the order of arrangement of two or more texts and the contents of the two or more texts (in FIG. Left is first)) is acquired. Further, when the number of texts between parts to be compared for similarity is 1, style information 194 (label information 1941, type information 1942, size information 1943, font-size information 1944, color information 1945, and maxlength illustrated in FIG. Information including information 1946) is acquired. The style information 193 and 194 can be constructed from attribute information or the like of an HTML source (content file), but may be constructed from other information elements that can be extracted from the source. The information element included in the style information 194 may be any information element that represents an attribute related to one text. Further, even if the number of texts between parts to be compared for similarity is two or more, style information 194 may be created for each text and the style information 194 may be compared.

次に、スタイル別グルーピング処理部１２８は、すべてのスタイルグループ１９０に対して、ステップ２６１４〜２６１７を繰り返す（ステップ２６１３）。なお、スタイルグループとは、画面上に表示されるスタイルが同一又は類似している既存部品サブツリーの集まりである。 Next, the style grouping processing unit 128 repeats Steps 2614 to 2617 for all the style groups 190 (Step 2613). A style group is a collection of existing component subtrees having the same or similar style displayed on the screen.

次に、スタイル別グルーピング処理部１２８は、該スタイルグループからスタイル情報１９３又は１９４を取得し、取得したスタイル情報１９３又は１９４を、ワークエリア１５０に格納する（ステップ２６１４）。 Next, the grouping processor by style 128 acquires the style information 193 or 194 from the style group, and stores the acquired style information 193 or 194 in the work area 150 (step 2614).

次に、スタイル別グルーピング処理部１２８は、既存部品サブツリーのスタイル情報１９３又は１９４とスタイルグループ１９０のスタイル情報１９３又は１９４とを比較し、スタイル情報が完全一致するか否かを判定する（ステップ２６１５）。 Next, the style-specific grouping processing unit 128 compares the style information 193 or 194 of the existing part subtree with the style information 193 or 194 of the style group 190 and determines whether the style information completely matches (step 2615). ).

ステップ２６１５の判定の結果、それぞれのスタイル情報が完全一致した場合は、スタイル別グルーピング処理部１２８は、該部品を該スタイルグループ１９０のメンバとし、該既存部品サブツリー１７４に対応する該コンテンツ構成ツリー１６１のノード１９５（図２２参照）に、スタイルグループIDを格納する（ステップ２５１６）。 As a result of the determination in step 2615, when the style information completely matches, the style grouping processing unit 128 makes the part a member of the style group 190 and the content structure tree 161 corresponding to the existing part subtree 174. The style group ID is stored in the node 195 (see FIG. 22) (step 2516).

ステップ２６１５の判定の結果、それぞれのスタイル情報が完全一致しない場合は、スタイル別グルーピング処理部１２８は、該既存部品サブツリーをメンバとする新たなスタイルグループを生成し、スタイルグループIDを付与した上で（ステップ２６１７）、磁気ディスク装置１０１又はワークエリア１５０に登録する。なお、スタイルグループIDは、主メモリ１０２または磁気ディスク装置１０１に格納されているルールに基づいて機能グループIDを生成して付与することができる。また、一のスタイルグループについて、最後にメンバとされた既存部品サブツリーを代表の既存部品サブツリーとすることができる。 As a result of the determination in step 2615, if the style information does not completely match, the style-specific grouping processing unit 128 generates a new style group having the existing component subtree as a member, and assigns the style group ID. (Step 2617), the data is registered in the magnetic disk device 101 or work area 150. The style group ID can be assigned by generating a function group ID based on the rules stored in the main memory 102 or the magnetic disk device 101. In addition, the existing component subtree that is the last member for one style group can be used as a representative existing component subtree.

なお、ステップ２６１５の類似性判定処理では、既存部品サブツリーおよび該スタイルグループ１９０のそれぞれのスタイル情報が互いに完全一致しないと類似しているとみなさないとしたが、条件を詳細に設けて、小さな差異については吸収して類似しているとみなしてもよい。具体的には、例えば、スタイル情報１９３同士の比較において、テキストが完全一致であれば並び順が違っていても、比較対象の既存部品サブツリーが比較対象のスタイルグループのメンバとされても良いし、或いは、並び順が完全一致であれば所定数のテキストが違っていても、比較対象の既存部品サブツリーが比較対象のスタイルグループのメンバとされても良い。また、例えば、スタイル情報１９４同士の比較において、所定割合以上の情報要素が互いに一致していれば、比較対象の既存部品サブツリーが比較対象のスタイルグループのメンバとされても良い。 In the similarity determination process in step 2615, it is assumed that the existing part subtree and the style information of the style group 190 are not similar if they do not completely match each other. May be considered similar to absorb. Specifically, for example, in the comparison between the style information 193, if the text is completely matched, the arrangement order may be different, or the existing part subtree to be compared may be a member of the style group to be compared. Alternatively, if the arrangement order is completely the same, even if the predetermined number of texts are different, the existing part subtree to be compared may be a member of the style group to be compared. For example, in the comparison between the style information 194, if the information elements of a predetermined ratio or more match each other, the existing part subtree to be compared may be a member of the style group to be compared.

なお、機能別グルーピング処理部１２７およびスタイル別グルーピング処理部１２８の処理結果は、図２２に例示する既存部品グルーピング情報１９６、すなわち、機能グループＩＤ情報１９６１、スタイルグループＩＤ情報１９６２、部品ＩＤ情報１９６３（スタイルグループに含まれる既存部品サブツリーに対応した部品のＩＤ）および出現数情報１９６４（スタイルグループに含まれる既存部品サブツリーの数）が含まれている情報であっても良い。既存部品グルーピング情報１９６は、磁気ディスク装置１０１又はワークエリア１５０に格納されてもよい。 Note that the processing results of the function-specific grouping processing unit 127 and the style-specific grouping processing unit 128 are the existing component grouping information 196 illustrated in FIG. 22, that is, the function group ID information 1961, the style group ID information 1962, and the component ID information 1963 ( Information including the ID of a component corresponding to an existing component subtree included in the style group and the appearance number information 1964 (the number of existing component subtrees included in the style group) may be used. The existing part grouping information 196 may be stored in the magnetic disk device 101 or the work area 150.

図１７は、部品グループ検索処理部１４２ｃが行う処理の流れの一例を示す。 FIG. 17 shows an example of the flow of processing performed by the component group search processing unit 142c.

まず、部品グループ検索処理部１４２ｃは、入出力定義情報解析処理部１４１で取得されたすべてのクエリツリー１７１に対して、ステップ２７１０〜２７１１を繰り返す（ステップ２７００）。 First, the component group search processing unit 142c repeats steps 2710 to 2711 for all the query trees 171 acquired by the input / output definition information analysis processing unit 141 (step 2700).

次に、部品グループ検索処理部１４２ｃは、該クエリツリー１７１の部品内情報１７７Ａを取得し、取得された部品内情報１７７Ａをワークエリア１５０に格納する（ステップ２７１０）。 Next, the part group search processing unit 142c acquires the in-part information 177A of the query tree 171 and stores the acquired in-part information 177A in the work area 150 (step 2710).

次に、部品グループ検索処理部１４２ｃは、磁気ディスク装置１０１又はワークエリア１５０に登録されているすべての機能グループ１８８に対して、ステップ２７１２〜２７１３を繰り返す。 Next, the parts group search processing unit 142 c repeats steps 2712 to 2713 for all the function groups 188 registered in the magnetic disk device 101 or the work area 150.

次に、部品グループ検索処理部１４２ｃは、該機能グループ１８８の部品内情報１７７Ｃを取得し、取得された部品内情報１７７Ｃをワークエリア１５０に格納する（ステップ２７１２）。 Next, the part group search processing unit 142c acquires the in-part information 177C of the function group 188, and stores the acquired in-part information 177C in the work area 150 (step 2712).

最後に、部品グループ検索処理部１４２ｃは、該クエリツリー１７１に対する該機能グループ１８８の類似度を算出し、図２３に例示する機能グループ別類似度算出結果１８９（機能グループＩＤ情報１８９１と類似度情報１８９２を含んだ情報）をワークエリア１５０に格納する（ステップ２７１３）。このステップ２７１３では、機能別グルーピング処理部１２７の処理流れにおけるステップ２５１６と実質的に同じ処理を行う（例えば、（式２）で類似度を算出する）ことができる。 Lastly, the part group search processing unit 142c calculates the similarity of the function group 188 with respect to the query tree 171, and calculates the similarity by function group calculation result 189 (function group ID information 1891 and similarity information) illustrated in FIG. (Information including 1892) is stored in the work area 150 (step 2713). In step 2713, substantially the same processing as in step 2516 in the processing flow of the function-specific grouping processing unit 127 can be performed (for example, the similarity is calculated by (Expression 2)).

部品候補出力処理部１４３ｃは、部品グループ検索処理部１４２ｃで生成された機能グループ別類似度算出処理結果１８９と、機能別グルーピング処理部１２７とスタイル別グルーピング処理部１２８によって生成された既存部品グルーピング情報１９６とを用いて、クエリツリー１７１に対する機能グループ１８８に関する情報および機能グループ１８８内におけるスタイルグループに関する情報の一覧（以下、部品グループ候補一覧）を、部品グループ候補一覧表示画面１９７に表示することができる。この画面１９７は、例えば、第一の実施形態における部品候補一覧表示画面１８７に代わりに用意された画面である。 The component candidate output processing unit 143c includes the function group similarity calculation processing result 189 generated by the component group search processing unit 142c, and the existing component grouping information generated by the function grouping processing unit 127 and the style grouping processing unit 128. 196 can be used to display a list of information related to the function group 188 for the query tree 171 and information related to the style groups in the function group 188 (hereinafter, a part group candidate list) on the part group candidate list display screen 197. . This screen 197 is, for example, a screen prepared instead of the component candidate list display screen 187 in the first embodiment.

部品グループ候補一覧には、機能グループのＩＤ、クエリツリー１７１に対する機能グループ１８８の類似度、該機能グループに含まれるサブ機能グループのＩＤ、該機能グループ１８８に含まれるスタイルグループのＩＤおよび該スタイルグループに含まれる部品数が含まれる。また、各スタイルグループの代表部品をプレビューするためのプレビューボタンが表示される。なお、該機能グループに含まれるサブ機能グループのＩＤは、該機能グループに含まれる既存部品サブツリー１７４内のノード１９５に記録されている機能グループＩＤである。すなわち、第一の機能グループに属する既存部品サブツリー内のノードに、第一の機能グループとは別の第二の機能グループのＩＤが記録されていれば、その第二の機能グループが、第一の機能グループのサブ機能グループとなる。 The part group candidate list includes the function group ID, the similarity of the function group 188 to the query tree 171, the ID of the sub-function group included in the function group, the ID of the style group included in the function group 188, and the style group. The number of parts included in is included. In addition, a preview button for previewing a representative part of each style group is displayed. The ID of the sub function group included in the function group is the function group ID recorded in the node 195 in the existing part subtree 174 included in the function group. That is, if the ID of the second function group different from the first function group is recorded in the node in the existing component subtree belonging to the first function group, the second function group is This is a sub-function group of the function group.

以下、部品登録処理部１２６ｃの具体的な処理を、図１８〜図２２を用いて説明する。 Hereinafter, specific processing of the component registration processing unit 126c will be described with reference to FIGS.

図１８は、機能別グルーピング処理部１２７の具体的な処理の説明図である。 FIG. 18 is an explanatory diagram of specific processing of the functional grouping processing unit 127.

機能別グルーピング処理部１２７は、磁気ディスク装置１０１に登録されているすべてのコンテンツファイル１６０に含まれるすべての既存部品サブツリー１７４について、登録されている機能グループ１８８との類似度を算出することで、各既存部品サブツリー１７４に対応する部品がどの機能グループに属すかを決定する。 The functional grouping processing unit 127 calculates the similarity with the registered function group 188 for all the existing component subtrees 174 included in all the content files 160 registered in the magnetic disk device 101. The functional group to which the part corresponding to each existing part subtree 174 belongs is determined.

図１８の例では、グルーピング対象の既存部品サブツリー１７４が、３つの機能グループ１８８（ＩＤ“００１”〜“００３”）の中のどの機能グループに属するかを決定するために、該既存部品サブツリー１７４に対する各機能グループ１８８の類似度が算出される。具体的には、機能グループＩＤ“００１”については類似度“０．５０”が算出され、機能グループＩＤ“００２”については類似度“０．６７”が算出され、機能グループＩＤ“００３”については類似度“０．４２”が算出され、それぞれ機能グループＩＤ情報１８９１と類似度情報１８９２が機能グループ別類似度算出結果１８９に格納される。この結果、例えば類似度の閾値が“０．５０”の場合、該既存部品サブツリー１７４は、類似度が最も高いＩＤ“００２”の機能グループのメンバに追加される。また、類似度の閾値が“０．７０”の場合、該既存部品サブツリー１７４の部品は、どの機能グループにも属さないものとして、該既存部品サブツリー１７４の部品をメンバとする新たな機能グループ（ＩＤ“００４”）が生成される。 In the example of FIG. 18, in order to determine which functional group of the three function groups 188 (ID “001” to “003”) the existing part subtree 174 to be grouped belongs to. The similarity of each function group 188 is calculated. Specifically, the similarity “0.50” is calculated for the function group ID “001”, the similarity “0.67” is calculated for the function group ID “002”, and the function group ID “003”. The similarity “0.42” is calculated, and the function group ID information 1891 and the similarity information 1892 are stored in the function group similarity calculation result 189, respectively. As a result, for example, when the similarity threshold is “0.50”, the existing component subtree 174 is added to the member of the function group with ID “002” having the highest similarity. Further, when the similarity threshold is “0.70”, it is assumed that the parts in the existing part subtree 174 do not belong to any function group, and a new function group (a member of the parts in the existing part subtree 174 as members) ID “004”) is generated.

図１９は、機能別グルーピング処理部１２７における類似度算出の具体的な処理の説明図である。 FIG. 19 is an explanatory diagram of specific processing of similarity calculation in the functional grouping processing unit 127.

まず、機能別グルーピング処理部１２７は、既存部品サブツリー１７４から部品内情報１７７Ｂを取得し、機能グループ１８８から部品内情報１７７Ｃを取得する。図１９の例では、部品内情報１７７Ｂは、図１２の部品内情報１７７Ａと同じ内容であり、部品内情報１７７Ｃは、図１２の部品内情報１７７Ｂと同じ内容である。このため、これらの部品内情報１７７Ｂ及び１７７Ｃを基に作成されたベクトル情報１８２を参照して上記（式２）を利用して算出された類似度は、０．６７となる。機能別グルーピング処理部１２７は、機能グループ別類似度算出結果１８９に、機能グループＩＤ情報１８９１“００２”と、類似度情報１８９２“０．６７”とを記録する。 First, the functional grouping processing unit 127 acquires the in-part information 177B from the existing part subtree 174, and acquires the in-part information 177C from the function group 188. In the example of FIG. 19, the in-part information 177B has the same contents as the in-part information 177A in FIG. 12, and the in-part information 177C has the same contents as the in-part information 177B in FIG. For this reason, the similarity calculated using the above (Formula 2) with reference to the vector information 182 created based on the in-part information 177B and 177C is 0.67. The functional grouping processing unit 127 records the functional group ID information 1891 “002” and the similarity information 1892 “0.67” in the functional group similarity calculation result 189.

なお、グルーピング対象の既存部品サブツリー１７４と機能グループ１８８との類似度算出では、機能グループ１８８における代表の既存部品サブツリーとグルーピング対象の既存部品サブツリー１７４とが比較されても良い。或いは、グルーピング対象の既存部品サブツリー１７４は、機能グループ１８８における二以上の（例えば全ての）既存部品サブツリー１７４と比較され、算出された二以上の類似度の平均が、グルーピング対象の既存部品サブツリー１７４と機能グループ１８８との類似度とされても良い。 In the similarity calculation between the existing part subtree 174 to be grouped and the function group 188, the representative existing part subtree in the function group 188 and the existing part subtree 174 to be grouped may be compared. Alternatively, the grouping target existing part subtree 174 is compared with two or more (for example, all) existing part subtrees 174 in the function group 188, and the calculated average of the two or more similarities is the grouping target existing part subtree 174. And the function group 188 may be similar to each other.

以上のように、機能グループとグルーピング対象の既存部品サブツリー１７４と As described above, the function group and the existing part subtree 174 to be grouped

図２０は、類似性を比較する部品間のテキスト数が二以上である場合のスタイル別グルーピング処理部１２８の具体的な処理の説明図である。 FIG. 20 is an explanatory diagram of specific processing of the style-specific grouping processing unit 128 when the number of texts between parts to be compared for similarity is two or more.

まず、スタイル別グルーピング処理部１２８は、コンテンツファイル１６０（ソース）から、グルーピング対象の既存部品サブツリーのスタイル情報１９３を取得する。スタイル情報１９３は、テキスト（例えば入力項目名）と順番を表現した情報であり、本図の例では、“氏名”、“住所”及び“電話番号”といった３つの入力項目名が、“氏名”、“住所”、“電話番号”の順でコンテンツ上にレイアウトされていることを示す。 First, the style-specific grouping processing unit 128 acquires the style information 193 of the existing part subtree to be grouped from the content file 160 (source). The style information 193 is information expressing the text (for example, input item names) and the order. In the example of this figure, three input item names such as “name”, “address”, and “phone number” are “name”. , “Address”, “phone number” in the order of content.

次に、スタイル別グルーピング処理部１２８は、スタイルグループ１９０についても、スタイルグループ１９０の代表既存部品ツリーからスタイル情報１９３を取得する。本図の例では、スタイルグループＩＤ“００１”について、スタイル情報１９３として、“性” 名住所電話番号”が取得され、スタイルグループＩＤ“００２”について、スタイル情報１９３として、“氏名電話番号住所”が取得され、スタイルグループＩＤ“００３”について、スタイル情報１９３として、“氏名住所電話番号”が取得される。 Next, the style grouping processing unit 128 also acquires the style information 193 from the representative existing part tree of the style group 190 for the style group 190. In the example of this figure, “sex” name address telephone number ”is acquired as style information 193 for style group ID“ 001 ”, and“ name telephone number address ”is used as style information 193 for style group ID“ 002 ”. Is acquired, and “name, address, telephone number” is acquired as the style information 193 for the style group ID “003”.

次に、スタイル別グルーピング処理部１２８は、グルーピング対象の既存部品サブツリー１７４のスタイル情報１９３と各スタイルグループ１９０の各スタイル情報１９３とを比較し、完全一致するかどうかを判定する。本図の例では、３つのスタイルグループ１９０のうち、スタイルグループＩＤ“００１”については入力項目名が異なり、スタイルグループＩＤ“００２”については入力項目名の順番が異なり、スタイルグループＩＤ“００３”に対応したスタイル情報１９３のみが、グルーピング対象の既存部品サブツリー１７４のスタイル情報１９３と完全一致することになる。このため、判定結果を表す情報であるスタイル別類似性判定結果１９２には、グルーピング対象の既存部品サブツリー１７４はスタイルグループＩＤ“００３”のスタイルグループ１９０のメンバとなることが示される。 Next, the style-specific grouping processing unit 128 compares the style information 193 of the grouping target existing part subtree 174 with the style information 193 of each style group 190 to determine whether or not they are completely matched. In the example of this figure, among the three style groups 190, the input item names are different for the style group ID “001”, the order of the input item names is different for the style group ID “002”, and the style group ID “003”. Only the style information 193 corresponding to is completely matched with the style information 193 of the existing part subtree 174 to be grouped. Therefore, the style-specific similarity determination result 192 that is information indicating the determination result indicates that the existing part subtree 174 to be grouped becomes a member of the style group 190 with the style group ID “003”.

図２１は、類似性を比較する部品間のテキスト数が１である場合のスタイル別グルーピング処理部１２８の具体的な処理の説明図である。 FIG. 21 is an explanatory diagram of specific processing of the style-specific grouping processing unit 128 when the number of texts between parts to be compared for similarity is 1.

まず、スタイル別グルーピング処理部１２８は、コンテンツファイル１６０（ソース）から、グルーピング対象の既存部品ツリー１７４のスタイル情報１９４を取得する。スタイル情報１９４には、一つの入力項目名（テキスト）のスタイルに関する属性を示す情報要素が含まれ、本図の例では、ラベル情報１９４１“氏名”、ｔｙｐｅ情報１９４２“text”、ｓｉｚｅ情報１９４３“１０”、ｆｏｎｔ−ｓｔｙｌｅ情報１９４４“normal”、ｆｏｎｔ−ｓｉｚｅ情報１９４５“medium”、ｃｏｌｏｒ情報１９４６“black”、およびｍａｘｌｅｎｇｔｈ情報１９４７“１０”が含まれる。 First, the style-specific grouping processing unit 128 acquires the style information 194 of the existing part tree 174 to be grouped from the content file 160 (source). The style information 194 includes information elements indicating attributes relating to the style of one input item name (text). In the example of this figure, label information 1941 “name”, type information 1942 “text”, and size information 1943 “ 10 ”, font-style information 1944“ normal ”, font-size information 1945“ medium ”, color information 1946“ black ”, and maxlength information 1947“ 10 ”are included.

スタイル別グルーピング処理部１２８は、各スタイルグループ１９０についても、スタイルグループ１９０の代表既存部品サブツリーからスタイル情報１９４を取得する。本図の例では、スタイルグループＩＤ“００１”に対応したスタイル情報１９４には、ラベル情報１９４１“名前”、ｔｙｐｅ情報１９４２“text”、ｓｉｚｅ情報１９４３“１０”、ｆｏｎｔ−ｓｔｙｌｅ情報１９４４“normal”、ｆｏｎｔ−ｓｉｚｅ情報１９４５“medium”、ｃｏｌｏｒ情報１９４６“black”およびｍａｘｌｅｎｇｔｈ情報１９４７“１０”が含まれる。以下、スタイルグループＩＤ“００２”と“００３”についても、同様にスタイル情報１９４が取得される。 The style grouping processing unit 128 also acquires the style information 194 from the representative existing component subtree of the style group 190 for each style group 190. In the example of this figure, the style information 194 corresponding to the style group ID “001” includes label information 1941 “name”, type information 1942 “text”, size information 1943 “10”, and font-style information 1944 “normal”. , Font-size information 1945 “medium”, color information 1946 “black”, and maxlength information 1947 “10”. Hereinafter, the style information 194 is similarly acquired for the style group IDs “002” and “003”.

次に、スタイル別グルーピング処理部１２８は、グルーピング対象の既存部品サブツリー１７４のスタイル情報１９４とスタイルグループ１９０のスタイル情報１９４とを比較し、それらが互いに完全一致するかどうかを判定する。本図の例では、スタイルグループＩＤ“００１”はラベル情報１９４１が異なり、スタイルグループＩＤ“００２”はｆｏｎｔ−ｓｉｚｅ情報１９４４が異なり、スタイルグループＩＤ“００３”のスタイル情報１９４のみが、グルーピング対象の部品のスタイル情報１９４とすべての情報が一致している。スタイル別グルーピング処理部１２８は、そのことを検出し、スタイル別類似性判定結果１９２に、グルーピング対象の既存部品ツリー１７４はＩＤ“００３”のスタイルグループのメンバとなることを記録する。 Next, the grouping processing unit 128 by style compares the style information 194 of the existing part subtree 174 to be grouped with the style information 194 of the style group 190 and determines whether or not they completely match each other. In the example of this figure, the style group ID “001” has different label information 1941, the style group ID “002” has different font-size information 1944, and only the style information 194 of the style group ID “003” is the grouping target. The part style information 194 matches all the information. The style-specific grouping processing unit 128 detects this and records in the style-specific similarity determination result 192 that the existing part tree 174 to be grouped becomes a member of the style group with ID “003”.

なお、本図には示されていないが、仮にグルーピング対象の既存部品サブツリーのスタイル情報がどのスタイルグループのスタイル情報１９３或いは１９４とも一致しなかった場合は、グルーピング対象の既存部品サブツリーをメンバとする新たなスタイルグループを生成し、スタイルグループＩＤを付与した上で磁気ディスク装置１０１又はワークエリア１５０に登録することができる。また、この場合、グルーピング対象の既存部品サブツリーは新たに生成されたスタイルグループの代表既存部品サブツリーとなる。 Although not shown in this figure, if the style information of the existing part subtree to be grouped does not match the style information 193 or 194 of any style group, the existing part subtree to be grouped is a member. A new style group can be generated, and a style group ID can be assigned and registered in the magnetic disk device 101 or the work area 150. In this case, the existing part subtree to be grouped becomes the representative existing part subtree of the newly generated style group.

また、機能別グルーピング処理部１２７とスタイル別グルーピング処理部１２８のグルーピング結果は、グルーピング対象の既存部品サブツリーに対応するコンテンツ構成ツリー１６１のノード１９５の中に機能グループＩＤとスタイルグループＩＤとして格納される。図２２の例では、コンテンツ構成ツリー１６１のノード１９５に、機能グループＩＤ“００２”とスタイルグループＩＤ“００３”がそれぞれ格納されている。 Further, the grouping results of the function-specific grouping processing unit 127 and the style-specific grouping processing unit 128 are stored as a function group ID and a style group ID in the node 195 of the content configuration tree 161 corresponding to the existing part subtree to be grouped. . In the example of FIG. 22, the function group ID “002” and the style group ID “003” are stored in the node 195 of the content configuration tree 161, respectively.

また、前述したように、機能別グルーピング処理部１２７とスタイル別グルーピング処理部１２８のグルーピング結果は、部品候補出力処理部１４３ｃにおける情報取得の高性能化のために、既存部品グルーピング情報１９６（機能グループＩＤ情報１９６１、スタイルグループＩＤ情報１９６２、部品ＩＤ情報１９６３及び出現数情報１９６４を含んだ情報）が磁気ディスク装置１０１又はワークエリア１５０に格納されてもよい。本図の例では、既存部品グルーピング情報１９６の２レコード目の情報として、機能グループＩＤ情報１９６１“００１”、スタイルグループＩＤ情報１９６２“００２”、部品ＩＤ情報１９６３“００３１_３、００６５_２”および出現数情報１９６４“２”がそれぞれ格納されている。また、部品ＩＤ情報１９６３で先頭に格納された（下線で示された）部品ＩＤについては、そのスタイルグループにおける代表既存部品サブツリーであることを示している。代表既存部品サブツリーがどれであるかを示す方法としては、代表既存部品サブツリーに対応した部品ＩＤを先頭に格納することに代えて、下線で示すなど、他の方法を採用することができる。 Further, as described above, the grouping results of the functional grouping processing unit 127 and the style-specific grouping processing unit 128 are obtained from the existing component grouping information 196 (functional group) in order to improve the information acquisition performance in the component candidate output processing unit 143c. ID information 1961, style group ID information 1962, component ID information 1963, and information including appearance number information 1964) may be stored in the magnetic disk device 101 or the work area 150. In the example of this figure, as the second record information of the existing component grouping information 196, function group ID information 1961 “001”, style group ID information 1962 “002”, component ID information 1963 “0031_3, 0065_2” and appearance number information 1964 “2” is stored. In addition, the component ID stored at the top of the component ID information 1963 (indicated by an underline) indicates that it is a representative existing component subtree in the style group. As a method of indicating the representative existing component subtree, other methods such as an underline can be adopted instead of storing the component ID corresponding to the representative existing component subtree at the head.

以上が、部品登録処理部１２６ｃの具体的な処理である。 The above is the specific processing of the component registration processing unit 126c.

部品検索処理部１４２は、ユーザによって指定されたクエリツリー１７１に対して、磁気ディスク装置１０１又はワークエリア１５０に登録されているすべての機能グループ１８８との類似度を算出する。クエリツリー１７１に対する機能グループ１８８の類似度算出については、図１９で示したグルーピング対象の既存部品サブツリーから部品内情報１７７Ｂを取得する処理が、クエリツリー１７１から部品内情報１７７Ａを取得する処理に変わる以外は、図１９で示した処理例を同じである。 The component search processing unit 142 calculates the similarity to all the functional groups 188 registered in the magnetic disk device 101 or the work area 150 with respect to the query tree 171 specified by the user. For the similarity calculation of the function group 188 with respect to the query tree 171, the process of acquiring the in-part information 177 B from the grouping target existing part subtree illustrated in FIG. 19 is changed to the process of acquiring the in-part information 177 A from the query tree 171. Except for the above, the processing example shown in FIG. 19 is the same.

図２３は、部品候補出力処理部１４３ｃの具体的な処理の説明図である。 FIG. 23 is an explanatory diagram of specific processing of the component candidate output processing unit 143c.

部品候補出力処理部１４３ｃは、部品グループ検索処理部１４２ｃで生成された機能グループ別類似度算出結果１８９と、機能別グルーピング処理部１２７とスタイル別グルーピング処理部１２８によって生成された既存部品グルーピング情報１９６から、クエリツリー１７１に対する機能グループ１８８に関する情報および機能グループ１８８内におけるスタイルグループ１９０に関する情報の一覧を部品グループ候補一覧表示画面１９７に表示する。図２３の例では、第１位の機能グループとして、機能グループＩＤ“００２”、クエリツリー１７１に対する機能グループ１８８の類似度“０．６７”、該機能グループに含まれるサブ機能グループのＩＤ“００４００５”が表示されている。サブ機能グループＩＤは、機能グループに含まれる既存部品サブツリー１７４におけるノード１９５内の、その機能グループのＩＤ以外の機能グループＩＤである。また、第１位の機能グループ（ＩＤ：００２）の中に含まれるスタイルグループの一覧として、スタイルグループＩＤ“００１”、“００２”および“００３”が表示されており、それぞれのスタイルグループに含まれる部品数（出現数）として“３０”、“２”および“１”がそれぞれ表示されている。 The component candidate output processing unit 143c includes a function group similarity calculation result 189 generated by the component group search processing unit 142c, and existing component grouping information 196 generated by the function grouping processing unit 127 and style grouping processing unit 128. Then, a list of information regarding the function group 188 for the query tree 171 and information regarding the style group 190 within the function group 188 is displayed on the part group candidate list display screen 197. In the example of FIG. 23, as the first function group, the function group ID “002”, the similarity “0.67” of the function group 188 with respect to the query tree 171, and the ID “004” of the sub function group included in the function group. 005 "is displayed. The sub function group ID is a function group ID other than the ID of the function group in the node 195 in the existing component subtree 174 included in the function group. In addition, style group IDs “001”, “002”, and “003” are displayed as a list of style groups included in the first functional group (ID: 002), and are included in each style group. “30”, “2”, and “1” are respectively displayed as the number of parts (number of appearances) to be displayed.

また、本図の例では、各スタイルグループの代表既存部品サブツリーが表す部品をプレビューするためのプレビューボタンが設けられている。部品候補出力処理部１４３ｃは、或るプレビューボタンが押下された場合、そのプレビューボタンに対応するスタイルグループ内の代表既存部品サブツリーが表すコードをソースから取得し、取得したコードを基に、代表既存部品サブツリーに対応した部品のイメージを表示することができる。 In the example of this figure, a preview button is provided for previewing the part represented by the representative existing part subtree of each style group. When a certain preview button is pressed, the component candidate output processing unit 143c acquires the code represented by the representative existing component subtree in the style group corresponding to the preview button from the source, and based on the acquired code, the representative existing An image of a part corresponding to the part subtree can be displayed.

以上が、本発明の第二の実施形態についての説明である。 The above is the description of the second embodiment of the present invention.

上述した第二の実施形態によれば、検索前に機能及びスタイルごとに既存部品サブツリーをグルーピングしておくことで、クエリツリーに対する部品候補を、機能およびスタイルの観点で重複することなく、異なる機能およびスタイルの部品ごとに一覧表示することができる。この結果、ユーザは効率よく所望の部品を部品候補一覧表示から探し出すことができる。 According to the second embodiment described above, by grouping the existing part subtree for each function and style before the search, the part candidates for the query tree can have different functions without duplication in terms of function and style. It is possible to display a list for each part of style and style. As a result, the user can efficiently search for a desired component from the component candidate list display.

＜第三の実施形態＞。 <Third embodiment>.

次に、本発明の第三の実施形態について説明する。 Next, a third embodiment of the present invention will be described.

第三の実施形態では、第一の実施形態における部品検索方法で既存部品サブツリーの集合を或る程度絞った上で、その集合における各既存部品サブツリーについて、第二の実施形態で説明した機能別およびスタイル別のグルーピング処理を行なう。 In the third embodiment, after narrowing a set of existing component subtrees to some extent by the component search method in the first embodiment, each existing component subtree in the set is classified according to the function described in the second embodiment. And grouping processing by style is performed.

本実施形態において、第一の実施形態と異なる点は、図２４に示すとおり、部品候補出力処理部１４３に機能別クルーピング処理部１２７、スタイル別グルーピング１２８、部品グループ検索処理部１４２ｃおよび部品候補出力処理部１４３ｃを追加する点である（変更後の部品候補出力処理部１４３を部品候補出力制御処理部１４４と呼ぶ）。 This embodiment is different from the first embodiment in that, as shown in FIG. 24, the component candidate output processing unit 143 includes a functional grouping processing unit 127, a style grouping 128, a component group search processing unit 142c, and a component candidate. An output processing unit 143c is added (the component candidate output processing unit 143 after the change is referred to as a component candidate output control processing unit 144).

以下、第一の実施形態及び第二の実施形態と異なる処理を主に説明する。 Hereinafter, processing different from the first embodiment and the second embodiment will be mainly described.

部品候補出力制御処理部１４４は、まず、部品検索処理部１４２によって算出された各既存部品サブツリーの類似度に対して、磁気ディスク装置１０１又はワークエリア１５０に格納されている類似度の閾値と比較し、閾値以上の類似度を持つ既存部品サブツリーを抽出する。以下、抽出された既存部品サブツリーの集合を「一次検索結果部品集合」と呼ぶ。 The component candidate output control processing unit 144 first compares the similarity of each existing component subtree calculated by the component search processing unit 142 with the similarity threshold stored in the magnetic disk device 101 or the work area 150. Then, an existing part subtree having a similarity greater than or equal to the threshold is extracted. Hereinafter, the set of extracted existing component subtrees is referred to as a “primary search result component set”.

次に、部品候補出力制御処理部１４４は、機能別グルーピング処理部１２７を実行し、機能別グルーピング処理部１２７が、一次検索結果部品集合における各既存部品サブツリーについて、機能別にグルーピングを行なう。機能別グルーピング処理部１２７の処理手順は、処理対象が一次検索結果部品集合における既存部品サブツリーとなる他は、第二の実施形態と同様である。 Next, the component candidate output control processing unit 144 executes the function-specific grouping processing unit 127, and the function-specific grouping processing unit 127 performs grouping by function for each existing component subtree in the primary search result component set. The processing procedure of the functional grouping processing unit 127 is the same as that of the second embodiment, except that the processing target is an existing part subtree in the primary search result part set.

次に、部品候補出力制御処理部１４４は、スタイル別グルーピング処理部１２８を実行し、スタイル別グルーピング処理部１２８が、登録された全ての機能グループ１８８に対して、スタイル別にグルーピングを行なう。スタイル別グルーピング処理部１２８の処理手順は、処理対象が一次検索結果部品集合に対して機能別にグルーピングされた各機能グループ１８８となる他は、第二の実施形態と同様である。 Next, the component candidate output control processing unit 144 executes the style-specific grouping processing unit 128, and the style-specific grouping processing unit 128 groups all the registered function groups 188 by style. The processing procedure of the style grouping processing unit 128 is the same as that of the second embodiment except that the processing target is each function group 188 grouped by function with respect to the primary search result component set.

次に、部品グループ検索処理部１４２ｃを実行し、ステップ３４１０によって登録されたすべて機能グループ１８８に対して、クエリサブツリー１７１との類似度を算出する。部品グループ検索処理部１４２ｃの処理手順は、処理対象が一次検索結果部品集合に対して機能別にグルーピングされた各機能グループ１８８となる他は第二の実施形態と同様である。 Next, the part group search processing unit 142c is executed, and the similarity with the query subtree 171 is calculated for all the function groups 188 registered in step 3410. The processing procedure of the component group search processing unit 142c is the same as that of the second embodiment except that the processing target is each function group 188 grouped by function with respect to the primary search result component set.

次に、部品候補出力制御処理部１４４は、部品候補出力処理部１４３ｃを実行し、部品候補出力処理部１４３ｃが、生成された機能グループ別類似度算出結果１８９と、機能別グルーピング処理部１２７とスタイル別グルーピング処理部１２８によって生成された既存部品グルーピング情報１９６から、クエリツリー１７１に対する部品グループ候補一覧を部品グループ候補一覧表示画面１９７に表示する。部品候補出力処理部１４３ｃの処理手順は、部品候補出力の対象が一次検索結果部品集合となる他は、第二の実施形態と同様である。 Next, the component candidate output control processing unit 144 executes the component candidate output processing unit 143c, and the component candidate output processing unit 143c performs the generated function group similarity calculation result 189, the function grouping processing unit 127, and the like. A part group candidate list for the query tree 171 is displayed on the part group candidate list display screen 197 from the existing part grouping information 196 generated by the style grouping processing unit 128. The processing procedure of the part candidate output processing unit 143c is the same as that of the second embodiment except that the part candidate output target is the primary search result part set.

第三の実施形態で行われる処理は、機能別およびスタイル別のグルーピングの処理対象が一次検索結果部品集合となる他は、第二の実施形態と略同様である。 The processing performed in the third embodiment is substantially the same as in the second embodiment, except that the processing target for grouping by function and style is the primary search result component set.

以上、第三の実施形態によれば、機能別およびスタイル別のグルーピングの処理対象を、第一の実施形態による部品検索方法によって、ある程度絞った部品集合とすることで、新規に作成しようとしているコンテンツの分野の中で使われている部品グループのみを出力することができる。この結果、ユーザは効率よく所望の部品を部品グループ候補一覧から探し出すことができる。 As described above, according to the third embodiment, the processing target of grouping by function and style is newly created by making a part set narrowed down to some extent by the part search method according to the first embodiment. Only parts groups used in the content field can be output. As a result, the user can efficiently search for a desired component from the component group candidate list.

＜第四の実施形態＞。 <Fourth embodiment>.

第四の実施形態では、テキストの類似度と入力コントロールの類似度とを別々に算出し、それらの類似度に基づいて、クエリツリーと既存部品サブツリー（或いは機能グループ）との類似度を算出することができる。具体的には、例えば、下記（式３）
クエリツリーと既存部品サブツリー（或いは機能グループ）との類似度＝α（テキストの類似度）＋β（入力コントロールの類似度）…（式３）
で算出することができる。α及びβは、それぞれ、余弦尺度を用いて求めることができる。 In the fourth embodiment, the similarity of the text and the similarity of the input control are calculated separately, and the similarity between the query tree and the existing part subtree (or function group) is calculated based on the similarity. be able to. Specifically, for example, the following (formula 3)
Similarity between query tree and existing parts subtree (or function group) = α (text similarity) + β (input control similarity) (Equation 3)
Can be calculated. α and β can be obtained using a cosine scale, respectively.

例えば、クエリツリーに、５つのテキスト（入力項目名）として、氏名、住所、電話番号及びメールアドレスがあり、８つの入力コントロールとして、５つのテキストボックス、チェックリスト、ラジオボタン、プルダウンがあるとする。一方、比較対象の既存部品サブツリーに、７つのテキストとして、氏名、性、名、メールアドレス、職業、住所及び電話番号があり、６つの入力コントロールとして、５つのテキストボックス、プルダウンがあるとする。このため、クエリツリーと比較対象の既存部品サブツリーとの間で、テキストについては、氏名、住所、電話番号及びメールアドレスの４つが一致し、入力コントロールについては、５つのテキストボックスとプルダウンの６つが一致する。 For example, there are five texts (input item names) in the query tree, such as name, address, telephone number, and e-mail address, and there are five text boxes, checklists, radio buttons, and pull-downs as eight input controls. . On the other hand, it is assumed that the existing part subtree to be compared includes name, sex, name, mail address, occupation, address, and telephone number as seven texts, and five text boxes and pull-down as six input controls. For this reason, between the query tree and the existing part subtree to be compared, the name, address, telephone number, and e-mail address match for the text, and the five text boxes and pull-downs for the input control. Match.

また、テキスト及び入力コントロールの８つのセットうち、クエリツリーについては、一致するセットとして３つあり、比較対象の既存部品サブツリーについては一致するセットとして７つあり、クエリツリーと比較対象の既存部品サブツリーとの間で互いに一致するセットが３つあるとする。 Of the eight sets of text and input controls, the query tree has three matching sets, the comparison target existing part subtree has seven matching sets, and the query tree and the comparison target existing part subtree. , There are three sets that match each other.

この場合、第一乃至第三の実施形態によれば、類似度が０．５７として算出される。なぜなら、余弦尺度（（式１）及び（式２））において、分子が、１^２＋１^２＋１^２＝３となり、分母が、｛（１^２＋１^２＋１^２＋１^２）の平方根｝×｛（１^２＋１^２＋１^２＋１^２＋１^２＋１^２＋１^２）の平方根｝となるからである。 In this case, according to the first to third embodiments, the similarity is calculated as 0.57. Because, in the cosine scale ((Equation 1) and (Equation 2)), the numerator is 1 ² +1 ² +1 ² = 3, and the denominator is {square root of ((1 ² +1 ² +1 ² +1 ² )} × {( This is because the square root of 1 ² +1 ² +1 ² +1 ² +1 ² +1 ² +1 ² )}.

しかし、第四の実施形態では、上記（式３）を用いて類似度が求められる。具体的には、例えば、αは０．７６であり、βは０．９６となり、そのため、類似度は、１．７２となる。αが０．７６となる理由は、余弦尺度において、分子が、１^２＋１^２＋１^２＋１^２＝４となり、分母が、｛（１^２＋１^２＋１^２＋１^２）の平方根｝×｛（１^２＋１^２＋１^２＋１^２＋１^２＋１^２＋１^２）の平方根｝となるからである。βが０．９６となる理由は、余弦尺度において、分子が、５^２＋１^２＝２６となり、分母が、｛（５^２＋１^２＋１^２＋１^２）の平方根｝×｛（５^２＋１^２）の平方根｝となるからである。 However, in the fourth embodiment, the similarity is obtained using the above (Equation 3). Specifically, for example, α is 0.76 and β is 0.96. Therefore, the similarity is 1.72. The reason for α being 0.76 is that, on the cosine scale, the numerator is 1 ² +1 ² +1 ² +1 ² = 4, and the denominator is the square root of {(1 ² +1 ² +1 ² +1 ² )} × {(1 ² + 1 ² +1 ² +1 ² +1 ² +1 ² +1 ² )}. The reason for β being 0.96 is that, on the cosine scale, the numerator is 5 ² +1 ² = 26, and the denominator is {square root of (5 ² +1 ² +1 ² +1 ² )} × {(5 ² +1 ² ) Because the square root of.

なお、α及びβは、余弦尺度により算出されることとしたが、他の方法により算出されても良い。 Note that α and β are calculated using a cosine scale, but may be calculated using other methods.

＜第五の実施形態＞。 <Fifth embodiment>.

第五の実施形態は、第二と第三の実施形態の組合せであり、且つ、登録性能を重視するか検索性能を重視するかの入力を受け付ける機能がコンテンツ部品検索システムに設けられる。コンテンツ部品検索システムでは、登録性能を重視することが開発者から入力された場合には、第三の実施形態で説明したように、一次検索結果部品集合について機能別及びスタイル別のグルーピング処理が行われ、検索性能を重視することが開発者から入力された場合は、第二の実施形態で説明したように、新規に登録されるコンテンツの各部品を表す各既存部品サブツリーについて、機能別及びスタイル別のグルーピング処理が行われる。 The fifth embodiment is a combination of the second and third embodiments, and a function for accepting an input indicating whether registration performance is important or search performance is important is provided in the content component search system. In the content parts search system, when a developer inputs importance on registration performance, as described in the third embodiment, a grouping process for each function and style is performed on the primary search result parts set. If importance is placed on search performance from the developer, as described in the second embodiment, for each existing part subtree representing each part of newly registered content, by function and style Another grouping process is performed.

以上、本発明の幾つかの実施形態を説明したが、これらの実施形態は本発明の説明のための例示にすぎず、本発明の範囲をそれらの実施形態にのみ限定する趣旨ではない。本発明は、その要旨を逸脱することなく、その他の様々な態様でも実施することができる。例えば、既存部品サブツリーや評価結果などの格納先は、主メモリ又は磁気ディスク装置に限らず、他種の記憶資源であってもよい。 As mentioned above, although several embodiment of this invention was described, these embodiment is only the illustration for description of this invention, and is not the meaning which limits the scope of the present invention only to those embodiment. The present invention can be implemented in various other modes without departing from the gist thereof. For example, the storage destination of the existing component subtree and the evaluation result is not limited to the main memory or the magnetic disk device, but may be other types of storage resources.

図１は、本発明の第一の実施形態におけるコンテンツ部品検索システムの全体構成例を示す。FIG. 1 shows an example of the overall configuration of a content parts search system according to the first embodiment of the present invention. 図２は、グルーピング処理部が行う処理の流れの一例を示す。FIG. 2 shows an example of the flow of processing performed by the grouping processing unit. 図３は、ラベル付与処理部が行う処理の流れの一例を示す。FIG. 3 shows an example of the flow of processing performed by the label addition processing unit. 図４は、コンテンツの表示例を示す。FIG. 4 shows an example of content display. 図５は、テキストリスト情報の一例を示す。FIG. 5 shows an example of text list information. 図６は、入力コントロール情報の一例を示す。FIG. 6 shows an example of input control information. 図７は、グルーピング条件リストの一例を示す。FIG. 7 shows an example of the grouping condition list. 図８は、ラベル付与処理部によるラベル付与前のコンテンツ構成ツリーの一例を示す。FIG. 8 shows an example of a content configuration tree before labeling by the labeling processing unit. 図９は、ラベル付与処理部によるラベル付与後のコンテンツ構成ツリーの一例を示す。FIG. 9 shows an example of a content structure tree after labeling by the labeling processing unit. 図１０は、部品検索処理部が行う処理の流れの一例を示す。FIG. 10 shows an example of the flow of processing performed by the component search processing unit. 図１１は、入出力定義情報からのクエリツリーの取得の説明図である。FIG. 11 is an explanatory diagram of obtaining a query tree from the input / output definition information. 図１２は、部品検索処理部が行う処理の一例の説明図である。FIG. 12 is an explanatory diagram of an example of processing performed by the component search processing unit. 図１３は、部品候補出力処理部が行う処理の一例の説明図である。FIG. 13 is an explanatory diagram of an example of processing performed by the component candidate output processing unit. 図１４は、本発明の第二の実施形態における部品登録処理部および部品検索制御処理部の構成例を示す。FIG. 14 shows a configuration example of the component registration processing unit and the component search control processing unit in the second embodiment of the present invention. 図１５は、機能別グルーピング処理部が行う処理の流れの一例を示す。FIG. 15 shows an example of the flow of processing performed by the functional grouping processing unit. 図１６は、スタイル別グルーピング処理部が行う処理の流れの一例を示す。FIG. 16 shows an example of the flow of processing performed by the style-specific grouping processing unit. 図１７は、本発明の第二の実施形態における部品グループ検索処理部が行う処理の流れの一例を示す。FIG. 17 shows an example of the flow of processing performed by the component group search processing unit in the second embodiment of the present invention. 図１８は、機能別グルーピング処理部が行う処理概要の説明図である。FIG. 18 is an explanatory diagram of an outline of processing performed by the functional grouping processing unit. 図１９は、機能別グルーピング処理部の類似度算出処理の説明図である。FIG. 19 is an explanatory diagram of similarity calculation processing of the functional grouping processing unit. 図２０は、類似性を比較する部品間のテキスト数が二以上である場合のスタイル別グルーピング処理部の具体的な処理の説明図である。である。FIG. 20 is an explanatory diagram of specific processing of the grouping processing unit by style when the number of texts between parts to be compared for similarity is two or more. It is. 図２１は、類似性を比較する部品間のテキスト数が１である場合のスタイル別グルーピング処理部の具体的な処理の説明図である。FIG. 21 is an explanatory diagram of specific processing of the grouping processing unit by style when the number of texts between parts to be compared for similarity is 1. 図２２は、コンテンツ構成ツリーにおけるノードと、既存部品グルーピング情報との一例を示す。FIG. 22 shows an example of nodes in the content structure tree and existing part grouping information. 図２３は、本発明の第二の実施形態における部品候補出力処理部が行う処理の説明図である。FIG. 23 is an explanatory diagram of processing performed by the component candidate output processing unit according to the second embodiment of the present invention. 図２４は、本発明の第三の実施形態における部品検索制御処理部の構成例を示す。FIG. 24 shows a configuration example of the component search control processing unit in the third embodiment of the present invention.

Explanation of symbols

１００…中央演算処理装置（ＣＰＵ）１０１…磁気ディスク装置１０２…主メモリ１０３…フロッピーディスクドライブ（ＦＤＤ）１０４…バス１０５…ネットワーク１０６…フロッピーディスク１１０…システム制御処理部１１１…部品登録制御処理部１１２…部品検索制御処理部１１２ｃ…部品検索制御処理部１２１…コンテンツ取得処理部１２２…コンテンツ内要素情報取得処理部１２３…コンテンツ構成ツリー生成処理部１２６…部品登録処理部１２６ｃ…部品登録処理部１２７…機能別グルーピング処理部１２８…スタイル別グルーピング処理部１３０…グルーピング処理部１３１…ラベル付与処理部１４０…入出力定義情報取得処理部１４１…入出力定義情報解析処理部１４２…部品検索処理部１４２ｃ…部品グループ検索処理部１４３…部品候補出力処理部１４３ｃ…部品候補出力処理部１４４…部品候補出力制御処理部１５０…ワークエリア１６０…コンテンツファイル１６１…コンテンツ構成ツリー DESCRIPTION OF SYMBOLS 100 ... Central processing unit (CPU) 101 ... Magnetic disk apparatus 102 ... Main memory 103 ... Floppy disk drive (FDD) 104 ... Bus 105 ... Network 106 ... Floppy disk 110 ... System control processing part 111 ... Component registration control processing part 112 ... part search control processing part 112c ... part search control processing part 121 ... content acquisition processing part 122 ... content element information acquisition processing part 123 ... content configuration tree generation processing part 126 ... part registration processing part 126c ... part registration processing part 127 ... Grouping processing unit by function 128 ... Grouping processing unit by style 130 ... Grouping processing unit 131 ... Label assignment processing unit 140 ... Input / output definition information acquisition processing unit 141 ... Input / output definition information analysis processing unit 142 ... Component search processing unit 142c ... Part group search processing unit 143 ... Part candidate output processing unit 143c ... Part candidate output processing unit 144 ... Part candidate output control processing unit 150 ... Work area 160 ... Content file 161 ... Content configuration tree

Claims

In an apparatus for searching for a content part as a part of the web content for creating the web content,
A search condition input unit that receives an input of a search condition that includes a combination of input control type information indicating the type of text information and input control that represents the text,
A plurality of information packages including a plurality of content parts composed of combinations of text corresponding to the text information and input controls corresponding to the input control type information, and including a plurality of combinations of the text information and the input control type information are stored. A storage unit;
The information package searched from the storage unit using text information and input control type information included in the input search condition, and the similarity of the information package searched by the search condition to the search condition A similarity calculation unit for calculating
A content component search apparatus comprising: a component candidate display unit that displays a plurality of the information packages searched under the search condition on a display screen based on each of the calculated similarities.

The similarity, a first similarity between the at least one text information is included in at least one of the information package and the text information contained in the search condition, it included in the retrieval condition and has at least one input control type information as a value calculated on the basis of the second degree of similarity between at least one input control type information is included in the information package,
The content parts retrieval apparatus according to claim 1.

  In a method in an apparatus for searching for a content part as a part of a web content for creating the web content,
  Accepts search condition input that includes a combination of text information representing text and input control type information representing the type of input control,
  A plurality of information packages including a plurality of content parts including a combination of text corresponding to the text information and an input control corresponding to the input control type information and a plurality of combinations of the text information and the input control type information are stored. Searching the information package from the storage unit using text information and input control type information included in the input search condition, and the similarity of the information package searched by the search condition to the search condition To calculate
  Displaying a plurality of the information packages searched under the search condition on a display screen based on the calculated similarity,
Content part retrieval method characterized by the above.

  In a program for causing a computer to function as an apparatus for searching for a content part as a part of the Web content for creating the Web content,
  A search condition step for accepting input of a search condition including a combination of text information representing text and input control type information representing a type of input control;
  A plurality of information packages including a plurality of content parts composed of combinations of text corresponding to the text information and input controls corresponding to the input control type information, and including a plurality of combinations of the text information and the input control type information are stored. Searching the information package from the storage unit using text information and input control type information included in the input search condition, and the similarity of the information package searched by the search condition to the search condition A similarity calculation step for calculating
  A candidate component display step for displaying a plurality of the information packages searched under the search condition on a display screen based on the calculated similarities;
A content parts search program for causing the computer to execute.