JP2013152564A

JP2013152564A - Document processor and document processing method

Info

Publication number: JP2013152564A
Application number: JP2012012407A
Authority: JP
Inventors: Hidetomo Soma; 英智相馬
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2012-01-24
Filing date: 2012-01-24
Publication date: 2013-08-08

Abstract

PROBLEM TO BE SOLVED: To perform emphasis expression for recognizing a correspondence between an explanatory text and a diagram in a document, and to perform emphasis display for, when there are two or more corresponding contents, individually identifying the correspondence.SOLUTION: When a link function of an object is achieved so as to easily come and go between an "object" and an "explanatory text of the object", an operation component of the link function is arranged at parts of the "object" and the "explanatory text of the object" as an addition object of the link function, and a function that easily comes and goes is added. In this case, when it is detected that the "explanatory text of the object" is a specific portion in the "object", corresponding expression information to be cooperatively displayed is generated and added to the "explanatory text of the object" and the "object" so that their correspondence can be understood by a reader of a document.

Description

本発明は文書処理装置及び文書処理方法に関し、特に、電子文書データの処理を行うために用いて好適な技術に関する。 The present invention relates to a document processing apparatus and a document processing method, and more particularly to a technique suitable for use in processing electronic document data.

従来、文書中の、「オブジェクト」（例えば、写真、図面、線画、表等）と「オブジェクトの説明文」（オブジェクトの説明や解説等を行う本文中の文章）を含む紙文書、または電子文書が広く利用されている。例えば、学術論文、特許、取扱説明書、商品カタログ等の説明調の文章を含む文書などである。 Conventionally, a paper document or an electronic document containing “objects” (for example, photographs, drawings, line drawings, tables, etc.) and “descriptions of objects” (texts in the text for explaining or explaining the objects) in the document. Is widely used. For example, documents including explanatory texts such as academic papers, patents, instruction manuals, and product catalogs.

「オブジェクトの説明文」とは、主たる文章である本文の中で、前述の「オブジェクト」を説明・解説するもの（実際には、文章の内容を図式化したものがオブジェクトであることが多い）である。文書の作成者は、これら「オブジェクト」と「オブジェクトの説明文」を合せて、文書の閲覧者が利用することを意図して作成している。 "Object description" explains and explains the above-mentioned "object" in the main text, which is the main sentence (in practice, the object is often a graphical representation of the contents of the sentence) It is. The creator of the document combines these “object” and “description of the object” with the intention of being used by the viewer of the document.

そのため、それらの関係付けのために、「図１５」などの表現を使用することが多い。この「図１５」のように、「オブジェクト」と「オブジェクトの説明文」のそれぞれを関係づけるためのものを、「アンカー表現」と呼ぶ。また、「オブジェクト」自身の近傍に、その「オブジェクト」を説明する説明文があることが多く、これを「キャプション表現」と呼ぶが、これが「アンカー表現」を一緒に含んでいることが多い。 For this reason, an expression such as “FIG. 15” is often used to relate them. As shown in “FIG. 15”, an object for associating each of “object” and “object description” is called “anchor expression”. In addition, there are many explanatory texts explaining the “object” in the vicinity of the “object” itself, and this is called “caption expression”, which often includes the “anchor expression” together.

この場合の例を図１５（ａ）に示す。図１５中、１５０１は文書の１ページ目、１５０２は２ページ目であり、１５１１が文書の本文、１５１２が文書中の説明文に含まれるアンカー表現である。１５１３が文書中のオブジェクトである表、１５１４がその表オブジェクトのキャプション表現で、１５１５がそのキャプション表現中のアンカー表現である。 An example of this case is shown in FIG. In FIG. 15, 1501 is the first page of the document, 1502 is the second page, 1511 is the body of the document, and 1512 is an anchor expression included in the explanatory text in the document. 1513 is a table that is an object in the document, 1514 is a caption expression of the table object, and 1515 is an anchor expression in the caption expression.

文書の閲覧者は、「オブジェクト」と「オブジェクトの説明文」の相互の対応関係を考慮し、読み進める必要がある。そのため、閲覧者は、本文中に「Table．１は・・・」という文章を見た場合、文書内から「Table．１」に対応するオブジェクトを探して確認したのち、再び本文の元の位置に戻り、続きを読み始めることになる。 The document viewer needs to read in consideration of the mutual relationship between the “object” and the “object description”. Therefore, when the viewer sees the text “Table.1 is ...” in the text, the viewer searches for the object corresponding to “Table.1” from the document, and then confirms the original position of the text again. Return to and start reading more.

一方、本文中に「Table．１」というアンカー表現を持つオブジェクトを見た場合には、本文中より「Table．１」について説明された文章を探すことになる。そして、説明文を読んで確認した後、再び元のページに戻り続きを読み進める。複数ページ文書の場合、本文中の「Table．１は・・・」に対応するオブジェクトを探したり、「Table．１」で示されるオブジェクトに対応する本文中の説明文を探したりする場合に、ページをまたがって参照する必要が出てくる。このため、可読性が良くない問題点があった。また、本文中の説明文は探しにくい上、複数個所で書かれているなど、閲覧者が全てを確認するのは困難である場合もあった。 On the other hand, when the object having the anchor expression “Table.1” is seen in the text, the text explaining “Table.1” is searched from the text. Then, after reading and confirming the explanatory text, the user returns to the original page again and continues reading. In the case of a multi-page document, when searching for an object corresponding to “Table.1 is ...” in the text or searching for an explanatory text in the text corresponding to the object indicated by “Table.1”, You will need to browse across pages. For this reason, there was a problem that readability was not good. In addition, it is difficult for the viewer to check all the explanations in the text, such as being difficult to find and being written in a plurality of places.

そこで、この「オブジェクト」である「図」と「アンカー表現」である「図番号」のハイパーテキスト化を行って電子文書を生成する。これにより、例えば、本文中の「図番号」をマウス等でクリックすると、「図番号」に該当する図を画面表示させるなどの機能を保有させることができる。また、紙文書を光学的に読み取った電子文書において、その文書の解析を行うことで、この機能を付加することが考えられる。 Therefore, the “document” “diagram” and the “anchor expression” “diagram number” are converted into hypertext to generate an electronic document. Thus, for example, when a “diagram number” in the text is clicked with a mouse or the like, it is possible to have a function of displaying a diagram corresponding to the “diagram number” on the screen. In addition, it is conceivable to add this function by analyzing a document in an electronic document obtained by optically reading a paper document.

特許文献１では、そのアンカー表現を保有する「オブジェクトの説明文」を抽出し、また、図などの「オブジェクト」の「アンカー表現」を抽出して関係づける。これにより、「オブジェクトの説明文」を検索用のメタデータとして、その「オブジェクト」の検索を可能としている。 In Patent Document 1, an “object description” having the anchor expression is extracted, and an “anchor expression” of “object” such as a figure is extracted and related. This makes it possible to search for “object” using “object description” as metadata for search.

特開２０００−３３１０５６号公報JP 2000-331056 A

そこで、これを利用して、これらの間を容易に行き来できる操作を可能とする機能（以下、これを「オブジェクトのリンク機能」という）を、電子文書とその文書の説明文などに追加する。これにより、図などを使用した文書でも、本文と図の内容を容易に読んだり確認したりすることが可能となる。 Therefore, using this, a function (hereinafter referred to as an “object link function”) that enables an operation to easily go back and forth between these is added to an electronic document and an explanatory text of the document. This makes it possible to easily read and check the contents of the text and the figure even in a document using the figure.

すなわち、「オブジェクトの説明文」もしくはその中の「アンカー表現」と「オブジェクト」の関係と、その出現位置の情報を抽出して、このオブジェクトのリンク機能を作り出すのである。図１５の１５２１、１５２２が、これを付加した場合のものである。「本文中の説明文」の中の「アンカー表現」１５１２に対して、操作機能１５２１が付加されている。 In other words, the information on the relationship between the “description of the object” or “anchor expression” and “object” and the appearance position thereof is extracted, and the link function of this object is created. 1521 and 1522 in FIG. 15 are obtained when this is added. An operation function 1521 is added to “anchor expression” 1512 in the “description in the text”.

これをマウス等でクリックすると、２ページ目の１５０２に表示部分が移動し、１５１３の図が強調表示されるなどして、その部分が該当部分であることが示される。同様に、「オブジェクト」１５１３に付加されている操作機能１５２２をマウス等でクリックすると、１ページ目の１５０１に表示部分が移動する。そして、１５１２の「アンカー表現」やそれを含む「オブジェクトの説明文」の部分が強調表示されるなどして、その部分が該当部分であることが示される。 When this is clicked with a mouse or the like, the display part moves to 1502 on the second page, and the figure 1513 is highlighted, indicating that the part is the corresponding part. Similarly, when an operation function 1522 added to “object” 1513 is clicked on with a mouse or the like, the display portion moves to 1501 on the first page. Then, the “anchor expression” of 1512 and the “object description” including the same are highlighted to indicate that the portion is the corresponding portion.

しかし、「オブジェクトの説明文」での説明内容は、「オブジェクト」全体に関するものよりは、むしろ、その一部などの特定部分に対して、説明を行う場合がよくある。すなわち、図の中の特徴的な部分や、表の中の一部の着目してほしい部分があり、その部分に対して説明を行っていることが多い。 However, the description content of the “object description” often explains a specific part such as a part rather than the whole “object”. In other words, there are characteristic parts in the figure and some parts in the table that you want to pay attention to, and there are many explanations for these parts.

そのため、前述のハイパーテキスト化を行って電子文書とした場合でも、移動した「オブジェクト」の中にある「オブジェクトの説明文」で示された部分を探す作業が発生してしまう。また、「オブジェクトの説明文」は、「オブジェクト」内の複数の部分に対して、個々に説明する文章だった場合、何度も、「オブジェクト」の表示と「オブジェクトの説明文」との間を移動して、文書を読むことになる。 Therefore, even when the hypertext is converted into an electronic document as described above, an operation for searching for a portion indicated by “description of object” in the moved “object” occurs. In addition, when the “object description” is a sentence that individually explains a plurality of parts in the “object”, the “object description” is repeatedly displayed between the “object” display and the “object description”. Go to read the document.

その際に、何度も「オブジェクトの説明文」で示された部分を探す作業が発生してしまう可能性がある。特に、「オブジェクト」が表などの場合においては、着目している部分以外が記載されているので、この傾向が顕著である。
このような機能は、文書を作成する利用者が、文書編集装置などを用いて、強調表示そのものを作成するのが一般的である。 At that time, there is a possibility that the operation of searching for the portion indicated by the “object description” will occur many times. In particular, when the “object” is a table or the like, this trend is remarkable because the portion other than the portion of interest is described.
Such a function is generally created by a user who creates a document by using a document editing apparatus or the like to create a highlight display itself.

特許文献１のように、図面などで、その中の記載内容に関する記事形式や規則が明確な場合は、自動的に強調表現を追加するものはあった。しかし、通常の文書などにおいて、その文書の中の本文たる説明文と図表内の該当部分の対応を示し、かつ、説明文が複数ある場合や、その図表内の該当部分の違いを明確にできるような強調表現を追加するものはなかった。
本発明は前述の問題点に鑑み、文書の中の説明文と図表内のものとの対応関係が明確にわかる強調表現を行うことができるすることを目的とする。 As in Patent Document 1, when there are clear article formats and rules relating to the contents described in drawings and the like, there are some that automatically add emphasis expression. However, in a normal document, etc., the correspondence between the explanation in the document and the corresponding part in the diagram can be shown, and when there are multiple explanations, the difference between the relevant parts in the diagram can be clarified There was nothing that added such emphasis.
SUMMARY OF THE INVENTION In view of the above-described problems, an object of the present invention is to make it possible to perform an emphasis expression that clearly shows the correspondence between explanatory texts in a document and those in a diagram.

本発明の文書処理装置は、イメージデータを含む複数ページの電子文書をページ単位でオブジェクトとオブジェクトの説明文との間に相互リンクを作成し、マルチページの電子文書を生成する文書処理装置であって、前記イメージデータを分割して分割領域を得る領域分割手段と、前記領域分割手段により得られた分割領域の属性を判定し、領域毎に文字属性を付加する属性情報付加手段と、前記属性情報付加手段により文字属性が付加された領域内の文字を認識する文字認識手段と、前記オブジェクトに付随するアンカー表現と、前記電子文書の本文中のアンカー表現またはオブジェクトの説明文との対応を調べて、前記オブジェクトと、前記本文中のアンカー表現またはオブジェクトの説明文との対応関係を保持するためのリンク情報を生成するリンク情報生成処理手段と、前記リンク情報生成処理手段により生成されたリンク情報によって対応付けられた、オブジェクト内のテキスト表現と、前記本文中のオブジェクトの説明文との対応する部分に、対応を強調表現する対応表現情報を生成して追加する対応表現追加手段と、前記リンク情報と前記対応を強調表現する対応表現情報を含む電子文書に変換するフォーマット変換手段とを有することを特徴とする。 The document processing apparatus according to the present invention is a document processing apparatus that creates a multi-page electronic document by creating a mutual link between an object and a description of the object for each page of a multi-page electronic document including image data. Area dividing means for dividing the image data to obtain divided areas, attribute information adding means for determining attributes of the divided areas obtained by the area dividing means, and adding character attributes for each area; and Check the correspondence between the character recognition means for recognizing characters in the area to which the character attribute is added by the information addition means, the anchor expression attached to the object, and the anchor expression or the object description in the body of the electronic document. Link information for maintaining the correspondence between the object and the anchor expression in the text or the description of the object. Corresponding to the corresponding part of the text information in the object and the description of the object in the body, which is associated by the link information generation processing means to be formed, the link information generated by the link information generation processing means Correspondence expression adding means for generating and adding correspondence expression information for emphasizing expression, and format conversion means for converting into an electronic document including the link information and correspondence expression information for emphasizing the correspondence. .

本発明によれば、相互リンクを利用した場合に、対応する部分を直観的にわかりやすく示すことが可能となり、対応する部分を探す手間を省くことができ、文書の可読性を向上させることが可能となる。 According to the present invention, when a mutual link is used, it is possible to show the corresponding part intuitively and easily, and it is possible to save the trouble of searching for the corresponding part and improve the readability of the document. It becomes.

本発明に係る画像処理システムの一例を示すブロック図である。1 is a block diagram illustrating an example of an image processing system according to the present invention. 図１におけるＭＦＰの詳細な構成例を示すブロック図である。FIG. 2 is a block diagram illustrating a detailed configuration example of an MFP in FIG. 1. 図２のデータ処理部の構成例を示すブロック図である。It is a block diagram which shows the structural example of the data processing part of FIG. 本発明の実施形態で説明に使用するイメージデータの例を示す図である。It is a figure which shows the example of the image data used for description by embodiment of this invention. 図４内の拡張域を指定できるようにした図である。FIG. 5 is a diagram in which an extension area in FIG. 4 can be designated. 本実施形態で使用される電子文書データの例を示す図である。It is a figure which shows the example of the electronic document data used by this embodiment. （ａ）はリンク処理部の詳細図、（ｂ）は対応表現追加部の詳細図である。(A) is a detailed diagram of the link processing unit, (b) is a detailed diagram of the corresponding expression adding unit. 本発明の実施形態で行われる処理全体の手順を示すフローチャートである。It is a flowchart which shows the procedure of the whole process performed by embodiment of this invention. 対応表現の情報を生成して格納する処理を示すフローチャートである。It is a flowchart which shows the process which produces | generates and stores the information of a corresponding expression. 抽出されたリンク情報、および抽出された対応表現の一例を示す図である。It is a figure which shows an example of the extracted link information and the extracted corresponding | compatible expression. 対応表現を追加して、リンク機能が利用可能になった状態の図である。It is a figure of the state which added the corresponding expression and the link function became usable. 文書を作成中の画面、および説明用に領域に分けた内容を示す図である。It is a figure which shows the content divided into the area | region for description and the screen which is producing the document. 入力を予測した補完候補を表示している状態の画面の例を示す図である。It is a figure which shows the example of the screen of the state which is displaying the complement candidate which estimated the input. 第２の実施形態における処理手順を示すフローチャートである。It is a flowchart which shows the process sequence in 2nd Embodiment. 本発明の背景技術を示し、リンク機能が付いている文書を示す図である。It is a figure which shows the background art of this invention, and shows the document with a link function.

（第１の実施形態）
以下、本発明を実施するための最良の実施形態について図面を用いて説明する。
図１は、本実施形態の文書処理システムの構成例を示すブロック図である。
図１において、オフィスＡ内に構築されたＬＡＮ１０２には、複数種類の機能（複写機能、印刷機能、送信機能等）を実現する複合機であるＭＦＰ（ＭｕｌｔｉＦｕｎｃｔｉｏｎＰｅｒｉｐｈｅｒａｌ）１００が接続されている。 (First embodiment)
DESCRIPTION OF EXEMPLARY EMBODIMENTS Hereinafter, the best mode for carrying out the invention will be described with reference to the drawings.
FIG. 1 is a block diagram illustrating a configuration example of a document processing system according to the present embodiment.
In FIG. 1, an MFP (Multi Function Peripheral) 100 that is a multifunction machine that realizes a plurality of types of functions (copying function, printing function, transmission function, etc.) is connected to a LAN 102 constructed in the office A.

ＬＡＮ１０２は、プロキシサーバ１０３を介して外部のネットワーク１０４にも接続されている。クライアントＰＣ１０１はＬＡＮ１０２を介してＭＦＰ１００からの送信データを受信したり、ＭＦＰ１００が有する機能を利用したりする。例えば、クライアントＰＣ１０１は、印刷データをＭＦＰ１００へ送信することで、その印刷データに基づく印刷物をＭＦＰ１００で印刷することもできる。なお、図１の構成は一例であり、オフィスＡと同様の構成要素を有する、複数のオフィスがネットワーク１０４上に接続されていてもよい。 The LAN 102 is also connected to an external network 104 via a proxy server 103. The client PC 101 receives transmission data from the MFP 100 via the LAN 102 and uses functions of the MFP 100. For example, the client PC 101 can also print the printed matter based on the print data by the MFP 100 by transmitting the print data to the MFP 100. The configuration in FIG. 1 is an example, and a plurality of offices having the same components as the office A may be connected on the network 104.

また、ネットワーク１０４は、典型的にはインターネットやＬＡＮやＷＡＮや電話回線、専用デジタル回線、ＡＴＭやフレームリレー回線、通信衛星回線、ケーブルテレビ回線、データ放送用無線回線等で実現される通信ネットワークである。これは、データの送受信が可能なものであれば、何でもよい。また、クライアントＰＣ１０１、プロキシサーバ１０３の各種端末はそれぞれ、汎用コンピュータに搭載される標準的な構成要素を有している。例えば、ＣＰＵ、ＲＡＭ、ＲＯＭ、ハードディスク、外部記憶装置、ネットワークインタフェース、ディスプレイ、キーボード、マウス等である。 The network 104 is a communication network typically realized by the Internet, LAN, WAN, telephone line, dedicated digital line, ATM, frame relay line, communication satellite line, cable TV line, data broadcasting radio line, and the like. is there. This may be anything as long as it can transmit and receive data. Each of the various terminals of the client PC 101 and the proxy server 103 has standard components mounted on a general-purpose computer. For example, a CPU, RAM, ROM, hard disk, external storage device, network interface, display, keyboard, mouse, and the like.

図２は、本実施形態の画像処理装置であるＭＦＰ１００の詳細構成を示すブロック図である。図２に示したＭＦＰ１００は、画像入力デバイスであるスキャナ部２０１と、画像出力デバイスであるプリンタ部２０２と、ＣＰＵ２０５等で構成される制御ユニット２０４と、ユーザインタフェースである操作部２０３等を有する。 FIG. 2 is a block diagram illustrating a detailed configuration of the MFP 100 which is the image processing apparatus according to the present embodiment. The MFP 100 illustrated in FIG. 2 includes a scanner unit 201 that is an image input device, a printer unit 202 that is an image output device, a control unit 204 that includes a CPU 205 and the like, an operation unit 203 that is a user interface, and the like.

制御ユニット２０４は、スキャナ部２０１、プリンタ部２０２、操作部２０３と接続し、一方では、ＬＡＮ２１９や一般の電話回線網である公衆回線（ＷＡＮ）２２０と接続することで、画像情報やデバイス情報の入出力を行うコントローラである。ＣＰＵ２０５は、制御ユニット２０４に含まれる各ユニットを制御する。ＲＡＭ２０６はＣＰＵ２０５が動作するためのシステムワークメモリであり、画像データを一時記憶するための画像メモリでもある。 The control unit 204 is connected to the scanner unit 201, the printer unit 202, and the operation unit 203. On the other hand, the control unit 204 is connected to a LAN 219 or a public line (WAN) 220, which is a general telephone line network, so that image information and device information can be stored. A controller that performs input and output. The CPU 205 controls each unit included in the control unit 204. A RAM 206 is a system work memory for the CPU 205 to operate, and is also an image memory for temporarily storing image data.

ＲＯＭ２１０はブートＲＯＭであり、システムのブートプログラム等のプログラムが格納されている。記憶部２１１はハードディスクドライブで、システム制御ソフトウェア、画像データを格納する。操作部Ｉ／Ｆ２０７は操作部（ＵＩ）２０３とのインターフェース部で、操作部２０３に表示するための画像データを操作部２０３に対して出力する。また、操作部Ｉ／Ｆ２０７は操作部２０３から本画像処理装置の使用者が入力した情報を、ＣＰＵ２０５に伝える役割をする。 A ROM 210 is a boot ROM, and stores programs such as a system boot program. A storage unit 211 is a hard disk drive and stores system control software and image data. An operation unit I / F 207 is an interface unit with the operation unit (UI) 203 and outputs image data to be displayed on the operation unit 203 to the operation unit 203. An operation unit I / F 207 serves to transmit information input by the user of the image processing apparatus from the operation unit 203 to the CPU 205.

ネットワークＩ／Ｆ２０８は本画像処理装置をＬＡＮ２１９に接続し、パケット形式の情報の入出力を行う。モデム２０９は本画像処理装置をＷＡＮ２２０に接続し、データの復調・変調を行うことにより情報の入出力を行う。以上のデバイスがシステムバス２２１上に配置される。 A network I / F 208 connects the image processing apparatus to the LAN 219 and inputs / outputs packet format information. A modem 209 connects the image processing apparatus to the WAN 220 and inputs / outputs information by demodulating / modulating data. The above devices are arranged on the system bus 221.

イメージバスＩ／Ｆ２１２はシステムバス２２１と画像データを高速で転送する画像バス２２２とを接続し、データ構造を変換するバスブリッジである。画像バス２２２は、例えば、ＰＣＩバスやＩＥＥＥ１３９４で構成される。画像バス２２２上には以下のデバイスが配置される。ラスターイメージプロセッサ（ＲＩＰ）２１３は、ＰＤＬ（ページ記述言語）コードを解析し、指定された解像度のビットマップイメージに展開する、いわゆるレンダリング処理を実現する。 An image bus I / F 212 is a bus bridge that connects the system bus 221 and an image bus 222 that transfers image data at high speed, and converts the data structure. The image bus 222 is composed of, for example, a PCI bus or IEEE1394. The following devices are arranged on the image bus 222. A raster image processor (RIP) 213 realizes a so-called rendering process in which a PDL (page description language) code is analyzed and developed into a bitmap image having a designated resolution.

この展開をする際には、各画素単位あるいは領域単位で属性情報が付加されることになる。これを像域判定処理と呼ぶ。像域判定処理により、画素毎にあるいは領域毎に、文字（テキスト）や線（ライン）、グラフィクス、イメージ等といったオブジェクト種類を示す属性情報が付与される。例えば、ＰＤＬコード内のＰＤＬ記述のオブジェクトの種類に応じて、ＲＩＰ２１３から像域信号が出力され、その信号値で示される属性に応じた属性情報が、オブジェクトに対応する画素や領域に関連づけて保存される。したがって、画像データには、関連づけられた属性情報が付属している。 When this expansion is performed, attribute information is added in units of pixels or regions. This is called image area determination processing. By the image area determination process, attribute information indicating an object type such as a character (text), a line (line), graphics, an image, or the like is given for each pixel or for each area. For example, an image area signal is output from the RIP 213 according to the object type of the PDL description in the PDL code, and attribute information corresponding to the attribute indicated by the signal value is stored in association with the pixel or area corresponding to the object. Is done. Therefore, associated attribute information is attached to the image data.

デバイスＩ／Ｆ２１４は、信号線２２３を介して画像入力デバイスであるスキャナ部２０１、信号線２２４を介して画像出力デバイスであるプリンタ部２０２、をそれぞれ制御ユニット２０４に接続し、画像データの同期系／非同期系の変換を行う。スキャナ画像処理部２１５は、入力画像データに対し補正、加工、編集を行う。プリンタ画像処理部２１６は、プリンタ部２０２に出力すべきプリント出力画像データに対して、プリンタ部２０２に応じた補正、解像度変換等を行う。画像回転部２１７は入力された画像データが正立するように回転を行い出力する。データ処理部２１８については後述する。 The device I / F 214 connects the scanner unit 201 which is an image input device via a signal line 223 and the printer unit 202 which is an image output device via a signal line 224 to the control unit 204, respectively, and synchronizes the image data. / Perform asynchronous system conversion. A scanner image processing unit 215 corrects, processes, and edits input image data. The printer image processing unit 216 performs correction, resolution conversion, and the like according to the printer unit 202 for print output image data to be output to the printer unit 202. The image rotation unit 217 rotates and outputs the input image data so that it is upright. The data processing unit 218 will be described later.

次に、図３を用いて、図２のデータ処理部２１８の詳細について説明を行う。データ処理部２１８は、領域分割部３０１、属性情報付加部３０２、文字認識部３０３、リンク処理部３０４、対応表現追加部３０５、フォーマット変換部３０６から構成される。データ処理部２１８は、スキャナ部２０１でスキャンしたイメージデータ３００が入力されてくると、各処理部３０１〜３０６で処理を行うことにより、電子文書データ３１０を生成して出力する。 Next, details of the data processing unit 218 in FIG. 2 will be described with reference to FIG. The data processing unit 218 includes an area dividing unit 301, an attribute information adding unit 302, a character recognition unit 303, a link processing unit 304, a corresponding expression adding unit 305, and a format conversion unit 306. When the image data 300 scanned by the scanner unit 201 is input, the data processing unit 218 generates and outputs electronic document data 310 by performing processing in each of the processing units 301 to 306.

領域分割部３０１には、図２のスキャナ部２０１でスキャンされたイメージデータ、あるいは記憶部２１１に保存されている過去に入力されたイメージデータ（文書画像）が入力される。そして、文字、写真、図表、イラスト等のような、そのページ内に配置されたオブジェクトの領域（オブジェクト領域）を抽出するために、データ中の画素の抽出・グループ化等の処理を行う。 Image data scanned by the scanner unit 201 in FIG. 2 or image data (document image) input in the past stored in the storage unit 211 is input to the region dividing unit 301. Then, in order to extract a region (object region) of an object arranged in the page, such as a character, a photo, a chart, an illustration, etc., processing such as extraction / grouping of pixels in the data is performed.

領域分割部３０１で行う領域抽出方法（領域分割方法）としては公知の方法を用いればよい。
一例を説明すると、まず、入力画像を２値化して２値画像を生成し、２値画像を低解像度化して間引き画像（縮小画像）を作成する。例えば、１／（Ｍ×Ｎ）の間引き画像を作成する際には、２値画像をＭ×Ｎ画素毎に分割し、Ｍ×Ｎ画素内に黒画素が存在すれば縮小後の対応する画素を黒画素とし、存在しなければ白画素とすることにより、間引き画像を作成する。 As a region extraction method (region division method) performed by the region dividing unit 301, a known method may be used.
As an example, first, an input image is binarized to generate a binary image, and the resolution of the binary image is reduced to create a thinned image (reduced image). For example, when a 1 / (M × N) thinned image is created, a binary image is divided into M × N pixels, and if there are black pixels in the M × N pixels, the corresponding pixels after reduction are reduced. Is a black pixel, and if it does not exist, it is a white pixel to create a thinned image.

次に、間引き画像において黒画素が連結する部分（連結黒画素）を抽出して当該連結黒画素に外接する矩形を作成していく。文字画像サイズに近い矩形（１文字の矩形）が並んでいる場合や、縦横のどちらかが文字画像サイズに近い矩形（数文字が繋がった連結黒画素の矩形）で短辺の近くに同様の矩形が並んでいる場合は、１つの文字行を構成している文字画像である可能性が高い。この場合は矩形同士を結合して、１つの文字行を表す矩形を得る。そして、１つの文字行を表す矩形の短辺の長さがほぼ同じで、列方向にほぼ等間隔に並んでいる矩形の集合は、本文部である可能性が高いので結合して本文領域を抽出する。また、写真領域や図領域や表領域は、文字画像よりも大きいサイズの連結黒画素により抽出される。 Next, a portion (connected black pixel) where black pixels are connected in the thinned image is extracted, and a rectangle circumscribing the connected black pixels is created. When the rectangles close to the character image size (one character rectangle) are lined up, or the rectangles that are close to the character image size either vertically or horizontally (connected black pixel rectangles with several characters connected) When the rectangles are arranged, it is highly possible that the image is a character image constituting one character line. In this case, the rectangles are combined to obtain a rectangle representing one character line. A set of rectangles having the same length of the short sides of the rectangle representing one character line and arranged at almost equal intervals in the column direction is likely to be a body part. Extract. In addition, the photograph area, the figure area, and the front area are extracted by the connected black pixels having a size larger than that of the character image.

属性情報付加部３０２は、領域分割部３０１で分割された分割領域毎に属性を付加する。図４の入力イメージデータの例として処理動作を説明する。説明しやすいように、図４内の拡張域を指定できるようにした図が、図５（ａ）である。
図５（ａ）中の領域５０６は、そのページ内で文字数や行数がある程度あり、文字数、行数、段落等の形態を保有する点から、総合的に判定して、『本文』の文字属性を付加することになる。残りの領域については、まず、文字画像サイズに近い矩形が含まれている領域か否かを判断する。 The attribute information adding unit 302 adds an attribute for each divided region divided by the region dividing unit 301. The processing operation will be described as an example of the input image data in FIG. For ease of explanation, FIG. 5A is a diagram in which the extension area in FIG. 4 can be designated.
The area 506 in FIG. 5A has a certain number of characters and lines in the page, and has a character number, a line number, a paragraph, and the like, and comprehensively determined. An attribute will be added. For the remaining area, first, it is determined whether or not the area includes a rectangle close to the character image size.

特に、文字画像が含まれている領域は、領域内で文字画像の矩形が周期的に現れるので、分割領域内に文字が含まれている領域であるか否かを判断することができる。その結果、領域５０１、表５０２中の５１１から５１８、領域５０７、領域５０３中の領域５０４、領域５０５は文字が含まれる領域として『テキスト領域』の属性を付加する。ただし、こちらは、文字数、行数、段落等の形態を持たない点から、『本文』の属性は付加されないことになる。 In particular, since the character image rectangle periodically appears in the region including the character image, it can be determined whether or not the character region is included in the divided region. As a result, the region 501 511 to 518 in the table 502, the region 507, the region 504 in the region 503, and the region 505 add the attribute “text region” as a region including characters. However, here, since there is no form such as the number of characters, the number of lines, and paragraphs, the “text” attribute is not added.

一方、それ以外の領域については、領域の大きさが非常に小さければ『ノイズ』と判定する。また、画素密度が小さい連結黒画素について、その内部の白画素輪郭追跡を行ったときに、その白画素輪郭の外接矩形が整然と並んでいる場合は当該領域を『表』と判断し、整然と並んでいない場合は『線画』と判断する。それ以外の画素密度の高いものは絵や写真であるとして『写真』の属性を付加する。これにより、領域５０２は『表』、領域５０３は『写真』と判断される。 On the other hand, other areas are determined as “noise” if the area size is very small. For connected black pixels with low pixel density, when tracking white pixel contours inside the connected black pixels, if the circumscribed rectangles of the white pixel contours are arranged in an orderly manner, the area is judged to be a “table” and arranged in an orderly manner. If not, it is determined as “line drawing”. The attribute of “photograph” is added because the other pixel density is a picture or a photograph. As a result, it is determined that the area 502 is “table” and the area 503 is “photo”.

さらに、表の領域については、その中のテキスト領域の位置的な関係と配置の規則性から、各テキスト領域に対して表内の属性を付加していく。これにより、テキスト領域５１２と５１４は『列ラベル』、テキスト領域５１１と５１５は『行ラベル』、テキスト領域５１３、５１６、５１７、５１８は『値』を付加する。 Further, for the table area, the attributes in the table are added to each text area from the positional relationship of the text area and the regularity of the arrangement. Thus, “column label” is added to the text areas 512 and 514, “row label” is added to the text areas 511 and 515, and “value” is added to the text areas 513, 516, 517, and 518.

更に、本文でないと判断された文字領域については、『表』、『線画』、『写真』が付加された領域の近傍（領域の上または下）に存在する場合、当該『表』、『線画』、『写真』の領域を説明する文字領域であると判断し、『キャプション』の属性を付加する。なお、『キャプション』を付加する領域は、その『キャプション』が付随する領域（『表』、『線画』、『写真』）を特定できるように、付随する領域と関連付けて保存する。これにより、表５０２に対して文字領域５０７を、写真の領域５０３に対して文字領域５０５を、それぞれキャプションとして、関連付けて保存する。 Furthermore, if a character area that is determined not to be a text exists in the vicinity (above or below the area) to which “table”, “line drawing”, or “photograph” is added, the “table”, “line drawing” ] And “photograph” areas are determined to be character areas, and a “caption” attribute is added. The area to which “caption” is added is stored in association with the associated area so that the area (“table”, “line drawing”, “photograph”) associated with the “caption” can be specified. Thus, the character area 507 is stored in the table 502 and the character area 505 is stored in association with the photograph area 503 as captions.

また、テキスト領域について、本文部の文字画像より大きく、本文部の段組とは異なる位置にあれば『見出し』の属性を付加する。また、本文部の文字画像より大きく、本文部の段組の上部に存在すれば、属性を『小見出し』とする。更に、本文部の文字画像のサイズ以下の文字画像で、原稿の下端部や上端部に存在すれば『ページ』（もしくは、「ページヘッダ」、「ページフッタ」）の属性を付加する。また、テキスト領域として判断されたが、『本文』、『見出し』、『小見出し』、『キャプション』、『ページ』のどれにも当てはまらなかった場合、『テキスト』の属性のまま残ることになる。 If the text area is larger than the character image of the body part and is at a position different from the column of the body part, the “heading” attribute is added. If the text image is larger than the text image of the text part and exists above the column of the text part, the attribute is set to “subheading”. Further, if the character image is smaller than the size of the character image in the body portion and exists at the lower end portion or the upper end portion of the document, the attribute of “page” (or “page header” or “page footer”) is added. In addition, although it is determined as a text area, if it does not apply to any of “body”, “heading”, “subheading”, “caption”, and “page”, the attribute of “text” remains.

文字認識部３０３は、文字画像を含む領域（『テキスト』、『本文』、『見出し』、『小見出し』、『キャプション』領域）について、公知の文字認識処理を実行し、その結果となる文字コード列を文字情報として格納するとともに対象領域に関連付けを行う。 The character recognizing unit 303 executes a known character recognition process on a region including a character image (“text”, “body”, “heading”, “subheading”, “caption” region), and a character code resulting therefrom The column is stored as character information and associated with the target area.

このように、領域分割部３０１、属性情報付加部３０２、文字認識部３０３において抽出された、領域の位置や大きさや領域属性の情報、ページの情報、文字認識結果の文字情報（文字コード情報）等は、図２の記憶部２１１に保存される。 As described above, the region position and size, region attribute information, page information, and character recognition result character information (character code information) extracted by the region dividing unit 301, attribute information adding unit 302, and character recognition unit 303. Are stored in the storage unit 211 of FIG.

図５（ｂ）は、図４の入力イメージデータ例を処理した場合に、図２の記憶部２１１に保存される情報の例を示す。図５の５０４については、これが、写真や図の文字画像の領域なので、『５０３の写真内』の属性が追加されており、これを図５（ｂ）の５０４で示されているとおりである。また、その表である表５０２の中については、表の内部構造情報として、図２の記憶部２１１に保存される情報を、図５（ｃ）に示す。 FIG. 5B shows an example of information stored in the storage unit 211 of FIG. 2 when the input image data example of FIG. 4 is processed. For 504 in FIG. 5, since this is a region of a character image of a photo or a figure, an attribute “in the photo of 503” is added, as shown by 504 in FIG. 5B. . As for the table 502, which is the table, information stored in the storage unit 211 of FIG. 2 as the internal structure information of the table is shown in FIG.

リンク処理部３０４は、属性情報付加部３０２で検出されたキャプション付随オブジェクト（例えば、『表』、『線画』、『写真』）と「本文中の説明表現」の間にリンクを作成するために必要な情報を生成し、図２の記憶部２１１に保存する。リンク処理部３０４の詳細については後述する。 The link processing unit 304 creates a link between the caption associated objects (for example, “table”, “line drawing”, “photo”) detected by the attribute information adding unit 302 and “descriptive expression in the text”. Necessary information is generated and stored in the storage unit 211 of FIG. Details of the link processing unit 304 will be described later.

対応表現追加部３０５は、リンク処理部３０４で作成されたリンク情報をもとに、この部分の操作を行う場合に、そのリンクによって対応づけられた本文の部分と図表の部分についての操作時の表現を追加する処理を行う。図表を用いて説明を行う本文と、その本文で着目すべき図表内の領域や部分の対応を強調する表現方法を決定し、そのリンク選択時や操作時などに表示される内容として、リンク情報と関連するように対応表現情報を生成し、図２の記憶部２１１に保存する。対応表現追加部３０５の詳細については後述する。 The correspondence expression adding unit 305, when performing the operation of this part based on the link information created by the link processing unit 304, performs the operation for the text part and the chart part associated with the link. Perform processing to add expressions. The text to be explained using the chart and the expression method that emphasizes the correspondence between the area and part in the chart to be noted in the text are determined, and the link information is displayed as the contents displayed when selecting or operating the link. Correspondence expression information is generated so as to be related to and stored in the storage unit 211 of FIG. Details of the correspondence expression adding unit 305 will be described later.

フォーマット変換部３０６は、入力されたイメージデータ３００、領域分割部３０１、属性情報付加部３０２、文字認識部３０３、リンク処理部３０４から得られた情報（例えば、ページ情報、領域の位置や大きさ、属性、文字情報、メタデータ）を用いる。そして、これらから、所定の電子文書フォーマット（例えば、ＰＤＦ、ＳＶＧ，ＸＰＳ、ＯｆｆｉｃｅＯｐｅｎＸＭＬ等）へ変換する。フォーマット変換で生成される電子文書は、グラフィックス等によるページ表示情報（表示用画像等）と、文字等の意味記述による内容情報（メタデータ等）を含むことになる。 The format conversion unit 306 receives information (for example, page information, region position and size) obtained from the input image data 300, the region division unit 301, the attribute information addition unit 302, the character recognition unit 303, and the link processing unit 304. , Attributes, character information, metadata). Then, these are converted into a predetermined electronic document format (for example, PDF, SVG, XPS, OfficeOpenXML, etc.). The electronic document generated by the format conversion includes page display information (such as a display image) using graphics and the like, and content information (metadata and the like) based on a semantic description such as characters.

フォーマット変換部３０６の処理は、大きく２つある。１つは、画像領域に対して、平坦化やスムージング、エッジ強調、色量子化、２値化等のフィルタ処理を施し、画像データ（例えば、『線画』属性が付与された領域に対応する部分の画像）を所定の電子文書フォーマットに格納できるものにすることである。 There are roughly two processes of the format conversion unit 306. One is a portion corresponding to an area to which image data (for example, a “line drawing” attribute is given by performing filter processing such as flattening, smoothing, edge enhancement, color quantization, and binarization on the image area. Image) can be stored in a predetermined electronic document format.

実際には、画像データを、ベクトルパス記述のグラフィックスデータ（ベクトルデータ）や、ビットマップ記述のグラフィックスデータ（例えばＪＰＥＧデータ）にすることである。ベクトルデータへ変換する技術は公知のベクトル化技術を用いることが可能である。また、『文字』領域部分に対しては、２値による画像切り出し処理と、イメージデータ３００からの画素消去処理などを行い、文字認識の結果である文字コードを利用した文字描画のグラフィックスデータを作成する。 Actually, the image data is converted to graphics data (vector data) of vector path description or graphics data (for example, JPEG data) of bitmap description. A known vectorization technique can be used as the technique for converting to vector data. Further, for the “character” area portion, binary image cutout processing, pixel erasure processing from the image data 300, and the like are performed, and character drawing graphics data using a character code as a result of character recognition is obtained. create.

特に、文字画像の正確さが必要な特殊なフォントや文字飾り、大きな文字の『文字』領域部分に対しては、画像データ同様にベクトルパス記述やビットマップ記述でのグラフィックスデータを作成する。また、オブジェクト検索時に、リンク使用・操作時や検索結果を特定・強調する際に表示される枠などのグラフィックス記述（ベクトルパス記述）を生成することも行う。そして、それらに対して、図２の記憶部２１１に保管されている領域情報（位置、大きさ、属性）、領域内の文字情報から、所定の電子文書フォーマットの電子文書を作成するのである。 In particular, for special fonts, character decorations, and “character” area portions of large characters that require the accuracy of character images, graphics data in vector path description or bitmap description is created in the same manner as image data. Also, when searching for an object, a graphics description (vector path description) such as a frame displayed when a link is used / operated or when a search result is specified / emphasized is generated. Then, an electronic document having a predetermined electronic document format is created from the area information (position, size, attribute) stored in the storage unit 211 of FIG. 2 and the character information in the area.

このようにして生成される電子文書データ３１０の例を図６に示す。図６は、図４のイメージデータ４００の例を処理した場合に、図２の記憶部２１１に保存された図５（ａ）と（ｂ）のようなデータに基づいて、ＳＶＧ（ＳｃａｌａｂｌｅＶｅｃｔｏｒＧｒａｐｈｉｃｓ）フォーマットで記述を行った場合の例である。 An example of the electronic document data 310 generated in this way is shown in FIG. 6 shows an example of SVG (Scalable Vector Graphics) based on the data shown in FIGS. 5A and 5B stored in the storage unit 211 of FIG. 2 when the example of the image data 400 of FIG. 4 is processed. This is an example when the description is in the format.

図６の電子文書データ６００において、記述６０１〜６０６は、それぞれ図５（ａ）の領域５０１〜５０６に対するグラフィックス記述である。ここで、６０１、６０４、６０５および６０６は文字コードによる文字描画記述の例である。
６０２は、表５０２に対応するもので、表を記述したものである。その内には、５１１から５１８までのテキスト領域の文字コードによる文字描画記述の例（実際には、５１１から５１３以外に対応するものは省略されている）である。６０３は切り出し処理された写真画像を貼り付ける記述の例である。これらはページ内の記述であるため、６１１と６１２で、個々のページに分けて記述できるようになっている。また、６１３には別のページの内容が記述されている。 In the electronic document data 600 of FIG. 6, descriptions 601 to 606 are graphics descriptions for the areas 501 to 506 of FIG. Here, reference numerals 601, 604, 605, and 606 are examples of character drawing descriptions using character codes.
Reference numeral 602 corresponds to the table 502 and describes the table. Among them, there are examples of character drawing descriptions by character codes in the text areas from 511 to 518 (actually, those other than 511 to 513 are omitted). Reference numeral 603 denotes an example of a description for pasting a cut-out photographic image. Since these are descriptions within a page, 611 and 612 can be described separately for each page. Reference numeral 613 describes the contents of another page.

本実施形態の電子文書では、簡単な関数型のプログラミング言語における、操作機能などの追加ができるようになっており、６２１や６２２が、その記述例となっている。６２１は、例えば図１５の操作機能１５２１のようなリンク機能を提供する部分で、その識別情報が「Ｌｉｎｋ１ＦｒｏｍＳｅｎｔｅｎｃｅ１ＴｏＴａｂｌｅ１」になっているものが利用者の操作などにより選択された場合の表示・動作を記す部分である。 In the electronic document of this embodiment, an operation function or the like can be added in a simple functional programming language, and 621 and 622 are examples of the description. 621 is a part that provides a link function such as the operation function 1521 of FIG. 15, for example, and describes the display / operation when an item whose identification information is “Link1FromSensence1ToTable1” is selected by a user operation or the like. Part.

個々には、関数「ＤｉｓｐｌａｙＦｒｏｍＳｅｎｔｅｎｃｅ１ＴｏＴａｂｌｅ１」という名前のプログラム・モジュール（関数）が呼び出されて処理されることが示されている。６２２では、その関数「ＤｉｓｐｌａｙＦｒｏｍＳｅｎｔｅｎｃｅ１ＴｏＴａｂｌｅ１」の中で行われる処理が前述の関数型プログラミング言語で記述されている(その中身は省略表現されている)。このようにして、リンク機能は、この電子文書の利用する際に、これらの関数型のプログラム言語で記述されたプログラムが実行されることにより、実現されるのである。 Individually, it is shown that a program module (function) named function “DisplayFromSensence1ToTable1” is called and processed. In 622, the processing performed in the function “DisplayFromSentence1ToTable1” is described in the above-described functional programming language (the contents are abbreviated). In this way, the link function is realized by executing a program described in these functional programming languages when using the electronic document.

なお、ここではＳＶＧを例として説明したが、出力フォーマットはＳＶＧに限定されるものではなく、ＰＤＦ、ＸＰＳ、ＯｆｆｉｃｅＯｐｅｎＸＭＬ、その他のＰＤＬ系のデータ形式等に変換してもよい。 Although SVG has been described as an example here, the output format is not limited to SVG, and may be converted to PDF, XPS, Office Open XML, other PDL data formats, and the like.

図７（ａ）は、図３のリンク処理部３０４の構成例を示すブロック図である。以下、７
０１〜７０５の機能について説明する。ここで行われる処理は、図３の領域分割部３０１から文字認識部３０３で作成され、図２の記憶部２１１に格納された、イメージデータ７１０、図５（ａ）で示されるテキスト内容や図表に関する領域情報７１１と文字情報７１２を利用して行われる。 FIG. 7A is a block diagram illustrating a configuration example of the link processing unit 304 in FIG. 7
The functions 01 to 705 will be described. The processing performed here is the image data 710 created by the character recognition unit 303 from the region division unit 301 in FIG. 3 and stored in the storage unit 211 in FIG. 2, the text contents and charts shown in FIG. This is performed using area information 711 and character information 712.

リンク情報付与対象選択処理部７０１は入力されたイメージデータに対して、リンク情報生成を行う対象としてキャプション付随オブジェクトとそのテキスト内容を選択する。
アンカー表現抽出処理部７０２は、リンク情報付与対象選択処理部７０１で選択されたキャプション付随オブジェクトに対応付けされたキャプション領域の文字情報を解析し、アンカー表現を抽出する。 The link information addition target selection processing unit 701 selects a caption-associated object and its text content as targets for link information generation for the input image data.
The anchor expression extraction processing unit 702 analyzes the character information of the caption area associated with the caption associated object selected by the link information addition target selection processing unit 701, and extracts the anchor expression.

ここで、キャプション領域の文字情報を解析し、その中からアンカー表現（例えば、「図１」、「Table．１」等）を探して、見つかった場合には、その該当部分をアンカー表現、それ以外の部分をキャプション表現として抽出する。また、文字コードの特性や辞書等を含むことで、有意でない文字列（無意味な記号列等）を排除する機能も有する。 Here, the character information in the caption area is analyzed, and an anchor expression (for example, “FIG. 1”, “Table. 1”, etc.) is searched from the text information. The part other than is extracted as a caption expression. In addition, by including character code characteristics, a dictionary, and the like, it has a function of eliminating insignificant character strings (insignificant symbol strings, etc.).

これは、文書のテキスト部分の境界に現れる飾りや、分割線、画像を文字として解釈するような文字認識の誤認識等に対応するためである。また、アンカー表現を抽出するために、図番号等の多言語の文字列パターンや、それに対する文字認識の誤認識パターンを保有することで、アンカー表現の抽出精度と、アンカー表現の文字補正を行うことが可能である。また、キャプション表現に対しても、同様である。すなわち、自然言語処理での解析や、文字認識の誤認識補正等を行うことが可能で、アンカー表現との境目や、先頭・末尾に現れる記号や文字飾り等を補正して排除したりする機能を持たせることも可能である。 This is to cope with misrecognition of character recognition that interprets decorations, dividing lines, and images that appear at the boundaries of the text portion of the document as characters. In addition, in order to extract anchor expressions, possessing multilingual character string patterns such as figure numbers and misrecognition patterns of character recognition for them, perform anchor expression extraction accuracy and anchor expression character correction It is possible. The same applies to caption expression. In other words, it is possible to perform analysis in natural language processing, correct misrecognition correction of character recognition, etc., and correct and eliminate the boundary with anchor expression, symbols and character decorations that appear at the beginning and end It is also possible to have

本文内アンカー表現検索処理部７０３は、アンカー表現抽出処理部７０２のアンカー表現抽出処理で抽出されたアンカー表現（例えば、「図１」、「Table．１」等）を文書の本文内から検索する。そして、オブジェクトに対応する本文中のアンカー表現として検出する処理を行う。これにより、アンカー表現を含みオブジェクト内の説明を行っている本文中の説明表現をオブジェクトの説明表現候補として検出する。 The in-text anchor expression search processing unit 703 searches for the anchor expression (for example, “FIG. 1”, “Table. 1”, etc.) extracted by the anchor expression extraction processing of the anchor expression extraction processing unit 702 from the text of the document. . And the process which detects as an anchor expression in the text corresponding to an object is performed. As a result, the explanation expression in the text including the anchor expression and explaining the object is detected as a candidate explanation expression of the object.

ここでは、検索を高速化するための、検索用インデックス（インデックス作成とそれを利用した高速検索の技術は公知のインデックス作成・検索技術を用いることが可能である）を作成することが可能である。また、複数のアンカー表現の特定文字列で一括検索をすることで、高速化を実現することも可能であり、こちらの方がより効果的である。また、本文中の説明表現に対しても、それだけでなく、図番号等の多言語の文字列パターンや、図番号等の表記ゆれの文字パターンや、それらに対する文字認識の誤認識パターンを保有する。そして、これらを用いた曖昧検索を行うことで、検索精度の向上、および補正を行う機能の提供が可能である。 Here, it is possible to create a search index for speeding up the search (index creation and high-speed search technology using the same can use known index creation / search technology). . In addition, it is possible to increase the speed by performing a batch search with a specific character string of a plurality of anchor expressions, which is more effective. In addition, for explanatory expressions in the text, not only that, but also possesses multilingual character string patterns such as figure numbers, distorted character patterns such as figure numbers, and misrecognition patterns for character recognition for them. . By performing an ambiguous search using these, it is possible to provide a function for improving and correcting the search accuracy.

リンク情報生成処理部７０４は、リンク情報付与対象選択処理部７０１で選択されたキャプション付随オブジェクトや、本文内アンカー表現検索処理部７０３で検索・抽出された本文中のアンカー表現と図や写真の間にリンクを生成するために必要な情報を作成する。リンク情報には、リンクそのものの位置とそれらの関係等が必要になる。本発明では、アンカー表現の位置情報を得るための、文字の位置情報をそのまま格納せず、段落などの単位でまとめて、すなわち、その中の文字の外接矩形をもとめて、その位置情報だけを蓄積するようにしている。 The link information generation processing unit 704 includes a caption-associated object selected by the link information addition target selection processing unit 701, an anchor expression in the text searched for and extracted by the anchor expression search processing unit 703 in the text, and a figure or a photo. Create the information needed to generate the link. The link information requires the position of the link itself and the relationship between them. In the present invention, the position information of the character for obtaining the position information of the anchor expression is not stored as it is, but is grouped in units such as paragraphs, that is, the circumscribed rectangle of the character in it is obtained, and only the position information is obtained. Accumulate.

それ以外の情報と合わせて、実際にリンクを生成するようにする処理をプログラミング言語で記述したものをリンク処理制御部７０５が作成し、電子文書内に格納する。そして、電子文書の利用時に、この、プログラミング言語で記述した処理が、これらの情報をもとに、リンク機能を動的に生成し、そのリンク機能が、利用者に利用可能となる。これらの詳細については後述する。ここでは、図や写真と本文中に記述されているアンカー表現の間のリンク作成のための情報を生成する。 The link processing control unit 705 creates a description in a programming language of a process for actually generating a link together with other information, and stores it in the electronic document. When the electronic document is used, the process described in the programming language dynamically generates a link function based on the information, and the link function can be used by the user. Details of these will be described later. Here, information for creating a link between a figure or photo and an anchor expression described in the text is generated.

リンク処理制御部７０５は、リンク情報生成処理部７０４でリンク生成に必要な情報を生成し、蓄積する処理を行う。ここで蓄積する際に、リンク生成用のプログラミング言語で記述した処理や、リンク生成に必要なグラフィックスなどの情報などを、電子文書の記述フォーマットやプログラミング言語の仕様に応じて生成を行い、その結果を蓄積する。また、ここで、必要に応じて、予め決められたリンクのトリガーやリンクアクション設定をもとに、生成を行い蓄積される。 The link processing control unit 705 performs processing for generating and storing information necessary for link generation by the link information generation processing unit 704. When accumulating here, the processing described in the programming language for link generation and the information such as graphics necessary for link generation are generated according to the description format of the electronic document and the specification of the programming language. Accumulate results. Here, as necessary, it is generated and stored based on a predetermined link trigger or link action setting.

このようにして、リンク情報は生成され、その結果を、リンク情報７１３として、図２の記憶部２１１に格納する。また、リンク処理制御部７０５は、これら７０１から７０４の各機能を連携・制御させる機能を提供するものである。
なお、リンク処理部３０４の内の各部分である図７（ａ）の各処理部７０１〜７０５の動作については、後述の処理手順の説明の中で、再度説明する。 In this way, link information is generated, and the result is stored as link information 713 in the storage unit 211 of FIG. The link processing control unit 705 provides a function for linking and controlling these functions 701 to 704.
Note that the operations of the processing units 701 to 705 in FIG. 7A, which are each part of the link processing unit 304, will be described again in the description of the processing procedure described later.

図７（ｂ）は、図３の対応表現追加部３０５の構成例を示すブロック図である。以下、７２１〜７２５の機能について説明する。ここで行われる処理は、図３の３０１から３０４で作成され、図２の記憶部２１１に格納された、イメージデータ７１０、図５（ａ）で示されるテキストや図表に関する領域情報７１１と文字情報７１２、リンク情報７１３を利用して行われる。 FIG. 7B is a block diagram illustrating a configuration example of the correspondence expression adding unit 305 in FIG. Hereinafter, functions of 721 to 725 will be described. The processing performed here is image data 710 created from 301 to 304 in FIG. 3 and stored in the storage unit 211 in FIG. 2, region information 711 and character information relating to the text and chart shown in FIG. 712, which is performed using the link information 713.

本文内アンカー近傍テキスト抽出部７２１は、リンク情報７１３を取り出し、その中にあるアンカー表現間のリンクから、本文中のアンカー表現の場所と、図表などの対応するキャプションを得る。これから、アンカー表現を含む本文中から、アンカー表現の近傍のテキスト情報を抽出する処理を行う。 The in-text anchor neighborhood text extraction unit 721 extracts the link information 713 and obtains the location of the anchor expression in the text and the corresponding caption such as a chart from the link between the anchor expressions in the link information 713. From now on, a process of extracting text information in the vicinity of the anchor expression from the text including the anchor expression is performed.

ここで、抽出するテキスト情報は、本文中のアンカー表現を含む文もしくは段落程度のものであり、そのテキストの長さ・文字数や代名詞等の他の文への参照表現の含有状況などから、適度なテキストの長さ・文字数を決めて、それを目安に抽出を行う。逆に、文があまりに長い場合などには、単純に前述の目安に従い、近傍のテキストを切り出す場合もある。このようにして、アンカー表現を含む本文中から、アンカー表現の近傍のテキスト情報を抽出する機能を提供する。ここで抽出されたテキストを、本文内アンカー近傍テキストと呼ぶこととする。 Here, the text information to be extracted is about sentences or paragraphs that contain anchor expressions in the body, and it is appropriate based on the length of the text, the number of characters, the content of reference expressions in other sentences such as pronouns, etc. Determine the length of text and the number of characters and perform extraction based on that. On the other hand, when the sentence is too long, nearby text may be cut out simply according to the above-mentioned standard. In this way, the function of extracting text information in the vicinity of the anchor expression from the text including the anchor expression is provided. The extracted text will be referred to as anchor text in the body text.

本文内テキスト・図表内テキスト検索部７２２は、本文内アンカー近傍テキスト抽出部７２１で抽出された本文内アンカー近傍テキスト内の単語や文字列と、リンクにより対応する図表内のテキストに同一の単語列や文字列が含まれているものを検索する。検索方法は、単語列単位の比較や、自然言語処理での形態素解析による単語の抽出で得られた単語を得て検索を行う方法などが考えられる。 The text in text / chart text search unit 722 has the same word string as the word or character string in the text in the vicinity of anchor in the text extracted by the text in the vicinity of anchor in text body 721 and the text in the diagram corresponding to the link. Search for items that contain or a string. As a search method, there can be considered a method of performing a search by obtaining a word obtained by comparing words in units or extracting words by morphological analysis in natural language processing.

また、図表内テキストは、図表内の項目などであることが多いため、文や段落よりは、単語や短文などが多いことが容易に予想できる。したがって、検索・比較を行う際に、こちらを検索・比較のキーとする方が効果的であることも考えられる。本発明の実施形態では、どのような方法でもよい。何れにしろ、ここでは、リンクにより対応する図表内のテキストに出現する同一のテキスト表現を抽出する機能を提供する。そして、この結果として、本文内アンカー近傍テキスト内の単語や文字列と、リンクにより対応する図表内のテキストに同一のテキスト表現を抽出する。これを対応表現と呼ぶこととする。 Further, since the text in a chart is often an item in the chart, it can be easily predicted that there are more words, short sentences, etc. than sentences and paragraphs. Therefore, when performing search / comparison, it may be more effective to use this as a key for search / comparison. Any method may be used in the embodiment of the present invention. In any case, here, the function of extracting the same text expression appearing in the text in the corresponding chart by the link is provided. As a result, the same text expression is extracted from the word or character string in the text in the vicinity of the anchor in the body and the text in the diagram corresponding to the link. This is called a correspondence expression.

対応表現内容決定部７２３は、本文内テキスト・図表内テキスト検索部７２２で得られた対応表現をもとに、これらに対して、どのような強調表現などを行うのがよいかを決定する処理を行う。１つの図表に対して、複数のリンクがある場合があり、かつ、１つのリンク情報に対して、複数の対応表現がある場合がある。そして、それらの対応表現には、同じ内容のテキスト表現が含まれることがある。ここでは、このような関係をもとに、利用者にとって、個々のリンクについて、そのリンクの違いと本文中と図表中にある対応表現の違いが明確になるように、強調表現の選択を行う。 The correspondence expression content determination unit 723 determines, based on the correspondence expression obtained by the text-in-text / graph-in-text search unit 722, what kind of emphasis expression should be performed on these. I do. There may be a plurality of links for one chart, and there may be a plurality of corresponding expressions for one link information. These correspondence expressions may include text expressions having the same contents. Here, based on this relationship, the user selects the emphasized expression for each link so that the difference between the links and the corresponding expression in the text and chart are clear. .

基本的には、リンク表示・使用時に、そのリンクでの対応表現同士に対しては、同一の強調表現パターンを用いる。また、同一図表に対する複数のリンクにおいて、同一の対応表現がある場合には、なるべく同じ強調表現を用いるようにする。また、対応表現が、表内の行や列のラベル（名前）等や、図中の出現場所の近さなどに応じて、これらを対応表現の属性とみなして、同一の強調表現パターンを用いるようにする。 Basically, when a link is displayed and used, the same emphasis expression pattern is used for corresponding expressions in the link. In addition, when there are the same corresponding expressions in a plurality of links to the same diagram, the same emphasized expression is used as much as possible. Also, the correspondence expression is regarded as an attribute of the correspondence expression according to the label (name) of the row or column in the table, the proximity of the appearance location in the figure, etc., and the same emphasized expression pattern is used. Like that.

ここでいう、強調表現のパターンとは、文字の下線や囲み、色、フォントの変更、文字を太くしたりするものから、文字を点滅させたり、揺らしたり、色を時間的に変更したりするような、動きのある表現でも構わない。当然、これらの組合せも含むものとする。そして、個々の強調表現としては、これらの各パターンにおいて、下線や囲み、文字の太さなどの度合いや、配色など、同一の強調パターンであっても、さまざまな設定や値があり、それらを変化させることが可能である。このようにして、各リンクの対応表現に対する強調表現を選択していく。これにより、リンク表示・利用時に、対応する本文中の表現と図表中の表現の出現位置と、その対応状況を容易に利用者に示せるような強調表現を決定する機能を提供する。 Here, the emphasis pattern is from underlining and enclosing characters, changing the color and font, and making the characters thicker, blinking, shaking, and changing the color over time. Such a moving expression is also acceptable. Of course, these combinations are also included. And as for each emphasis expression, there are various settings and values even in the same emphasis pattern such as underline, surrounding, character thickness, color scheme, etc. in each of these patterns. It is possible to change. In this way, an emphasized expression for the corresponding expression of each link is selected. This provides a function for determining an emphasized expression that can easily indicate to the user the corresponding expression in the text, the appearance position of the expression in the chart, and the corresponding situation when displaying and using the link.

対応表現追加処理部７２４は、個々のリンク情報に対応するように、対応表現内容決定部７２３で決定された対応表現に対する強調表現情報生成処理を行い、蓄積する機能を提供する。ここで蓄積する際に、リンク生成用のプログラミング言語で記述した処理や、リンク生成に必要なグラフィックスなどの情報などを、電子文書の記述フォーマットやプログラミング言語の仕様に応じて生成を行い、その結果を蓄積する。また、ここで、必要に応じて、予め決められたリンクのトリガーやリンクアクション設定をもとに、生成を行い蓄積される。 Corresponding expression addition processing unit 724 provides a function of performing and storing emphasized expression information for the corresponding expression determined by corresponding expression content determining unit 723 so as to correspond to individual link information. When accumulating here, the processing described in the programming language for link generation and the information such as graphics necessary for link generation are generated according to the description format of the electronic document and the specification of the programming language. Accumulate results. Here, as necessary, it is generated and stored based on a predetermined link trigger or link action setting.

このようにして、対応表現情報は生成され、その結果を、７１４の対応表現情報として、図２の記憶部２１１に格納する。また、対応表現追加制御部７２５は、これら７２１から７２４の各機能を連携・制御させる機能を提供するものである。
なお、対応表現追加部３０５の内の各部分である図７（ｂ）の各部７２１〜７２５の動作については、後述の処理手順の説明の中で、再度説明する。
最終的には、ここで生成されたものが、図３のフォーマット変換部３０６に出力され、電子文書データ３１０の中に格納されることになる。 In this way, the correspondence expression information is generated, and the result is stored as the correspondence expression information 714 in the storage unit 211 of FIG. The corresponding expression addition control unit 725 provides a function for linking and controlling these functions 721 to 724.
Note that the operations of the units 721 to 725 in FIG. 7B, which are the components of the correspondence expression adding unit 305, will be described again in the description of the processing procedure described later.
Finally, the data generated here is output to the format conversion unit 306 in FIG. 3 and stored in the electronic document data 310.

次に、実施形態の文書処理システムで実行する処理全体の概要を、図８のフローチャートを用いて説明する。図８及び図９に示すフローチャートは、図２のデータ処理部２１８（図３の各処理部３０１〜３０６）によって実行されるものとする。本実施形態では、図２のＣＰＵ２０５が記憶部２１１（コンピュータ読取可能な記憶媒体）に格納されたコンピュータプログラムを読み取り実行することによって、データ処理部２１８（図３の各処理部３０１〜３０６）として機能するものとする。しかし、これに限るものではなく、例えば、図２のデータ処理部２１８（図３の各処理部３０１〜３０６）を、電子回路等のハードウェアで実現するように構成してもよい。 Next, an overview of the entire process executed by the document processing system of the embodiment will be described with reference to the flowchart of FIG. 8 and 9 is executed by the data processing unit 218 in FIG. 2 (each processing unit 301 to 306 in FIG. 3). In the present embodiment, the CPU 205 in FIG. 2 reads and executes a computer program stored in the storage unit 211 (computer-readable storage medium), whereby the data processing unit 218 (each processing unit 301 to 306 in FIG. 3). It shall function. However, the present invention is not limited to this. For example, the data processing unit 218 in FIG. 2 (each processing unit 301 to 306 in FIG. 3) may be realized by hardware such as an electronic circuit.

図８は、図１のＭＦＰ１００で入力された複数ページのイメージデータを、複数ページからなる電子文書データに変換する処理のフローチャートである。なお、複数ページのイメージデータとしては、例えば、図４のページ画像が入力されるものとする。以下、図８のフローチャートの各説明を行う。 FIG. 8 is a flowchart of processing for converting a plurality of pages of image data input by MFP 100 of FIG. 1 into electronic document data consisting of a plurality of pages. For example, the page image of FIG. 4 is input as the image data of a plurality of pages. Hereinafter, each description of the flowchart of FIG. 8 will be given.

Ｓ８０１において、図７（ａ）のリンク処理制御部７０５は、以降の処理のための初期化処理と、図３のフォーマット変換部３０６が電子文書の作成をできるようにするための準備の処理を行う。 In step S801, the link processing control unit 705 in FIG. 7A performs initialization processing for subsequent processing and preparation processing for enabling the format conversion unit 306 in FIG. 3 to create an electronic document. Do.

Ｓ８０２において、図４のページ画像が図２の記憶部２１１に格納されているので、この中から処理対象となるページを、ページの先頭から順番に選んでいく処理を行う。これを行い、Ｓ８０７での条件分岐を利用してループ処理の形態とすることで、Ｓ８０２からＳ８０６までの処理を、各ページに施すことを可能としている。したがって、最初にこの処理を行った場合は先頭ページが選ばれ、以後、この処理が行われるたびに、その次のページが選ばれ、その選ばれたページの中のイメージデータ（図３の３００、図７の７１０）が以後の処理の対象となる。 In S802, since the page image of FIG. 4 is stored in the storage unit 211 of FIG. 2, processing is performed in which the pages to be processed are selected in order from the top of the page. By performing this and using the conditional branch in S807 to form a loop process, the processes from S802 to S806 can be performed on each page. Therefore, when this process is first performed, the first page is selected, and each time this process is performed, the next page is selected, and the image data (300 in FIG. 3) in the selected page is selected. 710 in FIG. 7 is a target of subsequent processing.

Ｓ８０３において、領域分割部３０１は、Ｓ８０２で選択された１ページ分のイメージデータから領域を抽出する領域分割を行う。例えば、図４のイメージデータ４００から、図５（ａ）で示される各領域を抽出する。この各領域はオブジェクトという、文書内の構成要素の一部とし、その識別のためにオブジェクトＩＤを使用することにする。 In step S803, the region division unit 301 performs region division that extracts a region from the image data for one page selected in step S802. For example, each area shown in FIG. 5A is extracted from the image data 400 of FIG. Each area is an object, which is part of a component in the document, and an object ID is used for identification.

Ｓ８０４において、属性情報付加部３０２は、Ｓ８０３で分割された各領域に属性を付加する。図５（ａ）の例では、領域５０３には『写真』、領域５０５は『キャプション』の属性といったように付加する。
さらに、このキャプション５０５には、付随する領域が５０３であるという情報も付加する。この付加した結果の一部を示したのが図５（ｂ）である。この中で、図５（ａ）内の各オブジェクトＩＤに対して、図５（ｂ）の「属性」と「キャプションが付属するオブジェクトＩＤ」の値が、この処理の結果である。 In step S804, the attribute information adding unit 302 adds an attribute to each area divided in step S803. In the example of FIG. 5A, an attribute such as “photo” is added to the area 503, and “caption” is added to the area 505.
Further, information that the accompanying area is 503 is also added to the caption 505. FIG. 5B shows a part of the added result. Among these, for each object ID in FIG. 5A, the values of “attribute” and “object ID with caption” in FIG. 5B are the results of this processing.

さらに、図表に対しては、その中の要素についても、オブジェクトとみなし、その属性等の情報を付加していく。表５０２に対して行った結果の一部が示されているのが、図５（ｃ）である。この中に、表の中のテキスト領域である、５１２から５１８の属性等の情報が示されている（ただし、５１４から５１８までは、同様なので省略）。ここで、表内の位置として「行位置」の値、「列位置」の値、属性として「行ラベル」、「列ラベル」、「値」が付加されている。 Furthermore, elements such as elements are also regarded as objects and information such as attributes is added to the chart. FIG. 5C shows a part of the result performed on the table 502. In this, information such as attributes from 512 to 518, which are text areas in the table, is shown (however, the information from 514 to 518 is omitted because it is the same). Here, a value of “row position”, a value of “column position” are added as positions in the table, and “row label”, “column label”, and “value” are added as attributes.

なお、座標Ｘ、座標Ｙ、幅Ｗ、高さＨは、そのオブジェクトの領域に関する位置と大きさを示すものである。これらは、Ｓ８１２でリンク情報や対応表現情報から、プログラミング言語で記載される計算機プログラムやグラフィックスオブジェクトなどを作成し、電子文書データ３１０内に格納する際に使用される。具体的には、その計算機プログラムで作成される表示部品やグラフィックスオブジェクトなどにおいて、その位置や大きさなどを決める際に使用されることになる。 Note that the coordinates X, coordinates Y, width W, and height H indicate the position and size of the object area. These are used when a computer program or a graphics object described in a programming language is created from the link information and the corresponding expression information in S812 and stored in the electronic document data 310. Specifically, it is used when determining the position and size of display parts and graphics objects created by the computer program.

Ｓ８０５において、文字認識部３０３は、Ｓ８０４でテキスト（本文、キャプション、見出し、小見出し等）の属性を付加した領域に対して文字認識処理を実行し、その結果を文字情報として対象領域に関連付けて保持する。この結果が、図５（ｂ）のテキスト情報に記されている。 In step S805, the character recognition unit 303 performs character recognition processing on the region to which the text (text, caption, heading, subheading, etc.) attribute is added in step S804, and stores the result in association with the target region as character information. To do. This result is described in the text information of FIG.

Ｓ８０６において、Ｓ８０３からＳ８０５、すなわち、領域分割部３０１、属性情報付加部３０２、文字認識部３０３で作成された図５（ｂ）と図５（ｃ）の情報を図２の記憶部２１１に、１ページ分の情報として蓄積を行う。 In S806, the information of FIG. 5B and FIG. 5C created by S803 to S805, that is, the area dividing unit 301, the attribute information adding unit 302, and the character recognizing unit 303 is stored in the storage unit 211 of FIG. Accumulation is performed as information for one page.

Ｓ８０７において、全てのページに対してＳ８０３からＳ８０６の処理を行ったかどうか確認し、まだ残っている場合には、Ｓ８０２に戻って、残っているページを選択し処理を行うようにする。全てのページが終わったら、Ｓ８０８に進む。 In step S807, it is confirmed whether or not the processing in steps S803 to S806 has been performed on all pages. If the pages remain, the process returns to step S802 to select and process the remaining pages. When all pages are finished, the process proceeds to S808.

Ｓ８０８において、リンク処理部３０４は、リンクの生成に必要となる情報を、抽出して蓄積するために、まずは、図７のリンク情報付与対象選択処理部７０１において、キャプション付随オブジェクトとそのテキスト情報を選択する。次に、アンカー表現抽出処理部７０２において、その中からアンカー表現の抽出を行う。これを、全ページのすべてのキャプション付随オブジェクトに対して行う。実際には、アンカー表現がない場合もあるので、アンカー表現が得られたものが、このあとの処理対象となる。 In step S808, in order to extract and accumulate information necessary for link generation, the link processing unit 304 first selects a caption associated object and its text information in the link information addition target selection processing unit 701 in FIG. select. Next, the anchor expression extraction processing unit 702 extracts anchor expressions from them. This is done for all caption-associated objects on all pages. Actually, there is a case where there is no anchor expression, and therefore, the one for which the anchor expression is obtained becomes a processing target after this.

アンカー表現とは、元の文書中でこのキャプションが付随するオブジェクトを識別するための文字情報（文字列）であり、キャプション表現とはオブジェクトを説明するための文字情報（文字列）である。オブジェクトに付随するキャプションには、アンカー表現のみが記載される場合、キャプション表現のみが記載される場合、両方が記載される場合、さらにどちらでもない場合がある。 The anchor expression is character information (character string) for identifying the object accompanied by the caption in the original document, and the caption expression is character information (character string) for explaining the object. In the caption associated with the object, only the anchor expression is described, only the caption expression is described, both are described, and neither is the case.

例えば、アンカー表現は「図」や「Ｆｉｇ」等、特定の文字列と、番号や記号との組み合わせ表現される場合が多い。そこで、それら特定の文字列を登録したアンカー文字列用辞書を予め用意しておき、キャプション表現を該辞書と比較してアンカー部分（アンカー文字列＋数記号）を特定すればよい。そして、キャプション領域の文字列のうち、アンカー表現以外の文字列をメタデータ表現として判断すればよい。 For example, the anchor expression is often expressed by a combination of a specific character string such as “figure” or “Fig” and a number or symbol. Therefore, an anchor character string dictionary in which these specific character strings are registered may be prepared in advance, and the anchor expression (anchor character string + number symbol) may be specified by comparing the caption expression with the dictionary. Then, a character string other than the anchor expression among the character strings in the caption area may be determined as the metadata expression.

例えば、「図１ＡＡＡ」というキャプションの場合には、「図１」がアンカー文字列にあたり、「ＡＡＡ」がキャプション文字列に当たる。キャプション表現の文字数が極端に少なかったり、有意な文字列とは思えなかったりする場合（例えば、記号列"― ― ― ― ―"などの場合）がある。このような場合には、文書の区切りなどの印が文字列として文字認識されたり、文字ではないものが文字認識されていたりする可能性があるので、この場合には、キャプションではないと判定し、アンカー文字列の抽出は行わないようにする。 For example, in the case of the caption “FIG. 1 AAA”, “FIG. 1” corresponds to the anchor character string, and “AAA” corresponds to the caption character string. There are cases in which the number of characters in the caption expression is extremely small, or it may not be considered a significant character string (for example, in the case of a symbol string "------", etc.). In such a case, there is a possibility that a mark such as a document separator is recognized as a character string, or a character that is not a character may be recognized. In this case, it is determined that it is not a caption. The anchor character string is not extracted.

さらに、アンカー表現が、文字認識の誤りなどで文字が誤っている場合があるので、文字認識の誤り訂正用のパターン辞書を保有し、これを用いて補正を行う。たとえば数字の一「１」とアルファベットの小文字のＬ「l」などである（これは、人間にとっても誤りやすい）。 Furthermore, since the anchor expression may have a wrong character due to an error in character recognition or the like, a pattern dictionary for correcting the error in character recognition is held, and correction is performed using this. For example, the number “1” and the lowercase letter L “l” (this is easily mistaken for humans).

また、アンカー表現に対して、その表現だけでなく、図番号等の多言語の文字列パターンや、図番号等の表記ゆれの文字パターンや、それらに対する文字認識の誤認識パターンを保有する。そして、これらをもとに曖昧検索用のアンカー表現辞書を作成し、これを利用した検索を行う方法が考えられるが、本発明の実施形態では、どのような方法を用いてもよい。 In addition, the anchor expression has not only the expression but also a multilingual character string pattern such as a figure number, a character pattern that is notated such as a figure number, and a misrecognition pattern for character recognition. A method for creating an anchor expression dictionary for ambiguous search based on these and performing a search using the dictionary can be considered, but any method may be used in the embodiment of the present invention.

Ｓ８０９において、本文テキストに対して、アンカー表現の検索を行う。具体的には、リンク処理部３０４は、リンクの生成に必要となる情報を、抽出し蓄積するために、本文内アンカー表現検索処理部７０３は、アンカー表現抽出処理部７０２のアンカー表現抽出処理で抽出されたアンカー表現を文書の本文内から検索する。そして、オブジェクトに対応する本文中のアンカー表現として検索を行う。このときに、同一のアンカー表現が、本文中から複数発見される場合もある。これは、１つの図表が、本文中で何度も参照されている場合などに生じるものである。 In step S809, an anchor expression is searched for the body text. Specifically, the link processing unit 304 extracts and accumulates information necessary for link generation. The in-text anchor expression search processing unit 703 performs the anchor expression extraction processing of the anchor expression extraction processing unit 702. The extracted anchor expression is searched from within the text of the document. Then, a search is performed as an anchor expression in the text corresponding to the object. At this time, a plurality of identical anchor expressions may be found in the text. This occurs, for example, when a single chart is referenced many times in the text.

Ｓ８１０において、リンク処理部３０４は、リンク生成処理を行う。具体的には、リンク情報生成処理部７０４は、リンク情報付与対象選択処理部７０１で選択されたキャプション付随オブジェクトや、本文中のアンカー表現と図や写真の間にリンクを生成するために必要な情報を作成する。そして、リンク処理制御部７０５は、リンク情報生成処理部７０４で生成したリンク生成に必要な情報を、図２の記憶部２１１に格納する。 In step S810, the link processing unit 304 performs link generation processing. Specifically, the link information generation processing unit 704 is necessary for generating a link between the caption-associated object selected by the link information addition target selection processing unit 701, the anchor expression in the text, and the figure or photograph. Create information. Then, the link processing control unit 705 stores information necessary for link generation generated by the link information generation processing unit 704 in the storage unit 211 of FIG.

Ｓ８１１において、対応表現追加部３０５で対応表現の情報を生成し、図２の記憶部２１１に格納する。この処理については、以降に詳細に説明を行う。
Ｓ８１２において、フォーマット変換部３０６は、イメージデータ３００、および図２の記憶部２１１に保存された情報に基づいて、グラフィックデータ生成などを行い、電子文書データ３１０への変換を行う。この際に、リンク情報とその対応表現情報をもとに、プログラミング言語で記述されたリンク作成処理及び対応表現の処理と合わせて、フォーマット変換部３０６に渡し、フォーマット変換部３０６が電子文書の中に格納されることになる。以上で、図８の説明を終了する。 In step S811, the correspondence expression adding unit 305 generates correspondence expression information and stores the information in the storage unit 211 in FIG. This process will be described in detail later.
In step S812, the format conversion unit 306 performs graphic data generation and the like based on the image data 300 and information stored in the storage unit 211 in FIG. At this time, based on the link information and the corresponding expression information, the link creation process and the corresponding expression process described in the programming language are passed to the format conversion unit 306, and the format conversion unit 306 Will be stored. Above, description of FIG. 8 is complete | finished.

以上の処理は、あくまでも、一般的な文書画像に対する処理であって、その順序や処理の詳細については、本発明のオブジェクトのリンク機能を実現するために、行うものであれば、何でも構わない。例えば、自然言語処理や辞書の応用で、文字認識の誤り訂正を持つ文字認識を行ったり、特定の表現を利用して情報を抽出・利用したりするものであってもよい。 The above processing is merely processing for a general document image, and the order and details of processing may be anything as long as they are performed in order to realize the object link function of the present invention. For example, character recognition with error correction of character recognition may be performed by application of natural language processing or a dictionary, or information may be extracted and used using a specific expression.

また、文書のスタイルや文書の内容の分類などで、各解析方法を最適化してもよい。また、オブジェクトは画像認識などの技術を用いて、その内容などの情報を抽出してもよい。また、入力となる文書画像は、ページ記述言語などで記載されていてもよい。このように、様々なものが考えられるが、本発明の実施形態のオブジェクトのリンク機能を実現するために本文のテキストと、オブジェクト内のキャプション表現やアンカー表現を利用するものであれば、どのようなものでも構わない。 Further, each analysis method may be optimized based on the document style, the document content classification, and the like. Further, information such as contents of the object may be extracted by using a technique such as image recognition. The input document image may be described in a page description language or the like. In this way, various things can be considered, but what is necessary is that the text of the body and the caption expression or anchor expression in the object are used to realize the object linking function of the embodiment of the present invention. It does n’t matter.

次に、図８中のＳ８１１の処理の詳細について、図９を用いて説明する。ここでの処理は、対応表現追加部３０５内の各部分において行われる処理であり、対応表現情報の生成を行い、これを図２の記憶部２１１に格納することを行うものである。説明のために、図４の文書を処理している場合を例に挙げて説明を行う。 Next, details of the processing of S811 in FIG. 8 will be described with reference to FIG. The processing here is processing performed in each part in the correspondence expression adding unit 305, and corresponding correspondence information is generated and stored in the storage unit 211 of FIG. For the sake of explanation, the case where the document of FIG. 4 is processed will be described as an example.

Ｓ９０１において、図７（ｂ）の対応表現追加制御部７２５は、以降の処理のための初期化処理と、図２の記憶部２１１に格納された情報、特に、図７（ｂ）のリンク情報７１３が利用できるようにする処理を行う。 In step S901, the correspondence expression addition control unit 725 in FIG. 7B performs initialization processing for subsequent processing, information stored in the storage unit 211 in FIG. 2, particularly link information in FIG. 7B. 713 is made available.

Ｓ９０２において、対応表現追加制御部７２５は、Ｓ９０１で利用できるようになったリンク情報に対して、順番に１つずつ選んで取り出す処理を行う。本実施形態においては、Ｓ９１２での条件分岐を利用してループ処理の形態とすることで、Ｓ９０２からＳ９１１までの処理を、各リンク情報に施すことを可能としている。したがって、最初にこの処理を行った場合は最初のリンク情報が選ばれ、以後、この処理が行われるたびに、その次のリンク情報が選ばれ、その選ばれたリンク情報が以後の処理の対象となる。 In step S 902, the correspondence expression addition control unit 725 performs processing for selecting and extracting the link information that can be used in step S 901 one by one in order. In the present embodiment, the processing from S902 to S911 can be applied to each link information by using the conditional branching in S912 to form a loop processing. Therefore, when this process is performed for the first time, the first link information is selected, and each time this process is performed, the next link information is selected, and the selected link information is the target of the subsequent processes. It becomes.

ここでは、まず、図１０（ａ）のリンク情報から、その中のオブジェクトＩＤが１００１のオブジェクトが選ばれたこととする。この図１０（ａ）は、図４の文書に対して作成された図５（ｂ）と図５（ｃ）から作成されたもので、１００１は、図５（ａ）の本文テキスト５０６に対して、作成されたものである。 Here, first, it is assumed that an object having an object ID of 1001 is selected from the link information in FIG. FIG. 10A is created from FIG. 5B and FIG. 5C created for the document of FIG. 4, and 1001 corresponds to the body text 506 of FIG. 5A. It was created.

同様に、１００２は、図５（ａ）の表である５０２に対して、作成されたものである。これらが新たなオブジェクトＩＤを保有しているのは、このリンクの表示・機能を実現するために、電子文書の中で表示・機能する新たなオブジェクトとして作成されたことを意味している。また、両方が互いに、リンク対象のオブジェクトＩＤで差しあっていることから分かるように、これらは、図８の処理において、アンカー表現「Table．１」を介して、対応づけられたことを意味している。 Similarly, 1002 is created for 502 which is the table of FIG. The fact that they have a new object ID means that they are created as new objects that are displayed and functioned in the electronic document in order to realize the display and function of this link. Further, as can be seen from the fact that both are mutually linked by the object ID to be linked, these mean that they are associated via the anchor expression “Table.1” in the processing of FIG. ing.

ただし、１００１は表５０２から本文テキスト５０６へのリンクであり、１００２は本文テキスト５０６から表５０２へのリンクである。ここでは１対１のリンクになっているが、双方向のリンクではなく、片方向の２つのリンクになっている。これは、実際には、図表は本文中の複数の部分（文や段落など）から参照されることがあるので、１対多のリンクになることもあるためであり、また、個々のリンクに独自の機能や表現を持たせることを可能とするためでもある。 However, 1001 is a link from the table 502 to the body text 506, and 1002 is a link from the body text 506 to the table 502. Here, the link is a one-to-one link, but it is not a bidirectional link but two links in one direction. This is because, in practice, a chart may be referenced from multiple parts (sentences, paragraphs, etc.) in the text, so it may be a one-to-many link. This is also to make it possible to have unique functions and expressions.

Ｓ９０３において、本文内アンカー近傍テキスト抽出部７２１は、Ｓ９０２で選択されたリンク情報から、その中にあるアンカー表現間のリンク情報をもとに、図表などの対応するキャプションや見出し、ラベルを取り出す。図４の例の場合には、これを行うことで、図１０（ａ）の場合は、１００２の基になった５０２が、図５（ｂ）から表であることがわかり、図５（ｃ）から、行ラベルや列ラベルである、「大阪支店」や「２０１０年」などを得ることになる。 In S903, the anchor-in-text anchor text extraction unit 721 extracts corresponding captions such as a chart, a headline, and a label from the link information selected in S902 based on the link information between the anchor expressions in the link information. In the case of the example of FIG. 4, by performing this, it can be seen that in FIG. 10A, 502 that is the basis of 1002 is a table from FIG. 5B, and FIG. ), “Osaka branch”, “2010” and the like, which are row labels and column labels, are obtained.

Ｓ９０４において、本文内アンカー近傍テキスト抽出部７２１は、Ｓ９０２で選択されたリンク情報から、その中にあるアンカー表現間のリンク情報をもとに、本文中のアンカー表現の近傍のテキストを取り出す。図４の例の場合には、これを行うことで、図１０（ａ）の場合は、１００１の基になった５０６が、図５（ｂ）から本文であることがわかり、図８の処理でのアンカー表現の近傍のテキスト情報を得て、「Ａ社の年間売上・・・」の表現を得ることになる。 In S904, the anchor vicinity text extraction unit 721 in the body extracts text in the vicinity of the anchor expression in the body from the link information selected in S902 based on the link information between the anchor expressions in the body. In the case of the example of FIG. 4, by performing this, in the case of FIG. 10A, it can be seen that 506 that is the basis of 1001 is the text from FIG. The text information in the vicinity of the anchor expression is obtained, and the expression "Annual sales of Company A ..." is obtained.

Ｓ９０５において、本文内テキスト・図表内テキスト検索部７２２は、本文内アンカー近傍テキスト抽出部７２１で得た図表のキャプション、見出し、ラベルなどのテキスト表現について、アンカー表現の近傍のテキスト情報に対して検索を行う。図４の例の場合には、これを行うことで、「大阪支店」や「２０１０年」が含まれていることが分かる。これは、図５（ｂ）での５１１と５２３、５１２と５２２が対応していることが分かる。この対応関係が、このリンク情報に対する対応表現追加の対象となる。 In step S 905, the text-in-text / in-table text search unit 722 searches the text information in the vicinity of the anchor expression for text representations such as captions, headings, and labels in the chart obtained by the anchor-inside text extraction unit 721 in the body. I do. In the case of the example of FIG. 4, it is understood that “Osaka branch” and “2010” are included by performing this. This shows that 511, 523, 512, and 522 in FIG. This correspondence is a target for adding a corresponding expression to the link information.

Ｓ９０６において、本文内テキスト・図表内テキスト検索部７２２で対応関係が得られたかどうかを判定し、判定結果に応じて分岐する処理を行う。対応関係が得られた場合は、Ｓ９０７へ進み、得られなかった場合は、Ｓ９１０へ進む。 In step S906, it is determined whether the correspondence is obtained by the text in text / graph in text search unit 722, and a process of branching according to the determination result is performed. If a correspondence relationship is obtained, the process proceeds to S907, and if not, the process proceeds to S910.

Ｓ９０７において、対応表現内容決定部７２３は、本文内テキスト・図表内テキスト検索部７２２で得られた対応表現をもとに、これらに対して、どのような強調表現などを行うのがよいかを決定する強調表現方式決定処理を行う。図４の例の場合には、これを行うことで、以下のようなものが決定される。 In step S 907, the correspondence expression content determination unit 723 determines what emphasis expression should be performed on the correspondence expression obtained by the text-in-text / graphics-text search unit 722. The emphasis expression method to be determined is determined. In the case of the example of FIG. 4, by doing this, the following is determined.

なお、この内容を示したものが図１０の（ｂ）である。ここでは、Ｓ９０５で得られた対応するテキスト表現である「大阪支店」と「２０１０年」に対して、その文字列部分に対して、文字列領域枠が設定される。ただし、文字列領域枠は、同じ１００１に対応する者同士は、異なる色となる等、区別できるようになっており、かつ、対応する者同士は同じ色となるなど、対応していることが分かるようになる。これにより、「２０１０年」に対して１０１４、「大阪支店」に対して１０１５が作成されている。また、同様に、１００２について、図５（ａ）の領域５０２に対して、操作ボタン１０２１、領域枠表示１０２２、文字列領域枠表示１０２４、１０２５が作成されることになる。 This content is shown in FIG. 10 (b). Here, a character string area frame is set for the character string portion of “Osaka branch” and “2010”, which are the corresponding text expressions obtained in S905. However, the character string area frames can be distinguished such that persons corresponding to the same 1001 have different colors, and the corresponding persons have the same color. I will understand. Thus, 1014 is created for “2010” and 1015 is created for “Osaka branch”. Similarly, for 1002, the operation button 1021, the area frame display 1022, and the character string area frame displays 1024 and 1025 are created for the area 502 in FIG.

Ｓ９０８において、対応表現内容決定部７２３は、同一テキストの図表の配置などの関係から指し示されているものを対象に追加することを行う。これには、図表内の位置関係や配置の情報が用いられる。図４の例の場合には、図５（ｃ）の表の情報から「大阪支店」と「２０１０年」がそれぞれ行と列であり、それで指示される図５（ａ）の５１３があることが分かるので、この「５００万」に対して、文字列下線１０２３が作成されることになる。 In step S 908, the correspondence expression content determination unit 723 adds a target indicated by the relationship such as the arrangement of the charts of the same text. For this, information on the positional relationship and arrangement in the chart is used. In the case of the example of FIG. 4, “Osaka branch” and “2010” are rows and columns from the information in the table of FIG. 5C, respectively, and there is 513 of FIG. Therefore, a character string underline 1023 is created for this “5 million”.

Ｓ９０９において、対応表現内容決定部７２３は、Ｓ９０８で追加されたものに対して、対応するものを作成し追加を行う。図４の例の場合には、これを行うことで、１０２３に対応するものが必要であることが分かる。しかし、本文中には「５００万」というテキスト表現がないことが、本文内テキスト・図表内テキスト検索部７２２がＳ９０５で行った検索でわかっている。これに対応するものとして、アンカー表現の近傍のテキスト情報を選択することとする。これにより、文字列下線１０１３が作成されることになる。 In S909, the correspondence expression content determination unit 723 creates and adds a corresponding one to the one added in S908. In the case of the example of FIG. 4, it can be seen that by doing this, the one corresponding to 1023 is necessary. However, it is known from the search performed by the text / chart text search unit 722 in S905 that there is no text expression “5 million” in the text. Corresponding to this, text information in the vicinity of the anchor expression is selected. As a result, a character string underline 1013 is created.

Ｓ９１０において、対応表現内容決定部７２３は、図表側と本文側の全体の対応表現を決定する。強調表示のパターンとして、１００１については、アンカー表現である「Table．１」の部分５２１に対して、図１０（ｂ）に示すように、操作ボタン１０１１の追加を行っている。これはリンク機能を追加する場合に、必ず行われるもので、これを操作すると、リンクで対応する先へと文書の表示が移動する機能が付加されているものである。 In step S910, the correspondence expression content determination unit 723 determines the entire correspondence expression on the diagram side and the text side. As a highlighting pattern, an operation button 1011 is added to the portion 521 of “Table.1” which is an anchor expression as shown in FIG. This is always performed when a link function is added, and when this is operated, a function for moving the display of a document to the destination corresponding to the link is added.

次に、本文中のアンカー表現近傍全体に対して、図１０（ｂ）に示すように、領域枠表示１０１２が追加されている。この領域枠表示は、対応する表５０２に対する枠表示と同一の強調表現になるようになっていて、例えば同一の色などに設定される。また、文書内の内容と区別できるように、異なる色や点滅するなどの動きで、それらと区別できるように設定される。 Next, an area frame display 1012 is added to the entire vicinity of the anchor expression in the text as shown in FIG. This area frame display has the same emphasized expression as the frame display for the corresponding table 502, and is set to the same color, for example. In addition, it is set so that it can be distinguished from the contents in the document by different colors or movements such as blinking.

前述の条件から、オブジェクトＩＤ１０１１から１０１５は、オブジェクトＩＤ１０２１から１０２５にそれぞれ対応しており、その対応毎に異なる色や表現になるようにしてあるが、対応するもの同士は同一の色や表現になるようになっている。また、表５０２に対して、本文中に複数のアンカー表現があることで、複数の場所から参照されている場合がある。その場合は、別のアンカー表現に対して領域枠表示１０１２と領域枠表示１０２２に相当するオブジェクトが生じる。これらは、領域枠表示１０１２と領域枠表示１０２２と同様の表現パターンを使用するが、色などは異なるように設定されることになる。 From the above-mentioned conditions, the object IDs 1011 to 1015 correspond to the object IDs 1021 to 1025, respectively, so that different colors and expressions are used for each of the correspondences, but the corresponding objects have the same color and expression. It is like that. Further, the table 502 may be referred to from a plurality of places due to a plurality of anchor expressions in the text. In that case, objects corresponding to the area frame display 1012 and the area frame display 1022 are generated for another anchor expression. These use the same expression pattern as the area frame display 1012 and the area frame display 1022, but are set to have different colors.

Ｓ９１１において、対応表現追加処理部７２４は、これまで作成された対応表現情報をリンク情報ごとにまとめて、図２の記憶部２１１に記憶する。これが、図７（ｂ）の対応表現情報７１４である。この情報をもとに、図８のＳ８１２で対応表現を実現する操作やグラフィックスオブジェクトや計算機プログラムを作成し、電子文書データに格納することになる。 In S911, the correspondence expression addition processing unit 724 collects the correspondence expression information created so far for each link information and stores the information in the storage unit 211 of FIG. This is the correspondence expression information 714 in FIG. Based on this information, an operation, a graphics object, and a computer program for realizing the corresponding expression in S812 of FIG. 8 are created and stored in the electronic document data.

Ｓ９１２は、全てのリンク情報に対して対応表現情報が付加されるように、まだ付加されていないリンク情報があるどうかを判定し、判定結果に応じて分岐する処理を行う。付加されていないリンク情報がある場合には、Ｓ９０２へ移動して処理を継続することになる。また、付加されていないリンク情報がない場合は、終了となる。 In step S912, it is determined whether there is link information that has not yet been added so that the corresponding expression information is added to all link information, and a process of branching according to the determination result is performed. If there is link information that has not been added, the process moves to S902 to continue the processing. If there is no link information not added, the process ends.

このようにして作成された電子文書データを利用しているところを図１１に示す。これは、図１０（ｂ）の情報に従って作成されたもので、オブジェクトＩＤ１０１１から１０１５と１０２１から１０２５までが、１１１１から１１１５と１１２１から１１２５に、対応している。１１１１と１１２１はリンク機能そのものの表示で操作可能である。 FIG. 11 shows the use of the electronic document data created in this way. This is created according to the information of FIG. 10B, and object IDs 1011 to 1015 and 1021 to 1025 correspond to 1111 to 1115 and 1121 to 1125, respectively. 1111 and 1121 can be operated by displaying the link function itself.

１１１１と１１２１は、このページが利用者の操作で着目状態（フォーカスされる）状態になると表示される。さらに、その際に、１１１１か１１２１が、利用者の操作で着目状態（フォーカスされる）状態になると１１１２から１１１５と１１２２から１１２５も表示されるようになる。 1111 and 1121 are displayed when this page is in a focused state (focused) by a user operation. Further, at that time, when 1111 or 1121 becomes a focused state (focused) by the user's operation, 1112 to 1115 and 1122 to 1125 are also displayed.

ただし、同一のページにない場合には、表示されているページのものしか表示されないようになっている。１１１１と１１２１は、利用者に選択される等で機能を呼び出されると、リンク先が着目状態（フォーカスされる）状態になるように変更される。これにより、表示されるページの移動が伴う場合がある。 However, if they are not on the same page, only the displayed page is displayed. 1111 and 1121 are changed so that the link destination is in a focused state (focused) when a function is called by being selected by the user. Thereby, the displayed page may be moved.

たとえば、１１１１を選択すると、１１２１が着目状態（フォーカスされる）状態となり、１１１１から１１１５と１１２１から１１２２の表示は持続することになる。これにより、利用者は、１１１１の本文の表現が指示している部分の表に移動し、かつ、対応を示す１１２３と１１２４が連携している表現で、１１２５の強調表現もあり、この「５００万」という表現を簡単に見つけることが可能となる。この内容を利用者が理解したら、１１２１を選択することで、１１１１を選択したときと同様に、また元の本文の部分へ戻ることが可能で、その該当する内容の部分が１１１２、１１１３、１１１４、１１１５で再度確認することが可能となっている。 For example, if 1111 is selected, 1121 becomes a focused state (focused) state, and the display of 1111 to 1115 and 1121 to 1122 is continued. As a result, the user moves to the table of the portion indicated by the expression of the text of 1111 and is an expression in which 1123 and 1124 indicating the correspondence are linked, and there is also an emphasis expression of 1125. It is possible to easily find the expression “ten thousand”. When the user understands this content, by selecting 1121, it is possible to return to the original body part in the same manner as when 1111 is selected, and the corresponding content part is 1112, 1113, 1114. 1115 can be confirmed again.

このようにして、複数ページの電子文書を入力として、ページ単位で「オブジェクト」と「オブジェクトの説明文」との間に相互リンクを自動的に作成し、マルチページの電子文書を生成することが可能となる。かつ、同時に、それを使用した場合の「オブジェクトの説明文」の個々の説明と、対応する「オブジェクト」内の特定の部分を容易に分かるようにする。 In this way, it is possible to generate a multi-page electronic document by automatically creating a mutual link between an “object” and an “object description” in units of pages by using a multi-page electronic document as an input. It becomes possible. At the same time, the individual description of the “object description” when it is used and the specific part in the corresponding “object” are easily understood.

また、オブジェクト内の特定の部分内の同一単語列の構造的な関係も利用して、強調表現方式やその特性を決定するようにした。これにより、対応する部分を直観的にわかりやすく示すことができるので、オブジェクト内の特定部分と、本文中のオブジェクトの説明文内の対応する部分に対して、その対応関係を容易に識別できるようにする強調表現することができる。これにより、対応する部分を探す手間が省けて、文書の可読性を向上させることが可能となった。 Also, the emphasis expression method and its characteristics are determined using the structural relationship of the same word string in a specific part in the object. As a result, the corresponding part can be shown intuitively and easily, so that the correspondence between the specific part in the object and the corresponding part in the description of the object in the text can be easily identified. Can be emphasized. As a result, it is possible to improve the readability of the document by eliminating the trouble of searching for the corresponding part.

（第２の実施形態）
第１の実施形態は、スキャナ等を使用した場合での利用や、電子文書のイメージやデータなどから利用した場合であった。これに対して、文書作成時でも、同じようなリンク機能とそれをわかりやすくするための対応表現などを付加することが可能である。これを利用することで、利用者が読む際に、わかりやすい動作と表現が伴った文書の作成を可能とする。 (Second Embodiment)
The first embodiment is used when a scanner or the like is used, or when used from an image or data of an electronic document. On the other hand, when creating a document, it is possible to add a similar link function and a corresponding expression to make it easier to understand. By using this, it is possible to create a document with easy-to-understand operations and expressions when the user reads.

図１２（ａ）は、文書の編集時の画面の図である。作成されるのは、第１の実施形態の図４の文書で、それを作成している途中の状態の図となっている。この作成途中の文書が１２００であり、１２２１はカーソルと呼ばれる、利用者が現在、文字などを入力している場所を示すものである。 FIG. 12A shows a screen when editing a document. What is created is the diagram of FIG. 4 of the first embodiment, which is a diagram in the middle of creating the document. The document being created is 1200, and 1221 is a cursor, which indicates a place where the user is currently inputting a character or the like.

近年の文書作成・編集用のワードプロセッサでは、このような文字入力を行っている場合に、入力内容を予測して入力候補を示す機能を有するものがある。通常は、よく使われる単語や文などの表現を学習しておき、利用者がその全部を入力する前に入力候補として示し、それが適切なものであれば利用者は、それを選択し、全部を入力することなく、その単語や文の入力を終えることができる。これは、入力予測による入力補完機能などと呼ばれている。 Some word processors for document creation / editing in recent years have a function of predicting input contents and indicating input candidates when such character input is performed. Usually, you will learn frequently used expressions such as words and sentences, and show them as input candidates before the user inputs all of them, and if it is appropriate, the user selects it, You can finish entering the word or sentence without entering everything. This is called an input complement function based on input prediction.

さらに、単純に入力された回数や頻度だけでなく、作成中の文書の中でよく使われている単語や文を、入力予測の候補として優先して使用するものがある。これは文書によって、単語や文の利用頻度が異なる場合に有効な手法である。 Furthermore, there are some which preferentially use words and sentences frequently used in a document being created as candidates for input prediction, as well as the number and frequency of simple input. This is an effective technique when the frequency of use of words and sentences varies depending on the document.

このような方法を利用する場合に、文書中の図表やその一部を指し示す入力を行った際に、近傍の図表内の表現を予測候補として提示し、それが選択された場合を考える。この時に、単に入力文字として記録するだけでなく、その入力部分と選択された表現を有する図表にリンク情報を持たせるようにするのである。 When such a method is used, when an input indicating a diagram or part thereof in a document is performed, an expression in a nearby diagram is presented as a prediction candidate, and the case is selected. At this time, not only is it recorded as an input character, but also a diagram having the input portion and the selected expression is provided with link information.

さらに、本文中の方言と、図表の対応がリンク情報として記録されるようになるので、これに対して対応表現も、推奨候補として提案し、その内容が利用可能と利用者が判断して選択したら、これを追加することにする。このようにすることで、作成された文書を読む場合に、本文内容と図表の中の対応する部分を見つけやすく、かつ、わかりやすく示すことができるようになる。 In addition, correspondence between dialects in the text and charts will be recorded as link information, so corresponding expressions are also proposed as recommended candidates, and the user decides that the contents can be used and selects them Then I will add this. In this way, when the created document is read, it is easy to find and show the corresponding contents in the text content and the chart.

図１２（ｂ）は、図１２（ａ）の入力中の文書の内容を示しやすいように、記したものである。以下、図１４のフローチャートに従って、本実施形態の文書処理装置の文書作成時の処理の内容を説明する。 FIG. 12B shows the contents of the document being input shown in FIG. In the following, the contents of the process at the time of document creation of the document processing apparatus of this embodiment will be described with reference to the flowchart of FIG.

Ｓ１４０１は、文書編集の開始が指示された場合の初期化処理に関するものである。これによって、文書作成を行うため必要な、文書内の情報に対して参照、追加、修正、削除などを行うことができるようになる。
Ｓ１４０２は、実際に文書が入力され始めた部分で、入力内容も含めて操作指示の内容として受け取る。
Ｓ１４０３は、操作指示が入力終了を指示したかどうかの判定を行う。ここで、入力終了と指示された場合は、Ｓ１４１１へ進む。そうでない場合は、操作内容は、入力指示や内容であると判断したことになり、Ｓ１４０４へ進む。 S1401 relates to an initialization process when an instruction to start document editing is given. This makes it possible to refer to, add, modify, and delete information in the document that is necessary for creating the document.
S1402 is the part where the document actually starts to be input, and is received as the contents of the operation instruction including the input contents.
In step S1403, it is determined whether or not the operation instruction instructs to end input. If it is instructed to end input, the process advances to step S1411. Otherwise, it is determined that the operation content is an input instruction or content, and the process proceeds to S1404.

Ｓ１４０４は、入力予測として、近傍の図表内のテキスト表現であるキャプションや見出し、ラベル等が予測可能かどうかを調べる。図１２（ａ）の状況の場合だと、表の１２０２からの１２１１から１２１３等や１２０７、図の１２０３からの１２０４、１２０５を、近傍の図表内のテキストとして取り出し、入力内容のテキストと比較を行う。
Ｓ１４０５は、Ｓ１４０４の比較結果を利用して条件分岐する処理であり、含まれていた場合は、Ｓ１４０６へ進む。含まれなかった場合は、Ｓ１４１０へ進む。 In step S1404, as input prediction, it is checked whether captions, headings, labels, and the like, which are text expressions in nearby charts, can be predicted. In the case of the situation in FIG. 12 (a), 1211 to 1213, etc., 1207 from 1202 in the table and 1204, 1205 from 1203 in the figure are taken out as texts in the neighboring charts and compared with the text of the input contents. Do.
S1405 is a process of conditional branching using the comparison result of S1404. If it is included, the process proceeds to S1406. If not included, the process proceeds to S1410.

Ｓ１４０６は、Ｓ１４０４の比較の結果として得られた図表のキャプションや見出し、ラベル等の該当するテキストを、入力補完候補として利用者に提示する処理を行う。実際に提示している図が、図１３である。図１３中、１３０１が推奨する補完候補であり、その候補として、１３０２の「２００万」と１３０３の「２０１０年」が表されている。これに対して、利用者は、その候補を選択する操作を行うことで補完を行い、入力していた単語や表現を完成させることができる。また、候補の選択を明示的に拒否する操作を行うか、候補表示を無視して入力を続けることで選択を行わない(拒否する)ことができる。 In step S1406, corresponding text such as captions, headings, and labels obtained as a result of the comparison in step S1404 is presented to the user as input completion candidates. FIG. 13 is a diagram actually presented. In FIG. 13, 1301 is a recommended complement candidate, and “2 million” of 1302 and “2010” of 1303 are represented as the candidates. On the other hand, the user can perform complementation by performing an operation of selecting the candidate, and can complete the input word or expression. In addition, it is possible to reject (reject) the selection by performing an operation of explicitly rejecting the selection of the candidate or ignoring the candidate display and continuing the input.

Ｓ１４０７は、Ｓ１４０６で行った表示、すなわち、推定による候補の提案に対して、利用者がどのように指示したかを判定して条件分岐する処理である。ここで、Ｓ１４０６で提案した候補を選択した場合には、Ｓ１４０８へ進む。選択されなかった場合には、Ｓ１４１０へ進む。 S1407 is a process of determining and branching the condition by determining how the user has instructed the display performed in S1406, that is, the candidate proposal based on estimation. If the candidate proposed in S1406 is selected, the process advances to S1408. If not selected, the process proceeds to S1410.

Ｓ１４０８は、Ｓ１４０６の表示に従い選択された入力補完候補を入力内容に加える処理を行う。図１３の表示の場合で、「２０１０年」が選ばれた場合だと、１２０６の本文の最後の「２０」を「２０１０年」で置き換えることになる。これは、通常の予測による入力補完を行っているものである。 In step S1408, the input completion candidate selected according to the display in step S1406 is added to the input content. In the case of the display in FIG. 13, when “2010” is selected, the last “20” in the text of 1206 is replaced with “2010”. This is an input supplement based on normal prediction.

Ｓ１４０９は、Ｓ１４０８で行った入力補完が行われたことに着目して、１２０６の本文と１２０２の表との間にリンク情報を作成する処理を行う。このときの処理は、第１の実施形態の図７（ａ）の各部７０１〜７０５が行うため、アンカー表現「Table．１」なども利用され、リンクの情報が作成される。これにより、図４のイメージデータ４００程度に入力された場合には、図５（ｂ）と図５（ｃ）と同様の情報が蓄積されることになる。 In S1409, attention is paid to the fact that the input completion performed in S1408 has been performed, and a process of creating link information between the text 1206 and the table 1202 is performed. Since the processing at this time is performed by the units 701 to 705 in FIG. 7A of the first embodiment, the anchor expression “Table.1” and the like are also used to create link information. As a result, when the image data is input to about 400 in FIG. 4, the same information as in FIGS. 5B and 5C is accumulated.

Ｓ１４１０は、Ｓ１４０９において生成されたリンク情報を電子文書として格納する処理を行う。
Ｓ１４１１は、第１の実施形態の図８のＳ８１１と同じであり、図７（ｂ）の対応表現追加部を使用して図９の処理を行う。これにより、対応表現が作成される。 In step S1410, the link information generated in step S1409 is stored as an electronic document.
S1411 is the same as S811 of FIG. 8 of the first embodiment, and the processing of FIG. 9 is performed using the correspondence expression adding unit of FIG. 7B. Thereby, a correspondence expression is created.

Ｓ１４１２は、Ｓ１４１１で得られた対応表現を、入力した文書を見やすくするための表現として追加するかどうかを利用者に判断してもらうために、その対応表現を表示する処理を行う。対応表現を表示する処理を行う際に、対応関係を容易に識別できるようにする強調表現情報を生成する強調表現情報生成処理を行う。入力文書が図４のイメージデータ４００まで作成された場合、図１１で示された対応表現の候補が示されることになる。 In S1412, processing is performed to display the corresponding expression so that the user can determine whether or not to add the corresponding expression obtained in S1411 as an expression for making the input document easy to see. When the process of displaying the corresponding expression is performed, the emphasized expression information generating process for generating the emphasized expression information that enables the correspondence relationship to be easily identified is performed. When the input document is created up to the image data 400 of FIG. 4, the corresponding expression candidates shown in FIG. 11 are shown.

Ｓ１４１３は、Ｓ１４１２での対応表現追加の提案に対する利用者の操作によって、条件分岐する処理を行う。利用者によってこの対応表現が選択された場合は、Ｓ１４１４へ進む。利用者によって選択されなかった場合は、入力終了となる。 In step S1413, a conditional branching process is performed according to the user's operation with respect to the proposal for adding the corresponding expression in step S1412. When this correspondence expression is selected by the user, the process proceeds to S1414. If not selected by the user, the input is terminated.

Ｓ１４１４は、Ｓ１４１３で提案した対応表現を電子文書データに追加を行う処理である。第１の実施形態とは異なり、第２の実施形態においては文書編集なので、編集完了時に、電子文書ファイルなどに記録する処理となるが、その際に、第１の実施形態のような電子文書ファイルの形態に変換する処理等が行われることになる。これは、通常、文書編集時には、編集用に、文書の中身を図２のＲＡＭ２０６などにデータを展開し、編集に有利なデータ構造などに変換することはよく行われる手法であり、電子文書ファイルに記録する際には、その形式に戻す（変換する）のである。 S1414 is processing for adding the correspondence expression proposed in S1413 to the electronic document data. Unlike the first embodiment, since document editing is performed in the second embodiment, when editing is completed, the process is recorded in an electronic document file or the like. At that time, the electronic document as in the first embodiment is used. Processing to convert the file format is performed. Normally, when editing a document, the contents of the document are expanded into the RAM 206 in FIG. 2 or the like for editing and converted into a data structure that is advantageous for editing. When recording in the format, it is returned (converted) to that format.

このようにして対応表現を追加することで、図１１に示されるような対応表現が追加される。特に、文書編集において、文字入力の予測補完機能と組み合わせてリンク情報を作成することを行うことで、自動的に対応表現候補が提示されるようになり、対応表現の追加を容易に行えるようにしている。 By adding the correspondence expression in this way, the correspondence expression as shown in FIG. 11 is added. In particular, in document editing, by creating link information in combination with a character input predictive complement function, corresponding expression candidates are automatically presented, and it is possible to easily add corresponding expressions. ing.

この文字入力の予測補完機能などの、文書編集の機能と連携させることで、文字入力が容易になるとともに、利用者は特に何も操作が増えていないのに、対応表現候補の提示が行われるようになったため、利用者の文書作成の負担を軽減することができる。文字入力の予測補完機能以外にも、文書編集装置において、文書内の文章や図表などのレイアウトや、それらをわかりやすく、あるいは美しく見せるための飾りとなる部品の追加、アンカー表現内の番号の管理など、さまざまな支援機能が提供される可能性がある。それらについても、同様に、それらの操作時に、リンク情報を抽出しておくようにすることで、本発明を適用可能であり、その効果を得ることは可能である。 By collaborating with document editing functions such as this predictive complement function for character input, character input is facilitated, and the user is presented with corresponding expression candidates even though the number of operations has not increased. As a result, the user's burden of document creation can be reduced. In addition to the predictive completion function for character input, in document editing devices, the layout of sentences and diagrams in the document, the addition of decorative parts to make them easy to understand or beautiful, and the management of numbers in anchor expressions Various support functions may be provided. Similarly, by extracting link information at the time of these operations, the present invention can be applied and the effect can be obtained.

本実施形態によれば、入力予測による文字入力を行わせる際に、同一文書内のオブジェクト中のテキスト内容と比較を行い、同一のものが入力される可能性が高まった場合に、該当する可能性のある同一文書内の図表中のテキストを入力候補として提示する。このような方法を利用する場合に、文書中の図表やその一部を指し示す入力を行った際に、近傍の図表内の表現を予測候補として提示し、それが選択された場合を考える。この時に、単に入力文字として記録するだけでなく、その入力部分と選択された表現を有する図表にリンク情報を持たせるようにすることができる。 According to the present embodiment, when performing character input by input prediction, it is possible to correspond to the case where the possibility of inputting the same thing increases by comparing with the text content in the object in the same document. The text in the diagram in the same document with the characteristics is presented as an input candidate. When such a method is used, when an input indicating a diagram or part thereof in a document is performed, an expression in a nearby diagram is presented as a prediction candidate, and the case is selected. At this time, it is possible not only to record as input characters but also to provide link information to the chart having the input portion and the selected expression.

さらに、本文中の方言と、図表の対応がリンク情報として記録されるようになるので、これに対して対応表現も、推奨候補として提案し、その内容が利用可能と利用者が判断して選択したら、これを編集中の電子文書にデータとして格納する。このようにすることで、作成された文書を読む場合に、本文内容と図表の中の対応する部分を見つけやすく、かつ、わかりやすく示すことができるようになる。また、このような文書を容易に作成できるようにすることができる。 In addition, correspondence between dialects in the text and charts will be recorded as link information, so corresponding expressions are also proposed as recommended candidates, and the user decides that the contents can be used and selects them Then, this is stored as data in the electronic document being edited. In this way, when the created document is read, it is easy to find and show the corresponding contents in the text content and the chart. In addition, such a document can be easily created.

（その他の実施形態）
また、本発明は、以下の処理を実行することによっても実現される。即ち、上述した実施形態の機能を実現するソフトウェア（コンピュータプログラム）を、ネットワーク又は各種のコンピュータ読み取り可能な記憶媒体を介してシステム或いは装置に供給する。そして、そのシステム或いは装置のコンピュータ（またはＣＰＵやＭＰＵ等）がプログラムを読み出して実行する処理である。 (Other embodiments)
The present invention can also be realized by executing the following processing. That is, software (computer program) that realizes the functions of the above-described embodiments is supplied to a system or apparatus via a network or various computer-readable storage media. Then, the computer (or CPU, MPU, etc.) of the system or apparatus reads out and executes the program.

２１８データ処理部、３００イメージデータ、３０１領域分割部、３０２属性情報付加部、３０３文字認識部、３０４リンク処理部、３０５対応表現追加部、３０６フォーマット変換部、３１０電子文書データ 218 Data processing unit, 300 image data, 301 area division unit, 302 attribute information addition unit, 303 character recognition unit, 304 link processing unit, 305 corresponding expression addition unit, 306 format conversion unit, 310 electronic document data

Claims

A document processing apparatus for creating a multi-page electronic document by creating a mutual link between an object and a description of an object in a page unit of a multi-page electronic document including image data,
Area dividing means for dividing the image data to obtain divided areas;
Attribute information adding means for determining the attribute of the divided area obtained by the area dividing means and adding a character attribute for each area;
Character recognition means for recognizing characters in the region to which the character attribute is added by the attribute information addition means;
The correspondence between the anchor expression associated with the object and the anchor expression or the description of the object in the text of the electronic document is examined, and the correspondence between the object and the anchor expression or the description of the object in the text is determined. Link information generation processing means for generating link information for holding;
Correspondence expression information that emphasizes the correspondence is generated in the corresponding portion between the text representation in the object and the description of the object in the body, which is associated with the link information generated by the link information generation processing means. A corresponding expression adding means to be added,
A document processing apparatus comprising: format conversion means for converting the link information into an electronic document including correspondence expression information that emphasizes the correspondence.

The link information generation processing unit includes an anchor expression associated with the image data and the object, and a means for extracting a description of the object including the anchor expression and the anchor expression in the text. Document processing device.

The format conversion means includes electronic information including link information that realizes a mutual link between the object and an anchor expression or a description of the object based on the link information, and correspondence expression information that emphasizes the correspondence. The document processing apparatus according to claim 1, wherein the document processing apparatus converts the document into a document.

The correspondence expression adding unit compares the text information in the object with the text information in the description of the object in the body using the link information, and the description of the object in the body explains An emphasis expression that obtains a specific part in the estimated object and makes it possible to easily identify the correspondence between the specific part in the object and the corresponding part in the description of the object in the text. 4. The document processing apparatus according to claim 1, wherein a method and its characteristics are determined, and emphasized expression information is generated based on the determined emphasized expression method and characteristics.

The correspondence expression adding means uses the link information to compare the text information in the object and the text information in the text description of the object in the body in units of word strings to obtain the same word string , Obtaining a specific part in the object that is assumed to be explained by the description of the object in the text, and corresponding to the specific part in the object and the corresponding part in the description of the object in the text In order to easily identify the relationship, the structural expression of the same word string in a specific part in the object is also used to determine the emphasis expression method and its characteristics, and the emphasis expression information is generated based on the determination. The document processing apparatus according to any one of claims 1 to 3.

The document processing apparatus according to claim 1, wherein the object includes a region of a figure, a drawing, a photograph, and an illustration.

A document processing apparatus that edits an electronic document using text, diagrams, and photographs as objects,
When inputting characters, input completion means for predicting input contents and displaying them as candidates using the contents input in the past, and allowing the user to select and input characters by predicting input,
When performing character input based on the input prediction, the content of the text in an object in the same document is compared, and if the possibility that the same item is input increases, Recording means for presenting the text in the chart as input candidates, and recording the corresponding relationship as link information when the presented input candidate is selected;
Using the link information, the text information in the object is compared with the text information in the description of the object in the body of the electronic document, and the description of the object in the body of the electronic document explains Emphasis expression information generating means for obtaining a specific portion in the estimated object and generating emphasis expression information that makes it possible to easily identify the correspondence relationship;
A document processing apparatus comprising: storage means for storing data in an electronic document being edited including the link information and emphasized expression information.

The storage means stores as data in an electronic document being edited that includes the link information and emphasis expression information that realizes a mutual link between the object and the anchor expression or description of the object based on the link information. The document processing apparatus according to claim 7, wherein:

The emphasized expression information generation means compares the text information in the object with the text information in the description of the object in the body using the link information, and the description of the object in the body is To obtain a specific part in an object presumed to be explained, and to easily identify the correspondence between the specific part in the object and the corresponding part in the description of the object in the text 9. The document processing apparatus according to claim 7, wherein the emphasis expression method and its characteristics are determined, and emphasis expression information is generated based on the determination method.

The emphasized expression information generating means compares the text information in the object with the text information in the object and the text information in the text description of the object in the body by using the link information, and the same word string is obtained. To obtain a specific part in the object that is supposed to be explained by the description of the object in the text, and to the specific part in the object and the corresponding part in the description of the object in the text. In order to easily identify the corresponding relationship, the emphasis expression method and its characteristics are determined using the structural relationship of the same word string in a specific part in the object, and the emphasis is made based on the determination method. The document processing apparatus according to claim 7, wherein expression information is generated.

The document processing apparatus according to claim 7, wherein the object is a region of a figure, a drawing, a photograph, or an illustration.

A document processing method for creating a multi-page electronic document by creating a mutual link between an object and a description of the object for each page of a multi-page electronic document including image data,
A region dividing step of dividing the image data to obtain a divided region;
Attribute information adding step of determining the attribute of the divided region obtained in the region dividing step and adding a character attribute for each region;
A character recognition step for recognizing characters in a region to which a character attribute is added in the attribute information addition step;
The correspondence between the anchor expression associated with the object and the anchor expression or the description of the object in the text of the electronic document is examined, and the correspondence between the object and the anchor expression or the description of the object in the text is determined. A link information generation process for generating link information for holding;
Correspondence expression information that emphasizes the correspondence is generated in a portion corresponding to the text representation in the object and the description of the object in the body, which is associated with the link information generated in the link information generation processing step. A corresponding expression adding step to be added,
A document processing method comprising: a format conversion step of converting the link information and an electronic document including correspondence expression information that emphasizes the correspondence.

A document processing method for editing a document using text, charts and photos as objects,
When inputting characters, an input complementing step that predicts the input contents using the contents input in the past, displays them as candidates, and causes the user to select and input characters by predicting input,
When performing character input based on the input prediction, the content of the text in an object in the same document is compared, and if the possibility that the same item is input increases, A step of presenting the text in the chart as an input candidate, and when the presented input candidate is selected, a recording step of recording the correspondence as link information;
Using the link information, the text information in the object is compared with the text information in the text description of the object in the text, and the text in the text is estimated to be explained by the text in the text. Emphasis expression information generation step for generating emphasis expression information that makes it possible to easily identify the corresponding relationship by obtaining a specific part of
A document processing method comprising a storing step of storing the link information and the highlighted expression information as data in an electronic document being edited.

A program that causes a computer to execute each step of a document processing method for creating a multi-page electronic document by creating a mutual link between an object and an object description in a page unit of a multi-page electronic document including image data There,
A region dividing step of dividing the image data to obtain a divided region;
Attribute information adding step of determining the attribute of the divided region obtained in the region dividing step and adding a character attribute for each region;
A character recognition step for recognizing characters in a region to which a character attribute is added in the attribute information addition step;
The correspondence between the anchor expression associated with the object and the anchor expression or the description of the object in the text of the electronic document is examined, and the correspondence between the object and the anchor expression or the description of the object in the text is determined. A link information generation process for generating link information for holding;
Correspondence expression information that emphasizes the correspondence is generated in a portion corresponding to the text representation in the object and the description of the object in the body, which is associated with the link information generated in the link information generation processing step. A corresponding expression adding step to be added,
A program causing a computer to execute the format conversion step of converting the link information and the electronic document including correspondence expression information that emphasizes the correspondence.

A program that causes a computer to execute each step of a document processing method that edits a document using text, diagrams, and photographs as objects,
When inputting characters, an input complementing step that predicts the input contents using the contents input in the past, displays them as candidates, and causes the user to select and input characters by predicting input,
When performing character input based on the input prediction, the content of the text in an object in the same document is compared, and if the possibility that the same item is input increases, A step of presenting the text in the chart as an input candidate, and when the presented input candidate is selected, a recording step of recording the correspondence as link information;
Using the link information, the text information in the object is compared with the text information in the text description of the object in the text, and the text in the text is estimated to be explained by the text in the text. Emphasis expression information generation step for generating emphasis expression information that makes it possible to easily identify the corresponding relationship by obtaining a specific part of
A program for causing a computer to execute a storing step of storing the link information and the emphasized expression information as data in an electronic document being edited.

A computer-readable storage medium storing the program according to claim 14 or 15.