JP6780380B2

JP6780380B2 - Image processing equipment and programs

Info

Publication number: JP6780380B2
Application number: JP2016168417A
Authority: JP
Inventors: 純黒木
Original assignee: Konica Minolta Inc
Current assignee: Konica Minolta Inc
Priority date: 2016-08-30
Filing date: 2016-08-30
Publication date: 2020-11-04
Anticipated expiration: 2036-08-30
Also published as: JP2018036794A

Description

本発明は、画像処理装置及びプログラムに関し、手書きで加筆された紙文書を読み取って得られる画像データを構造化データに変換する技術に係る。 The present invention relates to an image processing apparatus and a program, and relates to a technique for converting image data obtained by reading a handwritten paper document into structured data.

従来、ワードプロセッサやプレゼンテーション資料作成ソフトウェアで作成され印刷された紙文書に、鉛筆やボールペン等で手書きメモを書き込む場合がある。手書き文字が追加された紙文書をスキャナ等で画像化し、文字認識処理を実施し、Microsoft Office（登録商標）のアプリケーションファイル（例えばPowerPoint（登録商標）ファイル）のような構造化データに変換した場合、元から紙文書に印字された文字なのか、手書きで追加した文字なのか判別がつかない。そのため、構造化データからだけではどのような文字（情報）を追記したか分からなくなっていた。 Conventionally, a handwritten memo may be written with a pencil, a ballpoint pen, or the like on a paper document created and printed by a word processor or presentation material creation software. When a paper document with handwritten characters added is imaged with a scanner, etc., character recognition processing is performed, and it is converted into structured data such as an application file of Microsoft Office (registered trademark) (for example, PowerPoint (registered trademark) file). , I can't tell if the characters were originally printed on a paper document or added by hand. Therefore, it was not possible to know what kind of characters (information) were added only from the structured data.

また、手書き情報を含まない形で文書を共有したい場合に、手書き文字を消しゴムで消したり修正液で消したりする手間が発生していた。 In addition, when it is desired to share a document in a form that does not include handwritten information, it takes time and effort to erase the handwritten characters with an eraser or a correction fluid.

例えば特許文献１には、元文書に対して加筆されたと判断された部分を加筆の特徴に応じて電子文書のレイヤーに配置し、統合された電子文書に変換する技術が開示されている。 For example, Patent Document 1 discloses a technique of arranging a portion determined to have been added to an original document on a layer of an electronic document according to the characteristics of the addition and converting it into an integrated electronic document.

特開２０１２−４８６３７号公報Japanese Unexamined Patent Publication No. 2012-48637

しかしながら、特許文献１に記載の技術は、電子文書がレイヤー構造であることを閲覧者が認識する必要がある。また、統合された電子文書の表示もしくはレイヤーごとの表示から、どの部分が加筆されたかを閲覧者が判断することが難しい。 However, the technique described in Patent Document 1 requires the viewer to recognize that the electronic document has a layered structure. In addition, it is difficult for the viewer to determine which part has been added from the display of the integrated electronic document or the display for each layer.

本発明は、上記の状況から、紙文書に追記した情報が紙文書から生成した電子画像において容易に視認できることを目的とする。 From the above situation, it is an object of the present invention that the information added to the paper document can be easily visually recognized in the electronic image generated from the paper document.

本発明の一態様の画像処理装置は、手書きで加筆された紙文書を読み取って得られた画像データを解析し、画像データに含まれる文字と文字以外の領域を判別する領域判別部と、該領域判別部により判別された文字領域を解析し、文字領域に存在する文字が印字文字又は手書き文字のいずれであるかを判定する文字種別判別部と、印字文字及び手書き文字を文字コード化する文字コード化処理部と、領域判別部により判別された文字以外の領域、及び、文字コード化した印字文字を含む文字領域を構造化データの本体属性に変換する構造化データ変換部と、文字コード化された手書き文字を構造化データの補足属性に変換する補足属性変換部と、文字種別判別部により判定された手書き文字が存在した本体属性に対応する本体領域の手書き文字部を、手書き文字が存在したことを表す情報に置き換える手書き文字置換部と、を備える。 The image processing apparatus according to one aspect of the present invention includes an area discriminating unit that analyzes image data obtained by reading a handwritten paper document and discriminates between characters and areas other than characters included in the image data. A character type determination unit that analyzes the character area determined by the area determination unit and determines whether the character existing in the character area is a printed character or a handwritten character, and a character that encodes the printed character and the handwritten character. A coding processing unit, a structured data conversion unit that converts an area other than the characters determined by the area determination unit, and a character area including a character-coded printed character into the main body attribute of the structured data, and character coding. The handwritten character exists in the supplementary attribute conversion unit that converts the handwritten character to the supplementary attribute of the structured data, and the handwritten character part in the main body area that corresponds to the main body attribute in which the handwritten character determined by the character type determination unit exists. It is provided with a handwritten character replacement unit that replaces information indicating that the data has been performed.

本発明の少なくとも一態様によれば、紙文書に追記した情報が紙文書から生成した電子画像において容易に視認することができる。
上記した以外の課題、構成及び効果は、以下の実施形態の説明により明らかにされる。 According to at least one aspect of the present invention, the information added to the paper document can be easily visually recognized in the electronic image generated from the paper document.
Issues, configurations and effects other than those described above will be clarified by the description of the following embodiments.

本発明の第１の実施形態に係る画像処理装置を含むシステムの全体構成を示す概略図である。It is the schematic which shows the whole structure of the system including the image processing apparatus which concerns on 1st Embodiment of this invention. 本発明の第１の実施形態に係る画像処理装置のハードウェア構成を示すブロック図である。It is a block diagram which shows the hardware structure of the image processing apparatus which concerns on 1st Embodiment of this invention. 本発明の第１の実施形態に係る画像処理装置の機能構成を示すブロック図である。It is a block diagram which shows the functional structure of the image processing apparatus which concerns on 1st Embodiment of this invention. 本発明の第１の実施形態に係る画像処理装置の動作を示すフローチャートである。It is a flowchart which shows the operation of the image processing apparatus which concerns on 1st Embodiment of this invention. 手書き文字を含む入力画像の例を示す図である。It is a figure which shows the example of the input image including the handwritten character. 本発明の第１の実施形態に係る、入力画像に対し領域判別処理を実施した結果を示す図である。It is a figure which shows the result of having performed the area discrimination processing on the input image which concerns on 1st Embodiment of this invention. 本発明の第１の実施形態に係る文字種判別処理を示すフローチャートである。It is a flowchart which shows the character type discrimination process which concerns on 1st Embodiment of this invention. 本発明の第１の実施形態に係る文字種別判別処理を実施した結果を示す図である。It is a figure which shows the result of having performed the character type discrimination process which concerns on 1st Embodiment of this invention. 本発明の第１の実施形態に係る、PowerPointファイルアプリケーションによるアプリケーション画面の表示例である。This is an example of displaying an application screen by a PowerPoint file application according to the first embodiment of the present invention. ノート属性を含む一般的なPowerPointファイルのファイルフォーマットの構造例を示す図である。It is a figure which shows the structure example of the file format of a general PowerPoint file including a note attribute. 本発明の第１の実施形態の第１例に係る、手書き文字をPowerPointファイルのノート領域に配置したアプリケーション画面の表示例を示す図である。It is a figure which shows the display example of the application screen which arranged the handwritten character in the note area of the PowerPoint file which concerns on 1st example of 1st Embodiment of this invention. 本発明の第１の実施形態の第２例に係る、手書き文字をPowerPointファイルのノート領域に配置したアプリケーション画面の表示例を示す図である。It is a figure which shows the display example of the application screen which arranged the handwritten character in the note area of the PowerPoint file which concerns on the 2nd example of 1st Embodiment of this invention. 本発明の第２の実施形態の第１例に係る、手書き文字をPowerPointファイルのコメント領域に配置したアプリケーション画面の表示例を示す図である。It is a figure which shows the display example of the application screen which arranged the handwritten character in the comment area of the PowerPoint file which concerns on 1st example of the 2nd Embodiment of this invention. コメント属性を含む一般的なPowerPointファイルのファイルフォーマットの構造例を示す図である。It is a figure which shows the structure example of the file format of a general PowerPoint file including a comment attribute. 本発明の第２の実施形態の第２例に係る、手書き文字をPowerPointファイルのコメント領域に配置したアプリケーション画面の表示例を示す図である。It is a figure which shows the display example of the application screen which arranged the handwritten character in the comment area of the PowerPoint file which concerns on 2nd example of 2nd Embodiment of this invention. 本発明の第２の実施形態の第３例に係る、手書き文字をPowerPointファイルのコメント領域に配置したアプリケーション画面の表示例を示す図である。It is a figure which shows the display example of the application screen which arranged the handwritten character in the comment area of the PowerPoint file which concerns on 3rd example of the 2nd Embodiment of this invention. 本発明の第２の実施形態の第４例に係る、手書き文字をPowerPointファイルのコメント領域に配置したアプリケーション画面の表示例を示す図である。It is a figure which shows the display example of the application screen which arranged the handwritten character in the comment area of the PowerPoint file which concerns on 4th example of 2nd Embodiment of this invention. 本発明の第２の実施形態の第５例に係る、PowerPointファイルのコメント領域に表示された手書き文字情報の内容を削除したアプリケーション画面の表示例を示す図である。It is a figure which shows the display example of the application screen which deleted the content of the handwritten character information displayed in the comment area of the PowerPoint file which concerns on 5th example of 2nd Embodiment of this invention. 本発明の第３の実施形態に係る、手書き文字をＰＤＦファイルの注釈領域に配置したアプリケーション画面の表示例を示す図である。It is a figure which shows the display example of the application screen which arranged the handwritten character in the annotation area of a PDF file which concerns on 3rd Embodiment of this invention. 本発明の第４の実施形態に係る画像処理装置の機能構成を示すブロック図である。It is a block diagram which shows the functional structure of the image processing apparatus which concerns on 4th Embodiment of this invention.

以下、本発明を実施するための形態の例について、添付図面を参照しながら説明する。説明は下記の順序で行う。なお、各図において実質的に同一の機能又は構成を有する構成要素については、同一の符号を付して重複する説明を省略する。
１．第１の実施形態（PowerPointファイルのノート領域を用いる例）
２．第２の実施形態（PowerPointファイルのノート領域を用いる例）
３．第３の実施形態（ＰＤＦファイルの注釈機能を用いる例）
４．第４の実施形態（出力形態を選択する例）
５．その他 Hereinafter, examples of embodiments for carrying out the present invention will be described with reference to the accompanying drawings. The explanation will be given in the following order. In each figure, components having substantially the same function or configuration are designated by the same reference numerals, and duplicate description will be omitted.
1. 1. First embodiment (example of using a note area of a PowerPoint file)
2. Second embodiment (example of using a note area of a PowerPoint file)
3. 3. Third embodiment (example of using the annotation function of the PDF file)
4. Fourth embodiment (example of selecting an output mode)
5. Other

＜１．第１の実施形態＞
［システム構成］
図１は、第１の実施形態に係る画像処理装置を含むシステムの全体構成を示す概略図である。 <1. First Embodiment>
[System configuration]
FIG. 1 is a schematic view showing an overall configuration of a system including an image processing apparatus according to the first embodiment.

図１のシステム１０では、クライアント端末１、プリンタコントローラー２、画像形成装置３、スキャナ４、及びカメラ５がネットワークＮを介して相互に通信可能に接続されている。ネットワークＮは、例えばイーサネット（登録商標）などの規格に準拠したＬＡＮ等のネットワークである。クライアント端末１、プリンタコントローラー２、画像形成装置３は、画像処理装置の一例である。 In the system 10 of FIG. 1, the client terminal 1, the printer controller 2, the image forming apparatus 3, the scanner 4, and the camera 5 are connected to each other so as to be able to communicate with each other via the network N. The network N is a network such as a LAN conforming to a standard such as Ethernet (registered trademark). The client terminal 1, the printer controller 2, and the image forming apparatus 3 are examples of the image processing apparatus.

クライアント端末１は、例えばパーソナルコンピューター（ＰＣ）等の端末装置である。クライアント端末１は、ユーザーの入力操作によって印刷出力が指示された印刷ジョブを、ネットワークＮを介してプリンタコントローラー２に送信する。この印刷ジョブは、例えば、ＰＤＬ（Page Description Language）に従ってクライアント端末１により生成されるデータであり、出力設定や入力データが含まれる。また、クライアント端末１は、他装置から電子文書（画像データ）を受信し、記憶する。 The client terminal 1 is a terminal device such as a personal computer (PC). The client terminal 1 transmits a print job for which print output is instructed by a user input operation to the printer controller 2 via the network N. This print job is, for example, data generated by the client terminal 1 according to PDL (Page Description Language), and includes output settings and input data. Further, the client terminal 1 receives an electronic document (image data) from another device and stores it.

プリンタコントローラー２は、画像形成装置３に画像の印刷出力を行わせる。このプリンタコントローラー２は、ネットワークＮを介してクライアント端末１から印刷ジョブを受信する。そして、プリンタコントローラー２は、受信した印刷ジョブから取り出した入力データにラスタライズ処理（ＲＩＰ処理）を実行し、ビットマップデータ（画像形成用データ）を生成する。 The printer controller 2 causes the image forming apparatus 3 to print and output an image. The printer controller 2 receives a print job from the client terminal 1 via the network N. Then, the printer controller 2 executes a rasterization process (RIP process) on the input data taken out from the received print job to generate bitmap data (image forming data).

プリンタコントローラー２と画像形成装置３はネットワークＮを介して接続されているが、直接接続してもよい。このときプリンタコントローラー２と画像形成装置３は、ビデオインタフェース回線等の専用回線を介して接続されてもよい。 Although the printer controller 2 and the image forming apparatus 3 are connected via the network N, they may be directly connected. At this time, the printer controller 2 and the image forming apparatus 3 may be connected via a dedicated line such as a video interface line.

画像形成装置３は、プリンタコントローラー２から受信した印刷ジョブに基づいて、用紙に画像を形成して出力する。画像形成装置３は、複数種類の機能（印刷機能、複写機能、スキャン機能等）を備えた複合機（ＭＦＰ：ＭｕｌｔｉＦｕｎｃｔｉｏｎＰｅｒｉｐｈｅｒａｌ）でもよい。 The image forming apparatus 3 forms an image on paper and outputs it based on the print job received from the printer controller 2. The image forming apparatus 3 may be a multifunction device (MFP: Multifunction Peripheral) having a plurality of types of functions (printing function, copying function, scanning function, etc.).

スキャナ４は、読み取り面に載置された紙文書Ｐの表面を読み取って画像データ（ビットマップデータ）を生成し、ネットワークＮ又は無線を介してクライアント端末１等へ出力する。 The scanner 4 reads the surface of the paper document P placed on the reading surface to generate image data (bitmap data), and outputs the image data (bitmap data) to the client terminal 1 or the like via the network N or wirelessly.

カメラ５は、紙文書Ｐを撮像して画像データを生成し、ネットワークＮを介してクライアント端末１等へ出力する。 The camera 5 captures the paper document P, generates image data, and outputs the image data to the client terminal 1 or the like via the network N.

［各装置のハードウェア構成］
図２は、各装置のハードウェア構成を示すブロック図である。 [Hardware configuration of each device]
FIG. 2 is a block diagram showing a hardware configuration of each device.

ここでは、上述したシステム１０に示されたクライアント端末１、プリンタコントローラー２、画像形成装置３、スキャナ４、及びカメラ５を構成するコンピューター２０のハードウェア構成を説明する。なお、各装置の機能、使用目的に合わせてコンピューター２０の各部は取捨選択される。 Here, the hardware configuration of the computer 20 constituting the client terminal 1, the printer controller 2, the image forming apparatus 3, the scanner 4, and the camera 5 shown in the system 10 described above will be described. Each part of the computer 20 is selected according to the function and purpose of use of each device.

コンピューター２０は、バス２４にそれぞれ接続されたＣＰＵ（Central Processing Unit）２１、ＲＯＭ（Read Only Memory）２２、ＲＡＭ（Random Access Memory）２３を備える。さらに、コンピューター２０は、表示部２５、操作部２６、不揮発性ストレージ２７、ネットワークインターフェース２８を備える。 The computer 20 includes a CPU (Central Processing Unit) 21, a ROM (Read Only Memory) 22, and a RAM (Random Access Memory) 23, which are connected to the bus 24, respectively. Further, the computer 20 includes a display unit 25, an operation unit 26, a non-volatile storage 27, and a network interface 28.

ＣＰＵ２１は、本実施形態に係る各機能を実現するソフトウェアのプログラムコードをＲＯＭ２２から読み出して実行する。なお、コンピューター２０は、ＣＰＵ２１の代わりに、ＭＰＵ（Micro-Processing Unit）等の処理装置を備えるようにしてもよい。 The CPU 21 reads the program code of the software that realizes each function according to the present embodiment from the ROM 22 and executes it. The computer 20 may be provided with a processing device such as an MPU (Micro-Processing Unit) instead of the CPU 21.

ＲＡＭ２３には、演算処理の途中に発生した変数やパラメータ等が一時的に書き込まれる。表示部２５は、例えば、液晶ディスプレイモニタであり、コンピューター２０で行われる処理の結果等を表示する。操作部２６には、例えば、キーボード、マウス又はタッチパネル等が用いられ、ユーザーが所定の操作入力、指示を行うことが可能である。 Variables, parameters, etc. generated during the arithmetic processing are temporarily written in the RAM 23. The display unit 25 is, for example, a liquid crystal display monitor, and displays the result of processing performed by the computer 20 and the like. For example, a keyboard, a mouse, a touch panel, or the like is used for the operation unit 26, and the user can perform predetermined operation input and instruction.

不揮発性ストレージ２７としては、例えば、ＨＤＤ（Hard Disk Drive）、ＳＳＤ（Solid State Drive）、フレキシブルディスク、光ディスク、光磁気ディスク、ＣＤ−ＲＯＭ、ＣＤ−Ｒ、磁気テープ、不揮発性のメモリカード等が用いられる。この不揮発性ストレージ２７には、ＯＳ（Operating System）、各種のパラメータの他に、コンピューター２０を機能させるためのプログラムが記録されている。例えば不揮発性ストレージ２７には、電子文書の画像データが保存される。 Examples of the non-volatile storage 27 include HDD (Hard Disk Drive), SSD (Solid State Drive), flexible disk, optical disk, magneto-optical disk, CD-ROM, CD-R, magnetic tape, non-volatile memory card and the like. Used. In the non-volatile storage 27, in addition to the OS (Operating System) and various parameters, a program for operating the computer 20 is recorded. For example, image data of an electronic document is stored in the non-volatile storage 27.

ネットワークインターフェース２８には、例えば、ＮＩＣ（Network Interface Card）等が用いられ、ＬＡＮ等のネットワークＮを介して各装置間で各種のデータを送受信することが可能である。 For the network interface 28, for example, a NIC (Network Interface Card) or the like is used, and various data can be transmitted and received between each device via a network N such as a LAN.

本発明に係る画像処理装置は、図２のようなコンピューター２０上で動作する。よって、画像処理装置は、パーソナルコンピューター、スマートフォンやタブレット端末などの携帯端末、ネットワークＮ上に配置されたサーバー（例えばプリンタコントローラー２）、ＭＦＰのような複合機、スキャナ４、カメラ５、及びその他の装置でも動作可能である。 The image processing apparatus according to the present invention operates on the computer 20 as shown in FIG. Therefore, the image processing device includes a personal computer, a mobile terminal such as a smartphone or a tablet terminal, a server (for example, a printer controller 2) arranged on the network N, a multifunction device such as an MFP, a scanner 4, a camera 5, and other devices. It can also operate on a device.

［画像処理装置の機能］
図３は、第１の実施形態に係る画像処理装置が備えるコンピューター２０の機能構成を示す。ここでは、画像処理装置をクライアント端末１に適用した例について説明する。以降の説明において、紙文書を読み取って得た画像データを構造化データの本体属性に変換する技術は周知（例えば特開２００５−１４９０９７号公報等）であるので、詳細な説明は割愛する。 [Functions of image processing device]
FIG. 3 shows the functional configuration of the computer 20 included in the image processing apparatus according to the first embodiment. Here, an example in which the image processing device is applied to the client terminal 1 will be described. In the following description, since the technique of converting the image data obtained by reading the paper document into the main body attribute of the structured data is well known (for example, Japanese Patent Application Laid-Open No. 2005-149907), detailed description thereof will be omitted.

図３に示すように、クライアント端末１は、入力画像受信部３１、画像データ記録部３２、領域判別部３３、文字種別判別部３４、文字コード化処理部３５、構造化データ変換部３６、手書き文字置換部３７、構造化データ記録部３８、及び構造化データ出力部３９を備える。コンピューター２０のＣＰＵ２１が、ＲＯＭ２２に格納されたプログラムを実行することにより各部の機能が実現される。 As shown in FIG. 3, the client terminal 1 includes an input image receiving unit 31, an image data recording unit 32, an area discrimination unit 33, a character type discrimination unit 34, a character coding processing unit 35, a structured data conversion unit 36, and handwriting. It includes a character replacement unit 37, a structured data recording unit 38, and a structured data output unit 39. The functions of each part are realized by the CPU 21 of the computer 20 executing the program stored in the ROM 22.

入力画像受信部３１は、手書きで加筆された紙文書Ｐを読み取って得られる画像データ（入力画像）を受信する。画像データは、例えばスキャナ４やカメラ５により得られる。あるいは、ネットワークＮを介して不図示のサーバーから受信したり、リムーバブルメディアから取得したりしてもよい。 The input image receiving unit 31 receives the image data (input image) obtained by reading the handwritten paper document P. The image data is obtained by, for example, a scanner 4 or a camera 5. Alternatively, it may be received from a server (not shown) or acquired from removable media via the network N.

画像データ記録部３２は、入力画像受信部３１により受信した画像データを記録する。画像データ記録部３２には、例えば不揮発性ストレージ２７が用いられる。 The image data recording unit 32 records the image data received by the input image receiving unit 31. For the image data recording unit 32, for example, a non-volatile storage 27 is used.

領域判別部３３は、入力画像受信部３１又は画像データ記録部３２から画像データを取得して解析する。画像データに含まれる文字と文字以外の領域を判別し、判別結果を文字種別判別部３４及び構造化データ変換部３６に出力する。例えば領域判別部３３では、文字領域、図形領域、写真領域を判別する。 The area determination unit 33 acquires image data from the input image receiving unit 31 or the image data recording unit 32 and analyzes the image data. Characters included in the image data and areas other than the characters are discriminated, and the discriminant result is output to the character type discriminating unit 34 and the structured data conversion unit 36. For example, the area determination unit 33 determines a character area, a graphic area, and a photographic area.

文字種別判別部３４は、領域判別部３３により判別された文字領域を解析し、文字領域に存在する文字が印字手書き文字又は印字文字のいずれであるかを判定する。この文字種別判別部３４は、印字文字認識処理部３４１、手書き文字認識処理部３４２、及び手書き文字判定部３４３を備える。 The character type determination unit 34 analyzes the character area determined by the area determination unit 33, and determines whether the character existing in the character area is a printed handwritten character or a printed character. The character type determination unit 34 includes a print character recognition processing unit 341, a handwriting character recognition processing unit 342, and a handwriting character determination unit 343.

印字文字認識処理部３４１は、文字領域と判別された画像に対して印字文字認識処理を実行する。また手書き文字認識処理部３４２は、文字領域と判別された画像に対して手書き文字認識処理を実行する。さらに印字文字認識処理部３４１は、印字文字認識処理後に印字文字認識度を算出する。また手書き文字認識処理部３４２は、手書き文字認識処理後に手書き文字認識度を算出する。印字文字認識処理部３４１と手書き文字認識処理部３４２を、一つのブロックにまとめてもよい。 The print character recognition processing unit 341 executes the print character recognition process on the image determined to be the character area. Further, the handwritten character recognition processing unit 342 executes the handwritten character recognition process on the image determined to be the character area. Further, the print character recognition processing unit 341 calculates the print character recognition degree after the print character recognition process. Further, the handwritten character recognition processing unit 342 calculates the handwritten character recognition degree after the handwritten character recognition process. The printed character recognition processing unit 341 and the handwritten character recognition processing unit 342 may be combined into one block.

手書き文字判定部３４３は、印字文字認識処理部３４１で算出された印字文字認識度と手書き文字認識処理部３４２で算出された手書き文字認識度から、双方の文字の認識度の比率から印字文字と手書き文字を判別する。 The handwritten character determination unit 343 uses the printed character recognition degree calculated by the printed character recognition processing unit 341 and the handwritten character recognition degree calculated by the handwritten character recognition processing unit 342 as the printed character from the ratio of the recognition degrees of both characters. Determine handwritten characters.

文字コード化処理部３５は、手書き文字判定部３４３から受信した文字領域に含まれる印字文字及び手書き文字を文字コード化する処理を行う。 The character coding processing unit 35 performs a process of character-coding the printed characters and the handwritten characters included in the character area received from the handwritten character determination unit 343.

構造化データ変換部３６は、入力された画像データをベクタライズ処理して構造化データの本体属性に変換し、構造化データ記録部３８又は構造化データ出力部３９へ出力する。この構造化データ変換部３６は、本体属性変換部３６１と、補足属性変換部３６２を備える。 The structured data conversion unit 36 vectorizes the input image data, converts it into the main body attribute of the structured data, and outputs it to the structured data recording unit 38 or the structured data output unit 39. The structured data conversion unit 36 includes a main body attribute conversion unit 361 and a supplementary attribute conversion unit 362.

本体属性変換部３６１は、領域判別部３３により判別された文字以外（図形、写真）の領域、及び、文字コード化した印字文字を含む文字領域を構造化データの本体属性に変換する。補足属性変換部３６２は、文字コード化された手書き文字を構造化データの補足属性に変換し、補足属性に対応する補足領域（後述するノート領域、コメント領域、注釈機能のポップアップウィンドウ等）に配置する。 The main body attribute conversion unit 361 converts the area other than the characters (figure, photograph) determined by the area determination unit 33 and the character area including the character-encoded printed character into the main body attribute of the structured data. The supplementary attribute conversion unit 362 converts the character-encoded handwritten characters into supplementary attributes of structured data and arranges them in the supplementary area (note area, comment area, pop-up window of annotation function, etc. described later) corresponding to the supplementary attribute. To do.

補足属性に対応する補足領域は、文書作成ソフトウェアで利用される構造化データのコメント属性で定義される領域、ノート属性で定義される領域、あるいは注釈機能のポップアップウィンドウ等で定義される領域である。 The supplementary area corresponding to the supplementary attribute is an area defined by the comment attribute of the structured data used in the document creation software, an area defined by the note attribute, or an area defined by the pop-up window of the annotation function. ..

手書き文字置換部３７は、文字種別判別部３４により判定された手書き文字が存在した本体属性に対応する本体領域の手書き文字部を、手書き文字が存在したことを表す情報に置き換える。言い換えると、手書き文字置換部３７は、本体領域の手書き文字部に対応する部分に、補足属性に変換した手書き文字との関連性を示す情報（記号、図形、文字列、画像等）を生成する。 The handwritten character replacement unit 37 replaces the handwritten character portion of the main body area corresponding to the main body attribute in which the handwritten character determined by the character type determination unit 34 exists with information indicating that the handwritten character exists. In other words, the handwritten character replacement unit 37 generates information (symbols, figures, character strings, images, etc.) indicating the relevance to the handwritten characters converted into supplementary attributes in the portion corresponding to the handwritten character unit in the main body area. ..

構造化データ記録部３８は、構造化データ変換部３６から入力された構造化データを記録する。構造化データ記録部３８には、例えば不揮発性ストレージ２７が用いられる。 The structured data recording unit 38 records the structured data input from the structured data conversion unit 36. For the structured data recording unit 38, for example, a non-volatile storage 27 is used.

構造化データ出力部３９は、構造化データ変換部３６から出力された又は構造化データ記録部３８から読み出した構造化データを、表示部２５等へ出力する。 The structured data output unit 39 outputs the structured data output from the structured data conversion unit 36 or read from the structured data recording unit 38 to the display unit 25 or the like.

［画像処理装置の動作］
図４は、第１の実施形態に係る画像処理装置が適用されたクライアント端末１の動作を示すフローチャートである。コンピューター２０のＣＰＵ２１が、ＲＯＭ２２に格納されたプログラムを実行することにより図４のフローチャートの処理が実現される。 [Operation of image processing device]
FIG. 4 is a flowchart showing the operation of the client terminal 1 to which the image processing device according to the first embodiment is applied. The processing of the flowchart of FIG. 4 is realized by the CPU 21 of the computer 20 executing the program stored in the ROM 22.

まず、入力画像受信部３１は、印刷文書に手書きでメモを加えた紙文書Ｐをスキャナ４もしくはカメラ５等の電子画像化手段で電子画像化した画像データを、入力画像として受信する（Ｓ１）。入力画像の例を図５に示す。 First, the input image receiving unit 31 receives as an input image image data in which a paper document P in which a memo is added by hand to a printed document is electronically imaged by an electronic imaging means such as a scanner 4 or a camera 5 (S1). .. An example of the input image is shown in FIG.

図５は、手書き文字を含む入力画像の例を示す。
図５の入力画像Ｄの場合、印字領域４０の「2017年4月：リリース」と印字された行の下に、「遅すぎる！」という文字が手書きで追加されている。左斜め上を向いた矢印と「遅すぎる！」という文字を含む領域を、手書き領域４１とする。 FIG. 5 shows an example of an input image including handwritten characters.
In the case of the input image D of FIG. 5, the characters "too late!" Are added by hand below the line printed with "April 2017: Release" in the print area 40. The area including the arrow pointing diagonally upward to the left and the characters "too late!" Is defined as the handwriting area 41.

次に、領域判別部３３は、この入力画像を走査して領域判別処理を実施し、文字領域、図形領域、写真領域を判別する（Ｓ２）。判別結果を図６に示す。 Next, the area discrimination unit 33 scans the input image and performs the area discrimination process to discriminate the character area, the graphic area, and the photographic area (S2). The discrimination result is shown in FIG.

図６は、入力画像に対し領域判別処理を実施した結果を示す。
破線の枠が図形領域Ａｇ、実線の枠が文字領域Ａｔとして判別されている。図６の例では、写真領域は存在しない。手書き領域４１では、矢印が図形領域４２に、手書き文字が文字領域４３に判別される。 FIG. 6 shows the result of performing the area discrimination processing on the input image.
The broken line frame is determined as the graphic area Ag, and the solid line frame is determined as the character area At. In the example of FIG. 6, there is no photographic area. In the handwriting area 41, the arrow is determined in the graphic area 42, and the handwritten character is determined in the character area 43.

次に、領域判別部３３は、領域判別処理の結果が文字領域であるか否かを判定し（Ｓ３）、文字領域以外の領域である場合（Ｓ３のＮＯ）にはステップＳ４へ進み、文字領域である場合（Ｓ３のＹＥＳ）にはステップＳ６へ進む。 Next, the area determination unit 33 determines whether or not the result of the area determination process is a character area (S3), and if it is an area other than the character area (NO in S3), proceeds to step S4 and proceeds to the character. If it is an area (YES in S3), the process proceeds to step S6.

次に、構造化データ変換部３６の本体属性変換部３６１は、領域判別部３３により判別された文字以外の領域（図形領域、写真領域）に対して認識処理を実施し（Ｓ４）、構造化データの本体属性に変換する（Ｓ５）。ステップＳ５の処理が終了後、ステップＳ１１へ進む。 Next, the main body attribute conversion unit 361 of the structured data conversion unit 36 performs recognition processing on the areas (graphic area, photographic area) other than the characters determined by the area determination unit 33 (S4), and is structured. Convert to the body attribute of the data (S5). After the process of step S5 is completed, the process proceeds to step S11.

次に、領域判別処理の結果が文字領域である場合（Ｓ３のＹＥＳ）には、文字種別判別部３４による文字種別判別処理を実施する（Ｓ６）。文字種別判別処理のフローチャート（サブルーチン）を図７に示す Next, when the result of the area discrimination process is the character area (YES in S3), the character type discrimination process is performed by the character type discrimination unit 34 (S6). FIG. 7 shows a flowchart (subroutine) of the character type discrimination process.

［文字種別判別処理］
図７は、図４のステップＳ５の文字種別判別処理を示すフローチャートである。 [Character type discrimination processing]
FIG. 7 is a flowchart showing the character type discrimination process of step S5 of FIG.

まず文字種別判別部３４に、入力画像Ｄの文字領域が入力される（Ｓ２１）。次に、文字種別判別部３４の印字文字認識処理部３４１は、文字領域と判別された画像に対して印字文字認識処理（Ｓ２２）を実施し、印字文字認識度を算出する。 First, the character area of the input image D is input to the character type determination unit 34 (S21). Next, the print character recognition processing unit 341 of the character type determination unit 34 performs the print character recognition process (S22) on the image determined to be the character area, and calculates the print character recognition degree.

また、文字種別判別部３４の手書き文字認識処理部３４２は、文字領域と判別された画像に対して手書き文字認識処理（Ｓ２３）を実施し、手書き文字認識度を算出する。 Further, the handwritten character recognition processing unit 342 of the character type determination unit 34 performs the handwriting character recognition process (S23) on the image determined to be the character area, and calculates the handwriting character recognition degree.

次に、手書き文字判定部３４３は、印字文字認識処理部３４１で算出された印字文字認識度と、手書き文字認識処理部３４２で算出された手書き文字認識度を比較する（Ｓ２４）。ここで、印字文字認識度をＯＣＲＳ、手書き文字認識度をＩＣＲＳとする。 Next, the handwritten character determination unit 343 compares the printed character recognition degree calculated by the printed character recognition processing unit 341 with the handwritten character recognition degree calculated by the handwritten character recognition processing unit 342 (S24). Here, the printed character recognition degree is OCRS, and the handwritten character recognition degree is ICRS.

印字文字認識度（ＯＣＲＳ）が手書き文字認識度（ＩＣＲＳ）よりも大きい場合には（Ｓ２４のＹＥＳ）、手書き文字判定部３４３は、文字領域に含まれる文字は印字文字であると判定する（Ｓ２５）。 When the printed character recognition degree (OCRS) is larger than the handwritten character recognition degree (ICRS) (YES in S24), the handwritten character determination unit 343 determines that the character included in the character area is a printed character (S25). ).

一方、印字文字認識度（ＯＣＲＳ）が手書き文字認識度（ＩＣＲＳ）以下である場合には（Ｓ２４のＮＯ）、手書き文字判定部３４３は、文字領域に含まれる文字は手書き文字であると判定する（Ｓ２６）。 On the other hand, when the printed character recognition degree (OCRS) is equal to or lower than the handwritten character recognition degree (ICRS) (NO in S24), the handwritten character determination unit 343 determines that the character included in the character area is a handwritten character. (S26).

次に、ステップＳ２５，Ｓ２６により印字文字又は手書き文字と判定された文字に対して、それぞれに文字コード化処理部３５が文字コード化処理を行う（Ｓ２７）。ステップＳ２７の処理とステップＳ２２，Ｓ２３の処理を順不同とすることができる。ステップＳ２７の処理が終了後、図４のステップＳ７に進む。 Next, the character encoding processing unit 35 performs character encoding processing on each of the characters determined to be printed characters or handwritten characters in steps S25 and S26 (S27). The processing of step S27 and the processing of steps S22 and S23 can be performed in no particular order. After the process of step S27 is completed, the process proceeds to step S7 of FIG.

なお、文字コード化処理部３５の機能を、印字文字認識処理部３４１及び手書き文字認識処理部３４２が備えていてもよい。即ち、印字文字認識処理部３４１及び手書き文字認識処理部３４２がそれぞれの文字認識を実施する際に、文字コード化処理を実施してもよい。ここで、文字種別判別結果の例を図８に示す。 The printed character recognition processing unit 341 and the handwritten character recognition processing unit 342 may have the functions of the character coding processing unit 35. That is, when the printed character recognition processing unit 341 and the handwritten character recognition processing unit 342 perform their respective character recognition, the character encoding process may be performed. Here, an example of the character type discrimination result is shown in FIG.

図８は、入力画像Ｄの文字領域に対し文字種別判別処理を実施した結果である。手書き文字を含む文字領域はドットで塗りつぶしてある。 FIG. 8 shows the result of performing the character type discrimination processing on the character area of the input image D. The character area including handwritten characters is filled with dots.

図８において、ドットで図６の文字領域４３がドットで塗りつぶされており、文字領域４３の文字が手書き文字（手書き文字部４３Ａ）と判別されたことがわかる。この判別結果を元に、構造化データへの変換が行われる。 In FIG. 8, it can be seen that the character area 43 of FIG. 6 is filled with dots with dots, and the character in the character area 43 is determined to be a handwritten character (handwritten character portion 43A). Based on this determination result, conversion to structured data is performed.

構造化データとは、ＯＯＸＭＬやＯＤＦ、ＰＤＦ（登録商標）等のフォーマットで記述されたデータのことである。 Structured data is data described in a format such as OOXML, ODF, or PDF (registered trademark).

ＯＯＸＭＬ（Office Open XML, OpenXML）は、ＸＭＬをベースとしたオフィススイート用のファイルフォーマットである。Microsoft Word（登録商標）、Microsoft Excel（登録商標）、Power Pointは、ＯＯＸＭＬに準拠している。
ＯＤＦ（Open Document Format）は、ワープロ文書や表計算ソフトのワークシートなど、オフィスソフトの扱う文書ファイルを保存するためのＸＭＬベースのオープンなファイルフォーマットである。
ＰＤＦ（Portable Document Format）とは、Adobe Systems社によって開発された、電子文書のためのフォーマットである。ＰＤＦ形式のファイルには文字情報だけでなく、フォントや文字の大きさ、字飾り、埋め込まれた画像、それらのレイアウトなどの情報を保存できる。 OOXML (Office Open XML, OpenXML) is an XML-based file format for office suites. Microsoft Word (registered trademark), Microsoft Excel (registered trademark), and Power Point are OOXML compliant.
ODF (Open Document Format) is an XML-based open file format for storing document files handled by office software such as word processing documents and spreadsheets of spreadsheet software.
PDF (Portable Document Format) is a format for electronic documents developed by Adobe Systems. In the PDF format file, not only character information but also information such as font, character size, character decoration, embedded image, and their layout can be stored.

図４のフローチャートの説明に戻る。文字種別判別部３４の手書き文字判定部３４３により、入力画像Ｄの文字領域が手書き文字であるか否かを判定する（Ｓ７）。手書き文字ではない場合（Ｓ７のＮＯ）、構造化データ変換部３６の本体属性変換部３６１は、印字文字と判別された文字列を構造化データの本体属性に変換する（Ｓ８）。ステップＳ８の処理が終了後、ステップＳ１１へ進む。 Returning to the description of the flowchart of FIG. The handwritten character determination unit 343 of the character type determination unit 34 determines whether or not the character area of the input image D is a handwritten character (S7). When it is not a handwritten character (NO in S7), the main body attribute conversion unit 361 of the structured data conversion unit 36 converts the character string determined to be the printed character into the main body attribute of the structured data (S8). After the process of step S8 is completed, the process proceeds to step S11.

一方、入力画像Ｄの文字領域が手書き文字である場合（Ｓ７のＹＥＳ）、構造化データ変換部３６の補足属性変換部３６２は、手書き文字と判別された文字列を構造化データの補足属性に変換する（Ｓ９）。 On the other hand, when the character area of the input image D is a handwritten character (YES in S7), the supplementary attribute conversion unit 362 of the structured data conversion unit 36 sets the character string determined to be the handwritten character as the supplementary attribute of the structured data. Convert (S9).

以下、構造化データの補足属性の例として、Microsoft OfficeのPowerPointのノート属性に変換する場合について説明する。図９にPowerPointファイルをアプリケーションで開いた表示例を示す。 The following describes the case of converting to the PowerPoint note attribute of Microsoft Office as an example of the supplementary attribute of structured data. FIG. 9 shows a display example in which a PowerPoint file is opened by an application.

（アプリケーションの表示例）
図９は、第１の実施形態に係るPowerPointファイルをアプリケーションにより開いたときの表示例である。 (Application display example)
FIG. 9 is a display example when the PowerPoint file according to the first embodiment is opened by an application.

図９に示すアプリケーション画面５０には、ホームメニュー５１や校閲メニュー５２などの複数のメニューが用意されている。それらの複数のメニューの下側に本体属性に対応する本体領域５３と、本体領域５３の下側に補足属性に対応するノート領域５４が設けられている。本体領域５３は、アプリケーション画面５０の中央において広い面積を占める。このファイルフォーマットは、ＩＳＯ２９５００で標準化されたＸＭＬ形式のフォーマットである。 The application screen 50 shown in FIG. 9 is provided with a plurality of menus such as a home menu 51 and a review menu 52. A main body area 53 corresponding to the main body attribute is provided below the plurality of menus, and a note area 54 corresponding to the supplementary attribute is provided below the main body area 53. The main body area 53 occupies a large area in the center of the application screen 50. This file format is an XML format standardized by ISO29500.

（ファイルフォーマットの構造例）
図１０は、ノート属性を含む一般的なPowerPointファイルのファイルフォーマットの構造例を示す。 (File format structure example)
FIG. 10 shows a structural example of a file format of a general PowerPoint file including note attributes.

図９に示した本体領域５３のオブジェクトは‘slides’フォルダ６１（本体属性ディレクトリ）内の‘slide1.xml’に記述され、ノート領域５４の文字列は‘notesSlides’フォルダ６２（ノート属性ディレクトリ）内の‘notesSlide1.xml’に記述される。 The object of the main body area 53 shown in FIG. 9 is described in'slide1.xml'in the'slides' folder 61 (main body attribute directory), and the character string of the note area 54 is in the'notesSlides' folder 62 (note attribute directory). It is described in'notesSlide1.xml'.

図４のフローチャートの説明に戻る。次に、手書き文字置換部３７は、構造化データの手書き文字部が存在した本体領域の手書き文字部を、手書き文字が存在したことを表す情報（構造化データ）に置き換える（Ｓ１０）。構造化データ変換部３６は、これらの構造化データを、構造化データ記録部３８に記録する。 Returning to the description of the flowchart of FIG. Next, the handwritten character replacement unit 37 replaces the handwritten character portion of the main body area in which the handwritten character portion of the structured data exists with information (structured data) indicating that the handwritten character exists (S10). The structured data conversion unit 36 records these structured data in the structured data recording unit 38.

次に、構造化データ出力部３９は、ユーザーの指示に応じて、構造化データ記録部３８に記録された構造化データを出力する。例えば、構造化データ出力部３９は、アプリケーションにより表示部２５に構造化データを表示する（Ｓ１１）。ステップＳ１１の処理が終了したら本フローチャートの処理を終了する。 Next, the structured data output unit 39 outputs the structured data recorded in the structured data recording unit 38 in response to the user's instruction. For example, the structured data output unit 39 displays the structured data on the display unit 25 by the application (S11). When the process of step S11 is completed, the process of this flowchart is completed.

［手書き文字が存在していたことを表す情報の例］
手書き文字が存在した本体領域５３の手書き文字部に作成する、手書き文字が存在したことを表す情報の例について説明する。 [Example of information indicating that handwritten characters existed]
An example of information indicating that a handwritten character exists, which is created in the handwritten character portion of the main body area 53 where the handwritten character exists, will be described.

［第１例］
図１１は、第１の実施形態の第１例に係る、手書き文字をPowerPointファイルのノート領域に配置したアプリケーション画面の表示例を示す。 [First example]
FIG. 11 shows a display example of an application screen in which handwritten characters are arranged in a note area of a PowerPoint file according to the first example of the first embodiment.

図１１のアプリケーション画面５０Ａでは、「遅すぎる！」と書かれた手書き文字列５４１をノート領域５４に配置する。また、本体領域５３の文書画像５３０Ａの手書き文字列が存在していた部分（手書き文字部５３１）に、ノート領域５４に配置された手書き文字列５４１との対比が取れる情報（図１１の例では★の記号）を配置する。図１１では、本体領域５３の文書画像５３０Ａに配置した手書き文字が存在したことを表す情報（★）と同じ情報が、ノート領域５４にも配置されている。これによりユーザーは、ノート領域５４の文字列と本体領域５３の手書き文字部５３１との対応関係を、容易に認識することができる。 In the application screen 50A of FIG. 11, the handwritten character string 541 written as "too late!" Is arranged in the note area 54. Further, information that can be compared with the handwritten character string 541 arranged in the note area 54 in the portion (handwritten character portion 531) where the handwritten character string of the document image 530A of the main body area 53 exists (in the example of FIG. 11). Place the ★ symbol). In FIG. 11, the same information (★) indicating that the handwritten characters arranged in the document image 530A of the main body area 53 exist is also arranged in the note area 54. As a result, the user can easily recognize the correspondence between the character string in the note area 54 and the handwritten character unit 531 in the main body area 53.

また、記号だけではそこに手書き文字列が存在していたことをユーザーが見逃す可能性があるため、手書き文字列が存在していたことを表す文字列も追加し配置してもよい。例えば図１１では、手書き文字部５３１に「手書き有」の文字列が配置されている。このおうに、手書き文字が存在したことを表す情報との関連性を示す情報を、本体領域５３又はノート領域５４（補足領域）の少なくともいずれか一方に配置するとよい。 In addition, since the user may overlook the existence of the handwritten character string only with the symbol, a character string indicating that the handwritten character string existed may be added and arranged. For example, in FIG. 11, a character string of “with handwriting” is arranged in the handwritten character unit 531. In this way, the information indicating the relevance to the information indicating the existence of the handwritten characters may be arranged in at least one of the main body area 53 and the note area 54 (supplementary area).

［第２例］
第２例では、紙文書に手書き文字が複数存在していた場合には、複数の手書き文字が存在していたことを表す情報と、該手書き文字が存在していたことを表す情報との関連性を示す情報の各々に、対応関係を表す情報を付加する。例えば、紙文書に加筆された手書き文字列が多数存在する場合には、‘記号＋通し番号’としてもよい。 [Second example]
In the second example, when a plurality of handwritten characters exist in the paper document, the relationship between the information indicating that the plurality of handwritten characters existed and the information indicating that the handwritten characters existed. Information indicating the correspondence is added to each of the information indicating the sex. For example, when there are many handwritten character strings added to a paper document, it may be'symbol + serial number'.

図１２は、第１の実施形態の第２例に係る、手書き文字をPowerPointファイルのノート領域に配置したアプリケーション画面の表示例を示す。 FIG. 12 shows a display example of an application screen in which handwritten characters are arranged in a note area of a PowerPoint file according to a second example of the first embodiment.

図１２のアプリケーション画面５０Ｂでは、ノート領域５４に２つの手書き文字列５４１ａ，５４１ｂが存在し、本体領域５３の文書画像５３０Ｂにも２箇所に手書き文字部５３１ａ，５３１ｂが存在する。そして、ノート領域５４の手書き文字列５４１ａ，５４１ｂに対応する本体領域５３の手書き文字部５３１ａ，５３１ｂの前に、‘★１’，‘★２’の記号が配置されている。 In the application screen 50B of FIG. 12, two handwritten character strings 541a and 541b exist in the note area 54, and handwritten character portions 531a and 531b also exist in two places in the document image 530B of the main body area 53. Then, the symbols ‘★ 1’ and ‘★ 2’ are arranged in front of the handwritten character portions 531a and 531b of the main body area 53 corresponding to the handwritten character strings 541a and 541b of the note area 54.

また、本体領域５３に追加する文字列は、元から本体領域５３に存在する他の情報（印字文字列等）と区別できるように、文字色、文字サイズ、文字の太さ、斜体などの異なる装飾で追加してもよい。図１２の例では、文字列が斜体で表示されている。 Further, the character string added to the main body area 53 is different in character color, character size, character thickness, italic type, etc. so as to be distinguished from other information (printed character string, etc.) originally existing in the main body area 53. It may be added as a decoration. In the example of FIG. 12, the character string is displayed in italics.

［第１の実施形態の効果］
上述した構成の本発明によれば、印字文字と手書き文字が混在した紙文書を電子画像化し、その電子画像をベクタライズ処理して清書化されたファイルフォーマットに変換する処理において、印字文字と手書き文字を容易に判別できるようになる。 [Effect of the first embodiment]
According to the present invention having the above-described configuration, in the process of converting a paper document in which printed characters and handwritten characters are mixed into an electronic image and vectorizing the electronic image into a clean copy file format, the printed characters and the handwritten characters are converted. Can be easily identified.

＜２．第２の実施形態＞
第１の実施形態に係る構造化データの別の補足属性の例として、手書き文字をMicrosoft OfficeのPowerPointのコメント属性に変換する場合について説明する。 <2. Second embodiment>
As an example of another supplementary attribute of the structured data according to the first embodiment, a case where handwritten characters are converted into a comment attribute of PowerPoint of Microsoft Office will be described.

［第１例］
図１３は、第２の実施形態の第１例に係る、手書き文字をPowerPointファイルのコメント領域に配置したアプリケーション画面の表示例を示す。 [First example]
FIG. 13 shows a display example of an application screen in which handwritten characters are arranged in a comment area of a PowerPoint file according to the first example of the second embodiment.

図１２のアプリケーション画面７０Ａにおいて、本体領域７３の文書画像７３０Ａの手書き文字が存在していた部分（手書き文字部５３１）に、透明オブジェクト７３１ａを配置する。また、配置した透明オブジェクト７３１ａに関連するコメントとして、手書き文字列「遅すぎる！」をコメント領域７５のコメント欄７５１に追加する。コメント表示ボタン５２１をマウスポインタ等で押すことにより、コメント領域７５が表示される。 In the application screen 70A of FIG. 12, the transparent object 731a is arranged in the portion (handwritten character portion 531) where the handwritten character of the document image 730A of the main body area 73 exists. Further, as a comment related to the arranged transparent object 731a, the handwritten character string "too late!" Is added to the comment field 751 of the comment area 75. By pressing the comment display button 521 with the mouse pointer or the like, the comment area 75 is displayed.

透明オブジェクト７３１ａは、非可視文字コード（空白、タブ）でもよいし、透明な図形オブジェクトでもよい。 The transparent object 731a may be an invisible character code (blank, tab) or a transparent graphic object.

またコメント欄７５１の「遅すぎる！」の左側にクライアント端末１のユーザー名（ＫＵＲＯＫＩ）が表示されているが、操作部２６を操作して後から手書きコメントを加筆したユーザー名を入力してもよい。あるいは、コメント欄７５１にユーザー名を表示しなくてもよい。 Also, the user name (KUROKI) of the client terminal 1 is displayed on the left side of "Too late!" In the comment field 751, but even if you operate the operation unit 26 and enter the user name with a handwritten comment added later. Good. Alternatively, the user name may not be displayed in the comment field 751.

コメント機能を利用すると、アプリケーションが自動でコメント領域７５にコメントが存在することを表すマーク７３２ａ（記号、図形等）を本体領域７３の対応する場所に表示し、コメント欄７５１の手書き文字列との関連を示すことが可能である。本体領域７３の対応する場所とは、手書き文字部５３１に対応する位置である。 When the comment function is used, the application automatically displays a mark 732a (symbol, figure, etc.) indicating that a comment exists in the comment area 75 in the corresponding place in the main body area 73, and is combined with the handwritten character string in the comment field 751. It is possible to show a relationship. The corresponding place of the main body area 73 is a position corresponding to the handwritten character portion 531.

図１３において、本体領域７３の文書画像７３０Ａに透明オブジェクト７３１ａ及びマーク７３２ａが配置され、そこにコメントがあることがアプリケーションにより示されている。また図１３において、透明オブジェクト７３１ａ及びマーク７３２ａに関連するコメントとして、手書き文字列がコメント領域７５のコメント欄７５１に追加されている。 In FIG. 13, the application indicates that the transparent object 731a and the mark 732a are arranged in the document image 730A of the main body area 73, and there is a comment there. Further, in FIG. 13, a handwritten character string is added to the comment field 751 of the comment area 75 as a comment related to the transparent object 731a and the mark 732a.

（ファイルフォーマットの構造例）
図１４は、コメント属性を含む一般的なPowerPointファイルのファイルフォーマットの構造例を示す。 (File format structure example)
FIG. 14 shows a structural example of a file format of a general PowerPoint file including a comment attribute.

図１３に示した本体領域７３の透明オブジェクト７３１ａは‘slides’フォルダ６１（本体属性ディレクトリ）内の‘slide1.xml’に記述され、コメント領域７５のコメントは‘comments’フォルダ６３（コメント属性ディレクトリ）内の‘comment1.xml’に記述される。 The transparent object 731a in the main body area 73 shown in FIG. 13 is described in'slide1.xml'in the'slides' folder 61 (main body attribute directory), and the comments in the comment area 75 are described in the'comments' folder 63 (comment attribute directory). It is described in'comment1.xml'in.

［第２例］
図１３の本体領域７３に配置する透明オブジェクトは、手書き文字列があったことを表す可視文字列であってもよい。 [Second example]
The transparent object arranged in the main body area 73 of FIG. 13 may be a visible character string indicating that there was a handwritten character string.

図１５は、第２の実施形態の第２例に係る、手書き文字をPowerPointファイルのコメント領域に配置したアプリケーション画面の表示例を示す。 FIG. 15 shows a display example of an application screen in which handwritten characters are arranged in a comment area of a PowerPoint file according to a second example of the second embodiment.

図１５のアプリケーション画面７０Ｂでは、本体領域７３の文書画像７３０Ｂの手書き文字部５３１に、マーク７３２ａの横に可視文字列‘手書き有’からなるオブジェクト７３１ｂが配置されている。 In the application screen 70B of FIG. 15, an object 731b composed of a visible character string'handwritten'is arranged next to the mark 732a in the handwritten character portion 531 of the document image 730B in the main body area 73.

このように本体領域７３に可視文字列からなるオブジェクト７３１ｂを配置することにより、そこに手書き文字列が存在していたことをユーザーが見逃す可能性が減少する。 By arranging the object 731b composed of the visible character string in the main body area 73 in this way, the possibility that the user overlooks the existence of the handwritten character string is reduced.

［第３例］
図１６は、第２の実施形態の第３例に係る、手書き文字をPowerPointファイルのコメント領域に配置したアプリケーション画面の表示例を示す。 [Third example]
FIG. 16 shows a display example of an application screen in which handwritten characters are arranged in a comment area of a PowerPoint file according to a third example of the second embodiment.

図１６のアプリケーション画面７０Ｃでは、本体領域７３の文書画像７３０Ｃの手書き文字部５３１に、可視文字列‘手書き有’からなるオブジェクト７３１ｂの右上に、マーク７３２ｃが配置されている。 In the application screen 70C of FIG. 16, the mark 732c is arranged in the handwritten character portion 531 of the document image 730C in the main body area 73, at the upper right of the object 731b composed of the visible character string'handwritten available'.

［第４例］
図１７は、第２の実施形態の第４例に係る、手書き文字をPowerPointファイルのコメント領域に配置したアプリケーション画面の表示例を示す。 [4th example]
FIG. 17 shows a display example of an application screen in which handwritten characters are arranged in a comment area of a PowerPoint file according to a fourth example of the second embodiment.

図１７のアプリケーション画面７０Ｄでは、本体領域７３の文書画像７３０Ｄの手書き文字部５３１にマーク７３２ａのみが表示され、オブジェクトが配置されない又は透明オブジェクト７３１ａ（図１３）が配置される。そして、コメント領域７５のコメント欄７５１に、手書き文字列の存在を示す可視文字列７５１ｄ‘手書き有’とともに、手書きコメント「遅すぎる！」が表示される。 In the application screen 70D of FIG. 17, only the mark 732a is displayed on the handwritten character portion 531 of the document image 730D in the main body area 73, and no object is arranged or the transparent object 731a (FIG. 13) is arranged. Then, in the comment field 751 of the comment area 75, the handwritten comment "too late!" Is displayed together with the visible character string 751d "handwritten" indicating the existence of the handwritten character string.

これにより、本体領域７３にマーク７３２ａのみを表示し、手書き文字列の存在を示す可視文字列を表示しなくても、ユーザーはコメント領域７５に配置された可視文字列７５１ｄ‘手書き有’を見ることにより、手書き文字の存在を認識することができる。 As a result, even if only the mark 732a is displayed in the main body area 73 and the visible character string indicating the existence of the handwritten character string is not displayed, the user sees the visible character string 751d'handwritten yes'arranged in the comment area 75. This makes it possible to recognize the existence of handwritten characters.

［第５例］
図１８は、第２の実施形態の第５例に係る、PowerPointファイルのコメント領域に表示された手書き文字情報の内容を削除したアプリケーション画面の表示例を示す。 [Example 5]
FIG. 18 shows a display example of the application screen in which the content of the handwritten character information displayed in the comment area of the PowerPoint file is deleted according to the fifth example of the second embodiment.

図１８のアプリケーション画面７０Ｅでは、本体領域７３に図１７と同じ文書画像７３０Ｄを表示しているが、コメント領域７５のコメント欄７５１のコメントが削除されている。 In the application screen 70E of FIG. 18, the same document image 730D as that of FIG. 17 is displayed in the main body area 73, but the comment in the comment field 751 of the comment area 75 is deleted.

Microsoft Office等のＸＭＬベースの文書作成アプリケーションの機能として、補足領域（ノート領域やコメント領域）の文字列を一括で削除する一括削除機能が用意されている。手書き文字列を補足領域に配置した場合に、この一括削除機能を利用することで、手書き文字列を含まない形で文書配布したいという要望にも簡単に対応可能となる。本例は、第１の実施形態にも適用可能である。 As a function of an XML-based document creation application such as Microsoft Office, a batch deletion function for deleting character strings in a supplementary area (note area or comment area) at once is provided. When the handwritten character string is placed in the supplementary area, by using this batch deletion function, it is possible to easily respond to the request to distribute the document in a form that does not include the handwritten character string. This example is also applicable to the first embodiment.

＜３．第３の実施形態＞
ＰＤＦやMicrosoft Wordの機能にも同様のコメント機能（注釈機能）があり、この機能を使ってそれぞれのフォーマットに変換することで、第１及び第２の実施形態と同様の効果を持たせることが可能である。 <3. Third Embodiment>
The PDF and Microsoft Word functions also have a similar comment function (annotation function), and by converting to each format using this function, it is possible to have the same effect as in the first and second embodiments. It is possible.

図１９は、第３の実施形態に係る、手書き文字をＰＤＦファイルの注釈領域に配置したアプリケーション画面の表示例を示す。 FIG. 19 shows a display example of the application screen in which the handwritten characters are arranged in the annotation area of the PDF file according to the third embodiment.

図１９のアプリケーション画面７０Ｅでは、本体領域に対応する文書画像８０の手書き文字部５３１に、図形や記号等からなる注釈アイコン８１が配置され、注釈アイコン８１と紐付けられたポップアップウィンドウ８２が表示される。ポップアップウィンドウ８２は補足属性に対応する補足領域である。ポップアップウィンドウ８２に、手書き文字列‘遅すぎる！’が配置される。 In the application screen 70E of FIG. 19, an annotation icon 81 composed of figures, symbols, etc. is arranged in the handwritten character portion 531 of the document image 80 corresponding to the main body area, and a pop-up window 82 associated with the annotation icon 81 is displayed. To. The pop-up window 82 is a supplementary area corresponding to the supplementary attribute. In the pop-up window 82, the handwritten string'Too late! ’Is placed.

ＰＤＦの場合は、ページ毎のページオブジェクトのAnnots配列に手書き文字列（注釈データ）を登録することで実現可能である。 In the case of PDF, it can be realized by registering a handwritten character string (annotation data) in the Annots array of the page object for each page.

上述した第３の実施形態によれば、ＰＤＦファイルを用いた場合でも、第１の実施形態及び第２の実施形態と同様の作用効果を奏する。 According to the third embodiment described above, even when the PDF file is used, the same effects as those of the first embodiment and the second embodiment are obtained.

なお、ＰＤＦの場合でも、注釈アイコン８１の近くに可視文字列（例えば‘手書き文字有’）からなるオブジェクトを配置してもよい。 Even in the case of PDF, an object composed of a visible character string (for example,'with handwritten characters') may be placed near the annotation icon 81.

＜４．第４の実施形態＞
この構造化データ変換機能を実装するアプリケーション（クライアント端末１や画像形成装置３など）に、出力形態を選択可能なＵＩを持たせる。そして、手書き文字列を「ノート領域に配置」、「コメント領域に配置」、「本体領域に配置」、「完全に削除してどこにも配置しない」などの選択肢をユーザーが選択することも可能である。 <4. Fourth Embodiment>
An application (such as a client terminal 1 or an image forming apparatus 3) that implements this structured data conversion function is provided with a UI that can select an output form. And the user can also select options such as "Place handwritten character string in note area", "Place in comment area", "Place in body area", "Delete completely and do not place anywhere". is there.

図２０は、第４の実施形態に係る画像処理装置が備えるコンピューター２０Ａの機能構成を示す。ここでは、画像処理装置をクライアント端末１に適用した例について説明する。 FIG. 20 shows the functional configuration of the computer 20A included in the image processing apparatus according to the fourth embodiment. Here, an example in which the image processing device is applied to the client terminal 1 will be described.

図２０のコンピューター２０Ａは、図３のコンピューター２０に対して出力形態設定部９０を備える点が異なる。 The computer 20A of FIG. 20 is different from the computer 20 of FIG. 3 in that it includes an output form setting unit 90.

出力形態設定部９０は、スキャナ４等の電子画像化手段により得られた電子データを、どのフォーマット（データ形式）の構造化データに変換するか選択を促すフォーマット選択画面（図示略）を表示部２５に表示する。次に、出力形態設定部９０は、選択されたフォーマットが複数の出力形態（手書き文字列の配置態様）を備える場合には、上記の出力形態を選択させる出力形態選択画面（図示略）を表示する。そして、出力形態設定部９０は、ユーザーの選択に基づく出力形態で、手書き文字列をアプリケーション画面に出力する。 The output form setting unit 90 displays a format selection screen (not shown) that prompts selection of which format (data format) the structured data is converted from the electronic data obtained by the electronic imaging means such as the scanner 4. Display on 25. Next, the output form setting unit 90 displays an output form selection screen (not shown) for selecting the above output form when the selected format has a plurality of output forms (handwritten character string arrangement mode). To do. Then, the output form setting unit 90 outputs the handwritten character string to the application screen in the output form based on the user's selection.

なお、出力形態設定部９０は、「完全に削除してどこにも配置しない」が選択された場合には、ベクタライズ処理時に手書き文字以外の情報（例えば図形領域４２として判別された矢印）も含めて変換後の構造化データに加筆した情報を反映しない。このようにすることで、手書き情報を含まない形で文書を他ユーザーと共有、配布することができる。 When "Completely delete and do not place anywhere" is selected, the output form setting unit 90 includes information other than handwritten characters (for example, an arrow determined as a graphic area 42) during vectorizing processing. Does not reflect the added information in the structured data after conversion. By doing so, the document can be shared and distributed with other users in a form that does not include handwritten information.

＜５．その他＞
なお、スキャナ４及びカメラ５等の電子画像化手段であっても、図２に示すようなコンピューターを搭載していれば、画像処理装置を適用することが可能である。 <5. Others>
Even if the electronic imaging means such as the scanner 4 and the camera 5 is equipped with a computer as shown in FIG. 2, the image processing device can be applied.

さらに、本発明は上述した各実施形態例に限られるものではなく、特許請求の範囲に記載した本発明の要旨を逸脱しない限りにおいて、その他種々の応用例、変形例を取り得ることは勿論である。 Furthermore, the present invention is not limited to the above-described embodiments, and it goes without saying that various other application examples and modifications can be taken as long as the gist of the present invention described in the claims is not deviated. is there.

例えば、上述した実施形態例は本発明を分かりやすく説明するために装置及びシステムの構成を詳細且つ具体的に説明したものであり、必ずしも説明した全ての構成を備えるものに限定されるものではない。また、ある実施形態例の構成の一部を他の実施形態例の構成に置き換えることは可能である。また、ある実施形態例の構成に他の実施形態例の構成を加えることも可能である。また、各実施形態例の構成の一部について、他の構成の追加、削除、置換をすることも可能である。 For example, the above-described embodiment describes the configurations of the apparatus and the system in detail and concretely in order to explain the present invention in an easy-to-understand manner, and is not necessarily limited to those including all the described configurations. .. In addition, it is possible to replace a part of the configuration of one embodiment with the configuration of another embodiment. It is also possible to add the configuration of another embodiment to the configuration of one embodiment. Further, it is also possible to add, delete, or replace a part of the configuration of each embodiment with another configuration.

また、上記の各構成、機能、処理部、処理手段等は、それらの一部又は全部を、例えば集積回路で設計する等によりハードウェアで実現してもよい。また、上記の各構成、機能等は、プロセッサがそれぞれの機能を実現するプログラムを解釈し、実行することによりソフトウェアで実現してもよい。各機能を実現するプログラム、テーブル、ファイル等の情報は、メモリやハードディスク、ＳＳＤ（Solid State Drive）等の記録装置、又はＩＣカード、ＳＤカード、ＤＶＤ等の記録媒体に置くことができる。 Further, each of the above configurations, functions, processing units, processing means and the like may be realized by hardware by designing a part or all of them by, for example, an integrated circuit. Further, each of the above configurations, functions, and the like may be realized by software by the processor interpreting and executing a program that realizes each function. Information such as programs, tables, and files that realize each function can be placed in a memory, a hard disk, a recording device such as an SSD (Solid State Drive), or a recording medium such as an IC card, an SD card, or a DVD.

また、制御線や情報線は説明上必要と考えられるものを示しており、製品上必ずしも全ての制御線や情報線を示しているとは限らない。実際には殆ど全ての構成が相互に接続されていると考えてもよい。 In addition, the control lines and information lines indicate those that are considered necessary for explanation, and do not necessarily indicate all the control lines and information lines in the product. In practice, it can be considered that almost all configurations are interconnected.

また、本明細書において、時系列的な処理を記述する処理ステップは、記載された順序に沿って時系列的に行われる処理はもちろん、必ずしも時系列的に処理されなくとも、並列的あるいは個別に実行される処理（例えば、並列処理あるいはオブジェクトによる処理）をも含むものである。 Further, in the present specification, the processing steps for describing the time-series processing are not necessarily the processing performed in the time-series according to the described order, but are parallel or individual. It also includes processing executed in (for example, parallel processing or processing by an object).

１…クライアント端末、４…スキャナ、５…カメラ、１０…システム、２１…ＣＰＵ、２７…不揮発性ストレージ、３１…入力画像受信部、３３…領域判別部、３４…文字種別判別部、３５…文字コード化処理部、３６…構造化データ変換部、３７…手書き文字置換部、３９…構造化データ出力部、４０…印字文字領域、４１…手書き領域、４２…図形領域、４３…文字領域、４３Ａ…手書き文字部、５３…本体領域、５４…ノート領域、６１…本体属性フォルダ、６２…ノート属性フォルダ、６３…コメント属性フォルダ、７３…本体領域、７４…補正領域、７５…コメント領域、３６１…本体属性変換部、３６２…補正属性変換部、５２１…コメント表示ボタン、５３１，５３１ａ，５３１ｂ…手書き文字部、５４１，５４１ａ，５４１ｂ…手書き文字列、７５１…コメント欄、Ｐ…紙文書、Ｄ…画像データ 1 ... Client terminal, 4 ... Scanner, 5 ... Camera, 10 ... System, 21 ... CPU, 27 ... Non-volatile storage, 31 ... Input image receiver, 33 ... Area discriminator, 34 ... Character type discriminator, 35 ... Character Coding processing unit, 36 ... Structured data conversion unit, 37 ... Handwritten character replacement unit, 39 ... Structured data output unit, 40 ... Printed character area, 41 ... Handwritten area, 42 ... Graphic area, 43 ... Character area, 43A ... Handwritten character part, 53 ... Main body area, 54 ... Note area, 61 ... Main body attribute folder, 62 ... Note attribute folder, 63 ... Comment attribute folder, 73 ... Main body area, 74 ... Correction area, 75 ... Comment area, 361 ... Body attribute conversion unit, 362 ... Correction attribute conversion unit, 521 ... Comment display button, 531, 531a, 531b ... Handwritten character part, 541, 541a, 541b ... Handwritten character string, 751 ... Comment field, P ... Paper document, D ... image data

Claims

An area determination unit that analyzes the image data obtained by reading a handwritten paper document and discriminates between the character area and the non-character area included in the image data.
A character type determination unit that analyzes the character area determined by the area determination unit and determines whether the character existing in the character area is a printed character or a handwritten character.
A character encoding processing unit that converts the printed characters and the handwritten characters into character codes,
A structured data conversion unit that converts an area other than the character determined by the area determination unit and a character area including the character-encoded printed character into the main body attribute of the structured data.
A supplementary attribute conversion unit that converts the character-encoded handwritten characters into supplementary attributes of the structured data, and
It is provided with a handwritten character replacement unit that replaces the handwritten character unit of the main body area corresponding to the main body attribute in which the handwritten character determined by the character type determination unit exists with information indicating that the handwritten character exists .
An image processing device that arranges the same information as the information indicating that the handwritten character arranged in the main body area exists in the supplementary area corresponding to the supplementary attribute .

An area determination unit that analyzes the image data obtained by reading a handwritten paper document and discriminates between the character area and the non-character area included in the image data.
A character type determination unit that analyzes the character area determined by the area determination unit and determines whether the character existing in the character area is a printed character or a handwritten character.
A character encoding processing unit that converts the printed characters and the handwritten characters into character codes,
A structured data conversion unit that converts an area other than the character determined by the area determination unit and a character area including the character-encoded printed character into the main body attribute of the structured data.
A supplementary attribute conversion unit that converts the character-encoded handwritten characters into supplementary attributes of the structured data, and
It is provided with a handwritten character replacement unit that replaces the handwritten character unit of the main body area corresponding to the main body attribute in which the handwritten character determined by the character type determination unit exists with information indicating that the handwritten character exists .
When information indicating the relevance to the information indicating the existence of the handwritten character is arranged in at least one of the main body area and the supplementary area corresponding to the supplementary attribute, and a plurality of the handwritten characters are present. Is added with information indicating a correspondence relationship to each of the information indicating the existence of the plurality of handwritten characters and the information indicating the relevance of the information indicating the existence of the handwritten characters. Image processing device.

The image processing apparatus according to claim 1, wherein the information indicating the presence of the handwritten character in the main body area is decorated so as to be distinguishable from other information existing in the main body area.

The information indicating the existence of the handwritten character in the main body area and the information indicating the relevance to the information indicating the existence of the handwritten character can be distinguished from other information existing in the main body area. Decorated
The image processing apparatus according to claim 2.

The image processing according to claim 1 or 2 , wherein the supplementary area corresponding to the supplementary attribute is an area defined by the comment attribute of the structured data or an area defined by the note attribute used in the document creation software. apparatus.

The image processing apparatus according to claim 5 , wherein the structured data used in the document creation software is data described in an OOXML or ODF format.

The image processing apparatus according to claim 1 or 2 , wherein when the structured data is data described in PDF (registered trademark), the annotation function defined by the Annots array is used for the supplementary attribute.

The image processing apparatus according to any one of claims 1 to 7 , wherein the information indicating the existence of the handwritten character is at least one of a symbol, a character string, a figure, and an image.

A procedure for analyzing image data obtained by reading a handwritten paper document and determining a character area and a non-character area included in the image data, and
A procedure for analyzing the determined character area and determining whether the character existing in the character area is a printed character or a handwritten character.
The procedure for character-coding the printed characters and the handwritten characters, and
A procedure for converting the identified area other than the character and the character area including the character-encoded printed character into the main body attribute of the structured data, and
The procedure for converting the character-encoded handwritten character into the supplementary attribute of the structured data, and
A procedure for replacing the handwritten character portion of the main body area corresponding to the main body attribute in which the handwritten character exists with information indicating that the handwritten character exists, and
A procedure for arranging the same information as the information indicating that the handwritten character arranged in the main body area exists in the supplementary area corresponding to the supplementary attribute, and
A program that lets your computer run.

A procedure for analyzing image data obtained by reading a handwritten paper document and determining a character area and a non-character area included in the image data, and
A procedure for analyzing the determined character area and determining whether the character existing in the character area is a printed character or a handwritten character.
The procedure for character-coding the printed characters and the handwritten characters, and
A procedure for converting the identified area other than the character and the character area including the character-encoded printed character into the main body attribute of the structured data, and
The procedure for converting the character-encoded handwritten character into the supplementary attribute of the structured data, and
A procedure for replacing the handwritten character portion of the main body area corresponding to the main body attribute in which the handwritten character exists with information indicating that the handwritten character exists, and
When information indicating the relevance to the information indicating the existence of the handwritten character is arranged in at least one of the main body area and the supplementary area corresponding to the supplementary attribute, and a plurality of the handwritten characters are present. Is added information indicating a correspondence relationship to each of the information indicating the existence of the plurality of handwritten characters and the information indicating the relationship between the information indicating the existence of the handwritten characters. Procedure and
A program that lets your computer run.