JPH0612540B2

JPH0612540B2 - Document creation support device

Info

Publication number: JPH0612540B2
Application number: JP2172326A
Authority: JP
Inventors: 淳一大住
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 1990-06-28
Filing date: 1990-06-28
Publication date: 1994-02-16
Anticipated expiration: 2009-02-16
Also published as: GB2247803B; GB9113488D0; DE4121564A1; JPH0460759A; DE4121564C2; GB2247803A

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は、ワードプロセッサ，ワークステーション等の
文書作成装置に関し、特に、文字，図形，画像等が混在
した複雑なレイアウト構造の文書のレイアウトを電子的
に作成する文書作成支援装置に関する。Description: BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a document creation device such as a word processor or a workstation, and more particularly, to an electronic document layout having a complicated layout structure in which characters, figures, images and the like are mixed. The present invention relates to a document creation support device for creating a document.

[Conventional technology]

ワードプロセッサ，ワークステーション等の文書作成装
置により、文書を作成する場合、一つの文書中に、文字
の他に、直線，曲線等のベクトル情報で規定されるベク
トル表示図形、或いは、イメージリーダで読み込まれた
画像のようなビットパターンで規定されるビットマップ
画像を混在させたい場合がある。このような、文字、ベ
クトル表示図形、ビットマップ画像等が混在する文書を
作成する場合、文字、ベクトル表示図形、ビットマップ
画像のそれぞれに対する装置内でのデータ処理の形態が
異なる。そこで、処理効率を高めるため、文書中にそれ
ぞれ文字枠，図形枠，画像枠と呼ばれる特別な領域を設
定し、文書の作成或いは編集に際しては、処理の対象と
なった領域の属性を判別し、その属性に応じた処理を行
っている。なお、本明細書においては、画像という用語
は広義には、文字，ベクトル表示図形，ビットマップ画
像等を含み、狭義にはビットマップ画像のみを示すもの
とする。When a document is created by a document creation device such as a word processor or a workstation, in one document, in addition to characters, vector display figures defined by vector information such as straight lines and curves, or read by an image reader. In some cases, it is desirable to mix bitmap images that are defined by bit patterns such as different images. When creating a document in which characters, vector display graphics, bitmap images, etc. are mixed, the data processing mode in the device for each of the characters, vector display graphics, and bitmap images is different. Therefore, in order to improve the processing efficiency, special areas called character frames, graphic frames, and image frames are set in the document, and when creating or editing the document, the attributes of the processing target area are determined. The processing according to the attribute is performed. In the present specification, the term image includes a character, a vector display figure, a bitmap image, etc. in a broad sense, and only a bitmap image in a narrow sense.

このような、文字、ベクトル表示図形、ビットマップ画
像等が混在する文書を作成する場合、文書作成者がレイ
アウトを考えながら該当する属性の枠を文書内に設定
し、大きさ，位置を調整する作業を行う必要があった。When creating a document in which characters, vector display graphics, bitmap images, etc. are mixed, the document creator sets a frame with the corresponding attribute in the document while considering the layout, and adjusts the size and position. I had to do some work.

[Problems to be Solved by the Invention]

しかしながら、この枠の設定，調整は、文字枠、図形
枠、画像枠のそれぞれに対して行う必要があり、また、
同属性の枠が複数ある場合も同様な作業を繰り返さなけ
ればならなかった。このため、文書の作成作業が煩雑と
なり手間がかかるという問題があった。特に、類似の文
書が既に存在していて、同様のレイアウトを使いたい場
合でも、作成者が一々全て入力する必要があり、労力を
要するばかりではなく、精神的にも苦痛であった。However, it is necessary to set and adjust this frame for each of the character frame, the graphic frame, and the image frame.
The same work had to be repeated when there were multiple frames with the same attribute. For this reason, there has been a problem that the work of creating the document is complicated and time-consuming. In particular, even if a similar document already exists and wants to use the same layout, the creator needs to input all of them, which is not only labor-intensive but also mentally painful.

これを改善するための方法としては、基本的なレイアウ
トをもった文書を多種あらかじめ作成しておき、必要の
つど検索、コピーし、内容を入力するようにする方法が
考えられる。As a method for improving this, it is possible to create various kinds of documents with a basic layout in advance, search and copy them whenever necessary, and enter the contents.

しかし、この方法では、必要なレイアウトをあらかじめ
登録する必要があり、初期の入力作業の労力が非常に大
きい。また、必要なレイアウトの文書がすべて登録され
ている必要があり、膨大な記憶容量が必要になる。これ
らの登録済レイアウトを使う際には、目的のレイアウト
を検索する必要があるが、多数登録されていると容易に
検索できない等の問題がある。However, with this method, it is necessary to register necessary layouts in advance, and the labor of initial input work is very large. In addition, it is necessary to register all documents of the required layout, which requires a huge storage capacity. When using these registered layouts, it is necessary to search for the target layout, but if there are many registered layouts, there is a problem that the search cannot be performed easily.

本発明は、以上のような問題点を解決するものであり、
作成したいものと類似のレイアウト文書が有る場合、そ
の場で各種の枠のレイアウトを自動的に生成し文書作成
の効率を改善しようとするものである。The present invention is to solve the above problems,
When there is a layout document similar to the one to be created, layouts of various frames are automatically generated on the spot to improve the efficiency of document creation.

[Means for Solving the Problems]

本発明の文書作成支援装置は、前記目的を達成するた
め、文書を作成するに際し作成すべき文書に含まれる画
像の属性に応じてそれぞれ異なった領域を設定し各領域
に対してそれぞれ異なった処理を行う文書作成装置にお
いて使用される文書作成支援装置であって、皺形となる
文書を画像として入力する画像入力部と、入力された文
書の画像の属性を判別し、この属性の相違に基づいて前
記各領域の大きさ，位置及び属性を示すレイアウト情報
を抽出するレイアウト解析部と、得られたレイアウト情
報に基づいて作成すべき文書に前記皺形となる文書と同
様な領域の設定を指示するレイアウト生成指示部とを備
えていることを特徴とする。In order to achieve the above-mentioned object, the document creation support apparatus of the present invention sets different areas according to the attributes of the images included in the document to be created, and performs different processing for each area. A document creation support device used in a document creation device for performing, the image input unit for inputting a wrinkled document as an image, and the attribute of the image of the input document are discriminated, and based on the difference between these attributes. A layout analysis unit that extracts layout information indicating the size, position, and attribute of each area, and instructs the document to be created based on the obtained layout information to set the same area as the wrinkled document. And a layout generation instructing unit for performing the layout generation.

前記レイアウト解析部は、文書の画像から文字領域，図
形領域及び画像領域を抽出する領域解析部と、前記文字
領域内の文字部の属性を解析する文字部解析部と、前記
図形領域内の図形部を解析して図形中の線成分を抽出し
ベクトルデータとして出力する図形部解析部から構成す
ることができる。The layout analysis unit is a region analysis unit that extracts a character region, a graphic region, and an image region from an image of a document, a character unit analysis unit that analyzes attributes of a character unit in the character region, and a graphic in the graphic region. It can be configured by a graphic part analysis unit which analyzes a part to extract a line component in the graphic and outputs it as vector data.

[Action]

本発明においては、文書のレイアウトを設定するに際し
て、皺形となる文書の画像が入力され、レイアウト解析
部によりその領域の種別が識別される。たとえば、文書
に含まれる連結画像の外接矩形が求められ、この外接矩
形の大きさ，配置等により、文字領域，図形領域及び画
像領域が判別される。これらの解析された領域に基づ
き、レイアウト生成指示部からの指示により新たに作成
する文書に対して対応する領域が設定される。したがっ
て、新たに作成する文書には、皺形となる文書と同様な
レイアウトで各領域が設定されることになる。In the present invention, when setting the layout of the document, the image of the wrinkled document is input, and the layout analysis unit identifies the type of the area. For example, the circumscribed rectangle of the connected image included in the document is obtained, and the character area, the graphic area, and the image area are discriminated based on the size and arrangement of the circumscribed rectangle. Based on these analyzed regions, a region corresponding to a newly created document is set by an instruction from the layout generation instruction unit. Therefore, each area is set in the newly created document in the same layout as the wrinkled document.

〔Example〕

以下、図面を参照しながら実施例に基づいて本発明の特
徴を具体的に説明する。Hereinafter, features of the present invention will be specifically described based on embodiments with reference to the drawings.

第１図に本発明の文書作成支援装置が適用されたワード
プロセッサ等の文書作成装置の構成を示す。文書作成支
援装置１は、文書を画像として入力する画像入力部２、
入力した文書の画像から文書のレイアウトを抽出するレ
イアウト解析部３、解析したレイアウトに従って、図形
枠，文字枠，画像枠等の生成を指示するレイアウト生成
指示部４を備えている。レイアウト生成指示部４から指
定されたレイアウトは、レイアウト割り付け部５により
実際にユーザーに見える形に表現される。また、キーボ
ード／マウス６から入力されたデータは、文字・図形入
力部７により文字や図形として変換されて取り込まれ、
内容割り付け部８において、先にレイアウトされた各枠
の中に入力され、表示部９で表示される。また、レイア
ウトがキーボード／マウス６から手操作で入力されると
きは、レイアウト座標・属性入力部10において、キーボ
ード／マウス６から入力されたデータを変換し、レイア
ウト生成指示部４へ送る。FIG. 1 shows the configuration of a document creation device such as a word processor to which the document creation support device of the present invention is applied. The document creation support apparatus 1 includes an image input unit 2 for inputting a document as an image,
The layout analysis unit 3 extracts the layout of the document from the image of the input document, and the layout generation instruction unit 4 instructs the generation of a figure frame, a character frame, an image frame, etc. according to the analyzed layout. The layout designated by the layout generation instructing section 4 is expressed by the layout allocating section 5 in a form actually visible to the user. Further, the data input from the keyboard / mouse 6 is converted into a character or a graphic by the character / graphic input unit 7 and taken in,
In the content allocating section 8, the content is input into each of the previously laid out frames and displayed on the display section 9. Further, when the layout is manually input from the keyboard / mouse 6, the layout coordinate / attribute input unit 10 converts the data input from the keyboard / mouse 6 and sends it to the layout generation instruction unit 4.

第２図にレイアウト解析部３の構成を示す。画像入力部
２から入力された文書の画像は、領域解析部11におい
て、文字、ベクトル表示図形、ビットマップ画像の各領
域に分けられ、文字領域では文字部解析部12において、
文字領域の場所、文字の大きさ、文字列の間隔、文字列
の方向等のデータ（文字枠／文字部属性データ）が抽出
される。図形領域では、図形部解析部13において、図の
囲み枠，表の枠，けい線、段の仕切り線等の縦横線が検
出され、ベクトルデータ（図形枠／けい線データ）とし
て出力される。また、画像部は枠データとして出力され
る。FIG. 2 shows the configuration of the layout analysis unit 3. The image of the document input from the image input unit 2 is divided into regions of characters, vector display graphics, and bitmap images in the region analysis unit 11, and in the character region, the character unit analysis unit 12
Data (character frame / character part attribute data) such as the location of the character area, the size of the character, the space between the character strings, and the direction of the character string is extracted. In the figure area, the figure part analysis unit 13 detects vertical and horizontal lines such as a frame of a figure, a frame of a table, a ruled line, and a partition line of a step, and outputs them as vector data (figure frame / ruled line data). Further, the image part is output as frame data.

次に、上述のレイアウト解析部３の動作について説明す
る。Next, the operation of the layout analysis unit 3 described above will be described.

まず、画像入力部２（第１図参照）で皺形となる原稿画
像を入力し、２値化した文書画像を領域解析部11で解析
する。領域解析の方法は特に限定しないが、一方法とし
ては、連結した画像をそれぞれ囲む外接矩形をとり、外
接矩形の大きさ、配置等により分類する方法がある。First, a wrinkled original image is input to the image input unit 2 (see FIG. 1), and the binarized document image is analyzed by the area analysis unit 11. The area analysis method is not particularly limited, but as one method, there is a method of taking circumscribed rectangles surrounding the connected images and classifying the circumscribed rectangles according to size and arrangement.

第３図に領域解析の手順を例を示す。FIG. 3 shows an example of the procedure of area analysis.

まず、外接矩形を抽出する（ステップ101）。この際、
重なりあった外接矩形は全体を包含する外接矩形に置き
換えて統合しておく（ステップ102）。外接矩形の縦或
いは横の長さが文字の大きさを越えるものはビットマッ
プ画像或いはベクトル表示図形と識別することができる
（ステップ103）。たとえば、36ポイントまでの文字を
扱うとすると、高さ又は幅が約13mmより大きいものは、
ビットマップ画像或いはベクトル表示図形と見做す。ま
た、大きい外接矩形の内、縦横比が１に近く、外接矩形
内の黒画素の比率が高いものは画像領域と識別できる
（ステップ104）。なお、縦横比が１に近いとは、１／
３〜３の範囲にあることを意味する。他の大きい外接矩
形は図形領域と考えることができる。また、小さい外接
矩形は文字と判定されるが、さらに水平・垂直方向の文
字の周期性を調べることにより、確認可能である（ステ
ップ105）。このとき得られた水平・垂直の文字周期が
ほぼ一定の外接矩形の集合が、同一の属性をもった文字
群すなわち文字領域として抽出される（ステップ10
6）。以上の処理により、文字領域、図形領域、画像領
域が抽出され、各領域の大きさ・位置情報が得られる。First, a circumscribed rectangle is extracted (step 101). On this occasion,
The circumscribing rectangles that overlap each other are replaced with a circumscribing rectangle that includes the whole and integrated (step 102). If the vertical or horizontal length of the circumscribed rectangle exceeds the size of the character, it can be identified as a bitmap image or a vector display graphic (step 103). For example, when dealing with characters up to 36 points, if the height or width is larger than about 13 mm,
Considered as a bitmap image or vector display figure. Further, among the large circumscribed rectangles, those having an aspect ratio close to 1 and a high ratio of black pixels in the circumscribed rectangle can be identified as an image area (step 104). In addition, when the aspect ratio is close to 1, it means 1 /
It means in the range of 3 to 3. Other large bounding rectangles can be considered graphic regions. Further, the small circumscribed rectangle is determined as a character, but it can be confirmed by further examining the periodicity of characters in the horizontal and vertical directions (step 105). The set of circumscribing rectangles having a substantially constant horizontal / vertical character cycle obtained at this time is extracted as a character group having the same attribute, that is, a character area (step 10).
6). By the above processing, the character area, the graphic area, and the image area are extracted, and the size / position information of each area is obtained.

上述の領域解析においては、外接矩形大きさ，縦横比等
により各領域を識別したが、この他にも、木田他「文書
自動認識システムの構成法」，画像電子学会誌，第15巻
第２号(1986),P107 〜115 に記載されているような、周
辺分布法と黒連結法を併用したアルゴリズム等を採用す
ることができる。In the above-mentioned area analysis, each area was identified by the size of the circumscribed rectangle, the aspect ratio, etc. In addition to this, Kida et al. "Method of constructing an automatic document recognition system", The Institute of Image Electronics Engineers, Vol. An algorithm using the marginal distribution method and the black concatenation method as described in No. (1986), P107-115 can be adopted.

次に、文字部解析部12について説明する。ここでは、領
域解析部11で文字領域として抽出された領域の内部の構
造を解析する。ここでも、特に方法は限定しない。一方
法として、先に抽出した外接矩形及びその周期を用い
る。概略の手順の例を第４図に示す。Next, the character part analysis unit 12 will be described. Here, the internal structure of the area extracted as the character area by the area analysis unit 11 is analyzed. Here again, the method is not particularly limited. As one method, the circumscribing rectangle extracted previously and its period are used. An example of the general procedure is shown in FIG.

まず、文字列の方向を検出する。通常、文字間隔の方が
文字列間隔より狭い。そこで矩形間の間隔を抽出し（ス
テップ201）、水平方向・垂直方向の外接矩形の平均の
間隔の小さい方向が文字列の並んでいる方向であると判
断する（ステップ202）。すなわち、垂直の間隔が水平
の間隔より大きいときは、横書きと判断して文字列間隔
を垂直周期とし（ステップ203）、そうでないときは縦
書きと判断して文字列間隔を水平周期とする（ステップ
204）。First, the direction of the character string is detected. Usually, the character spacing is narrower than the character string spacing. Therefore, the interval between the rectangles is extracted (step 201), and it is determined that the direction in which the average interval of the circumscribing rectangles in the horizontal and vertical directions is small is the direction in which the character strings are arranged (step 202). That is, when the vertical interval is larger than the horizontal interval, it is determined to be horizontal writing and the character string interval is set to the vertical cycle (step 203), otherwise it is determined to be vertical writing and the character string interval is set to the horizontal cycle (step 203). Step
204).

次に、領域内の外接矩形の幅と高さのそれぞれの最大値
から文字の大きさを求める（ステップ205）。これで、
各文字領域内の属性を特定することができる。Next, the size of the character is obtained from the maximum values of the width and height of the circumscribed rectangle in the area (step 205). with this,
The attribute in each character area can be specified.

次に、図形部解析部13において、領域解析部11で図形と
判定された領域を解析し、線図形をベクトル化する。な
お、検出された全ての線図形をベクトル化することも可
能であるが、ここでは、枠情報や罫線、縦横の直線から
なる表の枠、段組みの仕切り線等の構造的に重要で再利
用の価値の高いと考えられる縦横の長い線分のみを抽出
する。第５図に解析の手順の例を示す。Next, in the figure part analysis section 13, the area determined as the figure by the area analysis section 11 is analyzed, and the line figure is vectorized. It is also possible to vectorize all the detected line figures, but here, because of structural importance such as frame information and ruled lines, a table frame consisting of vertical and horizontal straight lines, partition lines of columns, etc. Only long vertical and horizontal line segments that are considered to be highly useful are extracted. FIG. 5 shows an example of the analysis procedure.

縦横線抽出方法としては、黒画素を水平及び垂直にたど
り、所定の長さ以上の物だけを残すようにする（ステッ
プ301，302）。この後、得られた縦横線を始点終点情報
と太さ等で示されるベクトル情報に変換する（ステップ
303）。このベクトル化された線の内、一本の線分のみ
が単独に存在するものは、罫線や段組みの仕切り線と見
做す（ステップ304）。また、複数の水平垂直線が組合
わさって構成される部分は表の部分と見做し（ステップ
305）、水平・垂直線で一つの矩形を構成する場合は図
形の枠と見做す（ステップ306）。なお、これ以外のも
のは、一般図形と見做す。As a method of extracting vertical and horizontal lines, black pixels are traced horizontally and vertically, and only objects having a predetermined length or more are left (steps 301 and 302). After that, the obtained vertical and horizontal lines are converted into start point and end point information and vector information indicated by thickness and the like (step
303). Of these vectorized lines, the one in which only one line segment exists independently is regarded as a ruled line or a dividing line of columns (step 304). Also, the part that is composed of a combination of multiple horizontal and vertical lines is considered to be the table part (step
305), if one rectangle is composed of horizontal and vertical lines, it is regarded as a frame of a figure (step 306). It should be noted that anything other than this is regarded as a general figure.

以上によって、レイアウト解析が終了し、画像入力部２
で読み込まされた原稿の画像が、それぞれ文字領域、画
像領域、図形領域の各領域に分離される。なお、図形領
域には表及び線も含まれている。また、文字の領域では
内部の文字サイズ、文字列間隔、文字列方向等の属性も
付加される。これらの情報をもとに、第１図に示すレイ
アウト生成指示部４は、レイアウトをレイアウト割り付
け部５に対して指示し、各領域の割り付けを行う。With the above, the layout analysis is completed, and the image input unit 2
The image of the original read in is divided into a character area, an image area, and a graphic area. The graphic area also includes tables and lines. Further, in the character area, attributes such as internal character size, character string interval, and character string direction are added. Based on these pieces of information, the layout generation instructing section 4 shown in FIG. 1 instructs the layout allocating section 5 to perform layout and allocates each area.

たとえば、第６図(a)に示すような文字部ａ，ビットマ
ップ画像部ｂ，図形部ｃ，仕切り線ｄ，表ｅ等を含んだ
皺形となる文書を、画像入力部２で読み込んだ場合、こ
の画像に対してレイアウト解析を行うと模式的に同図
(b)に示すような結果が得られる。同図(b)において、Ａ
は文字領域，Ｂは画像領域，Ｃは図形領域，Ｄは仕切り
線，Ｅは表構造を示す。文書が表示される表示部９の画
面上には、レイアウト解析に基づいた各領域が表示され
るので、文書作成者は、キーボード／マウス６により文
字或いは図形を入力すべき領域を指定し、キーボード／
マウス６を使用して各領域に所望の内容を入力するのみ
で、所望のレイアウトをもった文書を生成できる。For example, a wrinkled document including a character portion a, a bitmap image portion b, a graphic portion c, a partition line d, a table e, etc. as shown in FIG. 6 (a) is read by the image input unit 2. In this case, if you perform layout analysis on this image,
The result shown in (b) is obtained. In the figure (b), A
Is a character area, B is an image area, C is a graphic area, D is a partition line, and E is a table structure. Since each area based on the layout analysis is displayed on the screen of the display unit 9 on which the document is displayed, the document creator designates the area where the character or the figure should be input by the keyboard / mouse 6, and the keyboard /
A document having a desired layout can be generated only by inputting desired contents in each area using the mouse 6.

なお、ビットマップ画像を直接文書の中に埋めこむとき
は、画像入力部２から入力した画像データを内容割り付
け部６へ送り、合成すればよい。When the bitmap image is directly embedded in the document, the image data input from the image input unit 2 may be sent to the content allocating unit 6 and combined.

また、レイアウト解析により生成された各領域に対し
て、文書作成者が修正を加えて所望のレイアウトに編集
したのち、内容を入力するようにしてもよい。Further, the document creator may make a correction to edit each area generated by the layout analysis and then input the content.

なお、上述の実施例においては、文書作成を例にとって
説明したが、文書のみ成らず、伝票等の予め決められ様
式の作成等にも適用することができる。In the above-described embodiment, the document creation is described as an example, but the invention can be applied not only to the document but also to the creation of a predetermined format such as a slip.

また、本実施例においては、表の部分は解析により得ら
れたベクトルデータに基づいて縦横線を再現するのみに
したが、文書作成装置側で、表構造の発生、管理が可能
であれば、ベクトルデータを直接出力するのではなく、
何行，何列の表であると言う表の属性を示す情報として
出力することも可能である。Further, in the present embodiment, the table portion only reproduces the vertical and horizontal lines based on the vector data obtained by the analysis, but if the generation and management of the table structure is possible on the document creation device side, Instead of directly outputting vector data,
It is also possible to output as information indicating the attributes of the table, which is the number of rows and columns.

また、上述の実施例においては、文字部で抽出した情報
は、文字の大きさ，文字列間隔，縦書き横書きのみであ
ったが、書体の相違や、言語の相違たとえば日本語と英
語の相違を識別してこれらの属性を付加してもよい。Further, in the above-described embodiment, the information extracted in the character portion is only the character size, character string interval, vertical writing and horizontal writing, but the typeface difference and the language difference, for example, the difference between Japanese and English. May be identified and these attributes may be added.

更に、文字部においては文字認識を行い、また、図形部
においては図形認識を行うことによってレイアウト情報
だけでなく、内容そのものも再利用が可能となる。Further, by recognizing characters in the character part and recognizing figures in the graphic part, not only the layout information but also the content itself can be reused.

〔The invention's effect〕

以上に述べたように、本発明においては、読み取った原
稿の画像に基づいて自動的にレイアウトを生成している
ので、類似のレイアウトの文書を作成する際に、文書作
成者が各領域の設定，修正等を行う作業が不要或いは簡
単になり、文書作成の効率が改善される。また、標準レ
イアウト文書を予め作成したり、登録しておく必要がな
いため、準備段階のための手順が不要となり、また、大
容量の記憶装置を必要としない。As described above, in the present invention, since the layout is automatically generated based on the read image of the original, the document creator sets each area when creating a document having a similar layout. The work of making corrections is unnecessary or simple, and the efficiency of document creation is improved. Further, since it is not necessary to create or register the standard layout document in advance, the procedure for the preparation stage is unnecessary, and a large-capacity storage device is not required.

[Brief description of drawings]

第１図は本発明の文書作成支援装置が適用された文書作
成装置の構成を示すブロック図、第２図はレイアウト解
析部の構成を示すブロック図、第３図は領域解析の手順
の例を示すフローチャート、第４図は文字部解析の手順
の例を示すフローチャート、第５図は図形部解析の手順
の例を示すフローチャート、第６図(a)，(b)は原稿画像
とレイアウト解析の結果例を模式的に示す説明図であ
る。１：文書作成支援装置、２：画像入力部３：レイアウト解析部４：レイアウト生成指示部５：レイアウト割り付け部６：キーボード／マウス７：文字・図形入力部、８：内容割り付け部９：表示部 10：レイアウト座標・属性入力部 11：領域解析部、12：文字部解析部 13：図形部解析部FIG. 1 is a block diagram showing the configuration of a document creation device to which the document creation support device of the present invention is applied, FIG. 2 is a block diagram showing the configuration of a layout analysis unit, and FIG. 3 is an example of a region analysis procedure. FIG. 4 is a flowchart showing an example of a procedure of character part analysis, FIG. 5 is a flowchart showing an example of a procedure of graphic part analysis, and FIGS. 6 (a) and 6 (b) are a manuscript image and layout analysis. It is explanatory drawing which shows a result example typically. 1: Document creation support device 2: Image input unit 3: Layout analysis unit 4: Layout generation instruction unit 5: Layout allocation unit 6: Keyboard / mouse 7: Character / figure input unit 8: Content allocation unit 9: Display unit 10: Layout coordinate / attribute input section 11: Area analysis section, 12: Character section analysis section 13: Graphic section analysis section

Claims

[Claims]

1. A document creating apparatus used in a document creating apparatus, wherein different areas are set according to attributes of an image included in a document to be created and different processing is performed on each area when creating a document. An image input unit for inputting a wrinkled document as an image, which is a support device, and an attribute of an image of the input document is discriminated, and the size, position, and attribute of each area are determined based on the difference of the attributes. A layout analysis unit that extracts layout information that indicates the layout information, and a layout generation instruction unit that instructs the document to be created based on the obtained layout information to set the same area as the wrinkled document. A document creation support device characterized by:

2. The layout analysis unit includes a region analysis unit that extracts a document region, a graphic region, and an image region from a document image; a character unit analysis unit that analyzes attributes of a character unit in the character region; 2. The document creation support apparatus according to claim 1, further comprising a graphic part analysis unit that analyzes a graphic part in the graphic area to extract a line component in the graphic and outputs it as vector data.