JPH05324709A

JPH05324709A - Bilingual image forming device

Info

Publication number: JPH05324709A
Application number: JP4151316A
Authority: JP
Inventors: Tetsuo Fujisawa; 哲夫藤沢; Takako Satou; 多加子佐藤
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1992-05-19
Filing date: 1992-05-19
Publication date: 1993-12-07

Abstract

PURPOSE:To retrieve translation information and improve the bilingual efficiency by removing a detected specific symbol when the specific symbol is at the tail of a line or including the specific symbol when not. CONSTITUTION:It is judged whether or not there is a hyphen as a specific symbol (S1001) and when not, a return is made. When there is the hyphen, on the other hand, it is judged whether or not a divided word flag is set (S1002). When it is judged that the flag is set, equivalent composition is performed in the center of the line below a recognized word (S1003) and a return is made. When it is judged that the divided word flag is not set, on the other hand, it is judged whether or not the number of characters before the hyphen is larger than the number of following characters (S1004). When not, equivalent composition to the line below the words following the hyphen is performed (S1006). When it is judged that the number of the characters before the hyphen is larger, on the other hand, equivalent composition to the line below the word before the hyphen is performed (S1005). Consequently, the position of the equivalent to the word is made clear.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は，対訳機能を備えたデジ
タル画像データを使用する複写機，ファクシミリ装置等
に適用される対訳画像形成装置に関し，より詳細には，
原稿を光学的に読み取り（或いは通信回線を介して画像
情報を入力し），読み取った（或いは入力した）画像情
報中の単語を認識して，その単語に対応する訳語を出力
する対訳画像形成装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a bilingual image forming apparatus applied to a copying machine, a facsimile machine or the like which uses digital image data having a bilingual function, and more specifically,
A bilingual image forming apparatus that optically reads a manuscript (or inputs image information via a communication line), recognizes a word in the read (or input) image information, and outputs a translated word corresponding to the word. Regarding

【０００２】[0002]

【従来の技術】他国語で記載された文章を読む人にと
り，該文章の多くの部分については読解に支障はない
が，単に特定の単語についてのみ適切な訳語を知らない
ため，読み進むことができないといった状況が多々あ
る。このような状況において，辞書を用いて該特定単語
の意味を知り，読み進むことはよく経験するところであ
る。2. Description of the Related Art For a person who reads a sentence written in another language, there is no problem in reading comprehension of many parts of the sentence, but it is difficult to read because only a specific word does not know an appropriate translation. There are many situations where it is not possible. In such a situation, it is a common experience to know the meaning of the specific word using a dictionary and proceed with reading.

【０００３】そこで，他国語で記載された文章を自動的
に翻訳して出力する翻訳複写機として，例えば，特開昭
６２−１５４８４５号公報に開示されている「翻訳複写
装置」がある。同公報に開示された公報によれば，第１
に，画像読取手段が原画像を読み取り，第２に，識別手
段が読み取った原画像を絵柄情報と文字情報に識別し，
第３に，翻訳手段が文字情報について文字毎に認識し，
更にその内容を他国語に翻訳して原画像と共に翻訳画像
を出力するように構成されている。従って，上記公報に
開示された技術によれば，原画像の複写画像と共に文字
情報に関する翻訳画像が得られるので辞書を用いる手間
を省くことができる。Therefore, as a translation copying machine for automatically translating and outputting a sentence written in another language, there is, for example, a "translation copying machine" disclosed in Japanese Patent Laid-Open No. 154845/1987. According to the publication disclosed in that publication, the first
Firstly, the image reading means reads the original image, and secondly, the original image read by the identifying means is distinguished into pattern information and character information,
Third, the translation means recognizes the character information for each character,
Further, the content is translated into another language and the translated image is output together with the original image. Therefore, according to the technique disclosed in the above publication, a translated image relating to character information can be obtained together with a copy image of an original image, and therefore the labor of using a dictionary can be omitted.

【０００４】[0004]

【発明が解決しようとする課題】しかしながら，上記の
従来技術における「翻訳複写装置」にあっては，確かに
原画像の複写画像と共に文字情報に関する翻訳画像が得
られるので辞書を用いる手間を省くことができるが，一
定の言語知識を有した多くの利用者にとって，必ずしも
エラーが皆無とはいえない現在の翻訳語術や実現コスト
等を勘案すれば，冗長性が高く，必ずしも望ましい機能
であるとは言えないという問題点があった。However, in the above-mentioned "translation copying apparatus" in the prior art, since a translated image relating to character information can be obtained together with a copied image of the original image, it is not necessary to use a dictionary. However, for many users who have a certain level of language knowledge, considering the current translation linguistics and implementation costs, which are not necessarily error-free, it is highly redundant and is always a desirable function. There was a problem that I could not say.

【０００５】また，従来にあっては，複数の段組を持つ
原稿画像に対しては，原稿画像中に存在する単語に対応
する訳語の位置が明確ではなく，また，訳語付加により
出力画像の内容が大幅に変更し，初期のレイアウトがく
ずれ，複写画像が非常に読みづらくなるばかりでなく，
単語と単語を特殊記号「−」，「＆」，「／」等により
接続されている組単語或いは行末において完成された１
つの単語を分割して２行に分けたものにあっては，処理
が適切に実行できず，対訳効率を低下させるという問題
点があった。Further, in the conventional art, for a manuscript image having a plurality of columns, the position of the translated word corresponding to the word existing in the manuscript image is not clear, and the addition of the translated word causes Not only is the content changed drastically, the initial layout is broken, and the copied image becomes very difficult to read,
Completed at the end of a line or a group word in which words are connected by special symbols "-", "&", "/", etc.
In the case where one word is divided into two lines, there is a problem that the processing cannot be properly executed and the translation efficiency is reduced.

【０００６】本発明は上記の問題点に鑑みてなされたも
のであって，原稿画像中に存在する単語に対応する訳語
の位置を明確にし，また，訳語付加により出力画像の内
容が大幅に変更したとしても，複写画像の内容を読みや
すくレイアウトして出力すると共に，特殊記号により接
続されている組単語等の処理を適切に実行して，対訳効
率を向上させることを目的とする。The present invention has been made in view of the above problems. The position of a translated word corresponding to a word existing in an original image is clarified, and the content of the output image is significantly changed by adding the translated word. Even if it does, the object of the present invention is to improve the translation efficiency by laying out the content of the copied image in an easy-to-read layout and outputting it, and by appropriately executing the processing of the set words connected by the special symbols.

【０００７】[0007]

【発明が解決しようとする課題】この発明は，原稿画像
を光学的に読み取り，画像情報に変換する画像読取手段
と，前記画像読取手段により読み取った画像情報或いは
通信回線を介して入力された画像情報を記憶する画像記
憶手段と，前記画像情報に対し各種画像処理を実行する
画像処理手段と，前記画像処理手段が出力する画像信号
に基づいて画像を記録媒体上に記録する画像記録手段
と，前記原稿画像情報と該情報に対する訳語情報を記憶
する訳語記憶手段と，前記訳語記憶手段の情報に基づい
て前記原稿画像情報に対し対訳処理を実行する制御手段
とを有する対訳画像形成装置において，前記制御手段
は，所定の記号を検出し，且つ，それが行末にあると判
断したときは前記所定の記号を除去して前記訳語情報を
検索し，行末にないと判断したときは前記所定の記号を
含めて訳語情報を検索する対訳画像形成装置を提供する
ものである。SUMMARY OF THE INVENTION According to the present invention, an image reading means for optically reading a document image and converting it into image information, and image information read by the image reading means or an image input through a communication line. Image storage means for storing information, image processing means for performing various image processing on the image information, image recording means for recording an image on a recording medium based on an image signal output by the image processing means, A translation image forming apparatus comprising: a translation word storage unit for storing the document image information and translation word information for the information; and a control unit for executing a translation process for the document image information based on the translation word storage unit information. The control means detects the predetermined symbol and, when judging that it is at the end of the line, removes the predetermined symbol to search the translation information, and if it is not at the end of the line. When disconnection is to provide a translation image forming apparatus for searching the translation information including the predetermined symbol.

【０００８】また，前記制御手段は，前記所定の記号を
除去して訳語情報を検索した場合には，該所定の記号以
前の部分における文字数を以降の部分における文字数と
比較し，文字数の大きい方の部分の下に訳語情報を合成
することが望ましい。Further, when the translation means information is retrieved by removing the predetermined symbol, the control means compares the number of characters in the portion before the predetermined symbol with the number of characters in the subsequent portion, and determines the one with the larger number of characters. It is desirable to combine the translated word information under the part of.

【０００９】また，前記制御手段は，前記所定の記号を
含めて訳語情報を検索したときに，前記訳語記憶手段に
該当する訳語がないと判断したときには，末尾（最後）
の所定の記号以降の部分を除去して訳語情報を検索する
ことが望ましい。Further, when the control means searches the translated word information including the predetermined symbol and finds that there is no corresponding translated word in the translated word storage means, it ends (last).
It is desirable to search the translation information by removing the portion after the predetermined symbol of.

【作用】この発明は，特殊記号を検出し，且つ，それが
行末にあると判断したときは該特殊記号を除去して前記
訳語情報を検索し，行末にないと判断したときは前記特
殊記号を含めて訳語情報を検索する。According to the present invention, when a special symbol is detected and it is determined that it is at the end of a line, the special symbol is removed to search the translated word information, and when it is determined that it is not at the end of the line, the special symbol is detected. Search for translation information including.

【００１０】また，特殊記号を除去して訳語情報を検索
した場合には，該特殊記号以前の部分における文字数を
以降の部分における文字数とを比較し，文字数の大きい
方の部分の下に訳語情報を合成する。Further, when the translated word information is retrieved by removing the special symbol, the number of characters in the portion before the special symbol is compared with the number of characters in the subsequent portion, and the translated word information is placed below the portion having the larger number of characters. To synthesize.

【００１１】また，特殊記号を含めて訳語情報を検索し
たときに，辞書に該当する訳語がないと判断したときに
は，末尾（最後）の特殊記号以降の部分を除去して訳語
情報を検索する。When it is determined that there is no corresponding translated word in the dictionary when searching the translated word information including the special symbols, the translated word information is searched by removing the portion after the last special symbol.

【００１２】[0012]

【実施例】以下，この発明に係る対訳画像形成装置の実
施例を，（１）対訳画像形成装置の概略構成（２）スキャナ部とプリンタ部の構成（３）ＣＰＵの処理動作（３）− 原稿認識処理（３）− 特殊記号の分類（３）− 段落，行，ページ合成処理（３）− 訳語辞書検索処理（３）− 訳語記憶処理（３）− はめ込み合成画像生成処理（３）− 訳語合成処理の順に図面を参照して説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Embodiments of a bilingual image forming apparatus according to the present invention will be described below. (1) Schematic configuration of bilingual image forming apparatus (2) Configuration of scanner section and printer section (3) CPU processing operation (3)- Manuscript recognition processing (3) -Classification of special symbols (3) -Paragraph, line, page synthesis processing (3) -Translation word dictionary retrieval processing (3) -Translation word storage processing (3) -Fixed composite image generation processing (3)- The translation word synthesizing process will be described with reference to the drawings.

【００１３】（１）対訳画像形成装置の概略構成図１は，本発明の一実施例である対訳画像形成装置の概
略構成を示すブロック図である。この対訳画像形成装置
は，原稿画像を読み取るスキャナ部１０１，装置全体の
制御を実行するＣＰＵ１０９，制御プログラムが格納さ
れているＲＯＭ１１０，制御プログラムが一時的に使用
するＲＡＭ１１１，読み取った画像を記憶する入力画像
メモリ１０４，出力画像を記憶する出力画像メモリ１０
５，文字認識に使用するデータが格納されている認識辞
書メモリ１０６，単語とその単語に対応する訳語が格納
されている訳語辞書メモリ１０７，画像生成に使用する
文字フォントデータが格納されている出力フォントメモ
リ１０８，各装置間のデータのやりとりを行う内部シス
テム・バス１１５，アナログ信号をデジタル信号に変換
するＡ／Ｄ変換器１０２，システム・バス１１５とのイ
ンターフェイスを行うＩ／Ｆ１０３，１１３，出力画像
メモリに格納された画像を出力するプリンタ部１１４，
スタート，ストップ等の指示を与える操作ボード１１２
等により構成されている。(1) Schematic Configuration of Parallel Image Forming Apparatus FIG. 1 is a block diagram showing a schematic configuration of a parallel image forming apparatus which is an embodiment of the present invention. This bilingual image forming apparatus includes a scanner unit 101 that reads an original image, a CPU 109 that controls the entire apparatus, a ROM 110 that stores a control program, a RAM 111 that is temporarily used by the control program, and an input that stores the read image. Image memory 104, output image memory 10 for storing output images
5, a recognition dictionary memory 106 in which data used for character recognition is stored, a translation dictionary memory 107 in which a word and a translation corresponding to the word are stored, and an output in which character font data used for image generation is stored Font memory 108, internal system bus 115 for exchanging data between devices, A / D converter 102 for converting analog signals into digital signals, I / Fs 103, 113 for interfacing with system bus 115, and outputs A printer unit 114 for outputting the image stored in the image memory,
Operation board 112 that gives instructions such as start and stop
Etc.

【００１４】また，この対訳画像形成装置は，ファクシ
ミリ通信機能を具備し，１１６は電話回線の閉結，開放
或いはダイヤル信号の送出，リンギング周波数の検出等
を実行する網制御部（ＮＣＵ），１１７は電話回線（ア
ナログ回線）にデータを送出する際のデータ変調，受信
時のデータ復調を実行するモデム（変復調装置），１１
８は画像情報を電話回線に送り出す前の処理（受信側の
ファクシミリ機能に合わせてデータ圧縮をやり直した
り，画像情報の縮小）を実行する通信制御部である。そ
の他，ファクシミリ通信機能に必要な部分の説明は省略
する。Further, this bilingual image forming apparatus has a facsimile communication function, and 116 is a network control unit (NCU) 117 for closing and opening a telephone line or sending a dial signal and detecting a ringing frequency. Is a modem (modulator / demodulator) that performs data modulation when transmitting data to a telephone line (analog line) and demodulates data when receiving.
Reference numeral 8 denotes a communication control unit for executing processing before sending the image information to the telephone line (re-compressing the data according to the facsimile function of the receiving side or reducing the image information). Descriptions of other parts necessary for the facsimile communication function are omitted.

【００１５】（２）スキャナ部とプリンタ部の構成図２は，スキャナ部１０１及びプリンタ部１１４の内部
の概略構成を示す説明図である。スキャナ部１０１にお
ける原稿を載置するためのコンタクトガラス２０１は，
光源２０２ａ，２０２ｂによって照明され，読取原稿か
らの反射光はミラー２０３〜２０７及びレンズ２０８を
介してＣＣＤイメージセンサ２０９の受光面に結像され
る。光源２０２（２０２ａ，２０２ｂ）及びミラー２０
３は，コンタクトガラス２０１の下面をコンタクトガラ
ス２０１と平行に副走査方向に移動する走行体２１０に
搭載され，ミラー２０４，２０５はその走行体２１０に
連動して１／２の速度で副走査方向に移動する走行体２
１１に搭載されている。主走査方向のスキャンは，ＣＣ
Ｄイメージセンサ２０９の固体走査によって行われ，原
稿画像はＣＣＤイメージセンサ２０９によって読み取ら
れ，前述のような光学系が移動することで原稿全体が走
査されるようになっている。尚，図中２３９は，原稿を
押圧するための圧板である。(2) Configuration of Scanner Unit and Printer Unit FIG. 2 is an explanatory diagram showing a schematic configuration of the inside of the scanner unit 101 and the printer unit 114. The contact glass 201 for placing a document on the scanner unit 101 is
Illuminated by the light sources 202a and 202b, the reflected light from the read document is imaged on the light receiving surface of the CCD image sensor 209 via the mirrors 203 to 207 and the lens 208. Light source 202 (202a, 202b) and mirror 20
3 is mounted on a traveling body 210 which moves the lower surface of the contact glass 201 parallel to the contact glass 201 in the sub-scanning direction. Moving body 2 moving to
It is installed in 11. CC in the main scanning direction
The original image is read by the CCD image sensor 209 by the solid-state scanning of the D image sensor 209, and the entire original is scanned by moving the optical system as described above. Incidentally, reference numeral 239 in the figure is a pressure plate for pressing the document.

【００１６】次に，プリンタ部１１４は，レーザ書込
系，画像再生系並びに給紙系により構成される。レーザ
書込系は，レーザ出力ユニット２２１，結像レンズ２２
２並びにミラー２２３を備えている。レーザ出力ユニッ
ト２２１の内部には，レーザ光源であるレーザダイオー
ド及びモータにより高速で回転する多角形ミラー（ポリ
ゴンミラー）が設けられている。レーザ書込系から出力
されるレーザ光が画像再生系の感光体ドラム２２４に照
射される。感光体ドラム２２４の周囲には，帯電チャー
ジャ２２５，イレーサ２２６，現像ユニット２２７，転
写チャージャ２２８，分離チャージャ２２９，分離爪２
３０，クリーニングユニット２３１などが具備されてい
る。尚，感光体ドラム２２４の一端近傍でレーザビーム
が照射される位置に主走査同期信号を発生するビームセ
ンサ（図示せず）が配置されている。Next, the printer unit 114 comprises a laser writing system, an image reproducing system and a paper feeding system. The laser writing system includes a laser output unit 221, an imaging lens 22.
2 and a mirror 223. Inside the laser output unit 221, a laser diode, which is a laser light source, and a polygon mirror that rotates at high speed by a motor are provided. The laser beam output from the laser writing system is applied to the photosensitive drum 224 of the image reproducing system. Around the photosensitive drum 224, a charging charger 225, an eraser 226, a developing unit 227, a transfer charger 228, a separation charger 229, a separation claw 2 are provided.
30, a cleaning unit 231 and the like are provided. A beam sensor (not shown) for generating a main-scanning synchronization signal is arranged near the one end of the photosensitive drum 224 at a position where the laser beam is emitted.

【００１７】このプリンタ部１１４における画像再生プ
ロセスの動作を簡単に説明する。感光体ドラム２２４の
周面は，帯電チャージャ２２５によって一様に高電位に
帯電される。その周面にレーザ光が照射されると，照射
された部分は電位が下がる。レーザ光は記録再生の黒／
白に応じてＯＮ／ＯＦＦ制御されるので，レーザ光の照
射によって，感光体ドラム２２４の周面に記録画像に対
応する電位分布，即ち，静電潜像が形成される。静電潜
像が形成された部分が現像ユニット２２７を通ると，そ
の電位の高低に応じてトナーが付着し，静電潜像が可視
化したトナー像となる。トナー像が形成された部分に，
所定のタイミングで記録シート２３２が給紙カセットか
ら送り込まれ，トナー像に重なる。このトナー像は転写
チャージャ２２８によって記録シート２３２に転写さ
れ，その後分離チャージャ２２９並びに分離爪２３０に
よって，感光体ドラム２２４から分離される。分離され
た記録シート２３２は，搬送ベルト２３４によって搬送
され，ヒータを内蔵した定着ローラ２３５によって加熱
された後，排紙トレイ２３６に排紙される。The operation of the image reproduction process in the printer section 114 will be briefly described. The peripheral surface of the photosensitive drum 224 is uniformly charged to a high potential by the charging charger 225. When the peripheral surface is irradiated with laser light, the potential of the irradiated part is lowered. Laser light is black for recording / playback
Since the ON / OFF control is performed according to the white color, the potential distribution corresponding to the recorded image, that is, the electrostatic latent image is formed on the peripheral surface of the photosensitive drum 224 by the irradiation of the laser light. When the portion on which the electrostatic latent image is formed passes through the developing unit 227, toner adheres according to the level of the potential, and the electrostatic latent image becomes a visualized toner image. In the part where the toner image is formed,
The recording sheet 232 is fed from the paper feed cassette at a predetermined timing and overlaps the toner image. The toner image is transferred to the recording sheet 232 by the transfer charger 228, and then separated from the photoconductor drum 224 by the separation charger 229 and the separation claw 230. The separated recording sheet 232 is conveyed by a conveyor belt 234, heated by a fixing roller 235 having a heater built therein, and then ejected to an ejection tray 236.

【００１８】また，本実施例では，図２に示す通り，プ
リンタ部１１４は給紙系を２系統有している。一方の給
紙系は，上段給紙カセット２３３ａ及び手差し給紙台２
３３ｃが備わっており，上段給紙カセット２３３ａ及び
手差し給紙台２３３ｃにセットされた記録シート２３２
ａは，給紙ローラ２３７ａによって給紙される。もう一
方の給紙系には下段給紙カセット２３３ｂが備わり，下
段給紙カセット２３３ｂ内の記録シート２３２ｂは，給
紙ローラ２３７ｂによって給紙される。そしていずれか
の給紙ローラから給紙された記録シート２３２はレジス
トローラ２３８に当接した状態で一旦停止し，記録プロ
セスの進行に同期したタイミングで感光体ドラム２２４
の転写部へに送り込まれる。Further, in this embodiment, as shown in FIG. 2, the printer section 114 has two paper feeding systems. One of the paper feed systems includes an upper paper feed cassette 233a and a manual paper feed tray 2
33c, and the recording sheets 232 set in the upper paper feed cassette 233a and the manual paper feed tray 233c.
The sheet a is fed by the sheet feeding roller 237a. The other paper feed system is provided with a lower paper feed cassette 233b, and the recording sheet 232b in the lower paper feed cassette 233b is fed by a paper feed roller 237b. Then, the recording sheet 232 fed from any one of the sheet feeding rollers is temporarily stopped while being in contact with the registration roller 238, and the photosensitive drum 224 is synchronized with the progress of the recording process.
Is sent to the transfer section of.

【００１９】（３）ＣＰＵの処理動作図３〜図１０は，各々ＣＰＵ１０９の処理動作を示すフ
ローチャートである。まず，図３のフローチャートを参
照する。電源を投入（ＯＮ）すると，処理モード等の初
期化（Ｓ３０１）を行う。次に，操作ボード１１２から
の対訳動作指示があったか否かを判断する（Ｓ３０
２）。対訳動作指示がないときは，通常の複写動作（Ｓ
３０３）を行い，再び対訳動作指示（Ｓ３０２）の待機
状態となる。(3) Processing Operation of CPU FIGS. 3 to 10 are flowcharts showing the processing operation of the CPU 109, respectively. First, refer to the flowchart of FIG. When the power is turned on (ON), the processing mode and the like are initialized (S301). Next, it is determined whether there is a parallel translation operation instruction from the operation board 112 (S30).
2). When there is no parallel translation operation instruction, the normal copying operation (S
303) is performed and the state of waiting for the parallel translation operation instruction (S302) is resumed.

【００２０】反対に，対訳動作指示があった場合は，再
び操作ボード１１２からスタート指示があったか否かを
判断する（Ｓ３０４）。スタート指示があったと判断す
ると，スキャナ部１０１にセットされた原稿画像を読み
取り（Ｓ３０５），入力画像メモリ１０４に記憶する。
次に，原稿のページ数をカウントし（Ｓ３０６），原稿
認識（Ｓ３０７）を実行する。On the contrary, when the parallel translation operation instruction is given, it is judged again whether or not the start instruction is given from the operation board 112 (S304). If it is determined that the start instruction is given, the original image set on the scanner unit 101 is read (S305) and stored in the input image memory 104.
Next, the number of pages of the document is counted (S306), and document recognition (S307) is executed.

【００２１】（３）− 原稿認識処理原稿認識（Ｓ３０７）の処理を図４に示す。原稿認識
は，初めに段組認識（Ｓ４０１）を実行する。段組認識
は，主走査方向／副走査方向共に，読取画像データの文
字が存在しない部分（白領域）が一定間隔以上存在した
場合を段の区切りとして認識する。次に，段組位置認識
（Ｓ４０２）を実行する。段組位置認識は，スキャナ部
１０１により読み取られ，入力画像メモリ１０４に保存
されている画像情報に，画素単位で座標を与え，認識さ
れた段組が画像中に占める座標を求める。例えば，図１
３に示した原稿中の「Ｃｈａｐｔｅｒ１」で始まる段
組を例にすると，図１４に示すように「Ｃｈａｐｔｅｒ
１」で始まる段組が完全に含まれる長方形の領域を求
め，その領域の左上の座標（ｂｓｘ，ｂｓｙ）と右下の
座標（ｂｅｘ，ｂｅｙ）を認識する。(3) -Document Recognition Processing The document recognition (S307) processing is shown in FIG. In document recognition, first, column recognition (S401) is executed. The column recognition recognizes a case where a portion (white area) of the read image data where no character exists in a main scanning direction / sub-scanning direction is a certain interval or more as a column break. Next, column position recognition (S402) is executed. In the column position recognition, coordinates are given in pixel units to the image information read by the scanner unit 101 and stored in the input image memory 104, and the coordinates occupied by the recognized column in the image are obtained. For example,
As an example of the column starting from “Chapter 1” in the manuscript shown in FIG. 3, as shown in FIG.
A rectangular area completely including a column starting with "1" is obtained, and the upper left coordinates (bsx, bsy) and the lower right coordinates (bex, bey) of the area are recognized.

【００２２】次に，行認識（Ｓ４０３）を実行する。行
認識は認識された段組ごとに行い，副走査方向に，読取
画像データの文字が存在しない部分（白領域）が一定間
隔以上存在した場合を行の区切りとして認識する。次
に，行位置認識（Ｓ４０４）を実行する。行位置認識
は，スキャナ部１０１により読み取られ，入力画像メモ
リ１０４に保存されている画像情報に，画素単位で座標
を与え，認識された行が画像中に占める座標を求める。
例えば，図１３に示した原稿中の「ｆｒｏｍａｓｉｎ
ｇｌｅ」で始まる行を例にすると，図１４に示すように
「ｆｒｏｍａｓｉｎｇｌｅ」で始まる行が完全に含ま
れる長方形の領域を求め，その領域の左上の座標（ｌｓ
ｘ，ｌｓｙ）と右下の座標（ｌｅｘ，ｌｅｙ）を認識す
る。Next, line recognition (S403) is executed. Line recognition is performed for each recognized column, and when a portion (white area) of the read image data where there is no character in the sub-scanning direction is present over a certain interval, it is recognized as a line delimiter. Next, line position recognition (S404) is executed. In the line position recognition, coordinates are given in pixel units to the image information read by the scanner unit 101 and stored in the input image memory 104, and the coordinates occupied by the recognized line in the image are obtained.
For example, in the manuscript shown in FIG.
Taking a line beginning with "gle" as an example, as shown in FIG. 14, a rectangular region that completely includes a line beginning with "from single" is obtained, and the upper left coordinates (ls
x, lsy) and the lower right coordinate (lex, ley) are recognized.

【００２３】次に，文字認識（Ｓ４０５）を実行する。
文字認識アルゴリズムはテンプレートマッチング，トポ
ロジカルマッチングなど従来用いられている技術の組み
合わせを利用する。次に，文字認識で認識された文字か
ら単語抽出（Ｓ４０６）を実行する。単語抽出は行ごと
に行い，主走査方向に読取画像データの文字が存在しな
い部分（白領域）が一定間隔以上存在した場合，また空
白文字，句読点等の区切記号が存在した場合を単語の区
切りとして抽出する。次に，特殊記号の分類（Ｓ４０
７）を実行する。その後，単語位置認識（Ｓ４０８）を
実行する。単語位置認識は，スキャナ部１０１により読
み取られ，入力画像メモリ１０４に保存されている画像
データに，画素単位で座標を与え，認識された単語が画
像中に占める座標を求める。例えば，図１３に示した原
稿中の「Ｔｈｉｓ」という単語を例にすると，図１４に
示すように「Ｔｈｉｓ」が完全に含まれる長方形の領域
を求め，その領域の左上の座標（ｗｓｘ，ｗｓｙ）と右
下の座標（ｗｅｘ，ｗｅｙ）を認識する。Next, character recognition (S405) is executed.
The character recognition algorithm uses a combination of conventional techniques such as template matching and topological matching. Next, word extraction (S406) is executed from the characters recognized by the character recognition. Words are extracted line by line, and if there are more than a certain interval (white area) in the scanned image data where there are no characters in the scanned image data, or if there are space characters, punctuation marks, or other delimiters, the words are separated. To extract. Next, the classification of special symbols (S40
Execute 7). Then, word position recognition (S408) is executed. In the word position recognition, coordinates are given in pixel units to the image data read by the scanner unit 101 and stored in the input image memory 104, and the coordinates occupied by the recognized word in the image are obtained. For example, taking the word "This" in the manuscript shown in FIG. 13 as an example, a rectangular area in which "This" is completely contained is obtained as shown in FIG. 14, and the coordinates (wsx, wsy) at the upper left of the area are obtained. ) And the lower right coordinate (wex, way).

【００２４】（３）− 特殊記号の分類図５は，特殊記号の分類動作を示すフローチャートであ
る。まず，改行位置認識を実行し（Ｓ５０１），次に，
文字認識を実行し（Ｓ５０２），その後，ハイフォンが
あるか否かを判断する（Ｓ５０３）。ハイフォンがない
と判断すれば，上記ステップ５０２に戻り，ハイフォン
があると判断するまでステップ５０２〜５０３を繰り返
す。反対に，ハイフォンがあると判断すれば，次に，行
末か否かを判断し（Ｓ５０４），行末であると判断した
場合には，分割単語フラグをリセットする（Ｓ５０
５）。反対に，行末ではないと判断した場合には，分割
単語フラグをセットして（Ｓ５０６）リターンする。(3) -Classification of Special Symbols FIG. 5 is a flowchart showing a classification operation of special symbols. First, the line feed position recognition is executed (S501), and then
Character recognition is executed (S502), and then it is determined whether or not there is a hyphen (S503). If it is determined that there is no hyphen, the process returns to step 502, and steps 502-503 are repeated until it is determined that there is a hyphen. On the contrary, if it is determined that there is a hyphen, then it is determined whether it is the end of the line (S504), and if it is determined that it is the end of the line, the divided word flag is reset (S50).
5). On the contrary, if it is determined that the line is not at the end of the line, the divided word flag is set (S506) and the process returns.

【００２５】（３）− 段落，行，ページ合成処理次に，図３のフローチャートに戻り，段落，行，ページ
合成（Ｓ３０８）を実行する。この処理を図６に示す。
段落，行，ページ合成処理は，初めに読み取り画像複写
（Ｓ６０１）を実行する。読取画像複写は，スキャナ部
１０１によって読み取られ，入力画像メモリ１０４に記
憶されている画像情報を出力画像メモリ１０５に複写す
る。(3) -Paragraph, line, page combination processing Next, returning to the flowchart of FIG. 3, paragraph, line, page combination (S308) is executed. This process is shown in FIG.
In the paragraph, line, and page combination processing, first, read image copying (S601) is executed. In the read image copying, the image information read by the scanner unit 101 and stored in the input image memory 104 is copied to the output image memory 105.

【００２６】次に，段組枠作画（Ｓ６０２）を実行す
る。段組枠作画は，上記ステップ４０２における段組位
置認識により求められた座標データに基づいて，それぞ
れの段組が完全に含まれる長方形を作画する。即ち，そ
れぞれの段組で（ｂｓｘ，ｂｓｙ）と（ｂｅｘ，ｂｅ
ｙ）を対角座標に持つ長方形を作画する。図１５に段組
枠を作画した例を示す。Next, a column framework drawing (S602) is executed. In the column framework drawing, a rectangle completely including each column is drawn based on the coordinate data obtained by the column position recognition in step 402. That is, in each column, (bsx, bsy) and (bex, be
Draw a rectangle with y) as diagonal coordinates. FIG. 15 shows an example of drawing a column framework.

【００２７】次に，段番号合成（Ｓ６０３）を実行す
る。段番号合成は，上記ステップ４０１における段組認
識により認識された段組ごとに順番に段番号を各段組の
左上に付加する。段番号の付加方法は，付加する段番号
に使用するフォントデータを，出力フォントメモリ１０
８から読み出して，出力画像メモリ１０５に展開する。
このときの基準座標として（ｂｓｘ，ｂｓｙ）を用い
る。即ち，フォントデータの展開は（ｂｓｘ，ｂｓｙ）
を始点として実行する。図１５に段番号を作画した例を
示す。Next, stage number composition (S603) is executed. In column number composition, a column number is sequentially added to the upper left of each column for each column recognized by the column recognition in step 401. The method of adding the column number is such that font data used for the column number to be added is output to the output font memory 10
It is read out from the memory 8 and expanded in the output image memory 105.
(Bsx, bsy) is used as the reference coordinate at this time. That is, the expansion of font data is (bsx, bsy)
Execute starting from. FIG. 15 shows an example in which a step number is drawn.

【００２８】次に，行番号合成（Ｓ６０４）を実行す
る。行番号合成は，上記ステップ４０３の行認識により
認識された行ごとに順番に行番号を各行の左側に付加す
る。行番号の付加方法は，付加する行番号に使用するフ
ォントデータを出力フォントメモリ１０８から読み出し
て，出力画像メモリ１０５に展開する。このときの基準
座標として（ｌｓｘ，ｌｓｙ）を用いる。即ち，フォン
トデータの展開は（ｌｓｘ，ｌｓｙ）を始点として実行
する。図１５に行番号を作画した例を示す。この例で
は，出力する行番号は５の倍数に限定している。Next, line number composition (S604) is executed. In the line number synthesis, a line number is sequentially added to the left side of each line recognized by the line recognition in step 403. As a method of adding a line number, the font data used for the line number to be added is read from the output font memory 108 and expanded in the output image memory 105. (Lsx, lsy) is used as the reference coordinate at this time. That is, the expansion of the font data is executed starting from (lsx, lsy). FIG. 15 shows an example in which line numbers are drawn. In this example, the output line number is limited to a multiple of 5.

【００２９】次に，ページ番号合成（Ｓ６０５）を実行
する。ページ番号合成は，上記ステップ３０６の原稿ペ
ージ数カウントによりカウントされたページ番号を出力
画像メモリ１０５中の画像の下中央部に付加する。ペー
ジ番号の付加方法は段番号の付加方法と同じように，付
加するページ番号に使用するフォントデータを，出力フ
ォントメモリ１０８から読み出して，出力画像メモリ１
０５に展開する。このときの基準座標として，出力画像
メモリ１０５中の画像の下中央部の座標を予め定めてお
く。フォントデータの展開は，その定められた座標を始
点として行う。図１５にページ番号を作画した例を示
す。Next, page number composition (S605) is executed. In the page number combining, the page number counted by the document page number counting in step 306 is added to the lower center portion of the image in the output image memory 105. The page number addition method is similar to the step number addition method, the font data used for the page number to be added is read from the output font memory 108, and the output image memory 1 is read.
Expand to 05. As the reference coordinates at this time, the coordinates of the lower central portion of the image in the output image memory 105 are determined in advance. The expansion of font data is performed with the determined coordinates as the starting point. FIG. 15 shows an example in which page numbers are drawn.

【００３０】以上の処理で，読み取った原稿画像に段落
番号，行番号，ページ番号を合成した画像が出力画像メ
モリ１０５に作成される。次に，この画像をプリンタ部
１１４により記録するプリント出力（Ｓ３０９）を実行
する（図３参照）。By the above processing, an image in which the read original image is combined with the paragraph number, line number and page number is created in the output image memory 105. Next, print output (S309) for recording this image by the printer unit 114 is executed (see FIG. 3).

【００３１】（３）− 訳語辞書検索処理次に，訳語辞書検索（Ｓ３１０）を行う。この処理を図
７に示す。訳語辞書検索は，初めに上記ステップ４０６
の単語抽出処理により抽出され記憶されている単語を一
つ読み出す認識単語読み出し（Ｓ７０１）を実行する。
次に，認識単語の中にハイフォンが有るか否かを判断す
る（Ｓ７０２）。ハイフォンがあると判断した場合に
は，次に，分割単語フラグがセットされているか否かを
判断する（Ｓ７０３）。尚，ここで，分割単語とは完成
した単語を接続したもの，例えば，ｅｌｅｃｔｒｏｎ−
ｓｐｉｎを意味し，前出の組単語を意味する。セットさ
れていると判断されれば，ハイフォンを含めて辞書検索
を実行する（Ｓ７０４）。その後，検索された単語が辞
書にあるか否かを判断する（Ｓ７０５）。辞書にないと
判断した場合には，末尾のハイフォン以降の単語を除い
て辞書検索を実行し（Ｓ７０６），その結果が辞書にあ
るか否かを判断する（Ｓ７０７）。辞書になければ，次
に，検索した単語にハイフォンがあるか否かを判断する
（Ｓ７０８）。ハイフォンがないと判断した場合には，
出力フラグをリセットし（Ｓ７０９），反対に，ハイフ
ォンがあると判断した場合には，上記ステップ７０６に
戻り，ハイフォンなしと判断するまで以降の処理を繰り
返す。また，上記ステップ７０７において辞書にあると
判断した場合には，訳語記憶を実行し（Ｓ７１０），出
力フラグをセットする（Ｓ７１１）。(3) -Translation dictionary search processing Next, a translation dictionary search (S310) is performed. This process is shown in FIG. The translation dictionary search is first performed in the above step 406.
Recognized word read (S701) is executed to read out one of the words extracted and stored by the word extraction processing of.
Next, it is determined whether or not hyphen is included in the recognized words (S702). If it is determined that there is a hyphen, then it is determined whether the divided word flag is set (S703). The term “divided word” as used herein means a combination of completed words, for example, electron-
It means spin, which means the above group word. If it is determined that it is set, a dictionary search including hyphen is executed (S704). Then, it is determined whether or not the retrieved word is in the dictionary (S705). If it is determined that it is not in the dictionary, the dictionary search is executed excluding the word after the last hyphen (S706), and it is determined whether the result is in the dictionary (S707). If it is not in the dictionary, it is then determined whether the searched word has a hyphen (S708). If you determine that there is no hyphen,
When the output flag is reset (S709) and it is determined that there is a hyphen, on the contrary, the process returns to step 706, and the subsequent processing is repeated until it is determined that there is no hyphen. If it is determined in step 707 that it is in the dictionary, the translated word is stored (S710) and the output flag is set (S711).

【００３２】上記ステップ７０９／７１１の出力フラグ
のリセット／セット後，読み出した認識単語の検索が全
て完了か否かを判断する（Ｓ７１２）。全て完了してい
ないと判断した場合には，読み出した認識単語のうち，
除去された単語の読み出しを実行した（Ｓ７１３）後，
上記ステップ７０５へ戻り，読み出した認識単語の検索
が全て完了したと判断するまで以降の処理を繰り返す。
反対に，上記ステップ７１２において，読み出した認識
単語の検索が全て完了したと判断した場合には，更に，
全認識単語完了か否かを判断し（Ｓ７１９），完了であ
ればリターンし，反対に，完了していなければ上記ステ
ップ７０１に戻り，全認識単語完了と判断するまで以降
の処理を繰り返す。After resetting / setting the output flags in the above steps 709/711, it is judged whether or not the retrieval of all the read recognition words is completed (S712). If it is determined that all of the recognition words have not been completed,
After reading the removed words (S713),
Returning to step 705, the subsequent processing is repeated until it is determined that the retrieval of all the read recognition words has been completed.
On the contrary, when it is determined in step 712 that the retrieval of the read recognition word is completed,
It is determined whether or not all recognized words are completed (S719), and if completed, the process returns. On the contrary, if not completed, the process returns to step 701, and the subsequent processes are repeated until it is determined that all recognized words are completed.

【００３３】また，上記ステップ７０３において，分割
単語フラグがセットされていないと判断した場合には，
ハイフォンを除くと共に次の認識単語を読み出して接続
する（Ｓ７１４）。その後，該単語が辞書にあるか否か
を判断し（Ｓ７１５），辞書にあると判断た場合には訳
語記憶を実行し（Ｓ７１６），出力フラグをセットする
（Ｓ７１７）。反対に，辞書にないと判断した場合には
出力フラグをリセットする（Ｓ７１８）。また，上記テ
ップ７０２において，認識単語の中にハイフォンがない
と判断した場合には，上記ステップ７１５に移行する。（３）− 訳語記憶処理ここで，上記訳語記憶処理（Ｓ７１０／Ｓ７１６）につ
いて図８のフローチャートを用いて説明する。この訳語
記憶処理は，単語とその単語に対応する訳語を訳語辞書
メモリ１０７から読み出して記憶する。訳語辞書メモリ
１０７には単語とその単語に対応する訳語がペアで記憶
されているが，訳語は一つの単語に対して複数の意味が
登録されている。それぞれの訳語は品詞別に優先順位が
付けられている。If it is determined in step 703 that the divided word flag is not set,
The hyphen is removed and the next recognition word is read out and connected (S714). Then, it is determined whether or not the word is in the dictionary (S715). If it is determined that the word is in the dictionary, the translated word is stored (S716) and the output flag is set (S717). On the other hand, if it is determined that it is not in the dictionary, the output flag is reset (S718). If it is determined in step 702 that there is no hyphen in the recognized word, the process proceeds to step 715. (3) -Translated word storage processing Here, the translated word storage processing (S710 / S716) will be described with reference to the flowchart of FIG. In this translated word storage process, a word and a translated word corresponding to the word are read from the translated word dictionary memory 107 and stored. In the translated word dictionary memory 107, a word and a translated word corresponding to the word are stored as a pair, and the translated word has a plurality of meanings registered for one word. Each translated word is prioritized by part of speech.

【００３４】次に，品詞特定により品詞が特定されたか
否かを品詞特定検査（Ｓ８０１）で判断する。品詞が特
定されていた場合は，特定された品詞の訳語を読み出す
（Ｓ８０３）。反対に，品詞が特定されていない場合
は，複数の品詞があるか否かを判断する（Ｓ８０２）。
複数の品詞がある場合は，第一優先品詞の訳語を読み出
す（Ｓ８０４）。複数の品詞がない場合は，訳語を読み
出す（Ｓ８０５）。Next, it is judged in the part-of-speech identification inspection (S801) whether or not the part-of-speech is identified by the part-of-speech identification. If the part of speech is specified, the translated word of the specified part of speech is read (S803). On the contrary, when the part of speech is not specified, it is determined whether or not there are a plurality of parts of speech (S802).
When there are a plurality of parts of speech, the translated word of the first priority part of speech is read (S804). When there are no plural parts of speech, the translated word is read (S805).

【００３５】例えば，“ｐｌａｙ”という単語は，訳語
辞書メモリ１０７内において，次のように記憶されてい
る。ｐｌａｙ［動］遊ぶ；〜する；演奏する［名］遊
びこれは，第一優先品詞が動詞で，そのときの訳語が「遊
ぶ」，「〜する」，「演奏する」を表し，第二優先品詞
が名詞で，そのときの訳語が「遊び」であることを表
す。即ち，品詞特定で動詞と特定されれば，「遊ぶ」が
読み出され，名詞と特定されれば「遊び」が読み出され
る。また品詞が特定されている場合は，第一優先品詞の
訳語である「遊ぶ」が読み出される。このようにして読
み出された訳語は単語と共に記憶される（Ｓ８０６）。For example, the word "play" is stored in the translated word dictionary memory 107 as follows. play [verb] play; to do; to play [name] to play This is the first priority part of speech is a verb, and the translated words at that time are "play", "...", "play", and second priority. Part of speech is a noun, and the translated word at that time is "play." That is, if the part of speech is specified as a verb, "play" is read, and if it is specified as a noun, "play" is read. When the part of speech is specified, the word "play" which is the translation of the first priority part of speech is read. The translated word thus read out is stored together with the word (S806).

【００３６】次に，図３のフローチャートに戻り，出力
ページ数カウント（Ｓ３１１）を実行する。本発明で
は，読み取った原稿画像より，画像生成手段によって生
成され出力される画像の方が情報量が多いため，１ペー
ジの原稿画像に対して，複数ページの出力画像が対応す
る場合がある。出力ページ数カウント処理では，この出
力画像のページ数をカウントする。Next, returning to the flowchart of FIG. 3, the number of output pages is counted (S311). In the present invention, since the image generated and output by the image generation unit has a larger amount of information than the read document image, a plurality of pages of output images may correspond to one page of the document image. In the output page number counting process, the number of pages of this output image is counted.

【００３７】（３）− はめ込み合成画像生成処理次に，はめ込み合成画像生成処理（Ｓ３１２）を実行す
る。この処理を図９に示す。はめ込み合成画像生成は，
初めに段落番号カウント（Ｓ９０１）を実行する。段組
番号カウントは，上記ステップ４０１における段組認識
により認識された段組を順番にカウントする。次に，段
組番号合成（Ｓ９０２）を実行する。段組番号合成は，
カウントされた段組番号に対応するフォントデータを出
力フォントメモリ１０８から読み出して，出力画像メモ
リ１０５に展開することにより実行する。この段組番号
合成を実行した例を，図１６に示す。(3) -Inset Composite Image Generation Process Next, an inset composite image generation process (S312) is executed. This process is shown in FIG. The embedded composite image generation is
First, paragraph number counting (S901) is executed. In the column number counting, the columns recognized by the column recognition in step 401 are sequentially counted. Next, the column number composition (S902) is executed. Column number composition
This is executed by reading the font data corresponding to the counted column number from the output font memory 108 and expanding it in the output image memory 105. FIG. 16 shows an example of executing this column number combination.

【００３８】次に，行番号カウント（Ｓ９０３）を実行
する。行番号カウントは，上記ステップ４０３の行認識
により認識された行を順番にカウントする。次に，行番
号合成（Ｓ９０４）を実行する。行番号合成は，カウン
トされた行番号に対応するフォントデータを出力フォン
トメモリ１０８から読み出して，出力画像メモリ１０５
に展開することにより実行する。この行番号合成を実行
した例を，図１６に示す。ここでは，５の倍数の行番号
のみ展開を行っている。Next, line number counting (S903) is executed. The line number counting sequentially counts the lines recognized by the line recognition in step 403. Next, line number composition (S904) is executed. In line number composition, the font data corresponding to the counted line number is read from the output font memory 108, and the output image memory 105
Run by deploying to. An example of executing this line number combination is shown in FIG. Here, only the line numbers that are multiples of 5 are expanded.

【００３９】次に，読み取り行複写（Ｓ９０５）を実行
する。読み取り行複写は，スキャナ部１０１により読み
取り，入力画像メモリ１０４に格納されている画像情報
から，上記ステップ４０４の行位置認識により求めた座
標（ｌｓｘ，ｌｓｙ），（ｌｅｘ，ｌｅｙ）を対角に持
つ長方形領域を，出力画像メモリ１０５に複写すること
により実行される。Next, read line copying (S905) is executed. In the read line copying, the coordinates (lsx, lsy) and (lex, ley) obtained by the line position recognition in step 404 are diagonally read from the image information read by the scanner unit 101 and stored in the input image memory 104. This is executed by copying the rectangular area that it has into the output image memory 105.

【００４０】次に，出力フラグ検査（Ｓ９０６）を実行
する。出力フラグ検査は，上記ステップ７１１／７１７
の出力フラグセット，ステップ７０９／７１８の出力フ
ラグリセットによって付けられた各単語ごとの出力フラ
グがセットされているか，リセットされているかを判断
する。出力フラグがセットされていると判断した場合に
は，訳語読み取り（Ｓ９０７）を実行する。訳語読み取
りは，上記ステップ７１０／７１６における訳語記憶に
より記憶された訳語を読み出す。続いて，訳語合成（Ｓ
９０８）を実行する。Next, the output flag check (S906) is executed. The output flag check is performed in the above steps 711/717.
Output flag is set, and it is determined whether the output flag for each word added by the output flag reset in step 709/718 is set or reset. When it is determined that the output flag is set, the translated word is read (S907). In the translated word reading, the translated word stored by the translated word storage in step 710/716 is read. Next, translated word synthesis (S
908) is executed.

【００４１】（３）− 訳語合成処理訳語合成処理（Ｓ９０８）を図１０に示すフローチャー
トに基づいて説明する。まず，ハイフォンが有るか否か
を判断し（Ｓ１００１），無ければリターンする。反対
に，ハイフォンがあれば，分割単語フラグがセットされ
ているか否かを判断し（Ｓ１００２），セットされてい
ると判断された場合，認識単語の下行の中央に訳語合成
を実行して（Ｓ１００３），リターンする。反対に，分
割単語フラグがセットされていないと判断した場合に
は，次に，ハイフォン以前の文字数の方が以降より大き
いか否かを判断する（Ｓ１００４）。大きくないと判断
した場合には，ハイフォン以降の単語の下行に訳語合成
を実行し（Ｓ１００６），反対に，大きいと判断した場
合には，ハイフォン以前の単語の下行に訳語合成を実行
する。(3) -Translated word synthesizing process The translated word synthesizing process (S908) will be described with reference to the flowchart shown in FIG. First, it is determined whether or not there is a hyphen (S1001), and if there is not, a return is made. On the contrary, if there is a hyphen, it is judged whether or not the divided word flag is set (S1002), and if it is judged that the divided word flag is set, translation word synthesis is executed at the center of the lower line of the recognized word (S1003). ), Return. On the contrary, if it is determined that the divided word flag is not set, then it is determined whether or not the number of characters before the hyphen is larger than that thereafter (S1004). When it is determined that it is not large, the translated word synthesis is executed in the lower line of the word after the hyphen (S1006). On the contrary, when it is determined that it is large, the translated word synthesis is executed in the lower line of the word before the hyphen.

【００４２】次に，図９のフローチャートに戻り，行内
の全単語についての処理が終了したか否かを判断し（Ｓ
９０９），終了していなければ，次の単語について上記
ステップ９０６の「出力フラグ検査」から同処理を繰り
返す。行内の全単語について処理が終了したと判断した
場合は，段組内の全行が終了しているか否かの判断を行
う（Ｓ９１０）。終了していなければ，次の行について
上記ステップ９０３の「行番号カウント」から同じ処理
を繰り返す。段組内の全行についての処理が終了したと
判断した場合は，ページ内の全段が終了しているか否か
の判断を行う（Ｓ９１１）。終了していなければ，次の
段について上記ステップ９０１の「段組番号カウント」
から同じ処理を繰り返す。Next, returning to the flowchart of FIG. 9, it is judged whether or not the processing for all the words in the line is completed (S
909), if not finished, the same process is repeated from the "output flag check" in step 906 for the next word. When it is determined that the processing has been completed for all the words in the line, it is determined whether or not all the lines in the column have been completed (S910). If not completed, the same process is repeated from the "line number count" in step 903 above for the next line. When it is determined that the processing has been completed for all the lines in the column, it is determined whether all the columns in the page have been completed (S911). If it is not completed, "column number counting" in step 901 above for the next column
Repeat the same process from.

【００４３】次に，ページ番号合成（Ｓ９１２）を実行
する。ページ番号合成処理は図６に示したステップ６０
５の「ページ番号合成」と同じ処理を実行するが，図１
６に示すように原稿画像のページ数と生成画像のページ
数をハイフォンで接続した番号を合成する。Next, page number composition (S912) is executed. The page number combining process is step 60 shown in FIG.
The same process as "Page number composition" of 5 is executed.
As shown in 6, the number of pages of the original image and the number of pages of the generated image are combined with a number connected by a hyphen.

【００４４】図３のフローチャートに戻って，次にプリ
ント出力（Ｓ３１３）を実行する。プリント出力処理は
上記ステップ３０９の「プリント出力」と同じで，上記
ステップ３１２の「はめ込み合成画像生成」により生成
された，画像をプリンタ部１１４によりプリントする。
次に，読み取った原稿に対応するはめ込み合成画像のプ
リントが全て完了したか否かを判断する（Ｓ３１４）。
まだ，完了していないと判断した場合は，上記ステップ
３１１に戻り，次のページ数カウントから同じ処理を繰
り返す。Returning to the flowchart of FIG. 3, print output (S313) is then executed. The print output process is the same as the "print output" in step 309, and the image generated by the "inset composite image generation" in step 312 is printed by the printer unit 114.
Next, it is determined whether or not all the inlaid composite images corresponding to the read document have been printed (S314).
If it is determined that the process is not completed yet, the process returns to step 311, and the same process is repeated from the next page number count.

【００４５】プリントが全て完了した場合，読取原稿を
全て読み取り，処理が完了した否かを判断する（Ｓ３１
５）。まだ，完了していない場合は，次の原稿について
上記ステップ３０５の「原稿読み取り」から同じ処理を
繰り返す。以上の処理が全て終了したら，再び上記ステ
ップ３０２の「対訳動作指示」に戻る。When all the printing is completed, all the read originals are read and it is judged whether or not the processing is completed (S31).
5). If it has not been completed yet, the same process is repeated from the "document reading" in step 305 for the next document. When all the above processing is completed, the process returns to the "parallel translation operation instruction" in step 302 above.

【００４６】以上のように，分割英単語のような特殊記
号を含んだ単語の検索を行う構成の場合，ＯＣＲ処理に
おいて特殊文字の検索，制御を行う。これは，辞書検索
を行う１単位と見なされた文字列中の特殊記号或いは登
録された文字，記号を認識し，前記記号或いは文字の前
後の文字を１つの辞書検索単位として見なす。このと
き，行末に所定記号（一般にはハイフォン）が認識され
た場合はその前後の文字列を各々辞書検索単位と見なさ
ず，次の文字列（次の行頭の単語）と連結させて辞書検
索１単位とする動作を行い，フォーマットは特殊文字を
考慮した形式でメモリに格納する。As described above, in the case of the structure in which a word including a special symbol such as a divided English word is searched, the special character is searched and controlled in the OCR processing. This recognizes a special symbol or a registered character or symbol in a character string regarded as one unit for performing a dictionary search, and regards the symbol or characters before and after the character as one dictionary search unit. At this time, when a predetermined symbol (generally a hyphen) is recognized at the end of a line, the character strings before and after it are not considered as a dictionary search unit, but are connected to the next character string (the word at the beginning of the next line) to search the dictionary 1 The operation is performed in units, and the format is stored in the memory in consideration of special characters.

【００４７】図１１は，単語，分割単語の構造フォーマ
ットの内容を説明するものである。まず単語の構造体フ
ォーマットのメンバーである文字数は，ＯＣＲで単語と
認識された文字の数を表し，分割単語オフセットは認識
単語が分割単語の場合，その行での分割単語の個数のオ
フセット値をあらわしている。分割単語でないものには
適当な固定値を代入しておく。図１１では０ｘｆｆを代
入している。次に，分割単語の構造体フォーマットのメ
ンバーである文字数は，認識された分割単語の文字数を
表し，分割単語残り数は注目分割単語に連結している分
割単語の個数を表す。FIG. 11 illustrates the contents of the structural format of words and divided words. First, the number of characters that is a member of the word structure format represents the number of characters recognized as words by OCR, and the split word offset is the offset value of the number of split words in the line when the recognized word is a split word. It shows. Substitute a suitable fixed value for a word that is not a divided word. In FIG. 11, 0xff is substituted. Next, the number of characters that are members of the divided word structure format represents the number of characters of the recognized divided word, and the remaining number of divided words represents the number of divided words connected to the noticed divided word.

【００４８】図１２は，行末にハイフォンがきたときの
処理の説明図である。１）は行末にハイフォンがある場
合の原稿の状態を示す。ここで，“ｃｏｍｍｕｎｉ―”
の単語の文字数と分割単語の文字数を比較する。その結
果，２）に示したように８と７で分割単語数が１少ない
ことがわかる。そして，この単語は行に存在する最終単
語という情報が既知であるため，分割単語として扱わな
いと認識することができる。そして，行末の分割単語を
分割単語テキストデータから取り出し，次の行頭の単語
を単語テキストから取り出し両者を連結させてから辞書
検索を実行する。次に，行末の文字数と，行頭の文字数
を比較し，３），４）に示すように文字数の多い方に前
記辞書検索処理により検索された訳語を出力させること
ができる。FIG. 12 is an explanatory diagram of processing when a hyphen arrives at the end of a line. 1) shows the state of the original when there is a hyphen at the end of the line. Where "communi-"
Compare the number of characters in the word and the number of characters in the divided word. As a result, it can be seen that the number of divided words is one less in 8 and 7 as shown in 2). Since this word has a known information of the last word existing in the line, it can be recognized that it is not treated as a divided word. Then, the divided word at the end of the line is taken out from the divided word text data, the word at the beginning of the next line is taken out from the word text, and the two are connected, and then a dictionary search is executed. Next, the number of characters at the end of the line is compared with the number of characters at the beginning of the line, and as shown in 3) and 4), the translated word searched by the dictionary search processing can be output to the one having the larger number of characters.

【００４９】また，図１７は，特殊記号により接続され
ている複数の単語から構成される組単語が，前記訳語記
憶手段に記憶されていないとき，組単語の接続を分解
し，分割された単語に対して訳語出力するように制御す
るための動作を示すフローチャートである。Further, FIG. 17 shows that when a group word composed of a plurality of words connected by a special symbol is not stored in the translated word storage means, the group word connection is decomposed and divided into words. 5 is a flowchart showing an operation for controlling so as to output a translated word for.

【００５０】初めに，特殊記号「−」，「＆」，「／」
等を含む英単語の検索に関して説明する。特殊記号
「−」，「＆」，「／」を含む英単語が与えられたと
き，その英単語を適宜辞書検索し，該英単語を辞書内に
見出せなかったときには，その英単語を特殊記号の部分
で分割し，該分割した各々の英単語について再度検索処
理を実行する。以下に出力例を示す。入力英単語：ｅｌｅｃｔｒｏｎ−ｓｐｉｎ見出し英単語１：ｅｌｅｃｔｒｏｎ訳語１：電子見出し英単語２：ｓｐｉｎ訳語２：紡ぐ他のモードでは，入力英単語：ｅｌｅｃｔｒｏｎ−ｓｐｉｎ見出し英単語１：ｅｌｅｃｔｒｏｎ訳語１：［名］電子，エレクトロン見出し英単語２：ｓｐｉｎ訳語２：［動］紡ぐ，（糸を）かける，回す，
スピンする上記に示した例は，訳語辞書メモリ１０７の内容が以下
のようになっている場合である。即ち，ｅｌｅｃｔｒｏｎ［名］電子，エレクトロンｓｐｉｎ［動］紡ぐ，（糸を）かける，回す，
スピンするFirst, the special symbols "-", "&", "/"
A search for an English word including "etc." will be described. When an English word containing the special symbols "-", "&", and "/" is given, the dictionary is searched as appropriate for the English word. When the English word cannot be found in the dictionary, the English word is special symbol. Is divided into parts, and the search process is executed again for each of the divided English words. An output example is shown below. Input English word: electron-spin Headline English word 1: electron Translated word 1: Electronic Headword English word: spin Translated word 2: Spin In other modes, input English word: electron-spin Headword English word: electron translated word 1: [name ] Electron, electron Headword English 2: spin Translated word 2: [moving] spinning, spinning, turning,
Spinning The example shown above is a case where the contents of the translation word dictionary memory 107 are as follows. That is, electron [name] electron, electron spin [moving] spinning (threading), spinning,
Spin

【００５１】次に，図１７において，ａｂｃ−ｄｅ−ｆ
ｇｈの訳語検索処理を例に採って説明する。まず，ａｂ
ｃ−ｄｅ−ｆｇｈの状態で訳語検索を実行する（Ｓ１７
０１）。その後，ａｂｃ−ｄｅ−ｆｇｈの状態で，その
訳語が辞書にあるか否かを判断する（Ｓ１７０２）。辞
書にあると判断した場合には，リターンして上記説明し
た処理を実行する。反対に，辞書にないと判断した場合
には，次に，ａｂｃ−ｄｅの状態で訳語検索を実行する
（Ｓ１７０３）。その後，ａｂｃ−ｄｅの状態で，その
訳語が辞書にあるか否かを判断する（Ｓ１７０４）。辞
書にあると判断した場合には，ｆｇｈの訳語検索を実行
する（Ｓ１７０９）。反対に，辞書にないと判断した場
合には，次に，ａｂｃで訳語検索を実行する（Ｓ１７０
５）。次に，ｄｅ−ｆｇｈの状態で訳語検索を実行する
（Ｓ１７０６），その後，ｄｅ−ｆｇｈの状態で，その
訳語が辞書にあるか否かを判断する（Ｓ１７０７）。辞
書にあると判断した場合には，リターンする。反対に，
辞書にないと判断した場合には，ｄｅで訳語検索を実行
し（Ｓ１７０８），更にｆｇｈで訳語検索を実行した
（Ｓ１７０９）後リターンする。Next, in FIG. 17, abc-de-f
The description will be made by taking the translation search process of gh as an example. First, ab
A translated word search is executed in the state of c-de-fgh (S17).
01). Then, in the abc-de-fgh state, it is determined whether the translated word is in the dictionary (S1702). If it is determined that it is in the dictionary, the process returns and the process described above is executed. On the contrary, if it is determined that the dictionary is not found in the dictionary, then a translated word search is executed in the abc-de state (S1703). Then, in the abc-de state, it is determined whether the translated word is in the dictionary (S1704). If it is determined that it is in the dictionary, a translation search for fgh is executed (S1709). On the contrary, if it is determined that the dictionary is not found in the dictionary, then a translation search is executed by abc (S170).
5). Next, a translated word search is executed in the de-fgh state (S1706), and then, in the de-fgh state, it is determined whether or not the translated word is in the dictionary (S1707). If it is determined that it is in the dictionary, it returns. Conversely,
If it is determined that the dictionary is not found in the dictionary, the translated word search is executed in de (S1708), the translated word search is executed in fgh (S1709), and then the process returns.

【００５２】また，上記実施例にあっては，対訳画像形
成装置における各種制御をソフトウェアにより実現して
いるが，処理の高速性に鑑み必要に応じて（全体的に或
いは部分的に）ハードウェアに置換して構成してもよ
い。Further, in the above embodiment, various controls in the bilingual image forming apparatus are realized by software. However, in view of high speed of processing, hardware (whole or partial) may be used as necessary. It may be replaced with.

【００５３】[0053]

【発明の効果】以上説明した通り，本発明による対訳画
像形成装置によれば，特殊記号を検出し，且つ，それが
行末にあると判断したときは該特殊記号を除去して前記
訳語情報を検索し，行末にないと判断したときは前記特
殊記号を含めて訳語情報を検索し，また，特殊記号を除
去して訳語情報を検索した場合には，該特殊記号以前の
部分における文字数を以降の部分における文字数とを比
較し，文字数の大きい方の部分の下に訳語情報を合成
し，また，特殊記号を含めて訳語情報を検索したとき
に，辞書に該当する訳語がないと判断したときには，末
尾（最後）の特殊記号以降の部分を除去して訳語情報を
検索するため，原稿画像中に存在する単語に対応する訳
語の位置を明確にし，また，訳語付加により出力画像の
内容が大幅に変更したとしても，複写画像の内容を読み
やすくレイアウトして出力すると共に，特殊記号により
接続されている組単語を処理を適切に行い，対訳効率を
向上させることができる。As described above, according to the bilingual image forming apparatus of the present invention, a special symbol is detected, and when it is judged that it is at the end of a line, the special symbol is removed and the translated word information is obtained. When it is determined that the special character is not at the end of the line, the translated word information including the special symbol is searched, and when the translated word information is searched by removing the special symbol, the number of characters in the portion before the special symbol is calculated as follows. When it is determined that there is no corresponding translation word in the dictionary when comparing the translation information under the part with the larger number of characters and searching for the translation information including the special symbol , The part after the last special symbol is removed to search the translated word information, so the position of the translated word corresponding to the word existing in the manuscript image is clarified, and the content of the output image is significantly increased by adding the translated word. Changed to Also, with easy to lay outputs Read the copied image, appropriately performs processing to set words that are connected by special symbols, it is possible to improve the translation efficiency.

[Brief description of drawings]

【図１】本発明による対訳画像形成装置の構成を示すブ
ロック図である。FIG. 1 is a block diagram showing a configuration of a bilingual image forming apparatus according to the present invention.

【図２】本発明による対訳画像形成装置を具体化した複
写機の構成を示す説明図である。FIG. 2 is an explanatory diagram showing a configuration of a copying machine that embodies a bilingual image forming apparatus according to the present invention.

【図３】図１に示した対訳画像形成装置のメイン動作を
示すフローチャートである。3 is a flowchart showing a main operation of the bilingual image forming apparatus shown in FIG.

【図４】図３に示した原稿認識処理の動作を示すフロー
チャートである。FIG. 4 is a flowchart showing the operation of the document recognition process shown in FIG.

【図５】図３に示した特殊記号の分類動作を示すフロー
チャートである。5 is a flowchart showing a classification operation of the special symbols shown in FIG.

【図６】図３に示した段落，行，ページ合成処理の動作
を示すフローチャートである。FIG. 6 is a flowchart showing the operation of the paragraph / line / page combining process shown in FIG.

【図７】図３に示した訳語辞書検索処理の動作を示すフ
ローチャートである。FIG. 7 is a flowchart showing the operation of the translation dictionary search process shown in FIG.

【図８】図７に示した訳語記憶処理の動作を示すフロー
チャートである。8 is a flowchart showing the operation of the translated word storage process shown in FIG.

【図９】図３に示したはめ込み合成画像生成処理の動作
を示すフローチャートである。9 is a flowchart showing the operation of the embedded composite image generation process shown in FIG.

【図１０】図８に示した訳語合成処理の動作を示すフロ
ーチャートである。10 is a flowchart showing the operation of the translated word synthesizing process shown in FIG.

【図１１】単語，分割単語の構造フォーマットの内容を
示す説明図である。FIG. 11 is an explanatory diagram showing contents of a structural format of a word and a divided word.

【図１２】行末にハイフォンがきたときの処理を示す説
明図である。FIG. 12 is an explanatory diagram showing a process when a hyphen comes to the end of a line.

【図１３】原稿画像例を示す説明図である。FIG. 13 is an explanatory diagram illustrating an example of a document image.

【図１４】図１３に示した原稿画像に対し座標指定した
状態を示す説明図である。FIG. 14 is an explanatory diagram showing a state in which coordinates are designated for the document image shown in FIG.

【図１５】図１３に示した原稿画像に対し段落番号を付
けた状態を示す説明図である。15 is an explanatory diagram showing a state in which paragraph numbers are attached to the document image shown in FIG.

【図１６】図１３に示した原稿画像に対して，この発明
による一連の対訳付加処理を実行した後に出力される複
写画像例を示す説明図である。FIG. 16 is an explanatory diagram showing an example of a copied image output after executing a series of parallel translation adding processes according to the present invention on the original image shown in FIG.

【図１７】組単語における分割処理動作の一例を示すフ
ローチャートである。FIG. 17 is a flowchart showing an example of division processing operation for a group of words.

[Explanation of symbols]

１０１スキャナ部１０４入力画像
メモリ１０５出力画像メモリ１０６認識辞書
メモリ１０７訳語辞書メモリ１０８出力フォ
ントメモリ１０９ＣＰＵ１１０ＲＯＭ１１１ＲＡＭ１１４プリンタ
部１１５システムバス１１６網制御部
（ＮＣＵ）１１７モデム１１８通信制御
部101 Scanner Unit 104 Input Image Memory 105 Output Image Memory 106 Recognition Dictionary Memory 107 Translation Dictionary Memory 108 Output Font Memory 109 CPU 110 ROM 111 111 RAM 114 Printer Unit 115 System Bus 116 Network Control Unit (NCU) 117 Modem 118 Communication Control Unit

Claims

[Claims]

1. An image reading unit for optically reading a document image and converting it into image information, and an image storage unit for storing the image information read by the image reading unit or the image information input via a communication line. An image processing means for executing various image processing on the image information, an image recording means for recording an image on a recording medium based on an image signal output by the image processing means, the original image information and the information In a bilingual image forming apparatus having a translated word storage unit that stores translated word information and a control unit that executes a bilingual translation process on the document image information based on the information in the translated word storage unit, the control unit is configured to display a predetermined symbol. If it is detected and it is determined that it is at the end of the line, the predetermined symbol is removed to search the translation information, and if it is determined that it is not at the end of the line, the predetermined symbol is deleted. A parallel translation image forming apparatus characterized by searching for translated word information including the translation word information.

2. The control means compares the number of characters in the portion before the predetermined symbol with the number of characters in the following portion when the translation information is retrieved by removing the predetermined symbol, and the one having the larger number of characters is compared. 2. The bilingual image forming apparatus according to claim 1, wherein translated word information is synthesized under the portion of.

3. The control means, when searching the translated word information including the predetermined symbol, and judging that there is no corresponding translated word in the translated word storage means, the control means determines that the last (last) symbol or more 2. The bilingual image forming apparatus according to claim 1, wherein the translation information is retrieved by removing the portion.