JPH0296467A - Document accumulating device - Google Patents

Document accumulating device

Info

Publication number
JPH0296467A
JPH0296467A JP63248383A JP24838388A JPH0296467A JP H0296467 A JPH0296467 A JP H0296467A JP 63248383 A JP63248383 A JP 63248383A JP 24838388 A JP24838388 A JP 24838388A JP H0296467 A JPH0296467 A JP H0296467A
Authority
JP
Japan
Prior art keywords
information
document
character
picture information
picture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP63248383A
Other languages
Japanese (ja)
Inventor
Hisao Hayashi
久雄 林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to JP63248383A priority Critical patent/JPH0296467A/en
Publication of JPH0296467A publication Critical patent/JPH0296467A/en
Pending legal-status Critical Current

Links

Landscapes

  • Storing Facsimile Image Data (AREA)
  • Document Processing Apparatus (AREA)

Abstract

PURPOSE:To synthesize and accumulate an obtained mark and picture information, to reduce an accumulated information quantity, and effectively use a storage part by providing a device, which extracts a mark-expressible information area from document information and obtains the corresponding mark, and a device which obtains an inexpressible area. CONSTITUTION:When accumulating the document being accumulated in an original, the document is converted into picture information 101 expressed in a bit image by an original reader part 1. Next, a character is recognized by a character recognizing part, the document information which has successed in recognizing the character is converted into a character code 104, and the document information which has not successed in recognizing the character is outputted as the picture information, compression-processed by an encoder part 3, the document information separated into encoding picture information 103 is information-synthesized 4, and accumulated 5 as the synthetic information. When the cumulative information is fetched, cumulative information 106 is separated by a character code 109 and encoding picture information 107 by an information separating part 6, encoded picture information 107 is decoded 7, and outputted as the picture information.

Description

【発明の詳細な説明】 産業上の利用分野 本発明は、文書蓄積装置に関し、特に、原稿に記述され
た文書を蓄積する文書蓄積装置に関する。
DETAILED DESCRIPTION OF THE INVENTION Field of the Invention The present invention relates to a document storage device, and more particularly to a document storage device that stores documents written on manuscripts.

従来の技術 従来、この種の文書蓄積装置は、文書情報を全て画像情
報としてN積する横道となっていた。
BACKGROUND OF THE INVENTION Conventionally, document storage devices of this type have been a sideways process in which all document information is multiplied by N as image information.

発明が解決しようとする課題 上述した文言蓄積装置は、文書情報を全て画幅情報とし
て処理しているので、蓄積する情報量か多く、大容量の
記憶部を必要とするという欠点がある。
Problems to be Solved by the Invention The above-mentioned text storage device processes all document information as image width information, so it has the drawback of storing a large amount of information and requiring a large-capacity storage unit.

本発明は従来の技術に内在する上記欠点を克服する為に
なされたものであり、従って本発明の目的は、蓄積する
情報量を少なくすることによって、記憶装置を効率的に
使用することを可能とした新規な文書蓄積装置を提供す
ることにある。
The present invention has been made in order to overcome the above-mentioned drawbacks inherent in the conventional technology, and therefore, an object of the present invention is to make it possible to efficiently use a storage device by reducing the amount of information stored. The object of the present invention is to provide a new document storage device with the following features.

課題を解決するための手段 上記目的を達成する為に、本発明に係る文言蓄積装置は
、文書情報から記号表現可能な情報領域を抽出してこれ
に該当する記号を得る第1の手段と、前記文書情報のう
ちで記号表現不可能な清報領域を画像情報として得る第
2の手段と、これら第1、第2の手段により得られた記
号と画像情報を合成して蓄積する手段と8備えて構成さ
れる。
Means for Solving the Problems In order to achieve the above object, the text storage device according to the present invention includes a first means for extracting an information region that can be expressed symbolically from document information and obtaining a symbol corresponding to the region; a second means for obtaining as image information a clear information area that cannot be represented symbolically in the document information; and a means for synthesizing and storing the symbols and image information obtained by these first and second means; Prepared and configured.

実施例 次に本発明をその好ましい一実施例について図面を参照
して具体的に説明する。
Embodiment Next, a preferred embodiment of the present invention will be specifically explained with reference to the drawings.

第1図は、本発明の一実施例を示すブロック構成図であ
る。
FIG. 1 is a block diagram showing one embodiment of the present invention.

第1図を参照するに、原稿に記述された文芽を蓄積する
場合に、ます、文書を原稿読取部]、によってビットイ
メージで表現された画(ffl情報101に変換する。
Referring to FIG. 1, when accumulating sentence buds written on a manuscript, the document is first converted into an image expressed as a bit image (ffl information 101) by the manuscript reading section.

次に文字認識処理によって文字の認識が行われ、文字認
識に成功した文書情報は、文字コード104に変換され
、文字認識できなかった文書+f!報は画像情報のまま
11〕2として出力されて符号化部3によって圧縮処理
され符号fヒ画像情報11〕3を得ろ。このようにして
 文字コード104と肴号イヒ画渫情報11]3に分離
された文書情報は、情報合成部4によって合成されき成
情報105として蓄積部5に蓄積される。
Next, characters are recognized by character recognition processing, and document information whose characters have been successfully recognized is converted to character code 104, and documents whose characters cannot be recognized +f! The image information is outputted as image information 11]2 and compressed by the encoder 3 to obtain code fhi image information 11]3. The document information thus separated into the character code 104 and the appetizer number information 11 ] 3 is synthesized by the information synthesis section 4 and stored in the storage section 5 as finished information 105 .

蓄積された文言を取り出す場りには、情報分雛部C′)
によって蓄積情報11)6を文字コード11)9と符号
化画像情報107に分越し、符号fヒ画像情報107は
復号(ヒ部7によってビットイメージに復号化されて画
(、T1情報108として出力される。このようにして
得られた文字コード109と画像情報108は表示制御
部8によってCRT信号110、プリント信号111と
してCRT9、プリンタ10に出力される。
Information section C') is used to retrieve the stored text.
The accumulated information 11) 6 is divided into character code 11) 9 and encoded image information 107, and the code f image information 107 is decoded into a bit image by the decoding section 7 and output as image (, T1 information 108. The character code 109 and image information 108 thus obtained are output by the display control section 8 to the CRT 9 and printer 10 as a CRT signal 110 and print signal 111.

第2図は、第1図における文字認識部2の処理概要の一
例を示したフローチャートである。
FIG. 2 is a flowchart showing an example of the processing outline of the character recognition unit 2 in FIG. 1.

第2図を参照するに、処理を開始すると、まず文書の画
像情報から文字3切り出す処理82ト行う。
Referring to FIG. 2, when the process starts, first, a process 82 is performed to cut out three characters from the image information of the document.

文字の切り出しができない場合にはS7へ進み、現在処
理している情報領域を画像情報のまま取り込む、一方、
文字の切り出しができた場合には、S4に進んで文字の
特徴点を計算し、特徴点ともとにして辞書検索処理S5
を行って、辞書の中に該当する文字があるかどうかを判
断する。該当する文字がない場合には、S7で現在処理
している情報領域を画像情報のまま取り込み、該当文字
があった場合には該当する文字の文字コードを得る。
If the characters cannot be cut out, the process advances to S7, and the information area currently being processed is imported as image information.
If the character has been successfully extracted, the process proceeds to S4, where the feature points of the character are calculated, and the feature points and the dictionary search process S5 are performed.
to determine whether the corresponding character exists in the dictionary. If there is no corresponding character, the information area currently being processed is taken in as image information in S7, and if there is a corresponding character, the character code of the corresponding character is obtained.

第3図は第1図における情報合成部4の処理結果の一例
を示す図である。
FIG. 3 is a diagram showing an example of the processing result of the information synthesis section 4 in FIG. 1.

第3図において、文字コードD3と符号化画像情報D6
5r文書取り出し時に識別できる様に、文字コードヘッ
ダD1、文字コード属性D2及び画像情報へlりD4.
亘(象情報属性り531寸加1−でいる。
In FIG. 3, character code D3 and encoded image information D6
5r The character code header D1, the character code attribute D2, and the image information D4.
Wataru (Elephant information attribute 531 size addition 1-).

発明の詳細 な説明したように、本発明によれば、文言情報から記号
表現可能な情報領域を抽出してこれに該当する記号を得
る第1の手段と、記号表現不可能な情報領域を画像情報
として得る第2の手段と、これら第1、第2の手段によ
り得られた記号と画像情報を合成して蓄積する手段とを
有することにより、蓄積する情報量が少なくなり、記憶
部を効率的に使用できる効果が得られる。
DETAILED DESCRIPTION OF THE INVENTION According to the present invention, there is provided a first means for extracting an information area that can be expressed symbolically from textual information and obtaining a symbol corresponding to the information area; By having a second means for obtaining information and a means for synthesizing and storing the symbols and image information obtained by these first and second means, the amount of information to be stored is reduced and the storage unit can be used more efficiently. It can be used effectively.

【図面の簡単な説明】[Brief explanation of drawings]

第1図は本発明の一実施例を示すブロック構成図、第2
図は文字認識処理の一例を示すフローチャート、第31
21は文字コードと画像情報の自戒例と示す図である。 1・・原稿読収部、2・・・文字認識部、3・・・符号
化部、4・・情報合成部、5・・・蓄積部、6・・・情
報分雛部、7・・復号化部、8・・・表示制御部、9・
・・CRT、10・・プリンタ、101・・・文書情報
、102・・・画1象情報、103 ・・符号化画像+
fl報、104・・・文字コード、105・・・合成+
rg報、106  ・・蓄積情報、107・・符号(ヒ
画像情報、108・・・画像情報、109・・・文字コ
ード、110・・CRT信号、111・・・プリント信
号S1・・開始、S2・・・文字切り比し処理、S3・
・・文字切り出し判別、S4・・・文字特徴点計算処理
、S5・・・辞書検索処理、S6・・・該当文字判別、
S7・・・画(象情報取り込み処理、S8・・・文字コ
ード決定処理、S9・・・終了D1・・・文字コードヘ
ッダ、D2・・・文字コード属性、D3・・・文字コー
ド、D4・・・画像情報ヘッダ、D5・・・画像情報属
性、D6・・・符号化画像情報 特許出願人   日本電気株式会社 代 理 人   弁理士 熊谷雄太部
FIG. 1 is a block diagram showing one embodiment of the present invention, and FIG.
The figure is a flowchart showing an example of character recognition processing, No. 31.
21 is a diagram showing an example of character code and image information. 1... Manuscript reading unit, 2... Character recognition unit, 3... Encoding unit, 4... Information synthesis unit, 5... Storage unit, 6... Information division unit, 7... Decoding unit, 8...Display control unit, 9.
...CRT, 10...Printer, 101...Document information, 102...Image 1 image information, 103...Encoded image +
fl report, 104...Character code, 105...Synthesis+
rg information, 106...accumulation information, 107...code (hi image information, 108...image information, 109...character code, 110...CRT signal, 111...print signal S1...start, S2・・・Character cut ratio processing, S3・
...Character cutout discrimination, S4... Character feature point calculation processing, S5... Dictionary search processing, S6... Applicable character discrimination,
S7... Image (image information import processing, S8... Character code determination processing, S9... End D1... Character code header, D2... Character code attribute, D3... Character code, D4... ...Image information header, D5...Image information attribute, D6...Encoded image information Patent applicant NEC Co., Ltd. Agent Patent attorney Yutabe Kumagai

Claims (1)

【特許請求の範囲】[Claims] 文書情報を蓄積する文書蓄積装置において、文書情報か
ら記号表現可能な情報領域を抽出してこれに該当する記
号を得る第1の手段と、前記文書情報のうち記号表現不
可能な情報領域を画像情報として得る第2の手段と、こ
れら第1、第2の手段により得られた記号と画像情報を
合成して蓄積する第3の手段とを有することを特徴とす
る文書蓄積装置。
In a document storage device that stores document information, a first means extracts an information area that can be expressed symbolically from the document information and obtains a symbol corresponding to the information area; A document storage device characterized by having a second means for obtaining information, and a third means for combining and storing symbols and image information obtained by the first and second means.
JP63248383A 1988-09-30 1988-09-30 Document accumulating device Pending JPH0296467A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP63248383A JPH0296467A (en) 1988-09-30 1988-09-30 Document accumulating device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP63248383A JPH0296467A (en) 1988-09-30 1988-09-30 Document accumulating device

Publications (1)

Publication Number Publication Date
JPH0296467A true JPH0296467A (en) 1990-04-09

Family

ID=17177289

Family Applications (1)

Application Number Title Priority Date Filing Date
JP63248383A Pending JPH0296467A (en) 1988-09-30 1988-09-30 Document accumulating device

Country Status (1)

Country Link
JP (1) JPH0296467A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6355800B1 (en) * 2017-06-28 2018-07-11 ヤフー株式会社 Learning device, generating device, learning method, generating method, learning program, and generating program

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS63155945A (en) * 1986-12-19 1988-06-29 Nec Corp Storage and distribution system for article data

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS63155945A (en) * 1986-12-19 1988-06-29 Nec Corp Storage and distribution system for article data

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6355800B1 (en) * 2017-06-28 2018-07-11 ヤフー株式会社 Learning device, generating device, learning method, generating method, learning program, and generating program
JP2019008742A (en) * 2017-06-28 2019-01-17 ヤフー株式会社 Learning device, generation device, learning method, generation method, learning program, and generation program

Similar Documents

Publication Publication Date Title
US6993196B2 (en) Digital image storage method
CN102131020B (en) Image processing apparatus and image processing method
US5294982A (en) Method and apparatus for providing dual language captioning of a television program
JP4590433B2 (en) Image processing apparatus, image processing method, and computer program
JP4402138B2 (en) Image processing apparatus, image processing method, and computer program
WO1998016917A2 (en) A file structure for scanned documents
JP2000306103A (en) Method and device for information processing
KR20030024786A (en) Method and means for mobile capture, processing, storage and transmission of text and mixed information containing characters and images
JPH0296467A (en) Document accumulating device
JP2584973B2 (en) Recognition result output method in character recognition device
JPH05303619A (en) Electronic scrap book
JPH07168913A (en) Character recognition system
JP2606560B2 (en) Document image storage device
JPH04309186A (en) Bar graph recognizing device
JPS6255772A (en) Picture processor
JPH05108740A (en) Band graph recognizing device
JPH04105178A (en) Document picture processor
JPH05159062A (en) Document recognition device
JPH05174150A (en) Circular graph recognition device
JP2665226B2 (en) Character recognition device
JPH07104940B2 (en) Figure recognition device
JPS60157644A (en) Filing device
JPH03230288A (en) Information processor
JPH0411907B2 (en)
JPH0573685A (en) Rader chart recognizing device