JPH0983789A

JPH0983789A - Video input device

Info

Publication number: JPH0983789A
Application number: JP7240167A
Authority: JP
Inventors: Makoto Senda; 誠千田
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1995-09-19
Filing date: 1995-09-19
Publication date: 1997-03-28

Abstract

PROBLEM TO BE SOLVED: To convert character information included in an inputted image into text data. SOLUTION: Still image information inputted from a camera part 1 is stored in a video memory part 4 as image data processed by a video input processing part 3 and stored in a storage part 6. Further, an image decision part 30 decides whether or not the image data have an image including characters by pixels or by image blocks obtained by dividing the area. When the characters are included, a character recognition part 13 converts the data into text data in sequence. The converted text data are put together by a graphic composition part 13 while related with the image data and then stored in the storage part 6. Further, the data are displayed on a monitor 16.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、取り込んだ画像に
含まれる文字情報をテキストデータに変換することが可
能な映像入力装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a video input device capable of converting character information contained in a captured image into text data.

【０００２】[0002]

【従来の技術】従来、映像入力装置におけるカメラ部が
有する有効入力画素は、略４０万画素程度であり、高精
細画像の静止画像をそのまま扱うほど解像度は高くな
い。また、このカメラで撮像された画像は、標準の映像
信号、例えば、ＮＴＳＣ，ＰＡＬ，ＳＥＣＡＭ等による
出力が一般的である。このため、従来のカメラにより高
画質な静止画像を得るには、取り込む画面の全体を複数
の画面に分割し、取り込んだ後に各分割画面を合成する
ことにより高画質な静止画像を得るのが一般的である。
以下に上記の従来例としての映像入力装置の構成及び動
作処理を図１６、図１７を参照して説明する。2. Description of the Related Art Conventionally, the effective input pixels of a camera unit in a video input device are about 400,000 pixels, and the resolution is not high enough to handle a still image of a high definition image as it is. An image captured by this camera is generally a standard video signal, for example, output by NTSC, PAL, SECAM or the like. Therefore, in order to obtain a high-quality still image with a conventional camera, it is common to divide the entire captured screen into multiple screens, and after capturing, combine each divided screen to obtain a high-quality still image. Target.
The configuration and operation processing of the above-mentioned conventional video input device will be described below with reference to FIGS. 16 and 17.

【０００３】図１６は、従来例としての映像入力装置の
ブロック構成図である。FIG. 16 is a block diagram of a conventional video input device.

【０００４】図中、１０１は、人物や書画に使用するカ
メラ部、１０２は、カメラ部１０１の撮像範囲を移動さ
せるための駆動部、１０３は、カメラ１０１で入力され
た映像信号を映像データに変換する処理を行う映像入力
処理部、１０４は、変換された映像データを記憶するた
めの映像メモリ部、１０５は、映像の入力から出力する
までの処理を制御する全体制御部、１０６は、映像デー
タを格納する蓄積部、１０７は、装置の調整やカメラ部
１０１の駆動部１０２の操作を行なう操作部、そして１
０８は、映像データを表示するモニタである。In the figure, 101 is a camera unit used for a person or a document, 102 is a drive unit for moving the imaging range of the camera unit 101, and 103 is a video signal input from the camera 101 into video data. A video input processing unit for performing conversion processing, 104 is a video memory unit for storing the converted video data, 105 is an overall control unit for controlling processing from video input to output, and 106 is a video A storage unit 107 for storing data, an operation unit 107 for adjusting the device and operating the drive unit 102 of the camera unit 101, and 1
A monitor 08 displays video data.

【０００５】この構成において、まず、全体制御部１０
５の指示により駆動部１０２を駆動し、カメラ１０１の
撮像領域を所定の位置に合わせる。次に、カメラ１０１
から入力された映像信号が映像入力処理部１０３を経由
して映像データとなる。映像入力処理部１０３は、入力
された映像信号がＮＴＳＣやＰＡＬなどのコンポジット
信号であれば、輝度信号（以下、Ｙ信号）と色差信号
（以下、Ｃ信号）とにＹＣ分離し、更にＣ信号をＣｒ、
Ｃｂ信号に色差分離して、Ｙ信号、Ｃｒ信号、Ｃｂ信号
とし、その後Ａ／Ｄ変換する。更に、色空間変換が必要
であれば、Ｒ信号、Ｇ信号、Ｂ信号に色空間変換する処
理が施される。また、フォーマット変換、解像度変換、
拡大／縮小等が必要であれば、画素密度の変換処理やそ
れに伴うフィルタ等による補間処理が施される。このよ
うな映像処理により得られた映像データを、分割された
画面分繰返し、映像メモリ部１０４の所定の領域に順次
記憶し、これらの各画面の映像データを張り合わせて合
成する。このようにして一つの高画質な静止画像データ
が得られるわけである。合成された高画質な静止画像デ
ータは、全体制御部１０５により蓄積部１０６に格納さ
れ、表示が必要な場合はモニタ１０８に転送され表示さ
れる。前記の動作処理を図１７のフローチャートに示
す。In this configuration, first, the overall control unit 10
The drive unit 102 is driven according to the instruction of 5, and the image pickup area of the camera 101 is adjusted to a predetermined position. Next, the camera 101
The video signal input from the device becomes video data via the video input processing unit 103. If the input video signal is a composite signal such as NTSC or PAL, the video input processing unit 103 performs YC separation into a luminance signal (hereinafter, Y signal) and a color difference signal (hereinafter, C signal), and further a C signal. Cr,
Color difference separation is performed on the Cb signal to obtain a Y signal, a Cr signal, and a Cb signal, and then A / D conversion is performed. Further, if color space conversion is necessary, color space conversion processing is performed on the R, G, and B signals. Also, format conversion, resolution conversion,
If enlargement / reduction or the like is required, conversion processing of pixel density and interpolation processing by a filter or the like accompanying it are performed. The video data obtained by such video processing is repeated for the divided screens, sequentially stored in a predetermined area of the video memory unit 104, and the video data of each of these screens is pasted and combined. In this way, one high-quality still image data can be obtained. The combined high-quality still image data is stored in the storage unit 106 by the overall control unit 105, and is transferred and displayed on the monitor 108 when display is required. The above operation processing is shown in the flowchart of FIG.

【０００６】図１７は、従来例としての静止画像の入力
処理を示すフローチャートである。FIG. 17 is a flowchart showing a conventional still image input process.

【０００７】図中、操作部１０７にて静止画像の入力が
要求されているか否かを判断し（ステップＳ１０１）、
静止画像の入力でない場合には、カメラ部１０１から動
画像の入力を行い、その入力した動画像をモニタ１０８
へ転送し表示する（ステップＳ１１２〜ステップＳ１１
３）。一方、静止画像の入力の場合には、取り込む静止
画像の解像度を設定する（ステップＳ１０２）。通常の
カメラが撮像可能な領域における撮像能力は、例えば、
ＮＴＳＣの場合には、水平７６８画素×垂直４９４ライ
ンであり、ＰＡＬの場合には、水平７５２画素×垂直５
８２ラインであり、上記の撮像能力を撮像対象物の寸法
で割ったものが解像度となる。ここで、解像度を向上さ
せために、撮像対象物の分割画面数をｎ個に設定する
（ステップＳ１０３）。ステップＳ１０４〜ステップＳ
１０９は、分割画面の入力ルーチンであり、画面の分割
数ｎに合わせて、カメラ部１０１の駆動部１０２を駆動
し、最初の撮像領域位置に合わせる。その後、カメラ部
１０１からその撮像領域の映像を入力し、映像入力処理
部１０３で処理した後、メモリ入力制御部１０４の所定
の領域に記憶する。この処理を、撮像領域と映像メモリ
部１０４のメモリ空間とを順次変更しながらｎ回繰り返
すことで静止画像の高画質な入力、蓄積、表示が可能と
なっている。In the figure, it is judged whether or not the input of a still image is requested by the operation unit 107 (step S101),
If the still image is not input, the moving image is input from the camera unit 101, and the input moving image is displayed on the monitor 108.
To display (step S112 to step S11)
3). On the other hand, when a still image is input, the resolution of the still image to be captured is set (step S102). The imaging capability in the area that a normal camera can image is, for example,
In the case of NTSC, horizontal 768 pixels × vertical 494 lines, and in the case of PAL, horizontal 752 pixels × vertical 5
There are 82 lines, and the above-mentioned imaging capability divided by the size of the imaging target is the resolution. Here, in order to improve the resolution, the number of divided screens of the imaging target is set to n (step S103). Step S104 to Step S
Reference numeral 109 denotes a split screen input routine, which drives the driving unit 102 of the camera unit 101 in accordance with the number of screen divisions n and adjusts the position of the first imaging area. After that, the image of the imaging region is input from the camera unit 101, processed by the image input processing unit 103, and then stored in a predetermined region of the memory input control unit 104. By repeating this process n times while sequentially changing the image pickup area and the memory space of the video memory unit 104, it is possible to input, store, and display a high-quality still image.

【０００８】[0008]

【発明が解決しようとする課題】しかしながら上記従来
例において、高画質な静止画像を取り込むには、画面全
体を高画質な静止画像で取り込む必要があった。このた
め撮像対象物によっては、画面全体の１部分しか高画質
な画像を必要としないため、結果的にそれ以外の部分の
高画質な画像データとそれを得るための装置の処理時間
とが無駄になるという問題があった。また、文書や文字
を含む撮像対象を画像データで取り込んだ場合、その部
分も画像データとして扱われるため、テキストデータと
して記憶するよりもかなり大きな記憶容量が必要とさ
れ、必要に応じて再度テキストデータとして入力しなけ
ればならないという問題があった。However, in the above conventional example, in order to capture a high-quality still image, it was necessary to capture the entire screen as a high-quality still image. Therefore, depending on the object to be imaged, only one part of the entire screen needs a high-quality image, and as a result, the high-quality image data of the other part and the processing time of the apparatus for obtaining it are wasted. There was a problem of becoming. In addition, when an image capture target including a document or characters is captured as image data, that portion is also treated as image data, so a considerably larger storage capacity than that required for storing as text data is required. There was a problem that you had to type as.

【０００９】そこで本発明は、取り込んだ画像に含まれ
る文字情報をテキストデータに変換することが可能な映
像入力装置の提供を目的とする。Therefore, an object of the present invention is to provide a video input device capable of converting character information contained in a captured image into text data.

【００１０】[0010]

【課題を解決するための手段】上述の目的を達成するた
めの本発明の映像入力装置の構成として、以下の特徴を
備える。As a configuration of the video input apparatus of the present invention for achieving the above object, the following features are provided.

【００１１】即ち、撮像領域を移動可能なカメラ部を備
え、そのカメラ部で撮像した画像を静止画像として取り
込む映像入力装置において、前記静止画像が有する画像
特性のうち文字特性を有する領域を判別する画像判別手
段と、前記文字特性を有する領域を分割する文字領域の
分割手段と、前記文字領域の分割手段により分割された
文字領域毎に文字の認識をする文字認識手段と、前記文
字認識手段により認識された文字をコード変換するコー
ド変換手段と、を備えたことを特徴とする。これによ
り、画像データに含まれる文字領域をテキストデータ等
の個々の文字コードに変換する。That is, in a video input device having a camera unit capable of moving an image pickup area and capturing an image picked up by the camera unit as a still image, an area having a character characteristic among the image characteristics of the still image is determined. An image discriminating means, a character area dividing means for dividing an area having the character characteristic, a character recognizing means for recognizing a character for each character area divided by the character area dividing means, and the character recognizing means. And a code conversion means for converting the recognized character into a code. As a result, the character area included in the image data is converted into an individual character code such as text data.

【００１２】更に、前記文字認識手段により文字認識で
きない領域を抽出し、その抽出された領域に対して文字
の認識が可能になるまで変倍倍率を大きくする抽出領域
の変倍手段を備え、その変倍後の領域を前記文字認識手
段により改めて文字認識することを特徴とする。また
は、前記文字認識手段により文字認識できない領域を抽
出し、その抽出された領域に対して文字の認識が可能に
なるまで解像度を高くする高解像度化手段を備え、その
解像度を高くされた領域を前記文字認識手段により改め
て文字認識することを特徴とする。または、前記抽出領
域の変倍手段と、前記高解像度化手段とを備え、その変
倍後及び／または高解像度化処理後の領域を前記文字認
識手段により改めて文字認識することを特徴とする。こ
れらにより、撮像時点の画質では文字の認識ができない
領域だけ画質の向上を施す。Further, there is provided an extraction area scaling means for extracting an area where the character cannot be recognized by the character recognition means and increasing the scaling ratio until the characters can be recognized in the extracted area. It is characterized in that the area after scaling is recognized again by the character recognition means. Alternatively, the character recognition means extracts an area in which characters cannot be recognized, and a resolution increasing means is provided for increasing the resolution until the characters can be recognized in the extracted area. Character recognition is performed again by the character recognition means. Alternatively, the present invention is characterized in that it comprises a scaling means for the extraction area and the resolution increasing means, and the character recognition means newly recognizes the area after the scaling and / or the resolution increasing processing. As a result, the image quality is improved only in the area where characters cannot be recognized with the image quality at the time of image capturing.

【００１３】更に、前記コード変換手段によりコード化
された文字を、前記静止画像または前記文字領域と関連
付けし、合成する第１の合成手段を備えたことを特徴と
する。または、前記文字認識手段により文字認識できな
い領域を、前記静止画像または前記文字領域と関連付け
し、合成する第２の合成手段を備えたことを特徴とす
る。これらにより、文字領域のデータをコードとして扱
える静止画像データを生成する。Further, it is characterized by further comprising first synthesizing means for synthesizing the character coded by the code converting means by associating with the still image or the character area. Alternatively, a second synthesizing unit for synthesizing the region in which the character recognizing unit cannot recognize the character with the still image or the character region is provided. With these, still image data that can handle data in the character area as a code is generated.

【００１４】[0014]

【発明の実施の形態】以下、本発明の実施形態を図面を
参照して詳細に説明する。BEST MODE FOR CARRYING OUT THE INVENTION Embodiments of the present invention will be described in detail below with reference to the drawings.

【００１５】＜第１の実施形態＞はじめに、本発明の第
１の実施形態である画面合成による高解像度映像入力装
置を図面を参照して説明する。<First Embodiment> First, a high-resolution video input device by screen synthesis according to a first embodiment of the present invention will be described with reference to the drawings.

【００１６】図１は、本発明の第１の実施形態としての
画面合成による高解像度映像入力装置のブロック構成図
である。FIG. 1 is a block diagram of a high resolution video input device by screen synthesis as the first embodiment of the present invention.

【００１７】図中、１は、人物や書画に使用するカメラ
部、２は、カメラ部１の撮像範囲を移動させるための駆
動部、３は、カメラ１で入力された映像信号を映像デー
タに変換する処理を行う映像入力処理部、４は、変換さ
れた映像データを記憶するための映像メモリ部、５は、
映像の入力から出力するまでの処理を制御する全体制御
部、６は、映像データを格納する蓄積部、７は、装置の
調整やカメラ部１の駆動部２の操作を行なう操作部、１
１は、ズーム機能により倍率を上げて撮像サイズを小さ
くすることで解像度を向上させる機能と、レンズを用い
た光軸可変機能によりＣＣＤの各素子の撮像範囲より微
小に光軸をずらすことで解像度を向上させる機能を有す
るレンズ部、１２は、全体制御部５からの指示によりレ
ンズ部１２に対して倍率や光軸のズレ量を制御するレン
ズ制御部、１３は、映像メモリ部４から出力された画像
データとグラフィックデータメモリ部からのグラフィッ
クデータを合成するグラフィック合成部、１４は、全体
制御部からのグラフィックデータを格納するグラフィッ
クメモリ部、１５は、グラフィック合成部からの映像信
号を表示部に表示させるための信号に変換処理する映像
出力処理部、１７は、映像入力処理部から出力される画
像データを複数の画面取り込んで合成する映像合成処理
部、１８は、映像合成処理部の制御により複数の画面の
映像データを記憶する映像合成メモリ部である。尚、第
１の実施形態の装置において、レンズ部１１及びレンズ
駆動部１２は備えない構成であってもよい（後述の第２
の実施形態では必須である）。In the figure, 1 is a camera unit used for a person or a document, 2 is a drive unit for moving the image capturing range of the camera unit 1, and 3 is a video signal input from the camera 1 into video data. A video input processing unit 4 for performing conversion processing, a video memory unit 4 for storing the converted video data, and a video memory unit 5 for storing the converted video data.
An overall control unit that controls the processing from the input of the video to the output, 6 is a storage unit that stores the video data, 7 is an operation unit that adjusts the device and operates the drive unit 2 of the camera unit 1,
1 is a function that improves the resolution by increasing the magnification by using the zoom function to reduce the image size, and a function that shifts the optical axis slightly from the imaging range of each element of the CCD by the optical axis variable function using a lens. A lens unit 12 having a function of improving the image quality is output from the image memory unit 4 by a lens control unit 13 that controls a magnification and an optical axis shift amount with respect to the lens unit 12 according to an instruction from the overall control unit 5. And a graphic memory unit for synthesizing the image data and the graphic data from the graphic data memory unit, a graphic memory unit for storing the graphic data from the overall control unit, and a display unit for the video signal from the graphic synthesizing unit. The video output processing unit 17, which converts the signal into a signal for display, outputs a plurality of image data output from the video input processing unit. Video synthesis processing unit for combining capture surface, 18 is an image synthesis memory unit for storing image data of a plurality of screens by the control of the image synthesis processing unit. The apparatus according to the first embodiment may have a configuration in which the lens unit 11 and the lens driving unit 12 are not provided (second unit described later).
Is required in the embodiment of).

【００１８】次に、カメラ部１について図３を参照して
説明する。Next, the camera section 1 will be described with reference to FIG.

【００１９】図３は、本発明の第１の実施形態としての
カメラ部１の内部ブロック図である。FIG. 3 is an internal block diagram of the camera unit 1 according to the first embodiment of the present invention.

【００２０】図中、２０１は、絞りシャッタ、２０２
は、光学的ローパスフィルタ、２０３は、不図示の同期
信号発生部からの同期信号に同期して光電交換素子２０
４を駆動するＣＣＤドライバ、２０４は撮像された対象
物の光信号を受光して電気信号に変換して出力する光電
変換素子、２０５は、光電変換素子から出力された信号
を増幅する自動ゲインコントロール部（ＡＧＣ）、２０
６は、光電変換素子からの信号により絞り量を測定する
ための絞り測光回路、２０７は、絞り測光回路により絞
りシャッターを駆動して絞りを調整するアイリス駆動
部、２０８は、光軸を微小移動して画素ずらしを行う平
行平板、２０９は、光電変換素子から出力される信号の
出力方法を制御するフィールド読み出し制御部、２１０
は、ローパスフィルタ２０２を抜き差しするためのロー
パスフィルタ駆動部、２１１は、平行平板２０８の光軸
ずらしを制御する平行平板駆動部である。In the figure, 201 is an aperture shutter, 202
Is an optical low-pass filter, and 203 is the photoelectric conversion element 20 in synchronization with a sync signal from a sync signal generator (not shown).
4, a CCD driver for driving 4, a photoelectric conversion element 204 for receiving an optical signal of an imaged object, converting it into an electric signal and outputting the electric signal, 205 an automatic gain control for amplifying the signal output from the photoelectric conversion element Department (AGC), 20
Reference numeral 6 is an aperture photometering circuit for measuring the aperture amount by a signal from the photoelectric conversion element, 207 is an iris drive unit for driving the aperture shutter by the aperture photometering circuit to adjust the aperture, and 208 is a minute movement of the optical axis. A parallel plate that shifts the pixels to perform pixel shifting, 209 is a field read control unit that controls an output method of a signal output from the photoelectric conversion element, 210
Is a low-pass filter drive unit for inserting / removing the low-pass filter 202, and 211 is a parallel plate drive unit for controlling the optical axis shift of the parallel plate 208.

【００２１】尚、第１の実施形態の装置において、平行
平板２０８及び平行平板駆動部２１１は備えない構成で
あってもよい（後述の第２の実施形態では必須であ
る）。また、ローパスフィルタ駆動部２１０は、平行平
板２０８及び平行平板駆動部２１１を備えない場合は不
要であり、その場合はローパスフィルタ２０２を付けた
ままの構成とする。The apparatus of the first embodiment may have a configuration in which the parallel plate 208 and the parallel plate drive unit 211 are not provided (essential in the second embodiment described later). Further, the low-pass filter driving section 210 is unnecessary when the parallel plate 208 and the parallel-plate driving section 211 are not provided, and in that case, the low-pass filter 202 is left attached.

【００２２】上述のように、カメラ部１により撮像され
た光情報が、電気信号に変換されて出力されており、絞
り調整やゲイン調整により、その出力レベルが適正な範
囲に入るように自動調整されている。ここで、カメラ部
１から出力される信号は、動画像と静止画像の各々に適
した信号を出力可能な構成をしており、操作部７におけ
るオペレータの要求に応じて、動画像モード／静止画像
モードの切り換えが可能である。As described above, the optical information picked up by the camera unit 1 is converted into an electric signal and outputted, and the aperture level and the gain adjustment automatically adjust the output level so that it falls within an appropriate range. Has been done. Here, the signal output from the camera unit 1 has a configuration capable of outputting a signal suitable for each of a moving image and a still image, and a moving image mode / still image is output in response to a request from an operator in the operation unit 7. The image mode can be switched.

【００２３】次に、出力信号のモードについて、図５及
び図６を参照して説明する。Next, the mode of the output signal will be described with reference to FIGS.

【００２４】図５は、本発明の第１の実施形態としての
光電変換素子２０４の素子上のフィルタ配列を示す図で
ある。FIG. 5 is a diagram showing a filter array on the elements of the photoelectric conversion element 204 according to the first embodiment of the present invention.

【００２５】図中、Ｃｙ（シアン）、Ｙｅ（イエロ
ー）、Ｇ（グリーン）、Ｍｇ（マゼンダ）の各フィルタ
が、図５のような配列で配置されており、補色市松配列
と呼ばれている（但し、Ｇは補色ではなく原色であ
る）。この配列は、原色であるＲ（レッド）、Ｇ（グリ
ーン）、Ｂ（ブルー）のフィルタを通して受光するより
も、補色であるＣｙ，Ｙｅ，Ｍｇのフィルタを通して受
光するほうが、各補色に対して、原色が２色混合されて
受光されるので、それだけ多くの情報が得られ感度が良
くなる。ＧとＭｇの配列は、１ラインごとに交互に配列
されており、更に、フィルタ配列の上下のフィルタの信
号を加算して出力されるため、カメラ部１から出力され
る信号は図６のフィールド読み出しモードに示すように
なる。In the figure, Cy (cyan), Ye (yellow), G (green), and Mg (magenta) filters are arranged in an arrangement as shown in FIG. 5, which is called a complementary color checkered arrangement. (However, G is a primary color, not a complementary color). In this arrangement, it is better to receive light through the complementary colors Cy, Ye, and Mg than the primary color R (red), G (green), and B (blue) filters, for each complementary color. Since the two primary colors are mixed and received, more information is obtained and the sensitivity is improved. The G and Mg arrays are alternately arranged for each line, and the signals of the filters above and below the filter array are added and output. Therefore, the signal output from the camera unit 1 is the field of FIG. As shown in the read mode.

【００２６】図６は、本発明の第１の実施形態としての
フィールド読み出しにおけるフィルタ配列を示す図であ
る。FIG. 6 is a diagram showing a filter array in the field reading according to the first embodiment of the present invention.

【００２７】図中、奇数ラインは、Ｃｙ＋Ｇ、Ｙｅ＋Ｍ
ｇを繰り返し、偶数ラインは、Ｃｙ＋Ｍｇ、Ｙｅ＋Ｇを
繰り返す。ここで、Ｙ信号とＣ信号は、下式により得ら
れるようにフィルタ特性が設定されている。Ｙ＝｛（Ｃｙ＋Ｇ）＋（Ｙｅ＋Ｍｇ）｝×１／２Ｒ−Ｙ＝｛（Ｙｅ＋Ｍｇ）−（Ｃｙ＋Ｇ）｝ −（Ｂ−Ｙ）＝｛（Ｙｅ＋Ｇ）−（Ｃｙ＋Ｍｇ）｝よって、映像入力処理部３では、上記のような加減算を
付加することで、Ｙ信号とＣ信号の生成処理が行われて
いる。即ち、動画像処理或は画面を空間的に分割して静
止画像処理する場合には、上記のようなフィールド読み
出しのモードが使用されるわけである。In the figure, the odd lines are Cy + G and Ye + M.
g is repeated, and Cy + Mg and Ye + G are repeated for even lines. Here, the filter characteristics of the Y signal and the C signal are set so as to be obtained by the following equation. Y = {(Cy + G) + (Ye + Mg)} × 1/2 R−Y = {(Ye + Mg) − (Cy + G)} − (B−Y) = {(Ye + G) − (Cy + Mg)} Therefore, the video input processing unit In 3, the Y signal and the C signal are generated by adding and subtracting as described above. That is, in the case of moving image processing or still image processing by spatially dividing the screen, the field reading mode as described above is used.

【００２８】次に、映像入力処理部３について図４を参
照して説明する。Next, the video input processing section 3 will be described with reference to FIG.

【００２９】図４は、本発明の第１の実施形態としての
映像入力処理部３の内部ブロック図である。FIG. 4 is an internal block diagram of the video input processing unit 3 according to the first embodiment of the present invention.

【００３０】図中、３０１は、水平ライン２ライン分の
遅延回路で、カメラ部１からの信号に対して、遅延無
し、１ライン遅延、２ライン遅延の信号を出力する。出
力信号自体は図６に示した構成と同様である。３０２
は、遅延回路３０１からの出力信号の奇数番目と偶数番
目とを加算することによりＹ信号を生成し、遅延回路に
より生成した３ライン分の信号を利用して、水平／垂直
のアパーチャ補正の処理を行なう水平／垂直アパーチャ
補正部、３０３は、水平／垂直アパーチャ補正部から出
力されたＹ信号にガンマ補正処理を施すガンマ補正部、
３０４は、遅延回路３０１からの３ライン分の信号を利
用して、各信号の奇数番目と偶数番目とを加減算するこ
とによりＹ信号、Ｃｒ（Ｒ−Ｙ）信号、Ｃｂ（Ｂ−Ｙ）
信号を生成する同期検波部、３０５は、ＹＣｒＣｂ信号
をＲＧＢ信号に色変換するマトリクス変換を施すＲＧＢ
マトリクス変換部、３０６は、撮像時の光源の色温度の
変化に対してＲＧＢの色再現性を一定に保つため、ＲＧ
Ｂ信号を合成して得られる白レベルが、基準となる白レ
ベルとなるようにＲＧＢ信号に調整を施すホワイトバラ
ンス調整部、３０７は、ＲＧＢ信号にガンマ補正処理を
施すガンマ補正処理部、３０８は、ＲＧＢ信号をＣｒ
（Ｒ−Ｙ）信号とＣｂ（Ｂ−Ｙ）信号に色変換する色差
マトリクス変換部である。つまり、この映像入力処理部
３は、カメラ部１から受信した光情報の電気信号をＹ，
Ｕ，Ｖ信号の映像情報に変換して出力しており、アパー
チャ補正やガンマ補正、色信号に対してはホワイトバラ
ンス調整を施し、その映像情報を適正なレベルへの自動
調整を行なう。In the figure, reference numeral 301 denotes a delay circuit for two horizontal lines, which outputs signals with no delay, one line delay, and two line delay with respect to the signal from the camera section 1. The output signal itself is the same as the configuration shown in FIG. 302
Generates a Y signal by adding the odd-numbered and even-numbered output signals from the delay circuit 301, and uses the signals for three lines generated by the delay circuit to perform horizontal / vertical aperture correction processing. And a horizontal / vertical aperture correction unit 303 for performing a gamma correction process on the Y signal output from the horizontal / vertical aperture correction unit.
Reference numeral 304 denotes a Y signal, a Cr (RY) signal, and a Cb (BY) signal by adding and subtracting odd-numbered and even-numbered signals of each signal using the signals of three lines from the delay circuit 301.
A synchronous detection unit 305 for generating a signal is an RGB that performs matrix conversion for color-converting the YCrCb signal into an RGB signal.
The matrix conversion unit 306 keeps RGB color reproducibility constant with respect to changes in the color temperature of the light source at the time of imaging,
A white balance adjusting unit that adjusts the RGB signals so that the white level obtained by combining the B signals becomes a reference white level, 307 is a gamma correction processing unit that performs gamma correction processing on the RGB signals, and 308 is , RGB signal is Cr
It is a color difference matrix conversion unit that performs color conversion into an (RY) signal and a Cb (BY) signal. That is, the video input processing unit 3 converts the electric signal of the optical information received from the camera unit 1 into Y,
The image information of the U and V signals is converted and output, and aperture correction, gamma correction, white balance adjustment are performed on the color signals, and the image information is automatically adjusted to an appropriate level.

【００３１】次に、画面合成による高解像度映像入力装
置が撮像対象を高画質で入力する際の映像の入力方法に
ついて、図９を参照して説明する。Next, a method of inputting an image when the high resolution image input device by screen synthesis inputs an image pickup target with high image quality will be described with reference to FIG.

【００３２】図９は、本発明の第１の実施形態としての
画面分割による映像の入力を示す図である。FIG. 9 is a diagram showing image input by screen division according to the first embodiment of the present invention.

【００３３】図９において、画面合成による高解像度映
像入力装置は、撮像対象を高画質で入力するために、駆
動部２を駆動することでカメラの撮像領域を上下左右に
移動させ、分割された画面ごとに映像を取り込んでいる
（図９の例では、○印の画像領域が映像データとして取
り込まれた場合を示す）。また、撮像領域を分割する駆
動モータの制御については、現在の駆動モータは十分に
高精度なので実現は容易である。しかし、装置内部にお
ける分割されて入力された画面の合成については、合成
画面間の境界が不連続であり、また個々の分割画面は取
り込んだ時間が微妙に異なるため、そのまま合成すると
不自然な画像となる。そこで、時間的なずれによる個々
の分割画面の合成時の影響を極力押さえるため、（１）その間は一定の環境にするか、複数の分割画面の
境界付近の映像信号の状態を記憶しておき、境界付近が
お互い同一のレベルになるように映像信号を調整する。（２）複数の分割画面の境界付近をダブらせて映像入力
しパターンマッチングさせる。（３）映像を取り込む際の画面ゆがみを補正するなどの
処理を施して合成する。等の手法を利用して境界の不連続さを解決し、良好な静
止画像を得ることが可能となる。In FIG. 9, the high-resolution video input device by screen synthesis is driven by driving the driving unit 2 to move the image pickup area of the camera vertically and horizontally in order to input the image pickup object with high image quality. The image is captured for each screen (in the example of FIG. 9, the case where the image area marked with a circle is captured as the image data). Further, the control of the drive motor that divides the imaging area is easy to realize because the current drive motor has sufficiently high accuracy. However, regarding the composition of divided and input screens inside the device, the boundaries between combined screens are discontinuous, and the time taken to capture each divided screen is slightly different. Becomes Therefore, in order to minimize the influence of the time lag when combining the individual split screens, (1) keep the environment constant during that time, or store the state of the video signal near the boundary of the multiple split screens. , Adjust the video signal so that the areas near the boundary are at the same level. (2) Duplicating the vicinity of the boundaries of a plurality of divided screens, inputting the image, and pattern matching. (3) Perform processing such as correction of screen distortion at the time of capturing an image and synthesize. It is possible to obtain a good still image by solving the discontinuity at the boundary by using such a method.

【００３４】次に、画像判定部３０について説明する。
画像判定部３０は、撮像領域を各画素毎、或はその領域
を分割した各画像ブロック毎に文字等の領域であるか否
かを判定し、その判定結果を全体制御部５に知らせるも
のである。以下にその判定方法について説明する。Next, the image determination unit 30 will be described.
The image determination unit 30 determines whether the imaging region is a region such as a character for each pixel or each image block obtained by dividing the region, and notifies the overall control unit 5 of the determination result. is there. The determination method will be described below.

【００３５】（１）まず、判定方法の一つに、その画像
が持つ周波数成分により判別する方法がある。例えば、
自然画像では、平坦な部分が多く見られるので周波数成
分で比較すると低周波成分に集中した傾向が見られる。
また、文字や図形などでは、輪郭部分が多く、この輪郭
を更に際立てるためにコントラストも高いので、周波数
成分で比較すると、高周波成分が自然画像の場合よりも
多い傾向が見られる。更に、木目の細かい模様等では、
文字や図形よりも細かいパターンになるので文字や図形
の時の周波数成分よりも更に多くの高周波成分が見られ
る。これらの性質によって、自然画像であるか文字や図
形であるかを、その画像が有する周波数成分を調べるこ
とにより判別が可能である。具体的には、空間／周波数
変換（ＦＦＴ，ＤＣＴ等）により、画像領域を周波数成
分に分解し、各周波数成分の値を抽出して上記の判別基
準により低周波成分のしきい値を設定し、そのしきい値
より小さい場合には自然画像、大きい場合には文字や図
形等であるという判定をする。この方法は判別方法とし
ては優れているが、かなり複雑で規模の大きな処理が必
要となる。(1) First, as one of the determination methods, there is a determination method based on the frequency component of the image. For example,
In a natural image, many flat parts are seen, and therefore a tendency to concentrate on low frequency components is seen when comparing frequency components.
In addition, since characters and figures have many contour portions and the contrast is high in order to further enhance the contours, when compared by frequency components, there is a tendency that there are more high frequency components than in the case of a natural image. Furthermore, in the case of fine patterns, etc.,
Since the pattern is finer than that of a character or graphic, more high-frequency components can be seen than the frequency component of a character or graphic. Due to these properties, it is possible to determine whether the image is a natural image or a character or figure by examining the frequency component of the image. Specifically, the image area is decomposed into frequency components by space / frequency conversion (FFT, DCT, etc.), the value of each frequency component is extracted, and the threshold value of the low frequency component is set according to the above-mentioned discrimination criterion. If it is smaller than the threshold value, it is determined that it is a natural image, and if it is larger than the threshold value, it is determined that it is a character or figure. Although this method is excellent as a discrimination method, it requires considerably complicated and large-scale processing.

【００３６】（２）次の方法は、画像データのばらつき
によって判定しようとする方法であり、対象となる各画
像の平均値と各画像の値との差の絶対値を累積加算して
判定する方法がある。この場合にも、その累積加算した
結果の値が小さいと、急激な変化が少なく平坦であると
判断され自然画に近いと判断される。差が大きい場合
は、文字や図形や模様のようにコントラストが大きく急
激な変化があったと見なせるわけである。(2) The following method is a method for determining based on the variation of the image data, and it is determined by cumulatively adding the absolute value of the difference between the average value of each target image and the value of each image. There is a way. Also in this case, when the value of the cumulative addition result is small, it is determined that there is little abrupt change and that the image is flat and that the image is close to a natural image. If the difference is large, it can be considered that the contrast is large and a sudden change occurs, such as characters, figures, and patterns.

【００３７】またもう一つの方法は、画像の標準偏差を
利用して判定する方法である。The other method is a method of making a determination using the standard deviation of the image.

【００３８】γ＝γ＋（各画素値−各画素値の平均）＾
２（但し、＾２は２乗を表わす）という式を用いて、対象となる画素の平均値と画素値と
の差の２乗を累積加算して判別する。この場合、標準偏
差が小さいと急激な変化が少なく平坦であると判断され
自然画に近いと判断される。標準偏差が大きいと、文字
や図形や模様のようにコントラストが大きく急激な変化
があったと見なせる。この判断基準でしきい値を設定
し、その値より小さい場合には自然画像とし、大きい場
合には文字や図形等と判断する。この方法の場合には、
演算処理が単純で高速で処理することが可能である。但
し、周波数成分による方法と比べると精度は落ちる。Γ = γ + (each pixel value-average of each pixel value) ^
2 (however, ^ 2 represents the square) is used to perform cumulative addition of the square of the difference between the average value of the target pixel and the pixel value for determination. In this case, if the standard deviation is small, it is determined that there is little sudden change and that the image is flat and that the image is close to a natural image. If the standard deviation is large, it can be considered that the contrast is large and a sharp change occurs, such as characters, figures, and patterns. A threshold value is set according to this criterion, and when it is smaller than that value, it is determined as a natural image, and when it is larger, it is determined as a character or figure. In this case,
The arithmetic processing is simple and can be processed at high speed. However, the accuracy is lower than the method using the frequency component.

【００３９】（３）更に、画像内に存在するエッジ等を
検出して判定する方法がある。つまり、エッジが多く存
在する部分は、文字や図形であると判別できる。逆に、
エッジが少ない場合には、自然画像であると判別でき
る。具体的な判別方法としては、差分フィルタ等により
局所的なモード変化を検出する方法があり、線形１階差
分のソーベルオペレータや２階差分のラプラシアン等の
差分フィルタで画像データに対して重み付け処理（マス
ク）して、その値（絶対値）が大きい場合には、その画
像領域内にエッジが存在すると判別する。これにより、
画像ブロック内にあるエッジ数をしきい値として設定
し、そのしきい値より小さい場合には自然画像と判別
し、そのしきい値より大きい場合には文字・図形等と判
別する。(3) Furthermore, there is a method of detecting and determining an edge or the like existing in the image. That is, it is possible to determine that a portion having many edges is a character or a figure. vice versa,
When there are few edges, it can be determined that the image is a natural image. As a specific determination method, there is a method of detecting a local mode change by a difference filter or the like, and a weighting process is performed on the image data with a difference filter such as a linear Sobel operator of the first-order difference or a Laplacian of the second-order difference. (Mask) and if the value (absolute value) is large, it is determined that an edge exists in the image area. This allows
The number of edges in the image block is set as a threshold value, and if it is smaller than the threshold value, it is determined to be a natural image, and if it is larger than the threshold value, it is determined to be a character or a figure.

【００４０】また、自然画像領域の判別精度を向上させ
る方法としては次のような方法がある。まず、エッジ検
出により画像の特性が変化する境界を検出し、ある画像
の輪郭を抽出することで上記の判別により画像特性の異
なる境界を明確にする。例えば、ある文章に画像や図形
がはめ込まれている場合などは、そのはめ込んだ境界付
近では画像の特性に大きな変化があらわれるので境界を
明確に定めて分割することが可能になる。また、細線化
処理による輪郭検出方法もある。As a method of improving the discrimination accuracy of the natural image area, there are the following methods. First, a boundary where image characteristics change is detected by edge detection, and a contour of a certain image is extracted to clarify the boundary where image characteristics differ by the above determination. For example, when an image or figure is embedded in a sentence, a large change occurs in the characteristics of the image near the embedded boundary, so that the boundary can be clearly defined and divided. There is also a contour detection method by thinning processing.

【００４１】次に、判別の誤動作を避ける方法もある。
例えば、平坦な画像すべてを自然画像として判別する
と、背景色（白地、黒地など）までも高画質入力してし
まう場合があるので、それを避けるために背景色は予め
登録しておくか、画像領域で背景色になりうる部分の画
像を取り込んで比較判別し、同じであれば背景と認識し
て高画質での入力処理は行わず、違えば背景と異なると
認識して高画質での入力処理を行う。Next, there is also a method of avoiding a malfunction of discrimination.
For example, if all flat images are identified as natural images, the background color (white background, black background, etc.) may be input with high image quality. To avoid this, the background color should be registered in advance, or the image The image of the part that can become the background color in the area is captured and compared, if it is the same, it is recognized as the background and input processing with high image quality is not performed, otherwise it is recognized as different from the background and input with high image quality Perform processing.

【００４２】上述の（１）〜（３）における複数の判別
方法を併用することにより、より正確で精度の高い画像
判別が可能となる。こうして、自然画、文字や図形等の
画像を判定された領域を特定することが可能になる。
尚、上述の画像判別方法以外の方法であっても、本発明
に適用可能であることは言うまでもない。By using a plurality of discrimination methods in the above (1) to (3) together, more accurate and highly accurate image discrimination can be performed. In this way, it is possible to specify a region where an image such as a natural image, a character, or a figure is determined.
Needless to say, methods other than the above-described image discrimination method can be applied to the present invention.

【００４３】次に、文字認識部３１について説明する。
まず、画像データの２値化処理を行い、１つの文字の単
位に切り出し、切り出された文字の寸法を算出し、認識
処理のために参照する文字の寸法に正規化を行う。更
に、この正規化された文字に対して、予め蓄積部６に記
憶している辞書から文字を引出し、差分比較して評価を
行い、差分が最も小さい文字を見出し、その文字コード
を認識結果として出力する。ここで、文字の切り出しに
ついては、書物や新聞等のように活字で一定の間隔で文
字が印刷されている場合は、文字間の空間領域を抽出し
て文字を１文字単位に分離、抽出することが可能であ
る。Next, the character recognition section 31 will be described.
First, the image data is binarized, cut out in units of one character, the size of the cut out character is calculated, and the size of the character referred to for the recognition process is normalized. Further, with respect to this normalized character, a character is extracted from a dictionary stored in the storage unit 6 in advance, the difference is compared and evaluated, the character with the smallest difference is found, and its character code is used as a recognition result. Output. Here, regarding the character cut-out, when characters are printed in print at a constant interval such as in books and newspapers, the space area between the characters is extracted, and the characters are separated and extracted in character units. It is possible.

【００４４】また、文字認識の方法であるが、大きくは
パターン整合法と構造解析法とに分けられる。パターン
整合法は、辞書にある各文字の標準文字のテンプレート
と画像入力された文字とを重ねあわせる方法である。一
方、構造解析方法は、文字を構成する線素の方向や大き
さ、形状、接続、交差点等の構造を表現するいくつかの
特徴を抽出し、辞書にある各文字の特徴と画像入力され
た文字の特徴とを照合する方法である。更に、前述のよ
うに１文字単位で認識するには精度や品質において限界
があることから、自然言語処理技術を取り込んで、単語
或は文章の前後関係による判断や文脈による判断により
認識する方法を導入することで認識率を飛躍的に向上さ
せることができる。尚、上述の文字認識方法以外の方法
であっても、本発明に適用可能であることは言うまでも
ない。The character recognition method is roughly classified into a pattern matching method and a structure analysis method. The pattern matching method is a method of superimposing a standard character template of each character in a dictionary and a character input as an image. On the other hand, the structure analysis method extracts some features that express the structure such as the direction, size, shape, connection, and intersection of the line elements that make up a character, and inputs the features and images of each character in the dictionary. This is a method of matching with the characteristics of characters. Furthermore, since there is a limit in accuracy and quality in recognizing character by character as described above, a method of incorporating natural language processing technology and recognizing it by the context or context of words or sentences is used. By introducing it, the recognition rate can be dramatically improved. Needless to say, methods other than the character recognition method described above can be applied to the present invention.

【００４５】次に、文字認識部３１の文字認識処理の流
れについて図１４及び図１５を参照して説明する。Next, the flow of character recognition processing of the character recognition unit 31 will be described with reference to FIGS. 14 and 15.

【００４６】図１４（図１４Ａ〜図１４Ｄ）は、本発明
の第１の実施形態としての文字認識処理を示すフローチ
ャートである。FIG. 14 (FIGS. 14A to 14D) is a flow chart showing the character recognition processing as the first embodiment of the present invention.

【００４７】図中、ステップＳ１において静止画像の入
力か否かを判断する。静止画像でない場合は、撮像部か
ら動画像を入力し（ステップＳ１３）、モニタ１６に転
送して表示する（ステップＳ１４）。静止画像の入力の
場合は、ステップＳ２において、全領域について高画質
な入力要求かを判断する。In the figure, in step S1, it is determined whether or not a still image is input. If it is not a still image, a moving image is input from the image pickup unit (step S13), transferred to the monitor 16 and displayed (step S14). In the case of inputting a still image, it is determined in step S2 whether the input request is of high image quality for the entire area.

【００４８】ＹＥＳの場合は、高画質入力の解像度を設
定し（ステップＳ３）、画面分割数も設定する（ステッ
プＳ４）。次に、ステップＳ５〜ステップＳ１１におい
て最初の撮像領域に駆動部２を制御して位置を合わせ、
映像を入力し映像信号処理された画像データは映像合成
処理部１７により映像合成メモリ部１８に記憶される。
映像合成メモリ部１８に画像データを記憶する際には、
取り込む複数の分割図面が重ならないように映像合成処
理部１７によりメモリの範囲が指定されている。こうし
て、最初の画像データが映像合成メモリ部の所定領域に
記憶される。この動作を画面の分割した数だけ繰り返
し、取り込み完了すると、その取り込まれた全ての画像
データを全体制御部５は蓄積部６に蓄積する。その後
は、動画像モードと静止画像モードの選択に戻る。If YES, the resolution for high image quality input is set (step S3), and the screen division number is also set (step S4). Next, in steps S5 to S11, the drive unit 2 is controlled to align with the first imaging region,
The image data obtained by inputting the image and processed by the image signal is stored in the image synthesis memory unit 18 by the image synthesis processing unit 17.
When storing image data in the video composition memory unit 18,
The memory range is specified by the video composition processing unit 17 so that the plurality of divided drawings to be captured do not overlap. Thus, the first image data is stored in the predetermined area of the video composition memory unit. When this operation is repeated for the divided number of screens and the acquisition is completed, the overall control unit 5 accumulates all the acquired image data in the accumulation unit 6. After that, the process returns to the selection of the moving image mode and the still image mode.

【００４９】ＮＯの場合は、カメラ部１から入力した映
像を映像信号処理して映像メモリ部４、グラフィック合
成部１３、映像出力部１５を介してモニタ１６に出力し
（ステップＳ２１）、操作部７からの静止画像の入力指
示に応じて、全体制御部５で静止画像での取り込み領域
を示すグラフィックデータ（例えば、領域を囲むワクな
ど）と撮像範囲を制御する操作方法を示すグラフィック
データをグラフィックメモリ（例えば、ズームやパン、
チルト等の操作画面）に書き込み、撮像中の画像と合成
してモニタに表示する（ステップＳ２２）。次に、表示
されているグラフィック画面により、操作部７からの制
御によりレンズ制御部１２、カメラ部１を制御して静止
画像の撮像領域を設定し（ステップＳ２３）、その設定
された撮像領域の静止画像を映像メモリ部４に記憶する
（ステップＳ２４）。更に、全体制御部５に転送し、蓄
積部６に蓄積される（ステップＳ２５）。In the case of NO, the image input from the camera unit 1 is subjected to image signal processing and output to the monitor 16 via the image memory unit 4, the graphic composition unit 13 and the image output unit 15 (step S21), and the operation unit. In response to a still image input instruction from 7, the overall control unit 5 displays graphic data indicating a capture area in the still image (for example, a frame surrounding the area) and graphic data indicating an operation method for controlling the imaging range. Memory (eg zoom, pan,
(Operation screen for tilting, etc.), synthesizes with the image being captured, and displays it on the monitor (step S22). Next, on the displayed graphic screen, the lens control unit 12 and the camera unit 1 are controlled under the control of the operation unit 7 to set the image pickup area of the still image (step S23), and the set image pickup area is set. The still image is stored in the video memory unit 4 (step S24). Further, it is transferred to the overall control unit 5 and stored in the storage unit 6 (step S25).

【００５０】次にステップＳ２６において、文字認識処
理をするかを判断する。ＮＯの場合は、そのまま動画像
モードと静止画像モードの選択に戻る。一方、ＹＥＳの
場合には、前述した画像判定処理を施して（ステップＳ
２７）、文字領域があるかを判別する（ステップＳ２
８）。ＮＯの場合は、そのまま動画像モードと静止画像
モードの選択に戻る。一方、ＹＥＳの場合には、、前述
した文字認識処理を全ての文字領域について施し、その
文字領域を識別するグラフィックデータを映像と合成し
て表示する（ステップＳ２９〜ステップＳ３１）。ステ
ップＳ３２において、文字認識が不可能な領域があるか
を判別する。ＮＯの場合は、その認識領域の文字情報を
蓄積済の静止画像に登録し（ステップＳ３３）、蓄積部
６に蓄積し（ステップＳ３４）、動画像と静止画像モー
ドの判断に戻る。ここで、登録とは、最初に蓄積した画
像データへの追加登録、或は、関連づけて蓄積すること
である。その登録情報としては、位置情報、解像度情
報、データ種別等がある。一方ＹＥＳの場合は、その認
識が不可能な領域を抽出して登録し、その中から最初の
撮像領域を指定し（ステップＳ４１）、その撮像領域の
中心が映像取り込みの中心になるように駆動部２を駆動
する（ステップＳ４２）。次にステップＳ４３におい
て、ズームアップが可能かどうかを判断する。ＹＥＳの
場合は、レンズ部１１を制御して倍率をアップして撮像
する（ステップＳ４４）。Next, in step S26, it is determined whether character recognition processing is to be performed. In the case of NO, the process directly returns to the selection of the moving image mode and the still image mode. On the other hand, in the case of YES, the above-described image determination processing is performed (step S
27), it is determined whether there is a character area (step S2).
8). In the case of NO, the process directly returns to the selection of the moving image mode and the still image mode. On the other hand, in the case of YES, the above-described character recognition processing is performed on all the character areas, and the graphic data for identifying the character areas is combined with the video and displayed (steps S29 to S31). In step S32, it is determined whether there is a region in which character recognition is impossible. In the case of NO, the character information of the recognition area is registered in the already stored still image (step S33) and stored in the storage unit 6 (step S34), and the process returns to the determination of the moving image and still image mode. Here, the term "registration" refers to additional registration with respect to the image data that has been initially stored, or storage in association with it. The registration information includes position information, resolution information, data type and the like. On the other hand, in the case of YES, the unrecognizable area is extracted and registered, the first imaging area is designated from the areas (step S41), and the center of the imaging area is driven so as to be the center of image capturing. The section 2 is driven (step S42). Next, in step S43, it is determined whether zoom-up is possible. In the case of YES, the lens unit 11 is controlled to increase the magnification and the image is taken (step S44).

【００５１】ここで、レンズ制御部１２を制御して、撮
像領域を望遠側に変倍し、撮像範囲内に合わせるが、そ
の方法について図１５を参照して説明する。Here, the lens control section 12 is controlled to change the magnification of the image pickup area to the telephoto side so as to fit within the image pickup range. The method will be described with reference to FIG.

【００５２】図１５は、本発明の第１の実施形態として
の撮像領域の変倍処理を示す図である。FIG. 15 is a diagram showing the scaling processing of the image pickup area according to the first embodiment of the present invention.

【００５３】図中、最初に取り込み蓄積された静止画像
の領域は、（Ｘ0，Ｙ0），（Ｘ1，Ｙ1），（Ｘ2，Ｙ
2），（Ｘ3，Ｙ3）で示されており、その領域内での高
画質での入力が指定されている領域が（ｘ0，ｙ0），
（ｘ1，ｙ1），（ｘ2，ｙ2），（ｘ3，ｙ3）で示されて
いる。ここで、まず、高画質入力に指定された領域を撮
像領域の中央にするために、（Ｘ0，Ｙ0），（Ｘ1，Ｙ
1），（Ｘ2，Ｙ2），（Ｘ3，Ｙ3）から（Ｘ’0，Ｙ’
0），（Ｘ’1，Ｙ’1），（Ｘ’2，Ｙ’2），（Ｘ’3，
Ｙ’3）になるように撮像領域を移動させる。この移動
量は、（Ｘ0，Ｙ0），（Ｘ1，Ｙ1），（Ｘ2，Ｙ2），
（Ｘ3，Ｙ3）の中心位置と（ｘ0，ｙ0），（ｘ1，ｙ
1），（ｘ2，ｙ2），（ｘ3，ｙ3）の中心位置のずれ分
で算出される。次に、設定された解像度に応じて高画質
入力に指定された領域を変倍するためには、（Ｘ’0，
Ｙ’0），（Ｘ’1，Ｙ’1），（Ｘ’2，Ｙ’2），
（Ｘ’3，Ｙ’3）から（Ｘ”0，Ｙ”0），（Ｘ”1，
Ｙ”1），（Ｘ”2，Ｙ”2），（Ｘ”3，Ｙ”3）に撮像
領域を変倍させる。その変倍量は、指定された解像度と
最初に蓄積された静止画像の解像度との倍率により変倍
率が算出される。この変倍処理の一連の動作は、一括し
て行うことはもちろん可能である。In the figure, the areas of the still image that are first captured and accumulated are (X0, Y0), (X1, Y1), (X2, Y
2), (X3, Y3), and the area in which the input with high image quality is designated is (x0, y0),
It is indicated by (x1, y1), (x2, y2), (x3, y3). Here, first, in order to make the area designated for high image quality input the center of the imaging area, (X0, Y0), (X1, Y
1), (X2, Y2), (X3, Y3) to (X'0, Y '
0), (X'1, Y'1), (X'2, Y'2), (X'3,
The image pickup area is moved so as to become Y'3). This movement amount is (X0, Y0), (X1, Y1), (X2, Y2),
Center position of (X3, Y3) and (x0, y0), (x1, y
1), (x2, y2), (x3, y3) are calculated by the deviation of the center position. Next, in order to scale the area designated for high image quality input according to the set resolution, (X'0,
Y'0), (X'1, Y'1), (X'2, Y'2),
From (X'3, Y'3) to (X "0, Y" 0), (X "1,
Y "1), (X" 2, Y "2), (X" 3, Y "3) The image area is scaled. The scaling amount is the specified resolution and the still image stored first. The scaling ratio is calculated by the scaling factor with the resolution of 1. It is of course possible to collectively perform the series of operations of this scaling process.

【００５４】変倍された（Ｘ”0，Ｙ”0），（Ｘ”1，
Ｙ”1），（Ｘ”2，Ｙ”2），（Ｘ”3，Ｙ”3）の領域
の画像データは、映像メモリ部４に記憶される。全体制
御部５は、その記憶された画像データの中から更に、指
定された領域である（ｘ0，ｙ0），（ｘ1，ｙ1），（ｘ
2，ｙ2），（ｘ3，ｙ3）の画像データのみを抽出する。
そして、上記の手順により撮像された撮像領域の映像を
映像メモリに記憶し、その記憶した領域すべてを文字認
識処理する（ステップＳ４５〜ステップＳ４７）。その
結果、認識が不可能な領域がない場合には、その領域の
文字情報を蓄積済の静止画像に登録し（ステップＳ４
９）、蓄積部６に蓄積し（ステップＳ５０）、すでに登
録されている認識不能領域すべてが完了した場合には、
動画像モードと静止画像モードの選択に戻る（ステップ
Ｓ５１）。認識が不可能な領域がまだある場合には、認
識が不可能な次の領域に撮像領域を指定し（ステップＳ
５２）、その撮像領域の中心が映像取り込みの中心にな
るように駆動部を駆動して倍率をアップし映像記憶して
文字認識処理までを繰り返す。上記の手順を繰り返し、
レンズ制御不能で倍率アップができなくなった場合に
は、画素ずらしを行なうかを判断する（ステップＳ６
１）。Scaled (X "0, Y" 0), (X "1,
The image data in the areas Y "1), (X" 2, Y "2), (X" 3, Y "3) is stored in the video memory unit 4. The overall control unit 5 stores the image data. In the image data, further specified areas are (x0, y0), (x1, y1), (x
Only image data of 2, y2) and (x3, y3) is extracted.
Then, the image of the image pickup area picked up by the above procedure is stored in the image memory, and character recognition processing is performed on all the stored areas (steps S45 to S47). As a result, if there is no unrecognizable area, the character information of the area is registered in the accumulated still image (step S4).
9) and stores in the storage unit 6 (step S50), and when all unrecognized areas already registered are completed,
The process returns to the selection of the moving image mode and the still image mode (step S51). If there is still an unrecognizable area, the imaging area is designated as the next unrecognizable area (step S
52), the driving unit is driven so that the center of the image pickup area becomes the center of image capturing, the magnification is increased, the image is stored, and the process up to character recognition is repeated. Repeat the above steps,
If the lens cannot be controlled and the magnification cannot be increased, it is determined whether to shift the pixel (step S6).
1).

【００５５】ＹＥＳの場合には、画素ずらしによる解像
度アップが可能かを判断し、ＹＥＳであれば解像度を設
定し、画面分割数ｎを算出して、その画面分割数分、画
素ずらし制御をして映像情報を入力し記憶し画面合成
し、このｎ個の合成した映像情報に静止画像処理を行な
い、その静止画像のすべてに対して文字認識処理を行な
う（ステップＳ６５〜ステップＳ７３）。その結果、認
識が不可能な領域がない場合には、ステップＳ４９に進
み、その認識領域の文字情報を蓄積済の静止画像に登録
して蓄積部６に蓄積し、既に登録されている認識不能領
域すべてが完了した場合には、動画像モードと静止画像
モードの選択に戻る。認識が不可能な領域がある場合に
は、ステップＳ６２に戻る。上記の手順をくり返し、解
像度アップが不可能になった場合及びステップＳ６１で
ＮＯ（画素ずらしでない場合）には、認識が不可能な領
域の映像データを画素ずらしによる静止画像の場合には
静止画像メモリから、通常の静止画像の場合には映像メ
モリから映像データを読み出し（ステップＳ７５）、そ
の読み出された映像データのままを蓄積済みの静止画像
に登録し（ステップＳ７６）、ステップＳ５０に進む。If YES, it is determined whether the resolution can be increased by shifting the pixels. If YES, the resolution is set, the screen division number n is calculated, and pixel shift control is performed by the number of screen divisions. The video information is input and stored, the screens are combined, the still image processing is performed on the n pieces of combined video information, and the character recognition processing is performed on all of the still images (steps S65 to S73). As a result, if there is no unrecognizable area, the process proceeds to step S49, the character information of the recognized area is registered in the already stored still image and is accumulated in the accumulating unit 6, and the already recognized unrecognizable area is recognized. When all the areas are completed, the process returns to the selection of the moving image mode and the still image mode. If there is a region that cannot be recognized, the process returns to step S62. If the resolution cannot be increased by repeating the above procedure and if NO in step S61 (if the pixel is not shifted), the video data of the unrecognizable area is a still image if the image is shifted by pixel. In the case of a normal still image, the video data is read from the video memory from the memory (step S75), the read video data as it is is registered in the accumulated still image (step S76), and the process proceeds to step S50. .

【００５６】＜第２の実施形態＞次に、本発明の第２の
実施形態である画素ずらしによる高解像度映像入力装置
について説明する。この画素ずらしによる高解像度映像
入力装置は、基本的な公正及び動作が、第１の実施形態
における画面合成による高解像度映像入力装置と同様な
ため、異なる点についてのみ以下に説明をする。<Second Embodiment> Next, a high-resolution video input device based on pixel shifting according to a second embodiment of the present invention will be described. Since the high resolution video input device based on the pixel shift is basically the same as the high resolution video input device based on the screen synthesis in the first embodiment, only the different points will be described below.

【００５７】図２は、本発明の第２の実施形態としての
画素ずらしによる高解像度映像入力装置のブロック構成
図である。FIG. 2 is a block diagram of a high-resolution video input device using pixel shifting according to a second embodiment of the present invention.

【００５８】図中、２０は、平行平板等による画素ずら
しされた色フィルタデータをカメラ部１から入力し、合
成する画素ずらしデータ合成処理部、２１は、合成され
たデータを記憶する画素ずらしデータメモリ部、２２
は、画素ずらしデータメモリ部２１の色フィルタデータ
を画像データに変換処理する静止画像処理部である。In the figure, reference numeral 20 is a pixel shift data synthesis processing section for inputting color shift data of pixels shifted by a parallel plate or the like from the camera section 1 and synthesizing, and 21 is pixel shift data for storing the synthesized data. Memory part, 22
Is a still image processing unit for converting the color filter data of the pixel shift data memory unit 21 into image data.

【００５９】静止画像モードの場合には、画像領域を半
画素、または１画素分づつずらして各フィルタごとに撮
像することによって、解像度を向上させたり、光軸が同
一のフィルタ情報（Ｃｙ，Ｙｅ，Ｍｇ，Ｇｒ）が得られ
色再現性の非常に優れた静止画像を生成するモードがあ
る。この場合には、第１の実施形態におけるフィールド
読み出しモードとは異なり、上下のフィルタイメージを
加算せず、フィルタイメージのまま読み出すフレーム読
み出しモードを使用する（従って、図３のカメラ部１に
おける２０９は、画素ずらしによるデータ取り込の場合
はフレーム読み出しを行なうフレーム読み出し制御部と
なる）。このフレーム読み出しモードを、図７及び図８
に示す。In the case of the still image mode, the image area is shifted by half a pixel or by one pixel to capture an image for each filter to improve the resolution and filter information (Cy, Ye) having the same optical axis. , Mg, Gr) are obtained, and there is a mode for generating a still image with excellent color reproducibility. In this case, unlike the field read mode in the first embodiment, the frame read mode in which the upper and lower filter images are not added and the filter images are read as they are (therefore, 209 in the camera unit 1 in FIG. 3 is used). , In the case of data acquisition by pixel shift, it becomes a frame read control unit that performs frame read). This frame read mode is shown in FIGS.
Shown in

【００６０】図７は、本発明の第２の実施形態としての
フレーム読み出しにおけるフィルタ配列を示す図であ
り、フィルタ配列を順次そのまま読み出している。FIG. 7 is a diagram showing a filter array in frame reading according to the second embodiment of the present invention, in which the filter array is sequentially read as it is.

【００６１】図８は、本発明の第２の実施形態としての
フレーム読み出しにおけるフィルタ配列を示す図であ
り、フィルタ配列の奇数列のみを最初に読み出し、次に
偶数列のみを読みだすことによるフィルタ種別ごとの読
み出しである。FIG. 8 is a diagram showing a filter array in frame reading according to the second embodiment of the present invention, in which only the odd-numbered columns of the filter array are read out first, and then only the even-numbered columns are read out. It is a read for each type.

【００６２】次に、静止画像処理部２２について補足説
明をすると、内部構成は映像入力処理部３とほぼ同じ
で、画素ずらしデータメモリ部２１から、フィルタ情報
（Ｃｙ，Ｙｅ，Ｍｇ，Ｇｒ）を読み出し、Ｃｙ＋Ｇ、Ｙ
ｅ＋Ｍｇ、Ｃｙ＋Ｍｇ、Ｙｅ＋Ｇの加算処理をした後
は、映像入力処理部３と同じ処理を行う。Next, a supplementary explanation of the still image processing unit 22 will be made. The internal structure is almost the same as that of the video input processing unit 3, and the filter information (Cy, Ye, Mg, Gr) is sent from the pixel shift data memory unit 21. Read, Cy + G, Y
After the addition processing of e + Mg, Cy + Mg, and Ye + G, the same processing as the video input processing unit 3 is performed.

【００６３】次に、画素ずらしの方法について図１０〜
図１２に示す。Next, the pixel shifting method will be described with reference to FIGS.
It shows in FIG.

【００６４】図１０は、本発明の第２の実施形態として
の画素ずらしによる映像の入力を示す図である。FIG. 10 is a diagram showing image input by pixel shifting according to the second embodiment of the present invention.

【００６５】図中、レンズ部１１において光軸ｂから光
軸ａに光軸を微小にずらすことでカメラ部１で撮像され
る画像領域が微小にずれる。そこで光軸を微小にずらし
ながらその都度、映像を取り込むことで、カメラ部１の
撮像素子の画素数があたかも増して解像度が向上したの
と同等の効果が得られる（図１０の例では、○印の画像
を取り込んだ例である）。光軸を微小に変化させる機構
は、プリズムレンズの頂角を可変させる機構であり、互
いに平行に配されたガラス板４０間をシリコンオイル４
１で満たし、その周囲をシールしたものであり、レンズ
制御部１２によって両ガラス板間の傾きを変化させ、頂
角を可変にするものである。ここで、レンズ部１１を微
小移動させるため、レンズ制御部１２の制御にはかなり
の制度が要求される。画面合成においては、合成画面間
の境界の不連続性が解消されるので、合成後の不自然さ
はなくなる。特に、解像度が変化しても、連続性が失わ
れないので、高画質の静止画像を入力するには、最適な
方法である。但し、複数の画面の取り込みには時間的な
ずれが生じているため、時間的ずれによる影響を極力押
さえるようにその間は一定の環境にするか、複数の画面
の画面全体の平均的な映像信号の状態を記憶しておき、
画面全体の平均レベルがお互い同一のレベルになるよう
に映像信号を調整することで時間的なずれ問題を解決
し、良好な静止画像を得ることが可能となる。In the figure, by slightly shifting the optical axis from the optical axis b to the optical axis a in the lens section 11, the image area picked up by the camera section 1 is slightly shifted. Therefore, by capturing an image each time while slightly shifting the optical axis, the same effect as that when the number of pixels of the image sensor of the camera unit 1 is increased and the resolution is improved can be obtained (in the example of FIG. 10, It is an example of capturing the image of the mark). The mechanism for slightly changing the optical axis is a mechanism for changing the apex angle of the prism lens, and the silicone oil 4 is provided between the glass plates 40 arranged in parallel with each other.
1 and the periphery thereof is sealed, and the lens controller 12 changes the inclination between both glass plates to make the apex angle variable. Here, since the lens unit 11 is slightly moved, control of the lens control unit 12 requires considerable precision. In the screen combination, the discontinuity of the boundary between the combined screens is eliminated, so that the unnaturalness after the combination is eliminated. In particular, since continuity is not lost even when the resolution changes, this is an optimal method for inputting a high-quality still image. However, since there is a time lag in capturing multiple screens, keep the environment constant during that time to minimize the effect of time lag, or use the average video signal of multiple screens. Memorize the state of
By adjusting the video signal so that the average level of the entire screen becomes the same level, it becomes possible to solve the problem of time shift and obtain a good still image.

【００６６】図１１は、本発明の第２の実施形態として
の平行平板を用いた画素ずらしによる映像の入力を示す
図である。FIG. 11 is a diagram showing image input by pixel shifting using a parallel plate as the second embodiment of the present invention.

【００６７】平行平板（Pallarel Plate）４２を斜めに
傾けることによって、光が物質を通過する際の屈折率に
より生じる入射光の角度のずれを利用し、光軸を微小に
ずらしながら映像を取り込むことで、解像度や色再現性
を向上させることができる。By tilting the parallel plate 42 obliquely, the angle deviation of the incident light caused by the refractive index when light passes through the substance is used to capture an image while slightly shifting the optical axis. Thus, the resolution and color reproducibility can be improved.

【００６８】図１２は、本発明の第２の実施形態として
の平行平板による光軸のずれについての説明図である。FIG. 12 is an explanatory diagram of the deviation of the optical axis due to the parallel plate as the second embodiment of the present invention.

【００６９】平行平板４２は、光軸と垂直であれば、光
軸のずれは発生しないが、図に示すように、光が平行平
板４２の斜め方向から入射されると、物体固有の屈折率
により入射角に対して屈折が生じる。この屈折自体は物
質が均一で変化がなければ常に一定であるが、物体の厚
みが増すとそれに応じて変化する。更に、光が物体を通
過すると、逆の屈折が生じて物体に入射した時の光軸と
平行な光となる。従って、図１２で示した長さｄが光軸
のずれとなる。長さｄは、下記の式により求めることが
できる。If the parallel plate 42 is perpendicular to the optical axis, the optical axis is not displaced, but as shown in the figure, when light is incident from the oblique direction of the parallel plate 42, the refractive index peculiar to the object is obtained. Causes refraction with respect to the incident angle. This refraction itself is always constant if the material is uniform and unchanged, but it changes accordingly as the thickness of the object increases. Further, when the light passes through the object, the opposite refraction occurs and becomes light parallel to the optical axis when the light enters the object. Therefore, the length d shown in FIG. 12 is the deviation of the optical axis. The length d can be calculated by the following formula.

【００７０】ｎ＝ｓｉｎｉ／ｓｉｎθ （ｎ：屈折率）ｘ＝Ｌ・（ｔａｎｉ−ｔａｎθ）ｄ＝ｃｏｓｉ・ｘこれにより、ｄ＝ｃｏｓｉ・Ｌ・（ｓｉｎｉ／ｃｏｓｉ−ｔａｎθ）＝Ｌ・［ｓｉｎｉ−ｃｏｓｉ・ｔａｎ｛ａｒｃｓｉｎ（ｓｉｎｉ／ｎ）｝］と求められる。N = sini / sin θ (n: refractive index) x = L · (tani-tan θ) d = cosi · x As a result, d = cosi · L · (sini / cosi-tan θ) = L · [sini- cosi · tan {arcsin (sini / n)}].

【００７１】この長さｄが、撮像素子の画素間の長さと
同じであれば１画素ずらし、１／２であれば半画素ずら
しての撮像が可能になる。その画素ずらしによる撮像を
イメージした図を図１３に示す。If the length d is the same as the length between the pixels of the image pickup device, the image can be shifted by one pixel, and if it is ½, the image can be shifted by a half pixel. FIG. 13 is a diagram showing an image of the image pickup by the pixel shift.

【００７２】図１３は、本発明の第２の実施形態として
の画素ずらしによる撮像についての説明図である。FIG. 13 is an explanatory diagram of image pickup by pixel shifting according to the second embodiment of the present invention.

【００７３】ここでａ１１は、ホームポジションであ
り、ｂ１１とｃ１１とｄ１１とは、ａ１１のホームポジ
ションから平行平板４２による画素ずらしにより、１画
素ずらした場合である。つまり撮像している対象物は同
一であって、ａ１１はＣｙ，ｂ１１はＹｅ，ｃ１１はＭ
ｇ，ｄ１１はＧｒの各フィルタイメージで撮像してい
る。Here, a11 is the home position, and b11, c11, and d11 are the case where the home position of a11 is shifted by one pixel by the pixel shift by the parallel plate 42. That is, the objects being imaged are the same, a11 is Cy, b11 is Ye, and c11 is M.
g and d11 are imaged with each filter image of Gr.

【００７４】半画素ずらしの場合は、ホームポジション
から垂直方向はそのままで水平右方向へ半画素ずらした
位置を新たなホームポジションａ１２とし、１画素ずら
したのが、ｂ１２、ｃ１２、ｄ１２である。同様に、ａ
２１を新たなホームポジションとした半画素ずらしによ
り、ｂ２１、ｃ２１、ｄ２１、ａ２２を新たなホームポ
ジションとした半画素ずらしにより、ｂ２２、ｃ２２、
ｄ２２と順次撮像することで高画質な静止画像を生成す
ることが可能となる。In the case of shifting by half a pixel, a position shifted by half a pixel in the horizontal right direction from the home position in the vertical direction is set as a new home position a12, and one pixel is shifted by b12, c12 and d12. Similarly, a
21 is a new home position and a half pixel shift is performed, so that b21, c21, d21, and a22 are new home positions, and a half pixel shift is performed to b22, c22,
It is possible to generate a high-quality still image by sequentially capturing images with d22.

【００７５】尚、本発明は、複数の機器から構成される
システムに適用しても、本実施形態のように１つの機器
からなる装置に適用しても良い。また、本発明はシステ
ム或は装置にプログラムを供給することによって実施さ
れる場合にも適用できることは言うまでもない。この場
合、本発明に係るプログラムを格納した記憶媒体が本発
明を構成することになる。そして、該記憶媒体からその
プログラムをシステム或は装置に読み出すことによっ
て、そのシステム或は装置が、予め定められた仕方で動
作する。The present invention may be applied to a system composed of a plurality of devices or an apparatus composed of a single device as in this embodiment. Further, it goes without saying that the present invention can be applied to the case where it is implemented by supplying a program to a system or an apparatus. In this case, the storage medium storing the program according to the present invention constitutes the present invention. Then, by reading the program from the storage medium to the system or device, the system or device operates in a predetermined manner.

【００７６】＜実施形態の効果＞（１）予め静止画像として取り込んでおいた画像データ
又は、モニタ１６を見ながら操作部７によりリアルタイ
ムに取り込んだ静止画像の文字領域について、画像判定
処理により文字領域を抽出し、文字認識を行いテキスト
データに変換し、改めてその静止画像と関連付けてその
文字情報を登録／蓄積することが可能になる。これによ
り処理するデータ量、記憶容量、処理時間を大幅に削減
することが可能となる。（２）入力文字が小さくて認識が不可能な場合に、ズー
ムや画素ずらしによって解像度を向上させ、かつ、カメ
ラ駆動部２によって撮像領域へのカメラ部１の制御を木
目細かく行なうことが可能なので、文字認識の精度を必
要とする部分のみを精度の向上が可能となる。<Effects of the Embodiment> (1) Image data previously captured as a still image, or a character region of a still image captured in real time by the operation unit 7 while observing the monitor 16 is subjected to image determination processing. Can be extracted, the characters can be recognized and converted into text data, and the character information can be registered / stored in association with the still image again. This makes it possible to significantly reduce the amount of data to be processed, the storage capacity, and the processing time. (2) When the input characters are too small to be recognized, the resolution can be improved by zooming or pixel shifting, and the camera drive unit 2 can finely control the camera unit 1 to the image pickup area. It is possible to improve the accuracy of only the portion that requires the accuracy of character recognition.

【００７７】[0077]

【発明の効果】以上説明したように、本発明によれば、
取り込んだ画像に含まれる文字情報をテキストデータに
変換することが可能な映像入力装置の提供が実現する。As described above, according to the present invention,
It is possible to provide a video input device capable of converting character information included in a captured image into text data.

【００７８】[0078]

[Brief description of drawings]

【図１】本発明の第１の実施形態としての画面合成によ
る高解像度映像入力装置のブロック構成図である。FIG. 1 is a block configuration diagram of a high resolution video input device by screen synthesis according to a first embodiment of the present invention.

【図２】本発明の第２の実施形態としての画素ずらしに
よる高解像度映像入力装置のブロック構成図である。FIG. 2 is a block configuration diagram of a high resolution video input device by pixel shifting as a second embodiment of the present invention.

【図３】本発明の第１の実施形態としてのカメラ部１の
内部ブロック図である。FIG. 3 is an internal block diagram of a camera unit 1 according to the first embodiment of the present invention.

【図４】本発明の第１の実施形態としての映像入力処理
部３の内部ブロック図である。FIG. 4 is an internal block diagram of a video input processing unit 3 according to the first embodiment of the present invention.

【図５】本発明の第１の実施形態としての光電変換素子
２０４の素子上のフィルタ配列を示す図である。FIG. 5 is a diagram showing a filter array on the elements of the photoelectric conversion element 204 according to the first embodiment of the present invention.

【図６】本発明の第１の実施形態としてのフィールド読
み出しにおけるフィルタ配列を示す図である。FIG. 6 is a diagram showing a filter array in field reading according to the first embodiment of the present invention.

【図７】本発明の第２の実施形態としてのフレーム読み
出しにおけるフィルタ配列を示す図である。FIG. 7 is a diagram showing a filter array in frame reading according to a second embodiment of the present invention.

【図８】本発明の第２の実施形態としてのフレーム読み
出しにおけるフィルタ配列を示す図である。FIG. 8 is a diagram showing a filter array in frame reading according to a second embodiment of the present invention.

【図９】本発明の第１の実施形態としての画面分割によ
る映像の入力を示す図である。FIG. 9 is a diagram showing image input by screen division according to the first embodiment of the present invention.

【図１０】本発明の第２の実施形態としての画素ずらし
による映像の入力を示す図である。FIG. 10 is a diagram showing image input by pixel shifting according to a second embodiment of the present invention.

【図１１】本発明の第２の実施形態としての平行平板を
用いた画素ずらしによる映像の入力を示す図である。FIG. 11 is a diagram showing image input by pixel shifting using a parallel plate as the second embodiment of the present invention.

【図１２】本発明の第２の実施形態としての平行平板に
よる光軸のずれについての説明図である。FIG. 12 is an explanatory diagram of an optical axis shift due to a parallel plate as the second embodiment of the present invention.

【図１３】本発明の第２の実施形態としての画素ずらし
による撮像についての説明図である。FIG. 13 is an explanatory diagram of imaging by pixel shifting according to a second embodiment of the present invention.

【図１４Ａ】本発明の第１の実施形態としての文字認識
処理を示すフローチャートである。FIG. 14A is a flowchart showing a character recognition process according to the first embodiment of the present invention.

【図１４Ｂ】本発明の第１の実施形態としての文字認識
処理を示すフローチャートである。FIG. 14B is a flowchart showing a character recognition process as the first embodiment of the present invention.

【図１４Ｃ】本発明の第１の実施形態としての文字認識
処理を示すフローチャートである。FIG. 14C is a flowchart showing a character recognition process according to the first embodiment of the present invention.

【図１４Ｄ】本発明の第１の実施形態としての文字認識
処理を示すフローチャートである。FIG. 14D is a flowchart showing a character recognition process according to the first embodiment of the present invention.

【図１５】本発明の第１の実施形態としての撮像領域の
変倍処理を示す図である。FIG. 15 is a diagram showing a scaling process of an imaging region according to the first embodiment of the present invention.

【図１６】従来例としての映像入力装置のブロック構成
図である。FIG. 16 is a block configuration diagram of a video input device as a conventional example.

【図１７】従来例としての静止画像の入力処理を示すフ
ローチャートである。FIG. 17 is a flowchart showing still image input processing as a conventional example.

[Explanation of symbols]

１カメラ部２駆動部３映像入力処理部４映像メモリ部５全体制御部６蓄積部７操作部８モニタ１１レンズ部１２レンズ制御部１３グラフィック合成部１４グラフィックメモリ部１５映像出力処理部１６モニタ１７映像合成処理部１８映像合成メモリ部２０画素ずらしデータ合成処理部２１画素ずらしデータメモリ部２２静止画処理部３０画像判定部３１文字認識部４０板ガラス４１シリコンオイル４２平行平板１０１カメラ部１０２駆動部１０３映像入力処理部１０４映像メモリ部１０５全体制御部１０５蓄積部１０７操作部１０８モニタ２０１絞りシャッタ２０２光学的ローパスフィルタ２０３ＣＣＤドライバ２０４光電変換素子２０５自動ゲインコントロール部（ＡＧＣ）２０６絞り測光回路２０７アイリス駆動部２０８平行平板２０９フレーム（フィールド）読み出し制御部２１０ローパスフィルタ駆動部２１１平行平板駆動部３０１遅延回路３０２水平／垂直アパーチャ補正部３０３ガンマ補正部３０４同期検波部３０５ＲＧＢマトリクス変換部３０６ホワイトバランス調整部３０７ガンマ補正処理部３０８色差マトリクス変換部 1 Camera Section 2 Driving Section 3 Video Input Processing Section 4 Video Memory Section 5 Overall Control Section 6 Storage Section 7 Operating Section 8 Monitor 11 Lens Section 12 Lens Control Section 13 Graphic Synthesis Section 14 Graphic Memory Section 15 Video Output Processing Section 16 Monitor 17 Image synthesis processing unit 18 Image synthesis memory unit 20 Pixel shifted data synthesis processing unit 21 Pixel shifted data memory unit 22 Still image processing unit 30 Image determination unit 31 Character recognition unit 40 Plate glass 41 Silicon oil 42 Parallel plate 101 Camera unit 102 Driving unit 103 Image input processing unit 104 Image memory unit 105 Overall control unit 105 Storage unit 107 Operation unit 108 Monitor 201 Aperture shutter 202 Optical low-pass filter 203 CCD driver 204 Photoelectric conversion element 205 Automatic gain control unit (AGC) 206 Aperture Photometric circuit 207 Iris drive unit 208 Parallel plate 209 Frame (field) readout control unit 210 Low-pass filter drive unit 211 Parallel plate drive unit 301 Delay circuit 302 Horizontal / vertical aperture correction unit 303 Gamma correction unit 304 Synchronous detection unit 305 RGB matrix conversion unit 306 White balance adjustment unit 307 Gamma correction processing unit 308 Color difference matrix conversion unit

Claims

[Claims]

1. A camera unit capable of moving an imaging region,
In a video input device that captures an image captured by the camera unit as a still image, an image determination unit that determines an area having character characteristics among the image characteristics of the still image, and a character area that divides the area having the character characteristics. Dividing means, character recognizing means for recognizing a character for each character area divided by the character area dividing means, and code converting means for performing code conversion of the character recognized by the character recognizing means. A video input device characterized in that

2. An extraction area scaling means for extracting an area where character recognition is not possible by the character recognition means, and increasing a scaling ratio until a character can be recognized in the extracted area. 2. The video input device according to claim 1, wherein the character recognition section newly recognizes characters in the scaled area.

3. A resolution increasing means is provided for extracting an area where character recognition is not possible by the character recognizing means, and increasing the resolution until the character can be recognized in the extracted area. 2. The video input device according to claim 1, wherein the character recognition unit re-recognizes the different area.

4. The extraction area scaling means and the resolution increasing means are provided, and the area after the scaling and / or resolution enhancement processing is newly recognized by the character recognition means. The video input device according to claim 1.

5. The video input according to claim 1, further comprising a first synthesizing unit that associates the character coded by the code converting unit with the still image or the character region to synthesize the character. apparatus.

6. The video input device according to claim 1, further comprising a second synthesizing unit for synthesizing a region in which the character recognizing unit cannot recognize the character with the still image or the character region. .

7. The image input device according to claim 3, wherein a pixel shift is used as the resolution increasing means.