JP2002358529A

JP2002358529A - Method and device for processing image

Info

Publication number: JP2002358529A
Application number: JP2001167016A
Authority: JP
Inventors: Daisuke Abe; 大輔阿部; Nobuhiro Tsunashima; 宣浩綱島; Morihito Shiobara; 守人塩原
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2001-06-01
Filing date: 2001-06-01
Publication date: 2002-12-13
Anticipated expiration: 2021-06-01
Also published as: JP4509423B2

Abstract

PROBLEM TO BE SOLVED: To provide an image processing method and its device by which a character and sign information on an object are read by a high resolution conversion processing based on a plurality of images which are extracted from a video obtained by photographing the mobile object. SOLUTION: Image areas p<0> -p<3> where the character or sign information concerning the object is included are respectively researched and taken-out based on a plurality of time sequential images from the video obtained by photographing the mobile object. The motion state for moving the object is assumed to be a uniform motion L, for example, and, then, positioning is performed to correct the positions of the image areas p<0> -p<3> to be at equal interval in times t0-t3 along the motion L. The positioned image areas p<0> -p<3> are superimposed to perform the high resolution processing so that the character and the sign included in a high resolution conversion image are read. Precision is enhanced in the high resolution processing through the use of the plurality of image areas which are positioned concerning the same object.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、映像中の文字や記
号を自動で読み取り可能にする画像処理方法および画像
処理装置に関し、特に、抽出された複数枚の画像から高
解像度で文字や記号を読み取ることができる画像処理に
関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an image processing method and an image processing apparatus for automatically reading characters and symbols in a video, and more particularly, to a method for extracting characters and symbols at high resolution from a plurality of extracted images. It relates to readable image processing.

【０００２】[0002]

【従来の技術】従来から、例えば、学校や会社などにお
いて、建物の出入り口にカメラを設置しておき、出入り
する人物の身体に付けられた名札の文字を読み取ること
が行われている。この読み取られた名札の文字に基づい
て、建物への人の出入を管理するシステムが設置されて
いた。2. Description of the Related Art Conventionally, for example, in a school or a company, a camera is installed at an entrance of a building, and characters on a name tag attached to a body of a person who enters and exits are read. A system has been set up to manage the entry and exit of people into and out of the building based on the read letters of the name tag.

【０００３】また、製品生産工場などの製造現場におい
て、出荷物をベルトコンベア上に流して仕分けする際、
現場に設置されているカメラにより、ベルトコンベア上
を流れ移動している荷物の映像を撮影し、その映像から
荷物に貼られているシールの文字を読み取ることで、移
動する各荷物を自動的に送り先ごとに仕分けする物流監
視システムなどが使用されている。[0003] Further, in a production site such as a product production factory, when a shipment is made to flow on a belt conveyor and sorted,
Cameras installed at the site take pictures of the luggage flowing and moving on the belt conveyor and read the characters on the stickers affixed to the luggage from the video to automatically identify each luggage that moves. A distribution monitoring system that sorts by destination is used.

【０００４】映像中で移動している文字や記号を読み取
る技術を、例えば、図１に示されるような、ベルトコン
ベア２、照明装置３、カメラ４、認識装置５、そしてモ
ニタを備えた物流監視システムを用いて説明する。生産
工場などにおいて、ベルトコンベア２の傍らに設置され
ているカメラ４によって、移動する出荷物１の映像を撮
影し、認識装置５でその映像から出荷物１に貼られてい
るラベル７の文字を読み取り、出荷物１を送り先などに
仕分けするというように、出荷物１の物流監視が行われ
ている。[0004] A technique for reading characters or symbols moving in an image is, for example, a logistics monitor provided with a belt conveyor 2, a lighting device 3, a camera 4, a recognition device 5, and a monitor as shown in FIG. This will be described using a system. In a production factory or the like, an image of the moving shipment 1 is photographed by the camera 4 installed beside the belt conveyor 2, and the characters of the label 7 attached to the shipment 1 are recognized by the recognition device 5 from the image. The logistics of the shipment 1 is monitored, such as reading and sorting the shipment 1 to a destination or the like.

【０００５】映像中のラベル７における文字や記号を読
み取る方法は、例えば、テンプレートマッチング（テレ
ビジョン学会編、「画像工学 −画像のエレクトロニク
ス−」、pp.132-133）と呼ばれる画像処理方法を用いて
いた。この技術は、物体中の文字が書かれている部分
を、予め「辞書」として持っている文字のテンプレート
と照合させ、辞書のテンプレートと最も類似しているも
のを読み取り、その結果を出力している。A method for reading characters and symbols on the label 7 in a video is, for example, an image processing method called template matching (edited by the Institute of Television Engineers of Japan, "Image Engineering-Electronics of Images-", pp. 132-133). I was With this technology, the part where the characters in the object are written is matched with the character template that we have in advance as a "dictionary", the one that is most similar to the dictionary template is read, and the result is output I have.

【０００６】この技術による読み取り方法では、カメラ
４と出荷物４とが離れているため、ラベル７に書かれて
いる文字が小さくなると、文字自体が潰れてしまい、他
の文字と似通ったものとなり、読み取ったときの文字認
識が困難になる。その結果、ある程度大きく映っている
文字を対象とせざるを得ない。そのため、この技術によ
り、ラベル７上の文字や記号を読み取る場合、文字や記
号を大きく撮影する必要があり、カメラ４の撮影範囲が
限定されることとなる。しかし、例えば、物流監視シス
テムにおいては、ベルトコンベア２上を移動する出荷物
１には、様々な大きさのものがあることや、出荷物１の
種類によっては貼られたラベル７の位置が異なるなどの
理由により、確実にラベル７を検出できるようにするた
めに、カメラ４の撮影範囲をできるだけ広くとる必要性
がでてくる。In the reading method according to this technique, since the camera 4 and the shipment 4 are separated from each other, when the characters written on the label 7 become small, the characters themselves are crushed and become similar to other characters. This makes it difficult to recognize characters when reading. As a result, it is inevitable to target characters that are somewhat large. Therefore, when reading a character or a symbol on the label 7 by this technique, it is necessary to take a large image of the character or the symbol, and the shooting range of the camera 4 is limited. However, for example, in the physical distribution monitoring system, there are various sizes of the shipment 1 moving on the belt conveyor 2, and the position of the applied label 7 differs depending on the type of the shipment 1. For such reasons, it is necessary to increase the photographing range of the camera 4 as much as possible in order to detect the label 7 reliably.

【０００７】そこで、この技術を用いてより広い範囲を
監視するためには、例えば、特開平１１−２８４９８１
号公報に見られるように、広い範囲を複数の領域に分
け、複数台のカメラを用いて各カメラに各領域を割り当
てるようにすることが開示されている。しかし、この手
法では、カメラ台数が増加するばかりでなく、その増加
に伴って照明装置、処理装置等の周辺機器も増えるた
め、コスト面や設置場所の確保などの問題があった。To monitor a wider range using this technique, for example, Japanese Patent Application Laid-Open No. H11-284981
As disclosed in Japanese Unexamined Patent Application Publication No. H11-264, it is disclosed that a wide range is divided into a plurality of areas, and a plurality of cameras are used to allocate each area to each camera. However, in this method, not only the number of cameras increases, but also peripheral devices such as lighting devices and processing devices increase along with the increase.

【０００８】[0008]

【発明が解決しようとする課題】以上述べたように、従
来技術による読み取り方法では、「カメラの台数」と
「文字や記号の読み取り可能範囲」とにトレードオフの
関係があった。そのため、少ないカメラ台数で広い範囲
を撮影した場合、対象となる文字や記号の解像度が低く
なり、文字や模様の情報が不足するため、一部分が潰れ
たり欠けたりして、それらを読み取ることができないと
いう問題がある。As described above, in the reading method according to the prior art, there is a trade-off between "the number of cameras" and "the readable range of characters and symbols". Therefore, when shooting a wide range with a small number of cameras, the resolution of the target character or symbol is low, and the information of the character or pattern is insufficient, so that the part is crushed or missing and it is not possible to read them There is a problem.

【０００９】この問題を解決するために、画像処理によ
り画像の解像度を上げる高解像度化技術が開発（例え
ば、青木ら、「複数のディジタル画像からの超解像度処
理」、第２回画像センシングシンポジウム、pp.65-70）
されている。この高解像度化技術は、映像中の複数枚の
画像を用い、各画像の画素の情報を足し合わせること
で、高い解像度かつ鮮明な画像を生成するというもので
ある。In order to solve this problem, a high resolution technology for increasing the resolution of an image by image processing has been developed (for example, Aoki et al., "Super Resolution Processing from Multiple Digital Images", 2nd Image Sensing Symposium, pp.65-70)
Have been. This high-resolution technique is to generate a high-resolution and clear image by using a plurality of images in a video and adding pixel information of each image.

【００１０】ここで、高解像度化技術における高解像度
化処理の原理を以下に説明する。高解像度化とは、画像
形成過程をモデル化し、その逆問題を解くことで形成さ
れた画像の原因となる高精細な画像を推定することであ
る。具体的には、例えば、以下のような解法により高解
像度画像を生成する。Ｆ(X、Y)を理想的な高解像度画像
とし、Ｆ(X、Y)を座標変換することによって得られるｋ
番目のフレームにおける高解像度画像をＦ_k(X、Y)とす
る。つまり、座標変換式をＸ＝Ｘ_k(x、y)、Ｙ＝Ｙ_k(x、y) …(1) とすると、ｋ番目のフレームにおける高解像度画像はＦ_k(X、Y)＝Ｆ(Ｘ_k(x、y)、Ｙ_k(x、y)) …(2) となる。Here, the principle of the high resolution processing in the high resolution technology will be described below. Higher resolution refers to modeling a process of forming an image and estimating a high-definition image that causes the formed image by solving the inverse problem. Specifically, for example, a high-resolution image is generated by the following solution. F (X, Y) is an ideal high-resolution image, and k (F, X, Y) is obtained by performing coordinate transformation.
The high-resolution image in the second frame and F _k (X, Y). That is, if the coordinate transformation formula is X = X _k (x, y), Y = Y _k (x, y) (1), the high-resolution image in the k-th frame is F _k (X, Y) = F (X _k (x, y), Y _k (x, y)) (2)

【００１１】ｋ番目の低解像度の観測画像をＧ_k(i、j)
とおくと、The k-th low-resolution observation image is _represented by G _k (i, j)
After all,

【００１２】[0012]

【数１】 (Equation 1)

【００１３】ここで、ｗ(i、j:x、y)はＣＣＤ各画素の
空間位置や開口特性によって定まる窓関数（ＰＳＦ）で
ある。ここ整数値(i、j)に対してｗ(i、j;x、y)＝1、 i−0.5＜x＜i＋0.5、j−0.5＜y＜j＋0.5 ＝0、その他 …(4) であることを仮定する。これは各受光素子が正方形であ
り、隙間無く画像面を覆っていることを意味している。
なお、ｗ(i、j;x、y)が別の形の関数であっても、以下
の議論は成立する。(2)式及び(3)式よりHere, w (i, j: x, y) is a window function (PSF) determined by the spatial position and aperture characteristics of each CCD pixel. W (i, j; x, y) = 1, i−0.5 <x <i + 0.5, j−0.5 <y <j + 0.5 = 0 with respect to the integer value (i, j), etc. ). This means that each light receiving element is square and covers the image surface without any gap.
Note that the following discussion holds even if w (i, j; x, y) is a function of another form. From equations (2) and (3)

【００１４】[0014]

【数２】 (Equation 2)

【００１５】ここで、ｘ_k(X、Y)、ｙ_k(X、Y)は(1)式の
座標変換の逆変換である。また、(δ(x、y)／δ(X、Y))
は変数変換のヤコビアン(各場所毎に面積が何倍になっ
ているかを表すスケールファクタ)であり、ｗ_k(i、j;X、Y) ＝ｗ(i、j:ｘ_k(X、Y)、ｙ_k(X、Y))(δ(x、y)／δ(X、Y)) …(6) とおくとHere, x _k (X, Y) and y _k (X, Y) are inverse transformations of the coordinate transformation of the equation (1). Also, (δ (x, y) / δ (X, Y))
Is the Jacobian of the variable transformation (a scale factor indicating how many times the area is increased for each location), and w _k (i, j; X, Y) = w (i, j: x _k (X, Y ), Y _k (X, Y)) (δ (x, y) / δ (X, Y)) (6)

【００１６】[0016]

【数３】 (Equation 3)

【００１７】が得られる。この式はｋ番目のフレームと
して観測される画像Ｇ_k(i、j)と高解像度の理想画像Ｆ
(X、Y)の関係を表す。観測画像Ｇ_k(i、j)が与えられた
ときに、その基となる高解像度の理想画像Ｆ(X、Y)を求
めたいというのが解くべき問題である。この(7)式自身
はＦ(X、Y)が任意に高解像度(X、Yが任意の実数)であっ
ても成立するが、あまり解像度が高いと未知数の数が多
すぎ、解が一意に定まらない。そこで、Ｆ(X、Y)が整数
格子上でのみ値を持つ離散的な画像Ｈ(I、J)から先に定
義した窓関数ｗ(I、J;X、Y)を使ってIs obtained. This equation shows that the image G _k (i, j) observed as the k-th frame and the high-resolution ideal image F
Represents the relationship (X, Y). Observed image G _k (i, j) when a given ideal image F (X, Y) of high resolution to be the group which is a problem to be solved because you seek. Equation (7) itself holds even if F (X, Y) is arbitrarily high resolution (X and Y are arbitrary real numbers), but if the resolution is too high, the number of unknowns is too large and the solution is unique. Not determined. Therefore, using a window function w (I, J; X, Y) defined earlier from a discrete image H (I, J) in which F (X, Y) has a value only on an integer lattice.

【００１８】[0018]

【数４】 (Equation 4)

【００１９】で表されるものと仮定する。するとＦ(X、
Y)は階段関数となり、未知数の数は整数格子点の数に削
減される。なお、画像の離散表現は別の方法も有り得る
が、任意の窓関数に対してこの議論は成り立つ。It is assumed that Then F (X,
Y) is a step function, and the number of unknowns is reduced to the number of integer lattice points. Note that the discrete representation of the image may be another method, but this argument holds for any window function.

【００２０】[0020]

【数５】 (Equation 5)

【００２１】ここで、(11)式を最小二乗の意味で最適に
満たす高解像度画像Ｈ(I、J)を求めればよい。つまり、
次の評価関数Here, a high-resolution image H (I, J) that optimally satisfies Expression (11) in the sense of least squares may be obtained. That is,
Next evaluation function

【００２２】[0022]

【数６】 (Equation 6)

【００２３】を最小にするようなＨ(I、J)を求めればよ
い。以上のような処理により、高解像度化画像が生成さ
れる。しかしながら、このような高解像度化技術では、
各画像の各画素を対応づけ画素値を重ね合わせること
で、高解像度化画像を生成していくため、１画素以下の
精度で各画像の位置を合わせる必要があり、各画像のう
ち１枚でもその位置がずれていると、鮮明な高解像度画
像は得られない。このため、各画像の位置合わせが重要
となる。しかし、対象の画像自体が低解像度になってい
る場合、画像中の対象の大きさが小さいため厳密に位置
合わせを行うことは難しい。H (I, J) may be obtained so as to minimize the following. Through the processing described above, a high-resolution image is generated. However, with such high resolution technology,
Since a high-resolution image is generated by associating each pixel of each image with a pixel value and superimposing the pixel value, it is necessary to adjust the position of each image with an accuracy of one pixel or less. If the position is shifted, a clear high-resolution image cannot be obtained. For this reason, the alignment of each image is important. However, when the target image itself has a low resolution, it is difficult to perform strict alignment because the size of the target in the image is small.

【００２４】そこで、本発明は、映像中を移動している
対象物体の動きに合わせて、読み取りの対象物体の各画
像で切り出された位置を拘束することで、画像全体を通
して厳密な位置合わせを可能とし、高解像度をより向上
する画像処理装置及び画像処理方法を提供することを目
的とする。Therefore, the present invention restricts the position of each target object to be read out in each image in accordance with the movement of the target object moving in the video, thereby achieving precise alignment throughout the entire image. It is an object of the present invention to provide an image processing apparatus and an image processing method that enable the image processing to be performed with higher resolution.

【００２５】[0025]

【課題を解決するための手段】以上の課題を解決するた
めに、本発明では、移動する対象物体を撮影した映像か
ら当該対象物体に係る対象情報を自動で読み取る画像処
理方法において、前記映像における時系列の複数の画像
毎に、前記対象情報を含む所定範囲の画像領域を探索す
る探索段階と、探索された各々の前記画像領域の位置
を、前記対象物体の移動に合った位置に修正して、各画
像領域の位置合わせを行う位置合わせ段階と、位置合わ
せされた前記各画像領域を重ね合わせて高解像度処理を
行う高解像度化画像処理段階とを含めた。According to the present invention, there is provided an image processing method for automatically reading target information relating to a moving target object from a video image of the moving target object. For each of a plurality of time-series images, a search step of searching for an image area in a predetermined range including the target information, and correcting a position of each of the searched image areas to a position suitable for movement of the target object. Thus, a positioning step of positioning each image area and a high-resolution image processing step of superimposing the aligned image areas and performing high-resolution processing are included.

【００２６】そして、前記対象情報には文字又は記号が
含まれ、前記探索段階では、前記画像の各々において、
前記文字又は記号に係るレイアウト情報に基づいて前記
画像領域を探索するようにした。さらに、前記位置合わ
せ段階では、前記対象物体の移動が等速直線運動である
として、前記各画像領域の位置を該等速直線上の位置に
修正して位置合わせを行い、前記画像領域の位置が前記
対象物体の移動位置から所定値を超えてずれているとき
には、当該画像領域を位置合わせ処理の対象から除くよ
うにした。The object information includes a character or a symbol, and in the searching step, in each of the images,
The image area is searched based on the layout information related to the character or the symbol. Further, in the positioning step, assuming that the movement of the target object is a constant-velocity linear motion, the position of each image region is corrected to a position on the constant-velocity straight line to perform positioning, and the position of the image region is adjusted. Is shifted from the movement position of the target object by more than a predetermined value, the image area is excluded from the target of the alignment processing.

【００２７】また、本発明では、移動する対象物体を撮
影した映像から当該対象物体に係る対象情報を自動で読
み取る画像処理装置において、前記映像における時系列
の複数の画像毎に、前記対象情報を含む所定範囲の画像
領域を探索する探索手段と、探索された各々の前記画像
領域の位置を、前記対象物体の移動に合った位置に修正
して、各画像領域の位置合わせを行う位置合わせ手段
と、位置合わせされた前記各画像領域を重ね合わせて高
解像度処理を行う高解像度化画像処理手段とを備えるこ
ととした。According to the present invention, there is provided an image processing apparatus for automatically reading target information relating to a moving target object from a video image of a moving target object, wherein the target information is stored for each of a plurality of time-series images in the video. Search means for searching for an image area in a predetermined range including the image area, and position adjusting means for correcting the position of each of the searched image areas to a position suitable for the movement of the target object, and aligning the image areas. And a high-resolution image processing means for performing the high-resolution processing by superimposing the aligned image regions.

【００２８】[0028]

【発明の実施の形態】以下、本発明について、高解像度
化技術を用いた文字認識処理装置に適用した場合の実施
形態を説明する。〔第１の実施形態〕図１に示されるように、固定された
カメラ４によって、ベルトコンベア２上に載せられて移
動する荷物１を撮影すると、モニタ６に映し出される荷
物１に係る画像は、ベルトコンベアの移動に従って画面
中を移動していくとする。ここで、対象である移動する
荷物１に係る画像を含む複数枚の各画像から文字らしい
領域を、例えば、対象中の文字列の配置情報により切り
出す。切り出した文字領域の位置には、荷物１の振動な
どによって多少の誤差を含むため、各画像間での誤差が
累積され全体として大きなずれが発生する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment in which the present invention is applied to a character recognition processing device using a high resolution technology will be described below. [First Embodiment] As shown in FIG. 1, when a fixed camera 4 photographs a moving luggage 1 placed on a belt conveyor 2, an image of the luggage 1 displayed on a monitor 6 is: Suppose that it moves on the screen according to the movement of the belt conveyor. Here, a character-like area is cut out from a plurality of images including an image relating to the moving luggage 1 as a target, for example, based on arrangement information of a character string in the target. Since the position of the cut-out character area includes some errors due to the vibration of the luggage 1 and the like, errors between the respective images are accumulated and a large shift occurs as a whole.

【００２９】そこで、本実施形態では、映像中を移動し
ている対象の動きを、微小時間において何らかの運動で
表現し、例えば、ベルトコンベア２上の荷物１の運動
は、微小時間においては「等速直線運動」で表されると
し、読み取りの対象である荷物１に貼られているラベル
７に係る各画像で切り出された位置が「等速直線運動」
をするように対象の位置を拘束することで、画像全体を
通して厳密な位置合わせを行うことができるようにし
た。このような位置合わせ処理により、良好な高解像度
化処理が行うことができ、対象中の文字や記号の解像度
を高くできるため、ラベル７に書かれた文字や記号を容
易に読み取ることができる。Therefore, in the present embodiment, the movement of the object moving in the video is represented by some kind of movement in a very short time. For example, the movement of the load 1 on the belt conveyor 2 is expressed as "e.g. The position extracted from each image related to the label 7 affixed to the baggage 1 to be read is referred to as “constant linear motion”.
By constraining the position of the object so as to perform, precise alignment can be performed throughout the image. By such a positioning process, it is possible to perform a favorable high resolution process and to increase the resolution of the characters and symbols in the target, so that the characters and symbols written on the label 7 can be easily read.

【００３０】第１の実施形態による位置合わせ処理を含
む文字認識処理装置のブロック構成を図２に示す。文字
認識処理装置１０には、カメラ４などの画像入力装置か
ら得られるアナログ映像入力信号、又は画像記録装置か
らの出力などのディジタル映像入力信号が供給される。
入力信号がアナログ映像信号である場合には、そのアナ
ログ映像信号をＡ／Ｄ変換部１１でディジタル信号に変
換した後に、或いは入力信号が画像記録装置からの出力
などのディジタル映像信号である場合には、そのディジ
タル信号のまま入力される画像制御部１２を有する。そ
して、画像制御部１２で入力された画像を記憶する第１
画像記憶部１３と、対象探索部１５や位置合わせ処理部
１６で処理された結果を記憶する第２画像記憶部１４
と、与えられたレイアウトに基づき、移動している対象
を探索する対象探索部１５と、各時刻での画像における
対象の位置を合わせる位置合わせ処理部１６と、各時刻
での対象の位置を合わせた結果から高解像度化処理を行
う高解像度化処理部１７と、高解像度化した画像に対し
て文字列や記号の読み取りを行う文字認識部１８を備え
ている。FIG. 2 shows a block configuration of a character recognition processing device including a positioning process according to the first embodiment. An analog video input signal obtained from an image input device such as the camera 4 or a digital video input signal such as an output from an image recording device is supplied to the character recognition processing device 10.
When the input signal is an analog video signal, after the analog video signal is converted into a digital signal by the A / D converter 11, or when the input signal is a digital video signal such as an output from an image recording apparatus. Has an image control unit 12 which receives the digital signal as it is. Then, a first image storing the image input by the image control unit 12 is performed.
An image storage unit 13 and a second image storage unit 14 that stores results processed by the target search unit 15 and the alignment processing unit 16
Based on a given layout, a target search unit 15 for searching for a moving target, a positioning processing unit 16 for positioning a target in an image at each time, and a target processing unit 16 for positioning a target at each time A high-resolution processing unit 17 for performing a high-resolution process based on the result, and a character recognition unit 18 for reading a character string or a symbol from the high-resolution image are provided.

【００３１】以下、第１の実施形態に係る文字認識処理
装置１０の動作について、ブロック毎に詳細に説明す
る。・入力された画像のＡ／Ｄ変換カメラ４などの画像入力装置から得られるアナログ映像
をＡ／Ｄ変換部１１によりディジタル化し、後段の画像
制御部１２へ出力する。ただし、入力映像がディジタル
映像の場合は、Ａ／Ｄ変換部１１のない構成となる。・Ａ／Ｄ変換された画像の制御画像制御部１２は、Ａ／Ｄ変換部１１からの出力画像又
は画像記録装置などからの出力画像によるディジタル映
像信号を制御し、第１画像記憶部１３に画像を記憶する
とともに、後段の対象探索部１５に送る。・画像の記憶第１画像記憶部１３には、画像制御部１２に入力された
ディジタル映像信号に係る画像データを記憶する。ま
た、ある程度の時間の画像データを記憶できるだけのメ
モリ容量を持っているため、後段の対象探索部１５で処
理を行っている間も入力される画像データを記憶するこ
とができる。・処理結果の記憶第２画像記憶部１４には、対象探索部１５や位置合わせ
処理部１６で処理された結果を記憶する。なお、第２画
像記憶部１４を第１画像記憶部１３と別に示したが、記
憶部としての構成としては、一つのものでもよく、説明
の便宜上、個別の構成として示した。・認識対象の探索対象探索部１５では、画像制御部１２から送られた映像
信号による画像に対象となる画像が含まれているかどう
かを探す。Hereinafter, the operation of the character recognition processing device 10 according to the first embodiment will be described in detail for each block. A / D conversion of input image An analog image obtained from an image input device such as the camera 4 is digitized by the A / D converter 11 and output to the image controller 12 at the subsequent stage. However, when the input video is a digital video, the configuration does not include the A / D converter 11. Control of A / D-converted image The image control unit 12 controls a digital video signal based on an output image from the A / D conversion unit 11 or an output image from an image recording device or the like. The image is stored and sent to the target search unit 15 at the subsequent stage. Image Storage The first image storage unit 13 stores image data relating to the digital video signal input to the image control unit 12. In addition, since the memory has a memory capacity enough to store image data for a certain period of time, it is possible to store the input image data even while the target search unit 15 in the subsequent stage performs the processing. -Storage of processing results The second image storage unit 14 stores results processed by the target search unit 15 and the positioning processing unit 16. Although the second image storage unit 14 is shown separately from the first image storage unit 13, the configuration as the storage unit may be one, and is shown as an individual configuration for convenience of explanation. Searching for Recognition Target The target search unit 15 searches whether or not an image based on the video signal sent from the image control unit 12 includes a target image.

【００３２】先ず、画像制御部１２から入力され、ある
時刻ｔのタイミングで撮影された画像に、対象が存在す
るかどうかを調べる。具体的には、入力された画像に対
して２値化、ラベリング処理を行うことで、図３（ａ）
のように文字らしい部分を検出する。同図中では、文字
らしい部分を、丸形状又は楕円形状で示した。次に、例
えば、図３（ｂ）に示されるような、名札のように、予
め、対象の文字列などの配置関係を示すレイアウト情報
を与える。このとき、レイアウト情報は、どんな文字又
は記号であるかは必要なく、それらの配置関係を判別す
ることができる程度のものである。First, it is checked whether or not an object is present in an image input from the image controller 12 and taken at a certain time t. Specifically, by performing binarization and labeling processing on the input image, FIG.
Detect character-like parts like In the figure, a character-like portion is shown in a round shape or an elliptical shape. Next, for example, layout information indicating an arrangement relationship of a target character string or the like is given in advance like a name tag as shown in FIG. 3B. At this time, the layout information does not need to be a character or a symbol, but is of such a degree that the positional relationship between them can be determined.

【００３３】その文字らしい部分の配置がレイアウト情
報と類似している所定範囲の領域を見つける。レイアウ
ト情報との類似度が、ある閾値以下の場合は、「対象な
し」を表す信号を画像制御部１２に送り、画像制御部１
２は、第１画像記憶部１３に記憶している次の時刻の画
像を対象探索部１５へ出力し、次いで、対象探索部１５
は、その画像に対して探索を行う。An area within a predetermined range in which the arrangement of the character-like portion is similar to the layout information is found. If the degree of similarity with the layout information is equal to or less than a certain threshold, a signal indicating “no target” is sent to the image control unit 12 and the image control unit 1
2 outputs the image at the next time stored in the first image storage unit 13 to the target search unit 15, and then outputs the target search unit 15
Performs a search on that image.

【００３４】その画像について、レイアウト情報との類
似度がある閾値以上の場合は、類似していると判断し、
図３（ａ）に示されるように、次の時刻における所定範
囲を有する領域Ｐの四隅の座標(ｘ1、ｙ1)、(ｘ2、ｙ
2)、(ｘ3、ｙ3)、(ｘ4、ｙ4)と、領域Ｐに係る画像デー
タとを第２画像記憶部１４へ出力し記憶する。これと同
時に、「対象あり」を表す信号を画像制御部１２に送
る。If the similarity of the image with the layout information is equal to or greater than a certain threshold, it is determined that the images are similar,
As shown in FIG. 3A, the coordinates (x1, y1), (x2, y) of the four corners of the area P having the predetermined range at the next time are shown.
2), (x3, y3), (x4, y4) and the image data relating to the area P are output to the second image storage unit 14 and stored. At the same time, a signal indicating “there is a target” is sent to the image control unit 12.

【００３５】このようにして、画像制御部１２は、第１
画像記憶部１３に記憶している時刻ｔが時刻ｔ0から時
刻ｔn-1までであれば、各時刻の画像を対象探索部１５
に順次入力する。そして、対象探索部１５では、順次入
力される画像について順次探索処理を行う。ここで、入
力される画像の枚数ｎは、高解像度化処理部１７で用い
る枚数である。In this way, the image control unit 12
If the time t stored in the image storage unit 13 is from the time t0 to the time tn-1, the image at each time is retrieved from the target search unit 15
Are input sequentially. Then, the target search unit 15 performs a sequential search process on the sequentially input images. Here, the number n of the input images is the number used in the high-resolution processing unit 17.

【００３６】次に、時刻ｔ1〜ｔn-1における対象の位置
を追跡する。具体的には、例えば、時刻ｔ0で検出した
対象の四隅の座標(ｘ10、ｙ10)、(ｘ20、ｙ20)、(ｘ3
0、ｙ30)、(ｘ40、ｙ40)から検出した領域ｐ0をテンプ
レートとしたテンプレートマッチングにより対象の追跡
を実現する。追跡により得られた各時刻ｔiでの対象領
域ｐiの四隅の座標(ｘ1i、ｙ1i)、(ｘ2i、ｙ2i)、(ｘ3
i、ｙ3i)、(ｘ4i、ｙ4i)と、領域ｐiに係る画像データ
を第２画像記憶部１４へ出力し記憶していく。ここで、
処理枚数がｎであれば、ｉ＝０〜（ｎ−１）である。Next, the position of the object from time t1 to tn-1 is tracked. Specifically, for example, the coordinates (x10, y10), (x20, y20), (x3
Object tracking is realized by template matching using the region p0 detected from (0, y30) and (x40, y40) as a template. The coordinates (x1i, y1i), (x2i, y2i), (x3i) of the four corners of the target area pi at each time ti obtained by tracking.
i, y3i), (x4i, y4i) and image data relating to the area pi are output to the second image storage unit 14 and stored. here,
If the number of processed sheets is n, i = 0 to (n-1).

【００３７】図４に、探索処理によって得られた対象領
域の具体例を、ｎ＝４の場合について示した。対象領域
が、ｐ0〜ｐ3の四角枠で示される。以上の処理を、処理
枚数分ｎだけ繰り返すことにより、各時刻における対象
領域の四隅の座標が求まる。この四隅の座標を基に、後
段の位置合わせ処理部１６で各画像の対象領域について
位置合わせを行う。・各画像の位置合わせ処理対象探索部１５での処理が終了すると、第２画像記憶部
１４に記憶された各時刻の画像と各画像での対象領域の
四隅の座標が、位置合わせ処理部１６に順次入力され
る。入力された時刻ｔ0から時刻ｔn-1までのｎ枚の画像
における対象の位置合わせを行う。FIG. 4 shows a specific example of the target area obtained by the search processing when n = 4. The target area is indicated by a square frame of p0 to p3. By repeating the above processing for the number of processed sheets n, the coordinates of the four corners of the target area at each time are obtained. On the basis of the coordinates of the four corners, the positioning processing unit 16 at the subsequent stage performs positioning on the target area of each image. When the processing in the target search unit 15 is completed, the image at each time stored in the second image storage unit 14 and the coordinates of the four corners of the target area in each image are stored in the alignment processing unit 16. Are sequentially input. Alignment of the object in the n images from the input time t0 to the time tn-1 is performed.

【００３８】先ず、ｎ枚の画像のある一枚を取り出し、
ある大きさに拡大した画像(以下基準画像と呼ぶ)を作成
する。その画像に対応するように、残りの（ｎ−１）枚
の画像の位置を合わせるための射影変換を求める。拡大
する大きさが、高解像度化処理により生成される画像の
大きさとなる。一般に、２枚の画像間で４点の対応関係
が与えられると、２枚の画像を結び付ける射影変換が定
まる。そこで、ｍ番目(ｍ＜ｎ)の入力画像における４頂
点を(ｘm1、ｙm1)、(ｘm2、ｙm2)、(ｘm3、ｙm3)、(ｘm
4、ｙm4)とし、基準画像における座標を(ｘM1、ｙM1)、
(ｘM2、ｙM2)、(ｘM3、ｙM3)、(ｘM4、ｙM4)とし、基準
画像をｍ番目の入力画像に変換する射影変換は、ｘmi＝(a1・ｘMi＋a2・ｙMi＋a3)／(a7・ｘMi＋a8・ｙMi＋a9) ｙmi＝(a4・ｘMi＋a5・ｙMi＋a6)／(a7・ｘMi＋a8・ｙMi＋a9) （４頂点であるので、i＝1、2、3、4） …(13) となる。First, one of the n images is taken out, and
An image enlarged to a certain size (hereinafter referred to as a reference image) is created. Projection transformation for adjusting the positions of the remaining (n-1) images is determined so as to correspond to the images. The size of the enlargement is the size of the image generated by the high resolution processing. In general, when a correspondence of four points is given between two images, a projective transformation for linking the two images is determined. Thus, the four vertices in the m-th (m <n) input image are represented by (xm1, ym1), (xm2, ym2), (xm3, ym3), (xm
4, ym4), and the coordinates in the reference image are (xM1, yM1),
(xM2, yM2), (xM3, yM3), (xM4, yM4), and the projection transformation for transforming the reference image into the m-th input image is: xmi = (a1.xMi + a2.yMi + a3) / (a7.xMi + a8.yMi + a9 ) ymi = (a4.xMi + a5.yMi + a6) / (a7.xMi + a8.yMi + a9) (Since there are four vertices, i = 1, 2, 3, 4) (13)

【００３９】対象探索部１５で求められた各画像の四隅
の座標を、(13)式に代入する。このことにより、変換係
数ａ1〜ａ9の９個の未知数に対して８個の線形方程式を
与えることになるが、(13)式は、ａ1〜ａ9を定数倍して
も変化しないスケール不変性があるので、例えば、ａ9
＝1であるという制約を与えれば、解が一意的に定ま
る。以上の処理により、基準画像をｍ番目の入力画像に
変換する射影変換が求められる。この逆変換をｍ番目の
入力画像に施せば、基準画像と位置が合うことになる。The coordinates of the four corners of each image obtained by the object search unit 15 are substituted into equation (13). This gives eight linear equations to the nine unknowns of the conversion coefficients a1 to a9. Equation (13) shows that the scale invariance that does not change even when a1 to a9 is multiplied by a constant is calculated. So, for example, a9
Given the constraint that = 1, the solution is uniquely determined. Through the above processing, projective transformation for transforming the reference image into the m-th input image is obtained. If this inverse transformation is performed on the m-th input image, the position matches the reference image.

【００４０】ここで、実際には、各画像の対象領域の解
像度が低いため、求まった対象領域の位置には、誤差を
含んでいる。そこで、対象物体の運動に対応して、位置
合わせを修正する。移動している対象物体は、微小時間
の間隔で見た場合、決められた所定の運動をしていると
見ることができる。その運動を満足するように、各時刻
で検出された対象領域の位置を修正してやることによ
り、対象領域の位置合わせを厳密に行うことができる。Here, actually, since the resolution of the target area of each image is low, the obtained position of the target area includes an error. Therefore, the alignment is corrected according to the movement of the target object. When the moving target object is viewed at a minute time interval, it can be seen that the moving target object is performing a predetermined predetermined movement. By correcting the position of the target area detected at each time so as to satisfy the movement, the position of the target area can be strictly adjusted.

【００４１】具体的には、例えば、図４のように、時刻
ｔ0〜ｔ3での各逆射影変換による対象領域ｐ0〜ｐ3の位
置が表されている場合、図５に示されるように、微小時
間における対象領域の運動を所定の運動に当て嵌める。
例えば、対象物体の運動が等速直線運動と見なす。図５
においては、破線Ｌによる矢印のように運動していると
する。同図は、対象物体の運動をモニタ６上の画面で見
た状態で示している。Specifically, for example, as shown in FIG. 4, when the positions of the target areas p0 to p3 by the respective inverse projection transformations at times t0 to t3 are represented, as shown in FIG. The motion of the target area in time is applied to a predetermined motion.
For example, the motion of the target object is regarded as a uniform linear motion. FIG.
Is assumed to be moving like the arrow indicated by the broken line L. The figure shows the motion of the target object as viewed on a screen on the monitor 6.

【００４２】先ず、各時刻での対象領域の四隅の位置に
対応する点のうち、領域ｐ0〜ｐ3の左上の点が領域を代
表する位置であるとして、それらの座標を、図５中に丸
印で示されるように、(ｘ10、ｙ10)、(ｘ11、ｙ11)、
(ｘ12、ｙ12)、(ｘ13、ｙ13)とする。そこで、図６に示
すように、図５に丸印で示されるような各対象領域の代
表位置について、対象物体の運動を表す直線Ｌ上に並ぶ
ように修正する。図示のように、各座標は、(ｘm10、ｙ
m10)、(ｘm11、ｙm11)、(ｘm12、ｙm12)、(ｘm13、ｙm1
3)に修正される。しかし、ここでは、各座標が直線Ｌ上
に位置されたに過ぎず、まだ等速運動に修正されていな
い。First, among the points corresponding to the four corner positions of the target area at each time, the upper left point of the areas p0 to p3 is assumed to be the position representative of the area, and their coordinates are indicated by circles in FIG. (X10, y10), (x11, y11),
(x12, y12) and (x13, y13). Therefore, as shown in FIG. 6, the representative position of each target area as indicated by a circle in FIG. 5 is corrected so as to be aligned on a straight line L representing the motion of the target object. As shown, each coordinate is (xm10, y
m10), (xm11, ym11), (xm12, ym12), (xm13, ym1
Modified in 3). However, here, each coordinate is merely located on the straight line L, and has not yet been corrected to the constant velocity motion.

【００４３】次に、各座標が、図７のように直線Ｌで示
される等速運動をしているとするために、各時間間隔の
移動量が等しくなるように修正される。(13)式による射
影変換を用い、各対象領域の画像データに対して再計算
する。修正された後の各座標は、図７に示されるよう
に、(ｘM10、ｙM10)、(ｘM11、ｙM11)、(ｘM12、ｙM1
2)、(ｘM13、ｙM13)となって、対象領域ｐ0〜ｐ3が、直
線Ｌに沿った等速直線運動となるように、各領域の座標
が修正される。Next, in order to assume that each coordinate is moving at a constant speed indicated by a straight line L as shown in FIG. 7, the coordinates are corrected so that the movement amounts at each time interval become equal. Using projective transformation according to equation (13), recalculation is performed on the image data of each target area. The coordinates after the correction are (xM10, yM10), (xM11, yM11), (xM12, yM1) as shown in FIG.
2), (xM13, yM13), and the coordinates of each area are corrected such that the target areas p0 to p3 have a uniform linear motion along the straight line L.

【００４４】このように、各時刻の座標を全体の運動に
より拘束することで、各時刻の画像間で生じる誤差を累
積することがないため、大きなずれのない厳密な位置合
わせを可能とする。また、対象の運動により拘束せず
に、四隅の点の対応のみで射影変換を求める場合も含
む。以上のように、各時刻ｔ0〜ｔ3での対象の位置を合
わせ、重ね合わした画像をつくり、その画像を高解像度
化処理部１７へ出力する。・重ね合わされた画像の高解像度化処理高解像度化処理部１７では、位置合わせ処理部１６で生
成されたｎ枚の画像に対して高解像度化処理を行う。こ
こで行う高解像度化とは、画像形成過程をモデル化し、
その逆問題を解くことで形成された画像の原因となる高
精細な画像を推定することであり、具体的には、前述の
(1)乃至(12)式による高解像度化処理である。(12)式で
表される評価関数を最小にするようなＨ(I、J)が求めら
れる。このような処理により生成された高解像度化画像
が、を文字認識部１８に出力される。・高解像度化画像からの文字認識文字認識処理部１８は、高解像度化処理部１７から出力
された画像を用いて文字認識を行う。As described above, by constraining the coordinates at each time by the entire motion, errors occurring between images at each time are not accumulated, so that strict positioning without a large shift can be performed. Also, the case where the projective transformation is obtained only by the correspondence of the four corner points without being restricted by the movement of the object is included. As described above, the position of the target at each of the times t0 to t3 is adjusted to form a superimposed image, and the image is output to the high resolution processing unit 17. High-resolution processing of superimposed images The high-resolution processing unit 17 performs high-resolution processing on the n images generated by the alignment processing unit 16. The resolution enhancement performed here is a model of the image forming process,
It is to estimate a high-definition image that causes the image formed by solving the inverse problem, and specifically,
This is a high resolution process based on equations (1) to (12). H (I, J) that minimizes the evaluation function represented by equation (12) is obtained. The high-resolution image generated by such processing is output to the character recognition unit 18. Character Recognition from High Resolution Image The character recognition processing unit 18 performs character recognition using the image output from the high resolution processing unit 17.

【００４５】具体的には、例えば、文字レイアウトに含
まれる可能性のある文字(漢字、数字、アルファベット
など)の大きさや位相の異なるパターンを認識するため
に、予め辞書として用意しておく。そこで、高解像度化
処理を施された画像中の文字と辞書画像とのテンプレー
トマッチングを行い、マッチング度が高いものを文字な
どの認識結果として出力する。テンプレートマッチング
の方法としては、入力画像（Ｘ、Ｙ）におけるｉ番目の
画素Ｘ_i、Ｙ_iについて、(14)式の計算を行い、マッチン
グ度Ｍが小さいほど似通った領域であるとする。ただ
し、Ｍ≧０である。Specifically, for example, a dictionary is prepared in advance in order to recognize patterns having different sizes and phases of characters (kanji, numerals, alphabets, etc.) which may be included in the character layout. Therefore, template matching is performed between the characters in the image that has been subjected to the high resolution processing and the dictionary image, and those having a high degree of matching are output as recognition results of the characters and the like. As a method of template matching, the equation (14) is calculated for the i-th pixel X _i , Y _i in the input image (X, Y), and the smaller the degree of matching M, the more similar the area. However, M ≧ 0.

【００４６】[0046]

【数７】 (Equation 7)

【００４７】以上の処理による文字などの認識結果を、
ベルトコンベア２の下流において別途用意されている仕
分け装置に出力する。その仕分け装置によって、対象物
体である荷物１の仕分けを自動で行うことや、モニタ６
に出力して作業員が荷物１の仕分けをすることなどのよ
うに、物流監視システムを構築できる。〔第２の実施形態〕第２の実施形態による文字認識処理
装置１０のブロック構成を図８に示す。The recognition result of characters and the like by the above processing is
The output is output to a separately prepared sorting device downstream of the belt conveyor 2. The sorting device automatically sorts the luggage 1 as a target object,
The distribution monitoring system can be constructed, for example, in such a way that the worker sorts out the luggage 1 by outputting the data to the user. [Second Embodiment] FIG. 8 shows a block configuration of a character recognition processing device 10 according to a second embodiment.

【００４８】図２に示された第１の実施形態による文字
認識処理装置１０において、位置合わせ処理部１６の後
段に、画像選択処理部１９を加えたものである。第１の
実施形態では、対象が発見された時刻ｔ0からｔn-1まで
の各時刻での画像であるｎ枚の入力画像を用い高解像度
化処理を行っている。しかし、入力されたｎ枚の画像に
は、位置合わせがうまく行えていない画像など高解像度
化処理に不適な画像も含まれている可能性が高い。その
ような画像を高解像度化処理に用いた場合、悪い影響を
与えることがある。In the character recognition processing device 10 according to the first embodiment shown in FIG. 2, an image selection processing unit 19 is added at the subsequent stage of the positioning processing unit 16. In the first embodiment, high resolution processing is performed using n input images, which are images at each time from the time t0 to tn-1 when the target is found. However, there is a high possibility that the input n images include images that are not suitable for high-resolution processing, such as images that have not been properly aligned. When such an image is used for the high resolution processing, it may have a bad influence.

【００４９】そこで、第２の実施形態では、ｎ枚の画像
から位置ずれの大きい画像を除去することにより、良好
な高解像度化が得られるようにした。具体的には、例え
ば、上述した高解像度化処理における(12)式に示すテン
プレートマッチングを、位置合わせを行った各画像と基
準画像との間で行う、マッチング度Ｍの値が大きい画像
ほど、位置ずれが大きいと考えられる。よって、画像選
択処理部１９によりマッチング度を計算し、その値が閾
値より大きい画像を除去した後で、比較的位置ずれが少
ない画像に基づいて高解像度化処理を行う。Therefore, in the second embodiment, good resolution can be obtained by removing an image having a large displacement from n images. Specifically, for example, the template matching shown in Expression (12) in the above-described high-resolution processing is performed between each of the aligned images and the reference image. It is considered that the displacement is large. Therefore, after the matching degree is calculated by the image selection processing unit 19 and the image whose value is larger than the threshold value is removed, the high resolution processing is performed based on the image with relatively small positional deviation.

【００５０】従って、高解像度化処理において、処理に
不適当な画像を処理対象から取り除くことにより、一層
精度の高い処理を行うことができる。これまで、物流監
視システムを例にして説明してきたが、これに限られ
ず、建物への人の出入チェックシステムなど、移動する
物に付けられている対象情報を認識する必要があるよう
なところに、本実施形態による複数画像を用いた高解像
度化画像処理を適用することができる。Therefore, in the high resolution processing, an image unsuitable for the processing is removed from the processing target, so that the processing can be performed with higher accuracy. Until now, the explanation has been given using the physical distribution monitoring system as an example, but the present invention is not limited to this, and it is necessary to recognize target information attached to moving objects, such as a system for checking in and out of people in buildings. In addition, the high-resolution image processing using a plurality of images according to the present embodiment can be applied.

【００５１】対象情報についても、文字又は記号だけで
なく、物体中の特定形状などによる図形でもよい。この
場合には、高解像度化処理部の後段にある文字認識部の
代わりに、図形認識部を置く。本実施形態による高解像
度化処理を適用した文字認識処理装置の動作として、探
索対象とする物体が等速直線運動に従って移動している
例を挙げて説明したが、探索対象物体がカメラで撮影さ
れる範囲内で、決まった経路で移動することが把握され
ているならば、その経路を探索対象の運動としてそれに
拘束するように各対象領域を修正するようにしてもよ
い。The object information is not limited to characters or symbols, but may be a figure in a specific shape in an object. In this case, a graphic recognizing unit is provided instead of the character recognizing unit at the subsequent stage of the high resolution processing unit. As an operation of the character recognition processing apparatus to which the resolution increasing process according to the present embodiment is applied, an example in which an object to be searched is moving according to a constant velocity linear motion has been described. If it is known that the user moves along a predetermined route within a certain range, each target region may be modified so that the route is restricted as the search target motion.

【００５２】また、これまで、カメラで撮影された画面
内において、一つの対象物体の探索を行う例を説明した
が、同時に２以上の対象物体を探索することもでき、こ
の場合には、各対象物体のそれぞれの運動を予め決定し
ておくことにより、各対象物体について、一つのカメラ
による撮影画像に基づいて広範囲の監視を実現すること
ができる。In the above, an example in which one target object is searched for in a screen shot by a camera has been described. However, two or more target objects can be searched at the same time. By preliminarily determining the motion of each target object, it is possible to monitor a wide range of each target object based on an image captured by one camera.

【００５３】さらに、カメラによって撮影された画面が
揺らいでいるような場合、対象物体が止まっていても、
決まった経路として各対象領域を停止運動に拘束するこ
とで、より鮮明な高解像度化画像を作成することができ
る。（付記１）移動する対象物体を撮影した映像から当該
対象物体に係る対象情報を自動で読み取る画像処理方法
であって、前記映像における時系列の複数の画像毎に、
前記対象情報を含む所定範囲の画像領域を探索する探索
段階と、探索された各々の前記画像領域の位置を、前記
対象物体の移動に合った位置に修正して、各画像領域の
位置合わせを行う位置合わせ段階と、位置合わせされた
前記各画像領域を重ね合わせて高解像度処理を行う高解
像度化画像処理段階とを含むことを特徴とする画像処理
方法。（付記２）前記対象情報が文字又は記号を含み、前記
探索段階では、前記画像の各々において、前記文字又は
記号に係るレイアウト情報に基づいて前記画像領域を探
索することを特徴とする付記１に記載の画像処理方法。（付記３）前記位置合わせ段階では、前記対象物体が
微小時間毎に移動しているとし、前記画像領域の位置を
前記時間に対応する前記対象物体の移動位置に従って修
正し位置合わせを行うことを特徴とする付記１又は２に
記載の画像処理方法。（付記４）前記位置合わせ段階では、前記対象物体の
移動が等速直線運動であるとして、前記各画像領域の位
置を該等速直線上の位置に修正して位置合わせを行うこ
とを特徴とする付記３に記載の画像処理方法。（付記５）前記位置合わせ段階では、前記画像領域の
位置が前記対象物体の移動位置から所定値を超えてずれ
ているとき、当該画像領域を位置合わせ処理の対象から
除くことを特徴とする付記１乃至４のいずれか一つに記
載の画像処理方法。（付記６）前記高解像度化画像に基づいて前記文字又
は記号情報の認識を行う文字認識処理段階を含むことを
特徴とする付記２乃至５のいずれか一つに記載の画像処
理方法。（付記７）移動する対象物体を撮影した映像から当該
対象物体に係る対象情報を自動で読み取る画像処理装置
であって、前記映像における時系列の複数の画像毎に、
前記対象情報を含む所定範囲の画像領域を探索する探索
手段と、探索された各々の前記画像領域の位置を、前記
対象物体の移動に合った位置に修正して、各画像領域の
位置合わせを行う位置合わせ手段と、位置合わせされた
前記各画像領域を重ね合わせて高解像度処理を行う高解
像度化画像処理手段とを有することを特徴とする画像処
理装置。（付記８）前記対象情報が文字又は記号を含み、前記
探索手段は、前記画像の各々において、前記文字又は記
号に係るレイアウト情報に基づいて前記画像領域を探索
することを特徴とする付記７に記載の画像処理装置。（付記９）前記位置合わせ手段は、前記対象物体が微
小時間毎に移動しているとし、前記画像領域の位置を前
記時間に対応する前記対象物体の移動位置に従って修正
し位置合わせを行うことを特徴とする付記７又は８に記
載の画像処理装置。（付記１０）前記位置合わせ手段は、前記対象物体の
移動が等速直線運動であるとして、前記各画像領域の位
置を該等速直線上の位置に修正して位置合わせを行うこ
とを特徴とする付記９に記載の画像処理装置。（付記１１）前記位置合わせ手段には、前記画像領域
の位置が前記対象物体の移動位置から所定値を超えてず
れているとき、当該画像領域を位置合わせ処理の対象か
ら除く画像選択手段を備えたことを特徴とする付記７乃
至１０のいずれか一つに記載の画像処理装置。（付記１２）前記高解像度化画像に基づいて前記文字
又は記号情報の認識を行う文字認識処理手段を有するこ
とを特徴とする付記８乃至１１のいずれか一つに記載の
画像処理装置。Further, when the screen shot by the camera fluctuates, even if the target object is stationary,
By constraining each target area to a stationary motion as a fixed path, a clearer high-resolution image can be created. (Supplementary Note 1) An image processing method for automatically reading target information relating to a moving target object from a video image of the moving target object, the method comprising:
A search step of searching for a predetermined range of image areas including the target information, and correcting the position of each of the searched image areas to a position suitable for the movement of the target object; An image processing method comprising: performing a positioning step; and performing a high-resolution image processing step of performing a high-resolution process by superimposing the aligned image areas. (Supplementary note 2) The supplementary note 1, wherein the target information includes a character or a symbol, and in the searching step, the image area is searched in each of the images based on layout information on the character or the symbol. The image processing method described in the above. (Supplementary Note 3) In the positioning step, it is assumed that the target object is moving every minute time, and the position of the image area is corrected according to the moving position of the target object corresponding to the time to perform positioning. 3. The image processing method according to claim 1 or 2, wherein (Supplementary Note 4) In the positioning step, the position of each of the image regions is corrected to a position on the constant velocity straight line, and the position is adjusted, assuming that the movement of the target object is a constant velocity linear movement. 4. The image processing method according to claim 3, wherein (Supplementary Note 5) In the positioning step, when the position of the image area is deviated from a movement position of the target object by more than a predetermined value, the image area is excluded from a target of the positioning processing. 5. The image processing method according to any one of 1 to 4. (Supplementary note 6) The image processing method according to any one of Supplementary notes 2 to 5, further comprising a character recognition processing step of recognizing the character or symbol information based on the high-resolution image. (Supplementary Note 7) An image processing apparatus that automatically reads target information related to a moving target object from a video image of the moving target object, the image processing device including:
Searching means for searching for a predetermined range of image areas including the target information; correcting the position of each of the searched image areas to a position suitable for the movement of the target object; An image processing apparatus, comprising: a positioning unit that performs a positioning process; and a high-resolution image processing unit that performs a high-resolution process by superimposing the aligned image regions. (Supplementary Note 8) The supplementary note 7, wherein the target information includes a character or a symbol, and the search unit searches the image area in each of the images based on layout information on the character or the symbol. The image processing apparatus according to any one of the preceding claims. (Supplementary Note 9) The positioning means corrects the position of the image area according to the movement position of the target object corresponding to the time, assuming that the target object moves every minute time. An image processing apparatus according to claim 7 or 8, wherein (Supplementary Note 10) The positioning means performs the positioning by correcting the position of each of the image regions to a position on the constant velocity straight line, assuming that the movement of the target object is a constant velocity linear movement. 10. The image processing apparatus according to claim 9, wherein (Supplementary Note 11) The positioning unit includes an image selection unit that excludes the image region from a target of the positioning process when the position of the image region is shifted from a moving position of the target object by more than a predetermined value. The image processing apparatus according to any one of supplementary notes 7 to 10, wherein: (Supplementary Note 12) The image processing apparatus according to any one of Supplementary Notes 8 to 11, further comprising a character recognition processing unit configured to recognize the character or symbol information based on the high-resolution image.

【００５４】[0054]

【発明の効果】本発明の効果は、映像中の文字や記号を
含む移動している対象の各時刻での位置を、対象の運動
により拘束することで厳密に合わせることができるた
め、高解像度化処理が良好に行え、自動で文字や記号を
読み取ることができる。それにより、カメラの撮影範囲
を広げることができ、例えば、物流監視システムに適用
した場合、複数台のベルトコンベアを１台のカメラでの
監視が可能となり、システムの大幅なコストダウンにつ
ながる。The effect of the present invention is that the position of a moving object including characters and symbols in a video at each time can be strictly adjusted by constraining the movement of the object. The conversion process can be performed well, and characters and symbols can be read automatically. Thereby, the photographing range of the camera can be widened. For example, when the present invention is applied to a distribution monitoring system, a plurality of belt conveyors can be monitored by one camera, leading to a significant cost reduction of the system.

[Brief description of the drawings]

【図１】物流監視システムの概略構成を示す図である。FIG. 1 is a diagram showing a schematic configuration of a physical distribution monitoring system.

【図２】第１の実施形態を適用した文字認識処理装置に
係るブロック構成を示した図である。FIG. 2 is a diagram showing a block configuration according to a character recognition processing device to which the first embodiment is applied.

【図３】レイアウト情報による認識対象の探索を説明す
る図である。FIG. 3 is a diagram illustrating a search for a recognition target based on layout information.

【図４】モニタに映し出された各時刻での認識対象の位
置を示した図である。FIG. 4 is a diagram showing a position of a recognition target at each time displayed on a monitor.

【図５】微少時間における認識対象の運動を修正するた
めの該対象画像の代表点を特定した状態を説明する図で
ある。FIG. 5 is a diagram illustrating a state in which a representative point of the target image for correcting the motion of the recognition target in a very short time is specified.

【図６】各対象画像の代表点を直線上に並べ、認識対象
の運動による画像位置の修正について説明する図であ
る。FIG. 6 is a diagram for explaining correction of an image position by a motion of a recognition target by arranging representative points of each target image on a straight line.

【図７】認識対象が等速直線運動となるように、各対象
画像を各時間の移動量が等しくなるように修正した状態
を示す図である。FIG. 7 is a diagram showing a state in which each target image has been corrected so that the amount of movement at each time is equal so that the recognition target has a uniform linear motion.

【図８】第２の実施形態を適用した文字認識処理装置に
係るブロック構成を示した図である。FIG. 8 is a diagram illustrating a block configuration according to a character recognition processing device to which the second embodiment is applied.

[Explanation of symbols]

１…荷物２…ベルトコンベア３…照明装置４…カメラ５…認識装置６…モニタ７…ラベル１０…文字認識処理装置１１…Ａ／Ｄ変換器１２…画像制御部１３…第１画像記憶部１４…第２画像記憶部１５…対象探索部１６…位置合わせ処理部１７…高解像度化処理部１８…文字認識部１９…画像選択処理部 DESCRIPTION OF SYMBOLS 1 ... Luggage 2 ... Belt conveyor 3 ... Lighting device 4 ... Camera 5 ... Recognition device 6 ... Monitor 7 ... Label 10 ... Character recognition processing device 11 ... A / D converter 12 ... Image control unit 13 ... First image storage unit 14 ... Second image storage unit 15 ... Object search unit 16 ... Positioning processing unit 17 ... High resolution processing unit 18 ... Character recognition unit 19 ... Image selection processing unit

フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考） // Ｈ０４Ｎ 5/915 Ｈ０４Ｎ 5/91 Ｋ (72)発明者塩原守人神奈川県川崎市中原区上小田中４丁目１番１号富士通株式会社内Ｆターム(参考） 5B057 CA12 CA16 CB12 CB16 CE03 5C052 AA00 DD04 5C053 FA11 GB19 HA29 KA01 KA24 LA01 5C054 AA01 EJ07 FC13 HA03 5L096 FA69 HA04 JA03 Continued on the front page (51) Int.Cl. ⁷ Identification FI FI Theme Court II (Reference) // H04N 5/915 H04N 5/91 K (72) Inventor Morito Shiohara 4-1-1 Kamiodanaka, Nakahara-ku, Kawasaki City, Kanagawa Prefecture No. 1 F-term in Fujitsu Limited (reference) 5B057 CA12 CA16 CB12 CB16 CE03 5C052 AA00 DD04 5C053 FA11 GB19 HA29 KA01 KA24 LA01 5C054 AA01 EJ07 FC13 HA03 5L096 FA69 HA04 JA03

Claims

[Claims]

1. An image processing method for automatically reading target information relating to a moving target object from a video image of a moving target object, wherein the predetermined information includes the target information for each of a plurality of time-series images in the video. A search step of searching for a range of image areas; a position adjusting step of correcting the position of each of the searched image areas to a position suitable for movement of the target object, and aligning each image area; A high-resolution image processing step of performing high-resolution processing by superimposing the aligned image areas.

2. The method according to claim 1, wherein the target information includes a character or a symbol, and in the searching step, the image area is searched in each of the images based on layout information on the character or the symbol. 2. The image processing method according to 1.

3. In the positioning step, the position of each of the image areas is corrected to a position on the constant velocity straight line, and the position is adjusted, assuming that the movement of the target object is a constant velocity linear movement. The image processing method according to claim 1.

4. In the positioning step, when the position of the image area deviates from a moving position of the target object by more than a predetermined value, the image area is excluded from a target of the positioning processing. The image processing method according to claim 1.

5. An image processing apparatus for automatically reading target information related to a moving target object from a video image of the moving target object, the predetermined information including the target information for each of a plurality of time-series images in the video. Searching means for searching for an image area in a range, correcting the position of each of the searched image areas to a position suitable for movement of the target object, and aligning each image area; An image processing apparatus comprising: a high-resolution image processing unit that performs high-resolution processing by superimposing the aligned image regions.