JP2017098878A

JP2017098878A - Information terminal device and program

Info

Publication number: JP2017098878A
Application number: JP2015231709A
Authority: JP
Inventors: 加藤　晴久; Haruhisa Kato; 晴久加藤
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2015-11-27
Filing date: 2015-11-27
Publication date: 2017-06-01
Anticipated expiration: 2035-11-27
Also published as: JP6478282B2

Abstract

PROBLEM TO BE SOLVED: To realize accurate recognition even when preview display desired by a user is short of recognition accuracy, in recognizing an imaging target in an imaged picture, from the imaged picture.SOLUTION: A processed picture obtained by processing an imaged picture is used as preview display of the imaged picture. When the imaged picture exists in a region R511 that is unsuitable for recognition, preview display is shown in a region R512 that may not be desired by a user; when the imaged picture exists in a region R521 that is suitable for recognition, the preview display occupies a region R522 that may be desired by the user. Thereby, the user may spontaneously perform imaging suitable for recognition processing. The processing can include expansion, reduction, translation, lightness conversion, etc.SELECTED DRAWING: Figure 5

Description

本発明は、撮像画像よりその撮像対象を認識するに際し、ユーザが望む撮像のプレビュー表示が認識処理には不適切なものとなる場合であっても、高精度な認識を実現する情報端末装置及びプログラムに関する。 The present invention provides an information terminal device that realizes highly accurate recognition even when a preview display of an imaging desired by a user is inappropriate for recognition processing when recognizing the imaging target from a captured image, and Regarding the program.

画像から対象を認識する装置は、配布や提示が容易な媒体に記載されたアナログ情報からデジタル情報に変換させることが可能であり、利用者の利便性を向上させることができる。 An apparatus for recognizing an object from an image can convert analog information described in a medium that can be easily distributed and presented from digital information to digital information, and can improve user convenience.

例えば、非特許文献１では、画像から特徴点を検出し、特徴点周辺から局所画像特徴量を算出した上で、事前に蓄積しておいた局所画像特徴量と照合することによって、対象の種類を認識する。 For example, in Non-Patent Document 1, a feature point is detected from an image, a local image feature amount is calculated from around the feature point, and then compared with a local image feature amount accumulated in advance, the type of object Recognize

特に、認識を高精度化させる装置としては、以下のようなものが公開されている。 In particular, the following devices are disclosed as devices for increasing the accuracy of recognition.

特許文献１では、2段階Hough変換でノイズとなるアウトライアを除去し、認識に用いるインライアのクラスタを特定することで、画像を認識する手法を開示している。特許文献２では、局所画像特徴量の次元数を増加させつつ、冗長性のある次元を削減する手法を開示している。 Patent Document 1 discloses a method of recognizing an image by removing outliers that become noise by two-step Hough transform and specifying inlier clusters used for recognition. Patent Document 2 discloses a technique of reducing redundant dimensions while increasing the number of dimensions of local image feature amounts.

特開2013-025799号公報JP 2013-025799 特開2012-181765号公報JP 2012-181765 JP

D. G. Lowe, ''Object recognition from local scale-invariant Features,'' Proc. of IEEE International Conference on Computer Vision (ICCV), pp.1150-1157, 1999.D. G. Lowe, '' Object recognition from local scale-invariant Features, '' Proc. Of IEEE International Conference on Computer Vision (ICCV), pp.1150-1157, 1999.

しかしながら、上記の非特許文献１および特許文献１、特許文献２といった従来技術では、いずれも認識に適した特徴点および局所特徴量が少ないと認識に失敗するという課題があった。 However, the conventional techniques such as Non-Patent Document 1, Patent Document 1, and Patent Document 2 described above have a problem in that recognition fails when there are few feature points and local feature values suitable for recognition.

そこで、当該課題について検討すると次の通りである。十分な数の特徴点が得られない状況として、３種類の状況が存在する。第一状況は撮像対象（の全体）にそもそも特徴点が少ない場合である。第二状況は撮像対象（の全体）に特徴点が多くても撮像対象に近接してその一部だけを撮像する場合である。第三状況は撮像対象（の全体）に特徴点が多くても撮像対象から離れすぎて撮像する場合である。 Therefore, the subject is examined as follows. There are three types of situations where a sufficient number of feature points cannot be obtained. The first situation is a case where there are few feature points in the imaging target (whole). The second situation is a case where even if there are many feature points in the imaging target (whole), only a part thereof is imaged in the vicinity of the imaging target. The third situation is a case where even if there are many feature points in the imaging target (whole), the imaging target is too far away from the imaging target.

第一状況に対する解決策として、複数の撮像対象をまとめて一つの撮像対象とすることで、個々の撮像対象に特徴点が少なくても、全体として認識に十分な数の特徴点を確保する手法が有効である。例えば、複数の商品が同一ページに掲載されているカタログ等の紙面を認識させたい場合、管理者等が個別の商品画像を撮像対象に設定してしまうと、十分な特徴点が存在しない商品を認識できなくなってしまうが、複数の商品画像をまとめて１つの撮像対象として設定することで、十分な特徴点を確保し認識の精度を高めることができる。 As a solution to the first situation, a method that secures a sufficient number of feature points for recognition as a whole even if there are few feature points for each image capture target by combining multiple image capture targets into one image capture target Is effective. For example, if you want to recognize a page of a catalog or the like where multiple products are posted on the same page, if an administrator sets an individual product image as an imaging target, a product that does not have sufficient feature points Although it becomes impossible to recognize, it is possible to secure sufficient feature points and improve recognition accuracy by collectively setting a plurality of product images as one imaging target.

しかし、上記のように管理者等の立場において、複数の商品画像をまとめて１つの撮像対象として設定するという措置を講じたとしても、利用者が興味を持った個別商品に撮像部をかざすと、特徴点の数が不足するため、第二状況が発生してしまう。なお、利用者の立場では、自分自身が興味を持った個別商品のみを撮像するように撮像部をかざすという操作は直感に沿った自然な操作であることから、第二状況が発生することも多いものと考えられる。 However, as described above, from the standpoint of an administrator or the like, even if a measure is taken to collectively set a plurality of product images as one imaging target, if the imaging unit is held over an individual product that the user is interested in, The second situation occurs because the number of feature points is insufficient. In addition, from the user's standpoint, the second situation may occur because the operation of holding the imaging unit so as to capture only the individual products that the user is interested in is a natural operation in line with intuition. It is thought that there are many.

ここで、第二状況で特徴点が不足する理由は撮像画像に特徴点が映り込んでいないためであるが、一方、第三状況で特徴点が不足する理由は撮像画像の解像度が低下し撮像対象の細かな模様がつぶれるためである。例えば、ポスターや展示物等を認識させたい場合、利用者はこうした認識対象（ポスターや展示物等）を手に取れない環境に置かれることが多いものと想定されるが、このような環境において利用者は認識対象を遠くから撮像する傾向が強く、第三状況が発生してしまうことにより、認識に失敗することが多い。 Here, the reason why the feature point is insufficient in the second situation is that the feature point is not reflected in the captured image. On the other hand, the reason that the feature point is insufficient in the third situation is that the resolution of the captured image is reduced and the image is taken. This is because the detailed pattern of the object is crushed. For example, when it is desired to recognize posters or exhibits, it is assumed that users are often placed in an environment where such recognition targets (posters, exhibits, etc.) cannot be obtained. Users have a strong tendency to image the recognition target from a distance, and the recognition often fails because the third situation occurs.

つまり、以上の検討のように、第一〜第三状況のいずれの状況においても、撮像対象との距離及び／又は位置関係等には認識に適した範囲が存在し、その範囲から逸脱した場合に局所特徴量が少なくなり認識できなくなるという課題があるが、従来技術では当該課題に対処することができなかった。ここで、以上の検討でも例示したように、管理者等が認識に適した範囲で撮像がなされることを前提に認識対象の特徴点及び局所画像特徴量を設定しておいたとしても、実際に撮像を行う利用者の立場では当該適した範囲における撮像という操作が直感に反する等の事情で、適した範囲から逸脱して撮像がなされてしまうことがよくあること等が、当該課題を発生させる一因となっていた。 In other words, as described above, in any of the first to third situations, there is a range suitable for recognition in the distance and / or positional relationship with the imaging target, and the range deviates from that range. However, there is a problem that the local feature amount becomes small and cannot be recognized, but the conventional technique cannot deal with the problem. Here, as illustrated in the above discussion, even if the feature point and local image feature amount to be recognized are set on the assumption that the administrator etc. captures images within a range suitable for recognition, From the standpoint of a user who performs image capturing, the problem is that, for example, the image capturing operation in the appropriate range is contrary to intuition, and the image is often deviated from the appropriate range. It was one of the causes.

また、当該課題は撮像画像より認識を行おうとする場合一般においても発生する課題であり、認識を特徴点及び局所画像特徴量を用いて行う場合に限らず、その他の手法で認識を行う場合であっても発生する課題であった。 In addition, the problem is a problem that generally occurs when recognition is performed from a captured image, and is not limited to the case where recognition is performed using feature points and local image feature amounts. It was a problem that occurred.

本発明の目的は、上記従来技術の課題に鑑み、撮像した画像から撮像対象を高精度に認識できる情報端末装置及びプログラムを提供することにある。 An object of the present invention is to provide an information terminal device and a program capable of recognizing an imaging target with high accuracy from a captured image in view of the above-described problems of the prior art.

上記目的を達成するため、本発明は以下（１）〜（１０）を特徴とする。 In order to achieve the above object, the present invention is characterized by the following (1) to (10).

（１）撮像対象を撮像して撮像画像を得る撮像部と、前記撮像画像を解析して前記撮像対象を認識する認識部と、前記撮像画像に所定の加工処理を施して加工画像を得る加工部と、前記加工画像をプレビューとして表示する表示部と、を備え、前記所定の加工処理が、撮像対象の候補及びその撮像のなされ方の候補に応じて予め定まった加工処理であり、前記撮像部が前記撮像対象を撮像している状態が、得られる撮像画像が前記認識部における認識に不適切な状態となる際には、得られる加工画像を前記表示部におけるプレビューとして不適切な状態とするものであり、且つ、前記撮像部が前記撮像対象を撮像している状態が、得られる撮像画像が前記認識部における認識に適切な状態となる際には、得られる加工画像を前記表示部におけるプレビューとして適切な状態とするものであることを特徴とする。 (1) An imaging unit that captures an imaged object to obtain a captured image, a recognition unit that analyzes the captured image and recognizes the imaged object, and a process that obtains a processed image by performing predetermined processing on the captured image And a display unit that displays the processed image as a preview, and the predetermined processing is a processing that is determined in advance according to a candidate for an imaging target and a candidate for how to perform the imaging, and the imaging When the captured image of the imaging target is in a state inappropriate for recognition in the recognition unit, the processed image obtained is in an inappropriate state as a preview in the display unit. When the captured image obtained by the imaging unit captures the imaging target is in a state suitable for recognition by the recognition unit, the processed image obtained is displayed on the display unit. In Characterized in that it is an appropriate state as a preview.

（２）上記特徴（１）においてさらに、前記所定の加工処理は撮像画像の一部分の領域を抽出したものを加工画像とする処理であり、前記認識に不適切な状態は、前記撮像部が前記認識対象に接近しすぎていることによる不適切な状態であり、前記表示部におけるプレビューとして不適切な状態は、前記認識対象が大きすぎることによる不適切な状態であることを特徴とする。 (2) Further, in the feature (1), the predetermined processing is processing that extracts a partial region of the captured image as a processed image, and the imaging unit is in a state inappropriate for recognition. The inappropriate state due to being too close to the recognition target, and the inappropriate state as the preview in the display unit is an inappropriate state due to the recognition target being too large.

（３）上記特徴（１）においてさらに、前記所定の加工処理は拡大処理であり、前記認識に不適切な状態は、前記撮像部が前記認識対象に接近しすぎていることによる不適切な状態であり、前記表示部におけるプレビューとして不適切な状態は、前記認識対象が大きすぎることによる不適切な状態であることを特徴とする。 (3) Further, in the feature (1), the predetermined processing is an enlargement process, and the state inappropriate for recognition is an inappropriate state due to the imaging unit being too close to the recognition target The inappropriate state as the preview in the display unit is an inappropriate state due to the recognition target being too large.

（４）上記特徴（１）においてさらに、前記所定の加工処理は縮小処理であり、前記認識に不適切な状態は、前記撮像部が前記認識対象から遠ざかりすぎていることによる不適切な状態であり、前記表示部におけるプレビューとして不適切な状態は、前記認識対象が小さすぎることによる不適切な状態であることを特徴とする。 (4) Further, in the feature (1), the predetermined processing is a reduction process, and the state inappropriate for recognition is an inappropriate state due to the imaging unit being too far from the recognition target. In addition, the inappropriate state as a preview in the display unit is an inappropriate state due to the recognition target being too small.

（５）上記特徴（１）においてさらに、前記所定の加工処理は糸巻き型の歪みを加える処理であり、前記認識に不適切な状態は、前記撮像部が前記認識対象から遠ざかりすぎていることによる不適切な状態であり、前記表示部におけるプレビューとして不適切な状態は、前記認識対象が小さすぎることによる不適切な状態であることを特徴とする。 (5) Further, in the feature (1), the predetermined processing is processing for adding a pincushion type distortion, and the state inappropriate for the recognition is that the imaging unit is too far from the recognition target. The inappropriate state, which is inappropriate as a preview in the display unit, is an inappropriate state due to the recognition target being too small.

（６）上記特徴（１）においてさらに、前記所定の加工処理は加工画像の中心位置を撮像画像の中心位置とは異なるように移動させる処理であり、前記認識に不適切な状態は、前記撮像部が前記認識対象をその中心からずれて撮像していることによる不適切な状態であり、前記表示部におけるプレビューとして不適切な状態は、前記認識対象が中心からずれて映っている状態であることを特徴とする。 (6) In the feature (1), the predetermined processing is processing for moving the center position of the processed image so as to be different from the center position of the captured image. Is an inappropriate state due to the image of the recognition target being deviated from its center, and an inappropriate state as a preview in the display unit is a state in which the recognition target is deviated from the center It is characterized by that.

（７）上記特徴（１）においてさらに、前記所定の加工処理は明度変換により加工画像の階調を撮像画像の階調よりも強調させる又は低減させる処理であり、前記認識に不適切な状態は、前記撮像部が前記認識対象をその階調を損なって撮像していることによる不適切な状態であり、前記表示部におけるプレビューとして不適切な状態は、前記認識対象がその階調を損なって映っている状態であることを特徴とする。 (7) In the feature (1), the predetermined processing is processing for enhancing or reducing the gradation of the processed image more than the gradation of the captured image by brightness conversion, and the state inappropriate for the recognition is The imaging unit is in an inappropriate state because it captures the recognition target with a loss of gradation, and the inappropriate state as a preview in the display unit is that the recognition target impairs the gradation. It is characterized by being reflected.

（８）上記特徴（１）〜（７）においてさらに、前記認識部は、前記撮像画像より局所画像特徴量を算出し、参照用として所定の複数の認識対象につきそれぞれ事前に算出されている局所画像特徴量と比較することで、類似していると判定される認識対象を、前記撮像画像における撮像対象に該当するものとして認識することを特徴とする。 (8) In the features (1) to (7), the recognizing unit further calculates a local image feature amount from the captured image, and each of the local areas is calculated in advance for each of a plurality of predetermined recognition objects for reference. A recognition target that is determined to be similar by comparing with an image feature amount is recognized as corresponding to the imaging target in the captured image.

（９）上記特徴（２）においてさらに、前記認識部は、前記撮像画像より局所画像特徴量を算出し、参照用として所定の複数の認識対象につきそれぞれ事前に算出されている局所画像特徴量と比較することで、類似していると判定される認識対象を、前記撮像画像における撮像対象に該当するものとして認識し、さらに、前記類似していると判定された認識対象における参照用の局所画像特徴量のうち、対応関係が前記撮像画像における局所画像特徴量との間で得られたものの特徴点座標の分布に基づいて、前記撮像画像の領域及び前記一部分の領域としての加工画像の領域を推定することを特徴とする。 (9) In the feature (2), the recognizing unit further calculates a local image feature amount from the captured image, and a local image feature amount calculated in advance for each of a plurality of predetermined recognition targets for reference. A recognition target determined to be similar by comparison is recognized as corresponding to the imaging target in the captured image, and the reference local image in the recognition target determined to be similar. Based on the distribution of the feature point coordinates of the corresponding feature amount obtained with the local image feature amount in the captured image, the region of the captured image and the region of the processed image as the partial region are obtained. It is characterized by estimating.

（１０）コンピュータを上記特徴（１）〜（９）の情報端末装置として機能させるプログラムであることを特徴とする。 (10) A program for causing a computer to function as the information terminal device according to the features (1) to (9).

前記（１）、（１０）の特徴によれば、撮像画像に対して、プレビュー表示として機能する画像として撮像画像が加工された加工画像を利用することにより、ユーザが適切であると判断するプレビュー表示が得られている時点において同時に、撮像画像も認識処理に適した状態として確保することが可能となり、当該撮像画像を認識処理の対象とすることで、高精度な認識処理を実現できる。 According to the features of (1) and (10) above, a preview that the user determines to be appropriate by using a processed image obtained by processing the captured image as an image that functions as a preview display for the captured image. At the same time when the display is obtained, the captured image can be secured in a state suitable for the recognition process, and by using the captured image as a target of the recognition process, a highly accurate recognition process can be realized.

前記（２）〜（７）の特徴によれば、ユーザが適切であると判断するプレビュー表示を、ユーザにおいて想定される撮像の仕方に応じたものとして撮像画像に対してそれぞれ、拡大処理（一部分のみ切り抜く場合を含む）、縮小処理、糸巻き型の歪みを加える処理、中心移動処理、明度変換処理という加工処理を施すことによって得るようにすることができる。 According to the features (2) to (7), the preview display that is determined to be appropriate by the user is displayed on the captured image according to the method of imaging assumed by the user. The image can be obtained by performing processing such as reduction processing, pincushion type distortion processing, center movement processing, and brightness conversion processing.

前記（８）の特徴によれば、局所画像特徴量に基づいて撮像画像における撮像対象の認識結果を得ることができる。 According to the feature (8), the recognition result of the imaging target in the captured image can be obtained based on the local image feature amount.

前記（９）の特徴によればさらに、認識部において撮像画像の領域の情報と、ユーザに加工画像として示されている領域の情報と、を得ることができる。 According to the feature (9), information on the area of the captured image and information on the area shown as a processed image to the user can be obtained in the recognition unit.

一実施形態に係る情報端末装置の機能ブロック図である。It is a functional block diagram of the information terminal device concerning one embodiment. 一実施形態における認識部の機能ブロック図である。It is a functional block diagram of the recognition part in one Embodiment. 加工部の機能ブロック図である。It is a functional block diagram of a processing part. 加工処理として拡大部による拡大処理（一部分の領域を抽出する処理）の内容を説明するための図である。It is a figure for demonstrating the content of the expansion process (process which extracts a one part area | region) by an expansion part as a process process. 加工処理として拡大部による拡大処理（一部分の領域を抽出する処理）を適用した場合の効果を説明するための例を示す図である。It is a figure which shows the example for demonstrating the effect at the time of applying the expansion process (process which extracts a one part area | region) by an expansion part as a process process. 加工処理として縮小部による縮小処理の内容を説明するための図である。It is a figure for demonstrating the content of the reduction process by a reduction part as a process process. 加工処理として縮小部による縮小処理を適用した場合の効果を説明するための例を示す図である。It is a figure which shows the example for demonstrating the effect at the time of applying the reduction process by a reduction part as a process process. 加工処理として並進部による並進処理（中心移動処理）の内容を説明するための図である。It is a figure for demonstrating the content of the translation process (center movement process) by a translation part as a process. 加工処理として並進部による並進処理（中心移動処理）を適用した場合の効果を説明するための例を示す図である。It is a figure which shows the example for demonstrating the effect at the time of applying the translation process (center movement process) by a translation part as a process process.

図１は、一実施形態に係る情報端末装置の機能ブロック図である。情報端末装置1は、撮像部2、認識部3、加工部4及び表示部5を備える。図１の各部1〜5の機能概要は以下の通りである。 FIG. 1 is a functional block diagram of an information terminal device according to an embodiment. The information terminal device 1 includes an imaging unit 2, a recognition unit 3, a processing unit 4, and a display unit 5. The functional outline of each part 1-5 of FIG. 1 is as follows.

撮像部2は、各時刻tにおけるユーザの撮像操作U(t)のもとで撮像対象を撮像して、その撮像画像P1(t)を算出部へ出力する。ここで、撮像画像P1(t)には予め既知の撮像対象が含まれるよう、ユーザが撮像操作U(t)にて撮像を行うものとする。撮像対象は例えば、特徴等が既知の模様を持つマーカーや印刷物、立体物等であってよい。撮像部2のためのハードウェア構成としては、携帯端末に標準装備されるデジタルカメラを用いることができる。 The imaging unit 2 images the imaging target under the user's imaging operation U (t) at each time t, and outputs the captured image P1 (t) to the calculation unit. Here, it is assumed that the user performs imaging by the imaging operation U (t) so that the captured image P1 (t) includes a known imaging target in advance. The imaging target may be, for example, a marker having a pattern with known characteristics, a printed material, a three-dimensional object, or the like. As a hardware configuration for the imaging unit 2, a digital camera provided as a standard in a portable terminal can be used.

なお、各時刻tにおける撮像部2に対するユーザの撮像操作U(t)とは、撮像対象に対して当該ユーザが望む形の配置（カメラ位置及び姿勢）に撮像部2を置くことにより、撮像する操作等を指す。すなわち、撮像対象に対して撮像部2（カメラ）をユーザが手に持つ、あるいはスタンドに取り付けるなどして「かざす」操作（通常の撮像操作）を指す。こうした操作は一般に時間変化するものである（ただし、一定状態を保つ場合も含む）ため、撮像操作U(t)と表記している。その他、撮像対象に対して光源が配置されておりユーザが当該光源を調整可能である場合には、当該光源の位置を調整すること等も撮像操作U(t)に含まれてよい。 The user's imaging operation U (t) with respect to the imaging unit 2 at each time t is taken by placing the imaging unit 2 in an arrangement (camera position and orientation) desired by the user with respect to the imaging target. Refers to operations. That is, it refers to a “holding” operation (normal imaging operation) by holding the imaging unit 2 (camera) in the hand or attaching the imaging unit 2 (camera) to a stand. Since such an operation generally changes with time (including a case where a constant state is maintained), it is described as an imaging operation U (t). In addition, when the light source is arranged with respect to the imaging target and the user can adjust the light source, adjusting the position of the light source may be included in the imaging operation U (t).

認識部3は、各時刻tにおいて、撮像部2で撮像された撮像画像P1(t)から撮像対象を認識し、認識結果R(t)としてユーザ等に対して出力する。認識部3における認識処理としては、QRコード（登録商標）読み取り、文字認識、局所画像特徴量を用いた特定物体認識など既存の方法を利用できる。 The recognition unit 3 recognizes the imaging target from the captured image P1 (t) captured by the imaging unit 2 at each time t, and outputs the recognition target R (t) to the user or the like. As recognition processing in the recognition unit 3, existing methods such as QR code (registered trademark) reading, character recognition, and specific object recognition using local image feature amounts can be used.

一実施形態において認識部3ではさらに、現時刻t2において加工部4（後述）が撮像画像P1(t2)を加工して加工画像P2(t2)を出力する際の加工処理におけるパラメータ等を、加工指示I(t2)として加工部4へ指示することもできる。ここで、認識部3では現時刻t2の加工指示I(t2)を、過去時刻t1（t1<t2）における認識結果R(t1)に基づいて決定することができる。当該決定するための過去時刻t1は1つの時刻に限らず、複数の時刻を用いてもよいし、現時刻t2に対する過去の所定期間を用いてもよい。 In the embodiment, the recognition unit 3 further processes parameters and the like in the processing when the processing unit 4 (described later) processes the captured image P1 (t2) and outputs the processed image P2 (t2) at the current time t2. An instruction I (t2) can also be instructed to the processing unit 4. Here, the recognition unit 3 can determine the processing instruction I (t2) at the current time t2 based on the recognition result R (t1) at the past time t1 (t1 <t2). The past time t1 for the determination is not limited to one time, and a plurality of times may be used, or a past predetermined period with respect to the current time t2 may be used.

なお、加工指示I(t2)を省略する実施形態も可能であり、この場合、加工部4では認識部3における実際の認識状態によらず常に、後述するような加工部4の各実施形態に応じた所定種類の加工処理を所定パラメータにおいて適用することとなる。これに対して、加工指示I(t2)を利用する実施形態では、認識部3における実際の認識状態に依存せずに加工部4の各実施形態に応じた所定種類の加工処理を加工部4において適用することは同じであるが、当該適用する際のパラメータを、認識部3における過去認識結果に基づいて、ユーザの撮像操作U(t)の過去履歴における傾向に応じたものとして設定することが可能となる。なお、加工指示I(t2)については後述する補足説明（１）において再度、説明する。 Note that an embodiment in which the processing instruction I (t2) is omitted is possible, and in this case, the processing unit 4 always uses the processing unit 4 as described later in each embodiment regardless of the actual recognition state in the recognition unit 3. The corresponding predetermined type of processing is applied to the predetermined parameters. In contrast, in the embodiment using the processing instruction I (t2), the processing unit 4 performs a predetermined type of processing according to each embodiment of the processing unit 4 without depending on the actual recognition state in the recognition unit 3. Is applied in the same way, but the parameter for the application is set based on the past recognition result in the recognition unit 3 according to the trend in the past history of the user's imaging operation U (t). Is possible. The processing instruction I (t2) will be described again in supplementary explanation (1) described later.

図２は、一実施形態における認識部3の機能ブロック図である。認識部3は、算出部31、記憶部32及び照合部33を備える。局所画像特徴量を用いて認識を行う一実施形態において、当該各部の処理内容は以下の通りである。 FIG. 2 is a functional block diagram of the recognition unit 3 in one embodiment. The recognition unit 3 includes a calculation unit 31, a storage unit 32, and a collation unit 33. In one embodiment in which recognition is performed using local image feature amounts, the processing content of each unit is as follows.

算出部31はまず、撮像部2で撮像された撮像画像P1(t)から撮像対象の特徴点を検出する。当該検出する特徴点には、認識対象におけるコーナーなどの特徴的な点を利用できる。検出手法としては、SIFT (Scale-Invariant Feature Transform)やSURF (Speeded Up Robust Features)などの特徴的な点を検出する既存手法が利用できる。 First, the calculation unit 31 detects the feature point of the imaging target from the captured image P1 (t) captured by the imaging unit 2. A characteristic point such as a corner in the recognition target can be used as the characteristic point to be detected. As detection methods, existing methods for detecting characteristic points such as SIFT (Scale-Invariant Feature Transform) and SURF (Speeded Up Robust Features) can be used.

算出部31は次に、検出された特徴点座標を中心として、撮像部2で撮像された撮像画像P1(t)から局所画像特徴量を算出する。局所画像特徴量の算出手法としては、SIFT (Scale-Invariant Feature Transform)やSURF (Speeded Up Robust Features)などの特徴的な量を算出する既存手法が利用できる。 Next, the calculation unit 31 calculates a local image feature amount from the captured image P1 (t) captured by the imaging unit 2, with the detected feature point coordinates as the center. As a method for calculating the local image feature amount, an existing method for calculating a characteristic amount such as SIFT (Scale-Invariant Feature Transform) or SURF (Speeded Up Robust Features) can be used.

算出部31で以上のように算出された複数の特徴点および局所画像特徴量は、各時刻tにおける撮像画像P1(t)の特徴情報F(t)として照合部33へと出力される。 The plurality of feature points and local image feature amounts calculated as described above by the calculation unit 31 are output to the matching unit 33 as the feature information F (t) of the captured image P1 (t) at each time t.

記憶部32は、参照対象としての所定の複数の認識対象につきそれぞれ、当該認識対象の画像より、算出部31が撮像画像に対して行う処理と同一処理で算出した特徴情報を記憶しておく。ここで、次に説明する照合部33での処理の高速化のため、特徴情報をベクトル量子化やハッシュ関数等で要約したうえで記憶しておいてもよい。 The storage unit 32 stores, for each of a plurality of predetermined recognition targets as reference targets, feature information calculated by the same process as the process performed on the captured image by the calculation unit 31 from the image of the recognition target. Here, in order to speed up the processing in the collation unit 33 described below, the feature information may be stored after being summarized by vector quantization, a hash function, or the like.

照合部33は、各時刻tにおいて、算出部31から入力される撮像対象の特徴情報F(t)と、記憶部32に記憶された各認識対象の特徴情報と、の類似性を評価し、予め設定された閾値より類似度が高い認識対象があれば当該１以上の認識対象を、あるいは、最も類似度の高い認識対象を、撮像画像P1(t)に撮像された撮像対象の認識結果R(t)として加工部4やユーザ等に向けて出力する。なお、各時刻tにおける認識結果R(t)は特にユーザ等に向けて出力することなく、情報端末装置1において保持しておいてもよい。この場合、ユーザ等からの出力要請を受けた後に、認識結果R(t)をユーザ等に向けて出力するようにしてもよい。 The collation unit 33 evaluates the similarity between the feature information F (t) of the imaging target input from the calculation unit 31 and the feature information of each recognition target stored in the storage unit 32 at each time t, If there is a recognition target having a similarity higher than a preset threshold, the recognition result R of the imaging target captured in the captured image P1 (t) is the one or more recognition targets or the recognition target having the highest similarity. (t) is output to the processing unit 4 or the user. Note that the recognition result R (t) at each time t may be held in the information terminal device 1 without being output to the user or the like. In this case, after receiving an output request from the user or the like, the recognition result R (t) may be output to the user or the like.

ここで、類似性の評価には、特徴情報同士の間のハミング距離やユークリッド距離、マハラノビス距離などを用いる既存の手法を利用できる。また、認識対象と撮像画像との間の個別の特徴情報同士で当該距離等に基づき、最も類似する特徴情報同士の対応関係を定めたうえで、当該定まった対応関係における類似度の総和を求めるようにしてもよいし、周知のRANSAC(Random Sample Consensus)により認識対象と撮像画像との間の特徴情報同士の全体的な対応関係及び類似度を求めるようにしてもよい。 Here, for the similarity evaluation, an existing method using a Hamming distance, Euclidean distance, Mahalanobis distance, or the like between feature information can be used. In addition, after determining the correspondence between the most similar feature information based on the distance between individual feature information between the recognition target and the captured image, the sum of the similarities in the determined correspondence is obtained. Alternatively, the overall correspondence and similarity between the feature information between the recognition target and the captured image may be obtained by a known RANSAC (Random Sample Consensus).

加工部4は、各時刻tにおいて、撮像部2から撮像画像P1(t)を入力し、撮像画像P1(t)に対して拡大縮小などの画像変換処理を施した結果を加工画像P2(t)として表示部5へと出力する。 The processing unit 4 inputs the captured image P1 (t) from the imaging unit 2 at each time t, and the result of performing image conversion processing such as enlargement / reduction on the captured image P1 (t) is processed image P2 (t ) To the display unit 5.

表示部5は、各時刻tにおいて、加工部4から加工画像P2(t)を入力し、加工画像P2(t)をユーザに対して表示する。表示部5としては、携帯端末に標準装備されるディスプレイを用いることができる。 The display unit 5 inputs the processed image P2 (t) from the processing unit 4 at each time t, and displays the processed image P2 (t) to the user. As the display unit 5, a display provided as a standard in a mobile terminal can be used.

上記のように、各時刻tにおいて撮像部2により撮像された撮像画像P1(t)が加工されたものとしての加工画像P2(t)が表示部5に表示されることとなる。従って、ユーザの立場において表示部5は、カメラとしての撮像部2で映像として撮像対象を撮像している際の、いわゆるカメラプレビューのインタフェースを提供するものとなる。 As described above, the processed image P2 (t) obtained by processing the captured image P1 (t) captured by the imaging unit 2 at each time t is displayed on the display unit 5. Accordingly, from the user's standpoint, the display unit 5 provides a so-called camera preview interface when the imaging unit 2 as a camera is imaging an imaging target.

従って、ユーザは、カメラプレビューとしての表示部5による表示を見ることで、ユーザ自身が望んでいるような撮像が行われているかを確認しながら、各時刻tにおいて撮像部2に対して撮像操作U(t)を行うこととなる。本発明においては、カメラプレビューを提供する表示部5が撮像部2で得た撮像画像P1(t)をそのまま表示するのではなく、加工部4によって加工された加工画像P2(t)を表示するようにすることで、次のような効果を奏することができる。すなわち、以下の第一事項と第二事項とを同時に達成するという効果を奏する。 Therefore, the user performs an imaging operation on the imaging unit 2 at each time t while confirming whether the imaging desired by the user himself / herself is performed by viewing the display on the display unit 5 as a camera preview. U (t) will be performed. In the present invention, the display unit 5 providing the camera preview does not display the captured image P1 (t) obtained by the imaging unit 2 as it is, but displays the processed image P2 (t) processed by the processing unit 4. By doing so, the following effects can be achieved. That is, there is an effect that the following first matter and second matter are achieved simultaneously.

第一事項は、表示部5で表示されている加工画像P2(t)が、特に加工が行われているということを意識していないユーザの立場においては、撮像部2で撮像したそのままの画像であるものと知覚されるものであり、且つ、ユーザ自身が望む形（例えば、ユーザが注目している物品のみにフォーカスした形）で撮像されたものとなることである。第二事項は、認識部3において（加工画像P2(t)ではなく）撮像画像P1(t)を用いて認識を実施することで、撮像対象の高精度な認識を実現することである。 The first matter is that the processed image P2 (t) displayed on the display unit 5 is a raw image captured by the imaging unit 2 in the standpoint of the user who is not particularly aware that the processing is being performed. And the image is captured in a form desired by the user himself (for example, a form focused only on an article that the user is paying attention to). The second matter is that the recognition unit 3 performs recognition using the captured image P1 (t) (not the processed image P2 (t)), thereby realizing highly accurate recognition of the imaging target.

従って、本発明においてはユーザが望む形で撮像がなされている（とユーザが感じることができる）という第一事項と、高精度な認識を実現するという第二事項と、を同時に達成することで、前述した従来技術の課題を解決することができる。すなわち、従来技術においては加工部4を経ての表示部5のプレビュー表示という手法が採用されていないため、撮像画像P1(t)がそのままプレビュー表示され、同時に認識処理の対象ともなることで、ユーザ立場では満足なプレビュー表示が得られ、第一事項は達成されたとしても、認識処理は必ずしも高精度には実現できず、第二事項が必ずしも達成されるわけではなかった。同様に、従来技術では第二事項が達成される場合に必ずしも第一事項が達成されるわけではなかった。これに対して本発明によれば、第一事項と第二事項とを同時に達成可能となる。 Therefore, in the present invention, by simultaneously achieving the first matter that the image is taken in the form desired by the user (and the user can feel it) and the second matter that realizes highly accurate recognition. Thus, the above-mentioned problems of the prior art can be solved. That is, since the technique of preview display of the display unit 5 through the processing unit 4 is not adopted in the conventional technology, the captured image P1 (t) is displayed as it is as a preview and is also subject to recognition processing at the same time. From the standpoint, a satisfactory preview display was obtained, and even if the first item was achieved, the recognition process could not always be realized with high accuracy, and the second item was not necessarily achieved. Similarly, in the prior art, when the second item is achieved, the first item is not necessarily achieved. On the other hand, according to the present invention, the first matter and the second matter can be achieved simultaneously.

上記のように、加工部4では第一事項と第二事項とが同時達成されるような形で撮像画像P1(t)に対する加工処理を行い、加工画像P2(t)を得る。従って、加工画像P2(t)がユーザが所望するようなプレビュー表示を提供するようなものとなっている場合に、撮像画像P1(t)が高精度な認識を実現可能なようなものとなっているように、加工処理を行う。 As described above, the processing unit 4 performs processing on the captured image P1 (t) in such a manner that the first item and the second item are simultaneously achieved, thereby obtaining a processed image P2 (t). Therefore, when the processed image P2 (t) provides a preview display desired by the user, the captured image P1 (t) can realize high-precision recognition. As with the processing.

当該加工処理は模式的には、以下のようなユーザによる撮像操作U(t)をユーザに自発的に行わせるようにする形で行われる。 The processing is typically performed in a manner that allows the user to voluntarily perform an imaging operation U (t) as described below.

すなわち、ユーザが撮像を開始した当初の時刻t1において、ユーザが撮像操作U(t1)を特に意識することなく、まずは所望の撮像対象をプレビュー画像内に捉えようとした場合を考えると、従来技術の課題で説明したように、撮像画像P1(t1)は高精度認識を実現するのには好ましい状態ではない場合がある。従って、このような当初の時刻t1では、プレビュー表示の対象としての加工画像P2(t1)を、ユーザが望む状態では撮像されていないように加工処理を行うようにする。 That is, when the user first tries to capture a desired imaging target in the preview image without particularly being aware of the imaging operation U (t1) at the initial time t1 when the user started imaging, the conventional technology As described in the above problem, the captured image P1 (t1) may not be in a preferable state for realizing high-accuracy recognition. Therefore, at such an initial time t1, the processed image P2 (t1) as the preview display target is processed so that it is not captured in the state desired by the user.

さらに、当初の時刻t1における望ましくない状態のプレビュー表示としての加工画像P2(t1)を見たユーザは、時刻t1以降において撮像操作U(t)を意識的に調整することで、その後の時刻t2において望ましい状態のプレビュー表示としての加工画像P2(t2)を得るようにするという行動を自発的に取る。そこで、当該時刻t2の撮像画像P1(t2)は高精度認識を実現できるようなものとなっていればよい。 Further, the user who has viewed the processed image P2 (t1) as the preview display of the undesirable state at the initial time t1 can consciously adjust the imaging operation U (t) after the time t1, thereby performing the subsequent time t2. In FIG. 5, the behavior of obtaining a processed image P2 (t2) as a preview display in a desired state is voluntarily taken. Therefore, the captured image P1 (t2) at the time t2 only needs to be such that high-accuracy recognition can be realized.

以上、当初の時刻t1からその後の時刻t2に至るまでのユーザ行動（撮像操作U(t)）の考察より明らかなように、加工部4における加工処理は、撮像画像P1(t)と加工画像P2(t)との関係が以下の３つの（関係１）〜（関係３）を満たすようなものとなるような処理とすればよい。 As described above, as apparent from the consideration of the user behavior (imaging operation U (t)) from the initial time t1 to the subsequent time t2, the processing in the processing unit 4 is performed by the captured image P1 (t) and the processed image. What is necessary is just to set it as the process by which the relationship with P2 (t) satisfies the following three (relation 1)-(relation 3).

（関係１）として、上記当初の時刻t1として説明したように、ユーザによる撮像操作U(t1)により撮像画像P1(t1)が高精度認識に不向きな状態である場合には、加工画像P2(t1)がユーザにとっては所望のプレビューとはなっていないような状態とする。 As (Relationship 1), as described above as the initial time t1, when the captured image P1 (t1) is not suitable for high-accuracy recognition due to the imaging operation U (t1) by the user, the processed image P2 ( It is assumed that t1) is not a desired preview for the user.

（関係２）として、上記その後の時刻t2として説明したように、ユーザによる撮像操作U(t2)により撮像画像P1(t2)が高精度認識に適した状態である場合には、加工画像P2(t2)も同時に、ユーザが所望するプレビューとなっているような状態とする。 As (Relationship 2), when the captured image P1 (t2) is in a state suitable for high-accuracy recognition by the imaging operation U (t2) by the user, as described as the subsequent time t2, the processed image P2 ( At the same time, t2) is set to a state where the preview desired by the user is obtained.

（関係３）として、（関係１）及び（関係２）の中間状態を次のような状態とする。すなわち、上記当初時刻t1からその後の時刻t2に至るまでのユーザ行動として説明したように、撮像した結果として上記の（関係１）が成立してしまっている場合には、ユーザに対して自発的に撮像操作を調整させて、上記の（関係２）を成立させるような撮像操作状態に収束させるよう、ユーザを動機付けることを可能にするように、（関係１）と（関係２）との中間状態における撮像画像P1(t)と加工画像P2(t)との関係を設定するようにする。 As (Relationship 3), an intermediate state between (Relationship 1) and (Relationship 2) is as follows. That is, as described as the user behavior from the initial time t1 to the subsequent time t2, if the above (Relationship 1) is established as a result of imaging, the user is voluntary. The relationship between (Relation 1) and (Relation 2) so that the user can be motivated to adjust the imaging operation to converge to the imaging operation state that satisfies the above (Relation 2). The relationship between the captured image P1 (t) and the processed image P2 (t) in the state is set.

なお、上記の（関係３）により、ユーザの撮像操作U(t)において撮像の開始当初から、あるいは撮像の途中において（関係２）が満たされた場合には、ユーザが当該撮像対象を撮像しようとしている限りにおいて（例えば撮像対象としてあるポスターを撮像し続けており、これを終えて別のポスターの撮像へと移行することがないような場合において）、ユーザは（関係２）を満たした状態（撮像操作状態）を保ったままで撮像を継続しようという動機付けを与えられることとなる。ここで、（関係３）に関して「収束」させるようユーザを動機付けると説明したが、「収束」して到達する（関係２）の撮像操作状態は必ずしも１点のみの最適な撮像操作状態として存在する必要はなく、ある程度の範囲を有する撮像操作状態として存在すればよい。 Note that, according to the above (Relationship 3), when (Relationship 2) is satisfied from the beginning of imaging in the user's imaging operation U (t) or in the middle of imaging, the user tries to capture the imaging target. (For example, in the case where a poster as an imaging target is continuously imaged and the transition to the imaging of another poster is not completed), the user satisfies (Relationship 2) The motivation to continue imaging while maintaining (imaging operation state) is given. Here, it has been described that the user is motivated to “converge” with respect to (Relationship 3), but the imaging operation state of “Convergence” and reached (Relationship 2) is not necessarily an optimal imaging operation state with only one point. It is not necessary, and it only has to exist as an imaging operation state having a certain range.

以上、加工部4による加工処理に関して、その「考え方」と共に概念的な説明を行った。以下、加工部4による加工処理の詳細を説明する。なお、上記（関係１）〜（関係３）を述べたが、以下の加工処理の例は（関係１）及び（関係２）を満たすことで自ずと（関係３）も満たすような例となっている。 As above, the processing process by the processing unit 4 is conceptually explained together with the “concept”. Details of the processing performed by the processing unit 4 will be described below. In addition, although the above (Relation 1) to (Relation 3) has been described, the following processing examples are examples that naturally satisfy (Relation 3) by satisfying (Relation 1) and (Relation 2). Yes.

加工部4では、各種の実施形態によって、撮像画像P1(t)を加工して加工画像P2(t)を得ることができる。図３は、各種の実施形態を実現するための要素構成としての、加工部4の機能ブロック図である。加工部4は、拡大部41、縮小部42、並進部43、明度変換部44を備える。拡大部41では拡大することにより、縮小部42は縮小することにより、並進部43では並進処理を行うことにより、明度変換部44では明度変換処理を行うことにより、それぞれ、撮像画像P1(t)を加工して加工画像P2(t)を得る。 The processing unit 4 can process the captured image P1 (t) to obtain a processed image P2 (t) according to various embodiments. FIG. 3 is a functional block diagram of the processing unit 4 as an element configuration for realizing various embodiments. The processing unit 4 includes an enlargement unit 41, a reduction unit 42, a translation unit 43, and a brightness conversion unit 44. By enlarging in the enlarging unit 41, by reducing the reducing unit 42, by performing translation processing in the translation unit 43, and by performing lightness conversion processing in the lightness conversion unit 44, respectively, the captured images P1 (t) To obtain a processed image P2 (t).

ここで、当該各部41〜44は各実施形態における加工処理を担うものとして、いずれか１つの機能部のみが適用されるようにしてもよいし、２つ以上の任意の機能部の組み合わせによって加工処理を実現してもよい。また、実際に想定される撮像対象に対して、いずれの加工処理をどのようなパラメータによって適用するかについては、管理者等によるマニュアル判断等で事前に設定しておくことができる。 Here, each of the units 41 to 44 is assumed to be responsible for processing in each embodiment, and only one of the functional units may be applied, or processing may be performed by a combination of two or more arbitrary functional units. Processing may be realized. In addition, it is possible to set in advance by a manual judgment or the like by an administrator or the like which parameter is applied to which processing is applied to an imaging target that is actually assumed.

特に、図２で説明した認識部3を構成する記憶部32に対して複数の所定の認識対象の特徴情報を事前登録しておく際に併せて、管理者等が加工部4においていずれの種類（又はその組み合わせ）の加工処理を適用するかを事前登録しておけばよい。なお、どのような加工処理が適切であるかは、認識対象が何であってユーザの撮像操作がどのように実施されうるかに応じて個別具体的に定まるものであり、その具体例については各部41〜44の詳細説明の際に後述する。従って、記憶部32にその特徴情報を記憶させておく一連の認識対象に関しては、共通の加工処理が適切であるような一連の認識対象となるように、記憶部32における事前登録と加工部4における加工処理内容の事前設定とを行うことが好ましい。 In particular, when the feature information of a plurality of predetermined recognition objects is pre-registered in the storage unit 32 constituting the recognition unit 3 described with reference to FIG. It is sufficient to register in advance whether to apply the processing (or a combination thereof). Note that what kind of processing is appropriate depends on what is to be recognized and how the user's imaging operation can be performed. Will be described later in the detailed description of .about.44. Therefore, with respect to a series of recognition targets whose feature information is stored in the storage unit 32, pre-registration and processing unit 4 in the storage unit 32 so as to be a series of recognition targets for which common processing is appropriate. It is preferable to perform presetting of the processing content in.

すなわち、記憶部32に事前登録しておく情報と連動させる形で、加工部4で行う所定の加工処理は、撮像対象の候補と当該撮像対象の撮像のなされ方の候補（例えば、撮像対象の候補はカタログ紙面であって、認識処理に不適切な撮像として接近しすぎて撮像される傾向にあるか、また逆に、撮像対象の候補はポスターであって、認識処理に不適切な撮像として離れすぎて撮像される傾向にあるか、といった撮像対象及びその撮像のなされ方の候補）とに応じて、予め定まった加工処理とすればよい。 In other words, the predetermined processing performed in the processing unit 4 in conjunction with the information registered in advance in the storage unit 32 is a candidate for an imaging target and a candidate for how the imaging target is to be captured (for example, an imaging target Candidates are catalog pages and tend to be captured too close as imaging inappropriate for recognition processing, or conversely, candidates for imaging are posters and imaging inappropriate for recognition processing The processing may be determined in advance according to the imaging target, such as whether the image is likely to be imaged too far away and the candidate for the imaging method).

また、事前設定された加工処理をどのようなパラメータによって適用するかに関しては、前述のように認識部3において過去時刻t1の認識結果R(t1)に基づいて現時刻t2における加工部4で適用すべきパラメータを決定するようにしてもよい。 In addition, as to what parameters the preset processing is applied to, the recognition unit 3 applies the processing unit 4 at the current time t2 based on the recognition result R (t1) at the past time t1 as described above. The parameter to be determined may be determined.

以下、各部41〜44のそれぞれの詳細説明を行う。 Hereinafter, each of the units 41 to 44 will be described in detail.

図４及び図５は、拡大部41による加工処理としての拡大処理の内容とその効果とを説明するための図である。 4 and 5 are diagrams for explaining the contents of the enlargement process as the processing by the enlargement unit 41 and the effects thereof.

ここで、図４に示すように、撮像部2で得られる撮像画像P1(t)はサイズが2N_x×2N_yであるものとする。すなわち、撮像画像P1(t)は横方向に2N_x個、縦方向に2N_y個の画素が並ぶことで、合計2N_x×2N_y個の画素で構成されているものとする。また、図４に座標系xyを示している通り、撮像画像P1(t)の中心が当該xy座標系の原点Oであるものとして、画素位置の説明を行う。当該座標系により例えば、撮像画像P1(t)の4頂点は(±N_x,±N_y)(複号任意)である。（通常、画像の左上端を原点として説明することが多いが、ここでは便宜上、画像の中心を原点に取るものとする。） Here, as shown in FIG. 4, the captured image P1 (t) obtained by the imaging unit 2 is assumed to have a size of 2N _x × 2N _y . That is, it is assumed that the captured image P1 (t) includes 2N _x pixels in the horizontal direction and 2N _y pixels in the vertical direction, so that the total is 2N _x × 2N _y pixels. Further, as shown in FIG. 4, the pixel position will be described on the assumption that the center of the captured image P1 (t) is the origin O of the xy coordinate system. According to the coordinate system, for example, the four vertices of the captured image P1 (t) are (± N _x , ± N _y ) (an arbitrary number of codes). (Normally, the upper left corner of the image is often described as the origin, but here, for convenience, the center of the image is taken as the origin.)

図４に示すように、拡大部41では加工処理として拡大を施した加工画像P2(t)を、撮像画像P1(t)の中心O付近の一部分のみを捉えた画像として生成する。すなわち、どの程度を一部分として定めるかの所定割合r(0<r<1)により、サイズ（構成画素数）が2rN_x×2rN_yであり、4頂点が(±rN_x,±rN_y)(複号任意)であり、撮像画像P1(t)と同様にx軸、y軸に平行な境界線を有し原点Oを中心とするような矩形画像として、拡大された加工画像P2(t)を生成する。 As shown in FIG. 4, the enlargement unit 41 generates a processed image P2 (t) that has been enlarged as a processing process, as an image that captures only a portion near the center O of the captured image P1 (t). In other words, the size (number of constituent pixels) is 2rN _x × 2rN _y and the four vertices are (± rN _x , ± rN _y ) (with a predetermined ratio r (0 <r <1) to determine how much is defined as a part. As with the captured image P1 (t), the processed image P2 (t) enlarged as a rectangular image having a boundary line parallel to the x-axis and y-axis and centering on the origin O. Is generated.

なお、加工画像P2(t)は撮像画像P1(t)の一部分として構成されることから、（構成画素数は撮像画像P1(t)よりも少なくなってはいるものの、）撮像画像P1(t)の一部分のみに接近して撮影したような内容の画像となるため、「拡大」処理としての加工処理を経たものとなっている。拡大部41ではさらに、解像度を調整して表示部5で表示させる画面領域サイズに合わせたものを加工画像P2(t)として出力するようにしてもよい。例えば、表示部5の表示領域が撮像画像P1(t)と同様にサイズ2N_x×2N_yであれば、解像度をr倍に落とすことによってサイズ2N_x×2N_yにまで引き伸ばしたものを、加工画像P2(t)として出力するようにしてもよい。 Since the processed image P2 (t) is configured as a part of the captured image P1 (t), the captured image P1 (t) (although the number of constituent pixels is smaller than that of the captured image P1 (t)). ), The image has a content as if it was photographed close to only a part of it, and has undergone a processing process as an “enlargement” process. Further, the enlargement unit 41 may adjust the resolution and output the processed image P2 (t) according to the screen area size displayed on the display unit 5. For example, if the display area of the display unit 5 is the size 2N _x × 2N _y as with the captured image P1 (t), the resolution is reduced to r times to reduce the size to 2N _x × 2N _y. You may make it output as image P2 (t).

従って、図４において灰色で塗った領域として示すような、「ロ」の字型の領域「P1(t)＼P2(t)」に関しては、撮像画像P1(t)には捉えられているものの、加工画像P2(t)からは除外されることとなり、表示部5におけるプレビュー表示を眺めるユーザには見えない領域となる。このため、プレビュー表示を眺めるユーザには見えない当該領域「P1(t)＼P2(t)」が一方では撮像画像P1(t)の一部分として認識部3においては参照可能となることによって、加工処理の一実施形態として拡大処理を適用した際に、本発明が効果を奏することが可能となる。 Therefore, the “B” -shaped region “P1 (t) \ P2 (t)” as shown in FIG. 4 as a region painted in gray is captured in the captured image P1 (t). Thus, it is excluded from the processed image P2 (t), and becomes an area invisible to the user who views the preview display on the display unit 5. For this reason, the region “P1 (t) \ P2 (t)” that cannot be seen by the user viewing the preview display can be referred to in the recognition unit 3 as a part of the captured image P1 (t). When an enlargement process is applied as an embodiment of the process, the present invention can be effective.

拡大による加工処理の効果の概略は次の通りである。ユーザは撮像部2を撮像対象にかざした際、撮像部2で撮像された撮像画像P1(t)が表示部5にプレビューされることを想定しているため、拡大された加工画像P2(t)が表示部5に表示されていると、加工されているとは想定せず、撮像対象に撮像部2を近接させすぎていると誤認する。当該誤認時点においては、（関係１）が成立している。すると、ユーザは自然に撮像対象と撮像部2との距離を離すように動かすことになるため、撮像情報に特徴量が多く含まれるようになり、認識精度が向上する。すなわち、自然と（関係２）が成立する状態へと移行し、且つ、そのような状態を保つようになる。 The outline of the effect of processing by enlargement is as follows. Since the user assumes that the captured image P1 (t) captured by the imaging unit 2 is previewed on the display unit 5 when the imaging unit 2 is held over the imaging target, the enlarged processed image P2 (t ) Is displayed on the display unit 5, it is not assumed that it has been processed, and it is mistaken for the imaging unit 2 being too close to the imaging target. At the time of the misconception, (Relationship 1) holds. Then, since the user naturally moves the imaging target and the imaging unit 2 away from each other, the imaging information includes a large amount of features, and the recognition accuracy is improved. In other words, the state naturally shifts to a state in which (Relationship 2) is established, and such a state is maintained.

図５に掲げた例を参照して、上記概略説明した拡大による加工処理の効果の具体例を説明する。図５では[1]に示すように、撮像対象Ob1はショッピングカタログ等におけるページ紙面（の全体）であり、部分的な対象としてテレビ（の画像）Ob2及びソファー（の画像）Ob3を含んでいる場合を例とする。図５において[2]及び[3]として示すのが、加工部4の処理内容を概念的に説明した際の（関係１）及び（関係２）が成立した撮像画像P1(t)及び加工画像P2(t)の例となっている。 With reference to the example shown in FIG. 5, a specific example of the effect of the processing by the enlargement described above will be described. In FIG. 5, as shown in [1], the imaging object Ob1 is the page paper (the whole) in a shopping catalog or the like, and includes a television (image) Ob2 and a sofa (image) Ob3 as partial objects. Take the case as an example. In FIG. 5, [2] and [3] indicate the captured image P1 (t) and the processed image in which (Relation 1) and (Relation 2) are satisfied when the processing content of the processing unit 4 is conceptually described. This is an example of P2 (t).

すなわち、図５の[2]においては、ユーザが当初時刻t1において撮像操作U(t1)を特に意識することなく、ユーザが興味を有しているテレビOb2のみが映るように撮像部2を掲げる操作を行った結果、撮像画像P1(t1)はほぼテレビOb2のみしか映っていない領域R511を捉えることとなる一方、矢印で示すようにプレビュー表示として現れる拡大された加工画像P2(t1)はテレビOb2が過大に大きく映りすぎ、テレビOb2の全体を捉えきっていないような領域R512で構成されることとなる。 That is, in [2] of FIG. 5, the imaging unit 2 is placed so that only the television Ob2 that the user is interested in is shown without the user being particularly aware of the imaging operation U (t1) at the initial time t1. As a result of the operation, the captured image P1 (t1) captures the region R511 where only the television Ob2 is only reflected, while the enlarged processed image P2 (t1) that appears as a preview display as indicated by the arrow is the television Ob2 is too large and is composed of the region R512 that does not capture the entire TV Ob2.

ここで、図５の[2]において、撮像画像P1(t1)の領域R511は管理者等が設定した認識対象Ob1の全体（事前にその特徴情報を算出して記憶部32に記憶させておく対象の全体）の一部分であるテレビOb2近辺しか捉えきれていないので、認識部3における認識には不適切な領域となっている。同時に、加工画像P2(t1)の領域R512はユーザが興味を持つテレビOb2が大きく映りすぎてテレビOb2の一部分しか見えない状態にあるので、プレビュー表示としてユーザが望むような状態にはなっていない。以上より、図５の[2]は前述の（関係１）が成立した状態となっている。 Here, in [2] of FIG. 5, the region R511 of the captured image P1 (t1) is the entire recognition target Ob1 set by the administrator or the like (feature information is calculated in advance and stored in the storage unit 32). Since only the vicinity of TV Ob2, which is a part of the entire object), is recognized, it is an area that is inappropriate for recognition by the recognition unit 3. At the same time, the region R512 of the processed image P2 (t1) is in a state where the television Ob2 that the user is interested in is so large that only a part of the television Ob2 can be seen, so the state is not as desired by the user as a preview display. . From the above, [2] in FIG. 5 is a state in which the above (Relationship 1) is established.

従って、ユーザはテレビOb2の全体が映った所望のプレビュー表示を得るべく、図５の当初時刻t1の[2]の状態から撮像部2を撮像対象より遠ざけるような撮像操作に移り、図５の[3]に示すように以降の時刻t2において、所望のプレビュー表示が実現されるような撮像操作U(t2)の状態に到達する。ここでは、撮像部2を遠ざけたことから、プレビュー表示である加工画像P2(t2)は領域R522として構成され、興味の対象であるテレビOb2の全体を適切に捉えている。同時に、撮像画像P1(t2)は、（ユーザにはその旨が知覚されないまま、）領域R521を捉えるような状態となっており、ユーザが興味を持つテレビOb2の周辺にあり当初時刻t1では写っていなかったソファーOb3も捉えるような状態となっていることから、管理者が想定していた（すなわち、記憶部32にその特徴情報を記憶させていた）ページ紙面Ob1のほぼ全体を捉えるような状態となっている。従って、図５の[3]は前述の（関係２）が成立した状態となっている。 Therefore, the user shifts from the state [2] at the initial time t1 in FIG. 5 to an imaging operation to move the imaging unit 2 away from the imaging target in order to obtain a desired preview display in which the entire television Ob2 is shown. As shown in [3], at the subsequent time t2, the state of the imaging operation U (t2) that achieves a desired preview display is reached. Here, since the imaging unit 2 is moved away, the processed image P2 (t2) that is a preview display is configured as a region R522, and appropriately captures the entire television Ob2 that is an object of interest. At the same time, the captured image P1 (t2) is in a state of capturing the region R521 (without being perceived by the user), is in the vicinity of the television Ob2 that the user is interested in, and is captured at the initial time t1. Since it is in a state that can capture the sofa Ob3 that was not, the administrator assumed (that is, the feature information was stored in the storage unit 32) almost the entire page page Ob1 It is in a state. Accordingly, [3] in FIG. 5 is a state in which the above (Relationship 2) is established.

以上、図５に例示したカタログ紙面Ob1のように、拡大部41の適用によれば、認識処理の観点からは撮像対象として広範な領域を撮像することが好ましいにもかかわらず、ユーザの撮像操作としてはその一部分であるテレビOb2やソファーOb3のみにフォーカスして撮像がなされてしまうような場合においても、高精度な認識を実現することが可能となる。カタログ紙面に限らず、参照用に特徴情報を算出しておく認識対象の領域と、ユーザが実際に注目するであろう領域との関係が上記を満たす場合には、全く同様に拡大部41を適用することができる。 As described above, according to the application of the enlargement unit 41 as in the catalog page Ob1 illustrated in FIG. 5, it is preferable to capture a wide area as an imaging target from the viewpoint of recognition processing. As a result, it is possible to realize highly accurate recognition even in the case where imaging is performed while focusing only on the television Ob2 and the sofa Ob3, which are a part thereof. When the relationship between the recognition target area for calculating the feature information for reference and the area that the user will actually pay attention to satisfies the above conditions, the enlargement unit 41 is exactly the same. Can be applied.

なお、カタログ紙面が認識対象である場合（すなわち、情報端末装置1の利用シーンとしてカタログ紙面を認識することが決まっている場合）には、記憶部32には１つ以上のカタログの１つ以上のページをそれぞれ認識対象として登録しておくと共に、加工部4では拡大部41の処理を適用するものとして事前設定しておくことで、複数のページのうちのいずれが撮像部2において撮像されているかを認識結果として得ることができる。 When the catalog sheet is a recognition target (that is, when it is determined that the catalog sheet is recognized as a usage scene of the information terminal device 1), the storage unit 32 stores one or more of one or more catalogs. Are registered as recognition targets, and the processing unit 4 presets the processing of the enlargement unit 41 so that any of the plurality of pages is captured by the imaging unit 2. Can be obtained as a recognition result.

さらに、本発明における拡大による加工処理は次のような追加効果も奏することが可能なものである。 Furthermore, the processing by enlargement in the present invention can also provide the following additional effects.

すなわち、図５の例ではユーザが興味を持つ対象をテレビOb2としたが、その画像は大部分が平坦な領域で構成され、従って、認識用の特徴情報（局所画像特徴量）を抽出しようとしても不十分にしか抽出できない（特徴点自体が少ない）例となっている。従来技術においては、このように特徴情報を不十分にしか抽出できないようなテレビOb2等の対象を個別の認識対象として設定したとしても、特徴情報そのものの不足によって認識処理の精度が得られない。しかしながら本発明においては、特徴情報が不足していることから本来的には認識困難であるようなテレビOb2等にユーザが注目している場合であっても、特徴情報を補充するその他の対象としてのソファーOb3等も含めて紙面全体Ob1を認識対象として登録可能であるため、紙面全体Ob1の認識結果を介してその一部分であるテレビOb2の認識結果も得ることが可能となる。 That is, in the example of FIG. 5, the object that the user is interested in is TV Ob2, but the image is mostly composed of a flat area, and therefore, the feature information for recognition (local image feature amount) is to be extracted. In this example, only insufficiently can be extracted (feature points themselves are few). In the prior art, even if a target such as TV Ob2 that can extract feature information only in an insufficient manner is set as an individual recognition target, the accuracy of recognition processing cannot be obtained due to the lack of feature information itself. However, in the present invention, even if the user is paying attention to the TV Ob2 or the like that is inherently difficult to recognize due to the lack of feature information, as another target for supplementing the feature information Since the entire paper surface Ob1 including the sofa Ob3 and the like can be registered as a recognition target, the recognition result of the television Ob2 as a part thereof can be obtained through the recognition result of the entire paper surface Ob1.

ここで、一実施形態においては、加工部4（拡大部41）と認識部3とが連携することで、上記のように紙面全体Ob1の認識結果を介してさらにその一部分であるテレビOb2にユーザが注目している旨の情報を、認識部3において得るようにすることができる。以下、拡大部41の場合を説明するが、後述する並進部43の場合も、撮像画像P1(t)の一部分の領域として加工画像P2(t)が存在するという所定の関係があるので、当該一実施形態を同様に実施可能である。具体的には、照合部33において前述のように認識結果R(t)として、記憶部32に記憶されているいずれの認識対象が撮像画像P1(t)に撮像されているかに関する結果を得た後、さらに次のようにすればよい。 Here, in one embodiment, the processing unit 4 (enlargement unit 41) and the recognition unit 3 cooperate with each other so that the user can further connect to the television Ob2 that is a part thereof through the recognition result of the entire paper Ob1 as described above. Can be obtained in the recognition unit 3. Hereinafter, the case of the enlargement unit 41 will be described, but also in the case of the translation unit 43 described later, since there is a predetermined relationship that the processed image P2 (t) exists as a partial region of the captured image P1 (t), One embodiment can be implemented as well. Specifically, in the collation unit 33, as described above, the recognition result R (t) was obtained as to which recognition target stored in the storage unit 32 was captured in the captured image P1 (t). Thereafter, the following may be further performed.

すなわち、第一処理として、認識結果R(t)における記憶部32に記憶された認識対象の特徴情報のうち、照合部33による照合処理の際に、撮像画像P1(t)の特徴情報と対応関係が設定されたもの（撮像画像P1(t)に実際に映っている特徴情報に相当）を抜粋し、当該抜粋された特徴情報における特徴点の座標分布から、撮像画像P1(t)の占めている領域を推定する。当該領域推定は例えば、当該抜粋された特徴情報における特徴点の座標分布を覆うような矩形領域として推定することができる。すなわち、当該推定される矩形領域は、当該抜粋された特徴情報における特徴点の座標分布をその内部に含むような領域であり、領域を定義する座標は、記憶部32に予め記憶しておく認識対象の特徴情報における特徴点座標において与えられることとなる。なお、座標分布を覆う矩形領域は無数に存在するが、何らかの所定基準で１つの領域に決定すればよい。例えば、無数に存在する矩形領域のうち面積最小となるような領域を推定結果とすればよい。また、面積最小の他さらに、加工画像P2(t)と形状情報（縦・横のサイズ比）が一致するという条件を課してもよい。 That is, as the first process, among the feature information of the recognition target stored in the storage unit 32 in the recognition result R (t), it corresponds to the feature information of the captured image P1 (t) during the matching process by the matching unit 33 Excerpt of the relationship set (corresponding to the feature information actually shown in the captured image P1 (t)), and the captured image P1 (t) occupies from the coordinate distribution of the feature points in the extracted feature information The area where The area estimation can be estimated as a rectangular area that covers the coordinate distribution of the feature points in the extracted feature information, for example. That is, the estimated rectangular area is an area including the coordinate distribution of the feature points in the extracted feature information, and the coordinates defining the area are stored in advance in the storage unit 32. It is given in the feature point coordinates in the target feature information. There are an infinite number of rectangular areas covering the coordinate distribution, but the area may be determined as one area based on some predetermined criterion. For example, an area that has the smallest area among an infinite number of rectangular areas may be used as the estimation result. In addition to the minimum area, a condition that the processed image P2 (t) and the shape information (vertical / horizontal size ratio) match may be imposed.

さらに、第二処理として、上記のように推定された領域を撮像画像P1(t)の全体領域とみなし、加工部4（拡大部41）において加工処理を適用することでその一部分の領域として加工画像P2(t)を得た際の関係（図４のような、撮像画像P1(t)と加工画像P2(t)との位置関係の情報）をあてはめることで、推定領域全体（撮像画像P1(t)の領域）のうちのどの部分領域が、加工画像P2(t)としてユーザが注目している領域であるかの結果を得ることができる。従って、当該部分領域の特定結果も、記憶部32に予め記憶しておく認識対象の特徴情報における特徴点座標において与えられることとなる。 Further, as the second process, the area estimated as described above is regarded as the entire area of the captured image P1 (t), and is processed as a partial area by applying the processing process in the processing unit 4 (enlargement unit 41). By applying the relationship when the image P2 (t) is obtained (information on the positional relationship between the captured image P1 (t) and the processed image P2 (t) as shown in FIG. 4), the entire estimated region (the captured image P1 It is possible to obtain a result as to which partial region in (t) region) is a region that the user is paying attention to as processed image P2 (t). Therefore, the identification result of the partial area is also given in the feature point coordinates in the feature information of the recognition target stored in advance in the storage unit 32.

以上の一実施形態では、加工画像P2(t)の領域に充分な特徴情報が存在しなくとも、ユーザの注目領域として当該加工画像P2(t)の領域の情報を得ることができる。別の一実施形態では、加工画像P2(t)の領域に充分な特徴情報があるものとし、照合部33による照合処理が行われた後にさらに、撮像画像P1(t)の特徴情報のうちの当該加工画像P2(t)の領域にあるものの特徴情報に対して、記憶部32における認識結果としての認識対象の特徴情報において対応関係が設定されたものを特定し、当該対応関係が特定された認識対象の特徴情報の占める座標分布を覆うような領域を加工画像P2(t)の領域として推定してもよい。座標分布を覆う領域の決定は上記実施形態と同様に可能である。 In the above embodiment, even if sufficient feature information does not exist in the region of the processed image P2 (t), information on the region of the processed image P2 (t) can be obtained as the user's attention region. In another embodiment, it is assumed that there is sufficient feature information in the region of the processed image P2 (t), and after the matching processing by the matching unit 33 is performed, the feature information of the captured image P1 (t) With respect to the feature information in the region of the processed image P2 (t), the feature information of the recognition target as the recognition result in the storage unit 32 is identified, and the correspondence relationship is identified. An area covering the coordinate distribution occupied by the feature information of the recognition target may be estimated as the area of the processed image P2 (t). The area covering the coordinate distribution can be determined as in the above embodiment.

図６及び図７は、縮小部42による加工処理としての縮小処理とその効果とを説明するための図である。 6 and 7 are diagrams for explaining a reduction process as a processing process by the reduction unit 42 and its effect.

ここで、図６に示すように、撮像部2で得られる撮像画像P1(t)はサイズが2N_x×2N_yであり、その中心Oが原点となるように座標系xyを取り、4頂点は(±N_x,±N_y)(複号任意)であるものとする。すなわち、撮像画像P1(t)や加工画像P2(t)の説明のために、図４で説明したのと全く同様の座標系を用い、撮像画像P1(t)のサイズ等に関しては図４の場合と同一とする。 Here, as shown in FIG. 6, the captured image P1 (t) obtained by the imaging unit 2 has a size of 2N _x × 2N _y and takes the coordinate system xy so that the center O is the origin, and the four vertices Is assumed to be (± N _x , ± N _y ) (double sign optional). That is, for the description of the captured image P1 (t) and the processed image P2 (t), the same coordinate system as described in FIG. 4 is used, and the size of the captured image P1 (t) is shown in FIG. Same as the case.

図６に４本の点線を描いて示すように、縮小部42では縮小処理として、撮像画像P1(t)を所定割合r(0<r<1)倍することで縮小した画像として、加工画像P2(t)を得る。すなわち、加工画像P2(t)はサイズが2rN_x×2rN_yであり、中心を原点Oとし、4頂点は(±rN_x,±rN_y)(複号任意)となる。なお、縮小の割合rは拡大処理について説明した図４と共通の文字rを用いているが、拡大の場合のrと縮小の場合のrとが同一である必要はない。また、縮小処理によって得られる加工画像P2(t)には、（拡大処理の場合とは異なり）当初の撮像画像P1(t)に映っていた範囲の全体が映っており、撮像画像P1(t)に対して画素が一律にr倍に減るように間引かれたものが縮小された加工画像P2(t)である。 As shown by four dotted lines in FIG. 6, the reduction unit 42 performs a reduction process as a reduction process by reducing the captured image P1 (t) by a predetermined ratio r (0 <r <1). P2 (t) is obtained. That is, the processed image P2 (t) has a size of 2rN _x × 2rN _y , the center is the origin O, and the four vertices are (± rN _x , ± rN _y ) (decoding arbitrary). Note that the reduction ratio r uses the same character r as in FIG. 4 described for the enlargement process, but r for enlargement and r for reduction are not necessarily the same. In addition, the processed image P2 (t) obtained by the reduction process shows the entire range shown in the original captured image P1 (t) (unlike the enlargement process), and the captured image P1 (t ) Is a processed image P2 (t) that has been reduced so that the pixels are uniformly reduced by r times.

縮小による加工処理の効果の概略は次の通りである。ユーザは撮像部2を認識対象にかざした際、撮像部2で撮像された撮像画像P1(t)が表示部5にプレビューされることを想定しているため、縮小された加工画像P2(t)が表示部5に表示されていると、加工されているとは想定せず、撮像対象から撮像部2を離しすぎていると誤認する。すると、ユーザは自然に撮像対象と撮像部2との距離を縮めるように動かすことになるため、撮像情報に特徴量が多く含まれるようになり、認識精度が向上する。 The outline of the effect of the processing by reduction is as follows. When the user holds the imaging unit 2 over the recognition target, it is assumed that the captured image P1 (t) captured by the imaging unit 2 is previewed on the display unit 5, and thus the reduced processed image P2 (t ) Is displayed on the display unit 5, it is not assumed that it has been processed, and it is mistakenly assumed that the imaging unit 2 is too far from the imaging target. Then, since the user naturally moves the imaging target and the imaging unit 2 to reduce the distance, the imaging information includes a large amount of features, and the recognition accuracy is improved.

図７に掲げた例を参照して、上記概略説明した拡大による加工処理の効果の具体例を説明する。図７の例では[1]に示す撮像対象Ob4は例えばポスターであり、その貼られてている箇所の事情等でユーザによって撮像される場合には遠くから撮像される傾向があるものとする。従って、[2]に示すように当初時刻t1ではユーザはポスターOb4を遠くから撮影し、撮像画像P1(t1)が領域R711を占めることとなるが、この場合、プレビュー表示としての加工画像P2(t1)は縮小されて小さな領域R712として構成されることとなる。そこで、プレビュー表示として領域R712を見たユーザは小さく遠いものと感じ、時刻t1以降で撮像部2をポスターOb4により近づけるような撮像操作U(t)を行うこととなる。 With reference to the example shown in FIG. 7, a specific example of the effect of the processing by the enlargement described above will be described. In the example of FIG. 7, the imaging object Ob4 shown in [1] is, for example, a poster, and when captured by the user due to circumstances of the place where the image is attached, the imaging object Ob4 tends to be captured from a distance. Therefore, as shown in [2], at the initial time t1, the user captures the poster Ob4 from a distance, and the captured image P1 (t1) occupies the region R711. In this case, the processed image P2 ( t1) is reduced and configured as a small region R712. Therefore, the user who sees the region R712 as a preview display feels small and far away, and performs an imaging operation U (t) that brings the imaging unit 2 closer to the poster Ob4 after time t1.

こうして、図７の[3]に示すように、その後の時刻t2ではポスターOb4により近接した範囲R721を占めるものとして撮像画像P1(t2)が得られると共に、プレビュー表示の加工画像P2(t2)は範囲R721を縮小した範囲R722を占めるようになり、ポスターOb4が小さすぎることは解消しているので、ユーザにとっても遠いとは感じない状態となっている。 In this way, as shown in [3] of FIG. 7, at the subsequent time t2, the captured image P1 (t2) is obtained as occupying the range R721 closer to the poster Ob4, and the processed image P2 (t2) of the preview display is obtained. Since the range R721 is reduced to occupy the range R722 and the poster Ob4 is no longer too small, the user does not feel that it is too far away.

以上の図７の[2],[3]の例も、前述の（関係１）、（関係２）を成立させたものとなっている。すなわち、[2]では範囲R711として構成される撮像画像P1(t1)は認識対象であるポスターOb4が管理者等が事前登録した際のものと比べて小さすぎるため、管理者等が事前登録した特徴点のうち解像度不足により算出不能となってしまうものが存在し、認識処理に不向きな状態で映っており、また、範囲R712として構成される加工画像P2(t1)はポスターOb4が小さすぎ、ユーザの所望するようなプレビュー表示とはなっていない。従って、[2]は（関係１）に該当するものであり、これが解消されたものとしての[3]は（関係２）に該当するものとなっている。 The above examples [2] and [3] in FIG. 7 also establish the above-described (Relation 1) and (Relation 2). That is, in [2], the captured image P1 (t1) configured as the range R711 is too small compared to the case where the poster Ob4 to be recognized is pre-registered by the administrator or the like. Some feature points cannot be calculated due to insufficient resolution, and are reflected in a state unsuitable for recognition processing, and the processed image P2 (t1) configured as the range R712 is too small for the poster Ob4, The preview display is not as desired by the user. Therefore, [2] corresponds to (Relationship 1), and [3] as a result of eliminating this is (Relationship 2).

図８及び図９は、並進部43による加工処理としての並進処理とその効果とを説明するための図である。 FIGS. 8 and 9 are diagrams for explaining the translation process as the machining process by the translation unit 43 and its effect.

ここで、図８に示すように、撮像部2で得られる撮像画像P1(t)はサイズが2N_x×2N_yであり、その中心Oが原点となるように座標系xyを取り、4頂点は(±N_x,±N_y)(複号任意)であるものとする。すなわち、撮像画像P1(t)や加工画像P2(t)の説明のために、図４で説明したのと全く同様の座標系を用い、撮像画像P1(t)のサイズ等に関しては図４の場合と同一とする。 Here, as shown in FIG. 8, the captured image P1 (t) obtained by the imaging unit 2 has a size of 2N _x × 2N _y and takes the coordinate system xy so that the center O is the origin, and the four vertices Is assumed to be (± N _x , ± N _y ) (double sign optional). That is, for the description of the captured image P1 (t) and the processed image P2 (t), the same coordinate system as described in FIG. 4 is used, and the size of the captured image P1 (t) is shown in FIG. Same as the case.

図８に並進処理がなされた加工画像P2(t)を示す通り、並進処理は、撮像画像P1(t)の中心（原点O）に対して、加工画像P2(t)の中心Cを移動させる処理として構成される。すなわち、撮像画像P1(t)の一部分の矩形領域として、中心Cが原点Oからずれているような領域を抽出することで、並進処理が加えられた加工画像P2(t)が得られる。 As shown in the processed image P2 (t) subjected to the translation processing in FIG. 8, the translation processing moves the center C of the processed image P2 (t) with respect to the center (origin O) of the captured image P1 (t). Configured as a process. That is, by extracting a region where the center C is displaced from the origin O as a rectangular region of a part of the captured image P1 (t), a processed image P2 (t) to which translation processing has been added is obtained.

従って、並進処理は図４で説明したのと同様の拡大処理（撮像画像P1(t)のうちの一部の割合rを占める領域のみに加工画像P2(t)を限定する処理）に対してさらに、中心位置の移動を施した処理とみることもできる。拡大処理の場合と同様に、領域「P1(t)＼P2(t)」に関しては、撮像画像P1(t)には捉えられていることにより認識精度の向上に寄与するが、加工画像P2(t)からは除外されることとなり、表示部5におけるプレビュー表示を眺めるユーザには見えない領域となる。 Therefore, the translation process is the same as the enlargement process described in FIG. 4 (a process in which the processed image P2 (t) is limited only to a region that occupies a portion r of the captured image P1 (t)). Further, it can be regarded as a process in which the center position is moved. As in the case of the enlargement process, the region “P1 (t) \ P2 (t)” contributes to the improvement of the recognition accuracy by being captured in the captured image P1 (t), but the processed image P2 ( This is excluded from t), and is an area that cannot be seen by the user who views the preview display on the display unit 5.

なお、図９を参照して後述する通り、加工画像P2(t)をプレビューとして表示することにより、ユーザにおける撮像操作U(t)の修正は、ユーザに対して適切なプレビュー表示を得るために図８に示すベクトルVの向きに撮像部2を移動させることを動機付けるという形で実現される。ここで、図８に示す通り、ベクトルVは加工画像P2(t)の中心Cから撮像画像P1(t)の中心Oへと向かうベクトルである。 Note that, as will be described later with reference to FIG. 9, by displaying the processed image P2 (t) as a preview, the user can correct the imaging operation U (t) in order to obtain an appropriate preview display for the user. This is realized in the form of motivating to move the imaging unit 2 in the direction of the vector V shown in FIG. Here, as shown in FIG. 8, the vector V is a vector from the center C of the processed image P2 (t) to the center O of the captured image P1 (t).

並進（画像中心位置の移動）による加工処理の効果の概略は次の通りである。撮像部2が表示部5の中心からずれて設置されている状況で、ユーザが表示部5を撮像対象の正面にかざした際、撮像部2で撮像された撮像画像P1(t)が表示部5にプレビューされることを想定しているため、撮像対象のずれを強調する方向（図８のベクトルVの方向）へ並進された加工画像P2(t)が表示部5に表示されていると、加工されているとは想定せず、撮像対象から撮像部2が大きくずれていると誤認する。すると、ユーザは自然に撮像対象と撮像部2との位置関係を正対させるように動かす（図８のベクトルVの方向へと動かす）ことになるため、撮像画像P1(t)に特徴量が多く含まれるようになり、認識精度が向上する。例えば、スマートフォン等で撮像部2が表示部5背面の右端に設置されている場合、表示部5自体を撮像対象の正面にかざしてしまうと、撮像情報には撮像対象が左寄りに撮像されるので右へのずれを強調するよう並進させることが望ましい。 The outline of the effect of the processing by translation (movement of the image center position) is as follows. When the imaging unit 2 is installed off the center of the display unit 5 and the user holds the display unit 5 over the front of the imaging target, the captured image P1 (t) captured by the imaging unit 2 is displayed on the display unit. 5 is assumed to be previewed on the display unit 5. Therefore, when the processed image P2 (t) translated in the direction of emphasizing the shift of the imaging target (the direction of the vector V in FIG. 8) is displayed on the display unit 5 Therefore, it is assumed that the imaging unit 2 is greatly deviated from the imaging target without assuming that it has been processed. Then, the user naturally moves the object so that the positional relationship between the imaging target and the imaging unit 2 faces each other (moves in the direction of the vector V in FIG. 8), and thus the feature amount is present in the captured image P1 (t). Many are included and the recognition accuracy is improved. For example, if the imaging unit 2 is installed at the right end of the back of the display unit 5 on a smartphone or the like, if the display unit 5 itself is held in front of the imaging target, the imaging target is imaged to the left in the imaging information. It is desirable to translate to emphasize the shift to the right.

図９に掲げた例を参照して、上記概略説明した並進による加工処理の効果の具体例を説明する。図９では[1]に示すように撮像対象は図４の[1]におけるとの同様の、ポスターOb1全体であり、その部分構成としてテレビOb2及びソファーOb3が含まれているものとする。そして、図４の例と同様に、図９においてもユーザはテレビOb2が興味の対象であるため、認識精度の観点からはポスターOb1の全体が撮像されるべきであるにもかかわらず、ユーザはテレビOb2のみを撮像しようとする傾向にあるものとする。 With reference to the example shown in FIG. 9, the specific example of the effect of the process by the translation outlined above will be described. In FIG. 9, as shown in [1], the imaging target is the entire poster Ob1 as in [1] in FIG. 4, and the television Ob2 and sofa Ob3 are included as its partial configuration. Similarly to the example of FIG. 4, since the user is interested in the television Ob2 in FIG. 9, the user should be captured even though the entire poster Ob1 should be imaged from the viewpoint of recognition accuracy. It is assumed that there is a tendency to capture only TV Ob2.

従って、図９の[2]に示すように、当初時刻t1においてユーザは撮像画像P1(t1)がほぼテレビOb2のみしか捉えられていないような範囲R911を占めるように撮像を行う。この結果、図８で示した並進処理（ベクトルVとは逆に、OからCへの左下向きの並進処理）が適用され、プレビューとしての加工画像P2(t1)は領域R912を占めるが、これはテレビOb2が（左下向きの並進処理とは逆に）右上にずれて画像外にはみ出す形の領域となっている。 Accordingly, as shown in [2] of FIG. 9, at the initial time t1, the user performs imaging so that the captured image P1 (t1) occupies a range R911 in which only the television Ob2 is captured. As a result, the translation processing shown in FIG. 8 (transverse processing from O to C in the lower left direction as opposed to the vector V) is applied, and the processed image P2 (t1) as a preview occupies the region R912. Is a region in which the TV Ob2 is shifted to the upper right (as opposed to the lower left translation process) and protrudes outside the image.

従って、時刻t1以降においてユーザは当該右上にはみ出たプレビューを修正するよう右上に撮像部2を動かすこととなり、その結果、時刻t2において図９の[3]に示すように、撮像画像P1(t2)はテレビOb2が左下に捉えられると共にソファーOb3も右上に捉えられ、ポスターOb1の全体をほぼ捉えた領域R921を占めるようになる。この際、プレビュー表示である加工画像P2(t2)は領域R922を占め、ユーザが所望する通りのテレビOb2を中央に捉えた状態となっている。 Therefore, after time t1, the user moves the imaging unit 2 to the upper right so as to correct the preview that protrudes to the upper right, and as a result, as shown in [3] of FIG. 9 at time t2, the captured image P1 (t2 ) TV Ob2 is captured in the lower left and sofa Ob3 is captured in the upper right, and occupies the area R921 that almost captures the entire poster Ob1. At this time, the processed image P2 (t2), which is a preview display, occupies the region R922, and is in a state where the television Ob2 as desired by the user is captured at the center.

以上、図９の[2],[3]の例も（関係１）、（関係２）をそれぞれ成立させたものとなっていることは、同一の対象であるポスターOb1について説明した図４の[2],[3]に関して説明したのと同様である。また、図９の[2],[3]の例より明らかなように、並進処理の向き(図８におけるベクトルVの逆ベクトルの向き)は、ユーザが偏って撮像することが想定される向きと同一に設定することが好ましい。 As described above, the examples of [2] and [3] in FIG. 9 are also obtained by satisfying (Relation 1) and (Relation 2) in FIG. 4 explaining the poster Ob1 which is the same object. This is the same as described for [2] and [3]. Further, as is clear from the examples of [2] and [3] in FIG. 9, the direction of translation processing (the direction of the reverse vector of the vector V in FIG. 8) is the direction in which the user is assumed to image in a biased manner. It is preferable to set the same.

明度変換部44では、明度変換により撮像画像P1(t)の階調よりも加工画像P2(t)の階調を強調するように、または、低減するように、加工処理を行う。なお、明度変換部44においては、以上の各部41〜43のように撮像画像P1(t)に対して加工画像P2(t)の占める領域を変化させたり、サイズを変化させたりする必要はない。撮像画像P1(t)がサイズ2N_x×2N_yであれば、明度変換された加工画像も同様のサイズ2N_x×2N_yでよい。 The lightness conversion unit 44 performs processing so that the gradation of the processed image P2 (t) is emphasized or reduced with respect to the gradation of the captured image P1 (t) by lightness conversion. In the lightness conversion unit 44, it is not necessary to change the area occupied by the processed image P2 (t) or the size of the captured image P1 (t) as in the above-described units 41 to 43. . If the captured image P1 (t) has a size of 2N _x × 2N _y , the brightness-converted processed image may have a similar size of 2N _x × 2N _y .

明度変換部44によって加工画像P2(t)を得るようにした場合は、次の効果が得られる。すなわち、鏡面反射や光量の過不足等によって撮像画像P1(t)の階調が損なわれ、当該領域の特徴点が失われている状況で、ユーザが撮像部2を撮像対象にかざした際、撮像部2で撮像された撮像画像P1(t)が表示部5にプレビューされることを想定しているため、階調を強調する明度変換された加工画像P2(t)が表示部5に表示されていると、加工されているとは想定せず、光量や光源が不適切であると誤認する。すると、ユーザは自然と光量や光源を修正するように動かすことで撮像操作U(t)を修正することになるため、撮像画像P1(t)に特徴量が多く含まれるようになり、認識精度が向上する。 When the processed image P2 (t) is obtained by the lightness conversion unit 44, the following effects are obtained. That is, when the user holds the imaging unit 2 over the imaging target in a situation where the gradation of the captured image P1 (t) is impaired due to specular reflection, excess or insufficient light quantity, and the feature points of the region are lost, Since it is assumed that the captured image P1 (t) captured by the imaging unit 2 is previewed on the display unit 5, the brightness-converted processed image P2 (t) that emphasizes the gradation is displayed on the display unit 5. If it is, it is not assumed that it has been processed, and it misidentifies that the amount of light and the light source are inappropriate. Then, since the user naturally corrects the imaging operation U (t) by moving it so as to correct the light amount and the light source, the captured image P1 (t) includes a lot of features, and the recognition accuracy Will improve.

なお、光源が強すぎる等で白とび等が発生するような状況が想定される場合、階調を強調するように（すなわち、白とび等がさらに強調されるように）、逆に、光源が弱く薄暗いためにぼやけてしまうような状況が想定される場合、階調を低減するように（すなわち、さらにぼやけてしまうように）、明度変換部44において所定パラメータにて明度変換処理を行うようにすることが好ましい。いずれの場合も、（損なわれる原因が逆ではあるが、）階調が損なわれている環境において撮像がなされる場合に、ユーザに対して階調が損なわれている旨をより強調して知覚させることにより、階調を損なうことのないように撮像をし直すことをユーザに促すことができる。 If a situation such as overexposure occurs due to the light source being too strong, the tone is emphasized (that is, the overexposure is further emphasized). When it is assumed that the image is blurred because it is weak and dim, the lightness conversion unit 44 performs lightness conversion processing with a predetermined parameter so as to reduce the gradation (that is, to further blur). It is preferable to do. In any case, when imaging is performed in an environment in which the gradation is impaired (although the cause of the damage is reversed), the user perceives that the gradation is impaired with more emphasis on the user. By doing so, it is possible to prompt the user to re-image the image so as not to impair the gradation.

上記のように、明度変換においてもユーザの撮像操作の修正前後において（関係１）及び（関係２）が成立している。 As described above, also in lightness conversion, (Relation 1) and (Relation 2) are established before and after correction of the user's imaging operation.

以上、本発明によれば、撮像対象を撮像部2で撮像することで撮像対象を高精度に認識することができる。特に、撮像画像P1(t)ではなく加工画像P2(t)をユーザに対するプレビュー表示に用いることにより、ユーザに対して自発的に認識処理に適した撮像画像P1(t)が取得されるような状態で撮像部2を用いた撮像を行うように促すことが可能となるため、高精度な認識が可能となる。 As described above, according to the present invention, the imaging target can be recognized with high accuracy by imaging the imaging target with the imaging unit 2. In particular, by using the processed image P2 (t) instead of the captured image P1 (t) for the preview display to the user, the captured image P1 (t) suitable for the recognition process is acquired voluntarily for the user. Since it is possible to prompt the user to perform imaging using the imaging unit 2 in a state, highly accurate recognition is possible.

以下、本発明におけるその他の実施形態等の補足事項を説明する。 Hereinafter, supplementary matters such as other embodiments of the present invention will be described.

（１）加工処理におけるパラメータの設定（例えば図４の拡大処理における拡大率1/r）に関しては、管理者等が予め設定しておいたものを用いればよいが、撮像がなされる実環境に適したパラメータを具体的に管理者等が予め定めるためには、例えば次のようにすればよい。 (1) Regarding the parameter setting in the processing process (for example, the enlargement ratio 1 / r in the enlargement process in FIG. 4), what is set in advance by the administrator or the like may be used. In order for an administrator or the like to determine appropriate parameters in advance, for example, the following may be performed.

具体的には、記憶部32に登録しておく認識対象の特徴点の数、寸法、種別が利用できる。特徴点の数を利用する場合、拡大縮小の倍率を特徴点の数に反比例させることが望ましい。例えば、特徴点が少ないほど撮像対象を大きく拡大する。撮像対象の寸法を利用する場合、拡大縮小の倍率を寸法に比例させることが望ましい。例えば、小さい撮像対象は縮小することで撮像情報に撮像対象が大きく写るように誘導し、大きい撮像対象は拡大することで撮像情報に撮像対象の全体が写るように誘導する。撮像対象の種別を利用する場合、拡大縮小の倍率を想定される撮像距離に比例させることが望ましい。例えば、撮像距離が長いと想定される撮像対象（ポスター等の固定されて動かせないもの）は縮小し、撮像距離が短いと想定される撮像対象（カタログ等の手に取って撮像できるもの）は拡大することが望ましい。 Specifically, the number, size, and type of recognition target feature points registered in the storage unit 32 can be used. When using the number of feature points, it is desirable to make the scaling factor inversely proportional to the number of feature points. For example, as the number of feature points is smaller, the imaging target is greatly enlarged. When using the size of the object to be imaged, it is desirable to make the scaling factor proportional to the size. For example, the small imaging target is guided so that the imaging target is reflected in the imaging information by being reduced, and the large imaging target is guided so that the entire imaging target is reflected in the imaging information by being enlarged. When the type of the imaging target is used, it is desirable to make the scaling factor proportional to the assumed imaging distance. For example, an imaging target that is assumed to have a long imaging distance (a poster or the like that cannot be moved because it is fixed) is reduced, and an imaging target that is assumed to be a short imaging distance (a product that can be taken by hand such as a catalog) is It is desirable to enlarge.

また、上記のような認識対象の特徴点の数、寸法、種別に関しては、記憶部32に記憶させておく一連の認識対象のそれぞれにおいて同一あるいはほぼ同じであるものとし、典型値としての特徴点の数、寸法、種別を固定的に利用するようにすればよい。すなわち、例えば、遠くで撮影されることが想定されるポスター群と近くで撮影されることが想定されるカタログページ群とを同時に記憶部32には登録しないようにすることが好ましい。 In addition, the number, size, and type of feature points of the recognition target as described above are the same or substantially the same in each of the series of recognition targets stored in the storage unit 32, and feature points as typical values The number, size, and type of each may be used in a fixed manner. That is, for example, it is preferable not to register the poster group that is supposed to be photographed at a distance and the catalog page group that is supposed to be photographed at the same time in the storage unit 32.

同様に、認識部3において認識するため記憶部32に記憶しておく対象を事前に、その撮像のなされ方の共通性に基づいてグループ分けしておき、ユーザが撮像して認識を行う場合にはいずれのグループを対象として認識を行うかを事前に指定し、当該グループに応じた共通の加工処理を行うようにしてもよい。すなわち、上記の例であればポスター群とカタログページ群とのいずれのグループに対して認識が行われるかは事前知識として情報端末装置1に与えておき、当該グループに適した所定の加工処理を行うようにしてもよい。 Similarly, when the objects to be stored in the storage unit 32 for recognition by the recognition unit 3 are grouped in advance based on the commonality of how the images are captured, and the user performs image capture and recognition May specify in advance which group is to be recognized, and perform common processing according to the group. That is, in the above example, which group of the poster group and the catalog page group is recognized is given to the information terminal device 1 as prior knowledge, and a predetermined processing suitable for the group is performed. You may make it perform.

あるいは、利用者の撮像方法の傾向に応じて画像変換処理のパラメータを変更させるよう、認識部3において前述のような加工指示I(t)を生成するようにしてもよい。具体的には、認識部3で認識されることが想定される本来の認識対象と認識後に撮像情報に含まれる撮像対象とを比較し、当該ユーザの当初時刻t1における撮像傾向に応じたパラメータを求めておき、次回以降の撮像時に相違を補正するようパラメータを設定することが望ましい。 Alternatively, the processing instruction I (t) as described above may be generated in the recognition unit 3 so that the parameters of the image conversion process are changed according to the tendency of the user's imaging method. Specifically, the original recognition target that is assumed to be recognized by the recognition unit 3 is compared with the imaging target included in the imaging information after recognition, and a parameter corresponding to the imaging tendency of the user at the initial time t1 is set. It is desirable to set parameters so as to correct the difference in the next and subsequent imaging.

（２）情報端末装置1がその各部（図１、図２、図３で説明した各部）を実現するためのハードウェア構成に関しては、通常のコンピュータにおけるハードウェア構成を採用することができる。 (2) With regard to the hardware configuration for realizing each part (each part described with reference to FIGS. 1, 2, and 3) by the information terminal device 1, a hardware configuration in a normal computer can be employed.

すなわち、図１〜図３の各部を実現する情報端末装置1のハードウェア構成としては、スマートフォンやタブレット端末といったような携帯端末の他、デスクトップ型、ラップトップ型その他の一般的なコンピュータの構成を採用することができる。すなわち、CPU(中央演算装置)と、CPUにワークエリアを提供する一時記憶装置と、プログラム等のデータを格納する二次記憶装置と、各種の入出力装置と、これらの間でのデータ通信を担うバスと、を備える一般的なコンピュータのハードウェア構成を採用できる。CPUが二次記憶装置に格納されたプログラムを読み込んで実行することで、図１〜図３の各部が実現される。本発明はこのようなプログラムとしても提供可能である。なお、各種の入出力装置としては、画像取得するカメラ、表示を行うディスプレイ、ユーザ入力を受け取るタッチパネルやキーボード、音声を入出力するマイク・スピーカ、外部と有線・無線にて通信を行う通信インタフェース、といったものの中から必要機能に応じたものを利用することができる。 That is, the hardware configuration of the information terminal device 1 that implements each unit of FIGS. 1 to 3 includes the configuration of a desktop computer, a laptop computer, or another general computer in addition to a mobile terminal such as a smartphone or a tablet terminal. Can be adopted. That is, a CPU (Central Processing Unit), a temporary storage device that provides a work area to the CPU, a secondary storage device that stores data such as programs, various input / output devices, and data communication between them It is possible to adopt a general computer hardware configuration including a bus to be carried. Each part of FIGS. 1 to 3 is realized by the CPU reading and executing the program stored in the secondary storage device. The present invention can also be provided as such a program. Various input / output devices include a camera for image acquisition, a display for display, a touch panel and keyboard for receiving user input, a microphone / speaker for inputting / outputting audio, a communication interface for communicating with the outside by wire / wirelessly, You can use the one according to the required function.

（３）認識部3については、図２の要素構成により局所画像特徴量に基づく認識を行う場合を説明したが、文字認識、QRコード（登録商標）等のコード認識、テンプレートマッチング等のその他の周知の各種の認識を行う場合も図２の要素構成と同様にして認識を行うことができる。すなわち、算出部31で撮像画像P1(t)から求める特徴情報F(t)を当該各種の認識処理において用いられているものに置き換えるようにすればよい。また、記憶部32においても記憶させておく特徴情報を当該各種の認識処理において用いられているものに置き換えるようにすればよい。照合部33においても、照合処理を当該各種の認識処理においてなされる照合処理におけるものに置き換えるようにすればよい。 (3) The recognition unit 3 has been described with respect to the case of performing recognition based on the local image feature amount by the element configuration of FIG. 2, but other types such as character recognition, code recognition such as QR code (registered trademark), template matching, etc. In the case of performing various types of known recognition, the recognition can be performed in the same manner as the element configuration in FIG. That is, the feature information F (t) obtained from the captured image P1 (t) by the calculation unit 31 may be replaced with information used in the various recognition processes. Further, the feature information stored in the storage unit 32 may be replaced with information used in the various recognition processes. Also in the collation unit 33, the collation process may be replaced with one in the collation process performed in the various recognition processes.

（４）図１等の各部の処理に関しては、各時刻tにおいてなされるものとして説明した。当該各時刻tは、所定レートでリアルタイム処理されるような各時刻であればよい。また、ユーザが指定した時刻tのみにおいて図１等の各部の全部あるいは一部の処理がなされるようにしてもよい。例えば、認識部3の処理は、適切なプレビュー表示が得られたとユーザ自身が判断し、情報端末装置1にその旨を指定した時点で、あるいは当該指定された時点以降で行うようにしてもよい。 (4) The processing of each unit in FIG. 1 and the like has been described as being performed at each time t. Each time t may be any time that is processed in real time at a predetermined rate. Further, all or part of the processing of each unit shown in FIG. 1 or the like may be performed only at time t designated by the user. For example, the processing of the recognition unit 3 may be performed when the user himself / herself determines that an appropriate preview display has been obtained and designates that in the information terminal device 1 or after the designated time. .

（５）以上の説明においては、撮像部2の撮像サイズ（撮像素子の縦横配列の個数）と、表示部5の表示サイズ（表示素子の縦横配列の個数）とが、例えば図４，６，８等を参照してサイズN_x×N_y等として説明したように、互いに共通であることを前提としていた。ここで、撮像部2の撮像サイズと表示部5の表示サイズとが異なる場合は、撮像部2で得た撮像画像P1(t)を解像度変換して、表示部5の表示サイズに合わせたものを、以上の説明における「撮像画像P1(t)」とみなして、本発明を適用するようにしてもよい。 (5) In the above description, the imaging size of the imaging unit 2 (the number of vertical and horizontal arrays of imaging elements) and the display size of the display unit 5 (the number of vertical and horizontal arrays of display elements) are, for example, FIGS. As described with reference to 8 etc. as the size N _x × N _y etc., it was assumed that they were common to each other. Here, if the imaging size of the imaging unit 2 and the display size of the display unit 5 are different, the resolution of the captured image P1 (t) obtained by the imaging unit 2 is converted to the display size of the display unit 5 May be regarded as “captured image P1 (t)” in the above description, and the present invention may be applied.

すなわち、図１において撮像部2と加工部4との間にさらに不図示の機能部である解像度変換部を設け、当該解像度変換部によって表示部5のサイズに合わせたものを、以上説明したような加工部4における処理対象としてもよい。情報端末装置1において（本発明とは関係なく）プレビュー機能を利用するに際して、撮像サイズと表示サイズが異なっていることによりこのような解像度変換部が標準適用されるようになっている場合、さらに本発明を適用するに際しては、当該標準適用により解像度変換済みである撮像画像P1(t)を対象に以上の加工部4の処理を適用してよい。 That is, in FIG. 1, a resolution conversion unit, which is a functional unit (not shown), is further provided between the imaging unit 2 and the processing unit 4, and the resolution conversion unit adapted to the size of the display unit 5 has been described above. The processing target in the processing unit 4 may be used. When the preview function is used in the information terminal device 1 (regardless of the present invention), when such a resolution conversion unit is applied as a standard due to a difference in imaging size and display size, When applying the present invention, the above processing of the processing unit 4 may be applied to the captured image P1 (t) whose resolution has been converted by the standard application.

例えば、撮像部2の撮像サイズが800×600で、表示部5の表示サイズが400×300である場合に、縮小部42において1/2倍に縮小する実施形態を適用する場合、次のようにしてもよい。すなわち、サイズ800×600で得た撮像画像P1(t)を表示部5の表示サイズに合わせるべく解像度を1/2に落として400×300としたものを縮小部42の縮小処理の対象とし、さらにサイズ200×150としたものを加工画像P2(t)として得るようにしてよい。 For example, when the imaging size of the imaging unit 2 is 800 × 600 and the display size of the display unit 5 is 400 × 300, when applying the embodiment in which the reduction unit 42 reduces to 1/2 times, as follows: It may be. That is, the captured image P1 (t) obtained at a size of 800 × 600 is subjected to the reduction processing of the reduction unit 42 by reducing the resolution to 1/2 to match the display size of the display unit 5 to 400 × 300, Further, a size 200 × 150 may be obtained as the processed image P2 (t).

逆に、同じく撮像サイズが800×600で、表示サイズが400×300である場合に、拡大部41において1/2の部分領域（図４にてr=1/2とした際の部分領域P2(t)）のみを表示させる実施形態を適用する場合、撮像サイズ800×600から1/2の部分領域を切り取ったものはサイズ400×300であり表示サイズに一致するので、解像度変換を経ることなく拡大部41の処理を適用することもできる。 Conversely, when the image pickup size is 800 × 600 and the display size is 400 × 300, the enlarged portion 41 has a half partial region (partial region P2 when r = 1/2 in FIG. 4). (t)) In the case of applying the embodiment in which only the image is displayed, the one obtained by cutting out a partial area of the imaging size 800 × 600 is a size 400 × 300, which matches the display size, and therefore undergoes resolution conversion. Alternatively, the processing of the enlargement unit 41 can be applied.

（７）拡大部41における別の実施形態として、図４で説明したように撮像画像P1(t)の一部分のみを切り取ったものを加工画像P2(t)とするのではなく、撮像画像P1(t)の全体をそのまま所定倍率で拡大したものを加工画像P2(t)とするようにしてもよい。なお、当該別の実施形態は図６で説明した縮小部42を、r>1となるような倍率rで適用することに相当する。効果として、拡大部41の以上説明した実施形態と同様に、ユーザに対しては撮像画像P1(t)が大きすぎ、撮像対象に対して接近しすぎているように誤認させ、撮像対象をより遠くで撮像しようという行動を促すことができる。 (7) As another embodiment of the enlargement unit 41, as described with reference to FIG. 4, an image obtained by cutting out only a part of the captured image P1 (t) is not the processed image P2 (t), but the captured image P1 ( An image obtained by enlarging the entire t) with a predetermined magnification may be used as the processed image P2 (t). The other embodiment corresponds to applying the reduction unit 42 described with reference to FIG. 6 at a magnification r such that r> 1. As an effect, similar to the above-described embodiment of the enlargement unit 41, the captured image P1 (t) is misidentified as being too large for the user and too close to the imaging target, and the imaging target is more It is possible to promote an action to take an image at a distance.

なお、拡大部41で上記の別の一実施形態を適用する場合は、撮像サイズの方が表示サイズよりも小さく、標準のプレビュー表示では解像度変換を適用せずに、表示サイズ全体の一部分のみに撮像画像が配置される状態にあることが好ましく、当該一部分のみに配置されていた撮像画像を拡大部41の加工処理によってさらに大きく配置することで、ユーザに上記のような行動を促すことができる。 Note that when another embodiment described above is applied to the enlargement unit 41, the imaging size is smaller than the display size, and resolution conversion is not applied to the standard preview display, and only a part of the entire display size is applied. Preferably, the captured image is in a state of being arranged, and the user can be encouraged to take the above-described action by arranging the captured image that has been arranged only in the part larger by the processing of the enlargement unit 41. .

（８）縮小部42の別の一実施形態として、図６で説明したように撮像画像P1(t)を縮小して加工画像P2(t)を得るのではなく、撮像画像P1(t)には縮小処理は加えず、所定パラメータで糸巻き型の歪みを加えたものを加工画像P2(t)とするようにしてもよい。例えば、画像中心に向けて所定割合だけ収縮させるようにすることで、撮像するユーザに対して縮小処理の実施形態の場合と概ね同様に、撮像対象が遠くにあると誤認させるようにすることができる。さらに、縮小処理と糸巻き型の歪みを加える処理とを組み合わせるようにしてもよい。 (8) As another embodiment of the reduction unit 42, the captured image P1 (t) is not reduced to obtain the processed image P2 (t) as described in FIG. No reduction processing is added, and a processed image P2 (t) may be obtained by adding a pincushion type distortion with a predetermined parameter. For example, by contracting by a predetermined rate toward the center of the image, it is possible to cause the user who captures an image to misidentify that the imaging target is far away, as in the case of the embodiment of the reduction process. it can. Further, a reduction process and a process for adding a pincushion type distortion may be combined.

1…情報端末装置、2…撮像部、3…認識部、4…加工部、5…表示部 1 ... information terminal device, 2 ... imaging unit, 3 ... recognition unit, 4 ... processing unit, 5 ... display unit

Claims

An imaging unit that captures an imaging target and obtains a captured image;
A recognition unit that analyzes the captured image and recognizes the imaging target;
A processing unit that performs a predetermined processing on the captured image to obtain a processed image;
A display unit that displays the processed image as a preview,
The predetermined processing process is a processing process determined in advance according to a candidate for an imaging target and a candidate for how to perform the imaging,
When the imaging unit is imaging the imaging target and the obtained captured image is inappropriate for recognition by the recognition unit, the obtained processed image is inappropriate as a preview in the display unit. State, and
When the captured image is in a state suitable for recognition by the recognizing unit when the captured image is captured by the imaging unit, the processed image obtained is in a suitable state as a preview in the display unit. An information terminal device characterized by that.

The predetermined processing process is a process in which a part of the captured image is extracted as a processed image;
The state inappropriate for recognition is an inappropriate state due to the imaging unit being too close to the recognition target, and the state inappropriate as a preview in the display unit is that the recognition target is too large. The information terminal device according to claim 1, wherein the information terminal device is in an inappropriate state.

The predetermined processing is an enlargement process,
The state inappropriate for recognition is an inappropriate state due to the imaging unit being too close to the recognition target, and the state inappropriate as a preview in the display unit is that the recognition target is too large. The information terminal device according to claim 1, wherein the information terminal device is in an inappropriate state.

The predetermined processing is a reduction process;
The state inappropriate for recognition is an inappropriate state due to the imaging unit being too far away from the recognition target, and the state inappropriate as a preview in the display unit is due to the recognition target being too small. The information terminal device according to claim 1, wherein the information terminal device is in an inappropriate state.

The predetermined processing is a process of adding a pincushion type distortion,
The state inappropriate for recognition is an inappropriate state due to the imaging unit being too far away from the recognition target, and the state inappropriate as a preview in the display unit is due to the recognition target being too small. The information terminal device according to claim 1, wherein the information terminal device is in an inappropriate state.

The predetermined processing is processing for moving the center position of the processed image so as to be different from the center position of the captured image;
The state inappropriate for recognition is an inappropriate state due to the imaging unit imaging the recognition target deviating from its center, and the state inappropriate as a preview in the display unit is the recognition target. The information terminal device according to claim 1, wherein the information terminal is in a state of being shifted from the center.

The predetermined processing is processing for enhancing or reducing the gradation of the processed image more than the gradation of the captured image by brightness conversion,
The state inappropriate for recognition is an inappropriate state due to the imaging unit imaging the recognition target with its gradation being impaired, and the state inappropriate as a preview in the display unit is the recognition The information terminal device according to claim 1, wherein the object is in a state where the gradation is lost.

The recognizing unit calculates a local image feature amount from the captured image, and determines that they are similar by comparing with a local image feature amount calculated in advance for each of a plurality of predetermined recognition targets for reference. The information terminal device according to claim 1, wherein the recognition target to be recognized is recognized as corresponding to the imaging target in the captured image.

The recognizing unit calculates a local image feature amount from the captured image, and determines that they are similar by comparing with a local image feature amount calculated in advance for each of a plurality of predetermined recognition targets for reference. A recognition target to be recognized as corresponding to the imaging target in the captured image,
Based on the distribution of the feature point coordinates of the local image feature value for reference in the recognition target determined to be similar, the correspondence obtained with the local image feature value in the captured image, The information terminal device according to claim 2, wherein an area of the captured image and an area of the processed image as the partial area are estimated.

A program for causing a computer to function as the information terminal device according to any one of claims 1 to 9.