JP2021077283A

JP2021077283A - Information processing apparatus, information processing method, and program

Info

Publication number: JP2021077283A
Application number: JP2019205645A
Authority: JP
Inventors: 雄弘和田; Takehiro Wada
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2019-11-13
Filing date: 2019-11-13
Publication date: 2021-05-20

Abstract

To continue tracking with less deviation when performing tracking on video images, while reducing an arithmetic load associated with matching processing.SOLUTION: An information processing apparatus has: a feature point extraction unit that extracts a feature point and a feature quantity from a whole image of a subject and photographed images obtained by continuously photographing the subject; a feature point comparison processing unit that performs feature point comparison processing between the whole image and the photographed images; a feature point tracking processing that performs feature point tracking processing for the feature point; a tracking processing unit that performs tracking processing on a predetermined area of the image by using a transformation matrix determined based on a processing result of the feature point comparison processing or the feature point tracking processing; and a tracking error calculation unit that, when an image on which the tracking processing is not performed occurs, calculates and stores an error in tracking that occurs. When the errors stored by the tracking error calculation unit exceed a threshold, the information processing apparatus performs the feature point comparison processing and resets the stored errors.SELECTED DRAWING: Figure 3

Description

本発明は、情報処理装置、情報処理方法、及びプログラムに関する。 The present invention relates to an information processing device, an information processing method, and a program.

カメラ付き携帯型端末装置（モバイル端末）は一般的なものになってきた。従来、ユーザーは紙文書を電子的に取り込むためにスキャナなどを利用していたが、モバイル端末のカメラを利用することで簡単に紙文書を電子的に取り込むことができるようになった。特許文献１には、カメラを利用して取り込んだ電子書類の表示領域及び座標を認識及びトラッキングするための技術が提案されている。 Camera-equipped portable terminal devices (mobile terminals) have become commonplace. In the past, users used scanners and the like to electronically capture paper documents, but now it is possible to easily capture paper documents electronically by using the camera of a mobile terminal. Patent Document 1 proposes a technique for recognizing and tracking the display area and coordinates of an electronic document captured by using a camera.

特開２００９−２０８９０号公報JP-A-2009-20890

特許文献１では、電子書類の表示領域及び場所を、インビジブルジャンクション特徴量を使用して認識及びトラッキングする方法が記載されている。特許文献１では、インビジブルジャンクション特徴量による表示領域及び場所の特定が一度行われれば、ビデオ撮影における特徴点を追跡し、その後にビデオ撮影間で平面的な動き（投影変換）を推定することが記載されている。しかし、特許文献１には、画像のトラッキングの継続による誤差の蓄積に対しての処理等に関しては書かれていない。トラッキングの誤差の蓄積によるズレの発生を抑制するためには高頻度でマッチング処理を行うことが考えられるが、マッチング処理に伴う演算負荷が高くなり、高速な処理が困難となる。そこで、本発明は、動画像に係るトラッキングを行う場合に、マッチング処理による演算負荷を減らしつつ、ズレの少ないトラッキングを継続できるようにすることを目的とする。 Patent Document 1 describes a method of recognizing and tracking a display area and a place of an electronic document by using an invisible junction feature amount. In Patent Document 1, once the display area and the location are specified by the invisible junction feature amount, the feature points in the video shooting are tracked, and then the planar movement (projection conversion) is estimated between the video shootings. Are listed. However, Patent Document 1 does not describe processing for accumulating errors due to continuous image tracking. In order to suppress the occurrence of deviation due to the accumulation of tracking errors, it is conceivable to perform the matching process with high frequency, but the calculation load associated with the matching process becomes high, and high-speed processing becomes difficult. Therefore, an object of the present invention is to enable tracking with less deviation while reducing the calculation load due to the matching process when tracking the moving image.

本発明に係る情報処理装置は、被写体の全体画像及び前記被写体を連続して撮影した撮影画像から特徴点及び前記特徴点の特徴量を抽出する抽出手段と、前記抽出手段により抽出された、第１の画像における特徴量と、前記第１の画像とは異なる第２の画像における特徴量とを比較して一致する特徴点の組み合わせを求める比較処理手段と、前記抽出手段により抽出された特徴点の画像間における移動ベクトルを求める追跡処理手段と、前記比較処理手段又は前記追跡処理手段による処理結果に基づいて求められる変換行列を用いて、画像の所定の領域に係るトラッキング処理を行うトラッキング処理手段と、前記トラッキング処理手段によるトラッキング処理が行われない画像が発生した場合に、発生するトラッキングの誤差を算出して蓄積する誤差算出手段と、前記誤差算出手段により蓄積した誤差が閾値を越えた場合に、前記比較処理手段による処理を実行させ、前記誤差算出手段により蓄積した誤差をリセットさせる制御手段とを有することを特徴とする。 The information processing apparatus according to the present invention includes an extraction means for extracting a feature point and a feature amount of the feature point from a whole image of the subject and a photographed image obtained by continuously photographing the subject, and a first extraction means extracted by the extraction means. A comparison processing means for comparing a feature amount in one image with a feature amount in a second image different from the first image to obtain a matching feature point, and a feature point extracted by the extraction means. A tracking processing means for performing tracking processing related to a predetermined region of an image by using a tracking processing means for obtaining a movement vector between images and a conversion matrix obtained based on the processing result by the comparison processing means or the tracking processing means. When an image that is not tracked by the tracking processing means is generated, an error calculating means that calculates and accumulates the generated tracking error, and an error that is accumulated by the error calculating means exceeds the threshold value. It is characterized by having a control means for executing a process by the comparison processing means and resetting an error accumulated by the error calculation means.

本発明によれば、マッチング処理による演算負荷を減らしつつ、ズレの少ないトラッキングを継続することが可能となる。 According to the present invention, it is possible to continue tracking with less deviation while reducing the calculation load due to the matching process.

モバイル端末の外観の一例を示す図である。It is a figure which shows an example of the appearance of a mobile terminal. モバイル端末のハードウェア構成例を示す図である。It is a figure which shows the hardware configuration example of a mobile terminal. モバイル端末のソフトウェア構成例を示す図である。It is a figure which shows the software configuration example of a mobile terminal. モバイルアプリのＵＩの一例を示す図である。It is a figure which shows an example of UI of a mobile application. 全体画像と撮影画像の一例を示す図である。It is a figure which shows an example of the whole image and the photographed image. データ入力領域を説明する図である。It is a figure explaining the data input area. トラッキング処理を説明する図である。It is a figure explaining the tracking process. トラッキング処理の例を示すフローチャートである。It is a flowchart which shows the example of the tracking process. 第１変換行列及び第２変換行列の作成・更新処理の例を示すフローチャートである。It is a flowchart which shows the example of the creation / update process of the 1st transformation matrix and the 2nd transformation matrix. トラッキング処理を説明する図である。It is a figure explaining the tracking process.

以下、本発明の実施形態を図面に基づいて説明する。なお、以下に説明する実施形態は、本発明を限定するものではなく、また、実施形態で説明されているすべての構成が本発明の課題を解決するための手段に必須であるとは限らない。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. It should be noted that the embodiments described below do not limit the present invention, and not all the configurations described in the embodiments are essential as means for solving the problems of the present invention. ..

ここで、カメラで取り込んだ紙文書の画像の局所領域における文字認識処理（ＯＣＲ）の結果情報を取得して利用する構成について検討する。ＯＣＲ結果情報を取得する際、取得する情報が記された領域（データ入力領域）の位置座標が既知であれば（例えば、既知のフォーマットの帳票）、ＯＣＲ処理対象の領域を特定できるので、その領域をＯＣＲ処理してＯＣＲ結果を取得すればよい。 Here, a configuration for acquiring and using the result information of character recognition processing (OCR) in a local region of an image of a paper document captured by a camera will be examined. When acquiring OCR result information, if the position coordinates of the area (data input area) in which the information to be acquired is written are known (for example, a form in a known format), the area to be OCR processed can be specified. The area may be subjected to OCR processing and the OCR result may be obtained.

カメラからの入力が動画である場合、各画像間の特徴点とその特徴量を比較することで、画像間で一致する特徴点の組み合わせを求める（マッチングをする）ことができる。しかし、特徴点比較処理は、精度の高いマッチング手法ではあるが、一般的に演算処理負荷が高く、処理速度が遅い。このため、３０ＦＰＳ（フレーム／秒）程度のある程度高速な動画像処理の場合、動画フレーム間隔でマッチング処理を終わらせるためには高速な演算処理装置などが必要となる。 When the input from the camera is a moving image, it is possible to obtain (match) a combination of feature points that match between the images by comparing the feature points between the images with the feature amount thereof. However, although the feature point comparison process is a highly accurate matching method, the arithmetic processing load is generally high and the processing speed is slow. Therefore, in the case of moving image processing having a certain high speed of about 30 FPS (frames / second), a high-speed arithmetic processing unit or the like is required to finish the matching processing at the moving image frame interval.

低速なモバイル端末の場合などは、マッチング処理を高頻度で行うことができない。そこで、比較対象の画像上で特徴点の移動位置を推定するトラッキング処理を行うことで、マッチング処理を行う頻度を下げることができる。トラッキング処理は、オプティカルフローなどにより、比較対象の画像においてどれだけ移動したかの移動ベクトルを推定する。これにより、原画像上の特徴点が比較対象の画像上のどの位置に移動したかの推定を行うことができる。特徴点追跡は、特徴点比較よりも比較的処理速度が速い。 In the case of a low-speed mobile terminal, the matching process cannot be performed frequently. Therefore, the frequency of matching processing can be reduced by performing tracking processing for estimating the moving position of the feature point on the image to be compared. The tracking process estimates the movement vector of how much the image to be compared has moved by an optical flow or the like. This makes it possible to estimate to which position on the image to be compared the feature points on the original image have moved. Feature point tracking is relatively faster than feature point comparison.

しかし、画像のトラッキングの継続による誤差の蓄積によりデータ入力領域の位置座標にズレが生じると、正確なＯＣＲ結果情報を得ることは難しくなる。そこで、本実施形態では、動画像の各画像間のトラッキング情報を用いてトラッキングを継続する場合に、適切なタイミングでマッチング処理を行うことにより、マッチング処理による演算負荷を減らしつつ、ズレの少ないトラッキングを継続できるようにする。 However, if the position coordinates of the data input area deviate due to the accumulation of errors due to the continuation of image tracking, it becomes difficult to obtain accurate OCR result information. Therefore, in the present embodiment, when tracking is continued using the tracking information between each image of the moving image, the matching process is performed at an appropriate timing to reduce the calculation load due to the matching process and to reduce the deviation. To be able to continue.

以下、本実施形態に係る情報処理装置の一例として、モバイル端末を例に説明する。モバイル端末は、携帯端末の一例であり、無線通信機能などの装備によって自由な場所で利用できる端末である。 Hereinafter, a mobile terminal will be described as an example of the information processing device according to the present embodiment. A mobile terminal is an example of a mobile terminal, and is a terminal that can be used in a free place by equipping it with a wireless communication function or the like.

図１は、モバイル端末の外観の一例を示す図である。モバイル端末１００は、各種のユニット１０１〜１０４等を含んで構成される。モバイル端末１００の表側がモバイル端末前面部１０１である。タッチパネル１０２は、ディスプレイ等の表示部の一例であり、出力（表示）と入力との２つの機能を備えている。モバイル端末１００の裏側がモバイル端末背面部１０３である。モバイル端末背面部１０３は、画像を取り込むためのカメラ１０４を有する。本実施形態では、モバイル端末１００のユーザーは、被写体１０５の画像を後述のモバイルアプリ（モバイルアプリケーション）でカメラ１０４を使用して撮ることによって処理を開始することができる。被写体１０５は、例えばＡ４サイズの紙文書の注文書等である。被写体１０５は、紙文書だけに限らず、様々なサイズの名刺、写真、カード等であっても良い。後述のモバイルアプリは、被写体１０５の画像をカメラ１０４を使用して取り込み、タッチパネル１０２にその画像を出力することができる。 FIG. 1 is a diagram showing an example of the appearance of a mobile terminal. The mobile terminal 100 includes various units 101 to 104 and the like. The front side of the mobile terminal 100 is the front side portion 101 of the mobile terminal. The touch panel 102 is an example of a display unit such as a display, and has two functions of output (display) and input. The back side of the mobile terminal 100 is the back side 103 of the mobile terminal. The mobile terminal back surface 103 has a camera 104 for capturing an image. In the present embodiment, the user of the mobile terminal 100 can start the process by taking an image of the subject 105 using the camera 104 with a mobile application (mobile application) described later. The subject 105 is, for example, an order form for an A4 size paper document. The subject 105 is not limited to a paper document, and may be a business card, a photograph, a card, or the like of various sizes. The mobile application described later can capture an image of the subject 105 using the camera 104 and output the image to the touch panel 102.

＜ハードウェア構成＞
次に、モバイル端末１００におけるハードウェア構成について説明する。図２は、モバイル端末１００のハードウェア構成例を示す図である。モバイル端末１００は、ＣＰＵ２０１、ＲＡＭ２０２、ＲＯＭ２０３、入出力インターフェース２０４、ＮＩＣ２０５、及びカメラユニット２０６を有する。ＣＰＵ２０１、ＲＡＭ２０２、ＲＯＭ２０３、入出力インターフェース２０４、ＮＩＣ２０５、及びカメラユニット２０６は、データの送受信を行うことが可能なようにバス２０７を介して通信可能に接続されている。 <Hardware configuration>
Next, the hardware configuration of the mobile terminal 100 will be described. FIG. 2 is a diagram showing a hardware configuration example of the mobile terminal 100. The mobile terminal 100 includes a CPU 201, a RAM 202, a ROM 203, an input / output interface 204, a NIC 205, and a camera unit 206. The CPU 201, RAM 202, ROM 203, input / output interface 204, NIC 205, and camera unit 206 are communicably connected via the bus 207 so that data can be transmitted and received.

ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）２０１は、各種のプログラムを実行し、様々な機能を実現する。ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）２０２は、各種の情報を記憶する。また、ＲＡＭ２０２は、ＣＰＵ２０１の一時的な作業記憶領域としても利用される。ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）２０３は、各種のプログラム等を記憶する。 The CPU (Central Processing Unit) 201 executes various programs and realizes various functions. The RAM (Random Access Memory) 202 stores various types of information. The RAM 202 is also used as a temporary working storage area for the CPU 201. The ROM (Read Only Memory) 203 stores various programs and the like.

例えば、ＣＰＵ２０１は、ＲＯＭ２０３に記憶されているプログラムをＲＡＭ２０２にロードしてプログラムを実行する。また、ＣＰＵ２０１は、フラッシュメモリ、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）又はＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）といった外部記憶装置に記憶されているプログラムに基づき処理を実行する。これにより、図３に示されるようなモバイル端末１００を構成するソフトウェア構成及び図８等に示されるようなフローチャートでの各処理が実現される。なお、モバイル端末１００の機能及び後述する処理の全部又は一部については専用のハードウェアを用いて実現してもよい。 For example, the CPU 201 loads the program stored in the ROM 203 into the RAM 202 and executes the program. Further, the CPU 201 executes processing based on a program stored in an external storage device such as a flash memory, an HDD (Hard Disk Drive), or an SSD (Solid State Drive). As a result, the software configuration constituting the mobile terminal 100 as shown in FIG. 3 and each process in the flowchart as shown in FIG. 8 and the like are realized. The functions of the mobile terminal 100 and all or part of the processing described later may be realized by using dedicated hardware.

入出力インターフェース２０４は、タッチパネル１０２とデータを送受信する。ＮＩＣ（ＮｅｔｗｏｒｋＩｎｔｅｒｆａｃｅＣａｒｄ）２０５は、モバイル端末１００をネットワーク（不図示）に接続するためのインターフェースである。カメラユニット２０６は、カメラ１０４と接続し被写体１０５の画像をモバイル端末１００に取り込む。 The input / output interface 204 transmits / receives data to / from the touch panel 102. The NIC (Network Interface Card) 205 is an interface for connecting the mobile terminal 100 to a network (not shown). The camera unit 206 connects to the camera 104 and captures an image of the subject 105 into the mobile terminal 100.

＜ソフトウェア構成＞
次に、モバイル端末１００におけるソフトウェア構成について説明する。図３は、モバイル端末１００のソフトウェア構成例を示す図である。モバイル端末１００は、データ管理部３０１及びモバイルアプリ３０２を有する。図３に示したソフトウェア（アプリケーション）における各機能を実現するプログラムは、ＲＯＭ２０３等に記憶されている。 <Software configuration>
Next, the software configuration in the mobile terminal 100 will be described. FIG. 3 is a diagram showing a software configuration example of the mobile terminal 100. The mobile terminal 100 has a data management unit 301 and a mobile application 302. A program that realizes each function in the software (application) shown in FIG. 3 is stored in ROM 203 or the like.

データ管理部３０１は、画像やアプリケーションデータを管理する。ＯＳは、データ管理部３０１を利用するための制御ＡＰＩ（ＡｐｐｌｉｃａｔｉｏｎＰｒｏｇｒａｍｍｉｎｇＩｎｔｅｒｆａｃｅ）を提供している。モバイルアプリ３０２は、その制御ＡＰＩを利用することでデータ管理部３０１が管理する画像やアプリケーションデータの取得や保存を行う。 The data management unit 301 manages images and application data. The OS provides a control API (Application Programming Interface) for using the data management unit 301. The mobile application 302 acquires and saves images and application data managed by the data management unit 301 by using the control API.

モバイルアプリ３０２は、例えばモバイル端末１００のＯＳのインストール機能を利用して、モバイル端末１００にダウンロードしインストールすることにより実行可能なアプリケーションである。モバイルアプリ３０２は、例えばカメラユニット２０６を介して取り込んだ被写体１０５の画像に対する各種のデータ処理を行う。モバイルアプリ３０２は、メイン制御部３０３、情報表示部３０４、操作情報取得部３０５、撮影画像取得部３０６、記憶部３０７、及びデータベース（ＤＢ）部３０８を有する。また、モバイルアプリ３０２は、特徴点抽出部３０９、特徴点比較処理部３１０、特徴点追跡処理部３１１、座標変換処理部３１２、トラッキング処理部３１３、及びトラッキング誤差量算出部３１４を有する。 The mobile application 302 is an application that can be executed by downloading and installing it on the mobile terminal 100, for example, by using the OS installation function of the mobile terminal 100. The mobile application 302 performs various data processing on the image of the subject 105 captured through, for example, the camera unit 206. The mobile application 302 has a main control unit 303, an information display unit 304, an operation information acquisition unit 305, a captured image acquisition unit 306, a storage unit 307, and a database (DB) unit 308. Further, the mobile application 302 has a feature point extraction unit 309, a feature point comparison processing unit 310, a feature point tracking processing unit 311, a coordinate conversion processing unit 312, a tracking processing unit 313, and a tracking error amount calculation unit 314.

メイン制御部３０３は、モバイル端末１００用のアプリケーション（モバイルアプリ）３０２を制御し、後述する各モジュール部３０３〜３１４に対する指示及び管理を行う。情報表示部３０４は、メイン制御部３０３からの指示に従い、モバイルアプリ３０２のユーザーインタフェース（ＵＩ）をユーザーに提供する。図４は、モバイルアプリ３０２のＵＩ（携帯端末用のＵＩ）を提供する画面の一例（モバイル端末画面４００）を示す図である。モバイル端末画面４００は、モバイル端末１００のタッチパネル１０２に表示される。また、モバイル端末画面４００では、表示・操作領域４０１にカメラ１０４を介して取り込んだ画像が表示され、画像等に対するユーザーによる操作（ユーザー操作）を、表示されたＵＩを介して受け付ける。なお、モバイルアプリ３０２のＵＩの形態（位置、大きさ、範囲、配置、表示内容など）は、図４に示したものに限定されるものではなく、モバイル端末１００の機能を実現することができる適宜の構成を採用することができる。 The main control unit 303 controls the application (mobile application) 302 for the mobile terminal 100, and gives instructions and manages to the module units 303 to 314, which will be described later. The information display unit 304 provides the user with the user interface (UI) of the mobile application 302 according to the instruction from the main control unit 303. FIG. 4 is a diagram showing an example (mobile terminal screen 400) of a screen that provides a UI (UI for a mobile terminal) of the mobile application 302. The mobile terminal screen 400 is displayed on the touch panel 102 of the mobile terminal 100. Further, on the mobile terminal screen 400, an image captured via the camera 104 is displayed in the display / operation area 401, and an operation (user operation) by the user on the image or the like is accepted via the displayed UI. The UI form (position, size, range, arrangement, display content, etc.) of the mobile application 302 is not limited to that shown in FIG. 4, and the functions of the mobile terminal 100 can be realized. An appropriate configuration can be adopted.

操作情報取得部３０５は、情報表示部３０４により表示されたモバイルアプリ３０２のＵＩを介したユーザー操作に係る情報を取得し、取得した情報をメイン制御部３０３に通知する。例えば、表示・操作領域４０１をユーザーが手で触れると、操作情報取得部３０５は、触れられた画面上の位置を示す情報を取得し、取得した位置の情報をメイン制御部３０３に送信する。撮影画像取得部３０６は、カメラユニット２０６を介して撮影された動画像の各撮影画像を取得し、記憶部３０７に送信する。 The operation information acquisition unit 305 acquires information related to user operations via the UI of the mobile application 302 displayed by the information display unit 304, and notifies the main control unit 303 of the acquired information. For example, when the user touches the display / operation area 401 by hand, the operation information acquisition unit 305 acquires information indicating the touched position on the screen, and transmits the acquired position information to the main control unit 303. The captured image acquisition unit 306 acquires each captured image of the moving image captured via the camera unit 206 and transmits it to the storage unit 307.

記憶部３０７は、例えば撮影画像取得部３０６により取得された撮影画像を記憶する。また、記憶部３０７に対しては、メイン制御部３０３の指示により記憶している撮影画像の削除を行うことができる。ＤＢ部３０８は、データベース機能を有し、後述する全体画像５００や、全体画像５００中の抽出するデータ情報が含まれている長方形領域（データ入力領域）についてのデータ入力領域情報テーブル６０１等を管理する。ＤＢ部３０８のデータは、メイン制御部３０３によるアプリケーション３０２の起動時に、記憶部３０７に送信され、必要な時に制御部３０３の指示により取得される。 The storage unit 307 stores, for example, a captured image acquired by the captured image acquisition unit 306. Further, the storage unit 307 can delete the captured image stored by the instruction of the main control unit 303. The DB unit 308 has a database function and manages the entire image 500, which will be described later, the data input area information table 601 for the rectangular area (data input area) including the data information to be extracted in the entire image 500, and the like. To do. The data of the DB unit 308 is transmitted to the storage unit 307 when the application 302 is started by the main control unit 303, and is acquired by the instruction of the control unit 303 when necessary.

特徴点抽出部３０９は、カメラユニット２０６を介して撮影された画像、又はＤＢ部３０８に予め保持され記憶部３０７に送信された画像等から、特徴点及び特徴量を抽出する。特徴点抽出部３０９は、例えば画像上の輝度の変化が大きな箇所（エッジ）などの特徴的なピクセル点（特徴点）と、特徴点の特徴を表すデータ（特徴量）を算出する。特徴点や特徴量を求める手法としては、ＳＩＦＴ（Scale-Invariant Feature Transform）やＳＵＲＦ（Speeded-Up Robust Features）等の手法がある。本実施形態では、回転や拡大縮小、画像の移動といった変化に頑強で、後述する特徴点比較処理においてマッチングする特徴点が一意に定まるような手法が好ましい。 The feature point extraction unit 309 extracts feature points and feature quantities from an image taken through the camera unit 206, an image previously held in the DB unit 308 and transmitted to the storage unit 307, and the like. The feature point extraction unit 309 calculates characteristic pixel points (feature points) such as a portion (edge) where the change in brightness on the image is large, and data (feature amount) representing the features of the feature points. As a method for obtaining a feature point or a feature amount, there are a method such as SIFT (Scale-Invariant Feature Transform) and SURF (Speeded-Up Robust Features). In the present embodiment, a method that is robust against changes such as rotation, enlargement / reduction, and image movement, and that uniquely determines matching feature points in the feature point comparison process described later is preferable.

特徴点比較処理部３１０は、特徴点抽出部３０９により異なる２つの画像からそれぞれ抽出された特徴点について特徴点比較処理を行う。特徴点比較処理は、各画像間の特徴点とその特徴量を比較することで、画像間で一致する特徴点の組み合わせを求める（マッチングをする）ことができる。また、ＲＡＮＳＡＣ（Random sample consensus）のような外れ値を排除して法則性を推定する手法を用いることで、ノイズとなる特徴点の組み合わせを排除し、より精度の高いマッチングを行うことが可能となる。特徴点比較処理は、精度の高いマッチング手法ではあるが、一般的に処理速度が遅い。本実施形態では、特徴点比較処理部３１０は、一意の全体画像と任意の撮影画像との間で特徴点比較処理を行う。以下では、特徴点比較処理に用いた撮影画像を特徴点比較画像とも称する。 The feature point comparison processing unit 310 performs feature point comparison processing on the feature points extracted from two images different from each other by the feature point extraction unit 309. In the feature point comparison process, a combination of feature points that match between images can be obtained (matched) by comparing the feature points between the images with the feature amount thereof. In addition, by using a method such as RANSAC (Random sample consensus) that eliminates outliers and estimates the law, it is possible to eliminate the combination of feature points that cause noise and perform more accurate matching. Become. The feature point comparison process is a highly accurate matching method, but the processing speed is generally slow. In the present embodiment, the feature point comparison processing unit 310 performs feature point comparison processing between a unique overall image and an arbitrary captured image. Hereinafter, the captured image used for the feature point comparison process is also referred to as a feature point comparison image.

特徴点追跡処理部３１１は、元となる原画像から特徴点抽出部３０９により抽出された特徴点について、比較対象の画像上での特徴点の移動位置を推定する特徴点追跡処理（オプティカルフロー）を行う。特徴点追跡処理部３１１は、原画像上の各特徴点が、比較対象の画像においてどれだけ移動したかの移動ベクトルを推定する。これにより、原画像上の特徴点が比較対象の画像上のどの位置に移動したかの推定を行うことができる。特徴点追跡処理は、特徴点比較処理よりも処理速度が速い。 The feature point tracking processing unit 311 estimates the moving position of the feature points on the image to be compared with respect to the feature points extracted by the feature point extraction unit 309 from the original original image (optical flow). I do. The feature point tracking processing unit 311 estimates the movement vector of how much each feature point on the original image has moved in the image to be compared. This makes it possible to estimate to which position on the image to be compared the feature points on the original image have moved. The feature point tracking process is faster than the feature point comparison process.

ここで、移動ベクトルの算出方法としては、一般に以下で示される。時刻ｔにおける画像Ｐ上の点ｘ，ｙの輝度をＰ（ｘ，ｙ，ｔ）とすると、時刻がΔｔ進み、その間に座標がΔｘ，Δｙ移動すると、移動先の輝度はＰ（ｘ＋Δｘ，ｙ＋Δｙ，ｔ＋Δｔ）となる。Ｐ（ｘ，ｙ，ｔ）＝Ｐ（ｘ＋Δｘ，ｙ＋Δｙ，ｔ＋Δｔ）となるためには、ｘ，ｙの単位時間当たりの移動量ベクトルｖ_x、ｖ_yを用いて、Ｐｖ_x＋Ｐｖ_y＋Ｐｔ＝０となる。ｖ_x、ｖ_yの２変数があるため、少なくとも１つ以上制約となる方程式を増やす必要があり、Ｌｕｃａｓ−Ｋａｎａｄｅ法は、周辺の点は同じ動きをすると仮定する。このため、移動ベクトルを正確に算出するためには、２つの画像間のオブジェクトの移動量が一定以下である（画像間の差分が小さい）必要がある。 Here, the method of calculating the movement vector is generally shown below. Assuming that the brightness of points x and y on the image P at time t is P (x, y, t), the time advances by Δt, and if the coordinates move by Δx and Δy during that time, the brightness of the destination is P (x + Δx, y + Δy). , T + Δt). In order for P (x, y, t) = P (x + Δx, y + Δy, t + Δt), Pv _x + Pv _y + Pt = 0 _{using the movement amount vectors v x} , v _{y per unit time of x, y.} It becomes. Since _{there are two variables v x} and v _y , it is necessary to increase at least one constraint equation, and the Lucas-Kanade method assumes that the surrounding points behave in the same way. Therefore, in order to calculate the movement vector accurately, it is necessary that the movement amount of the object between the two images is less than a certain amount (the difference between the images is small).

Ｌｕｃａｓ−Ｋａｎａｄｅ法の場合、画像間の差分が大きくなると、推定の誤差が大きくなり、実際の位置とのズレが大きくなる。画像間の差分が大きくなる要因としては、撮影フレーム間でのカメラの高速移動を行った場合、外部要因等により撮影対象の輝度が大きく変化した場合、動画撮影処理の遅延や撮影コマ落ち等により撮影間隔（フレーム間隔）が長くなった場合等が考えられる。高速移動や輝度変化についてはモーションセンサや輝度センサ、撮影画像から検出できるために、検出時にマッチング処理を行ってからトラッキング処理を再開することで、ズレを減らすことができる。なお、移動ベクトルの計算方法は、他の方法を用いてもよい。 In the case of the Lucas-Kanade method, when the difference between the images becomes large, the estimation error becomes large and the deviation from the actual position becomes large. Factors that increase the difference between images include high-speed movement of the camera between shooting frames, large changes in the brightness of the shooting target due to external factors, delays in movie shooting processing, dropped frames, etc. It is conceivable that the shooting interval (frame interval) becomes longer. Since high-speed movement and brightness change can be detected from the motion sensor, the brightness sensor, and the captured image, the deviation can be reduced by performing the matching process at the time of detection and then restarting the tracking process. In addition, as the calculation method of the movement vector, another method may be used.

座標変換処理部３１２は、全体画像と撮影画像との間（もしくは撮影画像と異なる撮影画像間で）、点のマッピングを行う。座標変換処理部３１２は、画像間でホモグラフィー変換（homography transform、平面射影変換）を行うためのホモグラフィー変換行列（以下、変換行列）を算出することで点のマッピングを行う。ホモグラフィー変換は、ある平面座標系上の点を異なる平面座標系上に変形して移し替えることができる。同様の変換に、画像の回転、平行移動、拡大縮小を行うアフィン変換があるが、ホモグラフィー変換は、それらに加えて、座標位置に応じて拡大縮小の比率を変え、台形状の変換が可能となる。ホモグラフィー変換は、現画像上の座標点（ｘ１，ｙ１）、変換後の画像上の座標点（ｘ２，ｙ２）、変換行列Ｈ、定数ｓを用いて以下のように表せられる。 The coordinate conversion processing unit 312 maps points between the entire image and the captured image (or between captured images different from the captured image). The coordinate transformation processing unit 312 performs point mapping by calculating a homography transformation matrix (hereinafter, transformation matrix) for performing homography transformation (homography transform) between images. The homography transformation can transform and transfer a point on one plane coordinate system onto a different plane coordinate system. Similar transformations include affine transformations that rotate, translate, and scale images, but in addition to these, homothety transformations can transform trapezoids by changing the scaling ratio according to the coordinate position. It becomes. The homography transformation can be expressed as follows using the coordinate points (x1, y1) on the current image, the coordinate points (x2, y2) on the transformed image, the transformation matrix H, and the constant s.

座標変換処理部３１２は、２つの画像間に同様のオブジェクトが存在している場合、特徴点比較処理部３１０又は特徴点追跡処理部３１１によるマッチング処理で求められた画像間の対応点座標を基に、変換行列Ｈのパラメータを算出する。これにより、全体画像と撮影画像との間の変換行列を求め、全体画像中の座標を撮影画像中にマッピングすることや、変換行列の逆行列を求めて、その逆のマッピングを行うことが可能となる。しかし、２つの画像間に同様のオブジェクトが存在しなかった場合（画像間の差異が大きかった場合）、マッチングする特徴点の数が少なくなり、変換行列Ｈの算出に失敗する。 When a similar object exists between two images, the coordinate conversion processing unit 312 is based on the corresponding point coordinates between the images obtained by the matching process by the feature point comparison processing unit 310 or the feature point tracking processing unit 311. The parameters of the transformation matrix H are calculated. This makes it possible to obtain the transformation matrix between the entire image and the captured image and map the coordinates in the entire image in the captured image, or to obtain the inverse matrix of the transformation matrix and perform the reverse mapping. It becomes. However, when a similar object does not exist between the two images (when the difference between the images is large), the number of matching feature points becomes small, and the calculation of the transformation matrix H fails.

トラッキング処理部３１３は、後述するトラッキング処理により、カメラ１０４による最新の撮影画像が全体画像５００のどの部分（領域）を撮影しているかをトラッキングする。そして、トラッキング処理部３１３は、データ入力領域情報テーブル６０１に保存されているデータ入力領域の情報に基づいて撮影画像上にデータ入力領域をマッピングして描画し、モバイル端末画面４００上に表示する。 The tracking processing unit 313 tracks which part (area) of the entire image 500 is captured by the latest captured image by the camera 104 by the tracking process described later. Then, the tracking processing unit 313 maps and draws the data input area on the captured image based on the information of the data input area stored in the data input area information table 601 and displays it on the mobile terminal screen 400.

そのため、トラッキングを行うには、最新の撮影画像を取得してからモバイル端末画面４００上にマッピングした画像を表示するまでに、座標変換処理部３１２による、撮影画像５００と最新の撮影画像間の変換行列を求める処理を待つ必要がある。座標変換処理部３１２による変換行列の算出には、２つの画像間の特徴点のマッチング処理が必要となり、マッチング処理には特徴点比較処理部３１０と特徴点追跡処理部３１１による２通りの算出方法がある。特徴点比較処理部３１１によるマッチング処理には時間がかかるため、特徴点比較処理が完了するまで描画を行わないでいると、撮影レートの低下を招いてしまう。また、特徴点追跡処理部３１２による特徴点追跡処理は高速ではあるが、各撮影画像間で求めた変換行列を掛け合わせることでトラッキングを行う際、各行列の誤差が蓄積していくため、次第にトラッキング結果にズレが生じてしまう。そのため、本実施形態におけるトラッキング処理では、特徴点比較処理と特徴点追跡処理とを組み合わせて使用することで、トラッキングのズレを最小限にしつつ、モバイル端末画面４００への描画における撮影レートの低下を防いでいる。 Therefore, in order to perform tracking, the coordinate conversion processing unit 312 converts between the captured image 500 and the latest captured image from the acquisition of the latest captured image to the display of the mapped image on the mobile terminal screen 400. It is necessary to wait for the process of finding the matrix. The calculation of the transformation matrix by the coordinate conversion processing unit 312 requires matching processing of feature points between two images, and the matching processing requires two calculation methods by the feature point comparison processing unit 310 and the feature point tracking processing unit 311. There is. Since the matching process by the feature point comparison processing unit 311 takes time, if the drawing is not performed until the feature point comparison process is completed, the shooting rate will be lowered. Further, although the feature point tracking process by the feature point tracking processing unit 312 is high-speed, when tracking is performed by multiplying the transformation matrices obtained between the captured images, the error of each matrix is accumulated, so that the error of each matrix is gradually accumulated. The tracking result will be out of sync. Therefore, in the tracking process in the present embodiment, the feature point comparison process and the feature point tracking process are used in combination to minimize the tracking deviation and reduce the shooting rate in drawing on the mobile terminal screen 400. I'm preventing it.

トラッキング誤差量算出部３１４は、撮影画像毎のトラッキング処理の誤差量を算出する。本実施形態では、トラッキング誤差の蓄積が所定の閾値を越えた場合に、マッチング処理を行い、トラッキング誤差量のリセットをする。トラッキング誤差量算出部３１４は、トラッキング未処理の画像が発生した場合、特徴点追跡処理部３１１が算出する移動ベクトルの誤差量からトラッキング処理の誤差量を算出する。具体的には、撮影画像取得部３０６から動画像のフレームが正しく取得できたか判定を行う。 The tracking error amount calculation unit 314 calculates the error amount of the tracking process for each captured image. In the present embodiment, when the accumulation of tracking errors exceeds a predetermined threshold value, matching processing is performed and the tracking error amount is reset. When an unprocessed image is generated, the tracking error amount calculation unit 314 calculates the error amount of the tracking process from the error amount of the movement vector calculated by the feature point tracking processing unit 311. Specifically, it is determined whether or not the frame of the moving image can be correctly acquired from the captured image acquisition unit 306.

撮影画像取得部３０６からの画像取得の判定は、撮影画像取得部３０６から規定間隔でフレームが取得できたか否かの判定を行う。規定間隔は、カメラユニット２０６から取得できる。規定間隔は、カメラユニット２０６の画像解像度によって異なることがある。例えば、カメラユニット２０６からの画像生成が３０ＦＰＳであれば、３０分の１秒以内に１フレーム分の撮影画像が取得できたか否かの判定を行う。また、３０分の１秒以内に１フレーム分の撮影画像が取得できても、トラッキング処理部３１３がトラッキング処理を行えなかった場合には、ズレが大きくなるため、画像取得が行えなかった場合と同様にトラッキング誤差を算出する。 The determination of image acquisition from the captured image acquisition unit 306 determines whether or not frames can be acquired from the captured image acquisition unit 306 at predetermined intervals. The specified interval can be obtained from the camera unit 206. The specified interval may differ depending on the image resolution of the camera unit 206. For example, if the image generation from the camera unit 206 is 30 FPS, it is determined whether or not the captured image for one frame can be acquired within 1/30 second. Further, even if the captured image for one frame can be acquired within 1/30 second, if the tracking processing unit 313 cannot perform the tracking processing, the deviation becomes large, so that the image cannot be acquired. Similarly, the tracking error is calculated.

＜全体画像と撮影画像＞
次に、全体画像と撮影画像とについて、図５を参照して説明する。全体画像５００は、全体画像の一例である。全体画像５００は、被写体１０５の全体像を写した画像データである。全体画像５００は、予めＤＢ部３０８に保存されているものとする。なお、撮影され撮影画像取得部３０６により取得した被写体１０５を含む画像データを、被写体以外の領域を除外する紙面検出処理、歪み部分を補正する歪み補正処理を施し整形加工することにより取得する処理をアプリケーション３０２に追加しても良い。 <Overall image and captured image>
Next, the whole image and the captured image will be described with reference to FIG. The whole image 500 is an example of the whole image. The whole image 500 is image data showing the whole image of the subject 105. It is assumed that the entire image 500 is stored in the DB unit 308 in advance. It should be noted that the image data including the subject 105 that has been photographed and acquired by the photographed image acquisition unit 306 is acquired by performing a paper surface detection process for excluding an area other than the subject and a distortion correction process for correcting the distorted portion and shaping the image data. It may be added to application 302.

また、被写体１０５に対してカメラ１０４を移動して取得した被写体１０５の一部（あるいは全体）の画像を撮影画像と呼ぶ。撮影画像５０１、５０２、５０３、５０４は、撮影画像の一例である。撮影領域５０５、５０６、５０７、５０８は、全体画像における撮影画像の撮影領域を示している。撮影画像５０１〜５０４は、撮影画像取得部３０６から取得された連続する動画撮影を抜き出したもので、カメラ１０４の移動とともに撮影領域５０５〜５０８が移動していることを図示している。 Further, an image of a part (or the whole) of the subject 105 acquired by moving the camera 104 with respect to the subject 105 is referred to as a captured image. The captured images 501, 502, 503, and 504 are examples of captured images. The shooting areas 505, 506, 507, and 508 indicate the shooting areas of the shot image in the entire image. The captured images 501 to 504 are extracted continuous moving image shots acquired from the captured image acquisition unit 306, and show that the shooting areas 505 to 508 move with the movement of the camera 104.

＜データ入力領域情報テーブル＞
次に、ＤＢ部３０８が管理するデータ入力領域情報テーブルについて説明する。図６は、本実施形態におけるデータ入力領域情報テーブルのデータ構造、及び保持するデータ入力領域情報を説明する図である。図６（Ａ）に示すように、データ入力領域情報テーブル６０１は、ｉｄカラム、ｋｅｙカラム、ｐｏｉｎｔカラム、ｗｉｄｔｈカラム、及びｈｅｉｇｈｔカラムから構成される。 <Data entry area information table>
Next, the data input area information table managed by the DB unit 308 will be described. FIG. 6 is a diagram for explaining the data structure of the data input area information table and the data input area information to be held in the present embodiment. As shown in FIG. 6A, the data input area information table 601 is composed of an id column, a key column, a point column, a width column, and a hight column.

ｉｄカラムは、データ入力領域情報テーブル６０１にレコードが追加されるたびに１ずつ増加する値で、テーブルのＰｒｉｍａｒｙｋｅｙである。ｋｅｙカラムは、各レコードがなんの情報に関するデータ入力領域情報であるのかを示す情報を格納する。ｐｏｉｎｔカラムは、データ入力領域の左上端の位置に対応する、全体画像５００座標系における座標を格納する。ｗｉｄｔｈカラムは、データ入力領域の幅をピクセル単位で示した情報を格納する。ｈｅｉｇｈｔカラムは、データ入力領域の高さをピクセル単位で示した情報を格納する。例えば、図６（Ｂ）に示した全体画像５００上に図示された表示領域６０８、６０９、６１０、６１１、６１２、６１３が、それぞれデータ入力領域情報テーブル６０１のデータ入力領域情報６０２、６０３、６０４、６０５、６０６、６０７に対応している。 The id column is a value that is incremented by 1 each time a record is added to the data input area information table 601 and is a primary key of the table. The key column stores information indicating what information each record is related to as data input area information. The point column stores the coordinates in the entire image 500 coordinate system corresponding to the position of the upper left corner of the data input area. The width column stores information indicating the width of the data input area in pixels. The height column stores information indicating the height of the data input area in pixels. For example, the display areas 608, 609, 610, 611, 612, and 613 shown on the entire image 500 shown in FIG. 6B are the data input area information 602, 603, and 604 of the data input area information table 601, respectively. , 605, 606, 607 are supported.

＜トラッキング処理＞
次に、トラッキング処理部３１３によるトラッキング処理について、図７を参照して説明する。図７において、撮影画像７００〜７０８は、撮影画像取得部３０６から取得された連続する動画撮影の撮影画像であり、全体画像５００にカメラ１０４を近づけて撮影されたものである。撮影画像７００〜７０８のうち、撮影画像７００が撮影開始時点から数えて一番初めに取得された撮影画像である。 <Tracking process>
Next, the tracking process by the tracking process unit 313 will be described with reference to FIG. 7. In FIG. 7, the captured images 700 to 708 are captured images of continuous moving image acquisition acquired from the captured image acquisition unit 306, and are captured by bringing the camera 104 close to the entire image 500. Of the captured images 700 to 708, the captured image 700 is the first captured image acquired from the start of imaging.

第１変換行列７０９は、全体画像５００と撮影画像７００とを入力として、特徴点比較処理部３１０によって求められた特徴点比較結果を用いて、座標変換処理部３１２により求められる。特徴点比較処理部３１０による特徴点比較処理には時間がかかるため、第１変換行列７１６が算出されるまでに撮影画像７０１、７０２が取得されるが変換行列が未生成のため、未加工の撮影画像７０１、７０２がモバイル端末画面４００に表示される。 The first transformation matrix 709 is obtained by the coordinate conversion processing unit 312 using the feature point comparison result obtained by the feature point comparison processing unit 310 with the entire image 500 and the captured image 700 as inputs. Since the feature point comparison process by the feature point comparison processing unit 310 takes time, captured images 701 and 702 are acquired by the time the first conversion matrix 716 is calculated, but the conversion matrix has not been generated and is not processed. The captured images 701 and 702 are displayed on the mobile terminal screen 400.

撮影画像７０３が取得されたところで座標変換処理部３１２による第１変換行列７０９の算出が終わり、第１変換行列７０９が得られたとする。そこで、第２変換行列７１０の生成が行われる。第２変換行列７１０は、第１変換行列７０９の算出に用いられた撮影画像７００と最新の撮影画像７０３とを入力画像として、特徴点追跡処理部３１１によって求められた特徴点追跡処理結果を用いて、座標変換処理部３１２により求められる。 It is assumed that the calculation of the first transformation matrix 709 by the coordinate conversion processing unit 312 is completed when the captured image 703 is acquired, and the first transformation matrix 709 is obtained. Therefore, the second transformation matrix 710 is generated. The second transformation matrix 710 uses the captured image 700 used for the calculation of the first transformation matrix 709 and the latest captured image 703 as input images, and uses the feature point tracking processing result obtained by the feature point tracking processing unit 311. Therefore, it is obtained by the coordinate conversion processing unit 312.

第１変換行列７０９と第２変換行列７１０とを掛け合わせることにより、全体画像５００と撮影画像７０３との間で座標の変換が可能となる変換行列が求められる。そして、トラッキング処理部３１３は、データ入力領域情報テーブル６０１に保存されているデータ入力領域情報に基づいて、撮影画像７０３上に各データ入力領域をマッピングして描画し、モバイル端末画面４００に表示する。 By multiplying the first transformation matrix 709 and the second transformation matrix 710, a transformation matrix capable of converting coordinates between the entire image 500 and the captured image 703 is obtained. Then, the tracking processing unit 313 maps and draws each data input area on the captured image 703 based on the data input area information stored in the data input area information table 601 and displays it on the mobile terminal screen 400. ..

次に、最新の撮影画像７０４とひとつ前の撮影画像７０３とを入力画像として、特徴点追跡処理部３１１によって求められた特徴点追跡処理結果を用いて、第３変換行列７１１が座標変換処理部３１２により求められる。第１変換行列７０９、第２変換行列７１０、及び第３変換行列７１１を掛け合わせて、全体画像５００と撮影画像７０４との間で座標の変換が可能となる変換行列が求められる。同様にして、最新の撮影画像とひとつ前の撮影画像との間で第３変換行列を求め、一意の第１変換行列と一意の第２変換行列、そして複数の第３変換行列を掛け合わせることで、全体画像５００と最新の撮影画像との間で座標の変換を行う変換行列を求める。 Next, using the latest captured image 704 and the previous captured image 703 as input images and the feature point tracking processing result obtained by the feature point tracking processing unit 311, the third transformation matrix 711 is the coordinate conversion processing unit. Obtained by 312. By multiplying the first transformation matrix 709, the second transformation matrix 710, and the third transformation matrix 711, a transformation matrix capable of converting the coordinates between the entire image 500 and the captured image 704 is obtained. Similarly, the third transformation matrix is obtained between the latest captured image and the previous captured image, and the unique first transformation matrix, the unique second transformation matrix, and a plurality of third transformation matrices are multiplied. Then, the transformation matrix for converting the coordinates between the entire image 500 and the latest captured image is obtained.

ここで、座標変換処理部３１２により求められた変換行列の精度は、特徴点追跡処理部３１１による特徴点追跡処理の推定誤差の影響などにより１００％ではないため、複数の変換行列を掛け合わせることで誤差が蓄積する。そのため、トラッキング誤差量算出部３１４により誤差の蓄積が閾値を越えたと判定された場合、第１変換行列及び第２変換行列が更新され、誤差の蓄積がリセットされる。 Here, the accuracy of the transformation matrix obtained by the coordinate conversion processing unit 312 is not 100% due to the influence of the estimation error of the feature point tracking processing by the feature point tracking processing unit 311 and the like, so a plurality of transformation matrices are multiplied. The error accumulates at. Therefore, when the tracking error amount calculation unit 314 determines that the accumulation of errors exceeds the threshold value, the first conversion matrix and the second transformation matrix are updated, and the accumulation of errors is reset.

第１変換行列７１６は、全体画像５００と撮影画像７０４とを入力として、特徴点比較処理部３１０によって求められた特徴点比較結果を用いて、座標変換処理部３１２により求められる変換行列である。第１変換行列７０９を求める時と同様に、変換行列の算出に時間がかかるため、第１変換行列７１６が算出されるまでには撮影画像７０５、７０６が取得される。このとき、各撮影画像の取得のたびに、１つ前の撮影画像との間で第３変換行列７１２、７１３が算出される。その間、トラッキング処理部３１３は、生成済みの第１変換行列７０９と第２変換行列７１０、及び第３変換行列７１１、７１２、７１３を用いて、最新の撮影画像と全体画像５００との間で座標の変換を行う変換行列を求める。 The first transformation matrix 716 is a transformation matrix obtained by the coordinate conversion processing unit 312 using the feature point comparison result obtained by the feature point comparison processing unit 310 with the entire image 500 and the captured image 704 as inputs. Since it takes time to calculate the transformation matrix as in the case of obtaining the first transformation matrix 709, the captured images 705 and 706 are acquired by the time the first transformation matrix 716 is calculated. At this time, each time each captured image is acquired, the third transformation matrices 712 and 713 are calculated with the previous captured image. Meanwhile, the tracking processing unit 313 uses the generated first transformation matrix 709, second transformation matrix 710, and third transformation matrix 711, 712, 713 to coordinate between the latest captured image and the entire image 500. Find the transformation matrix that transforms.

そして、撮影画像７０７が取得されたところで座標変換処理部３１２による第１変換行列７１６の算出が終わり、第１変換行列７１６が得られたとする。そこで、第２変換行列７１７の生成が行われる。第２変換行列７１７は、第１変換行列７１６の算出に用いられた撮影画像７０４と最新の撮影画像７０７とを入力画像として、特徴点追跡処理部３１１によって求められた特徴点追跡処理結果を用いて、座標変換処理部３１２により求められる。 Then, it is assumed that the calculation of the first transformation matrix 716 by the coordinate conversion processing unit 312 is completed when the captured image 707 is acquired, and the first transformation matrix 716 is obtained. Therefore, the second transformation matrix 717 is generated. The second transformation matrix 717 uses the captured image 704 used for the calculation of the first transformation matrix 716 and the latest captured image 707 as input images, and uses the feature point tracking processing result obtained by the feature point tracking processing unit 311. Therefore, it is obtained by the coordinate conversion processing unit 312.

第２変換行列７１７が求められた時点で、第１変換行列と第２変換行列の更新が完了する。以降の撮影画像では、更新された第１変換行列７１６と第２変換行列７１７、及び各撮影画像間の第３変換行列を用いて、トラッキング処理部３１３が、全体画像５００と最新の撮影画像の間で座標の変換を行う変換行列を求める。これにより、最新の撮影画像でのトラッキングに第３変換行列７１１〜７１４が不要となるため、これらの変換行列を掛け合わせることで生じていた誤差がリセットされる。このように、第１変換行列及び第２変換行列を、誤差の蓄積が所定の閾値を越えた場合に更新することで、トラッキング中の誤差を最小限に保つことができる。 When the second transformation matrix 717 is obtained, the update of the first transformation matrix and the second transformation matrix is completed. In the subsequent captured images, the tracking processing unit 313 uses the updated first transformation matrix 716 and the second transformation matrix 717, and the third transformation matrix between the captured images to perform the entire image 500 and the latest captured image. Find the transformation matrix that transforms the coordinates between them. This eliminates the need for the third transformation matrix 711 to 714 for tracking the latest captured image, so that the error caused by multiplying these transformation matrices is reset. In this way, by updating the first transformation matrix and the second transformation matrix when the accumulation of errors exceeds a predetermined threshold value, the error during tracking can be kept to a minimum.

次に、モバイル端末１００でモバイルアプリ３０２が実行する基本的なトラッキング処理について、図８を参照して説明する。図８は、本実施形態におけるトラッキング処理の例を示すフローチャートである。図８に示すフローチャートの処理は、ユーザーによりモバイル端末１００におけるモバイルアプリ３０２が起動され、被写体１０５に対してカメラ１０４を接近して画像を取得することをトリガーに開始する。 Next, the basic tracking process executed by the mobile application 302 on the mobile terminal 100 will be described with reference to FIG. FIG. 8 is a flowchart showing an example of tracking processing in the present embodiment. The processing of the flowchart shown in FIG. 8 is triggered by the user invoking the mobile application 302 in the mobile terminal 100 and approaching the camera 104 with respect to the subject 105 to acquire an image.

Ｓ８０１で、メイン制御部３０３は、ＤＢ部３０８に保存された全体画像５００を記憶部３０７に送信し、使用できるようにする。
次に、Ｓ８０２で、メイン制御部３０３は、ＤＢ部３０８に保存されたデータ入力領域情報テーブル６０１を記憶部３０７に送信し、使用できるようにする。 In S801, the main control unit 303 transmits the entire image 500 stored in the DB unit 308 to the storage unit 307 so that it can be used.
Next, in S802, the main control unit 303 transmits the data input area information table 601 stored in the DB unit 308 to the storage unit 307 so that it can be used.

Ｓ８０３で、メイン制御部３０３は、撮影画像取得部３０６から最新の動画撮影画像を撮影画像として１枚取得するよう命令する。
次に、Ｓ８０４で、メイン制御部３０３は、後述する第１変換行列及び第２変換行列の作成・更新処理を実行し、第１変換行列及び第２変換行列の作成・更新に関する処理を行う。 In S803, the main control unit 303 orders the captured image acquisition unit 306 to acquire one latest moving image captured image as a captured image.
Next, in S804, the main control unit 303 executes the creation / update processing of the first transformation matrix and the second transformation matrix, which will be described later, and performs the processing related to the creation / update of the first transformation matrix and the second transformation matrix.

Ｓ８０５で、メイン制御部３０３は、トラッキングが可能であるか否かを判定する。メイン制御部３０３は、第１変換行列及び第２変換行列の作成が完了していた場合、トラッキングが可能であると判定して（Ｓ８０５のＹｅｓ）Ｓ８０６へ遷移する。一方、メイン制御部３０３は、第１変換行列及び第２変換行列のいずれかが作成されていなかった場合、トラッキングが不可能であると判定して（Ｓ８０５のＮｏ）Ｓ８０７へ遷移する。 In S805, the main control unit 303 determines whether or not tracking is possible. When the creation of the first transformation matrix and the second transformation matrix has been completed, the main control unit 303 determines that tracking is possible (Yes in S805) and transitions to S806. On the other hand, if either the first transformation matrix or the second transformation matrix has not been created, the main control unit 303 determines that tracking is impossible (No in S805) and transitions to S807.

Ｓ８０６で、メイン制御部３０３は、座標変換処理部３１２に対し、入力された最新の撮影画像と直前に入力された撮影画像との間の第３変換行列を生成するよう命令する。
次に、Ｓ８０７で、メイン制御部３０３は、座標変換処理部３１２に対し、生成された第１変換行列、第２変換行列、及び第３変換行列を用いて、全体画像と最新の撮影画像との間で座標の変換が可能となる変換行列を生成するよう命令する。 In S806, the main control unit 303 instructs the coordinate conversion processing unit 312 to generate a third conversion matrix between the latest input captured image and the immediately input captured image.
Next, in S807, the main control unit 303 sends the entire image and the latest captured image to the coordinate conversion processing unit 312 by using the generated first conversion matrix, second transformation matrix, and third transformation matrix. Instruct to generate a transformation matrix that allows transformation of coordinates between.

Ｓ８０８で、メイン制御部３０３は、トラッキング処理が規定時間内に終了したか否かを判定する。トラッキング処理が規定時間内に終了したと判定した場合（Ｓ８０８のＹｅｓ）、メイン制御部３０３はＳ８１２へ遷移する。一方、トラッキング処理が規定時間内に終了しなかったと判定した場合（Ｓ８０８のＮｏ）、メイン制御部３０３はＳ８０９へ遷移する。 In S808, the main control unit 303 determines whether or not the tracking process is completed within the specified time. When it is determined that the tracking process is completed within the specified time (Yes in S808), the main control unit 303 transitions to S812. On the other hand, when it is determined that the tracking process has not been completed within the specified time (No in S808), the main control unit 303 transitions to S809.

Ｓ８０９で、メイン制御部３０３は、トラッキング誤差量算出部３１４に対し、最新の撮影画像の取得結果からトラッキングの誤差量を算出するよう命令する。また、トラッキング誤差量算出部３１４によるトラッキングの誤差量の算出結果に基づいて、蓄積誤差が算出される。 In S809, the main control unit 303 instructs the tracking error amount calculation unit 314 to calculate the tracking error amount from the acquisition result of the latest captured image. Further, the accumulation error is calculated based on the calculation result of the tracking error amount by the tracking error amount calculation unit 314.

Ｓ８１０で、メイン制御部３０３は、トラッキング誤差量算出部３１４から蓄積誤差を取得し、取得したトラッキングの蓄積誤差と誤差の閾値とを比較する。誤差の閾値は、モバイルアプリ３０２のユーザーインタフェース（ＵＩ）で設定することや、トラッキング処理とマッチング処理との結果から適応的に決定することができる。メイン制御部３０３は、トラッキングの蓄積誤差が誤差の閾値を越えていると判定した場合（Ｓ８１０のＹｅｓ）にはＳ８１１へ遷移し、トラッキングの蓄積誤差が誤差の閾値を越えていないと判定した場合（Ｓ８１０のＮｏ）にはＳ８１２へ遷移する。 In S810, the main control unit 303 acquires the accumulation error from the tracking error amount calculation unit 314, and compares the acquired tracking accumulation error with the error threshold value. The error threshold can be set in the user interface (UI) of the mobile application 302 or can be adaptively determined from the results of the tracking process and the matching process. When the main control unit 303 determines that the tracking accumulation error exceeds the error threshold value (Yes in S810), it transitions to S811, and when it determines that the tracking accumulation error does not exceed the error threshold value. (No of S810) transitions to S812.

Ｓ８１１で、メイン制御部３０３は、後述する第１変換行列作成を行うフラグを立てる。この第１変換行列作成フラグは、Ｓ８０４での第１変換行列及び第２変換行列の作成・更新処理において用いられる。 In S811, the main control unit 303 sets a flag for creating the first transformation matrix, which will be described later. This first transformation matrix creation flag is used in the creation / update processing of the first transformation matrix and the second transformation matrix in S804.

Ｓ８１２で、メイン制御部３０３は、Ｓ８０７で生成した変換行列と、記憶部３０７に記憶されたデータ入力領域情報テーブル６０１の情報とを用いて、撮影画像上にデータ入力領域をマッピングし、モバイル端末１００のモバイル端末画面４００上に表示する。メイン制御部３０３は、Ｓ８０７で生成した変換行列に基づいて、データ入力領域情報テーブル６０１に格納されている全体画像座標系でのデータ入力領域を撮影画像上にマッピングする。なお、撮影画像入力後にＳ８０７によるトラッキング処理を経由していない場合には、データ入力領域をマッピングしていない撮影画像をそのままモバイル端末画面４００に表示する。 In S812, the main control unit 303 maps the data input area on the captured image by using the transformation matrix generated in S807 and the information of the data input area information table 601 stored in the storage unit 307, and the mobile terminal. It is displayed on 100 mobile terminal screens 400. The main control unit 303 maps the data input area in the entire image coordinate system stored in the data input area information table 601 on the captured image based on the transformation matrix generated in S807. If the captured image is input and does not go through the tracking process by S807, the captured image without mapping the data input area is displayed as it is on the mobile terminal screen 400.

Ｓ８１３で、メイン制御部３０３は、撮影画像取得部３０６による撮影画像の入力が終了したか否かを判定する。メイン制御部３０３は、撮影画像の入力が終了したと判定した場合（Ｓ８１３のＹｅｓ）には図８に示す処理を終了し、撮影画像の入力が続いていると判定した場合（Ｓ８１３のＮｏ）にはＳ８０３へ戻り処理を継続する。 In S813, the main control unit 303 determines whether or not the input of the captured image by the captured image acquisition unit 306 is completed. When the main control unit 303 determines that the input of the captured image is completed (Yes in S813), the process shown in FIG. 8 is terminated, and when it is determined that the input of the captured image is continued (No in S813). Return to S803 and continue the process.

＜第１変換行列及び第２変換行列の作成・更新処理＞
次に、図８のＳ８０４で実行される第１変換行列及び第２変換行列の作成・更新処理について、図９を参照して説明する。図９は、第１変換行列及び第２変換行列の作成・更新処理の例を示すフローチャートである。 <Creation / update processing of the first transformation matrix and the second transformation matrix>
Next, the process of creating / updating the first transformation matrix and the second transformation matrix executed in S804 of FIG. 8 will be described with reference to FIG. FIG. 9 is a flowchart showing an example of creation / update processing of the first transformation matrix and the second transformation matrix.

Ｓ９０１で、メイン制御部３０３は、第１変換行列がすでに生成済みであるか否かを判定する。メイン制御部３０３は、第１変換行列が生成済みであると判定した場合（Ｓ９０１のＹｅｓ）にはＳ９０２へ遷移し、第１変換行列が未生成であると判定した場合（Ｓ９０１のＮｏ）にはＳ９０５へ遷移する。 In S901, the main control unit 303 determines whether or not the first transformation matrix has already been generated. When the main control unit 303 determines that the first transformation matrix has been generated (Yes in S901), it transitions to S902, and when it determines that the first transformation matrix has not been generated (No in S901). Transitions to S905.

Ｓ９０２で、メイン制御部３０３は、第１変換行列を更新するか否かを判定する。第１変換行列の更新タイミングは、第１変換行列及び第２変換行列がすでに生成済みで、Ｓ８１１においてメイン制御部３０３が第１変換行列作成を行うフラグを立てた状態で最新の撮影画像に対して更新を行う。メイン制御部３０３は、第１変換行列を更新しないと判定した場合（Ｓ９０２のＮｏ）にはＳ９０３遷移し、第１変換行列を更新すると判定した場合（Ｓ９０２のＹｅｓ）にはＳ９０８へ遷移する。 In S902, the main control unit 303 determines whether or not to update the first transformation matrix. As for the update timing of the first transformation matrix, the first transformation matrix and the second transformation matrix have already been generated, and the main control unit 303 sets a flag in S811 to create the first transformation matrix for the latest captured image. And update. When the main control unit 303 determines that the first transformation matrix is not updated (No in S902), the transition is made to S903, and when it is determined that the first transformation matrix is not updated (Yes in S902), the transition is made to S908.

Ｓ９０３で、メイン制御部３０３は、第２変換行列がすでに生成済みであるか否かを判定する。メイン制御部３０３は、第２変換行列が生成済みであると判定した場合（Ｓ９０３のＹｅｓ）にはＳ９０４へ遷移し、第２変換行列が未生成であると判定した場合（Ｓ９０３のＮｏ）にはＳ９０９へ遷移する。 In S903, the main control unit 303 determines whether or not the second transformation matrix has already been generated. When the main control unit 303 determines that the second transformation matrix has been generated (Yes in S903), the main control unit 303 transitions to S904, and when it determines that the second transformation matrix has not been generated (No in S903). Transitions to S909.

Ｓ９０４で、メイン制御部３０３は、第１変換行列の更新が行われたか否かを判定する。メイン制御部３０３は、第１変換行列の更新が行われたと判定した場合（Ｓ９０４のＹｅｓ）にはＳ９０９へ遷移し、第１変換行列の更新が行われていないと判定した場合（Ｓ９０４のＮｏ）には第１変換行列及び第２変換行列の作成・更新処理を終了する。 In S904, the main control unit 303 determines whether or not the first conversion matrix has been updated. When the main control unit 303 determines that the first transformation matrix has been updated (Yes in S904), it transitions to S909, and when it determines that the first transformation matrix has not been updated (No in S904). ) Ends the creation / update process of the first transformation matrix and the second transformation matrix.

Ｓ９０５で、メイン制御部３０３は、特徴点比較処理部３１０による特徴点比較処理が行われている最中であるか否かを判別する。メイン制御部３０３は、特徴点比較処理の実行中であると判定した場合（Ｓ９０５のＹｅｓ）にはＳ９０７へ遷移し、特徴点比較処理の実行中でないと判定した場合（Ｓ９０５のＮｏ）にはＳ９０６へ遷移する。 In S905, the main control unit 303 determines whether or not the feature point comparison processing by the feature point comparison processing unit 310 is being performed. When the main control unit 303 determines that the feature point comparison process is being executed (Yes in S905), the main control unit 303 transitions to S907, and when it is determined that the feature point comparison process is not being executed (No in S905), the main control unit 303 transitions to S907. Transition to S906.

Ｓ９０６で、メイン制御部３０３は、特徴点比較処理部３１０に対し、全体画像と最新の撮影画像との間で特徴点比較処理を開始するよう指示し、第１変換行列及び第２変換行列の作成・更新処理を終了する。 In S906, the main control unit 303 instructs the feature point comparison processing unit 310 to start the feature point comparison process between the entire image and the latest captured image, and the first transformation matrix and the second transformation matrix End the creation / update process.

Ｓ９０７で、メイン制御部３０３は、特徴点比較処理部３１０による特徴点比較処理が完了したか否かを判定する。メイン制御部３０３は、特徴点比較処理が完了したと判定した場合（Ｓ９０７のＹｅｓ）にはＳ９０８へ遷移し、特徴点比較処理が完了していないと判定した場合（Ｓ９０７のＮｏ）には第１変換行列及び第２変換行列の作成・更新処理を終了する。 In S907, the main control unit 303 determines whether or not the feature point comparison process by the feature point comparison processing unit 310 is completed. The main control unit 303 transitions to S908 when it is determined that the feature point comparison process is completed (Yes in S907), and when it is determined that the feature point comparison process is not completed (No in S907), it is the first. The process of creating / updating the 1st transformation matrix and the 2nd transformation matrix is completed.

Ｓ９０８で、メイン制御部３０３は、特徴点比較処理部３１０による特徴点比較処理結果を用いて、座標変換処理部３１２に対し全体画像と最新の撮影画像と間の第１変換行列を生成するよう指示する。第１変換行列の生成が完了するとＳ９０１へ遷移して処理を継続する。 In S908, the main control unit 303 causes the coordinate conversion processing unit 312 to generate a first conversion matrix between the entire image and the latest captured image by using the feature point comparison processing result by the feature point comparison processing unit 310. Instruct. When the generation of the first transformation matrix is completed, the process proceeds to S901 and the processing is continued.

Ｓ９０９で、メイン制御部３０３は、座標変換処理部３１２に対し、第１変換行列の作成に用いた特徴点比較画像と最新の撮影画像との間で第２変換行列を求めるよう指示する。第２変換行列の生成が完了すると、第１変換行列及び第２変換行列の作成・更新処理を終了する。 In S909, the main control unit 303 instructs the coordinate conversion processing unit 312 to obtain the second conversion matrix between the feature point comparison image used for creating the first conversion matrix and the latest captured image. When the generation of the second transformation matrix is completed, the creation / update processing of the first transformation matrix and the second transformation matrix is completed.

＜トラッキング誤差量算出＞
トラッキング誤差量算出部３１４による誤差算出処理について、図１０を参照して説明する。図７を参照して説明したトラッキング処理で既に説明した処理については省略する。撮影画像７００〜７０８は、撮影画像取得部３０６から取得された連続する動画撮影の撮影画像であり、全体画像５００にカメラ１０４を近づけて撮影されたものである。ここで、未使用画像１００１、１００２、１００３、１００４は、画像が未使用又は未取得の画像を示す。未使用画像１００１〜１００４は、トラッキング処理部３１３がトラッキング処理を行えない場合、又はカメラユニット２０６から取得できる連続撮影の規定間隔で撮影画像取得部３０６が画像取得を行えないことにより未使用又は未取得の画像である。 <Calculation of tracking error amount>
The error calculation process by the tracking error amount calculation unit 314 will be described with reference to FIG. The process already described in the tracking process described with reference to FIG. 7 will be omitted. The captured images 700 to 708 are captured images of continuous moving image acquisition acquired from the captured image acquisition unit 306, and are captured by bringing the camera 104 close to the entire image 500. Here, unused images 1001, 1002, 1003, 1004 indicate images whose images are unused or unacquired. The unused images 1001 to 1004 are unused or unused because the tracking processing unit 313 cannot perform the tracking process or the captured image acquisition unit 306 cannot acquire the image at a predetermined interval of continuous shooting that can be acquired from the camera unit 206. It is an image of acquisition.

トラッキング処理部３１３は、撮影画像７０３において、撮影画像７００と最新の撮影画像７０３とを入力画像として、特徴点追跡処理部３１１によって求められた特徴点追跡処理結果を用いて、マッチング処理を行う。撮影画像７００は、第１変換行列７０９及び第２変換行列７１０の算出に用いられた撮影画像である。マッチング処理では、第１変換行列７０９、第２変換行列７１０、第３変換行列７１１を掛け合わせて求められる、全体画像５００と撮影画像７０４の間で座標の変換が可能となる変換行列から画像上の座標をマッチング処理により特定を行う。マッチング処理を行った結果、座標のズレが最小になるため、トラッキング誤差のリセットを行う。 The tracking processing unit 313 performs matching processing on the captured image 703 by using the captured image 700 and the latest captured image 703 as input images and using the feature point tracking processing result obtained by the feature point tracking processing unit 311. The captured image 700 is a captured image used for calculating the first transformation matrix 709 and the second transformation matrix 710. In the matching process, the transformation matrix obtained by multiplying the first transformation matrix 709, the second transformation matrix 710, and the third transformation matrix 711, which enables the transformation of coordinates between the entire image 500 and the captured image 704, is displayed on the image. The coordinates of are specified by the matching process. As a result of performing the matching process, the deviation of the coordinates is minimized, so the tracking error is reset.

トラッキング処理部３１３は、未使用画像１００３が存在するため、未使用画像１００３の判定を行って、撮影画像７０４と撮影画像７０６との間で第３変換行列１００６の生成と、トラッキング誤差量算出部３１４からトラッキング誤差量の算出を行う。トラッキング誤差量算出部３１４は、算出したトラッキング誤差量を蓄積誤差としてＤＢ部３０８に保持する。 Since the unused image 1003 exists, the tracking processing unit 313 determines the unused image 1003, generates a third transformation matrix 1006 between the captured image 704 and the captured image 706, and is a tracking error amount calculation unit. The tracking error amount is calculated from 314. The tracking error amount calculation unit 314 holds the calculated tracking error amount as an accumulation error in the DB unit 308.

トラッキング誤差算出状態１００９〜１０１６は、各画像における処理の状態を示す。トラッキング誤差算出状態１０１４の場合、トラッキング誤差算出状態１０１３からの蓄積誤差が０であり、未使用画像１００３により１フレーム分の未使用画像が発生したために発生誤差が１であることを示す。さらに、その結果、トラッキング誤差算出状態１０１４の蓄積誤差が１になったことを示す。ここで、現在の蓄積誤差の閾値が２であるため、蓄積誤差は閾値を超えていない。 The tracking error calculation states 1009 to 1016 indicate the processing states in each image. In the case of the tracking error calculation state 1014, the accumulation error from the tracking error calculation state 1013 is 0, and the occurrence error is 1 because the unused image 1003 generates an unused image for one frame. Further, as a result, it is shown that the accumulation error of the tracking error calculation state 1014 has become 1. Here, since the current threshold value of the accumulation error is 2, the accumulation error does not exceed the threshold value.

同様に、トラッキング処理部３１３は、未使用画像１００４が存在するため、第３変換行列１００６の生成と、トラッキング誤差量算出部３１４からトラッキング誤差量の算出を行う。 Similarly, since the unused image 1004 exists, the tracking processing unit 313 generates the third transformation matrix 1006 and calculates the tracking error amount from the tracking error amount calculation unit 314.

未使用画像１００３、１００４などの発生により、座標変換処理部３１２により求められた変換行列の精度は、複数の未使用画像が発生した場合の変換行列を掛け合わせることで誤差が蓄積する。そのため、トラッキング誤差量算出部３１４により、誤差の蓄積が閾値を越えたと判定された場合に、第１変換行列と第２変換行列を更新するマッチング処理を行うことで蓄積された誤差がリセットされる。 The accuracy of the transformation matrix obtained by the coordinate conversion processing unit 312 due to the generation of the unused images 1003, 1004, etc. accumulates an error by multiplying the transformation matrices when a plurality of unused images are generated. Therefore, when the tracking error amount calculation unit 314 determines that the accumulated error exceeds the threshold value, the accumulated error is reset by performing a matching process for updating the first transformation matrix and the second transformation matrix. ..

未使用画像１００４において、トラッキング誤差算出状態１０１６では蓄積誤差が２であり、未使用画像１００４により１フレーム分の未使用画像が発生したために、発生誤差が１であり、トラッキング誤差算出状態１０１６の蓄積誤差が２になったことを示す。ここで、現在の蓄積誤差の閾値が２であり、蓄積誤差が閾値２を超えたために、メイン制御部３０３は、マッチング処理を行う第１変換行列１００７、第２変換行列１００８を作成するためのフラグを立てる（図８のＳ８１１）。 In the unused image 1004, the accumulation error is 2 in the tracking error calculation state 1016, and the occurrence error is 1 because the unused image 1004 generates an unused image for one frame, and the tracking error calculation state 1016 is accumulated. It shows that the error is 2. Here, since the current threshold value of the accumulation error is 2 and the accumulation error exceeds the threshold value 2, the main control unit 303 creates the first transformation matrix 1007 and the second transformation matrix 1008 to perform the matching process. Set a flag (S811 in FIG. 8).

本実施形態によれば、紙文書等の被写体に対して接近して画像を取り込む際に、適切なタイミングでマッチング処理を行うことにより、マッチング処理による演算負荷を減らしつつ、ズレの少ないトラッキングをすることができる。 According to the present embodiment, when an image is captured in close proximity to a subject such as a paper document, matching processing is performed at an appropriate timing to reduce the calculation load due to the matching processing and to perform tracking with less deviation. be able to.

なお、前述した実施形態において、トラッキング未使用の撮影画像が連続して発生した場合、トラッキングの誤差が大きくなることが考えられる。そこで、トラッキングの誤差に対する重みづけを行うようにし、トラッキング未使用の撮影画像が連続して発生した場合には、トラッキング未使用の撮影画像が単独で発生した場合よりも、トラッキングの誤差に対する重みを大きくするようにしても良い。 In the above-described embodiment, when captured images that have not been tracked are continuously generated, it is conceivable that the tracking error becomes large. Therefore, weighting is performed for the tracking error, and when the captured images without tracking are continuously generated, the weight for the tracking error is increased as compared with the case where the captured images without tracking are generated alone. You may try to make it larger.

（本発明の他の実施形態）
本発明は、前述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサがプログラムを読み出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 (Other Embodiments of the present invention)
The present invention supplies a program that realizes one or more functions of the above-described embodiment to a system or device via a network or storage medium, and one or more processors in the computer of the system or device reads and executes the program. It can also be realized by the processing to be performed. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

なお、前記実施形態は、何れも本発明を実施するにあたっての具体化のほんの一例を示したものに過ぎず、これらによって本発明の技術的範囲が限定的に解釈されてはならないものである。すなわち、本発明はその技術思想、又はその主要な特徴から逸脱することなく、様々な形で実施することができる。 It should be noted that the above-described embodiments are merely examples of embodiment of the present invention, and the technical scope of the present invention should not be construed in a limited manner by these. That is, the present invention can be implemented in various forms without departing from the technical idea or its main features.

１００：携帯型端末装置（モバイル端末）３０１：データ管理部３０２：モバイルアプリ３０３：メイン制御部３０４：情報表示部３０５：操作情報取得部３０６：撮影画像取得部３０７：記憶部３０８：データベース（ＤＢ）部３０９：特徴点抽出部３１０：特徴点比較処理部３１１：特徴点追跡処理部３１２：座標変換処理部３１３：トラッキング処理部３１４：トラッキング誤差量算出部 100: Portable terminal device (mobile terminal) 301: Data management unit 302: Mobile application 303: Main control unit 304: Information display unit 305: Operation information acquisition unit 306: Captured image acquisition unit 307: Storage unit 308: Database (DB) ) Part 309: Feature point extraction unit 310: Feature point comparison processing unit 311: Feature point tracking processing unit 312: Coordinate conversion processing unit 313: Tracking processing unit 314: Tracking error amount calculation unit

Claims

An extraction means for extracting feature points and feature quantities of the feature points from the entire image of the subject and captured images obtained by continuously photographing the subject.
A comparison processing means that compares the feature amount in the first image extracted by the extraction means with the feature amount in the second image different from the first image and obtains a combination of matching feature points.
A tracking processing means for obtaining a movement vector between images of feature points extracted by the extraction means, and
A tracking processing means that performs tracking processing related to a predetermined region of an image by using the comparison processing means or a transformation matrix obtained based on the processing result by the tracking processing means.
An error calculation means for calculating and accumulating a tracking error that occurs when an image that is not tracked by the tracking processing means is generated.
An information processing apparatus including a control means for executing a process by the comparison processing means and resetting the error accumulated by the error calculating means when the error accumulated by the error calculating means exceeds a threshold value. ..

The information processing apparatus according to claim 1, wherein the first image is the whole image, and the second image is the captured image.

It has a transformation processing means for obtaining the transformation matrix used for the tracking processing, and has
The conversion processing means generates a first transformation matrix based on the first image and the second image each time the processing by the comparison processing means is executed, and generates the first transformation matrix. A second transformation matrix is generated based on the latest photographed image and the second image used for generating the first transformation matrix, and the first transformation matrix and the second transformation matrix are used. The information processing apparatus according to claim 1 or 2, wherein the transformation matrix used for the tracking process is obtained.

The error calculation means according to claim 1 to 3, wherein the error calculation means calculates a tracking error when the captured image cannot be acquired at a predetermined interval or when the tracking processing means cannot perform the tracking process. The information processing apparatus according to any one item.

The claim is characterized in that the error calculating means increases the weight for a tracking error that occurs when the captured image cannot be continuously acquired or when the tracking processing means cannot continuously perform the tracking process. Item 2. The information processing apparatus according to any one of Items 1 to 4.

It has a holding means for holding information indicating the position of the predetermined region in the whole image.
The tracking processing means according to any one of claims 1 to 5, wherein the tracking processing means tracks the predetermined region in the captured image based on the transformation matrix and the information held by the holding means. Information processing device.

The information processing apparatus according to any one of claims 1 to 6, wherein the predetermined area is a data input area including data information to be extracted in the whole image.

It is an information processing method of an information processing device.
An extraction step of extracting feature points and feature quantities of the feature points from the entire image of the subject and the captured images of the subject continuously photographed.
A comparison processing step of comparing the feature amount in the first image extracted in the extraction step with the feature amount in the second image different from the first image and obtaining a combination of matching feature points. ,
A tracking process for obtaining a movement vector between images of feature points extracted in the extraction step, and a tracking process for obtaining a movement vector between images.
A tracking processing step of performing tracking processing related to a predetermined region of an image by using a transformation matrix obtained based on the processing result of the comparison processing step or the tracking processing step.
An error calculation process that calculates and accumulates the tracking error that occurs when an image that is not subjected to the tracking process occurs, and
Information characterized by having a control step of executing the process of the comparison processing step and resetting the error accumulated in the error calculation step when the error accumulated in the error calculation step exceeds the threshold value. Processing method.

To the computer of the information processing device
An extraction step of extracting feature points and feature quantities of the feature points from the entire image of the subject and the captured images of the subject continuously captured.
A comparison processing step of comparing the feature amount in the first image extracted in the extraction step with the feature amount in the second image different from the first image and obtaining a combination of matching feature points. ,
A tracking process step for obtaining a movement vector between images of feature points extracted in the extraction step, and a tracking processing step.
A tracking processing step that performs tracking processing related to a predetermined region of an image by using a transformation matrix obtained based on the comparison processing step or the processing result of the tracking processing step.
An error calculation step that calculates and accumulates the tracking error that occurs when an image that is not subjected to the tracking process occurs, and
A program for executing the processing of the comparison processing step and the control step of resetting the error accumulated in the error calculation step when the error accumulated in the error calculation step exceeds the threshold value.