JP6031915B2

JP6031915B2 - Image processing apparatus and program

Info

Publication number: JP6031915B2
Application number: JP2012213196A
Authority: JP
Inventors: 茂樹木谷; 大隆久徳; 顕藤本; 昌也谷川; 匡弘土屋; 勇人加藤; 博章川▲崎▼; 賢治 ▲高▼橋
Original assignee: Buffalo Inc
Current assignee: Buffalo Inc
Priority date: 2012-09-26
Filing date: 2012-09-26
Publication date: 2016-11-24
Anticipated expiration: 2032-09-26
Also published as: JP2014067302A

Description

本発明は、画像処理装置及びプログラムに関する。 The present invention relates to an image processing apparatus and a program.

近年では、ＧＰＳ（Global Positioning System）等、地理情報を取得する手段が普及しており、写真撮影時に、当該写真の撮影地を表す情報を含める装置がある。例えば、カメラ付のスマートフォン等では、写真撮影時に、ＧＰＳを用いて現在地の測位を行い、当該測位結果を撮影した写真の画像情報に含めて記録する。 In recent years, means for acquiring geographic information, such as GPS (Global Positioning System), has become widespread, and there is an apparatus that includes information representing the shooting location of a photograph when the photograph is taken. For example, in a smartphone with a camera or the like, at the time of taking a photo, the current location is measured using GPS, and the positioning result is included in the image information of the taken photo and recorded.

特表２００９−５２６３０２号公報Special table 2009-526302

しかしながら、例えばディジタル一眼レフ等のカメラ機能を中心とした機器では、ＧＰＳ装置はオプションとしては用意されているものの、必ずしも使用されていないのが現状である。従って、このような機器の利用者の撮影した写真の画像情報には、撮影地の情報は多くの場合含まれていない。 However, in devices such as a digital single-lens reflex camera, the GPS device is prepared as an option, but it is not always used. Accordingly, in many cases, image information of a photograph taken by a user of such a device does not include information on a shooting location.

一方で、写真の撮影地が記録されていれば、当該写真を見る者に対して、その写真に関わる記憶を想起させるのに役立つなど、さまざまな効用が期待できる。そこで撮影地の情報が記録されていない写真の画像情報に、後から撮影地の情報を追記する技術が要望されている。 On the other hand, if the shooting location of a photo is recorded, it can be expected to have various effects such as helping the viewer to recall the memory associated with the photo. Therefore, there is a demand for a technique for adding information on a shooting location later to image information of a photograph in which shooting location information is not recorded.

ここで人為的に追記するのでなく、写真から、ＯＣＲを用いて位置、時間、人物に関する情報を抽出し、道路標識から位置情報を抽出して、この抽出した情報をタグとしてデジタルデータに付与してデータを管理することが、特許文献１に開示されている。ところが、撮影された道路標示板が表す地名が必ずしも撮影地であるとは限らない。例えば道路標識は、「日本橋まで２０ｋｍ」などという道路利用者に目的地への経路や地点等に関する情報を提供するための表示である場合があるためである。 Instead of manually adding information here, the information about the position, time, and person is extracted from the photograph using OCR, the position information is extracted from the road sign, and the extracted information is added to the digital data as a tag. Patent Document 1 discloses that data management is performed. However, the place name represented by the photographed road sign board is not necessarily the place of photographing. This is because, for example, a road sign may be a display for providing information about a route to a destination, a point, and the like to a road user such as “20 km to Nihonbashi”.

本発明は上記実情に鑑みて為されたもので、撮影地の情報が記録されていない写真の画像情報に、後から撮影地の情報を追記することのできる画像処理装置を提供することを、その目的の一つとする。 The present invention has been made in view of the above circumstances, and provides an image processing apparatus capable of adding information on a shooting location later to image information of a photo in which shooting location information is not recorded. One of its purposes.

上記従来例の問題点を解決するための本発明は、画像処理装置であって、処理の対象となる画像情報を取得する手段と、前記取得した、処理の対象となる画像情報から場所を表す文字列及び距離を表す文字列を含んだ領域を認識する認識処理手段と、前記認識した領域から、場所を表す文字列と、距離を表す文字列とを文字認識し、当該文字認識結果に基づいて、前記処理の対象となった画像情報の撮影地を推定する推定手段と、前記推定した撮影地の情報を出力する手段と、を含むこととしたものである。 The present invention for solving the problems of the conventional example described above is an image processing apparatus, and means for acquiring image information to be processed and represents a place from the acquired image information to be processed A recognition processing means for recognizing a region including a character string and a character string representing a distance; and character recognition representing a character string representing a location and a character string representing a distance from the recognized region, and based on the character recognition result In addition, an estimation unit that estimates the shooting location of the image information that is the target of the processing and a unit that outputs information on the estimated shooting location are included.

また本発明の一態様に係る画像処理装置は、処理の対象となる画像情報を取得する手段と、前記取得した、処理の対象となる画像情報から、場所を表す文字列及び距離を表す文字列の組を複数含んだ領域を認識する認識処理手段と、前記認識した領域から、各組に含まれる場所を表す文字列と距離を表す文字列とを文字認識し、当該文字認識結果に基づいて、前記処理の対象となった画像情報の撮影地の範囲を絞込み推定する推定手段と、前記推定した撮影地の情報を出力する手段と、を含むこととしたものである。 An image processing apparatus according to an aspect of the present invention includes a unit that acquires image information to be processed, a character string that represents a location, and a character string that represents a distance from the acquired image information to be processed. Recognition processing means for recognizing a region including a plurality of sets of characters, and character recognition of a character string representing a location and a character string representing a distance included in each set from the recognized region, and based on the character recognition result The image processing apparatus includes an estimation unit that narrows and estimates a shooting location range of the image information to be processed, and a unit that outputs the estimated shooting location information.

またこれらにおいて、前記認識処理手段はさらに、前記取得した、処理の対象となる画像情報から、路線を表す文字列を認識し、前記推定手段は、前記認識した路線を表す文字列の文字認識結果をさらに用いて、前記処理の対象となった画像情報の撮影地を推定してもよい。 In these, the recognition processing means further recognizes a character string representing a route from the acquired image information to be processed, and the estimation means recognizes a character recognition result of the character string representing the recognized route. May be used to estimate the shooting location of the image information to be processed.

また前記認識処理手段は場所を表す文字列及び距離を表す文字列を含んだ領域の候補として、前記処理の対象となる画像情報から柱状体が含まれる画像部分を認識し、当該認識した画像部分の内から、場所を表す文字列及び距離を表す文字列を含んだ領域を探索してもよい。 Further, the recognition processing means recognizes an image part including a columnar body from the image information to be processed as a candidate area including a character string representing a place and a character string representing a distance, and the recognized image part An area including a character string representing a place and a character string representing a distance may be searched from among the above.

さらに前記認識処理手段はさらに、前記処理対象となる画像情報に方向を示す画像が含まれる場合には、当該画像が示す方向を認識し、当該認識処理手段により認識された方向に基づいて、撮影方角を推定する手段をさらに含むものであってもよい。 Further, the recognition processing means further recognizes the direction indicated by the image when the image information to be processed includes an image indicating the direction, and captures the image based on the direction recognized by the recognition processing means. It may further include means for estimating the direction.

また前記処理の対象となった画像情報の撮影日時を参照し、当該撮影日時を含む予め定めた時間範囲に撮像された他の画像情報を取得する手段をさらに含み、前記認識処理手段は、当該他の画像情報から場所を表す文字列及び距離を表す文字列を含んだ領域を少なくとも一つ認識し、前記推定手段は、当該認識された領域内の文字列を文字認識し、当該文字認識結果に基づいて他の画像情報の撮影地を推定するとともに、当該他の画像情報の撮影地の推定結果を、さらに用いて、前記処理の対象となった画像情報の撮影地を推定することとしてもよい。 The image processing apparatus further includes means for referring to the shooting date and time of the image information to be processed and acquiring other image information captured in a predetermined time range including the shooting date and time. Recognizing at least one area including a character string representing a location and a character string representing a distance from other image information, the estimation means recognizes a character string in the recognized area, and the character recognition result And estimating the shooting location of the image information based on the image information, and further using the estimation result of the shooting location of the other image information to estimate the shooting location of the image information subject to the processing. Good.

さらに本発明の別の態様に係るプログラムは、コンピュータを、処理の対象となる画像情報を取得する手段と、前記取得した、処理の対象となる画像情報から場所を表す文字列及び距離を表す文字列を含んだ領域を認識する認識処理手段と、前記認識した領域から、場所を表す文字列と、距離を表す文字列とを文字認識し、当該文字認識結果に基づいて、前記処理の対象となった画像情報の撮影地を推定する推定手段と、前記推定した撮影地の情報を出力する手段と、として機能させることとしたものである。 Furthermore, a program according to another aspect of the present invention provides a computer that obtains image information to be processed, a character string that represents a location, and a character that represents a distance from the acquired image information to be processed. A recognition processing means for recognizing an area including a column, and character recognition representing a place and a character string representing a distance from the recognized area, and based on the character recognition result, The estimation means for estimating the shooting location of the acquired image information and the means for outputting the information of the estimated shooting location are functioned.

本発明によると、撮影地の情報が記録されていない写真の画像情報に、後から撮影地の情報を追記することができる。 According to the present invention, information on a shooting location can be added later to the image information of a photo in which shooting location information is not recorded.

本発明の実施の形態に係る画像処理装置の構成例を表すブロック図である。It is a block diagram showing the example of a structure of the image processing apparatus which concerns on embodiment of this invention. 本発明の実施の形態に係る画像処理装置が保持する画像データベースの内容例を表す説明図である。It is explanatory drawing showing the example of the content of the image database which the image processing apparatus which concerns on embodiment of this invention hold | maintains. 本発明の実施の形態に係る画像処理装置の例を表す機能ブロック図である。It is a functional block diagram showing the example of the image processing apparatus which concerns on embodiment of this invention. 本発明の実施の形態に係る画像処理装置が認識する画像部分の例を表す説明図である。It is explanatory drawing showing the example of the image part which the image processing apparatus which concerns on embodiment of this invention recognizes. 本発明の実施の形態に係る画像処理装置の動作例を表すフローチャート図である。It is a flowchart figure showing the example of operation of the image processing device concerning an embodiment of the invention. 本発明の実施の形態に係る画像処理装置による地名と距離とを表す文字列を含んだ領域を取り出す処理の例を表す流れ図である。It is a flowchart showing the example of the process which takes out the area | region containing the character string showing the place name and distance by the image processing apparatus which concerns on embodiment of this invention. 本発明の実施の形態に係る画像処理装置による撮影地の推定動作の例を説明する説明図である。It is explanatory drawing explaining the example of the presumed operation | movement of the imaging location by the image processing apparatus which concerns on embodiment of this invention. 本発明の実施の形態に係る画像処理装置による撮影地の推定動作の例を説明するもう一つの説明図である。It is another explanatory drawing explaining the example of the presumed operation | movement of the imaging location by the image processing apparatus which concerns on embodiment of this invention.

本発明の実施の形態について図面を参照しながら説明する。本発明の実施の形態に係る画像処理装置１は、図１に例示するように、制御部１１、記憶部１２、操作部１３、表示部１４、通信部１５、及び入出力インタフェース１６を含んで構成されている。ここで制御部１１は、ＣＰＵなどのプログラム制御デバイスであり、記憶部１２に格納されたプログラムに従って動作する。 Embodiments of the present invention will be described with reference to the drawings. As illustrated in FIG. 1, the image processing apparatus 1 according to the embodiment of the present invention includes a control unit 11, a storage unit 12, an operation unit 13, a display unit 14, a communication unit 15, and an input / output interface 16. It is configured. Here, the control unit 11 is a program control device such as a CPU, and operates according to a program stored in the storage unit 12.

具体的に本実施の形態では制御部１１は、処理の対象となる画像情報を入出力インタフェース１６を介して受け入れて、記憶部１２に蓄積して格納する。本実施の形態で処理の対象となる画像情報は、デジタルカメラ等で撮像された画像を表す画像情報であり、撮影日の情報や撮影したカメラを特定するカメラ特定情報等のメタデータを含む。ここでメタデータはいわゆるＥｘｉｆ（Exchangeable Image File Format）情報であってもよい。 Specifically, in the present embodiment, the control unit 11 accepts image information to be processed via the input / output interface 16 and accumulates and stores it in the storage unit 12. The image information to be processed in the present embodiment is image information representing an image captured by a digital camera or the like, and includes metadata such as shooting date information and camera specifying information for specifying the captured camera. Here, the metadata may be so-called Exif (Exchangeable Image File Format) information.

本実施の形態の制御部１１は、この記憶部１２に蓄積された画像情報のうちから、処理の対象となる画像情報を取得し、取得した処理の対象となる画像情報から場所を表す文字列及び距離を表す文字列を含んだ領域を認識する。またこの認識した領域から、場所を表す文字列と、距離を表す文字列とを文字認識し、当該文字認識結果に基づいて、処理の対象となった画像情報の撮影地を推定する。この制御部１１の詳しい処理の内容は、後に述べる。 The control unit 11 according to the present embodiment acquires image information to be processed from the image information stored in the storage unit 12, and a character string representing a location from the acquired image information to be processed And an area including a character string representing a distance. Further, from this recognized area, a character string representing a place and a character string representing a distance are character-recognized, and a shooting location of image information to be processed is estimated based on the character recognition result. Details of the processing of the control unit 11 will be described later.

記憶部１２は、制御部１１によって実行されるプログラムを格納している。このプログラムは、ＤＶＤ−ＲＯＭ（Digital Versatile Disc Read Only Memory）等のコンピュータ可読な記録媒体に格納されて提供され、この記憶部１２に格納されたものであってもよい。また、このプログラムは、ネットワーク等を介して配信され、この記憶部１２に格納されたものであってもよい。またこの記憶部１２は制御部１１のワークメモリとしても動作する。 The storage unit 12 stores a program executed by the control unit 11. The program may be provided by being stored in a computer-readable recording medium such as a DVD-ROM (Digital Versatile Disc Read Only Memory) and stored in the storage unit 12. The program may be distributed via a network or the like and stored in the storage unit 12. The storage unit 12 also operates as a work memory for the control unit 11.

本実施の形態ではこの記憶部１２には、図２に例示するように、画像情報とタグ情報とを関連付けて、画像データベースとして蓄積して格納している。なおこのメタデータとしてのタグ情報には、関連する画像情報のＥｘｉｆデータから取り出されたデータが含まれてもよい。 In the present embodiment, as illustrated in FIG. 2, the storage unit 12 stores image information and tag information in association with each other as an image database. The tag information as the metadata may include data extracted from the Exif data of the related image information.

操作部１３は、例えばマウスやキーボード等であってもよいし、赤外線リモートコントローラ等の入力インタフェースであってもよい。本実施の形態のある例では、この操作部１３は、赤外線入力インタフェースであり、利用者の指示操作を受けたリモートコントローラが発信する、利用者の指示操作の内容を表す情報を受信する。そしてこの操作部１３は、当該受信した指示操作の内容を表す情報を制御部１１に出力する。 The operation unit 13 may be, for example, a mouse or a keyboard, or may be an input interface such as an infrared remote controller. In an example of the present embodiment, the operation unit 13 is an infrared input interface, and receives information representing the content of the user's instruction operation transmitted from the remote controller that has received the user's instruction operation. The operation unit 13 outputs information representing the contents of the received instruction operation to the control unit 11.

表示部１４は、制御部１１から入力される指示に従い、内蔵ディスプレイや家庭用テレビジョン装置等の外部ディスプレイに画像を出力するインタフェースである。通信部１５は、例えばネットワークインタフェースであり、有線または無線にてネットワークに接続され、ネットワークを介して受信される情報を制御部１１に出力する。またこの通信部１５は、ネットワークを介して送信するべき情報の入力を制御部１１から受けて、当該情報をネットワークを介して送信する。 The display unit 14 is an interface that outputs an image to an external display such as a built-in display or a home television device in accordance with an instruction input from the control unit 11. The communication unit 15 is a network interface, for example, and is connected to the network by wire or wirelessly and outputs information received via the network to the control unit 11. The communication unit 15 receives input of information to be transmitted via the network from the control unit 11 and transmits the information via the network.

入出力インタフェース１６は、例えばＳＤカードスロットやＵＳＢ（Universal Serial Bus）インタフェース等である。この入出力インタフェース１６は、例えば制御部１１から入力される指示に従い、ここへ接続されたＳＤカードや、ＵＳＢメモリ、ＵＳＢハードディスクドライブ等から画像情報を読み出して制御部１１に出力する。 The input / output interface 16 is, for example, an SD card slot or a USB (Universal Serial Bus) interface. The input / output interface 16 reads image information from an SD card, a USB memory, a USB hard disk drive, or the like connected thereto according to an instruction input from the control unit 11 and outputs the image information to the control unit 11.

次に本実施の形態の制御部１１の処理の内容について述べる。制御部１１は、記憶部１２に格納されたプログラムを実行することにより、機能的には図３に例示するように、画像情報取得部２１と、認識処理部２２と、推定部２３と、情報出力部２４とを含むものとして動作する。 Next, the content of the process of the control part 11 of this Embodiment is described. The control unit 11 executes the program stored in the storage unit 12 to functionally illustrate the image information acquisition unit 21, the recognition processing unit 22, the estimation unit 23, the information, as illustrated in FIG. It operates as including the output unit 24.

ここで画像情報取得部２１は、例えば記憶部１２に蓄積された画像情報のタグ情報を参照し、撮影地の情報が含まれていない画像情報を処理の候補として選択する。画像情報取得部２１はそして、この選択した処理の候補となった画像情報の一つを処理の対象として取得する（記憶部１２の画像データベースから読み出す）。 Here, the image information acquisition unit 21 refers to, for example, tag information of the image information stored in the storage unit 12 and selects image information that does not include shooting location information as a candidate for processing. Then, the image information acquisition unit 21 acquires one of the selected pieces of image information as a processing target (reads out from the image database in the storage unit 12).

認識処理部２２は、画像データベースから読出した画像情報に対して、場所を表す文字列及び距離を表す文字列を含んだ領域を認識する処理を実行する。具体的には、この処理はいわゆる道路標識を認識する処理であり、一例として次のようにして行われる。 The recognition processing unit 22 executes a process of recognizing an area including a character string representing a place and a character string representing a distance for the image information read from the image database. Specifically, this process is a process for recognizing a so-called road sign, and is performed as follows as an example.

すなわち道路標識には、一般的に図４に例示するように、
（ａ１）一般道路上にあって、方面、方向及び距離を表すもの（路線表示のないもの）、
（ａ２）一般道路上にあって、方面、方向及び距離を表すもの（路線表示のあるもの）、
（ｂ）高速道路上にあって、方面及び距離を表すもの、
（ｃ）方面及び方向を予告するもの、
（ｄ）方面や車線を表示するもの、
（ｅ）高速道路上で出口やサービスエリア、料金所等を予告するもの、
（ｆ）著名ないし主要地点を表すものなどがある。これらは、例えば日本であれば道路標識、区画線及び道路標示に関する命令の別表第２に規定されており、諸国においても、例えば米国のFederal Highway Administrationにより提供されているManual on Uniform Traffic Control Devices(MUTCD)といったマニュアルに同様の規定がされたものがある。 In other words, as shown in FIG.
(A1) It is on a general road and represents the direction, direction and distance (no route display),
(A2) It is on a general road and represents the direction, direction and distance (with a route display),
(B) on a highway, indicating direction and distance,
(C) Notice of direction and direction,
(D) display direction or lane,
(E) A notice on the expressway on exits, service areas, tollgates,
(F) Some are famous or represent major points. For example, in Japan, these are stipulated in Appendix 2 of the instructions on road signs, lane markings, and road markings. In countries, for example, Manual on Uniform Traffic Control Devices (provided by the Federal Highway Administration in the United States) Some manuals (MUTCD) have similar specifications.

これらの規定にあるように、道路標識の背景色は予め定められている。そこでこの認識処理部２２は、図５に示すように、処理対象として取得した画像情報に含まれる画素から、画素値Ｐが、上記予め定められた色を表す画素値（複数あってもよいので、それぞれをＱ1，Ｑ2…とする）に対して色空間上で予め定めたしきい値を下回る距離となっている画素を抽出する（Ｓ１）。この抽出結果は例えば図６の（Ｓ１）に示すようなものとなる。ここで色空間は例えばＲＧＢ（Red Green Blue）の各値で定義できる三次元空間であり、画素値間の距離は、この色空間内のユークリッド距離で定義すればよい。 As described in these regulations, the background color of the road sign is predetermined. Therefore, as shown in FIG. 5, the recognition processing unit 22 determines that the pixel value P represents a pixel value (a plurality of pixel values representing the predetermined color) from the pixels included in the image information acquired as the processing target. , Each of which is Q1, Q2,...), A pixel having a distance below a predetermined threshold in the color space is extracted (S1). This extraction result is, for example, as shown in (S1) of FIG. Here, the color space is a three-dimensional space that can be defined by each value of RGB (Red Green Blue), for example, and the distance between the pixel values may be defined by the Euclidean distance in this color space.

また認識処理部２２は、処理対象として取得した画像情報に対してエッジ検出の画像処理を実行し、二値化された輪郭線の画像情報を得る（（Ｓ２），図６の（Ｓ２））。また認識処理部２２は、輪郭線で囲まれた領域内の画素群をラベリング処理する（（Ｓ３），図６の（Ｓ３））。このラベリング処理は、輪郭線を追跡し、輪郭線で囲まれた領域ごとに互いに異なる識別情報を関連付ける、広く知られた処理が利用できるので、ここでの詳しい説明を省略する。認識処理部２２は、ラベリングした画素群を含む領域ごとに、領域ごとに固有の識別情報と、領域を特定する情報（処理対象の画像情報と同じサイズの画像情報であって、ラベリングした画素群を有意な画素（例えば黒色）に設定し、ラベリングした画素群以外の画素を有意でない画素（例えば白色）に設定したマスク画像情報等）とを生成する（（Ｓ４），図６の（Ｓ４））。 Further, the recognition processing unit 22 performs edge detection image processing on the image information acquired as the processing target, and obtains binarized contour image information ((S2), (S2) in FIG. 6). . Further, the recognition processing unit 22 performs a labeling process on the pixel group in the region surrounded by the outline ((S3), (S3) in FIG. 6). Since this labeling process can use a widely known process for tracking the contour line and associating different identification information for each region surrounded by the contour line, detailed description thereof will be omitted here. The recognition processing unit 22 includes, for each region including the labeled pixel group, unique identification information for each region and information for specifying the region (image information having the same size as the image information to be processed, the labeled pixel group Is set to a significant pixel (for example, black), and a pixel other than the labeled pixel group is generated as a non-significant pixel (for example, mask image information) ((S4), (S4) in FIG. 6). ).

認識処理部２２はさらに、処理（Ｓ４）で生成した領域ごとの外形状を認識する処理を実行する（Ｓ５）。この外形状を認識する処理は、例えばＳＯＭ（Self Organizing Maps）等を用いて外形状を分類して認識する処理など、広く知られた方法を採用できるので、ここでの詳しい説明を省略する。認識処理部２２は、この各領域の外形状を矩形、逆三角、六角形、矢印形状等に分類し、この分類結果を領域の外形状の認識結果とする。そして認識処理部２２は、図６の（Ｓ６）に示すように、領域ごとに固有の識別情報と、領域を特定する情報と、領域の外形状の認識結果とを互いに関連付けて領域データベースとして記憶部１２に格納する（Ｓ６）。また認識処理部２２は矩形と認識された領域については、その四隅に相当する画素の座標値を見出し、これからホモグラフィー行列を求める。 The recognition processing unit 22 further executes a process of recognizing the outer shape for each region generated in the process (S4) (S5). As the process for recognizing the outer shape, for example, a widely known method such as a process for classifying and recognizing the outer shape using SOM (Self Organizing Maps) or the like can be adopted, and detailed description thereof is omitted here. The recognition processing unit 22 classifies the outer shape of each region into a rectangle, an inverted triangle, a hexagon, an arrow shape, and the like, and uses the classification result as a recognition result of the outer shape of the region. Then, as shown in FIG. 6 (S6), the recognition processing unit 22 stores the identification information unique to each region, the information for specifying the region, and the recognition result of the outer shape of the region in association with each other as a region database. Store in the unit 12 (S6). The recognition processing unit 22 finds the coordinate values of the pixels corresponding to the four corners of the region recognized as a rectangle, and obtains a homography matrix from this.

次に認識処理部２２は、処理（Ｓ４）で見出した各領域について、領域内に含まれる画素のうち、処理（Ｓ１）で抽出した画素の数をカウントする（Ｓ７）。つまり、輪郭線で囲まれた領域内で、道路標識の背景色に相当する画素値となっている画素の数を調べる。そして認識処理部２２は、この画素数が予め定めたしきい値を超えている領域を選択する（Ｓ８）。認識処理部２２は、処理対象の画像情報のうち、当該選択した領域内の画素部分について、当該領域の外形状から求めたホモグラフィー行列を用いて射影変換を行う（Ｓ９）。 Next, the recognition processing unit 22 counts the number of pixels extracted in the process (S1) among the pixels included in the area for each area found in the process (S4) (S7). That is, the number of pixels having a pixel value corresponding to the background color of the road sign in the region surrounded by the outline is examined. And the recognition process part 22 selects the area | region where this pixel number exceeds the predetermined threshold value (S8). The recognition processing unit 22 performs projective transformation on the pixel portion in the selected area of the image information to be processed using the homography matrix obtained from the outer shape of the area (S9).

認識処理部２２は、射影変換の結果として得られた画像情報の一部（処理の対象となった画像情報のうち、道路標識の背景色となっている部分に撮像された対象物の画像を、その正面から見た状態になるよう変換した画像、以下部分画像と呼ぶ，図６の（Ｓ１０））を、当該部分画像が取り出された領域の識別情報と、当該領域の外形状を表す情報とに関連付けて記憶部１２に格納しておく（Ｓ１０）。処理の対象である画像情報に、道路標識の背景色に相当する画素を上記しきい値を超えて含む領域が複数ある場合は、各領域に対応する部分画像を記憶部１２に格納する。 The recognition processing unit 22 obtains a part of the image information obtained as a result of the projective transformation (the image of the object imaged in the part that is the background color of the road sign in the image information that is the object of processing). , An image converted so as to be viewed from the front, hereinafter referred to as a partial image (S10) in FIG. 6), identification information of the area from which the partial image is extracted, and information indicating the outer shape of the area And stored in the storage unit 12 (S10). When the image information to be processed includes a plurality of regions including pixels corresponding to the background color of the road sign exceeding the threshold value, the partial image corresponding to each region is stored in the storage unit 12.

なお、方面、方向及び距離を表す文字列が、図４に示した（ａ１）一般道路上にあって、方面、方向及び距離を表すもの（路線表示のないもの）等のように白線によって複数の部分に区切られているような場合は、区切られた各部の画像が出力されることとなる（図６の（Ｓ１０）に示した通り）。 A plurality of character strings representing directions, directions, and distances are indicated by white lines such as (a1) on a general road shown in FIG. 4 and indicating directions, directions, and distances (no route indications). In such a case, the image of each divided part is output (as shown in (S10) of FIG. 6).

推定部２３は、認識処理部２２が生成した部分画像Ｐi（領域ｉ（ｉ＝１，２…）ごとの部分画像をＰiとする）に対して文字認識処理（ＯＣＲ処理）を実行する。これにより、部分画像ごとの文字認識結果が得られる。推定部２３は、部分画像Ｐiごとの文字認識結果Ｃi（ｉ＝１，２…）に、距離を表す文字列が含まれるか否かを調べる。これは文字列に数字が含まれるか否かを調べることによって行えばよい。 The estimation unit 23 performs a character recognition process (OCR process) on the partial image Pi generated by the recognition processing unit 22 (the partial image for each region i (i = 1, 2,...) Is Pi). Thereby, the character recognition result for every partial image is obtained. The estimation unit 23 checks whether or not the character recognition result Ci (i = 1, 2,...) For each partial image Pi includes a character string representing a distance. This may be done by checking whether the character string contains numbers.

推定部２３は、文字認識結果Ｃiに、地名（場所）と距離とを表す文字列が含まれる部分画像Ｐiを選択して、次のような処理を実行する。まず推定部２３は、当該選択した部分画像Ｐiから認識された文字列Ｃiを用い、予め地名の文字列を列挙した地名辞書を参照して、地名部分Ｌiを取得する。また推定部２３は、文字列Ｃiから数字の部分を距離を表す文字列Ｄiとして抽出する。推定部２３は、この地名部分Ｌiと距離を表す文字列Ｄiとの組に基づいて撮影地を推定する。
なお、ここで地名辞書を参照する際には、部分画像Ｐiから認識された文字列Ｃiについて部分一致する条件で地名部分Ｌiを抽出してもよい。例えば「日比谷」全体でなくとも、「比谷」との部分一致により地名「日比谷」を地名部分Ｌiとして推定して取得してもよい。 The estimation unit 23 selects a partial image Pi in which a character string representing a place name (location) and a distance is included in the character recognition result Ci, and executes the following processing. First, the estimation unit 23 acquires a place name portion Li by using a character string Ci recognized from the selected partial image Pi and referring to a place name dictionary in which place name character strings are listed in advance. In addition, the estimation unit 23 extracts a numeric part from the character string Ci as a character string Di representing the distance. The estimation unit 23 estimates the shooting location based on the combination of the place name portion Li and the character string Di representing the distance.
Here, when referring to the place name dictionary, the place name portion Li may be extracted under the condition of partially matching the character string Ci recognized from the partial image Pi. For example, the place name “Hibiya” may be estimated and acquired as the place name portion Li by partial matching with “Hibiya” instead of the entire “Hibiya”.

推定部２３は、地名を表す文字列Ｌiと、地理上の座標情報とを関連付けたデータベース（地名データベースと呼ぶ）を参照して、地名を表す文字列Ｌiに関連付けられている地理上の座標情報（緯度経度の情報）Ｔiを取得する。日本であれば、このような地名データベースとしては国土交通省が提供する位置参照情報がある。 The estimation unit 23 refers to a database (referred to as a place name database) in which a character string Li representing a place name is associated with geographical coordinate information (referred to as a place name database), and is associated with the character string Li representing a place name. (Latitude / longitude information) Ti is acquired. In Japan, such a place name database includes location reference information provided by the Ministry of Land, Infrastructure, Transport and Tourism.

推定部２３は、この緯度経度Ｔiを中心として、距離の情報Ｄiの範囲を表す仮想円を地図上に生成する。推定部２３は地名データベースを参照し、生成した複数の仮想円の重なり合う範囲内にある緯度経度の値に関連付けられた地名の文字情報を取得する。つまり推定部２３は、複数の仮想円により、処理の対象となった画像情報の撮影地の範囲を絞り込み推定する。またここで、生成した仮想円の重なり合う範囲内にある緯度経度の値に関連付けられた地名の文字情報が複数見出された場合には、推定部２３は、当該複数の地名の文字情報を取得する。ここで取得した文字情報で表される地名が、撮影地の推定結果となる。あるいは、この推定部２３は、上記生成した仮想円の重なり合う範囲内にある緯度経度の値を、そのまま撮影地の推定結果として出力してもよい。例えば、上記生成した仮想円の重なり合う範囲の重心にある緯度経度の値を、撮影地の推定結果としてもよい。 The estimation unit 23 generates a virtual circle representing the range of the distance information Di on the map with the latitude and longitude Ti as the center. The estimation unit 23 refers to the place name database, and acquires character information of the place name associated with the latitude and longitude values within the overlapping range of the generated virtual circles. That is, the estimation unit 23 narrows down and estimates the range of the shooting location of the image information to be processed by using a plurality of virtual circles. Here, when a plurality of place name character information associated with the latitude and longitude values within the overlapping range of the generated virtual circles are found, the estimating unit 23 acquires the plurality of place name character information. To do. The place name represented by the character information acquired here is the estimation result of the shooting place. Or this estimation part 23 may output the value of the latitude longitude in the range which the said produced | generated virtual circle overlaps as an imaging | photography place estimation result as it is. For example, the value of the latitude and longitude at the center of gravity of the overlapping range of the generated virtual circle may be used as the shooting location estimation result.

情報出力部２４は、推定部２３が得た地名の文字情報を、処理対象となった画像情報の撮影地を表すタグ情報として、この処理対象となった画像情報に関連付けて、記憶部１２の画像データベースに記録し、画像データベースを更新する。 The information output unit 24 associates the character information of the place name obtained by the estimation unit 23 as tag information representing the shooting location of the image information to be processed, and associates it with the image information to be processed, in the storage unit 12. Record in the image database and update the image database.

また推定部２３が取り出した、地名と距離とを表す文字列Ｌi，Ｄiが１つだけである場合は、一つの仮想円が得られるだけであるために、仮想円の重なり合う範囲が存在しない。この場合、推定部２３は、仮想円の円周から予め定めた範囲内にある緯度経度の値に関連付けられた地名の文字情報を、地名データベースを参照して取得することとすればよい。または、推定部２３は、仮想円の円周上に存在する緯度経度の値（円周上の複数の点での緯度経度の値としてもよい）を、そのまま画像情報の撮影地を表す情報として出力してもよい。 Further, when there is only one character string Li, Di representing the place name and distance extracted by the estimation unit 23, only one virtual circle is obtained, so there is no overlapping range of the virtual circles. In this case, the estimation part 23 should just acquire the character information of the place name linked | related with the value of the latitude longitude within the predetermined range from the circumference of a virtual circle with reference to a place name database. Alternatively, the estimation unit 23 uses the latitude / longitude values (may be the latitude / longitude values at a plurality of points on the circumference) existing on the circumference of the virtual circle as information representing the shooting location of the image information as it is. It may be output.

さらに推定部２３は、文字認識結果Ｃiに距離を表す文字列（数字）が含まれていない部分画像Ｐiについては、処理に用いず、無視することとしてもよい。 Furthermore, the estimation unit 23 may ignore the partial image Pi that does not include the character string (number) representing the distance in the character recognition result Ci, without using it in the processing.

また図４の（ａ２）に例示したように路線表示（Ｒ）のあるものについては、推定部２３は、この路線表示に含まれる、路線を表す文字列を文字認識して、この路線を表す文字列の文字認識結果をさらに用いて、画像情報の撮影地を推定してもよい。 Further, as illustrated in (a2) of FIG. 4, for a route display (R), the estimation unit 23 recognizes a character string representing the route included in the route display and represents this route. The shooting location of the image information may be estimated by further using the character recognition result of the character string.

具体的にこの処理を行う場合、制御部１１は、認識処理部２２の処理として生成した領域データベースに含まれる領域のうち、外形状が路線を表す形状として予め定められているもの（例えば日本であれば逆三角形と六角形）となっている領域を選択する。そして認識処理部２２は、領域内に含まれる画素のうち、図５の処理（Ｓ１）で抽出した画素の数をカウントする。つまり、輪郭線で囲まれた領域内で、道路標識の背景色に相当する画素値となっている画素の数を調べる。そして認識処理部２２は、この画素数が予め定めたしきい値（矩形の場合のしきい値とは異なっていてよい）を超えている領域について、推定部２３が文字認識処理を実行する。日本の場合は、逆三角形状に数字を白抜きで示したものが国道の番号であり、六角形状に数字を白抜きで示したものが都道府県道など地方道の番号である。また都道府県道の場合は、都道府県名が併せて表示されている。 Specifically, when this processing is performed, the control unit 11 has a region whose outer shape is predetermined as a shape representing a route among regions included in the region database generated as the processing of the recognition processing unit 22 (for example, in Japan). If there is an inverted triangle or hexagon, select the area. And the recognition process part 22 counts the number of the pixels extracted by the process (S1) of FIG. 5 among the pixels contained in an area | region. That is, the number of pixels having a pixel value corresponding to the background color of the road sign in the region surrounded by the outline is examined. In the recognition processing unit 22, the estimation unit 23 performs the character recognition processing on a region where the number of pixels exceeds a predetermined threshold value (which may be different from the threshold value in the case of a rectangle). In Japan, national road numbers are numbers that are white in an inverted triangle and numbers are white, and numbers that are white in hexagons are numbers for local roads such as prefectural roads. In the case of a prefectural road, the prefectural name is also displayed.

そこで、ここで推定部２３が文字認識した結果は、
（１）逆三角形状であることを表す情報に関連付けた領域から認識された数字の文字列と、
（２）六角形状であることを表す情報に関連付けた領域から認識された数字及び都道府県名（地方名）の文字列と、
のいずれかまたは双方となる。すなわち、国道と地方道とが重複する区間もあるので、これら（１），（２）の双方が認識されたならば、この重複区間において撮像されたこととなり、推定部２３は、これらの重複区間にある緯度経度情報を得て、当該緯度経度情報、または当該緯度経度情報から最も近い緯度経度情報に関連付けられた地名の文字列を取得して、撮影地の推定結果とする。これによると、撮影地の推定精度をより向上できる。 Therefore, here, the result of the character recognition by the estimation unit 23 is
(1) a numeric character string recognized from an area associated with information representing an inverted triangular shape;
(2) a number and a character string of a prefecture name (region name) recognized from an area associated with information representing a hexagonal shape;
Either or both. That is, since there are sections where the national road and the local road overlap, if both of these (1) and (2) are recognized, the image is captured in the overlapping section, and the estimation unit 23 performs the overlap. The latitude / longitude information in the section is obtained, and the latitude / longitude information or the character string of the place name associated with the latitude / longitude information closest to the latitude / longitude information is obtained and used as the estimation result of the shooting location. According to this, the estimation accuracy of the shooting location can be further improved.

推定部２３は、（１）逆三角形状であることを表す情報に関連付けた領域から認識された数字の文字列については、国道の番号を表すものとし、また（２）六角形状であることを表す情報に関連付けた領域から認識された数字及び都道府県名の文字列については、当該都道府県名の文字列が表す都道府県道で、認識された数字はその番号を表すものとする。 The estimation unit 23 represents (1) a national road number for a character string of numbers recognized from an area associated with information representing an inverted triangle shape, and (2) a hexagonal shape. With respect to the number and the character string of the prefecture name recognized from the area associated with the information to be represented, the recognized number represents the number in the prefectural road represented by the character string of the name of the prefecture.

これにより推定部２３は、画像情報から道路を特定する情報を得る。そしてこの情報で特定された道路上に撮影者が存在していたものとして、撮影地の推定を行う。すなわち推定部２３は、図７に例示するように、画像情報から認識された地名及び距離の情報を用いて、地図上に生成した複数の仮想円の円周により切り取られる、認識された国道または都道府県道の線分（曲線であってもよい）を見出す。そしてこれらの線分のうち、最も短い線分上、またはこの最も短い線分から予め定めた距離の範囲内にある緯度経度の値に関連付けられた地名の文字情報を、地名データベースを参照して取得する。ここで最も短い線分としたのは、各仮想円の中心までの距離をなるべく短くするためである。なお、複数の地名の文字情報が取得されてもよい。ここで取得した文字情報で表される地名が、撮影地の推定結果となる。具体的に図７に示した例では、「日本橋」から半径１０ｋｍとする仮想円Ａと、「日比谷」から半径７ｋｍとする仮想円Ｂとが地図上に生成されるものとしている。撮影者が存在していたとして特定される道路Ｒは、これらの仮想円Ａ，Ｂが図７のように一部で重なり合う場合、仮想円Ａの一方側と仮想円Ｂの一方側とに挟まれる部分ｒ１と、仮想円Ｂの一方側と仮想円Ａの他方側とに挟まれる部分ｒ２と、仮想円Ａの他方側と仮想円Ｂの他方側とに挟まれる部分ｒ３とにわけられる。ここで最も短い線分は、部分ｒ３となるので、推定部２３は、この部分ｒ３に含まれる点を撮影地の推定結果とする。またすでに述べたように、この線分上にある緯度経度の値（例えば線分上の中点にあたる緯度経度の値）を撮影地の推定結果としてもよい。 Thereby, the estimation part 23 acquires the information which specifies a road from image information. Then, the shooting location is estimated on the assumption that the photographer was present on the road specified by this information. That is, as illustrated in FIG. 7, the estimation unit 23 uses the information on the place name and distance recognized from the image information, and recognizes the recognized national road or the cut off by the circumferences of a plurality of virtual circles generated on the map. Find the prefectural road segment (may be a curve). Of these line segments, the character information of the place names associated with the latitude and longitude values on the shortest line segment or within a predetermined distance range from the shortest line segment is obtained by referring to the place name database. To do. The reason why the shortest line segment is used is to shorten the distance to the center of each virtual circle as much as possible. Note that character information of a plurality of place names may be acquired. The place name represented by the character information acquired here is the estimation result of the shooting place. Specifically, in the example shown in FIG. 7, a virtual circle A having a radius of 10 km from “Nihonbashi” and a virtual circle B having a radius of 7 km from “Hibiya” are generated on the map. The road R identified as having a photographer is sandwiched between one side of the virtual circle A and one side of the virtual circle B when these virtual circles A and B partially overlap as shown in FIG. A portion r1 sandwiched between one side of the virtual circle B and the other side of the virtual circle A, and a portion r3 sandwiched between the other side of the virtual circle A and the other side of the virtual circle B. Here, since the shortest line segment is the part r3, the estimation unit 23 uses the point included in the part r3 as the estimation result of the shooting location. Further, as already described, the value of the latitude and longitude on this line segment (for example, the value of the latitude and longitude corresponding to the midpoint on the line segment) may be used as the estimation result of the shooting location.

また制御部１１は、さらに処理対象とした画像情報に、外形状が矢印等の方向を示す画像なっている領域が含まれる場合であって、この領域内の道路標識の背景色に相当する画素値となっている画素の数が予め定めたしきい値（他の外形状の場合のしきい値とは異なっていてよい）を超えている場合に、その画像が示す方向、例えば当該画像が矢印であれば、その矢印の向きを認識して、当該方向（矢印の向きなど）に基づいて撮影方角を推定してもよい。 In addition, the control unit 11 further includes a pixel corresponding to the background color of the road sign in the region in which the image information to be processed includes an area whose outer shape is an image indicating a direction such as an arrow. When the number of pixels in the value exceeds a predetermined threshold (which may be different from the threshold for other external shapes), the direction indicated by the image, for example, the image In the case of an arrow, the direction of the arrow may be recognized, and the shooting direction may be estimated based on the direction (the direction of the arrow, etc.).

具体的にこの場合、制御部１１の認識処理部２２は、外形状が矢印となっている領域が含まれる場合であって、この領域内の道路標識の背景色に相当する画素値となっている画素の数が予め定めたしきい値（他の外形状の場合のしきい値とは異なっていてよい）を超えている場合に、当該領域（注目矢印領域と呼ぶ）が表す矢印の向きを次のように定める。すなわち認識処理部２２は、画像情報を撮影時の向き（回転方向）に合わせる。この処理は例えば画像情報にＥｘｉｆ情報が含まれていれば、そのOrientationの情報を参照する等の広く知られた処理を用いることができる。 Specifically, in this case, the recognition processing unit 22 of the control unit 11 includes a region whose outer shape is an arrow, and has a pixel value corresponding to the background color of the road sign in this region. Direction of the arrow indicated by the area (referred to as the target arrow area) when the number of pixels exceeds a predetermined threshold (which may be different from the threshold for other external shapes) Is defined as follows. That is, the recognition processing unit 22 matches the image information with the direction (rotation direction) at the time of shooting. For example, if the Exif information is included in the image information, a widely known process such as referring to the orientation information can be used.

認識処理部２２は、回転方向を合わせた状態で、注目矢印領域（複数あればそれぞれの注目矢印領域）の向きを例えば上方、左方、右方、下方の四方向または上方、左上方、左方、左下方、下方、右下方、右方、右上方の八方向に分類する。この分類は、学習処理によって行う等の広く知られた方法を採用できる。 The recognition processing unit 22 adjusts the direction of the attention arrow region (if there are a plurality of attention arrow regions if there are a plurality of directions), for example, upward, left, right, lower four directions or upward, upper left, left It is classified into eight directions: left, lower left, lower, lower right, right, upper right. This classification can employ a widely known method such as performing learning.

認識処理部２２は、注目矢印領域に外接する矩形を生成し、上方、左方、右方、下方の四方向のいずれかに分類された向きの注目矢印領域については、当該矩形の分類された向き側にある外接矩形の辺（例えば上方に分類された向きの注目矢印領域であれば、それに外接する矩形の上側の辺）、つまり、注目矢印領域内の矢印の向いている方向にある辺の中点の座標を注目点座標として取り出す。また認識処理部２２は、左上方、左下方、右下方、右上方のいずれかに分類された向きの注目矢印領域については、当該矩形の分類された向き側にある外接矩形の頂点、つまり、注目矢印領域内の矢印の向いている方向にある頂点の座標を注目点座標として取り出す。 The recognition processing unit 22 generates a rectangle circumscribing the target arrow region, and the target arrow region in the direction classified into any one of the four directions of upper, left, right, and lower is classified into the rectangle. The side of the circumscribed rectangle on the direction side (for example, the upper side of the circumscribed rectangle if the target arrow region is classified upward), that is, the side in the direction of the arrow in the target arrow region The coordinates of the middle point are taken out as the target point coordinates. In addition, the recognition processing unit 22 regards the attention arrow region in the direction classified as one of the upper left, lower left, lower right, or upper right, the vertex of the circumscribed rectangle on the classified direction side of the rectangle, that is, The coordinates of the vertex in the direction of the arrow in the attention arrow area are extracted as the attention point coordinates.

推定部２３は、注目矢印領域と、その注目点座標とが認識処理部２２によって取り出されているときは、領域データベースに記憶された各領域について文字認識処理した結果、得た文字列の重心座標に最も近い注目点座標の注目矢印領域を見出す。そして当該文字列と見出した注目矢印領域について分類された向きの情報とを関連付けて記憶部１２に格納する。 When the attention processing unit 22 extracts the attention arrow area and the attention point coordinates, the estimation unit 23 performs the character recognition processing on each area stored in the area database, and as a result, the barycentric coordinates of the obtained character string. Find the attention arrow area of the attention point coordinates closest to. Then, the character string is stored in the storage unit 12 in association with the orientation information classified for the found attention arrow region.

推定部２３は、例えば上方の向きに関連付けられた文字列が表す地名を、地名データベースを参照して取得し、当該取得した地名の方向に撮影者が向いている（撮影方角が当該取得した地名の方向である）と推定する。 The estimation unit 23 acquires, for example, the place name represented by the character string associated with the upward direction with reference to the place name database, and the photographer faces the direction of the acquired place name (the shooting direction is the acquired place name). Direction).

この場合、情報出力部２４は、推定部２３が得た地名の文字情報とともに、撮影方角の推定結果を、処理対象となった画像情報の撮影地並びに撮影方角を表すタグ情報として、この処理対象となった画像情報に関連付けて、記憶部１２の画像データベースに記録し、画像データベースを更新する。 In this case, the information output unit 24 uses the place name character information obtained by the estimation unit 23 as the tag information indicating the shooting location and the shooting direction of the image information to be processed as tag information indicating the shooting location and the shooting direction. The image information is recorded in the image database of the storage unit 12 in association with the image information, and the image database is updated.

本実施の形態の画像処理装置１は以上の構成を基本的に備えてなり、次のように動作する。画像処理装置１は、ＳＤカード等から取込んで記憶部１２に蓄積した画像情報のうち、撮影地の情報が記録されていない写真の画像情報を選択し、選択した画像情報を処理の対象として、道路標識を認識する処理を行う。ここで道路標識が認識されると当該道路標識から、地名と距離とを表す文字列を文字認識により取得する。ここで例えば図４（ａ１）に例示した道路標識が撮影されていると、
Ｌ1：「国分寺」，Ｄ1：「４」
Ｌ2：「調布」，Ｄ2：「５」
Ｌ3：「立川」，Ｄ3：「７」
といった地名と距離とを表す文字列Ｌi，Ｄiが取得されることとなる。 The image processing apparatus 1 according to the present embodiment basically includes the above configuration and operates as follows. The image processing apparatus 1 selects image information of a photograph in which shooting location information is not recorded from among image information captured from an SD card or the like and stored in the storage unit 12, and uses the selected image information as a processing target. The process of recognizing the road sign is performed. When a road sign is recognized here, a character string representing a place name and a distance is acquired from the road sign by character recognition. Here, for example, when the road sign illustrated in FIG.
L1: “Kokubunji”, D1: “4”
L2: “Chofu”, D2: “5”
L3: “Tachikawa”, D3: “7”
Character strings Li and Di representing such place names and distances are acquired.

画像処理装置１は、地名データベースを参照して地図上でＬ1：「国分寺」に対応する緯度経度を中心とした、半径Ｄ1：「４」キロメートルの仮想的な円形状（仮想円）を設定する。なお、各地に複数の「国分寺」がある場合は、それぞれに対応する緯度経度を中心として複数の仮想円を設定すればよい。また画像処理装置１は、他の認識結果についても同様にして、地図上でＬ2：「調布」に対応する緯度経度を中心とした、半径Ｄ2：「５」キロメートルの仮想円と、Ｌ3：「立川」に対応する緯度経度を中心とした、半径Ｄ3：「７」キロメートルの仮想円とを設定する。 The image processing apparatus 1 sets a virtual circular shape (virtual circle) with a radius D1: “4” kilometers around the latitude and longitude corresponding to L1: “Kokubunji” on the map with reference to the place name database. . In addition, when there are a plurality of “Kokubunji” in each place, a plurality of virtual circles may be set around the latitude and longitude corresponding to each. Similarly, the image processing apparatus 1 also applies to other recognition results, a virtual circle having a radius D2: “5” kilometers centered on the latitude and longitude corresponding to L2: “Chofu” on the map, and L3: “ A virtual circle with a radius D3: “7” kilometers centered on the latitude and longitude corresponding to “Tachikawa” is set.

そして画像処理装置１は、複数の仮想円の重なりあう領域を見出す。つまり各仮想円内の領域について他の仮想円に重なり合う領域があれば、当該領域を見出すことになる。ここでは、「国分寺」、「調布」、「立川」の各点を中心とした仮想円が互いに交わる領域を見出す（図８）。ここでは例えば「東京都」の「府中」の近傍で互いに交わる領域が存在するものとする。なお、図８では地名データベースに登録されている地名がそれぞれ表示されている。そこで画像処理装置１は、この画像情報が「東京都、府中」で撮影されたものと推定し、この推定の結果である撮影地の情報を、画像情報にタグ情報として関連付けて記憶部１２に格納する。 Then, the image processing apparatus 1 finds a region where a plurality of virtual circles overlap. That is, if there is a region that overlaps with another virtual circle in the region within each virtual circle, the region is found. Here, an area where virtual circles centering on the points of “Kokubunji”, “Chofu”, and “Tachikawa” intersect is found (FIG. 8). Here, for example, it is assumed that there are regions that intersect each other in the vicinity of “Fuchu” in “Tokyo”. In FIG. 8, place names registered in the place name database are respectively displayed. Therefore, the image processing apparatus 1 presumes that this image information was taken in “Tokyo, Fuchu”, and associates the information of the shooting location as a result of this estimation with the image information as tag information in the storage unit 12. Store.

なお、本実施の形態は、ここまでに説明した例に限られるものではない。まず、ここまでの説明では、処理の対象とする一つの画像情報ごとに撮影地の情報を推定していたが、例えば１５分程度の間に撮影された複数の画像情報の一つについて撮影地の情報が推定できたならば、当該複数の画像情報のうちの他の画像情報についてもほぼ同じ撮影地にいることが推定できる（１５分の間に移動可能な距離は大きくないと仮定できる）。そこで本実施の形態の画像処理装置１の制御部１１は、次のような処理を行ってもよい。 Note that the present embodiment is not limited to the examples described so far. First, in the description so far, the information on the shooting location is estimated for each piece of image information to be processed. However, for example, the shooting location for one of a plurality of pieces of image information taken in about 15 minutes is estimated. If this information can be estimated, it can be estimated that the other image information of the plurality of image information is also in the same shooting location (it can be assumed that the distance that can be moved in 15 minutes is not large). . Therefore, the control unit 11 of the image processing apparatus 1 according to the present embodiment may perform the following processing.

すなわち制御部１１は、処理対象となる画像情報を選択した後、この選択した画像情報の撮影日時の情報を参照する。そして当該撮影日時を含む、予め定めた時間範囲に撮像された他の画像情報を記憶部１２から取得する。なお、制御部１１は、このとき、上記撮影日時を含む、予め定めた時間範囲に撮像された他の画像情報であって、かつ、処理対象となった画像情報を撮影したカメラと同じカメラで撮影された他の画像情報を取得することとしてもよい。 That is, after selecting the image information to be processed, the control unit 11 refers to the shooting date / time information of the selected image information. And the other image information imaged in the predetermined time range including the said imaging | photography date is acquired from the memory | storage part 12. FIG. At this time, the control unit 11 is the same camera as the camera that captured the image information that is other image information captured in a predetermined time range including the shooting date and time and that has been processed. It is good also as acquiring the other image information image | photographed.

制御部１１は、処理対象となる画像情報と、取得した他の画像情報とのそれぞれについて、道路標識を認識する処理と、認識した道路標識から地名と距離とを表す文字列を文字認識により取得する処理とを実行する。そして制御部１１は、これら複数の画像情報から得られた、地名と距離とを表す文字列について、それぞれの地名に対応する緯度経度を中心とした仮想円を地図上に設定し、複数の仮想円の重なりあう領域を見出す。制御部１１は、見出した領域内にある緯度経度に関連付けられた地名の情報を地名データベースを参照して取得し、取得した地名が撮影地であると推定する。制御部１１は、この推定の結果である撮影地の情報を、処理対象となった画像情報にタグ情報として関連付けて記憶部１２に格納する。このとき取得された他の画像情報にも同様に、この推定の結果である撮影地の情報を、タグ情報として関連付けて記憶部１２に格納してもよい。 For each of the image information to be processed and the acquired other image information, the control unit 11 recognizes a road sign and acquires a character string representing a place name and a distance from the recognized road sign by character recognition. The process to perform is performed. And the control part 11 sets the virtual circle centering on the latitude longitude corresponding to each place name on the map about the character string showing the place name and distance obtained from these several image information, and is made into several virtual Find areas where circles overlap. The control unit 11 obtains place name information associated with the latitude and longitude in the found area with reference to the place name database, and estimates that the obtained place name is a shooting place. The control unit 11 stores the information on the shooting location as a result of the estimation in the storage unit 12 in association with the image information to be processed as tag information. Similarly, the other image information acquired at this time may store the information on the shooting location as a result of the estimation in the storage unit 12 in association with the tag information.

また制御部１１は、処理対象となる画像情報と、取得した他の画像情報とのそれぞれについて、道路標識を認識する処理と、認識した道路標識から地名と距離とを表す文字列を文字認識により取得する処理とを実行し、それぞれの画像情報ごとに取得された地名と距離とを表す情報から撮影地を推定する処理を行ってもよい。この場合、画像情報ごとに異なる撮影地が推定されることがあり得る。制御部１１は、例えば各画像情報について推定された撮影地の緯度経度の情報の平均値を算出し、当該平均値に最も近い緯度経度に関連付けられた地名の情報を地名データベースから取得し、取得した地名が撮影地であると推定してもよい。 In addition, the control unit 11 performs processing for recognizing a road sign for each piece of image information to be processed and other acquired image information, and character recognition that represents a place name and a distance from the recognized road sign by character recognition. The process of acquiring may be performed, and the process of estimating the shooting location from the information indicating the place name and distance acquired for each piece of image information may be performed. In this case, a different shooting location may be estimated for each piece of image information. For example, the control unit 11 calculates an average value of latitude and longitude information of the shooting location estimated for each piece of image information, acquires information on a place name associated with the latitude and longitude closest to the average value from the place name database, and acquires the information. It may be estimated that the place name is the shooting place.

また本実施の形態の画像処理装置１は、ここまでに例示した道路標識だけでなく、例えば電柱や、駅の駅名標示版等を認識し、それらに含まれる地名、駅名、電話番号などを認識し、撮影地の推定の処理に供してもよい。 In addition, the image processing apparatus 1 according to the present embodiment recognizes not only the road signs exemplified so far, but also, for example, utility poles, station name signs of stations, and the like, and recognizes place names, station names, telephone numbers, and the like included therein. However, it may be used for the process of estimating the shooting location.

さらに、例えば電柱に取付けられた看板には、場所を表す文字列とともに距離が明示されているものもある。一例としては「ＸＸ医院、ココから３０ｍ先」といったような文字列がそれである。そこで制御部１１は、処理の対象となる画像情報について柱状体を含む画像部分を認識し、当該認識した画像部分の内から、場所を表す文字列及び距離を表す文字列を文字認識して取得してもよい。 Furthermore, for example, some signs attached to utility poles have a distance clearly indicated along with a character string representing the place. An example is a character string such as “XX clinic, 30 meters away from here”. Therefore, the control unit 11 recognizes an image part including a columnar body for image information to be processed, and recognizes and acquires a character string representing a place and a character string representing a distance from the recognized image part. May be.

ここで場所を表す文字列は例えば、「ＸＸ医院」といった施設等を表す文字列であってもよいし、電話番号であってもよい。制御部１１は、これら場所を表す文字列に対応する緯度経度の情報を、例えばカーナビゲーションシステムで用いられているようなデータベースを参照して取得する。また距離を表す文字列については、その単位（メートル「ｍ」や、キロメートル「ｋｍ」）を含めて文字認識する。なお、このデータベースは、作成日時別に複数あってもよい。制御部１１は、処理の対象となる画像情報の撮影日時を参照し、当該撮影日時に作成日時が最も近いデータベースを選択して、当該選択したデータベースを参照して、場所を表す文字列に対応する緯度経度の情報を取得する。このようにすると、ランドマークとなるべき店舗の名称と、これに対応する緯度経度の情報との対応関係等が変更される場合などに配慮した処理とすることができる。 Here, the character string representing the location may be, for example, a character string representing a facility such as “XX clinic” or a telephone number. The control unit 11 acquires the latitude and longitude information corresponding to the character strings representing these places with reference to a database used in a car navigation system, for example. The character string representing the distance is recognized by including the unit (meter “m” or kilometer “km”). Note that there may be a plurality of databases for each creation date. The control unit 11 refers to the shooting date and time of the image information to be processed, selects the database whose creation date and time is closest to the shooting date and time, refers to the selected database, and corresponds to the character string representing the place Get the latitude and longitude information. In this way, it is possible to perform processing in consideration of a case where the correspondence relationship between the name of the store to be a landmark and the latitude / longitude information corresponding thereto is changed.

以下は道路標識から地名と距離との文字列を認識した場合と同様に、制御部１１は、当該文字認識された場所を表す文字列に対応する緯度経度を中心とし、距離を表す文字列により半径を定めた仮想円を地図上に設定すればよい。なお、この場合も、複数の仮想円が処理対象の画像情報（または処理対象の画像情報の撮影日時を含む予め定めた時間範囲に撮像された他の画像情報）から得られた場合は、当該複数の仮想円の重なり合う領域を見出してもよい。また、この場合は当該領域が地名データベースにおける地名の緯度経度の分布よりも小さいと考えられるので、この見出した領域に最も近い緯度経度に関連付けて地名データベースに登録されている地名を撮影地の推定結果として取得してもよい。また仮想円の重なりあう領域が複数見出される場合は、重なり合っている仮想円の数が最も多い領域を選択し、この選択した領域に最も近い緯度経度に関連付けて地名データベースに登録されている地名を撮影地の推定結果として取得してもよい。なお、仮想円の重なりあう領域がない場合は、各仮想円を予め定めた半径だけ拡大し、重なり合う領域ができた場合には、重なり合っている仮想円の数が最も多い領域を選択し、この選択した領域に最も近い緯度経度に関連付けて地名データベースに登録されている地名を撮影地の推定結果として取得してもよい。さらに、仮想円の重なりあう領域がない場合は、各仮想円の各中心を結ぶ線分の中点または各仮想円の各中心を結んでできる多角形の重心に最も近い緯度経度に関連付けて地名データベースに登録されている地名を撮影地の推定結果として取得してもよい。さらにこの場合も、選択した領域の重心など、選択した領域内にある緯度経度の情報をそのまま撮影地の推定結果として取得してもよいし、各仮想円の各中心を結ぶ線分の中点、または各仮想円の各中心を結んでできる多角形の重心を用いる場合は、当該重心にあたる位置の緯度経度の情報を撮影地の推定結果として取得してもよい。 In the following, similarly to the case where the character string of the place name and the distance is recognized from the road sign, the control unit 11 uses the character string representing the distance centered on the latitude and longitude corresponding to the character string representing the character-recognized place. A virtual circle with a defined radius may be set on the map. In this case as well, when a plurality of virtual circles are obtained from the image information to be processed (or other image information captured in a predetermined time range including the shooting date and time of the image information to be processed) A region where a plurality of virtual circles overlap may be found. In this case, it is considered that the area is smaller than the latitude / longitude distribution of the place name in the place name database, so the place name registered in the place name database in relation to the latitude / longitude closest to the found area is estimated as the shooting location. You may acquire as a result. If multiple overlapping areas of virtual circles are found, select the area with the largest number of overlapping virtual circles and select the place name registered in the place name database in association with the latitude and longitude closest to the selected area. You may acquire as a presumed result of a photography place. If there is no overlapping area of virtual circles, each virtual circle is enlarged by a predetermined radius, and if overlapping areas are created, select the area with the largest number of overlapping virtual circles. A place name registered in the place name database in association with the latitude / longitude closest to the selected area may be acquired as an estimation result of the shooting place. In addition, if there is no area where the virtual circles overlap, the place name is associated with the latitude and longitude closest to the midpoint of the line connecting the centers of the virtual circles or the center of gravity of the polygon connecting the centers of the virtual circles. You may acquire the place name registered into the database as an estimation result of a photography place. Furthermore, in this case as well, the latitude and longitude information in the selected area, such as the center of gravity of the selected area, may be directly acquired as the estimation result of the shooting location, or the midpoint of the line segment that connects each center of each virtual circle Alternatively, in the case of using a polygonal center of gravity formed by connecting the centers of the virtual circles, information on the latitude and longitude of the position corresponding to the center of gravity may be acquired as the estimation result of the shooting location.

また制御部１１は、処理対象の画像情報（または処理対象の画像情報の撮影日時を含む予め定めた時間範囲に撮像された他の画像情報）から仮想円が一つしか得られなかった場合は、この仮想円の円周から、予め定めた範囲内にある緯度経度に関連付けて地名データベースに登録されている地名を撮影地の推定結果として取得してもよい。 When the control unit 11 obtains only one virtual circle from the image information to be processed (or other image information captured in a predetermined time range including the shooting date and time of the image information to be processed). From the circumference of this virtual circle, the place name registered in the place name database in association with the latitude and longitude within the predetermined range may be acquired as the estimation result of the shooting place.

さらにここまでの説明では、処理対象の画像情報は静止画であるものとしてきたが、処理対象の画像情報は動画であってもよい。動画の画像情報は、静止画の画像情報を撮影時間順に配列したものと同じであるので、動画中の静止画の画像情報を、予め定めた時間間隔で抽出して、それぞれを処理対象の画像情報として取り出し、それぞれの画像情報について撮影地の推定結果を得ることとすればよい。あるいは、動画中の静止画の画像情報をそれぞれ処理の対象として、例えば道路標識の背景色に近い色の画素値を所定数以上含む静止画を抽出して、処理対象の画像情報として取り出し、当該取り出した画像情報について撮影地の推定結果を得ることとしてもよい。 Further, in the above description, the processing target image information is a still image, but the processing target image information may be a moving image. Since the image information of the moving image is the same as the image information of the still image arranged in order of shooting time, the image information of the still image in the moving image is extracted at a predetermined time interval, and each is processed. It is only necessary to take out as information and obtain a shooting location estimation result for each piece of image information. Alternatively, each piece of image information of a still image in a moving image is processed, for example, a still image including a predetermined number or more of pixel values of colors close to the background color of a road sign is extracted and taken out as image information to be processed. It is also possible to obtain a shooting location estimation result for the extracted image information.

これらの場合、動画から取り出した複数の画像情報について得た撮影地の推定結果の論理和集合を生成し、この生成した論理和集合に含まれる撮影地の情報を、動画像の画像情報に関連付けて記憶部１２に格納する。動画から処理対象として取り出した複数の画像情報のそれぞれの撮影日時の情報を用いることで、動画撮影時の移動速度を推定することもできる。この場合はさらに、動画から処理対象として取り出され、撮影地が推定された複数の画像情報（推定済み画像情報）の間にある、動画から処理対象として取り出されなかった（撮影地の推定が行われなかった）画像情報（未推定画像）について、当該未推定画像の撮影日時に対して、その前後で最も近い撮影日時である推定済み画像情報を取り出して、当該取り出した２つの推定済み画像情報のそれぞれについて推定された撮影地の情報を用いて、それら推定された撮影地の間に、未推定画像の撮影地があるものとして推定処理を行ってもよい。 In these cases, a logical sum set of the shooting location estimation results obtained for a plurality of image information extracted from the moving image is generated, and the shooting location information included in the generated logical sum set is associated with the image information of the moving image. And stored in the storage unit 12. The moving speed at the time of moving image shooting can also be estimated by using the information of each shooting date and time of a plurality of pieces of image information taken out as processing targets from the moving image. In this case, furthermore, it was not extracted from the moving image (estimated shooting location was found) between the multiple pieces of image information (estimated image information) that were extracted from the moving image as the processing target and the shooting location was estimated. For the image information (unestimated image), the estimated image information that is the closest shooting date and time before and after the shooting date and time of the unestimated image is extracted, and the two estimated image information items that have been extracted The estimation processing may be performed on the assumption that there is a shooting location of an unestimated image between the estimated shooting locations using the information of the shooting location estimated for each of the above.

例えば、未推定画像の撮影日時Ｔより前の撮影日時で、最も近い撮影日時になっている推定済み画像情報の撮影日時がＴ１、推定された緯度経度の値が（ＬＡＴ１，ＬＯＮ１）であり、また未推定画像の撮影日時より後の撮影日時で、最も近い撮影日時になっている推定済み画像情報の撮影日時がＴ２、推定された緯度経度の値が（ＬＡＴ２，ＬＯＮ２）であるとすると（Ｔ１＜Ｔ＜Ｔ２となる）、未推定画像情報についての推定される撮影地は、

として推定できる。 For example, the shooting date and time of the estimated image information that is the closest shooting date and time before the shooting date and time T of the unestimated image is T1, and the estimated latitude and longitude values are (LAT1, LON1). Also, assuming that the shooting date / time of the estimated image information that is the closest shooting date / time after the shooting date / time of the unestimated image is T2, and the estimated latitude / longitude value is (LAT2, LON2) ( T1 <T <T2), and the estimated shooting location for the non-estimated image information is

Can be estimated.

１画像処理装置、１１制御部、１２記憶部、１３操作部、１４表示部、１５通信部、１６入出力インタフェース、２１画像情報取得部、２２認識処理部、２３推定部、２４情報出力部。
DESCRIPTION OF SYMBOLS 1 Image processing apparatus, 11 Control part, 12 Storage part, 13 Operation part, 14 Display part, 15 Communication part, 16 Input / output interface, 21 Image information acquisition part, 22 Recognition processing part, 23 Estimation part, 24 Information output part

Claims

Means for acquiring image information to be processed;
Recognition processing means for recognizing a region including a plurality of character strings representing a place and a character string representing a distance from the acquired image information to be processed;
A character string representing a place and a character string representing a distance included in each set are character-recognized from the recognized area, and based on the character recognition result, the range of the shooting location of the image information to be processed Estimating means for narrowing down and estimating,
Means for recording the estimated shooting location information in an image database in association with the image information to be processed;
Only including,
The estimating means sets on the map a plurality of virtual circles centered on a latitude and longitude corresponding to a character string representing a place included in each of the character string sets and having a radius determined by a character string representing a distance. An image processing apparatus that estimates the place name registered in the place name database in association with the latitude and longitude closest to the overlapping area of the plurality of virtual circles as shooting place information .

The image processing apparatus according to claim 1 ,
The recognition processing means further recognizes a character string representing a route from the acquired image information to be processed,
The said estimation means is an image processing apparatus which estimates the imaging | photography location of the image information used as the said process object further using the character recognition result of the character string showing the recognized route.

Means for acquiring image information to be processed;
Recognition processing means for recognizing a region including a character string representing a place and a character string representing a distance from the acquired image information to be processed;
An estimation means for recognizing a character string representing a place and a character string representing a distance from the recognized area, and estimating a shooting location of the image information to be processed based on the character recognition result; ,
Means for outputting information of the estimated shooting location,
The recognition processing means further recognizes a character string representing a route from the acquired image information to be processed,
The said estimation means is an image processing apparatus which estimates the imaging | photography location of the image information used as the said process object further using the character recognition result of the character string showing the recognized route.

Means for acquiring image information to be processed;
Recognition processing means for recognizing a region including a plurality of character strings representing a place and a character string representing a distance from the acquired image information to be processed;
A character string representing a place and a character string representing a distance included in each set are character-recognized from the recognized area, and based on the character recognition result, the range of the shooting location of the image information to be processed Estimating means for narrowing down and estimating,
Means for outputting information of the estimated shooting location,
The recognition processing means further recognizes a character string representing a route from the acquired image information to be processed,
The said estimation means is an image processing apparatus which estimates the imaging | photography location of the image information used as the said process object further using the character recognition result of the character string showing the recognized route.

The image processing apparatus according to any one of claims 1 to 4 , wherein:
The recognition processing means is a candidate for a region including a character string representing a location and a character string representing a distance,
Recognizing an image portion including a columnar body from the image information to be processed,
An image processing apparatus that searches a region including a character string representing a place and a character string representing a distance from the recognized image portion.

An image processing apparatus according to any one of claims 1 to 5 ,
The recognition processing means further recognizes the direction indicated by the image when the image information to be processed includes an image indicating the direction,
An image processing apparatus further comprising means for estimating a shooting direction based on the direction recognized by the recognition processing means.

The image processing apparatus according to any one of claims 1 to 6 ,
Means for acquiring other image information captured in a predetermined time range including the shooting date and time, referring to the shooting date and time of the image information to be processed;
The recognition processing means recognizes at least one area including a character string representing a place and a character string representing a distance from the other image information,
The estimation means character-recognizes a character string in the recognized area, estimates a shooting location of other image information based on the character recognition result, and calculates an estimation result of the shooting location of the other image information. And an image processing apparatus for further estimating the shooting location of the image information to be processed.

Computer
Means for acquiring image information to be processed;
Recognition processing means for recognizing a region including a plurality of character strings representing a place and a character string representing a distance from the acquired image information to be processed;
A character string representing a place and a character string representing a distance included in each set are character-recognized from the recognized area, and based on the character recognition result, the range of the shooting location of the image information to be processed Estimating means for narrowing down and estimating,
Means for recording the estimated shooting location information in an image database in association with the image information to be processed;
To function as,
When functioning as the estimating means, a plurality of virtual circles having a radius defined by a character string representing a distance centered on a latitude and longitude corresponding to a character string representing a place included in each of the character string sets are mapped A program that is set above and estimates the place name registered in the place name database in association with the latitude / longitude closest to the overlapping area of the plurality of virtual circles as shooting place information .