JP2013108933A

JP2013108933A - Information terminal device

Info

Publication number: JP2013108933A
Application number: JP2011256091A
Authority: JP
Inventors: Haruhisa Kato; 晴久加藤; Akio Yoneyama; 暁夫米山
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2011-11-24
Filing date: 2011-11-24
Publication date: 2013-06-06
Anticipated expiration: 2031-11-24
Also published as: JP5773436B2

Abstract

PROBLEM TO BE SOLVED: To provide an information terminal device capable of controlling display information in accordance with a size of an imaging object by simply estimating the size with high accuracy.SOLUTION: An information terminal device 1 includes an imaging section 2, an estimation section 3 for estimating the size of an imaging object, a display section 6 for displaying information, a storage section 5 for storing information to be displayed in the display section 6, and a control section 4 for reading information relating to the estimated size from the storage section 5 to control the information. The estimation section 3 includes a tag detection section for detecting a tag whose size and shape are known provided on one face of the imaging object, a region extraction section for extracting a region of an imaging object formed around the tag, a posture estimation section for estimating a posture of the detected tag to the imaging section 2 on the basis of comparison between the arrangement of the detected tag and a prescribed position of the tag, and a size estimation section for estimating the size of the one face of the imaging object extracted based on the estimated posture, and the size of a depth part obtained by excluding the one face.

Description

本発明は、情報を提示する情報端末装置に関し、特に、撮像対象の大きさを推定し関連情報を制御できる情報端末装置に関する。 The present invention relates to an information terminal device that presents information, and more particularly to an information terminal device that can estimate the size of an imaging target and control related information.

撮像対象の大きさを画像から計測する装置は、簡潔かつ短時間で計測できるとともに関連情報を撮像対象にひもづけて提示することが可能であり、利用者の利便性を向上させることができる。これを実現する手法としては、以下のような手法が公開されている。 An apparatus that measures the size of an imaging target from an image can be measured simply and in a short time, and related information can be associated with the imaging target and presented, thereby improving user convenience. The following methods are publicly disclosed as a method for realizing this.

特許文献１では、二つのカメラを用いたステレオ計測法にて物品の特徴点の三次元座標を計測する手法を提案している。特許文献２では、ポアソン過程に従ってばらまかれている複数のセンサーからの入力をもとに、対象物の大きさを推定する手法を提案している。 Patent Document 1 proposes a method for measuring the three-dimensional coordinates of feature points of an article by a stereo measurement method using two cameras. Patent Document 2 proposes a method of estimating the size of an object based on inputs from a plurality of sensors dispersed according to the Poisson process.

特開2011-123051号公報JP 2011-123051 特開2009-63554号公報JP 2009-63554 A

特許文献１の手法では、ステレオカメラを必要としているため、利用できる装置が限定されるという課題がある。また、ステレオカメラの搭載は装置の小型化や省電力化、低廉化が困難になるという課題もある。 In the method of Patent Document 1, since a stereo camera is required, there is a problem that devices that can be used are limited. In addition, the mounting of a stereo camera has a problem that it is difficult to reduce the size, power consumption, and cost of the apparatus.

特許文献２の手法では、複数のセンサーを必要としているため、利用できる装置が限定されるという課題がある。また、分散してばらまく必要があるだけでなく、ばらまき方に精度が依存するという課題もある。 In the method of Patent Document 2, since a plurality of sensors are required, there is a problem that devices that can be used are limited. In addition, there is a problem that not only is it necessary to disperse and distribute, but also accuracy depends on how to disperse.

本発明の目的は、上記課題を解決し、撮像対象の大きさ等を簡素かつ高精度に推定し、表示部で表示する情報を確実に制御できる情報端末装置を提供することにある。 The objective of this invention is providing the information terminal device which solves the said subject, estimates the magnitude | size etc. of imaging object simply and with high precision, and can control the information displayed on a display part reliably.

上記目的を達成するため、本発明は、撮像対象を撮像する撮像部と、前記撮像対象の大きさを推定する推定部と、情報を表示する表示部と、前記表示部で表示する情報を記憶する記憶部と、前記推定された大きさに関連する情報を前記記憶部から読み出して制御する制御部とを備える情報端末装置であって、前記推定部は、前記撮像対象の一面上に設けられた大きさ及び形状が既知のタグを検出するタグ検出部と、前記検出されるタグの周囲に形成される前記撮像対象の領域を抽出する領域抽出部と、前記検出されたタグの配置と前記既知の形状の所定配置との比較に基づいて、前記タグの前記撮像部に対する姿勢を推定する姿勢推定部と、前記既知の大きさと前記推定された姿勢とに基づいて、前記抽出された撮像対象の領域における前記一面の大きさと、前記抽出された撮像対象の領域より前記一面を除外した該撮像対象の前記一面に対する奥行き部分の大きさとを推定するサイズ推定部とを含むことを第一の特徴とする。 In order to achieve the above object, the present invention stores an imaging unit that images an imaging target, an estimation unit that estimates the size of the imaging target, a display unit that displays information, and information displayed on the display unit. And a control unit that reads out and controls information related to the estimated size from the storage unit, and the estimation unit is provided on one surface of the imaging target. A tag detection unit for detecting a tag whose size and shape are known, a region extraction unit for extracting the region to be imaged formed around the detected tag, the arrangement of the detected tag, and the Based on a comparison with a predetermined arrangement of a known shape, a posture estimation unit that estimates a posture of the tag with respect to the imaging unit, and the extracted imaging target based on the known size and the estimated posture In the region of The size of the face, the first comprising a size estimation unit for estimating the magnitude of the depth portion to said one surface of the imaging object excluding the one side than the extracted imaging target region.

また、前記所定配置が、前記タグを所定距離離して正面から前記撮像部で撮像した際の配置であり、前記姿勢推定部が、前記検出されたタグの配置と前記所定配置との平面射影変換の関係として前記姿勢を推定することを第二の特徴とする。 Further, the predetermined arrangement is an arrangement when the image is picked up from the front by separating the tag by a predetermined distance, and the posture estimation unit is configured to perform a planar projective conversion between the detected tag arrangement and the predetermined arrangement. As a relationship, the second feature is to estimate the posture.

前記第一の特徴によれば、大きさ及び形状が既知のタグを予め用意しておき、平面領域としての一面を有するような任意の撮像対象に当該タグが形成されるようにしておき、通常のカメラとして構成される撮像部が撮像した撮像対象の画像を解析するだけで、撮像対象の大きさを推定して表示情報を制御することができる。すなわち、特殊なカメラや特殊なセンサ等を必要とすることなくソフトウェア等によって実現可能であって簡素に、撮像対象の大きさを推定し、表示情報を制御できる。 According to the first feature, a tag having a known size and shape is prepared in advance, and the tag is formed on an arbitrary imaging target having one surface as a planar region. The display information can be controlled by estimating the size of the imaging target simply by analyzing the image of the imaging target captured by the imaging unit configured as the camera. That is, it can be realized by software or the like without requiring a special camera or special sensor, and can simply estimate the size of an imaging target and control display information.

また、前記第二の特徴によれば、平面射影変換の関係を利用することで、簡素且つ高精度に姿勢が推定されるので、撮像対象の大きさも簡素且つ高精度に推定することができる。 Further, according to the second feature, since the posture is estimated simply and with high accuracy by using the relationship of the planar projective transformation, the size of the imaging target can be estimated with high accuracy.

本発明の適用例の１つを概略的に示す図である。It is a figure which shows one of the application examples of this invention roughly. 本発明の実施形態に係る情報端末装置の機能ブロック図である。It is a functional block diagram of the information terminal device which concerns on embodiment of this invention. 推定部の機能ブロック図である。It is a functional block diagram of an estimation part. 一実施例に係る推定部のサイズ推定処理のフローチャートである。It is a flowchart of the size estimation process of the estimation part which concerns on one Example. タグの各例を説明する図である。It is a figure explaining each example of a tag. 姿勢推定部による姿勢推定を説明する図である。It is a figure explaining the posture estimation by a posture estimation part. サイズ推定部によるサイズ推定を説明する図である。It is a figure explaining the size estimation by a size estimation part. 撮像対象の正面以外の面を求めるのに利用する平面射影変換行列を算出する際の補足事項を説明する図である。It is a figure explaining the supplementary matter at the time of calculating the plane projection transformation matrix used for calculating | requiring surfaces other than the front of an imaging target. 一実施例に係る推定部のサイズ推定処理のフローチャートである。It is a flowchart of the size estimation process of the estimation part which concerns on one Example. 図9のフローを補足説明する図である。FIG. 10 is a diagram for supplementarily explaining the flow of FIG. サイズ推定部にて付随情報として取得可能な撮像部と撮像対象との距離の取得を説明する図である。It is a figure explaining acquisition of the distance of the imaging part and imaging target which can be acquired as accompanying information in a size estimation part.

以下、図面を参照して本発明を詳細に説明する。図1は、本発明の適用例の１つを概略的に示す図である。本発明によれば、直方体の撮像対象T1に対してそのいずれかの面に長方形のタグT2を予め設けておいた上で情報端末装置の撮像部により撮像して、図1に示すような当該撮像した画像(ただし、撮像対象以外は省略してある)を解析することで、当該撮像対象T1の大きさ、すなわち縦の長さL1、横の長さL2及び奥行きL3を自動で推定することができる。 Hereinafter, the present invention will be described in detail with reference to the drawings. FIG. 1 is a diagram schematically showing one application example of the present invention. According to the present invention, a rectangular tag T2 is provided in advance on either surface of a rectangular parallelepiped imaging target T1, and then the imaging unit of the information terminal device captures the image as shown in FIG. By automatically analyzing the captured image (however, other than the imaging target is omitted), the size of the imaging target T1, that is, the vertical length L1, the horizontal length L2, and the depth L3 are automatically estimated. Can do.

さらに、当該情報端末装置において当該大きさに付随する情報を表示等させることもできる。一例では、撮像対象T1は集荷の段ボール箱であり、タグT2は当該段ボール箱に貼り付けられた集荷票であり、大きさに付随して表示させる情報は当該段ボール箱の大きさに応じた配送料の情報である。 Further, information associated with the size can be displayed on the information terminal device. In one example, the imaging target T1 is a collection cardboard box, the tag T2 is a collection tag attached to the cardboard box, and the information displayed along with the size is allocated according to the size of the cardboard box. This is shipping information.

なお、撮像対象T1の撮像はL1、L2及びL3すなわち縦、横及び奥行きに対応する辺が画像上で見えるようにして、情報端末装置を利用するユーザが行うものとする。また、図1ではタグT2は直方体の正面(a)に配置されているが、2つの側面(b1)又は(b2)のいずれかに配置されていてもよい。 Note that imaging of the imaging target T1 is performed by the user using the information terminal device such that L1, L2, and L3, that is, the sides corresponding to the vertical, horizontal, and depth are visible on the image. In FIG. 1, the tag T2 is disposed on the front surface (a) of the rectangular parallelepiped, but may be disposed on either of the two side surfaces (b1) or (b2).

図2は、本発明の実施形態に係る情報端末装置の機能ブロック図である。情報端末装置1は、撮像部2、推定部3、記憶部5、制御部4及び表示部6を備える。図3は、推定部3の機能ブロック図である。推定部3は、タグ検出部31、領域抽出部32、姿勢推定部33及びサイズ推定部34を含む。 FIG. 2 is a functional block diagram of the information terminal device according to the embodiment of the present invention. The information terminal device 1 includes an imaging unit 2, an estimation unit 3, a storage unit 5, a control unit 4, and a display unit 6. FIG. 3 is a functional block diagram of the estimation unit 3. The estimation unit 3 includes a tag detection unit 31, a region extraction unit 32, a posture estimation unit 33, and a size estimation unit 34.

なお、図3においてタグ検出部31及び領域抽出部32は、機能ブロック群30として示されている。これは、処理順序としてタグ検出部31がまず処理を行って当該処理結果を利用して領域抽出部32が処理を行う場合と、その逆にまず領域抽出部32が前段としての処理を行ってタグ検出部31が後段としての処理を行う場合との、２つの実施例があることを示している。前者を実施例A、後者を実施例Bとする。後述するように実施例Aでは画像全体からまずタグを検出し、次に当該タグ周辺の撮像対象の領域を検出するのに対し、実施例Bでは画像全体からまず撮像対象の領域を検出し、次に当該領域内部のタグを検出する。 In FIG. 3, the tag detection unit 31 and the region extraction unit 32 are shown as a functional block group 30. This is because the tag detection unit 31 first performs processing as the processing order and the region extraction unit 32 performs processing using the processing result, and conversely, the region extraction unit 32 first performs the previous processing. This shows that there are two examples, the case where the tag detection unit 31 performs the subsequent processing. The former is Example A and the latter is Example B. As will be described later, in Example A, the tag is first detected from the entire image, and then the imaging target area around the tag is detected, whereas in Example B, the imaging target area is first detected from the entire image, Next, a tag inside the area is detected.

撮像部2は、ユーザの操作のもとで図1に例示したように撮像対象の縦、横及び奥行きが写るように撮像して、該撮像画像を推定部3へと渡す。情報端末装置１に携帯端末を採用し、撮像部2には当該携帯端末に標準装備されるデジタルカメラを採用することができる。また情報端末装置１にデスクトップ又はラップトップ等のコンピュータを採用し、撮像部2には当該コンピュータに別途接続するデジタルカメラを採用してもよい。当該接続はネットワーク経由であってもよい。 The imaging unit 2 performs imaging so that the vertical, horizontal, and depth of the imaging target are captured as illustrated in FIG. 1 under the operation of the user, and passes the captured image to the estimation unit 3. A portable terminal can be adopted as the information terminal device 1, and a digital camera provided as a standard on the portable terminal can be adopted as the imaging unit 2. In addition, a computer such as a desktop or a laptop may be employed for the information terminal device 1, and a digital camera separately connected to the computer may be employed for the imaging unit 2. The connection may be via a network.

推定部3は、撮像画像を解析して撮像対象の大きさを推定する。当該推定のため推定部3は、大きさ及び形状などが既知のタグの情報を撮像対象の大きさの判断基準として予め保持しておく。当該推定に際しては、タグ検出部31が画像よりタグを検出し、領域抽出部32が画像より撮像対象の領域を抽出する。ここで、検出されるタグの周囲に撮像対象の領域が形成される、という関係がある。 The estimation unit 3 analyzes the captured image and estimates the size of the imaging target. For the estimation, the estimation unit 3 holds in advance information on tags whose sizes and shapes are known as criteria for determining the size of the imaging target. In the estimation, the tag detection unit 31 detects a tag from the image, and the region extraction unit 32 extracts a region to be imaged from the image. Here, there is a relationship that an imaging target region is formed around the detected tag.

また当該大きさの推定に際して、姿勢推定部33は、タグ及び撮像対象の撮像部2に対する姿勢を推定する。当該姿勢推定は、検出されたタグの画像上での形状と予め既知のタグの形状との比較より行われる。 In estimating the size, the posture estimation unit 33 estimates the posture of the tag and the imaging unit 2 to be imaged. The posture estimation is performed by comparing the shape of the detected tag on the image with a previously known tag shape.

サイズ推定部34は、推定された姿勢によって撮像対象の三面図に対応する画像を求め、当該三面図の各々における撮像対象の画像上の長さを当該三面図のうちの１つに含まれるタグの画像上の長さで割った比率に、予め既知のタグの実際の長さを乗ずることによって、三面図の各々における撮像対象の実際の長さを求めることで、撮像対象の大きさを推定する。なお、三面図に対応する画像を求めるのは撮像対象が互いに垂直をなす面から形成されている場合である。当該推定処理については各実施例A及びBにつき、その詳細を後述する。 The size estimation unit 34 obtains an image corresponding to the three-view drawing of the imaging target based on the estimated posture, and the tag included in one of the three-view drawings is the length on the image of the imaging target in each of the three-view drawing. The size of the imaging target is estimated by calculating the actual length of the imaging target in each of the three views by multiplying the ratio of the length on the image by the actual length of a known tag. To do. Note that the image corresponding to the three-view drawing is obtained when the imaging targets are formed from surfaces perpendicular to each other. The details of the estimation process will be described later for each of Examples A and B.

記憶部5は、推定された撮像対象の大きさ及び／又は検出したタグの情報に関連した情報を予め複数蓄積している。例えば、タグの例として業者ごとの集荷票を形状及び／又は色特徴に基づいて区別する情報と、撮像対象の例として集荷段ボール箱の各サイズに対応する各業者における配送料との情報を蓄積することができる。 The storage unit 5 previously stores a plurality of pieces of information related to the estimated size of the imaging target and / or the detected tag information. For example, as an example of a tag, information for discriminating a collection slip for each trader based on shape and / or color characteristics, and as an example of an imaging target, information on delivery charges at each trader corresponding to each size of the collection cardboard box is stored. can do.

表示部6での情報表示の際、制御部4は、推定部3の推定した撮像対象の大きさ及び／又は検出したタグの情報に対応する記憶部5の情報を読み出して、表示情報を加工するように表示情報を制御する。当該表示情報には撮像部2での撮像画像が含まれてもよい。例えば、所定業者の集荷票が添付された集荷段ボール箱の画像に対して、推定された大きさに対応する、当該業者におけるサイズ規格の情報と配送料の情報とを重ねて表示するようにすることができる。 When displaying information on the display unit 6, the control unit 4 reads the information in the storage unit 5 corresponding to the size of the imaging target estimated by the estimation unit 3 and / or the detected tag information, and processes the display information. Control the display information. The display information may include an image captured by the imaging unit 2. For example, the size standard information and the delivery fee information corresponding to the estimated size are superimposed on the image of the collection cardboard box to which the collection slip of a predetermined supplier is attached. be able to.

図4は、実施例Aに係る推定部3の推定処理のフローチャートであり、順次タグ検出(ステップS1)、姿勢推定(ステップS2)、領域抽出(ステップS3)及びサイズ推定(ステップS4)が実行される。なお、ステップS2の実行にはステップS1の結果が必要であり、ステップS3の実行にはステップS1の結果が必要であり、ステップS4の実行にはステップS1,2及び3の結果が必要である。よって、当該フローにおいてステップS2とS3とは順を入れ替えてもよい。 FIG. 4 is a flowchart of the estimation process of the estimation unit 3 according to the embodiment A, in which tag detection (step S1), posture estimation (step S2), region extraction (step S3), and size estimation (step S4) are executed. Is done. The execution of step S2 requires the result of step S1, the execution of step S3 requires the result of step S1, and the execution of step S4 requires the results of steps S1, 2, and 3. . Therefore, the order of steps S2 and S3 may be reversed in the flow.

(ステップS1)
タグ検出部31が撮像対象の画像より、タグを検出する。当該タグ検出は、画像全体の中からタグの存在している領域の検出と、当該領域におけるタグの向きの検出とからなる。図5はタグの各例を説明する図である。 (Step S1)
The tag detection unit 31 detects a tag from the image to be imaged. The tag detection includes detection of a region where a tag exists in the entire image and detection of a tag direction in the region. FIG. 5 is a diagram for explaining examples of tags.

(1)に示すように、タグは斜線付与で示す段ボール等の撮像対象の下地の色と区別可能な色特徴を有する領域R1で囲まれた長方形を採用することができ、当該領域R1の色特徴に応じた2値化処理を画像に施すことでタグを囲む領域として領域R1を検出し、さらに当該領域R1の境界線から線分を求めることができる。さらに、当該検出されたタグ内の領域R2に、文字列等の非対称な模様を設けておき、当該模様の解析よりタグの向きを検出することができる。当該向きの検出によって、画像上の辺S1,S2,S3及びS4が予め既知のタグのいずれの辺に対応するのかを特定することが可能となる。あるいは同様にして、画像上の4辺の代わりに当該4辺の各交点として求まる4頂点P1,P2,P3及びP4が予め既知のタグのいずれの頂点に対応するのかを特定してもよい。 As shown in (1), the tag can adopt a rectangle surrounded by a region R1 having a color feature that can be distinguished from the background color of the imaging target such as cardboard indicated by hatching, and the color of the region R1 By applying binarization processing according to the feature to the image, the region R1 can be detected as a region surrounding the tag, and a line segment can be obtained from the boundary line of the region R1. Furthermore, an asymmetric pattern such as a character string is provided in the detected region R2 in the tag, and the tag orientation can be detected by analyzing the pattern. By detecting the orientation, it is possible to specify which side of the known tag corresponds to the sides S1, S2, S3, and S4 on the image. Alternatively, similarly, it may be specified which vertex of the known tag corresponds to the four vertices P1, P2, P3 and P4 obtained as the intersections of the four sides instead of the four sides on the image.

なお、向きを特定するための非対称な模様としては(2)に示すように形状によって非対称な模様を用いてもよいし、(3)に示すように色特徴によって非対称な模様を用いてもよいし、これらの混合を用いてもよい。例として(2)では方位記号を、(3)では菱形状の各頂点の位置に配置される赤、橙、黄及び緑の各色のドットを示している。こうした配置によって、(1)における四辺S1〜S4又は四頂点P1〜P4を特定できるようになる。 As the asymmetric pattern for specifying the orientation, an asymmetric pattern may be used depending on the shape as shown in (2), or an asymmetric pattern may be used depending on the color feature as shown in (3). A mixture of these may also be used. As an example, (2) shows an azimuth symbol, and (3) shows red, orange, yellow, and green dots arranged at the positions of the diamond-shaped vertices. With this arrangement, the four sides S1 to S4 or the four vertices P1 to P4 in (1) can be specified.

また(4)に示すように、上記2値化による領域R1の検出を、撮像対象の下地の色特徴がどのようであっても確実に行えるようにするため、領域R1の矩形外枠としての領域R0を設けたものをタグとして採用し、当該領域R1及びR0同士でコントラストが強くなるような色特徴を設けて領域R1を検出するようにしてもよい。 In addition, as shown in (4), in order to ensure that the detection of the area R1 by binarization can be performed regardless of the color characteristics of the background of the imaging target, as a rectangular outer frame of the area R1 A region provided with the region R0 may be adopted as a tag, and the region R1 may be detected by providing a color feature that increases the contrast between the regions R1 and R0.

なお、タグはより一般には、予め既知の配置の少なくとも４つの線分又は４つの点を画像上から互いの線分同士又は互いの点同士を区別して特定できるようなものであれば任意のものを利用できる。(1)に示す長方形のタグの四辺S1〜S4又は四点P1〜P4はその一例である。当該四辺S1〜S4又は四点P1〜P4自体が異なる色彩を有する等して、互いを区別できるように形成されていてもよい。 The tag is more generally arbitrary as long as it can identify at least four line segments or four points in a known arrangement by distinguishing each other from each other on the image. Can be used. The four sides S1 to S4 or the four points P1 to P4 of the rectangular tag shown in (1) are an example. The four sides S1 to S4 or the four points P1 to P4 themselves may have different colors or the like so that they can be distinguished from each other.

より一般的な例を線分の場合に関して(5)に、点の場合に関して(6)に示す。すなわち(5)に示すように、下地と区別可能な任意形状の領域R3の内部又は境界に、配置形状及び／又は色彩で生ずる非対称性によって互いに区別可能な(少なくとも)４つの線分D1〜D4を配置したものを、タグとして利用することができる。ここで線分D1〜D4は実際に線分として与えなくとも、当該箇所に線分が検出できるような任意のもの、例えば線分の両端点のみとして与えることもできる。なお下地と区別可能ならば、領域R3を省略して４つの線分D1〜D4が直接撮像対象上に配置されていてもよい。 More general examples are shown in (5) for the case of line segments and (6) for the case of points. That is, as shown in (5), at least four line segments D1 to D4 that can be distinguished from each other by an asymmetry caused by the arrangement shape and / or color in the interior or boundary of an arbitrarily shaped region R3 that can be distinguished from the base. Can be used as a tag. Here, the line segments D1 to D4 may not be given as actual line segments, but may be given as arbitrary ones that can detect a line segment at the corresponding location, for example, only the end points of the line segment. Note that the region R3 may be omitted and the four line segments D1 to D4 may be directly arranged on the imaging target as long as they can be distinguished from the ground.

また(6)に示すように、(5)と同様に下地と区別可能な任意形状の領域R3の内部又は境界に、配置及び／又は色彩で生ずる非対称性によって互いに区別可能な（少なくとも）４個の点P10〜P40を配置したものを、タグとして利用することができる。 In addition, as shown in (6), as in (5), (at least) four that can be distinguished from each other by the asymmetry that occurs in the arrangement and / or color in the interior or boundary of the arbitrarily shaped region R3 that can be distinguished from the ground. Those in which the points P10 to P40 are arranged can be used as tags.

(ステップS2)
姿勢推定部33が、タグ検出部31により検出された画像上でのタグの配置と、予め形状が既知の当該タグの所定配置との比較に基づいて、タグの撮像部に対する姿勢を推定する。 (Step S2)
The posture estimation unit 33 estimates the posture of the tag with respect to the imaging unit based on the comparison between the tag arrangement on the image detected by the tag detection unit 31 and the predetermined arrangement of the tag whose shape is known in advance.

図6は、当該姿勢推定を説明する図である。(1)が形状既知のタグの所定配置の例である。すなわち、既知形状の例として長方形状のタグを所定距離だけ撮像部2から離して真正面より撮像した際に、画素上の座標で当該タグの四辺SA1〜SD1又は四頂点PA1〜PD1の配置を指定する情報として、形状既知のタグの所定配置が与えられる。 FIG. 6 is a diagram for explaining the posture estimation. (1) is an example of a predetermined arrangement of tags with known shapes. That is, as an example of a known shape, when a rectangular tag is imaged from the front in front of the imaging unit 2 by a predetermined distance, the arrangement of the four sides SA1 to SD1 or the four vertices PA1 to PD1 of the tag is specified by coordinates on the pixel As information to be performed, a predetermined arrangement of tags with known shapes is given.

(2)がタグ検出部31の検出した画像上でのタグの配置であり、傾いた姿勢でタグが撮像されているため、(1)の形状に対して歪んだ形状となって配置されている。そして、図5の(2)や(3)で説明したような非対称性を利用して、図6の(2)においては(1)における四辺SA1〜SD1にそれぞれ対応する四辺SA2〜SD2が、又は(1)における四点PA1〜PD1にそれぞれ対応する四点PA2〜PD2が、区別して特定されると共に、その画像座標上での配置が求められている。 (2) is the placement of the tag on the image detected by the tag detection unit 31, and since the tag is imaged in an inclined posture, it is placed in a distorted shape with respect to the shape of (1). Yes. And using the asymmetry as described in (2) and (3) of FIG. 5, in (2) of FIG. 6, the four sides SA2 to SD2 respectively corresponding to the four sides SA1 to SD1 in (1) are Alternatively, the four points PA2 to PD2 respectively corresponding to the four points PA1 to PD1 in (1) are identified and identified, and their arrangement on the image coordinates is required.

姿勢推定部33は線分を利用する場合(3)に示すように、(1)での既知の正面配置における四つの線分SA1〜SD1を(2)での四つの線分SA2〜SD2に対応づける写像を与える平面射影行列Hを求めることで、撮像されたタグの姿勢を推定する。当該求め方は図形の投影や変換の分野において周知であるので説明を省略するが、(3)に概念的に示すように各線分を表現するパラメータ間の対応関係として行列Hが求まる。また当該行列Hを求めるには、少なくとも四つの線分同士の対応が必要となるが、線分は４つ以上であってもよい。 When the posture estimation unit 33 uses line segments, as shown in (3), the four line segments SA1 to SD1 in the known front arrangement in (1) are changed to four line segments SA2 to SD2 in (2). The orientation of the imaged tag is estimated by obtaining a planar projection matrix H that gives a mapping to be associated. The method of obtaining is well known in the field of graphic projection and conversion, and will not be described. However, as conceptually shown in (3), a matrix H is obtained as a correspondence between parameters representing each line segment. Further, in order to obtain the matrix H, at least four line segments need to be associated with each other, but there may be four or more line segments.

点を利用する場合も同様に、姿勢推定部33は(4)に示すように、(1)での既知の正面配置における四つの点PA1〜PD1を(2)での四つの点PA2〜PD2に対応づける写像を与える平面射影変換行列Hを求めることで、撮像されたタグの姿勢を推定する。点のパラメータ表現や行列Hの求め方は周知である。また利用する点は少なくとも四つであり、四つ以上であってもよい。 Similarly, when using the points, the posture estimation unit 33 converts the four points PA1 to PD1 in the known front arrangement in (1) to the four points PA2 to PD2 in (2) as shown in (4). The orientation of the captured tag is estimated by obtaining a planar projective transformation matrix H that gives a mapping associated with. The point parameter representation and the method of obtaining the matrix H are well known. Further, there are at least four points to be used, and four or more points may be used.

(ステップS3)
領域抽出部32が、タグ検出部31の検出したタグに外接する領域、例えば所定幅でタグの枠となる領域が概ね同一色である仮定のもと、当該色特徴に所定基準で一致してタグを内包する領域を撮像対象の領域として抽出する。図1の例であれば、検出されたタグ領域T2を元に、当該タグ領域T2を囲んでいる撮像対象の領域T1が抽出される。 (Step S3)
Based on the assumption that the region extraction unit 32 circumscribes the tag detected by the tag detection unit 31, for example, a region having a predetermined width and a frame of the tag is substantially the same color, the region extraction unit 32 matches the color feature on a predetermined basis. An area including the tag is extracted as an imaging target area. In the example of FIG. 1, an imaging target region T1 surrounding the tag region T2 is extracted based on the detected tag region T2.

なお当該同一色の仮定のもとでは、撮像対象が概ね均一の色特徴であれば、実際に何色であるかが未知であっても、任意の色の撮像対象を抽出できる。また撮像対象の色特徴が既知であれば、領域抽出時に利用してもよい。 Note that, under the assumption of the same color, if an imaging target has a substantially uniform color feature, an imaging target of an arbitrary color can be extracted even if it is unknown how many colors actually exist. If the color feature of the imaging target is known, it may be used when extracting the region.

(ステップS4)
サイズ推定部34が、タグの既知の大きさと姿勢推定部33が推定した撮像部2に対するタグの姿勢とに基づいて、撮像対象の各方向からのサイズを推定する。図7は、当該サイズ推定を説明する図である。 (Step S4)
Based on the known size of the tag and the posture of the tag with respect to the image pickup unit 2 estimated by the posture estimation unit 33, the size estimation unit 34 estimates the size of each object to be imaged from each direction. FIG. 7 is a diagram for explaining the size estimation.

撮像対象はタグの配置された正面(A1)並びに側面(B1)及び(C1)が見えるように、画像上に写っている。画像上の撮像対象の領域すなわち(A1)∪(B1)∪(C1)は、領域抽出部32によって求められている。また(1)に示すように、正面配置でのタグに対する画像上でのタグの姿勢が、平面射影変換行列Hとして姿勢推定部33により求められている。 The imaging target is shown on the image so that the front face (A1) and the side faces (B1) and (C1) where the tags are arranged can be seen. The region to be imaged on the image, that is, (A1) ∪ (B1) ∪ (C1) is obtained by the region extraction unit 32. Further, as shown in (1), the posture of the tag on the image with respect to the tag in the front layout is obtained by the posture estimation unit 33 as a plane projective transformation matrix H.

タグは直方体の撮像対象上の一面(A1)に配置されているため、(2)に示すようにサイズ推定部34は(1)の行列Hに対する逆行列H_aをそのまま画像に適用することで、(A2)に示すような撮像対象の正面図(A2)の画像を得ることができる。すなわち、正面図(A2)は、領域(A1)∪(B1)∪(C1)を行列H_a(＝H⁻¹)によって変換した像として得られる。当該正面図(A2)において撮像対象のサイズとして縦の長さL1及び横の長さL2を画像上の長さとして求めることができる。 Tag since it is arranged on one side of the rectangular imaging target (A1), (2) size estimation unit 34 as shown in the By applying the inverse matrix H _a as it is to the image for the matrix H (1) The image of the front view (A2) of the imaging target as shown in (A2) can be obtained. That is, a front view (A2) is obtained as an image converted by the area (A1) ∪ (B1) ∪ the (C1) matrix H _a (= H ^-1). In the front view (A2), the vertical length L1 and the horizontal length L2 can be obtained as the length on the image as the size of the imaging target.

サイズ推定部34はさらに、タグの配置された正面(A1)に対する2つの側面(B1)及び(C1)をそれぞれ正面から見た画像としての(B2)及び(C2)を得て、当該画像より奥行きの長さL3を画像上の長さとして求める。すなわち、画像(A2),(B2)及び(C2)は、第三角法における正面図、側面図及び平面図の関係にある。 The size estimation unit 34 further obtains (B2) and (C2) as images obtained by viewing the two side surfaces (B1) and (C1) from the front with respect to the front surface (A1) where the tag is arranged, from the image. The depth length L3 is obtained as the length on the image. That is, the images (A2), (B2), and (C2) are in a relationship of a front view, a side view, and a plan view in the third trigonometry.

当該側面図及び平面図に対応する(B2)及び(C2)を求める際、次の関係を利用する。すなわち、任意の平面射影変換行列H'は数学上周知の関係として一般に、撮像部2の内部パラメータA(カメラの内部行列)と、回転行列の第一及び第二列ベクトルr₁,r₂並びに並進ベクトルtを並べた行列(r₁ r₂ t)とで、次のような行列方程式で表すことができる。
H' ＝ A (r₁ r₂ t) When obtaining (B2) and (C2) corresponding to the side view and plan view, the following relationship is used. That is, an arbitrary planar projective transformation matrix H ′ is generally known as a mathematical relationship, and generally includes an internal parameter A (camera internal matrix) of the imaging unit 2 and first and second column vectors r ₁ and r ₂ of the rotation matrix and The matrix (r ₁ r ₂ t) in which the translation vectors t are arranged can be expressed by the following matrix equation.
H '= A (r ₁ r ₂ t)

よって、(2)に示す平面射影変換行列H_aを上記の形式で表しておいて、回転行列の部分に正面図(A2)から側面図(B2)へ至る回転をさらに反映させることで、(3)に示すように(B2)を得る行列H_bを求めることができる。同様に正面図(A2)から平面図(C2)へ至る回転をさらに反映させることで、(4)に示すように(C2)を得る行列H_cを求めることができる。 Therefore, (2) the homography matrix H _a shown in keep expressed in the above format, a further possible to reflect the rotation, from a front view (A2) to the side view (B2) in the part of the rotation matrix, ( As shown in 3), a matrix H _b for obtaining (B2) can be obtained. Be to further reflect rotation extending likewise from the front view (A2) a plan view to (C2), it is possible to determine the matrix H _c of obtaining a (C2) as shown in (4).

サイズ推定部34は最後に、既知のタグにおける所定箇所の実寸を、当該所定箇所を(A2)に示す正面から見た際の画像上の長さで割った比率を、画像(A2),(B2)及び(C2)において画像上の長さとして求めたL1,L2及びL3に乗ずることで、L1,L2及びL3の実寸を求める。タグが長方形であれば、当該所定箇所には例えば四辺のうち特定の一辺や、対角線等を採用することができる。 Finally, the size estimation unit 34 calculates a ratio obtained by dividing the actual size of the predetermined location in the known tag by the length on the image when the predetermined location is viewed from the front shown in (A2). The actual dimensions of L1, L2, and L3 are obtained by multiplying L1, L2, and L3 obtained as the lengths on the image in B2) and (C2). If the tag is rectangular, for example, a specific one of the four sides, a diagonal line, or the like can be adopted as the predetermined location.

なお、図8はH_b及びH_cを求める際の補足事項を説明する図である。すなわち(1)のような撮像対象の画像に対してタグを正面に向ける行列H_aを適用した結果、(2)のように画像のxy座標上において傾いた矩形(A4)として"正面図"が得られた場合、当該x軸やy軸の周りの回転を適用すると奥行きL3が正しく見えるような(B3)の側面図及び(C3)の平面図が得られない。すなわち、回転を適用する場合、(B3)を得る際はL2の辺の方向を軸とし、(C3)を得る際はL1の辺の方向を軸とする。あるいは、(2)の状態からさらに(3)のようなL2がx軸に平行でL3がy軸に平行な状態(A5)に変換してから、x軸及びy軸回りに回転させてもよい。 FIG. 8 is a diagram for explaining supplementary matters when obtaining H _b and H _c . : (1) the result of applying the matrix H _a for directing tag in front with respect to the imaging target image, such as, as a rectangle (A4) tilted in the xy coordinates of the image as in (2) "front view" Is obtained, a side view of (B3) and a plan view of (C3) where the depth L3 can be viewed correctly cannot be obtained by applying rotation around the x-axis and y-axis. That is, when applying rotation, when obtaining (B3), the direction of the side of L2 is used as an axis, and when obtaining (C3), the direction of the side of L1 is used as an axis. Alternatively, after the state (2) is further changed to the state (A5) where L2 is parallel to the x-axis and L3 is parallel to the y-axis as in (3), it can be rotated around the x-axis and y-axis. Good.

なおまた、実施例Aにおける撮像対象は図7に説明したような直方体に限らず、円錐などを含む錐体などであってもよい。この際、タグは平面をなす底面に設け、底面と側面とが写るように撮像対象は撮像されるものとする。また、図7における行列H_aに対してH_b及びH_cを求めることが可能となるように、底面と側面との角度関係(互いになす角度)が予め既知であるものとする。当該角度関係を適用すれば側面を正面から見た図形が得られる。図7に示す直方体の場合は当該角度関係が直角となる一例である。 In addition, the imaging target in the embodiment A is not limited to the rectangular parallelepiped as illustrated in FIG. 7, but may be a cone including a cone. At this time, the tag is provided on a flat bottom surface, and the imaging target is imaged so that the bottom surface and the side surface are captured. Also, as it is possible to obtain the H _b and H _c with respect to the matrix H _a in FIG. 7, the angular relationship between the bottom surface and the side surface (the angle formed with each other) is assumed to be known beforehand. If this angular relationship is applied, a figure with the side viewed from the front can be obtained. The rectangular parallelepiped shown in FIG. 7 is an example in which the angular relationship is a right angle.

図9は、実施例Bに係る推定部3の推定処理のフローチャートであり、順次領域抽出(ステップS10)、タグ検出(ステップS20)、姿勢推定(ステップS30)及びサイズ推定(ステップS40)が実行される。ここで、ステップS30及びステップS40はそれぞれ実施例AのステップS2及びステップS4と同様である。 FIG. 9 is a flowchart of the estimation process of the estimation unit 3 according to Example B, in which sequential region extraction (step S10), tag detection (step S20), posture estimation (step S30), and size estimation (step S40) are executed. Is done. Here, Step S30 and Step S40 are the same as Step S2 and Step S4 of Example A, respectively.

図10は実施例Bの説明図である。実施例Bにおいては、撮像対象は予め既知の概ね均一な色特徴を有し、形状については概ね直方体となる対象に限定される。タグは予め既知の所定の模様を有するものが当該直方体の一面に付されるが、付す面をどの面のどの場所にどの向きとするかは任意である。 FIG. 10 is an explanatory diagram of Example B. In the embodiment B, the imaging target has a generally uniform color feature that is known in advance, and the shape is limited to a target that is a substantially rectangular parallelepiped. A tag having a predetermined pattern in advance is attached to one surface of the rectangular parallelepiped, and it is arbitrary which surface of which surface is attached and in which direction.

(ステップS10)
領域抽出部32が、撮像対象の色特徴に所定基準で合致しかつ撮像された画像中において最大領域となる領域を撮像対象の領域として抽出する。領域抽出部32はさらに、当該抽出した領域の外周を直線近似して、六角形を得る。すなわち、図10の(1)に示すように六角形x1〜x6を得る。ここでの前提として、撮像対象は画像中に充分大きく映っていることから最大領域として抽出がなされ、また撮像対象が直方体であり縦、横及び奥行きに対応する辺が写っていることから外周としての六角形が得られる。 (Step S10)
The area extraction unit 32 extracts, as the imaging target area, an area that matches the color characteristics of the imaging target on a predetermined standard and becomes the maximum area in the captured image. The area extracting unit 32 further linearly approximates the outer periphery of the extracted area to obtain a hexagon. That is, hexagons x1 to x6 are obtained as shown in (1) of FIG. As a premise here, since the imaging target is sufficiently large in the image, it is extracted as the maximum area, and since the imaging target is a rectangular parallelepiped and the sides corresponding to the vertical, horizontal, and depth are reflected, The hexagon is obtained.

領域抽出部32はさらに(2)に示すように、抽出領域内部からエッジ検出で3本の稜線x1-x0,x3-x0及びx5-x0を算出し、当該エッジの交点として頂点x0を求める。この際、x1,x3及びx5が六角形において１つおきの頂点であり、かつ共通の頂点x0に至るという制約条件を利用して、その他の稜線ではないものとして検出されるエッジを除外して正しい稜線を求める。当該稜線によって、撮像対象の領域は各面に対応する３つの領域(A1)、(B1)及び(C1)に分けられる。 Further, as shown in (2), the region extraction unit 32 calculates three ridge lines x1-x0, x3-x0, and x5-x0 by edge detection from the inside of the extraction region, and obtains a vertex x0 as an intersection of the edges. At this time, by using the constraint that x1, x3 and x5 are every other vertex in the hexagon and reach the common vertex x0, the edges detected as not being other ridge lines are excluded. Find the correct ridgeline. The area to be imaged is divided into three areas (A1), (B1), and (C1) corresponding to each surface by the ridgeline.

(ステップS20)
タグが線分で形成されていれば、タグ検出部31が、分けられた３つの領域の各々のエッジ成分についてHough変換を適用して、１つの領域内よりタグの模様に相当する線分の組を検出する。当該線分は配置の非対称性によって各々を特定可能な少なくとも4本の線分である。(2)の例では、領域(A1)から長方形の模様のタグが検出される。 (Step S20)
If the tag is formed by a line segment, the tag detection unit 31 applies a Hough transform to each edge component of each of the three divided areas, and the line segment corresponding to the tag pattern from one area. Detect a pair. The line segments are at least four line segments that can be identified by the asymmetry of the arrangement. In the example (2), a rectangular pattern tag is detected from the area (A1).

(ステップS30)
姿勢推定部33が、検出されたタグを利用して実施例Aと同様にして当該タグの姿勢を推定する。 (Step S30)
The posture estimation unit 33 estimates the posture of the tag using the detected tag in the same manner as in Example A.

(ステップS40)
タグが検出された面(A1)を正面とする画像の平面射影変換による補正を用いて、サイズ推定部34が実施例Aと同様に撮像対象のサイズを求める。すなわち、図10の(2)における縦L1、横L2及び奥行きL3の実寸を求める。 (Step S40)
The size estimation unit 34 obtains the size of the imaging target in the same manner as in the embodiment A by using the correction by the planar projective transformation of the image with the surface (A1) where the tag is detected as the front. That is, the actual dimensions of the vertical L1, the horizontal L2, and the depth L3 in (2) of FIG. 10 are obtained.

なおステップS20において、当該実施例Bにおけるタグは線分の組合せとしてでなく、点の組合せ等の実施例Aと同様のものを採用してもよい。なお、タグに線分の組合せを採用した場合は、ステップS10におけるエッジ検出結果において稜線として不採用としたものの中からステップS20においてタグを検出すればよいので、色特徴等に基づく処理を別途行う必要がなく、処理が簡素化される。 In step S20, the tag in the embodiment B may be the same as that in the embodiment A such as a combination of points instead of a combination of line segments. If a combination of line segments is used for the tag, it is only necessary to detect the tag in step S20 from those not adopted as ridge lines in the edge detection result in step S10. There is no need, and the processing is simplified.

以下、実施例A及びBの両者において適用可能な、本発明の補足事項を紹介する。 In the following, supplementary matters of the present invention that are applicable in both Examples A and B will be introduced.

タグ検出部31は、複数の異なるタグを切り替えて検出するように構成することができる。この場合、形状、大きさ、色彩又はそれらの組合せによって各タグが区別できるような所定のタグのセットを用意しておく。いずれのタグが利用されているかについては、ユーザがマニュアルで指定してもよいし、各タグの検出手法をタグ検出部31で試みて、正常に検出されるタグを自動で選ぶようにしてもよい。 The tag detection unit 31 can be configured to switch and detect a plurality of different tags. In this case, a predetermined set of tags is prepared so that each tag can be distinguished by its shape, size, color, or a combination thereof. Which tag is used may be specified manually by the user, or the tag detection unit 31 may try each tag detection method to automatically select a tag that is normally detected. Good.

この場合、記憶部5は各タグに応じて異なる情報を蓄積しておくことができる。例えば各タグは各配送業者の集荷票であって、当該各配送業者の料金体系に係る情報を蓄積しておくことができる。制御部4は検出されたタグの種類に応じた情報を記憶部5から読み出して、表示部6にて表示するよう制御することができる。 In this case, the storage unit 5 can store different information depending on each tag. For example, each tag is a collection slip of each delivery company, and information related to the charge system of each delivery company can be accumulated. The control unit 4 can control to read information corresponding to the detected tag type from the storage unit 5 and display it on the display unit 6.

なおまたタグ自体は、撮像対象の段ボール箱に対して別途、平面状の物品として貼り付けるような集荷票といったようなものであってもよいし、段ボール箱の一面に描かれた模様等として形成されていてもよい。大きさ及び形状等が既知となるならば、レーザポインタ等で照射した模様等であってもよい。 In addition, the tag itself may be a collection tag that is separately attached to the cardboard box to be imaged as a flat article, or formed as a pattern or the like drawn on one surface of the cardboard box. May be. If the size and shape are known, a pattern irradiated with a laser pointer or the like may be used.

図11は、サイズ推定部34において撮像対象のサイズ以外の付随情報として取得可能な、撮像部2と撮像対象との距離Dの取得を説明する図である。当該距離Dはレンズの公式に基づいて、次式により求めることができる。
D ＝ f L/n FIG. 11 is a diagram illustrating the acquisition of the distance D between the imaging unit 2 and the imaging target that can be acquired as accompanying information other than the size of the imaging target by the size estimation unit 34. The distance D can be obtained from the following equation based on the lens formula.
D = f L / n

ここで、Lはタグの所定部分の実寸の長さであり、nはタグの当該所定部分の撮像部2における撮像面に生じた像の長さである。すなわちnは、タグの所定部分の像の占める画素数に撮像部2の所定の画素ピッチ(隣接画素間距離)を積算して得られる長さである。fは撮像部2の焦点距離である。当該取得された距離も、制御部4によって表示部6に表示させるようにすることができる。 Here, L is the actual length of the predetermined portion of the tag, and n is the length of the image generated on the imaging surface of the imaging portion 2 of the predetermined portion of the tag. That is, n is a length obtained by adding a predetermined pixel pitch (distance between adjacent pixels) of the imaging unit 2 to the number of pixels occupied by an image of a predetermined portion of the tag. f is the focal length of the imaging unit 2. The acquired distance can also be displayed on the display unit 6 by the control unit 4.

1…情報端末装置、2…撮像部、3…推定部、4…制御部、5…記憶部、6…表示部、31…タグ検出部、32…領域抽出部、33…姿勢推定部、34…サイズ推定部 DESCRIPTION OF SYMBOLS 1 ... Information terminal device, 2 ... Imaging part, 3 ... Estimation part, 4 ... Control part, 5 ... Memory | storage part, 6 ... Display part, 31 ... Tag detection part, 32 ... Area extraction part, 33 ... Attitude estimation part, 34 ... Size estimation part

Claims

An imaging unit that captures an imaging target, an estimation unit that estimates the size of the imaging target, a display unit that displays information, a storage unit that stores information displayed on the display unit, and the estimated size A control unit that reads out and controls information related to the storage unit from the storage unit,
The estimation unit includes
A tag detection unit for detecting a tag having a known size and shape provided on one surface of the imaging target;
A region extraction unit that extracts a region of the imaging target formed around the detected tag;
A posture estimation unit that estimates a posture of the tag with respect to the imaging unit based on a comparison between the detected placement of the tag and the predetermined placement of the known shape;
Based on the known size and the estimated posture, the size of the one surface in the extracted imaging target region and the one surface of the imaging target excluding the one surface from the extracted imaging target region An information terminal device, comprising: a size estimation unit that estimates a size of a depth portion with respect to.

The tag has a predetermined color characteristic at its boundary portion,
2. The information terminal device according to claim 1, wherein the tag detection unit detects the tag by binarizing an image based on the predetermined color feature.

The imaging object has a substantially uniform predetermined color feature;
3. The information according to claim 1, wherein the region extraction unit extracts a region having a color feature substantially the same as a color feature of a predetermined region circumscribing the detected tag as the region to be imaged. Terminal device.

The predetermined arrangement is an arrangement when the image is picked up from the front by separating the tag by a predetermined distance,
4. The information terminal device according to claim 1, wherein the posture estimation unit estimates the posture as a relation of planar projective transformation between the detected tag arrangement and the predetermined arrangement.

The size estimation unit estimates the size of the one surface after converting the imaging target into a state where the one surface is imaged based on the estimated plane projective transformation relationship. 4. The information terminal device according to 4.

In the imaging object, the angular relationship between the one surface and the surface along the depth portion thereof is known,
The size estimation unit converts the imaging target into a state in which the surface along the depth portion is imaged as a front surface using the estimated planar projective transformation relationship and the known angular relationship, and then the depth. 6. The information terminal device according to claim 5, wherein the size of the part is estimated.

The imaging object is a substantially rectangular parallelepiped object having a substantially uniform predetermined color feature, and the tag is a predetermined pattern attached to one surface of the rectangular parallelepiped,
The region extraction unit extracts a maximum region having a color feature substantially the same as the predetermined color feature as the region to be imaged, obtains a hexagon by linearly approximating the outer periphery of the extracted region, Detect three line segments extending from every other vertex of the square to one point inside the extracted area,
The tag detection unit detects, as the tag, an area in which the predetermined pattern is detected from the imaging target area divided into three by the three line segments. Item 7. The information terminal device according to any one of Items 1 to 6.

The tag detection unit is configured to distinguish and detect a plurality of different predetermined tags,
The storage unit stores information related to each of the plurality of different tags,
8. The information terminal device according to claim 1, wherein the control unit reads and controls information corresponding to the detected tag from the storage unit.

The size estimation unit further includes the imaging unit and the imaging based on the number of pixels occupied by the detected tag in the imaging unit, a known size of the tag, and a pixel pitch and a focal length of the imaging unit. 9. The information terminal device according to claim 1, wherein a distance from the target is estimated.