JP6814775B2

JP6814775B2 - Collection device, collection method and collection program

Info

Publication number: JP6814775B2
Application number: JP2018188870A
Authority: JP
Inventors: 秉哲 ▲裴▼; 翔太斎藤
Original assignee: 東芝インフォメーションシステムズ株式会社
Priority date: 2018-10-04
Filing date: 2018-10-04
Publication date: 2021-01-20
Anticipated expiration: 2038-10-04
Also published as: JP2020057299A

Description

本発明は、収集装置、収集方法及び収集プログラムに関する。 The present invention relates to a collection device, a collection method and a collection program.

画像認識のための機械学習を行う場合、認識対象のオブジェクトが画角いっぱいに映った画像が用いられる。このような画像を収集するためには、認識対象のオブジェクトを様々な角度から撮影し、さらにそれぞれの画像に対しトリミング加工を行う必要がある。 When performing machine learning for image recognition, an image in which the object to be recognized is reflected at the full angle of view is used. In order to collect such images, it is necessary to photograph the object to be recognized from various angles and further perform trimming processing on each image.

また、ＡＲ（Augmented Reality）技術によれば、撮像した画像の傾きや、画像に映っているオブジェクトの位置を把握することができる（例えば、特許文献１又は特許文献２を参照）。 Further, according to the AR (Augmented Reality) technology, it is possible to grasp the inclination of the captured image and the position of the object reflected in the image (see, for example, Patent Document 1 or Patent Document 2).

特開２００８−２４９４０７号公報Japanese Unexamined Patent Publication No. 2008-249407 国際公開第２０１４／０６１３７２号International Publication No. 2014/061372

しかしながら、従来の技術には、特定のオブジェクトの画像を効率的に収集することが困難な場合があるという問題がある。例えば、画像のトリミングは手動で行われるため、必要な画像の数が多くなると、膨大な時間がかかるようになる。 However, the conventional technique has a problem that it may be difficult to efficiently collect an image of a specific object. For example, image cropping is done manually, which can take a huge amount of time as the number of images required increases.

実施形態の収集装置は、撮像部と、特定部と、取得部とを有する。撮像部は、画像を撮像する。特定部は、撮像部によって撮像された画像である撮像画像内のマーカを基に、撮像部の位置及び姿勢を特定する。取得部は、撮像画像と、特定部によって特定された位置及び姿勢を示すパラメータと、を対応付けて取得する。 The collecting device of the embodiment includes an imaging unit, a specific unit, and an acquisition unit. The image pickup unit captures an image. The specific unit identifies the position and orientation of the imaging unit based on the markers in the captured image, which is the image captured by the imaging unit. The acquisition unit acquires the captured image in association with the parameters indicating the position and posture specified by the specific unit.

図１は、第１の実施形態に係る収集装置の利用シーンを説明するための図である。FIG. 1 is a diagram for explaining a usage scene of the collecting device according to the first embodiment. 図２は、第１の実施形態に係る収集装置の構成の一例を示す図である。FIG. 2 is a diagram showing an example of the configuration of the collecting device according to the first embodiment. 図３は、第１の実施形態に係る収集データについて説明するための図である。FIG. 3 is a diagram for explaining the collected data according to the first embodiment. 図４は、第１の実施形態に係る仮想カメラについて説明するための図である。FIG. 4 is a diagram for explaining the virtual camera according to the first embodiment. 図５は、第１の実施形態に係る正投影図について説明するための図である。FIG. 5 is a diagram for explaining an orthographic projection according to the first embodiment. 図６は、第１の実施形態に係る収集画像の一例を示す図である。FIG. 6 is a diagram showing an example of collected images according to the first embodiment. 図７は、第１の実施形態に係る収集装置の処理の流れを示すフローチャートである。FIG. 7 is a flowchart showing a processing flow of the collecting device according to the first embodiment.

以下に、本願に係る収集装置、収集方法及び収集プログラムの実施形態を図面に基づいて詳細に説明する。なお、本発明は、以下に説明する実施形態により限定されるものではない。 Hereinafter, embodiments of the collection device, collection method, and collection program according to the present application will be described in detail with reference to the drawings. The present invention is not limited to the embodiments described below.

（第１の実施形態）
まず、図１を用いて、第１の実施形態に係る収集装置の利用シーンを説明する。図１は、第１の実施形態に係る収集装置の利用シーンを説明するための図である。収集装置１０は画像を撮像する撮像機能を有しているものとする。収集装置１０は、例えば、カメラ付きのスマートフォンである。 (First Embodiment)
First, the usage scene of the collecting device according to the first embodiment will be described with reference to FIG. FIG. 1 is a diagram for explaining a usage scene of the collecting device according to the first embodiment. It is assumed that the collecting device 10 has an imaging function for capturing an image. The collecting device 10 is, for example, a smartphone with a camera.

図１に示すように、ユーザは、収集装置１０を用いて、撮像対象のオブジェクト２０の撮像を行う。また、オブジェクト２０の付近には、マーカ３０が備えられている。マーカ３０は、収集装置１０がＡＲを実行するためのマーカである。例えば、マーカ３０は、ＱＲ（Quick Response）コード（登録商標）である。また、収集装置１０は、マーカレスＡＲを実行してもよい。 As shown in FIG. 1, the user uses the collecting device 10 to image the object 20 to be imaged. Further, a marker 30 is provided in the vicinity of the object 20. The marker 30 is a marker for the collecting device 10 to execute AR. For example, the marker 30 is a QR (Quick Response) code (registered trademark). Further, the collecting device 10 may execute the markerless AR.

収集装置１０は、撮像した画像と、画像の内部パラメータと、ＡＲを実行することで得られる外部パラメータとを対応付けたデータを収集する。例えば、内部パラメータは、焦点距離、画像中心、解像度、歪み係数等である。また、外部パラメータは、ＡＲを実行することによって特定されるカメラの位置及び姿勢である。そのほかにも、収集装置１０は、撮像日時、場所、絞り、露光時間、センサの感度といったメタデータをパラメータとして取得してもよい。 The collecting device 10 collects data in which the captured image, the internal parameters of the image, and the external parameters obtained by executing AR are associated with each other. For example, internal parameters are focal length, image center, resolution, distortion factor, and the like. Also, the external parameters are the position and orientation of the camera identified by performing AR. In addition, the collecting device 10 may acquire metadata such as imaging date / time, location, aperture, exposure time, and sensor sensitivity as parameters.

また、収集装置１０は、撮像した画像を加工した画像を収集してもよい。例えば、収集装置１０は、撮像した画像をトリミングした画像を収集する。このとき、収集装置１０は、外部パラメータを基にトリミングを行うことができる。 Further, the collecting device 10 may collect a processed image of the captured image. For example, the collecting device 10 collects a cropped image of the captured image. At this time, the collecting device 10 can perform trimming based on the external parameters.

このように、収集装置１０は、画像を、内部パラメータ及び外部パラメータとともに収集する。これにより、内部パラメータ又は外部パラメータを用いてトリミング等の加工を行うことで、収集した画像から対象オブジェクトが映った領域を抽出することができる。このため、収集装置１０によれば、特定のオブジェクトの画像を効率的に収集することができる。 In this way, the collecting device 10 collects the image together with the internal parameter and the external parameter. As a result, the area in which the target object is reflected can be extracted from the collected image by performing processing such as trimming using the internal parameter or the external parameter. Therefore, according to the collection device 10, images of a specific object can be efficiently collected.

（第１の実施形態の収集装置の構成）
図２を用いて、収集装置１０の構成について説明する。図２は、第１の実施形態に係る収集装置の構成の一例を示す図である。図２に示すように、収集装置１０は、カメラ１１、記憶部１２及び制御部１３を有する。カメラ１１は、撮像部の一例である。 (Structure of the collecting device of the first embodiment)
The configuration of the collecting device 10 will be described with reference to FIG. FIG. 2 is a diagram showing an example of the configuration of the collecting device according to the first embodiment. As shown in FIG. 2, the collecting device 10 includes a camera 11, a storage unit 12, and a control unit 13. The camera 11 is an example of an imaging unit.

記憶部１２は、ＨＤＤ（Hard Disk Drive）、ＳＳＤ（Solid State Drive）、光ディスク、ＲＡＭ（Random Access Memory）、フラッシュメモリ、ＮＶＳＲＡＭ（Non Volatile Static Random Access Memory）等の記憶装置である。記憶部１２は、収集装置１０で実行されるＯＳ（Operating System）やプログラムを記憶する。さらに、記憶部１２は、プログラムの実行で用いられる各種情報を記憶する。また、記憶部１２は、ＡＲ情報１２１及び収集データ１２２を記憶する。 The storage unit 12 is a storage device for HDD (Hard Disk Drive), SSD (Solid State Drive), optical disk, RAM (Random Access Memory), flash memory, NVSRAM (Non Volatile Static Random Access Memory), and the like. The storage unit 12 stores an OS (Operating System) and a program executed by the collection device 10. Further, the storage unit 12 stores various information used in executing the program. Further, the storage unit 12 stores the AR information 121 and the collected data 122.

ＡＲ情報１２１は、ＡＲを実行するための情報である。ＡＲ情報１２１には、マーカ３０の画像が含まれる。また、ＡＲ情報１２１には、撮像対象のオブジェクトを配置する位置である撮像空間の、マーカ３０に対する相対的な位置を示す情報が含まれる。例えば、撮像空間は、マーカ３０に対する所定の位置に仮想的に配置された球等の立体の内部空間である。なお、マーカ３０は、撮像空間が撮像対象のオブジェクトを内包するように設置されているものとする。 The AR information 121 is information for executing AR. The AR information 121 includes an image of the marker 30. Further, the AR information 121 includes information indicating the position of the imaging space, which is the position where the object to be imaged is arranged, relative to the marker 30. For example, the imaging space is a three-dimensional internal space such as a sphere virtually arranged at a predetermined position with respect to the marker 30. It is assumed that the marker 30 is installed so that the imaging space includes the object to be imaged.

収集データ１２２は、収集装置１０によって収集されたデータである。図３は、第１の実施形態に係る収集データについて説明するための図である。図３に示すように、収集データ１２２には、撮像画像１２２ａ及びパラメータ１２２ｂが含まれる。撮像画像１２２ａは、カメラ１１によって撮像された画像である。また、パラメータ１２２ｂには、日時、緯度経度、外部パラメータ及び内部パラメータが含まれる。図３の例における日時及び緯度経度は、それぞれ「２０１８／８／２８１１：５８：００」及び「３５°Ｎ１３９°Ｅ」である。 The collected data 122 is the data collected by the collecting device 10. FIG. 3 is a diagram for explaining the collected data according to the first embodiment. As shown in FIG. 3, the collected data 122 includes the captured image 122a and the parameter 122b. The captured image 122a is an image captured by the camera 11. Further, the parameter 122b includes a date and time, latitude and longitude, an external parameter and an internal parameter. The date and time and latitude / longitude in the example of FIG. 3 are “2018/8/28 11:58:00” and “35 ° N139 ° E”, respectively.

図３に示すように、外部パラメータには、位置及び角度が含まれる。位置は、撮像時のカメラ１１の位置を示す座標である。また、角度は、撮像時のカメラ１１のｘ軸、ｙ軸、ｚ軸に対する回転角度である。図３の例における外部パラメータは、撮像画像１２２ａを撮像した際のカメラ１１の位置が（１．５，２，２）であり、ｘ軸、ｙ軸、ｚ軸に対する回転角度が（０°，３０°，３０°）であったことを示している。 As shown in FIG. 3, the external parameters include position and angle. The position is a coordinate indicating the position of the camera 11 at the time of imaging. The angle is the rotation angle of the camera 11 with respect to the x-axis, y-axis, and z-axis at the time of imaging. The external parameters in the example of FIG. 3 are that the position of the camera 11 when the captured image 122a is captured is (1.5, 2, 2), and the rotation angle with respect to the x-axis, y-axis, and z-axis is (0 °, It shows that it was 30 °, 30 °).

図３に示すように、内部パラメータには、焦点距離、解像度、画像中心及び歪み係数が含まれる。図３の例における内部パラメータは、撮像画像１２２ａが、焦点距離５０ｍｍで撮像されたことを示している。また内部パラメータは、撮像画像１２２ａの解像度が１９２０×１０８０であり、画像中心が（９５０，５５０）であり、歪み係数が０．０１であることを示している。 As shown in FIG. 3, the internal parameters include focal length, resolution, image center and distortion factor. The internal parameters in the example of FIG. 3 indicate that the captured image 122a was captured at a focal length of 50 mm. The internal parameters indicate that the resolution of the captured image 122a is 1920 × 1080, the center of the image is (950,550), and the distortion coefficient is 0.01.

制御部１３は、収集装置１０を制御する。制御部１３は、ＣＰＵ（Central Processing Unit）、ＭＰＵ（Micro Processing Unit）、ＡＳＩＣ（Application Specific Integrated Circuit）、ＦＰＧＡ（Field Programmable Gate Array）等である。例えば、制御部１３は、特定部１３１、取得部１３２及び抽出部１３３として機能する。 The control unit 13 controls the collection device 10. The control unit 13 is a CPU (Central Processing Unit), an MPU (Micro Processing Unit), an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable Gate Array), and the like. For example, the control unit 13 functions as a specific unit 131, an acquisition unit 132, and an extraction unit 133.

特定部１３１は、カメラ１１によって撮像された画像内のマーカ３０を基に、カメラ１１の位置及び姿勢を特定する。特定部１３１は、既知のＡＲ技術を用いて位置及び姿勢を特定することができる。また、特定部１３１は、特定した位置及び姿勢を表す外部パラメータ行列を計算することができる。 The identification unit 131 identifies the position and orientation of the camera 11 based on the marker 30 in the image captured by the camera 11. The identification unit 131 can specify the position and the posture by using a known AR technique. In addition, the specific unit 131 can calculate an external parameter matrix representing the specified position and posture.

取得部１３２は、カメラ１１によって撮像された画像と、特定部１３１によって特定された位置及び姿勢を示すパラメータと、を対応付けて取得する。例えば、取得部１３２は、撮像画像１２２ａ、パラメータ１２２ｂを取得し、収集データ１２２として記憶部１２に格納する。 The acquisition unit 132 acquires the image captured by the camera 11 in association with the parameters indicating the position and posture specified by the specific unit 131. For example, the acquisition unit 132 acquires the captured image 122a and the parameter 122b and stores the captured data 122 in the storage unit 12.

抽出部１３３は、パラメータを基に、カメラ１１によって撮像された画像から、マーカ３０に対してあらかじめ定められた位置にある所定の領域の画像を抽出する。ここでいう所定の領域は、例えば、前述のＡＲ情報１２１に含まれる撮像空間である。 The extraction unit 133 extracts an image of a predetermined region at a position predetermined with respect to the marker 30 from the image captured by the camera 11 based on the parameters. The predetermined area referred to here is, for example, an imaging space included in the AR information 121 described above.

さらに、撮像空間は、仮想的な立体の内部空間とみなせる。このため、撮像空間を内包する仮想的な立体をフレームに収めた正投影画像には、撮像対象のオブジェクトが映っていることになる。そこで、抽出部１３３は、マーカ３０に対してあらかじめ定められた位置に仮想的に配置された立体が中心になるような正投影図を抽出する。なお、抽出部１３３は、撮像対象のオブジェクトの大きさ及び形状に応じて、抽出する領域の立体の大きさ及び形状を変化させてもよい。また、抽出部１３３は、抽出した正投影図を収集データ１２２として記憶部１２に格納する。 Further, the imaging space can be regarded as a virtual three-dimensional internal space. Therefore, the object to be imaged is reflected in the orthographic image in which the virtual solid including the imaging space is contained in the frame. Therefore, the extraction unit 133 extracts an orthographic projection in which a solid virtually arranged at a predetermined position with respect to the marker 30 is the center. The extraction unit 133 may change the size and shape of the three-dimensional object in the extraction region according to the size and shape of the object to be imaged. Further, the extraction unit 133 stores the extracted orthographic projection as the collected data 122 in the storage unit 12.

図４を用いて、撮像空間を球の内部空間とした場合の抽出部１３３の処理の例について説明する。図４は、第１の実施形態に係る仮想カメラについて説明するための図である。図４に示すように、仮想球３０ａは、マーカ３０に対してあらかじめ定められた位置に仮想的に配置されている。また、仮想球３０ａはオブジェクト２０を内包している。 An example of processing of the extraction unit 133 when the imaging space is the internal space of the sphere will be described with reference to FIG. FIG. 4 is a diagram for explaining the virtual camera according to the first embodiment. As shown in FIG. 4, the virtual sphere 30a is virtually arranged at a predetermined position with respect to the marker 30. Further, the virtual sphere 30a includes the object 20.

仮想カメラ１１ａは、仮想球３０ａが中心になるような正投影図を撮像する仮想的な正投影（Orthographic）カメラである。抽出部１３３は、仮想カメラ１１ａによって撮像された画像を抽出する。つまり、抽出部１３３は、カメラ１１の位置及び姿勢が仮想カメラ１１ａの位置及び姿勢と同じであったと仮定した場合に、カメラ１１が撮像したと考えられる撮像画像を抽出する。 The virtual camera 11a is a virtual orthographic camera that captures an orthographic projection such that the virtual sphere 30a is at the center. The extraction unit 133 extracts an image captured by the virtual camera 11a. That is, the extraction unit 133 extracts the captured image considered to have been captured by the camera 11 on the assumption that the position and orientation of the camera 11 are the same as the position and orientation of the virtual camera 11a.

なお、正投影図に対象のオブジェクトを最も大きく映すことができるのは、仮想球３０ａがオブジェクト２０に外接している場合である。また、オブジェクト２０の立体的な形状を事前に特定できている場合は、仮想球３０ａを形状に合わせて変形させてもよい。例えば、仮想球３０ａは、オブジェクト２０に外接し、オブジェクト２０との接点がなるべく多くなるような多面体に置き換えられる。 The object can be projected in the orthographic projection in the largest size when the virtual sphere 30a circumscribes the object 20. Further, when the three-dimensional shape of the object 20 can be specified in advance, the virtual sphere 30a may be deformed according to the shape. For example, the virtual sphere 30a is replaced with a polyhedron that circumscribes the object 20 and has as many points of contact with the object 20 as possible.

図５は、第１の実施形態に係る正投影図について説明するための図である。図５に示すように、抽出部１３３は、撮像画像２０１ａから正投影図２０１ｂを抽出する。また、抽出部１３３は、撮像画像２０２ａから正投影図２０２ｂを抽出する。また、抽出部１３３は、撮像画像２０３ａから正投影図２０３ｂを抽出する。 FIG. 5 is a diagram for explaining an orthographic projection according to the first embodiment. As shown in FIG. 5, the extraction unit 133 extracts the orthographic projection 201b from the captured image 201a. In addition, the extraction unit 133 extracts the orthographic projection 202b from the captured image 202a. In addition, the extraction unit 133 extracts the orthographic projection 203b from the captured image 203a.

図５の例では、撮像画像２０１ａ、撮像画像２０２ａ、撮像画像２０３ａに映ったオブジェクト２０の大きさはそれぞれ異なる。これに対し、正投影図２０１ｂ、正投影図２０２ｂ、正投影図２０３ｂに映ったオブジェクト２０の大きさは同一である。一方で、正投影図２０１ｂ、正投影図２０２ｂ、正投影図２０３ｂに映ったオブジェクト２０の角度はそれぞれ異なっている。このため、収集装置１０は、異なる角度から撮像したオブジェクト２０の画像を、大きさを揃えたうえで収集することができる。 In the example of FIG. 5, the sizes of the objects 20 reflected in the captured image 201a, the captured image 202a, and the captured image 203a are different from each other. On the other hand, the sizes of the objects 20 shown in the orthographic projection 201b, the orthographic projection 202b, and the orthographic projection 203b are the same. On the other hand, the angles of the objects 20 reflected in the orthographic projection 201b, the orthographic projection 202b, and the orthographic projection 203b are different from each other. Therefore, the collecting device 10 can collect images of the objects 20 captured from different angles after having the same size.

また、収集装置１０は、図６に示すように、少しずつ角度を変えたオブジェクト２０の画像を収集することができる。図６は、第１の実施形態に係る収集画像の一例を示す図である。 Further, as shown in FIG. 6, the collecting device 10 can collect an image of the object 20 whose angle is changed little by little. FIG. 6 is a diagram showing an example of collected images according to the first embodiment.

収集画像が多様な角度から撮像されているほど、より有効な学習データになることが考えられる。逆に、同じ角度から撮像した多数の画像があったとしても、画像の数に対する学習データとしての有効性は低いと考えられる。 It is conceivable that the more the collected images are captured from various angles, the more effective the learning data will be. On the contrary, even if there are many images taken from the same angle, it is considered that the effectiveness as learning data for the number of images is low.

そこで、収集装置１０は、なるべく多くの異なる角度から撮像が行われるように、音声や画面表示により、収集装置１０を持ったユーザを所定の位置に誘導するようにしてもよい。 Therefore, the collecting device 10 may guide the user holding the collecting device 10 to a predetermined position by voice or screen display so that the image pickup is performed from as many different angles as possible.

ここで、例えば、図４のように、オブジェクト２０の周囲の空間を、仮想球３０ａの中心を原点とする互いに垂直な３軸を用いた空間座標を（ｘ，ｙ，ｚ）のように表すとする。このとき、収集装置１０は、収集画像の中から、ｘ＝０、ｙ＝０、ｚ＝０の平面で分割された８個の空間それぞれからの正投影図の数をカウントする。そして、収集装置１０は、カウントした正投影図の数が所定値以下である空間からの正投影図が抽出可能な撮像画像が得られる位置にユーザを誘導する。 Here, for example, as shown in FIG. 4, the space around the object 20 is represented by (x, y, z) as spatial coordinates using three axes perpendicular to each other with the center of the virtual sphere 30a as the origin. And. At this time, the collecting device 10 counts the number of orthographic projections from each of the eight spaces divided by the planes of x = 0, y = 0, and z = 0 from the collected images. Then, the collecting device 10 guides the user to a position where an captured image capable of extracting an orthographic projection from a space in which the number of counted orthographic projections is equal to or less than a predetermined value can be obtained.

例えば、収集画像が、図１のオブジェクト２０からユーザの方向（ｚ軸のマイナス方向）を見て右側から見た画像に偏っている場合、収集装置１０は、左側にユーザを誘導する。 For example, when the collected image is biased toward the image viewed from the right side when the user's direction (minus direction of the z-axis) is viewed from the object 20 in FIG. 1, the collecting device 10 guides the user to the left side.

また、収集データ１２２を用いることで、オブジェクト２０をマーカとするＡＲを容易に行うことができるようになる。例えば、ＡＲを実行する際に撮像されたＡＲ用の画像内に収集画像のいずれかが映っている場合、収集画像及び収集したパラメータから、オブジェクト２０の正面の位置を基準としたときの、当該ＡＲ用の画像を撮像しているカメラの位置及び姿勢を特定することができる。 Further, by using the collected data 122, AR using the object 20 as a marker can be easily performed. For example, when any of the collected images is shown in the AR image captured when executing AR, the said image is based on the position in front of the object 20 from the collected images and the collected parameters. The position and orientation of the camera that is capturing the image for AR can be specified.

（第１の実施形態の処理）
図７を用いて、収集装置１０の処理について説明する。図７は、第１の実施形態に係る収集装置の処理の流れを示すフローチャートである。まず、収集装置１０は、オブジェクト２０の画像を撮像する（ステップＳ１０１）。このとき、マーカ３０がオブジェクト２０の付近の所定の位置に備えられているものとする。 (Processing of the first embodiment)
The processing of the collecting device 10 will be described with reference to FIG. 7. FIG. 7 is a flowchart showing a processing flow of the collecting device according to the first embodiment. First, the collecting device 10 captures an image of the object 20 (step S101). At this time, it is assumed that the marker 30 is provided at a predetermined position near the object 20.

次に、収集装置１０は、撮像した画像内のマーカ３０を基に、カメラ１１の位置と姿勢を特定する（ステップＳ１０２）。ここで、収集装置１０は、撮像した画像及びパラメータを取得する（ステップＳ１０３）。パラメータは、外部パラメータ及び内部パラメータを含む。 Next, the collecting device 10 identifies the position and orientation of the camera 11 based on the marker 30 in the captured image (step S102). Here, the collecting device 10 acquires the captured image and the parameters (step S103). Parameters include external and internal parameters.

さらに、収集装置１０は、撮像した画像から、オブジェクト２０が映った領域の画像を抽出する（ステップＳ１０４）。例えば、収集装置１０は、前述の仮想カメラ１１ａを用いた手法により正投影図を抽出する。収集装置１０は、抽出した画像及びパラメータを記憶部１２に格納する（ステップＳ１０５）。 Further, the collecting device 10 extracts an image of the region in which the object 20 is reflected from the captured image (step S104). For example, the collecting device 10 extracts an orthographic projection by the method using the virtual camera 11a described above. The collecting device 10 stores the extracted images and parameters in the storage unit 12 (step S105).

ここで、ユーザの操作等により撮像が終了した場合（ステップＳ１０６、Ｙｅｓ）、収集装置１０は処理を終了する。また、引き続き撮像が行われる場合（ステップＳ１０６、Ｎｏ）、収集装置１０はステップＳ１０１に戻り、処理を繰り返す。 Here, when the imaging is completed by the user's operation or the like (step S106, Yes), the collecting device 10 ends the process. Further, when the imaging is continuously performed (step S106, No), the collecting device 10 returns to step S101 and repeats the process.

（第１の実施形態の効果）
これまで説明してきたように、収集装置１０はカメラ１１を用いて画像を撮像する。また、収集装置１０は、撮像した画像である撮像画像内のマーカ３０を基に、カメラ１１の位置及び姿勢を特定する。また、収集装置１０は、撮像画像と、特定した位置及び姿勢を示すパラメータと、を対応付けて取得する。このため、本実施形態によれば、特定のオブジェクトの画像を効率的に収集することができる。例えば、本実施形態によれば、従来手動で行われていた画像のトリミングを自動で行うことができる。また、画像とともに収集されたパラメータは、機械学習の教師データに含めることができるため、収集したデータの活用の幅を広げることができる。 (Effect of the first embodiment)
As described above, the collecting device 10 captures an image using the camera 11. Further, the collecting device 10 identifies the position and orientation of the camera 11 based on the marker 30 in the captured image which is the captured image. Further, the collecting device 10 acquires the captured image in association with the parameter indicating the specified position and posture. Therefore, according to the present embodiment, it is possible to efficiently collect images of a specific object. For example, according to the present embodiment, it is possible to automatically perform image trimming, which has been performed manually in the past. In addition, since the parameters collected together with the image can be included in the teacher data of machine learning, the range of utilization of the collected data can be expanded.

また、収集装置１０は、パラメータを基に、撮像画像から、マーカ３０に対してあらかじめ定められた位置にある所定の領域の画像を抽出する。これにより、ＡＲを用いて撮像対象オブジェクトの位置を特定することができるようになる。 Further, the collecting device 10 extracts an image of a predetermined region at a predetermined position with respect to the marker 30 from the captured image based on the parameters. This makes it possible to specify the position of the object to be imaged using AR.

また、収集装置１０は、マーカ３０に対してあらかじめ定められた位置に仮想的に配置された立体が中心になるような正投影図を抽出する。これにより、抽出する画像にオブジェクトを大きく映すことができるようになる。 Further, the collecting device 10 extracts an orthographic projection such that a solid virtually arranged at a predetermined position with respect to the marker 30 is the center. This makes it possible to make the object appear larger in the extracted image.

なお、本実施形態の収集装置１０で実行されるプログラムは、ＲＯＭ等にあらかじめ組み込まれて提供される。本実施形態の収集装置１０で実行されるプログラムは、インストール可能な形式又は実行可能な形式のファイルでコンピュータで読み取り可能な記録媒体に記録して提供するように構成してもよい。 The program executed by the collecting device 10 of the present embodiment is provided by being incorporated in a ROM or the like in advance. The program executed by the collecting device 10 of the present embodiment may be configured to be recorded and provided on a computer-readable recording medium in an installable or executable format file.

さらに、本実施形態の収集装置１０で実行されるプログラムを、インターネット等のネットワークに接続されたコンピュータ上に格納し、ネットワーク経由でダウンロードさせることにより提供するように構成してもよい。また、本実施形態の収集装置１０で実行されるプログラムをインターネット等のネットワーク経由で提供又は配布するように構成してもよい。 Further, the program executed by the collecting device 10 of the present embodiment may be stored on a computer connected to a network such as the Internet and provided by downloading via the network. Further, the program executed by the collection device 10 of the present embodiment may be configured to be provided or distributed via a network such as the Internet.

本実施形態の収集装置１０で実行されるプログラムは、上述した各部（特定部１３１、取得部１３２及び抽出部１３３）を含むモジュール構成となっており、実際のハードウェアとしてはＣＰＵが上記ＲＯＭからプログラムを読み出して実行することにより上記各部が主記憶装置上にロードされ、特定部１３１、取得部１３２及び抽出部１３３が主記憶装置上に生成されるようになっている。 The program executed by the collection device 10 of the present embodiment has a module configuration including each of the above-mentioned parts (specific part 131, acquisition part 132, and extraction part 133), and as actual hardware, the CPU starts from the above ROM. By reading and executing the program, each of the above units is loaded on the main storage device, and the specific unit 131, the acquisition unit 132, and the extraction unit 133 are generated on the main memory device.

本発明の実施形態を説明したが、実施形態は、例として提示したものであり、発明の範囲を限定することは意図していない。実施形態は、その他の様々な形態で実施されることが可能であり、発明の要旨を逸脱しない範囲で、種々の省略、置き換え、変更を行うことができる。実施形態やその変形は、発明の範囲や要旨に含まれると同様に、特許請求の範囲に記載された発明とその均等の範囲に含まれるものである。 Although embodiments of the present invention have been described, the embodiments are presented as examples and are not intended to limit the scope of the invention. The embodiment can be implemented in various other forms, and various omissions, replacements, and changes can be made without departing from the gist of the invention. The embodiments and variations thereof are included in the scope of the invention described in the claims and the equivalent scope thereof, as are included in the scope and gist of the invention.

１０収集装置
１１カメラ
１２記憶部
１３制御部
２０オブジェクト
３０マーカ
１２１ＡＲ情報
１２２収集データ
１３１特定部
１３２取得部
１３３抽出部 10 Collecting device 11 Camera 12 Storage unit 13 Control unit 20 Object 30 Marker 121 AR information 122 Collected data 131 Specific unit 132 Acquisition unit 133 Extraction unit

Claims

An imaging unit that captures images and
Based on the markers in the captured image, which is the image captured by the imaging unit, the specific unit that specifies the position and orientation of the imaging unit, and the specific unit.
An acquisition unit that acquires the captured image in association with a parameter indicating a position and a posture specified by the specific unit.
A collecting device characterized by having.

The collecting device according to claim 1, further comprising an extraction unit that extracts an image of a predetermined region at a predetermined position with respect to the marker from the captured image based on the parameters.

The collecting device according to claim 2, wherein the extraction unit extracts an orthographic projection in which a solid virtually arranged at a predetermined position with respect to the marker is the center.

A collection method performed by a computer
A specific step of specifying the position and orientation of the camera when the captured image is captured based on a marker in the captured image, which is an image captured by the camera, and
An acquisition step of associating the captured image with a parameter indicating a position and a posture specified by the specific step and acquiring the image.
A collection method characterized by including.

Computer,
A specific means for identifying the position and orientation of the camera when the captured image is captured, based on a marker in the captured image, which is an image captured by the camera, and
An acquisition means for acquiring the captured image in association with a parameter indicating a position and a posture specified by the specific means.
A collection program characterized by functioning as.