WO2022102015A1 - Image information acquisition device, image information acquisition method, and computer program - Google Patents

Image information acquisition device, image information acquisition method, and computer program Download PDF

Info

Publication number
WO2022102015A1
WO2022102015A1 PCT/JP2020/042069 JP2020042069W WO2022102015A1 WO 2022102015 A1 WO2022102015 A1 WO 2022102015A1 JP 2020042069 W JP2020042069 W JP 2020042069W WO 2022102015 A1 WO2022102015 A1 WO 2022102015A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
subject
information acquisition
information
unit
Prior art date
Application number
PCT/JP2020/042069
Other languages
French (fr)
Japanese (ja)
Inventor
勇 五十嵐
隆行 黒住
誠之 高村
英明 木全
Original Assignee
日本電信電話株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電信電話株式会社 filed Critical 日本電信電話株式会社
Priority to PCT/JP2020/042069 priority Critical patent/WO2022102015A1/en
Publication of WO2022102015A1 publication Critical patent/WO2022102015A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images

Definitions

  • the present invention relates to an image information acquisition device, an image information acquisition method, and a computer program technique.
  • the means there is a method of acquiring and recording information on the three-dimensional appearance of space.
  • Specific examples of such means include shooting a stereo image using multiple cameras, shooting a depth image using a depth sensor, and acquiring a three-dimensional point cloud using LiDAR (Light Detection and Ringing). be.
  • LiDAR Light Detection and Ringing
  • Non-Patent Document 1 Structure from Motion (SfM) (Non-Patent Document 1) and Simultaneous Localization and Mapping (SLAM) (Non-Patent Document 2).
  • SfM Structure from Motion
  • SLAM Simultaneous Localization and Mapping
  • three-dimensional information is acquired by using a large number of images of the same subject.
  • an object of the present invention is to provide a technique capable of acquiring more accurate three-dimensional information by using a smaller number of images.
  • One aspect of the present invention is a classification unit that classifies a subject reflected in the target image into the same type of subject from the target image that is the image to be processed, and a plurality of subjects classified as the same type of subject by the classification unit.
  • This is an image information acquisition device including a three-dimensional information acquisition unit that acquires information indicating the three-dimensional shape of the subject based on the image of the above.
  • One aspect of the present invention is a classification step of classifying a subject reflected in the target image into the same type of subject from the target image which is an image to be processed, and a plurality of subjects classified as the same type of subject in the classification step.
  • This is an image information acquisition method including a three-dimensional information acquisition step for acquiring information indicating the three-dimensional shape of the subject based on the image of the above.
  • One aspect of the present invention is a computer program for operating a computer as the above-mentioned image information acquisition device.
  • FIG. 1 is a diagram showing a functional configuration example of the image information acquisition device 100 of the present invention.
  • the image information acquisition device 100 is configured by using information devices such as a personal computer, a server device, a game device, a smartphone, and an image pickup device.
  • the image information acquisition device 100 includes an image input unit 10, an output unit 20, a storage unit 30, and a control unit 40.
  • the image input unit 10 receives image data input to the image information acquisition device 100.
  • the image data input by the image input unit 10 may be still image data or moving image data.
  • the image input unit 10 may read image data recorded on a recording medium such as a CD-ROM or a USB memory (Universal Serial Bus Memory). Further, the image input unit 10 may receive an image captured by a still camera or a video camera from the camera. Further, when the image information acquisition device 100 is built in a still camera, a video camera, or an information processing device provided with a camera, the image input unit 10 may receive the captured image or the image before imaging from the bus. good. Further, the image input unit 10 may receive image data from another information processing device via a network.
  • the image input unit 10 may be configured in a different manner as long as it can receive input of image data.
  • the output unit 20 outputs image information and image data generated by the control unit 40.
  • the output unit 20 may write image information or image data to a recording medium such as a CD-ROM or a USB memory (Universal Serial Bus Memory).
  • a recording medium such as a CD-ROM or a USB memory (Universal Serial Bus Memory).
  • the output unit 20 is provided with the generated image information and image data in these devices. It may be recorded on a recording medium or displayed as a preview image on a display device provided in these devices. Further, the output unit 20 may transmit image information or image data to another information processing device via the network.
  • the output unit 20 may be configured in a different manner as long as it can output image information and image data.
  • the storage unit 30 is configured by using a storage device such as a magnetic hard disk device or a semiconductor storage device.
  • the storage unit 30 functions as, for example, an image storage unit 301 and an image information storage unit 302.
  • the image storage unit 301 stores image data input by the image input unit 10.
  • the image storage unit 301 may store still image data or moving image data.
  • the image information storage unit 302 stores image information generated by the control unit 40.
  • FIG. 2 is a diagram showing a specific example of an image information table stored in the image information storage unit 302.
  • the image information table has a record for each combination of an image to be processed (hereinafter referred to as "target image") and a subject in the target image.
  • Each record has, for example, identification information indicating a target image (hereinafter referred to as “target image identification information”), identification information indicating a subject (hereinafter referred to as "subject identification information”), and image information in association with each other.
  • the image information is information about the image of the subject in the corresponding target image.
  • the image information includes, for example, area information indicating the subject area of the subject, information indicating the three-dimensional shape of the subject (hereinafter referred to as "3D model”), and information such as state parameters indicating the position and posture of the subject. include.
  • the control unit 40 is configured by using a processor such as a CPU (Central Processing Unit) and a memory.
  • the control unit 40 includes an input / output control unit 401, an area information acquisition unit 402, a classification unit 403, a three-dimensional information acquisition unit 404, a state parameter acquisition unit 405, an additional information acquisition unit 406, and an image when the processor executes a program. It functions as a generator 407. All or part of each function of the control unit 40 may be realized by using hardware such as ASIC (Application Specific Integrated Circuit), PLD (Programmable Logic Device), and FPGA (Field Programmable Gate Array).
  • the above program may be recorded on a computer-readable recording medium.
  • Computer-readable recording media include, for example, flexible disks, magneto-optical disks, ROMs, CD-ROMs, portable media such as semiconductor storage devices (for example, SSD: Solid State Drive), hard disks and semiconductor storage built into computer systems. It is a storage device such as a device.
  • the above program may be transmitted over a telecommunication line.
  • the input / output control unit 401 controls the input / output of data.
  • the input / output control unit 401 acquires image data by controlling the operation of the image input unit 10.
  • the input / output control unit 401 records the input image data in the image storage unit 301.
  • the input / output control unit 401 may temporarily record the input image data in a storage device such as a memory, if necessary.
  • the input / output control unit 401 outputs the image information data recorded in the image information storage unit 302 and the image data generated by the image generation unit 407 to an external device by controlling the output unit 20. ..
  • the area information acquisition unit 402 acquires information (hereinafter referred to as "area information") indicating the area of each subject existing in the image (hereinafter referred to as "subject area”) in the target image for each subject.
  • the target image may be an image stored as a still image in the image storage unit 301, or may be an image of a frame of a moving image stored as a moving image in the image storage unit 301.
  • the target image may be one still image or frame, or may be a plurality of still images or frames.
  • the target image may be a combination of a still image and a frame. When a plurality of frames are used as the target image, a plurality of frames may be acquired from one moving image.
  • the time interval of each frame may be configured to be equal to or larger than a predetermined threshold value so that frames from different viewpoints can be obtained.
  • the frame from which the area information is acquired may be determined by the area information acquisition unit 402 based on a predetermined criterion.
  • the frame is a still image or a moving image obtained by photographing the same or the same type of subject. Is desirable.
  • the position of each subject in the three-dimensional space may be the same or different in the frame of each still image or moving image.
  • the subject area is an area surrounded by the outline of the subject.
  • FIG. 3 is a diagram showing a specific example of the target image.
  • a plurality of subjects are shown in the target image of FIG.
  • the subject 81 and the subject 86 are heart-shaped objects.
  • the subject 81 and the subject 86 have the same type of object or a similar shape.
  • the subject 82, the subject 83, the subject 84, and the subject 85 are star-shaped objects.
  • the subject 82, the subject 83, the subject 84, and the subject 85 have the same type of object or a similar shape.
  • Each subject 81 to 86 is photographed at a unique position and tilted at a unique angle.
  • FIG. 4 is a diagram showing a specific example of the subject area. Each shape shown in a different pattern in FIG. 4 indicates a subject area. The subject areas 91 to 96 indicate the areas of the subjects 81 to 86, respectively.
  • the area information acquisition unit 402 may estimate, for example, which subject corresponds to each pixel in the target image, or which subject does not correspond to any subject.
  • the techniques applied to this estimation need not be limited to specific ones. For example, techniques based on deep learning such as Mask-RCNN and GAN may be applied. Further, the subject area of each subject may be manually specified.
  • the area information acquisition unit 402 records the generated area information data of each subject area in the image information storage unit 302 as image information associated with the identification information of the target image and the identification information of each subject.
  • the classification unit 403 classifies each subject area for each subject of the same type.
  • the subject area to be classified is not limited to one target image (same target image), and the subject of a plurality of subject areas obtained in each of a plurality of target images may be classified. ..
  • n target images from which m subject areas have been acquired are used (m and n are both integers of 1 or more)
  • m ⁇ n subject areas may be the target of classification.
  • the classification unit 403 classifies subject areas of subjects having the same appearance or similar to each other as the same group.
  • the technique applied to the classification unit 403 need not be limited to a specific one.
  • the classification unit 403 may classify the subject areas of the same category into the same group.
  • the classification unit 403 may calculate the similarity between subject areas of the same category based on the feature amount, and classify the subject areas having high similarity into the same group. By performing the processing in this way, a more subdivided classification can be realized.
  • the classification unit 403 may determine which reference image is most similar to the subject for each subject area obtained from the target image, and may generate a group for each reference image. Further, each subject area may be manually classified.
  • the three-dimensional information acquisition unit 404 generates a 3D model of the subject of each group based on the information obtained from a plurality of subject areas belonging to each group.
  • the 3D model may be represented by, for example, a three-dimensional point group, a polygon, or another model. Further, the 3D model may be stored in the storage unit 30 in advance as known information.
  • the technique applied to the three-dimensional information acquisition unit 404 does not have to be limited to a specific one.
  • the three-dimensional information acquisition unit 404 may handle each of the images in the subject area as a plurality of images of the same individual taken at different positions and different postures.
  • the three-dimensional information acquisition unit 404 may generate a 3D model by executing Structure from Motion (SfM) using the images of the plurality of subject areas described above.
  • the three-dimensional information acquisition unit 404 records the generated 3D model data in the image information storage unit 302 as image information associated with the identification information of the target image and the identification information of the subject indicated by the 3D model.
  • the state parameter acquisition unit 405 generates information (hereinafter referred to as "state parameter") indicating the positional relationship with the camera, the posture, and the like for the subject in each subject area.
  • state parameter information
  • the technique applied to the state parameter acquisition unit 405 does not have to be limited to a specific one.
  • the state parameter acquisition unit 405 may acquire the state parameter for each subject area by using SfM.
  • Three-dimensional world coordinates are given to the 3D model.
  • the coordinates of each point are represented by world coordinates.
  • the 3D model of the subject is represented by polygons
  • each point forming the polygon is represented by world coordinates.
  • the camera's internal parameters for converting world coordinates to image coordinates eg, focal length, optical center, distortion factor, etc.
  • a coordinate transformation matrix representing the coordinates and orientation of the camera is estimated as a state parameter.
  • Equation 1 R is a coordinate transformation matrix and is expressed as in Equation 2 below.
  • R11 to R33 are values corresponding to the rotation matrix.
  • R11 to R33 can also be expressed as the following equation 3 by interpreting them in the form of rotating around each coordinate axis in the order of, for example, y-axis, z-axis, and x-axis.
  • the coordinates of the camera coordinate system can be converted to the coordinates (i, j) of the image coordinate system by projection conversion as shown below.
  • f and (cx, cy) are the focal length and optical center of the camera, respectively.
  • FIG. 5 is a diagram showing specific examples of positions and postures of each subject in a three-dimensional space. Each subject shown in the image of FIG. 3 is arranged at each position in a three-dimensional space in each posture. The position and state of each subject are represented by state parameters.
  • the processing of the state parameter acquisition unit 405 may be provided with a constraint condition that the subjects in each subject area do not overlap three-dimensionally (they do not overlap in the same space). By providing such a constraint condition, it is possible to improve the acquisition accuracy of the state parameter.
  • the state parameter acquisition unit 405 records the generated state parameter data in the image information storage unit 302 as image information associated with the identification information of the target image and the identification information of the subject indicated by the state parameter.
  • the additional information acquisition unit 406 acquires additional information for each subject area.
  • the additional information there is information about a relative three-dimensional position with respect to the subject in another subject area of the same group.
  • the image of each subject area can be considered to match the appearance of the subject represented by the 3D model when viewed from a specific position and posture. Therefore, if a reference position that serves as a reference for the viewpoint of the 3D model is arbitrarily specified, the position relative to the reference position can be calculated. This calculation may be performed using, for example, the coordinate transformation matrix R in each subject area.
  • the relative position of the subject in each subject region can be represented by the world coordinate system.
  • the coordinates of the 3D model obtained by performing such coordinate conversion have the same positional relationship as in real space. This makes it possible to represent the relative position between subjects in each subject area.
  • the additional information there is information on how the surface of the subject looks in each subject area. For example, it may be information about the texture of the surface of the subject (for example, information about the color, shape, and material) or information about the reflectance of light on the surface of the subject.
  • information about a light source that irradiates a subject in each subject area with light for example, color of light in the light source, light intensity, information about the position of the light source, number of light sources. .
  • Such information may be obtained, for example, by using a trained model by deep learning or machine learning regarding a light source.
  • the additional information there is information on whether or not the subject in each subject area is in contact with another subject. Such information may be acquired based on whether or not the distance of the closest portion of the surface of each subject is smaller than a predetermined value by using a 3D model and a state parameter.
  • the additional information acquisition unit 406 records the generated additional information data in the image information storage unit 302 as image information associated with the identification information of the target image and the identification information of the subject indicated by the additional information.
  • the image generation unit 407 When the image generation unit 407 is given information on the camera parameters required for generating an image, the image generation unit 407 generates an image according to the camera parameters.
  • Information regarding the camera parameters may be given by the user operating an input device (keyboard, pointing device, touch panel, etc.) included in the image information acquisition device 100, or information transmitted from another information processing device may be received. May be given by doing.
  • the target image identification information when the target image identification information is given together with the camera parameters, a new image is generated based on the given camera parameters for each subject shown in the target image.
  • the information regarding the camera parameters may be, for example, information indicating a focal length, an optical center, and a strain count.
  • the information regarding the camera parameters may be, for example, information indicating the position of the viewpoint, the direction of the line of sight, and the size of the screen (image).
  • the information regarding the camera parameters may be any information as long as it is possible to generate an image.
  • the target image identification information is given, the image is generated based on the image information associated with the target image.
  • the image generation unit 407 may generate an image by further performing coordinate conversion using, for example, a transformation matrix representing a change in viewpoint on a 3D model whose coordinates have been converted so as to represent the position of each subject. For example, an image similar to the target image may be generated based on the same viewpoint as the target image as shown in FIG. 3, or a new image may be generated based on a viewpoint different from the target image. ..
  • FIG. 6 is a diagram showing an outline of image generation from a viewpoint different from the target image.
  • the viewpoint 86_1 is a viewpoint in the target image of FIG.
  • a value related to the subject 82 is input to the pixels of the coordinates (i_1, j_1) in the target image.
  • the viewpoint 86_2 is a new viewpoint different from the target image.
  • the values for the same portion of the subject 82 are in the pixels of the coordinates (i_2, j_2).
  • a new image is generated by performing processing such as coordinate conversion on each subject and each pixel.
  • FIG. 7 is a diagram showing a specific example of a newly generated image with respect to the new viewpoint 86_2.
  • FIG. 8 is a diagram showing a specific example of processing of the image information acquisition device 100.
  • the input / output control unit 401 inputs the target image to be processed and records it in the image storage unit 301 (step S101).
  • the area information acquisition unit 402 acquires area information indicating the subject area in the target image for each subject and records it in the image information storage unit 302 (step S102).
  • the classification unit 403 classifies each subject (step S103).
  • the three-dimensional information acquisition unit 404 generates a 3D model of the subject by using the information of a plurality of subjects classified into the same group, and records the generated 3D model data in the image information storage unit 302. (Step S104).
  • the state parameter acquisition unit 405 generates a state parameter for each subject and records the state parameter in the image information storage unit 302 (step S105).
  • the additional information acquisition unit 406 acquires additional information for each subject and records it in the image information storage unit 302 (step S106).
  • FIG. 9 is a diagram showing a specific example of the hardware configuration of the image information acquisition device 100.
  • the image information acquisition device 100 includes, for example, an input / output device 1, an auxiliary storage device 2, a memory 3, and a processor 4 as shown in FIG.
  • the input / output device 1 inputs / outputs information (including data) to and from the outside (including the user) in the image information acquisition device 100.
  • the input / output device 1 functions as, for example, an image input unit 10 or an output unit 20.
  • the auxiliary storage device 2 is configured by using a magnetic hard disk device or a semiconductor storage device.
  • the auxiliary storage device 2 functions as, for example, a storage unit 30.
  • the memory 3 and the processor 4 function as, for example, a control unit 40.
  • the subject in the image to be processed may be a subject classified as the same type by the classification unit 403 even if it is actually a separate subject.
  • three-dimensional information for example, a 3D model
  • the subject in the image to be processed may be a subject classified as the same type by the classification unit 403 even if it is actually a separate subject.
  • three-dimensional information for example, a 3D model
  • three-dimensional information is acquired using the images of those subjects. Therefore, if there are multiple subjects of the same type in one image, even if there are only a few images as separate subjects, it is possible to acquire more accurate three-dimensional information using those few images. Become.
  • the image information acquisition device 100 may be configured not to include the image generation unit 407.
  • the image generation unit 407 may be mounted on another information processing device.
  • the image information storage unit 302 may be further mounted on the information processing apparatus on which the image generation unit 407 is mounted. With such a configuration, it becomes possible to easily generate an image of a subject in another information processing apparatus based on the image information acquired in the image information acquisition apparatus 100.
  • the image information acquisition device 100 may be mounted separately in a plurality of devices.
  • the image information acquisition device 100 may be implemented as an image information acquisition system including a plurality of devices.
  • the information processing device having the control unit 40 and the information processing device having the storage unit 30 may be mounted as different devices, or the functions of the storage unit 30 may be duplicated and mounted on a plurality of information processing devices.
  • the function of the control unit 40 may be implemented separately in a plurality of information processing devices.
  • the present invention is applicable to an apparatus for acquiring image information.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

An image information acquisition device comprising: a classification unit for classifying, from relevant images which are the images to be processed, the subjects seen in the relevant images into a subject of the same kind; and a three-dimensional information acquisition unit for acquiring information that indicates the three-dimensional shapes of the subjects, on the basis of a plurality of images of subjects classified into the subject of the same kind by the classification unit.

Description

画像情報取得装置、画像情報取得方法及びコンピュータープログラムImage information acquisition device, image information acquisition method and computer program
 本発明は、画像情報取得装置、画像情報取得方法及びコンピュータープログラムの技術に関する。 The present invention relates to an image information acquisition device, an image information acquisition method, and a computer program technique.
 近年、被写体の任意の視点からの見え方を再現した画像を生成する技術が求められている。その手段の一つとして、空間の三次元的な見え方の情報を取得し記録する方法が存在する。このような手段の具体例として、複数のカメラを用いたステレオ画像の撮影や、深度センサを用いた深度画像の撮影や、LiDAR(Light Detection and Ranging)を用いた三次元点群の取得などがある。また、単一のカメラで撮影した色(輝度)を画素値とする画像や映像を用いる技術もある。このような技術の具体例として、Structure from Motion(SfM)(非特許文献1)、Simultaneous Localization and Mapping(SLAM)(非特許文献2)がある。これらの技術では、同一被写体を撮影した多数の画像を用いることで三次元情報が取得される。 In recent years, there has been a demand for a technique for generating an image that reproduces the appearance of a subject from an arbitrary viewpoint. As one of the means, there is a method of acquiring and recording information on the three-dimensional appearance of space. Specific examples of such means include shooting a stereo image using multiple cameras, shooting a depth image using a depth sensor, and acquiring a three-dimensional point cloud using LiDAR (Light Detection and Ringing). be. There is also a technique of using an image or video whose pixel value is a color (luminance) taken by a single camera. Specific examples of such techniques include Structure from Motion (SfM) (Non-Patent Document 1) and Simultaneous Localization and Mapping (SLAM) (Non-Patent Document 2). In these techniques, three-dimensional information is acquired by using a large number of images of the same subject.
 しかしながら、従来の技術では、多数の画像が必要であり、画像の枚数が少ないと得られる三次元情報の精度が低下してしまうという問題があった。そのため、動画像から三次元情報を得る場合には、長時間の映像が必要になってしまうという問題もあった。
 上記事情に鑑み、本発明は、より少ない画像を用いることでより精度の高い三次元情報を取得することができる技術の提供を目的としている。
However, the conventional technique requires a large number of images, and there is a problem that the accuracy of the obtained three-dimensional information is lowered when the number of images is small. Therefore, when obtaining three-dimensional information from a moving image, there is also a problem that a long-time image is required.
In view of the above circumstances, an object of the present invention is to provide a technique capable of acquiring more accurate three-dimensional information by using a smaller number of images.
 本発明の一態様は、処理対象の画像である対象画像から、前記対象画像に写っている被写体を同種の被写体に分類する分類部と、前記分類部によって同種の被写体として分類された複数の被写体の画像に基づいて、前記被写体の三次元形状を示す情報を取得する三次元情報取得部と、を備える画像情報取得装置である。 One aspect of the present invention is a classification unit that classifies a subject reflected in the target image into the same type of subject from the target image that is the image to be processed, and a plurality of subjects classified as the same type of subject by the classification unit. This is an image information acquisition device including a three-dimensional information acquisition unit that acquires information indicating the three-dimensional shape of the subject based on the image of the above.
 本発明の一態様は、処理対象の画像である対象画像から、前記対象画像に写っている被写体を同種の被写体に分類する分類ステップと、前記分類ステップにおいて同種の被写体として分類された複数の被写体の画像に基づいて、前記被写体の三次元形状を示す情報を取得する三次元情報取得ステップと、を有する画像情報取得方法である。 One aspect of the present invention is a classification step of classifying a subject reflected in the target image into the same type of subject from the target image which is an image to be processed, and a plurality of subjects classified as the same type of subject in the classification step. This is an image information acquisition method including a three-dimensional information acquisition step for acquiring information indicating the three-dimensional shape of the subject based on the image of the above.
 本発明の一態様は、上記の画像情報取得装置としてコンピューターを機能させるためのコンピュータープログラムである。 One aspect of the present invention is a computer program for operating a computer as the above-mentioned image information acquisition device.
 本発明により、より少ない画像を用いることでより精度の高い三次元情報を取得することが可能となる。 According to the present invention, it is possible to acquire more accurate three-dimensional information by using fewer images.
本発明の画像情報取得装置100の機能構成例を示す図である。It is a figure which shows the functional structure example of the image information acquisition apparatus 100 of this invention. 画像情報記憶部302が記憶する画像情報テーブルの具体例を示す図である。It is a figure which shows the specific example of the image information table which the image information storage unit 302 stores. 対象画像の具体例を示す図である。It is a figure which shows the specific example of the target image. 被写体領域の具体例を示す図である。It is a figure which shows the specific example of a subject area. 各被写体の三次元空間における位置や姿勢の具体例を示す図である。It is a figure which shows the specific example of the position and posture of each subject in a three-dimensional space. 対象画像とは異なる視点における画像の生成の概略を示す図である。It is a figure which shows the outline of the generation of the image from the viewpoint different from the target image. 新たな視点86_2に関して新たに生成された画像の具体例を示す図である。It is a figure which shows the specific example of the image newly generated with respect to the new viewpoint 86_2. 画像情報取得装置100の処理の具体例を示す図である。It is a figure which shows the specific example of the processing of the image information acquisition apparatus 100. 画像情報取得装置100のハードウェア構成の具体例を示す図である。It is a figure which shows the specific example of the hardware composition of the image information acquisition apparatus 100.
 本発明の実施形態について、図面を参照して詳細に説明する。
 図1は、本発明の画像情報取得装置100の機能構成例を示す図である。画像情報取得装置100は、パーソナルコンピューター、サーバー装置、ゲーム機器、スマートフォン、撮像装置等の情報機器を用いて構成される。画像情報取得装置100は、画像入力部10、出力部20、記憶部30及び制御部40を備える。
Embodiments of the present invention will be described in detail with reference to the drawings.
FIG. 1 is a diagram showing a functional configuration example of the image information acquisition device 100 of the present invention. The image information acquisition device 100 is configured by using information devices such as a personal computer, a server device, a game device, a smartphone, and an image pickup device. The image information acquisition device 100 includes an image input unit 10, an output unit 20, a storage unit 30, and a control unit 40.
 画像入力部10は、画像情報取得装置100に対して入力される画像のデータを受け付ける。画像入力部10によって入力される画像のデータは、静止画像のデータであってもよいし、動画像のデータであってもよい。画像入力部10は、例えばCD-ROMやUSBメモリー(Universal Serial Bus Memory)等の記録媒体に記録された画像のデータを読み出しても良い。また、画像入力部10は、スチルカメラやビデオカメラによって撮像された画像を、カメラから受信しても良い。また、画像情報取得装置100がスチルカメラやビデオカメラ若しくはカメラを備えた情報処理装置に内蔵されている場合は、画像入力部10は撮像された画像又は撮像前の画像をバスから受信しても良い。また、画像入力部10は、ネットワークを介して他の情報処理装置から画像のデータを受信しても良い。画像入力部10は、画像のデータの入力を受けることが可能な構成であれば、さらに異なる態様で構成されても良い。 The image input unit 10 receives image data input to the image information acquisition device 100. The image data input by the image input unit 10 may be still image data or moving image data. The image input unit 10 may read image data recorded on a recording medium such as a CD-ROM or a USB memory (Universal Serial Bus Memory). Further, the image input unit 10 may receive an image captured by a still camera or a video camera from the camera. Further, when the image information acquisition device 100 is built in a still camera, a video camera, or an information processing device provided with a camera, the image input unit 10 may receive the captured image or the image before imaging from the bus. good. Further, the image input unit 10 may receive image data from another information processing device via a network. The image input unit 10 may be configured in a different manner as long as it can receive input of image data.
 出力部20は、制御部40によって生成された画像情報や画像のデータを出力する。出力部20は、例えばCD-ROMやUSBメモリー(Universal Serial Bus Memory)等の記録媒体に対して画像情報や画像のデータを書き込んでも良い。また、画像情報取得装置100がスチルカメラやビデオカメラ若しくはカメラを備えた情報処理装置に内蔵されている場合は、出力部20は生成された画像情報や画像のデータを、これらの機器に備えられた記録媒体に記録してもよいし、プレビュー画像としてこれらの機器に備えられた表示装置に表示してもよい。また、出力部20は、ネットワークを介して他の情報処理装置に対し画像情報や画像のデータを送信しても良い。出力部20は、画像情報や画像のデータを出力することが可能な構成であれば、さらに異なる態様で構成されても良い。 The output unit 20 outputs image information and image data generated by the control unit 40. The output unit 20 may write image information or image data to a recording medium such as a CD-ROM or a USB memory (Universal Serial Bus Memory). When the image information acquisition device 100 is built in a still camera, a video camera, or an information processing device including a camera, the output unit 20 is provided with the generated image information and image data in these devices. It may be recorded on a recording medium or displayed as a preview image on a display device provided in these devices. Further, the output unit 20 may transmit image information or image data to another information processing device via the network. The output unit 20 may be configured in a different manner as long as it can output image information and image data.
 記憶部30は、磁気ハードディスク装置や半導体記憶装置等の記憶装置を用いて構成される。記憶部30は、例えば画像記憶部301及び画像情報記憶部302として機能する。画像記憶部301は、画像入力部10によって入力された画像のデータを記憶する。画像記憶部301は、静止画像のデータを記憶してもよいし、動画像のデータを記憶してもよい。画像情報記憶部302は、制御部40によって生成される画像情報を記憶する。 The storage unit 30 is configured by using a storage device such as a magnetic hard disk device or a semiconductor storage device. The storage unit 30 functions as, for example, an image storage unit 301 and an image information storage unit 302. The image storage unit 301 stores image data input by the image input unit 10. The image storage unit 301 may store still image data or moving image data. The image information storage unit 302 stores image information generated by the control unit 40.
 図2は、画像情報記憶部302が記憶する画像情報テーブルの具体例を示す図である。画像情報テーブルは、処理の対象となる画像(以下「対象画像」という。)とその対象画像内の被写体との組合せ毎にレコードを有する。各レコードは、例えば対象画像を示す識別情報(以下「対象画像識別情報」という。)と、被写体を示す識別情報(以下「被写体識別情報」という。)と、画像情報とを対応付けて有する。画像情報は、対応する対象画像における被写体の画像に関する情報である。画像情報は、例えばその被写体の被写体領域を示す領域情報や、その被写体の三次元形状を示す情報(以下「3Dモデル」という。)や、その被写体の位置や姿勢を示す状態パラメータ等の情報を含む。 FIG. 2 is a diagram showing a specific example of an image information table stored in the image information storage unit 302. The image information table has a record for each combination of an image to be processed (hereinafter referred to as "target image") and a subject in the target image. Each record has, for example, identification information indicating a target image (hereinafter referred to as "target image identification information"), identification information indicating a subject (hereinafter referred to as "subject identification information"), and image information in association with each other. The image information is information about the image of the subject in the corresponding target image. The image information includes, for example, area information indicating the subject area of the subject, information indicating the three-dimensional shape of the subject (hereinafter referred to as "3D model"), and information such as state parameters indicating the position and posture of the subject. include.
 制御部40は、CPU(Central Processing Unit)等のプロセッサーとメモリーとを用いて構成される。制御部40は、プロセッサーがプログラムを実行することによって、入出力制御部401、領域情報取得部402、分類部403、三次元情報取得部404、状態パラメータ取得部405、付加情報取得部406及び画像生成部407として機能する。なお、制御部40の各機能の全て又は一部は、ASIC(Application Specific Integrated Circuit)やPLD(Programmable Logic Device)やFPGA(Field Programmable Gate Array)等のハードウェアを用いて実現されても良い。上記のプログラムは、コンピューター読み取り可能な記録媒体に記録されても良い。コンピューター読み取り可能な記録媒体とは、例えばフレキシブルディスク、光磁気ディスク、ROM、CD-ROM、半導体記憶装置(例えばSSD:Solid State Drive)等の可搬媒体、コンピューターシステムに内蔵されるハードディスクや半導体記憶装置等の記憶装置である。上記のプログラムは、電気通信回線を介して送信されてもよい。 The control unit 40 is configured by using a processor such as a CPU (Central Processing Unit) and a memory. The control unit 40 includes an input / output control unit 401, an area information acquisition unit 402, a classification unit 403, a three-dimensional information acquisition unit 404, a state parameter acquisition unit 405, an additional information acquisition unit 406, and an image when the processor executes a program. It functions as a generator 407. All or part of each function of the control unit 40 may be realized by using hardware such as ASIC (Application Specific Integrated Circuit), PLD (Programmable Logic Device), and FPGA (Field Programmable Gate Array). The above program may be recorded on a computer-readable recording medium. Computer-readable recording media include, for example, flexible disks, magneto-optical disks, ROMs, CD-ROMs, portable media such as semiconductor storage devices (for example, SSD: Solid State Drive), hard disks and semiconductor storage built into computer systems. It is a storage device such as a device. The above program may be transmitted over a telecommunication line.
 入出力制御部401は、データの入出力を制御する。例えば、入出力制御部401は、画像入力部10の動作を制御することによって、画像のデータを取得する。入出力制御部401は、入力された画像のデータを画像記憶部301に記録する。入出力制御部401は、入力された画像のデータを必要に応じてメモリーなどの記憶装置に一時的に記録してもよい。入出力制御部401は、画像情報記憶部302に記録されている画像情報のデータや、画像生成部407によって生成された画像のデータを、出力部20を制御することによって外部の機器に出力する。 The input / output control unit 401 controls the input / output of data. For example, the input / output control unit 401 acquires image data by controlling the operation of the image input unit 10. The input / output control unit 401 records the input image data in the image storage unit 301. The input / output control unit 401 may temporarily record the input image data in a storage device such as a memory, if necessary. The input / output control unit 401 outputs the image information data recorded in the image information storage unit 302 and the image data generated by the image generation unit 407 to an external device by controlling the output unit 20. ..
 領域情報取得部402は、対象画像において画像内に存在している各被写体の領域(以下「被写体領域」という。)を示す情報(以下「領域情報」という。)を被写体毎に取得する。対象画像は、画像記憶部301に静止画像として記憶されている画像であってもよいし、画像記憶部301に動画像として記憶されている動画像のフレームの画像であってもよい。対象画像は、一つの静止画像又はフレームであってもよいし、複数の静止画像又はフレームであってもよい。対象画像は、静止画像とフレームとの組合せであってもよい。対象画像として複数のフレームが用いられる場合には、1つの動画像から複数のフレームが取得されてもよい。その場合には、異なる視点からのフレームが得られるように、各フレームの時間間隔が所定の閾値以上となるように構成されてもよい。その場合、どのフレームから領域情報を取得するかは、領域情報取得部402によって所定の基準に基づいて決定されてもよい。 The area information acquisition unit 402 acquires information (hereinafter referred to as "area information") indicating the area of each subject existing in the image (hereinafter referred to as "subject area") in the target image for each subject. The target image may be an image stored as a still image in the image storage unit 301, or may be an image of a frame of a moving image stored as a moving image in the image storage unit 301. The target image may be one still image or frame, or may be a plurality of still images or frames. The target image may be a combination of a still image and a frame. When a plurality of frames are used as the target image, a plurality of frames may be acquired from one moving image. In that case, the time interval of each frame may be configured to be equal to or larger than a predetermined threshold value so that frames from different viewpoints can be obtained. In that case, the frame from which the area information is acquired may be determined by the area information acquisition unit 402 based on a predetermined criterion.
 いずれの場合にも、三次元情報取得部404及び状態パラメータ取得部405の処理に用いられる場合には、同じ又は同種の被写体を撮像することによって得られた静止画像又は動画像のフレームであることが望ましい。また、状態パラメータ取得部405の処理に用いられる場合には、例えば各被写体の三次元空間における位置が各静止画像又は動画像のフレームにおいて同一であってもよいし、異なっていてもよい。被写体領域は、被写体の輪郭によって囲まれる領域である。 In any case, when used for the processing of the three-dimensional information acquisition unit 404 and the state parameter acquisition unit 405, the frame is a still image or a moving image obtained by photographing the same or the same type of subject. Is desirable. Further, when it is used for the processing of the state parameter acquisition unit 405, for example, the position of each subject in the three-dimensional space may be the same or different in the frame of each still image or moving image. The subject area is an area surrounded by the outline of the subject.
 図3は、対象画像の具体例を示す図である。図3の対象画像には、複数の被写体が写っている。被写体81及び被写体86は、ハート型の物体である。被写体81及び被写体86は、同種の物体か又は似た形状をしている。被写体82、被写体83、被写体84及び被写体85は、星型の物体である。被写体82、被写体83、被写体84及び被写体85は、同種の物体か又は似た形状をしている。各被写体81~86はそれぞれの特有の位置で特有の角度で傾いた状態で写っている。 FIG. 3 is a diagram showing a specific example of the target image. A plurality of subjects are shown in the target image of FIG. The subject 81 and the subject 86 are heart-shaped objects. The subject 81 and the subject 86 have the same type of object or a similar shape. The subject 82, the subject 83, the subject 84, and the subject 85 are star-shaped objects. The subject 82, the subject 83, the subject 84, and the subject 85 have the same type of object or a similar shape. Each subject 81 to 86 is photographed at a unique position and tilted at a unique angle.
 図4は、被写体領域の具体例を示す図である。図4においてそれぞれ異なるパターンで示される各形状がそれぞれ被写体領域を示している。被写体領域91~96は、それぞれ被写体81~86の領域を示している。 FIG. 4 is a diagram showing a specific example of the subject area. Each shape shown in a different pattern in FIG. 4 indicates a subject area. The subject areas 91 to 96 indicate the areas of the subjects 81 to 86, respectively.
 以下、領域情報取得部402の処理の具体例について説明する。領域情報取得部402は、例えば、対象画像における各画素について、どの被写体に対応するのか、又は、いずれの被写体にも対応しないのか、について推定してもよい。この推定に適用される技術は特定のものに限定される必要は無い。例えば、Mask-RCNNやGANなどの深層学習に基づく技術が適用されてもよい。また、手動で各被写体の被写体領域が指定されてもよい。領域情報取得部402は、生成された各被写体領域の領域情報のデータを、対象画像の識別情報と各被写体の識別情報と対応付けた画像情報として画像情報記憶部302に記録する。 Hereinafter, a specific example of the processing of the area information acquisition unit 402 will be described. The area information acquisition unit 402 may estimate, for example, which subject corresponds to each pixel in the target image, or which subject does not correspond to any subject. The techniques applied to this estimation need not be limited to specific ones. For example, techniques based on deep learning such as Mask-RCNN and GAN may be applied. Further, the subject area of each subject may be manually specified. The area information acquisition unit 402 records the generated area information data of each subject area in the image information storage unit 302 as image information associated with the identification information of the target image and the identification information of each subject.
 分類部403は、各被写体領域について、同種の被写体毎に分類する。分類の対象となる被写体領域は、1枚の対象画像(同一の対象画像)のみに限らず、複数毎の対象画像において得られた複数の被写体領域の被写体を対象として分類が行われてもよい。例えば、m個の被写体領域が取得された対象画像をn枚用いた場合(m及びnはいずれも1以上の整数)、m×n個の被写体領域が分類の対象となってもよい。 The classification unit 403 classifies each subject area for each subject of the same type. The subject area to be classified is not limited to one target image (same target image), and the subject of a plurality of subject areas obtained in each of a plurality of target images may be classified. .. For example, when n target images from which m subject areas have been acquired are used (m and n are both integers of 1 or more), m × n subject areas may be the target of classification.
 分類部403は、例えば、外見が同一又は所定の基準よりも似ている被写体の被写体領域同士を同じグループとして分類する。分類部403に適用される技術は特定のものに限定される必要は無い。例えば、領域情報取得部402において用いられる深層学習が被写体のカテゴリを推定することができる技術である場合、分類部403は同一のカテゴリの被写体領域を同一のグループに分類してもよい。例えば、分類部403は、同一カテゴリの被写体領域同士でその特徴量に基づいた類似度を算出し、類似度が高い被写体領域同士を同一のグループに分類してもよい。このように処理が行われることで、より細分化された分類を実現できる。また、被写体の候補が既知である場合、対象画像とは別に、被写体が撮影された画像(以下「参照画像」という。)が用いられてもよい。この場合、分類部403は、対象画像から得られた被写体領域毎に、どの参照画像の被写体と最も類似しているかと判定し、参照画像毎のグループを生成してもよい。また、手動で各被写体領域が分類されてもよい。 For example, the classification unit 403 classifies subject areas of subjects having the same appearance or similar to each other as the same group. The technique applied to the classification unit 403 need not be limited to a specific one. For example, when the deep learning used in the area information acquisition unit 402 is a technique capable of estimating the subject category, the classification unit 403 may classify the subject areas of the same category into the same group. For example, the classification unit 403 may calculate the similarity between subject areas of the same category based on the feature amount, and classify the subject areas having high similarity into the same group. By performing the processing in this way, a more subdivided classification can be realized. Further, when the candidate of the subject is known, an image in which the subject is taken (hereinafter referred to as "reference image") may be used in addition to the target image. In this case, the classification unit 403 may determine which reference image is most similar to the subject for each subject area obtained from the target image, and may generate a group for each reference image. Further, each subject area may be manually classified.
 三次元情報取得部404は、各グループに属している複数の被写体領域から得られる情報に基づいて、各グループの被写体の3Dモデルを生成する。3Dモデルは、例えば三次元の点群で表されてもよいし、ポリゴンで表されてもよいし、他のモデルで表されてもよい。また、3Dモデルは、既知の情報として予め記憶部30に記憶されていてもよい。三次元情報取得部404に適用される技術は特定のものに限定される必要は無い。例えば、三次元情報取得部404は、被写体領域の画像のそれぞれを、同一個体を異なる位置や異なる姿勢で撮影した複数の画像として取り扱ってもよい。例えば、三次元情報取得部404は、上述した複数の被写体領域の画像を用いてStructure from Motion (SfM)を実行することによって3Dモデルを生成してもよい。三次元情報取得部404は、生成された3Dモデルのデータを、対象画像の識別情報とその3Dモデルが示す被写体の識別情報と対応付けた画像情報として画像情報記憶部302に記録する。 The three-dimensional information acquisition unit 404 generates a 3D model of the subject of each group based on the information obtained from a plurality of subject areas belonging to each group. The 3D model may be represented by, for example, a three-dimensional point group, a polygon, or another model. Further, the 3D model may be stored in the storage unit 30 in advance as known information. The technique applied to the three-dimensional information acquisition unit 404 does not have to be limited to a specific one. For example, the three-dimensional information acquisition unit 404 may handle each of the images in the subject area as a plurality of images of the same individual taken at different positions and different postures. For example, the three-dimensional information acquisition unit 404 may generate a 3D model by executing Structure from Motion (SfM) using the images of the plurality of subject areas described above. The three-dimensional information acquisition unit 404 records the generated 3D model data in the image information storage unit 302 as image information associated with the identification information of the target image and the identification information of the subject indicated by the 3D model.
 状態パラメータ取得部405は、各被写体領域の被写体について、カメラとの位置関係や姿勢などを表す情報(以下「状態パラメータ」という。)を生成する。状態パラメータ取得部405に適用される技術は特定のものに限定される必要は無い。例えば、状態パラメータ取得部405は、SfMを利用することで、各被写体領域に関する状態パラメータを取得してもよい。 The state parameter acquisition unit 405 generates information (hereinafter referred to as "state parameter") indicating the positional relationship with the camera, the posture, and the like for the subject in each subject area. The technique applied to the state parameter acquisition unit 405 does not have to be limited to a specific one. For example, the state parameter acquisition unit 405 may acquire the state parameter for each subject area by using SfM.
 状態パラメータ取得部405の処理の具体例についてより詳細に説明する。3Dモデルに対し、三次元のワールド座標が与えられる。例えば、被写体の3Dモデルが三次元の点群で表される場合には、各点の座標がワールド座標で表される。例えば、被写体の3Dモデルがポリゴンで表される場合には、ポリゴンを形成する各点がワールド座標で表される。また、ワールド座標を画像座標に変換するためのカメラの内部パラメータ(例えば、焦点距離、光学的中心、歪み係数など)が推定される。さらに、被写体領域のそれぞれについて、その被写体領域における被写体の見え方と3Dモデルの見え方とが一致するようなカメラの座標と向きを表す座標変換行列が状態パラメータとして推定される。 A specific example of the processing of the state parameter acquisition unit 405 will be described in more detail. Three-dimensional world coordinates are given to the 3D model. For example, when the 3D model of the subject is represented by a three-dimensional point group, the coordinates of each point are represented by world coordinates. For example, when the 3D model of the subject is represented by polygons, each point forming the polygon is represented by world coordinates. Also, the camera's internal parameters for converting world coordinates to image coordinates (eg, focal length, optical center, distortion factor, etc.) are estimated. Further, for each of the subject areas, a coordinate transformation matrix representing the coordinates and orientation of the camera such that the appearance of the subject in the subject area and the appearance of the 3D model match is estimated as a state parameter.
 以下、推定される状態パラメータによる座標変換の表し方の具体的な一例を示す。まず、ワールド座標系から、ある被写体領域に対応するカメラ座標系への変換を考える。ワールド座標系の座標(X,Y,Z)は、以下の式1によってカメラ座標系の座標(X’,Y’,Z’)に変換できる。 The following is a concrete example of how to express the coordinate conversion based on the estimated state parameters. First, consider the conversion from the world coordinate system to the camera coordinate system corresponding to a certain subject area. The coordinates (X, Y, Z) of the world coordinate system can be converted into the coordinates (X', Y', Z') of the camera coordinate system by the following equation 1.
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000001
 式1において、Rは座標変換行列であり、以下の式2のように表される。 In Equation 1, R is a coordinate transformation matrix and is expressed as in Equation 2 below.
Figure JPOXMLDOC01-appb-M000002
Figure JPOXMLDOC01-appb-M000002
 Rの成分のうち、(tx,ty,yz)は平行移動を表す。Rの成分のうち、R11~R33は回転行列に相当する値である。R11~R33は、例えばy軸、z軸、x軸の順に各座標軸回りの回転を行うという形で解釈することで、以下の式3のように表すこともできる。また、カメラ座標系の座標は下記の通り投影変換で画像座標系の座標(i,j)に変換できる。ここで、fと(cx,cy)はそれぞれカメラの焦点距離と光学的中心である。 Of the components of R, (tx, ty, yz) represents translation. Of the components of R, R11 to R33 are values corresponding to the rotation matrix. R11 to R33 can also be expressed as the following equation 3 by interpreting them in the form of rotating around each coordinate axis in the order of, for example, y-axis, z-axis, and x-axis. In addition, the coordinates of the camera coordinate system can be converted to the coordinates (i, j) of the image coordinate system by projection conversion as shown below. Here, f and (cx, cy) are the focal length and optical center of the camera, respectively.
Figure JPOXMLDOC01-appb-M000003
Figure JPOXMLDOC01-appb-M000003
Figure JPOXMLDOC01-appb-M000004
Figure JPOXMLDOC01-appb-M000004
 以上のようにして得られるRそのものやRの成分が、状態パラメータとして取得される。図5は、各被写体の三次元空間における位置や姿勢の具体例を示す図である。図3の画像に写っている各被写体は三次元空間にそれぞれの姿勢でそれぞれの位置に配置されている。各被写体の位置及び状態は、状態パラメータによって表される。 R itself and the components of R obtained as described above are acquired as state parameters. FIG. 5 is a diagram showing specific examples of positions and postures of each subject in a three-dimensional space. Each subject shown in the image of FIG. 3 is arranged at each position in a three-dimensional space in each posture. The position and state of each subject are represented by state parameters.
 状態パラメータ取得部405の処理には、各被写体領域における被写体同士が三次元的に重なり合わない(同一空間に重複して存在はしない)という制約条件が設けられてもよい。このような制約条件が設けられることで、状態パラメータの取得精度を向上させることが可能となる。状態パラメータ取得部405は、生成された状態パラメータのデータを対象画像の識別情報とその状態パラメータが示す被写体の識別情報と対応付けた画像情報として画像情報記憶部302に記録する。 The processing of the state parameter acquisition unit 405 may be provided with a constraint condition that the subjects in each subject area do not overlap three-dimensionally (they do not overlap in the same space). By providing such a constraint condition, it is possible to improve the acquisition accuracy of the state parameter. The state parameter acquisition unit 405 records the generated state parameter data in the image information storage unit 302 as image information associated with the identification information of the target image and the identification information of the subject indicated by the state parameter.
 付加情報取得部406は、各被写体領域に関して付加情報を取得する。付加情報の具体例として、同一のグループの他の被写体領域における被写体との間の相対的な三次元位置に関する情報がある。各被写体領域の画像は、3Dモデルで表される被写体を特定の位置及び姿勢から見た時の見え方に一致すると考えることができる。したがって、3Dモデルの視点の基準となる基準位置を任意に指定すれば、基準位置に対する相対的な位置を算出する事ができる。この算出は、例えば各被写体領域における座標変換行列Rを利用して行われてもよい。各被写体領域に関して得られた座標変換行列Rの逆行列を用いて座標変換を行うことで、各被写体領域における被写体の相対位置をワールド座標系で表す事ができる。このような座標変換を行うことで得られる3Dモデルの座標は、互いに実空間上と同等の位置関係を持つようになる。これにより各被写体領域における被写体間の相対位置を表す事ができる。 The additional information acquisition unit 406 acquires additional information for each subject area. As a specific example of the additional information, there is information about a relative three-dimensional position with respect to the subject in another subject area of the same group. The image of each subject area can be considered to match the appearance of the subject represented by the 3D model when viewed from a specific position and posture. Therefore, if a reference position that serves as a reference for the viewpoint of the 3D model is arbitrarily specified, the position relative to the reference position can be calculated. This calculation may be performed using, for example, the coordinate transformation matrix R in each subject area. By performing coordinate transformation using the inverse matrix of the coordinate transformation matrix R obtained for each subject region, the relative position of the subject in each subject region can be represented by the world coordinate system. The coordinates of the 3D model obtained by performing such coordinate conversion have the same positional relationship as in real space. This makes it possible to represent the relative position between subjects in each subject area.
 付加情報の他の具体例として、各被写体領域における被写体の表面の見え方に関する情報がある。例えば、被写体の表面のテクスチャに関する情報(例えば色、形状、材質に関する情報)や、被写体の表面の光の反射率に関する情報であってもよい。付加情報の他の具体例として、各被写体領域の被写体に対して光を照射する光源に関する情報(例えば光源における光の色味、光の強さ、光源の位置に関する情報、光源の数)がある。このような情報は、例えば光源に関する深層学習や機械学習による学習済みモデルを用いることで取得されてもよい。 As another specific example of the additional information, there is information on how the surface of the subject looks in each subject area. For example, it may be information about the texture of the surface of the subject (for example, information about the color, shape, and material) or information about the reflectance of light on the surface of the subject. As another specific example of the additional information, there is information about a light source that irradiates a subject in each subject area with light (for example, color of light in the light source, light intensity, information about the position of the light source, number of light sources). .. Such information may be obtained, for example, by using a trained model by deep learning or machine learning regarding a light source.
 付加情報の他の具体例として、各被写体領域のける被写体が他の被写体と接触しているか否かに関する情報がある。このような情報は、3Dモデルと状態パラメータとを用いることによって、各被写体の表面のうち最も近い部分の距離が所定の値よりも小さいか否かに基づいて取得されてもよい。付加情報取得部406は、生成された付加情報のデータを、対象画像の識別情報とその付加情報が示す被写体の識別情報と対応付けた画像情報として画像情報記憶部302に記録する。 As another specific example of the additional information, there is information on whether or not the subject in each subject area is in contact with another subject. Such information may be acquired based on whether or not the distance of the closest portion of the surface of each subject is smaller than a predetermined value by using a 3D model and a state parameter. The additional information acquisition unit 406 records the generated additional information data in the image information storage unit 302 as image information associated with the identification information of the target image and the identification information of the subject indicated by the additional information.
 画像生成部407は、画像を生成する際に必要となるカメラパラメータに関する情報が与えられると、そのカメラパラメータに応じて画像を生成する。カメラパラメータに関する情報は、画像情報取得装置100が有する入力装置(キーボードやポインティングデバイスやタッチパネル等)をユーザーが操作することによって与えられてもよいし、他の情報処理装置から送信された情報を受信することによって与えられてもよい。 When the image generation unit 407 is given information on the camera parameters required for generating an image, the image generation unit 407 generates an image according to the camera parameters. Information regarding the camera parameters may be given by the user operating an input device (keyboard, pointing device, touch panel, etc.) included in the image information acquisition device 100, or information transmitted from another information processing device may be received. May be given by doing.
 例えば、カメラパラメータと供に対象画像識別情報が与えられた場合には、その対象画像において写っている各被写体について、与えられたカメラパラメータに基づいて新たな画像が生成される。カメラパラメータに関する情報は、例えば焦点距離、光学的中心、歪み計数を示す情報であってもよい。カメラパラメータに関する情報は、例えば視点の位置、視線の向き、スクリーン(画像)の大きさを示す情報であってもよい。カメラパラメータに関する情報は、画像を生成することを可能にする情報であればどのような情報であってもよい。対象画像識別情報が与えられている場合には、その対象画像に対応付けられた画像情報に基づいて画像が生成される。 For example, when the target image identification information is given together with the camera parameters, a new image is generated based on the given camera parameters for each subject shown in the target image. The information regarding the camera parameters may be, for example, information indicating a focal length, an optical center, and a strain count. The information regarding the camera parameters may be, for example, information indicating the position of the viewpoint, the direction of the line of sight, and the size of the screen (image). The information regarding the camera parameters may be any information as long as it is possible to generate an image. When the target image identification information is given, the image is generated based on the image information associated with the target image.
 画像生成部407は、例えば被写体それぞれの位置を表すように座標変換した3Dモデルに対し、視点の変化を表す変換行列を使ってさらに座標変換を行うことで画像を生成してもよい。例えば、図3に示されるような対象画像と同じ視点に基づいて対象画像と同じような画像が生成されてもよいし、対象画像とは異なる視点に基づいて新たな画像が生成されてもよい。 The image generation unit 407 may generate an image by further performing coordinate conversion using, for example, a transformation matrix representing a change in viewpoint on a 3D model whose coordinates have been converted so as to represent the position of each subject. For example, an image similar to the target image may be generated based on the same viewpoint as the target image as shown in FIG. 3, or a new image may be generated based on a viewpoint different from the target image. ..
 図6は、対象画像とは異なる視点における画像の生成の概略を示す図である。図6において、視点86_1は、図3の対象画像における視点である。対象画像における座標(i_1,j_1)の画素には、被写体82に関する値が入る。図6において、視点86_2は、対象画像とは異なる新たな視点である。新たな視点86_2に基づいて生成される新たな画像では、被写体82の同じ部分に関する値は、座標(i_2,j_2)の画素に入る。新たな視点86_2に関して、各被写体や各画素において座標変換等の処理がおこなわれることによって、新たな画像が生成される。図7は、新たな視点86_2に関して新たに生成された画像の具体例を示す図である。 FIG. 6 is a diagram showing an outline of image generation from a viewpoint different from the target image. In FIG. 6, the viewpoint 86_1 is a viewpoint in the target image of FIG. A value related to the subject 82 is input to the pixels of the coordinates (i_1, j_1) in the target image. In FIG. 6, the viewpoint 86_2 is a new viewpoint different from the target image. In the new image generated based on the new viewpoint 86_2, the values for the same portion of the subject 82 are in the pixels of the coordinates (i_2, j_2). With respect to the new viewpoint 86_2, a new image is generated by performing processing such as coordinate conversion on each subject and each pixel. FIG. 7 is a diagram showing a specific example of a newly generated image with respect to the new viewpoint 86_2.
 図8は、画像情報取得装置100の処理の具体例を示す図である。まず、入出力制御部401が、処理の対象となる対象画像を入力し、画像記憶部301に記録する(ステップS101)。領域情報取得部402が、対象画像において被写体領域を示す領域情報を被写体毎に取得し、画像情報記憶部302に記録する(ステップS102)。分類部403が、各被写体を分類する(ステップS103)。三次元情報取得部404は、同一のグループに分類された複数の被写体の情報を用いることによって、その被写体の3Dモデルを生成し、生成された3Dモデルのデータを画像情報記憶部302に記録する(ステップS104)。状態パラメータ取得部405は、各被写体について状態パラメータを生成し、状態パラメータを画像情報記憶部302に記録する(ステップS105)。付加情報取得部406は、各被写体について付加情報を取得し、画像情報記憶部302に記録する(ステップS106)。 FIG. 8 is a diagram showing a specific example of processing of the image information acquisition device 100. First, the input / output control unit 401 inputs the target image to be processed and records it in the image storage unit 301 (step S101). The area information acquisition unit 402 acquires area information indicating the subject area in the target image for each subject and records it in the image information storage unit 302 (step S102). The classification unit 403 classifies each subject (step S103). The three-dimensional information acquisition unit 404 generates a 3D model of the subject by using the information of a plurality of subjects classified into the same group, and records the generated 3D model data in the image information storage unit 302. (Step S104). The state parameter acquisition unit 405 generates a state parameter for each subject and records the state parameter in the image information storage unit 302 (step S105). The additional information acquisition unit 406 acquires additional information for each subject and records it in the image information storage unit 302 (step S106).
 図9は、画像情報取得装置100のハードウェア構成の具体例を示す図である。画像情報取得装置100は、例えば図9に示されるように入出力装置1、補助記憶装置2、メモリー3及びプロセッサー4を備える。入出力装置1は、画像情報取得装置100において外部(ユーザー含む)との間で情報(データを含む)の入出力を行う。入出力装置1は、例えば画像入力部10や出力部20として機能する。補助記憶装置2は、磁気ハードディスク装置や半導体記憶装置を用いて構成される。補助記憶装置2は、例えば記憶部30として機能する。メモリー3及びプロセッサー4は、例えば制御部40として機能する。 FIG. 9 is a diagram showing a specific example of the hardware configuration of the image information acquisition device 100. The image information acquisition device 100 includes, for example, an input / output device 1, an auxiliary storage device 2, a memory 3, and a processor 4 as shown in FIG. The input / output device 1 inputs / outputs information (including data) to and from the outside (including the user) in the image information acquisition device 100. The input / output device 1 functions as, for example, an image input unit 10 or an output unit 20. The auxiliary storage device 2 is configured by using a magnetic hard disk device or a semiconductor storage device. The auxiliary storage device 2 functions as, for example, a storage unit 30. The memory 3 and the processor 4 function as, for example, a control unit 40.
 このように構成された画像情報取得装置100では、処理の対象となる画像内の被写体について、たとえ実際には別個の被写体であっても、分類部403によって同種の被写体として分類された被写体であれば、それらの被写体の画像を用いて画像の三次元情報(例えば3Dモデル)が取得される。そのため、もし同種の被写体が1つの画像に複数存在すれば、それぞれ別個の被写体としては少ない画像しか存在しないとしても、それら少ない画像を用いてより精度の高い三次元情報を取得することが可能となる。 In the image information acquisition device 100 configured in this way, the subject in the image to be processed may be a subject classified as the same type by the classification unit 403 even if it is actually a separate subject. For example, three-dimensional information (for example, a 3D model) of an image is acquired using the images of those subjects. Therefore, if there are multiple subjects of the same type in one image, even if there are only a few images as separate subjects, it is possible to acquire more accurate three-dimensional information using those few images. Become.
 (変形例)
 画像情報取得装置100は、画像生成部407を備えないように構成されてもよい。画像生成部407は、他の情報処理装置に実装されてもよい。この場合、画像生成部407が実装される情報処理装置には、画像情報記憶部302がさらに実装されてもよい。このように構成されることによって、画像情報取得装置100において取得された画像情報に基づいて他の情報処理装置において容易に被写体の画像を生成することが可能となる。
(Modification example)
The image information acquisition device 100 may be configured not to include the image generation unit 407. The image generation unit 407 may be mounted on another information processing device. In this case, the image information storage unit 302 may be further mounted on the information processing apparatus on which the image generation unit 407 is mounted. With such a configuration, it becomes possible to easily generate an image of a subject in another information processing apparatus based on the image information acquired in the image information acquisition apparatus 100.
 画像情報取得装置100は、複数の装置に分けて実装されてもよい。この場合、例えば画像情報取得装置100は、複数の装置を含む画像情報取得システムとして実装されてもよい。例えば、制御部40を有する情報処理装置と、記憶部30を有する情報処理装置とが異なる装置として実装されてもよいし、記憶部30の機能が重複して複数の情報処理装置に実装されてもよいし、制御部40の機能が複数の情報処理装置に分けて実装されてもよい。 The image information acquisition device 100 may be mounted separately in a plurality of devices. In this case, for example, the image information acquisition device 100 may be implemented as an image information acquisition system including a plurality of devices. For example, the information processing device having the control unit 40 and the information processing device having the storage unit 30 may be mounted as different devices, or the functions of the storage unit 30 may be duplicated and mounted on a plurality of information processing devices. Alternatively, the function of the control unit 40 may be implemented separately in a plurality of information processing devices.
 以上、この発明の実施形態について図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の設計等も含まれる。 As described above, the embodiment of the present invention has been described in detail with reference to the drawings, but the specific configuration is not limited to this embodiment, and the design and the like within a range not deviating from the gist of the present invention are also included.
 本発明は、画像の情報を取得する装置に適用可能である。 The present invention is applicable to an apparatus for acquiring image information.
100…画像情報取得装置、10…画像入力部、20…出力部、30…記憶部、301…画像記憶部、302…画像情報記憶部、40…制御部、401…入出力制御部、402…領域情報取得部、403…分類部、404…三次元情報取得部、405…状態パラメータ取得部、406…付加情報取得部、407…画像生成部、81~86…被写体、91~96…被写体領域 100 ... Image information acquisition device, 10 ... Image input unit, 20 ... Output unit, 30 ... Storage unit, 301 ... Image storage unit, 302 ... Image information storage unit, 40 ... Control unit, 401 ... Input / output control unit, 402 ... Area information acquisition unit, 403 ... Classification unit, 404 ... Three-dimensional information acquisition unit, 405 ... State parameter acquisition unit, 406 ... Additional information acquisition unit, 407 ... Image generation unit, 81-86 ... Subject, 91-96 ... Subject area

Claims (7)

  1.  処理対象の画像である対象画像から、前記対象画像に写っている被写体を同種の被写体に分類する分類部と、
     前記分類部によって同種の被写体として分類された複数の被写体の画像に基づいて、前記被写体の三次元形状を示す情報を取得する三次元情報取得部と、
     を備える画像情報取得装置。
    A classification unit that classifies the subject appearing in the target image into the same type of subject from the target image that is the image to be processed.
    A three-dimensional information acquisition unit that acquires information indicating the three-dimensional shape of the subject based on images of a plurality of subjects classified as the same type of subject by the classification unit.
    Image information acquisition device including.
  2.  前記被写体毎に、前記対象画像の三次元空間における位置及び姿勢を示す情報である状態パラメータを取得する状態パラメータ取得部をさらに備える、請求項1に記載の画像情報取得装置。 The image information acquisition device according to claim 1, further comprising a state parameter acquisition unit for acquiring a state parameter which is information indicating a position and a posture of the target image in a three-dimensional space for each subject.
  3.  前記状態パラメータ取得部は、被写体同士が三次元的に重なり合わないという制約条件に基づいて処理をおこなう、請求項2に記載の画像情報取得装置。 The image information acquisition device according to claim 2, wherein the state parameter acquisition unit performs processing based on a constraint condition that subjects do not overlap each other three-dimensionally.
  4.  前記対象画像から前記被写体毎にその被写体の画像が占める領域を示す情報である領域情報を取得する領域情報取得部をさらに備える、請求項1から3のいずれか一項に記載の画像情報取得装置。 The image information acquisition device according to any one of claims 1 to 3, further comprising an area information acquisition unit that acquires area information that is information indicating an area occupied by the image of the subject for each subject from the target image. ..
  5.  前記対象画像における前記被写体同士の相対的な三次元位置に関する情報を取得する付加情報取得部をさらに備える、請求項1から4のいずれか一項に記載の画像情報取得装置。 The image information acquisition device according to any one of claims 1 to 4, further comprising an additional information acquisition unit for acquiring information regarding relative three-dimensional positions of the subjects in the target image.
  6.  処理対象の画像である対象画像から、前記対象画像に写っている被写体を同種の被写体に分類する分類ステップと、
     前記分類ステップにおいて同種の被写体として分類された複数の被写体の画像に基づいて、前記被写体の三次元形状を示す情報を取得する三次元情報取得ステップと、
     を有する画像情報取得方法。
    From the target image, which is the image to be processed, the classification step of classifying the subject appearing in the target image into the same type of subject, and
    A three-dimensional information acquisition step for acquiring information indicating the three-dimensional shape of the subject based on images of a plurality of subjects classified as the same type of subject in the classification step.
    Image information acquisition method having.
  7.  請求項1から5のいずれか一項に記載の画像情報取得装置としてコンピューターを機能させるためのコンピュータープログラム。 A computer program for operating a computer as the image information acquisition device according to any one of claims 1 to 5.
PCT/JP2020/042069 2020-11-11 2020-11-11 Image information acquisition device, image information acquisition method, and computer program WO2022102015A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2020/042069 WO2022102015A1 (en) 2020-11-11 2020-11-11 Image information acquisition device, image information acquisition method, and computer program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2020/042069 WO2022102015A1 (en) 2020-11-11 2020-11-11 Image information acquisition device, image information acquisition method, and computer program

Publications (1)

Publication Number Publication Date
WO2022102015A1 true WO2022102015A1 (en) 2022-05-19

Family

ID=81600853

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/042069 WO2022102015A1 (en) 2020-11-11 2020-11-11 Image information acquisition device, image information acquisition method, and computer program

Country Status (1)

Country Link
WO (1) WO2022102015A1 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2020135679A (en) * 2019-02-25 2020-08-31 富士通株式会社 Data set creation method, data set creation device, and data set creation program

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2020135679A (en) * 2019-02-25 2020-08-31 富士通株式会社 Data set creation method, data set creation device, and data set creation program

Similar Documents

Publication Publication Date Title
JP6685827B2 (en) Image processing apparatus, image processing method and program
JP6425780B1 (en) Image processing system, image processing apparatus, image processing method and program
CN110998659B (en) Image processing system, image processing method, and program
JP5812599B2 (en) Information processing method and apparatus
JP6554900B2 (en) Template creation apparatus and template creation method
KR102387891B1 (en) Image processing apparatus, control method of image processing apparatus, and computer-readable storage medium
JP6675691B1 (en) Learning data generation method, program, learning data generation device, and inference processing method
JP6541920B1 (en) INFORMATION PROCESSING APPARATUS, PROGRAM, AND INFORMATION PROCESSING METHOD
CN110517211B (en) Image fusion method based on gradient domain mapping
JP2018133059A (en) Information processing apparatus and method of generating three-dimensional model
US11645800B2 (en) Advanced systems and methods for automatically generating an animatable object from various types of user input
JP6762570B2 (en) Image processing equipment, image processing method, and image processing program
KR102195762B1 (en) Acquisition method for high quality 3-dimension spatial information using photogrammetry
KR101566459B1 (en) Concave surface modeling in image-based visual hull
WO2022102015A1 (en) Image information acquisition device, image information acquisition method, and computer program
JP2008204318A (en) Image processor, image processing method and image processing program
JP7057086B2 (en) Image processing equipment, image processing methods, and programs
JP2020173726A (en) Virtual viewpoint conversion device and program
CN113034345B (en) Face recognition method and system based on SFM reconstruction
Sosa et al. 3D surface reconstruction of entomological specimens from uniform multi-view image datasets
JP4505616B2 (en) Eigenspace learning device, eigenspace learning method, and eigenspace program
WO2022102016A1 (en) Image encoding device, image encoding method, image decoding device, image decoding method, and computer program
JP2020166652A (en) Image processing apparatus, image processing method, and program
TWI768231B (en) Information processing device, recording medium, program product, and information processing method
JP2021093712A (en) Imaging control device, evaluation system, imaging control method, and program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20961543

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20961543

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP