WO2022191010A1 - Information processing device and information processing method - Google Patents

Information processing device and information processing method Download PDF

Info

Publication number
WO2022191010A1
WO2022191010A1 PCT/JP2022/008967 JP2022008967W WO2022191010A1 WO 2022191010 A1 WO2022191010 A1 WO 2022191010A1 JP 2022008967 W JP2022008967 W JP 2022008967W WO 2022191010 A1 WO2022191010 A1 WO 2022191010A1
Authority
WO
WIPO (PCT)
Prior art keywords
unit
information
dimensional
imaging
information processing
Prior art date
Application number
PCT/JP2022/008967
Other languages
French (fr)
Japanese (ja)
Inventor
剛也 小林
Original Assignee
ソニーグループ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ソニーグループ株式会社 filed Critical ソニーグループ株式会社
Publication of WO2022191010A1 publication Critical patent/WO2022191010A1/en

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01BMEASURING LENGTH, THICKNESS OR SIMILAR LINEAR DIMENSIONS; MEASURING ANGLES; MEASURING AREAS; MEASURING IRREGULARITIES OF SURFACES OR CONTOURS
    • G01B11/00Measuring arrangements characterised by the use of optical techniques
    • G01B11/24Measuring arrangements characterised by the use of optical techniques for measuring contours or curvatures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/04Texture mapping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/111Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/271Image signal generators wherein the generated image signals comprise depth maps or disparity maps
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/275Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals

Definitions

  • the present disclosure relates to an information processing device and an information processing method.
  • volumetric capture that generates a 3D model of a subject using captured images of an existing subject, and generates a high-quality 3D image of the subject based on the generated 3D model and the captured image of the subject.
  • An object of the present disclosure is to provide an information processing apparatus and an information processing method capable of generating a three-dimensional image of higher quality in volumetric capture.
  • An information processing apparatus includes a generation unit that generates an image by applying a texture image to a three-dimensional model included in three-dimensional data, a first position of a virtual camera that acquires an image of a virtual space, Based on a second position of the three-dimensional model and a third position of one or more imaging cameras that capture an image of the subject in real space, a captured image of the subject to be used as a texture image is obtained from one or more imaging cameras. and a selection unit that selects an imaging camera.
  • the information processing apparatus includes a generation unit that generates three-dimensional data based on captured images captured by one or more imaging cameras, and a three-dimensional data corresponding to a subject included in the captured image from the three-dimensional data. a separating unit that separates the model and generates position information indicating the position of the separated three-dimensional model.
  • FIG. 10 is a diagram showing basic processing of volumetric capture based on a captured image of a photographed image, which is applicable to the embodiment;
  • FIG. 10 is a diagram for explaining a problem of an example of existing technology;
  • FIG. 10 is a diagram for explaining a problem of another example of the existing technology;
  • FIG. 10 is a diagram for explaining a method of selecting an example of an imaging camera that acquires a texture image to be applied to a 3D model, according to existing technology;
  • FIG. 10 is a schematic diagram for explaining a first example of image pickup camera selection according to existing technology;
  • FIG. 10 is a schematic diagram for explaining a first example of image pickup camera selection according to existing technology;
  • FIG. 10 is a schematic diagram for explaining a second example of image pickup camera selection according to existing technology;
  • FIG. 10 is a diagram showing basic processing of volumetric capture based on a captured image of a photographed image, which is applicable to the embodiment;
  • FIG. 10 is a diagram for explaining a problem of an example of existing
  • FIG. 10 is a schematic diagram for explaining a second example of image pickup camera selection according to existing technology
  • 1 is a functional block diagram showing an example of functions of an information processing system according to an embodiment
  • FIG. FIG. 3 is a schematic diagram showing an example configuration for acquiring image data of a subject, which is applicable to the embodiment
  • 1 is a block diagram showing a hardware configuration of an example of an information processing device applicable to an information processing system according to an embodiment
  • FIG. 6 is an exemplary flowchart schematically showing processing in the information processing system according to the embodiment
  • FIG. 4 is a schematic diagram schematically showing a three-dimensional model generation process applicable to the embodiment
  • It is a block diagram showing an example of the configuration of a 3D model generation unit according to the embodiment.
  • FIG. 7 is a schematic diagram for explaining subject separation processing according to the embodiment;
  • FIG. 14 is an exemplary flowchart illustrating subject separation processing according to the embodiment.
  • FIG. 4 is a schematic diagram for explaining selection of an imaging camera according to the embodiment; 4 is a block diagram showing an example configuration of a rendering unit according to the embodiment;
  • FIG. 8 is an example flowchart illustrating a first example of imaging camera selection processing in rendering processing according to the embodiment;
  • FIG. 4 is a schematic diagram for explaining the relationship between an object and a virtual camera according to the embodiment;
  • FIG. 10 is a schematic diagram for explaining processing for calculating an average value of reference positions of objects according to the embodiment; 9 is an example flowchart illustrating a second example of imaging camera selection processing in rendering processing according to the embodiment; 6 is an example flowchart illustrating rendering processing according to the embodiment; FIG. 4 is a schematic diagram for explaining post-effect processing according to the embodiment; FIG. 5 is a schematic diagram showing more specifically post-effect processing according to the embodiment;
  • FIG. 1 is a diagram showing basic processing of volumetric capture based on a captured image of a photographed image, which is applicable to the embodiment.
  • step S1 the system surrounds an object (subject) with a large number of cameras in real space and captures an image of the subject.
  • a camera that captures an image of a subject in real space is hereinafter referred to as an imaging camera.
  • step S2 the system converts the subject into three-dimensional data and generates a three-dimensional model of the subject based on a plurality of captured images captured by multiple imaging cameras (3D modeling processing).
  • step S3 the system renders the three-dimensional model generated at step S2 to generate an image.
  • step S3 the system places the 3D model in the virtual space, and renders the 3D model from a virtual camera (hereinafter referred to as a virtual camera) that can freely move in the virtual space. to generate an image. That is, the system renders according to the position and orientation of the virtual camera with respect to the 3D model. For example, a user who operates a virtual camera can observe an image of a three-dimensional model viewed from a position according to his/her own operation.
  • a virtual camera a virtual camera
  • a format combining mesh information and UV textures and a format combining mesh information and multi-textures are generally used.
  • Mesh information is a set of vertices and edges of a three-dimensional model made up of polygons.
  • a UV texture is a texture obtained by assigning UV coordinates, which are coordinates on the texture, to a texture image.
  • multi-texture is used to overlap and paste a plurality of texture images onto the polygons of the three-dimensional model.
  • the format that combines mesh information and UV texture covers all directions of the 3D model with one UV texture, so the amount of data is relatively small and lightweight, and the rendering load is low.
  • This format is suitable for use in the View Independent method (hereinafter abbreviated as the VI method), which is a rendering method in which the geometry is fixed with respect to the viewpoint movement of the virtual camera.
  • the format that combines mesh information and multi-textures increases the amount of data and the rendering load, but it can provide high image quality.
  • This format is suitable for use in the View Dependent method (hereinafter abbreviated as the VD method) in which the geometric shape changes as the viewpoint of the virtual camera moves.
  • FIG. 2 is a diagram for explaining a problem of an example of existing technology.
  • multiple three-dimensional models 51 1 to 51 3 are included for a single three-dimensional data 50, as shown in section (a) of FIG.
  • the three-dimensional models 51 1 to 51 3 are objects in virtual space obtained by giving three-dimensional information to the image of the subject in real space included in the captured image.
  • each three -dimensional model 51 1 -51 3 since the three-dimensional models 51 1 to 51 3 could not be separated and recognized, it was difficult to obtain sufficient quality when rendering the three-dimensional models 51 1 to 51 3 . That is, in order to render each three-dimensional model 51 1 -51 3 with sufficient quality, each three -dimensional model 51 1 -51 3 must be rendered as independent data, as shown in section (b) of FIG. It must be treated as 52 1 to 52 3 .
  • FIG. 3 is a diagram for explaining a problem of another example of existing technology.
  • a single three-dimensional data 50 includes a plurality of three-dimensional models 51 1 to 51 3 as shown in section (a) of FIG.
  • three - dimensional data 500 after post-effect processing in section (b) of FIG. It was difficult to selectively apply effect processing (in this example, non-display processing).
  • separation processing for separating each of the three-dimensional models 51 1 to 51 3 is required.
  • the existing technology does not consider such separation of the plurality of three-dimensional models 51 1 to 51 3 .
  • FIG. 4 is a diagram for explaining an example of a selection method of an imaging camera that acquires a texture image to be applied to a subject according to existing technology.
  • a subject 80 in real space and a plurality of imaging cameras 60 1 to 60 8 surrounding the subject 80 in real space are shown.
  • a reference position 81 is shown as a reference position of the subject 80 .
  • FIG. 4 also shows a virtual camera 70 arranged in the virtual space.
  • the coordinates in the real space and the coordinates in the virtual space match, and unless otherwise specified, the description will be made without distinguishing between the real space and the virtual space.
  • the real space and the virtual space have the same scale, and the position of an object (object, imaging camera, etc.) placed in the real space can be directly replaced with the position in the virtual space.
  • the positions of, for example, the three-dimensional model and the virtual camera 70 in the virtual space can be directly replaced with the positions in the real space.
  • the reference position 81 of the object 80 the position corresponding to the point closest to the optical axes of all the imaging cameras 60 1 to 60 8 in the object 80 can be applied.
  • the reference position 81 of the subject 80 may be an intermediate position between the maximum value and the minimum value of the vertex coordinates of the subject 80, or the most important position in the subject 80 (corresponding to the subject 80). If the subject is a person, it may be the position of the face, for example.
  • the optimum imaging camera for acquiring the texture to be applied to the three-dimensional model is determined according to the importance of each imaging camera 60 1 to 60 8 . It is known to select based on The degree of importance can be determined, for example, based on the angle formed by the position of the virtual camera 70 and the positions of the imaging cameras 60 1 to 60 8 with the reference position 81 as the vertex.
  • the angle ⁇ 1 formed by the position of the virtual camera 70 and the imaging camera 60 1 with respect to the reference position 81 is the smallest angle
  • the angle ⁇ 2 formed by the imaging camera 60 2 is the next smallest angle. becomes. Therefore, with respect to the position of the virtual camera 70, the imaging camera 60 1 has the highest importance, and the imaging camera 60 2 has the next highest importance after the imaging camera 60 1 .
  • the importance P(i) of each imaging camera 60 1 to 60 8 can be calculated by the following equation (1).
  • P(i) arccos(C i ⁇ C v ) (1)
  • equation (1) the value i represents each of the imaging cameras 60 1 to 60 8 . Also, the value C i represents a vector from each imaging camera 60 1 to 60 8 to the reference position 81 , and the value C v represents a vector from the virtual camera 70 to the reference position 81 . That is, equation (1) obtains the importance P(i) of the imaging cameras 60 1 to 60 8 based on the inner product of vectors from the imaging cameras 60 1 to 60 8 and the virtual camera 70 to the reference position 81 .
  • an unintended imaging camera may be selected as the optimal imaging camera.
  • FIGS. 5A and 5B are schematic diagrams for explaining a first example of image pickup camera selection according to the existing technology.
  • This first example is an example of selecting an imaging camera based on a vector relative to a reference position. That is, in FIGS. 5A and 5B, as described with reference to FIG. 4 , when a plurality of subjects 82 1 and 82 2 are included, vectors C i and , and a vector C v from the virtual camera 70 to the reference position.
  • an imaging range 84 includes two subjects 82 1 and 82 2 .
  • the subject 82 1 is positioned at the upper left corner of the imaging range 84 in the figure, and the subject 82 2 is positioned at the lower right corner of the imaging range 84 in the figure.
  • 16 imaging cameras 60 1 to 60 16 are arranged surrounding the imaging range 84 with their imaging directions facing the center of the imaging range 84 .
  • the virtual camera 70 has an angle of view ⁇ , and the three-dimensional model corresponding to the subject 82 1 is assumed to fit within the angle of view ⁇ . Since the positions of the subjects 82 1 and 82 2 are unknown, the reference position 83 adopts the center of the imaging range 84 or the center of gravity of the subjects 82 1 and 82 2 .
  • the three-dimensional models corresponding to the subjects 82 1 and 82 2 are assumed to be the subjects 82 1 and 82 2 unless otherwise specified.
  • FIG. 5A shows an example in which the virtual camera 70 is on the front side of the reference position 83 with respect to the subject 82 1 .
  • the imaging camera 60 1 located on a straight line 93a passing through the virtual camera 70 from the object 82 1 is the optimum imaging camera.
  • the direction of the vector 91a from the virtual camera 70 to the reference position 83 and the direction of the vector 90a from the imaging camera 60 16 to the reference position 83 substantially match, and the imaging camera 60 16 is optimal. camera.
  • the imaging camera 60 16 differs in position and orientation with respect to the object 82 1 with respect to the optimal imaging camera 60 1 in the ideal case. Therefore, the quality of the texture based on the captured image of the imaging camera 60 16 is lower than that of the texture based on the captured image of the imaging camera 60 1 .
  • FIG. 5B shows an example in which the virtual camera 70 is positioned between the subject 82 1 and the reference position 83.
  • the imaging camera 60 1 located on the straight line 93b passing through the virtual camera 70 from the object 82 2 is the optimum imaging camera.
  • the reference position 83 is on the side opposite to the subject 82 1 with respect to the virtual camera 70, and the vector 91b from the virtual camera 70 to the reference position 83 points to the side opposite to the subject 82 1 . become. Therefore, the direction of the vector 90b from the imaging camera 60 11 located on the opposite side of the subject 82 1 as viewed from the virtual camera 70 to the reference position 83 becomes close to the direction of the vector 91b, and the imaging camera 60 11 is the optimum imaging camera. be selected.
  • the imaging camera 60 11 images a surface of the subject 82 1 that cannot be seen from the virtual camera 70 . Therefore, the quality of the texture based on the image captured by the imaging camera 60 11 is greatly reduced compared to the texture based on the ideal image captured by the imaging camera 60 2 .
  • the selection method of the optimum imaging camera is not limited to the selection method based on the vector for the reference position described above.
  • a second example of image pickup camera selection by existing technology is based on the angle between the optical axis of the virtual camera 70 and the vector of each image pickup camera 60 1 to 60 16 with respect to the subject 82 1 , from each image pickup camera 60 1 to 60 16 This is an example of selecting the optimum imaging camera.
  • FIGS. 6A and 6B are schematic diagrams for explaining a second example of image pickup camera selection according to existing technology.
  • the subjects 821 and 822, the reference position 83, and the imaging range 84 are the same as in FIGS. 5A and 5B described above, so descriptions thereof will be omitted here.
  • FIG. 6A corresponds to FIG. 5A described above, and shows an example in which the virtual camera 70 is positioned closer to the subject 82 1 than the reference position 83 .
  • the imaging camera 60 1 located on a straight line 93a passing through the virtual camera 70 from the object 82 1 is the optimum imaging camera.
  • the virtual camera 70 faces upward in the figure, and the optical axis 94a is upward.
  • the angle between the direction of the vector 90c from the imaging camera 60 1 to the reference position 83 and the optical axis 94a of the virtual camera 70 is the smallest. Therefore, the same imaging camera 60 1 as the ideal optimum imaging camera is selected as the optimum imaging camera, and high-quality texture can be obtained.
  • FIG. 6B corresponds to FIG. 5B described above, and shows an example in which the virtual camera 70 is positioned between the object 82 1 and the reference position 83 .
  • the imaging camera 60 2 located on a straight line 93c passing through the virtual camera 70 from the object 82 2 is the optimum imaging camera.
  • the virtual camera 70 faces upward in the drawing, and the optical axis 94b is directed upward.
  • the angle between the direction of the vector 90c from the imaging camera 60 1 to the reference position 83 and the optical axis 94b of the virtual camera 70 is the smallest. Therefore, as the optimum imaging camera, the imaging camera 60 1 different from the optimum imaging camera 60 2 in the ideal case is selected. Therefore, the quality of the texture based on the captured image of the imaging camera 60 1 is lower than that of the texture based on the captured image of the imaging camera 60 2 .
  • the information processing system obtains the position of each subject when generating each three-dimensional model. Then, when rendering a three-dimensional model based on each subject, the information processing system uses the position of each subject obtained when the three-dimensional model is generated, and uses the imaging camera to acquire the texture to be applied to the three-dimensional model. to select.
  • the imaging camera used for acquiring the texture to be applied to the 3D model can be appropriately selected. You can choose and get high quality textures. Also, by using the position information added to each three-dimensional model, it is possible to apply post-effect processing to each three-dimensional model.
  • FIG. 7 is an exemplary functional block diagram illustrating functions of the information processing system according to the embodiment.
  • the information processing system 100 includes a data acquisition unit 110, a 3D (3-Dimensional) model generation unit 111, a formatting unit 112, a transmission unit 113, a reception unit 120, and a rendering unit 121. , and a display unit 122 .
  • the information processing system 100 includes, for example, an information processing device for outputting a 3D model including a data acquisition unit 110, a 3D model generation unit 111, a formatting unit 112, and a transmission unit 113, a reception unit 120, and rendering units 121 and 122. and an information processing device for outputting display information.
  • the information processing system 100 can also be configured by a single computer device (information processing device).
  • the data acquisition unit 110, the 3D model generation unit 111, the formatting unit 112, the transmission unit 113, the reception unit 120, the rendering unit 121, and the display unit 122 run the information processing program according to the embodiment on, for example, a CPU (Central Processing Unit). It is realized by being executed. Not limited to this, some or all of the data acquisition unit 110, the 3D model generation unit 111, the formatting unit 112, the transmission unit 113, the reception unit 120, the rendering unit 121, and the display unit 122 may be hardware that cooperates with each other. It may be realized by a circuit.
  • the data acquisition unit 110 acquires image data for generating a 3D model of a subject.
  • FIG. 8 is a schematic diagram showing an example configuration for acquiring image data of a subject, which is applicable to the embodiment.
  • a plurality of captured images captured from a plurality of viewpoints by a plurality of imaging cameras 60 1 , 60 2 , 60 3 , .
  • the captured images from multiple viewpoints are preferably images captured in synchronism by the plurality of imaging cameras 60 1 to 60 n .
  • the data acquisition unit 110 may acquire, as image data, a plurality of captured images obtained by capturing the subject 80 from a plurality of viewpoints with a single imaging camera.
  • this image data acquisition method is applicable when the position of the subject 80 is fixed.
  • the data acquisition unit 110 may perform calibration based on the image data and acquire the internal parameters and external parameters of each imaging camera 60 1 to 60 n . Also, the data acquisition unit 110 may acquire a plurality of pieces of depth information indicating distances from a plurality of viewpoints to the subject 80, for example.
  • the 3D model generation unit 111 generates a 3D model having 3D information of the subject 80 based on image data obtained by the data acquisition unit 110 and obtained by imaging the subject 80 from multiple viewpoints.
  • the 3D model generation unit 111 uses, for example, a so-called Visual Hull to cut the three-dimensional shape of the subject 80 using images from multiple viewpoints (for example, silhouette images from multiple viewpoints). A three-dimensional model of the subject 80 is generated. In this case, the 3D model generation unit 111 further transforms the 3D model generated using Visual Full with a high degree of accuracy using a plurality of pieces of depth information indicating distances from a plurality of viewpoints to the subject 80. can be done.
  • the 3D model generated by the 3D model generation unit 111 is generated using captured images captured by the imaging cameras 60 1 to 60 n in the real space, and therefore can be said to be a real 3D model.
  • the 3D model generation unit 111 can express the generated 3D model, for example, in the form of mesh data.
  • the mesh data is data representing shape information representing the surface shape of the subject 80 by connections between vertices called polygon meshes.
  • the method of expressing the three-dimensional model generated by the 3D model generation unit 111 is not limited to mesh data.
  • the 3D model generation unit 111 may describe the generated 3D model in a so-called point cloud representation method represented by point position information.
  • the 3D model generation unit 111 also generates color information data of the subject 80 as a texture in association with the three-dimensional model of the subject 80 .
  • the 3D model generation unit 111 can generate, for example, a View Independent (VD) texture that has a constant color when viewed from any direction. Not limited to this, the 3D model generation unit 111 may generate a View Dependent (VI) texture whose color changes depending on the viewing direction.
  • VD View Independent
  • VI View Dependent
  • the formatting unit 112 converts the 3D model data generated by the 3D model generation unit 111 into data in a format suitable for transmission and storage.
  • the formatting unit 112 can convert the 3D model generated by the 3D model generating unit 111 into a plurality of two-dimensional images by perspectively projecting the model from a plurality of directions.
  • the formatting unit 112 may generate depth information, which is two-dimensional depth images from multiple viewpoints, using the three-dimensional model.
  • the formatting unit 112 compresses and encodes the depth information and the color information of the state of the two-dimensional image, and outputs them to the transmission unit 113 .
  • the formatting unit 112 may transmit the depth information and the color information side by side as one image, or may transmit them as two separate images.
  • the formatting unit 112 can compress and encode the data using a compression technique for two-dimensional images such as AVC (Advanced Video Coding).
  • the formatting unit 112 may also convert the three-dimensional model into a point cloud format. Furthermore, the formatting unit 112 may output the 3D model to the transmission unit 113 as 3D data. In this case, the formatting unit 112 can use, for example, the Geometry-based-Approach three-dimensional compression technology discussed in MPEG (Moving Picture Experts Group).
  • the transmission unit 113 transmits transmission data generated by the formatting unit 112 .
  • the transmission unit 113 transmits transmission data after performing a series of processes of the data acquisition unit 110, the 3D model generation unit 111, and the formatting unit 112 offline. Further, the transmission unit 113 may transmit the transmission data generated from the series of processes described above in real time.
  • the receiving section 120 receives transmission data transmitted from the transmitting section 113 .
  • the rendering unit 121 performs rendering according to the position of the virtual camera 70 using the transmission data received by the receiving unit 120 .
  • mesh data of a three-dimensional model is projected from the viewpoint of the virtual camera 70 that performs drawing, and texture mapping is performed to paste textures representing colors and patterns.
  • the drawn image can be viewed from a freely set viewpoint by means of the virtual camera 70 regardless of the positions of the imaging cameras 60 1 to 60 n at the time of photographing.
  • the rendering unit 121 performs texture mapping to paste textures representing the color, pattern, and texture of the mesh according to the position of the mesh of the three-dimensional model.
  • the rendering unit 121 may perform texture mapping using a VD method that considers the viewpoint from the user (virtual camera 70). Not limited to this, the rendering unit 121 may perform texture mapping by a VI method that does not consider the viewpoint of the user.
  • the VD method changes the texture to be pasted on the 3D model according to the position of the viewpoint from the user (the viewpoint from the virtual camera 70). Therefore, the VD method has the advantage of realizing higher quality rendering than the VI method. On the other hand, since the VI method does not consider the position of the viewpoint from the user, it has the advantage of reducing the amount of processing compared to the VD method.
  • the user's viewpoint data may be input to the rendering unit 121 from the display device, for example, by detecting the region of interest of the user.
  • the rendering unit 121 may employ billboard rendering, which renders an object such that the object maintains a vertical orientation with respect to the viewpoint of the user, for example.
  • the rendering unit 121 may render an object that the user is less interested in using billboard rendering, and render other objects using another rendering method.
  • the display unit 122 displays the resulting image rendered by the rendering unit 121 on the display device of the display device.
  • the display device may be, for example, a head-mounted display or a spatial display, or may be a display device of an information device such as a smartphone, a television receiver, or a personal computer. Also, the display device may be a 2D monitor for two-dimensional display, or a 3D monitor for three-dimensional display.
  • the information processing system 100 shown in FIG. 7 shows a series of flows from the data acquisition unit 110 that acquires captured images, which are materials for generating content, to the display control unit that controls the display device observed by the user.
  • the transmitting unit 113 and the receiving unit 120 are provided to show a series of flow from the content (three-dimensional model) creation side to the content observation side through the distribution of the content data.
  • the information processing system 100 can omit the formatting unit 112, the transmission unit 113, and the reception unit 120.
  • FIG. 7 shows a series of flows from the data acquisition unit 110 that acquires captured images, which are materials for generating content, to the display control unit that controls the display device observed by the user.
  • the transmitting unit 113 and the receiving unit 120 are provided to show a series of flow from the content (three-dimensional model) creation side to the content observation side through the distribution of the content data.
  • the same information processing apparatus for example, a personal computer
  • the information processing system 100 can omit the formatting unit 112, the transmission unit 113, and
  • the same implementer may implement all of them, or each functional block may be implemented by different implementers.
  • operator A generates 3D content (three-dimensional model) using data acquisition section 110 , 3D model generation section 111 and format formation section 112 .
  • the 3D content is distributed through the transmitter 113 (platform) of the operator B, and the display device of the operator C receives, renders, and displays the 3D content.
  • each functional block shown in FIG. 7 can be implemented on a cloud network.
  • the rendering unit 121 may be implemented within a display device, or may be implemented in a server on a cloud network. In that case, information is exchanged between the display device and the server.
  • the data acquisition unit 110, the 3D model generation unit 111, the formatting unit 112, the transmission unit 113, the reception unit 120, the rendering unit 121, and the display unit 122 are collectively described as the information processing system 100.
  • the unit 110 , the 3D model generation unit 111 , the formatting unit 112 , the transmission unit 113 , the reception unit 120 and the rendering unit 121 may be collectively referred to as the information processing system 100 .
  • FIG. 9 is a block diagram showing a hardware configuration of an example of an information processing device applicable to the information processing system 100 according to the embodiment.
  • the information processing apparatus 2000 shown in FIG. 9 can be applied to both the information processing apparatus for outputting the 3D model and the information processing apparatus for outputting the display information described above.
  • the information processing apparatus 2000 shown in FIG. 9 can also be applied to a configuration including the entire information processing system 100 shown in FIG.
  • an information processing device 2000 includes a CPU (Central Processing Unit) 2100, a ROM (Read Only Memory) 2101, a RAM (Random Access Memory) 2102, an interface (I/F) 2103, an input section 2104, and , an output unit 2105 , a storage device 2106 , a communication I/F 2107 and a drive device 2108 .
  • a CPU Central Processing Unit
  • ROM Read Only Memory
  • RAM Random Access Memory
  • the CPU 2100, ROM 2101, RAM 2102 and I/F 2103 are communicably connected to each other via a bus 2110.
  • An input unit 2104 , an output unit 2105 , a storage device 2106 , a communication I/F 2107 and a drive device 2108 are connected to the I/F 2103 .
  • These input unit 2104 , output unit 2105 , storage device 2106 , communication I/F 2107 and drive device 2108 can communicate with CPU 2100 and the like via I/F 2103 and bus 2110 .
  • the storage device 2106 is a non-volatile storage medium such as a hard disk drive or flash memory.
  • the CPU 2100 controls the overall operation of the information processing apparatus 2000 according to programs stored in the ROM 2101 and storage device 2106 and using the RAM 2102 as a work memory.
  • the input unit 2104 accepts data input to the information processing device 2000 .
  • an input device for inputting data according to user operation such as a pointing device such as a mouse, a keyboard, a touch panel, a joystick, or a controller, can be applied.
  • the input unit 2104 can include various input terminals for inputting data from an external device.
  • the input section 2104 can include a sound pickup device such as a microphone.
  • the output unit 2105 is responsible for outputting information from the information processing device 2000 .
  • a display device such as a display can be applied as the output unit 2105 .
  • the output unit 2105 can include a sound output device such as a speaker.
  • the output unit 2105 can include various output terminals for outputting data to external devices.
  • the output unit 2105 preferably includes a GPU (Graphics Processing Unit).
  • the GPU has a memory (GPU memory) for graphics processing.
  • a communication I/F 2107 controls communication via a network such as a LAN (Local Area Network) or the Internet.
  • a drive device 2108 drives removable media such as optical discs, magneto-optical discs, flexible discs, and semiconductor memories to read and write data.
  • the CPU 2100 executes the information processing program according to the embodiment to obtain the data acquisition unit 110 and the 3D model generation unit 111 described above.
  • the formatting unit 112 and the transmitting unit 113 are configured as modules on the main storage area of the RAM 2012, respectively.
  • the CPU 2100 executes the information processing program according to the embodiment, so that the receiving unit 120, the rendering unit 121, and the display unit 122 are configured as modules, for example, on the main memory area of the RAM 2012 .
  • These information processing programs can be acquired from outside (for example, a server device) via a network such as a LAN or the Internet by communication via the communication I/F 2107, and installed on the information processing device 2000. ing. Not limited to this, the information program may be stored in a removable storage medium such as a CD (Compact Disk), a DVD (Digital Versatile Disk), or a USB (Universal Serial Bus) memory and provided.
  • a removable storage medium such as a CD (Compact Disk), a DVD (Digital Versatile Disk), or a USB (Universal Serial Bus) memory and provided.
  • FIG. 10 is an exemplary flowchart schematically showing processing in the information processing system 100 according to the embodiment. Prior to the processing according to the flowchart of FIG. 10 , as described with reference to FIG.
  • the information processing system 100 acquires captured image data for generating a three-dimensional model of the subject 80 by the data acquisition unit 110 in step S10.
  • the information processing system 100 uses the 3D model generation unit 111 to generate a three-dimensional model having three-dimensional information of the subject 80 based on the captured image data acquired in step S10.
  • the information processing system 100 causes the formatting unit 112 to encode the three-dimensional model shape and texture data generated in step S11 into a format suitable for transmission and storage.
  • the information processing system 100 causes the transmission unit 113 to transmit the data encoded in step S12.
  • the information processing system 100 receives the data transmitted in step S13 by the receiving unit 120.
  • the receiving unit 120 decodes the received data and restores the shape and texture data of the three-dimensional model.
  • the information processing system 100 causes the rendering section 121 to perform rendering using the shape and texture data passed from the receiving section 120, and generate image data for displaying the three-dimensional model.
  • the information processing system 100 causes the display unit 122 to display the image data generated by rendering on the display device.
  • step S16 ends, the series of processes in the flowchart of FIG. 10 ends.
  • FIG. 11 is a schematic diagram that schematically shows a three-dimensional model generation process that can be applied to the embodiment.
  • the 3D model generation unit 111 generates a plurality of 3D models 51 1 to 51 3 based on captured images captured from different viewpoints, for example, each of which is based on an object in real space.
  • Generate three-dimensional data 50 including: Various methods are conceivable for adding position information to each of the three-dimensional models 51 1 to 51 3 .
  • position information is added to each of the three-dimensional models 51 1 to 51 3 using bounding boxes.
  • Section (b) of FIG. 11 shows an example of a bounding box. Rectangular parallelepipeds circumscribing the three-dimensional models 51 1 , 51 2 and 51 3 are determined as three-dimensional bounding boxes 200 1 , 200 2 and 200 3 . Each vertex of these three-dimensional bounding boxes 200 1 to 200 3 is used as position information indicating the position of the corresponding three-dimensional models 51 1 to 51 3 .
  • BoundingBox[1] and BoundingBox[2] of the three-dimensional bounding boxes 200 2 and 200 3 with respect to the three-dimensional models 51 2 and 51 3 are similarly represented by the following equations (3) and (4).
  • BoundingBox[1] (x min1 , x max1 , y min1 , y max1 , z min1 , z max1 ) ...
  • BoundingBox[2] (x min2 , x max2 , y min2 , y max2 , z min2 , z max2 ) (4)
  • FIG. 12 is a block diagram showing an example configuration of the 3D model generation unit 111 according to the embodiment.
  • the 3D model generation unit 111 includes a 3D model processing unit 1110 and a 3D model separation unit 1111 .
  • the image data captured by each of the imaging cameras 60 1 to 60 n and the imaging camera information output from the data acquisition unit 110 are input to the 3D model generation unit 111 .
  • the imaging camera information may include color information, depth information, camera parameter information, and the like.
  • the camera parameter information includes, for example, information on the position, direction and angle of view ⁇ of each imaging camera 60 1 to 60 n .
  • Camera parameter information may further include zoom information, shutter speed information, aperture information, and the like.
  • the imaging camera information of each of the imaging cameras 60 1 to 60 n is passed to the 3D model processing section 1110 and output from the 3D model generation section 111 .
  • the 3D model processing unit 1110 Based on the image data captured by each of the imaging cameras 60 1 to 60 n and the imaging camera information, the 3D model processing unit 1110 removes the three-dimensional shape of the subject included in the imaging range using the above-described Visual Full, thereby arranging the vertices of the subject. and generate surface data. More specifically, the 3D model processing unit 1110 acquires in advance an image of the background of the space in which the subject is placed in the real space for each of the imaging cameras 60 1 to 60 n . A silhouette image of the subject is generated based on the difference between each image of the subject captured by each of the imaging cameras 60 1 to 60 n and each background image. By shaving the three-dimensional space from this silhouette image, it is possible to obtain the three-dimensional shape of the subject based on the vertex and surface data.
  • the 3D model processing unit 1110 functions as a generation unit that generates three-dimensional data based on captured images captured by one or more imaging cameras.
  • the 3D model processing unit 1110 outputs the generated vertex and surface data of the subject as mesh information.
  • the mesh information output from the 3D model processing unit 1110 is output from the 3D model generation unit 111 and passed to the 3D model separation unit 1111 .
  • the 3D model separation unit 1111 separates each subject based on the mesh information passed from the 3D model processing unit 1110 and generates position information of each subject.
  • FIG. 13 is a schematic diagram for explaining subject separation processing according to the embodiment.
  • FIG. 14 is a flowchart of an example showing subject separation processing according to the embodiment.
  • the 3D model processing unit 1110 generates a plurality of 3D models 51 visually based on captured images captured from different viewpoints. Generate three-dimensional data 50 containing 1 to 51 3 .
  • step S100 of FIG. 14 the 3D model separation unit 1111 projects the 3D data 50 in the height direction (y-axis direction) to generate 2D silhouette information for each of the 3D models 51 1 to 51 3 .
  • Section (b) of FIG. 13 shows examples of two-dimensional silhouettes 52 1 -52 3 based on respective three-dimensional models 51 1 -51 3 .
  • the 3D model separation unit 1111 performs clustering on a two-dimensional plane based on the silhouettes 52 1 to 52 3 to detect blobs.
  • Subsequent steps S103 to S105 are processed for each blob detected in step S101.
  • step S103 the 3D model separating unit 1111 divides the detected blob into a two-dimensional bounding rectangle corresponding to each of the three-dimensional models 51 1 to 51 3 , as shown in section (c) of FIG. Obtained as boxes 53 1 to 53 3 .
  • the 3D model separation unit 1111 adds height information to the two-dimensional bounding box 53 1 obtained in step S103 to obtain a 3D model as shown in section (d) of FIG. Generate a dimensional bounding box 200 1 .
  • Height information to be added to the two-dimensional bounding box 53 1 can be obtained based on the three-dimensional data 50 shown in section (a) of FIG. 13, for example.
  • the 3D model separation unit 1111 gives height information to the two-dimensional bounding boxes 53 2 and 53 3 to generate three-dimensional bounding boxes 200 2 and 200 3 .
  • a two-dimensional bounding box 53 2 is obtained in step S103
  • height information is added to the two-dimensional bounding box 53 2 in the next step S104.
  • a three-dimensional bounding box 200 2 is generated.
  • a two-dimensional bounding box 53 3 is obtained in step S103
  • height information is added to the two-dimensional bounding box 53 3 in step S104 to obtain a three-dimensional bounding box 200 information 3 is generated.
  • step S105 when the 3D model separation unit 1111 determines that the processing for all blobs has ended (step S105, "Yes"), it ends the series of processing according to the flowchart in FIG.
  • three-dimensional bounding boxes 200 1 -200 3 corresponding to the three-dimensional models 51 1 -51 3 are generated. Then, based on the vertex coordinates of each of these three-dimensional bounding boxes 200 1 to 200 3 , position information indicating the position of each of the three-dimensional models 51 1 to 51 3 is obtained.
  • the 3D model separation unit 1111 separates the 3D model corresponding to the subject included in the captured image from the 3D data and generates position information indicating the position of the separated 3D model. Function.
  • the 3D model generation unit 111 adds position information indicating the position of the 3D model to the 3D model separated by the 3D model separation unit 1111 and outputs the 3D model.
  • rendering processing by the rendering unit 121 uses the position information indicating the position of the subject acquired by the 3D model generation unit 111 as described above to select the optimum imaging camera for acquiring the texture to be applied to the subject.
  • FIG. 15 is a schematic diagram for explaining selection of an imaging camera according to the embodiment.
  • section (a) shows an example of imaging camera selection according to the embodiment.
  • section (b) is the same diagram as FIG. 6B according to the existing technology described above, and is reprinted for comparison with the embodiment.
  • two subjects 82 1 and 82 2 are included in an imaging range 84 to be imaged, as in FIG. 5A and the like described above.
  • Subject 82 1 is positioned at the upper left corner of imaging range 84 in section (a) of FIG. 15, and subject 82 2 is positioned at the lower right corner of imaging range 84 in section (a) of FIG.
  • 16 imaging cameras 60 1 to 60 16 each having an angle of view ⁇ are arranged to surround an imaging range 84 with their imaging directions facing the center of the imaging range 84.
  • the virtual camera 70 has an angle of view ⁇ , and the three-dimensional model corresponding to the subject 82 1 is assumed to fit within the angle of view ⁇ .
  • the virtual camera 70 is arranged closer to the subject 82 1 than the center of the imaging range 84, and the subject 82 1 is included within the angle of view ⁇ of the virtual camera 70.
  • the reference position 83 is set based on position information indicating the position of the subject 82 1 obtained by the 3D model generation unit 111 .
  • the reference position 83 is set at the center of the subject 82 1 .
  • the rendering unit 121 renders the positions of the imaging cameras 60 1 to 60 16 and the position of the subject 82 1 included within the angle of view ⁇ of the virtual camera 70. , each vector from each imaging camera 60 1 to 60 16 to the reference position 83 is obtained. The rendering unit 121 also obtains a vector 91 e from the virtual camera 70 to the reference position 83 from the virtual camera 70 and the position of the subject 821 included within the angle of view ⁇ of the virtual camera 70 .
  • the rendering unit 121 for example, in accordance with the above-described formula (1), for each angle formed by each vector (vector C i ) from each imaging camera 60 1 to 60 16 to the reference position 83 and the vector 91e (vector C v ) Based on this, the degree of importance P(i) of each imaging camera 60 1 to 60 16 is obtained.
  • the rendering unit 121 selects the optimum imaging camera for acquiring the texture to be applied to the subject 82 1 based on the importance P(i) obtained for each of the imaging cameras 60 1 to 60 16 .
  • the imaging camera 60 is on a straight line 93c passing through the virtual camera 70 from the subject 82 1 (reference position 83). 2 is the optimal imaging camera.
  • the position of the subject 82 1 is obtained and set as the reference position 83 . Therefore, among the vectors from the imaging cameras 60 1 to 60 16 to the reference position 83, by selecting the vector with the smallest angle with the vector 91e from the virtual camera 70 to the reference position 83, an imaging camera close to the ideal can be selected as the optimal imaging camera.
  • the imaging camera 602 which is the above-described ideal imaging camera, is selected as the optimum imaging camera.
  • the imaging camera 60 2 can be said to be a camera viewing the subject 82 1 from substantially the same direction as the virtual camera 70 . Therefore, according to the imaging camera selection method according to the embodiment, it is possible to obtain textures of higher quality.
  • Section (b) of FIG. 15 is based on the angle between the optical axis of the virtual camera 70 and the vector of each of the imaging cameras 60 1 to 60 16 with respect to the subject 82 1 from each of the imaging cameras 60 1 to 60 16 according to the existing technology.
  • This is an example of selecting the optimum imaging camera.
  • the reference position 83 does not match the position of the subject 82 1 , and the virtual camera 70 and the selected optimum imaging camera are viewing the subject 82 1 from substantially the same direction.
  • the imaging camera 60 1 different from the ideal imaging camera 60 2 is selected as the optimum imaging camera. Therefore, compared to the selection direction using the position information indicating the position of the subject 821 according to the embodiment, the quality of the acquired texture is degraded.
  • FIG. 16 is a block diagram showing an example configuration of the rendering unit 121 according to the embodiment.
  • the rendering unit 121 includes a mesh transfer unit 1210, an imaging camera selection unit 1211, an imaging viewpoint depth generation unit 1212, an imaging camera information transfer unit 1213, and a virtual viewpoint texture generation unit 1214.
  • the mesh information, imaging camera information, and subject position information generated by the 3D model generation unit 111 are input to the rendering unit 121 .
  • virtual viewpoint position information indicating the position and direction of the virtual camera 70 is input to the rendering unit 121 .
  • This virtual viewpoint position information is input by the user, for example, using a controller (corresponding to the input unit 2104).
  • the rendering unit 121 generates a texture at the virtual viewpoint of the virtual camera 70 based on the mesh information, the imaging camera information, the virtual viewpoint position information, and the subject position information.
  • the mesh information is transferred to the mesh transfer unit 1210.
  • the mesh transfer unit 1210 transfers the passed mesh information to the imaging viewpoint depth generation unit 1212 and the virtual viewpoint texture generation unit 1214 .
  • the mesh transfer processing by the mesh transfer unit 1210 is processing for transferring mesh information to the GPU memory.
  • the virtual viewpoint texture generation unit 1214 may access the GPU memory to acquire mesh information. Note that if the mesh information is on the GPU memory when the reception unit 120 receives the mesh information, the mesh transfer unit 1210 can be omitted.
  • the imaging camera information is transferred to the imaging camera information transfer unit 1213.
  • camera parameter information in the imaging camera information is transferred to the imaging camera selection unit 1211 and the imaging viewpoint depth generation unit 1212 .
  • the imaging viewpoint depth generation unit 1212 selects an imaging camera from the imaging cameras 60 1 to 60 n according to camera selection information passed from the imaging camera selection unit 1211, which will be described later. Based on the mesh information transferred from the mesh transfer unit 1210, the imaging viewpoint depth generation unit 1212 generates selected imaging viewpoint depth information, which is depth information corresponding to the image captured by the selected imaging camera.
  • the imaging viewpoint depth generation unit 1212 may transfer the depth information included in the imaging camera information input to the rendering unit 121 to the imaging viewpoint depth generation unit 1212 .
  • depth generation processing by the imaging viewpoint depth generation unit 1212 is unnecessary, and the imaging viewpoint depth generation unit 1212 transfers the depth information to the virtual viewpoint texture generation unit 1214 as selected imaging viewpoint depth information.
  • the virtual viewpoint texture generation unit 1214 may access the GPU memory and acquire the selected imaging viewpoint depth information.
  • the virtual viewpoint position information and the subject position information are transferred to the imaging camera selection section 1211 and the imaging camera information transfer section 1213 .
  • the imaging camera selection unit 1211 selects one or more imaging cameras to be used in subsequent processing from the imaging cameras 60 1 to 60 n based on the camera parameter information, the virtual viewpoint position information, and the subject position information.
  • Camera selection information is generated that indicates one or more imaging cameras.
  • the imaging camera selection unit 1211 transfers the generated camera selection information to the imaging viewpoint depth generation unit 1212 and the imaging camera information transfer unit 1213 .
  • the imaging camera selection unit 1211 selects the first position of the virtual camera that acquires the image of the virtual space, the second position of the three-dimensional model, and one or more imaging cameras that capture the subject in the real space. It functions as a selection unit that selects, from one or more imaging cameras, an imaging camera that acquires an imaging image of a subject to be used as a texture image based on the third position.
  • the imaging camera information transfer section 1213 transfers imaging camera information indicating the selected imaging camera to the virtual viewpoint texture generation section 1214 as selected camera information. Even in this case, if the imaging camera information is already on the GPU memory, the process of transferring the selected camera information can be omitted. In this case, the virtual viewpoint texture generation unit 1214 may access the GPU memory and acquire the imaging camera information.
  • the virtual viewpoint texture generation unit 1214 receives the mesh information from the mesh transfer unit 1210, the selected imaging viewpoint depth information from the imaging viewpoint depth generation unit 1212, and the selected camera information from the imaging camera information transfer unit 1213. is transferred. Also, the virtual viewpoint position information and the subject position information input to the rendering unit 121 are transferred to the virtual viewpoint texture generation unit 1214 . The virtual viewpoint texture generation unit 1214 generates the texture of the virtual viewpoint, which is the viewpoint from the virtual camera 70, based on the information transferred from each of these units.
  • the virtual viewpoint texture generation unit 1214 functions as a generation unit that generates an image by applying a texture image to the 3D model included in the 3D data.
  • FIG. 17 is an exemplary flowchart illustrating a first example of imaging camera selection processing in rendering processing according to the embodiment.
  • one reference position is set for one or more subjects.
  • Each process in the flowchart of FIG. 17 is a process executed by the imaging camera selection unit 1211 included in the rendering unit 121 .
  • the subsequent processing from step S201 to step S205 is processing for each object (subject). Note that the number of objects input to the rendering unit 121 can be obtained from subject position information.
  • the subsequent processing in steps S202 and S203 is processing for each vertex of the bounding box of the i-th object.
  • the imaging camera selection unit 1211 projects the j-th vertex of the bounding box of the i-th object onto the virtual camera 70 based on the virtual viewpoint position information and the subject position information related to the object.
  • the imaging camera selection unit 1211 has determined that processing has been completed for all vertices of the target bounding box, or that the j-th vertex of the target bounding box has been projected within the angle of view ⁇ of the virtual camera 70. If so (step S203, "Yes"), the process proceeds to step S204. In step S204, if even one of all the vertices of the target bounding box exists within the angle of view ⁇ of the virtual camera 70, the imaging camera selection unit 1211 adds a reference position based on the bounding box.
  • steps S203 and S204 if at least one of the vertices of the bounding box projected onto the virtual camera 70 is included within the angle of view ⁇ of the virtual camera 70, the imaging camera selection unit 1211 It is assumed that the (subject) of the object related to the bounding box exists within the angle of view ⁇ of the virtual camera 70 . Then, the imaging camera selection unit 1211 obtains the reference position based on the bounding box assumed to exist within the angle of view ⁇ of the virtual camera 70 .
  • FIG. 18 is a schematic diagram for explaining the relationship between an object (subject) and the virtual camera 70 according to the embodiment.
  • the vertex 201a is outside the angle of view ⁇ of the virtual camera 70.
  • the imaging camera selection unit 1211 assumes that the three-dimensional model 51a exists within the angle of view ⁇ of the virtual camera 70 .
  • the imaging camera selection unit 1211 determines that the three-dimensional model 51a is It is assumed to exist within the angle of view ⁇ of the virtual camera 70 .
  • the imaging camera selection unit 1211 obtains the reference position 84a for the three-dimensional model 51a related to the bounding box 200a based on the coordinates of each vertex of the three-dimensional bounding box 200a. For example, the imaging camera selection unit 1211 obtains the average value of the coordinates of each vertex of the three-dimensional bounding box 200a as the reference position 84a for the three-dimensional model 51a related to the bounding box 200a.
  • step S203 when it is determined in step S203 that any vertex other than the vertex 201a of the bounding box 200a exists within the angle of view ⁇ of the virtual camera 70, , the process proceeds to step S204.
  • the imaging camera selection unit 1211 determines whether the processes of steps S202 to S204 have been completed for all objects input to the rendering unit 121.
  • the imaging camera selection unit 1211 determines that processing has not been completed for all objects input to the rendering unit 121 (“No” in step S205)
  • the imaging camera selection unit 1211 determines that the processing has been completed for all objects input to the rendering unit 121 (step S205, “Yes”), the process proceeds to step S206.
  • step S206 the imaging camera selection unit 1211 calculates a representative reference position for all objects for which processing has been completed up to step S205. More specifically, in step S206, the imaging camera selection unit 1211 calculates the average value of the reference positions of all the objects for which the processing has been completed up to step S205. The imaging camera selection unit 1211 uses the calculated average value as a representative reference position for all the objects.
  • FIG. 19 is a schematic diagram for explaining the process of calculating the average value of the reference positions of the objects according to the embodiment.
  • the angle of view ⁇ of the virtual camera 70 includes a bounding box 200a for the three-dimensional model 51a and a bounding box 200b for the three-dimensional model 51b.
  • Reference positions 84a and 84b are set for the three-dimensional models 51a and 51b, respectively.
  • the imaging camera selection unit 1211 sets the reference position 85 with respect to the coordinates of the average values of the coordinates of the reference positions 84a and 84b.
  • This reference position 85 serves as a common reference position for the three-dimensional models 51a and 51b.
  • the reference position 85 is used to select the optimum imaging camera for the three-dimensional models 51a and 51b in common. set.
  • the three-dimensional models 51a and 51b form one group.
  • the subsequent processing of steps S208 to S210 is processing for each of the imaging cameras 60 1 to 60 n . Also, the processing target in the loop among the imaging cameras 60 1 to 60 n is assumed to be the imaging camera 60 k .
  • step S208 for the k-th imaging camera 60 k , the imaging camera selection unit 1211 obtains an angle between a vector directed from the imaging camera 60 k to the reference position 85 and a vector directed from the virtual camera 70 to the reference position 85. .
  • the imaging camera selection unit 1211 sorts the imaging cameras 60 k in ascending order of angles obtained in step S208 in the loop processing based on the loop variable k. That is, in step S209, the imaging camera selection unit 1211 sorts the imaging cameras 60 k in descending order of importance.
  • the imaging camera selection unit 1211 determines whether or not processing has been completed for all of the arranged imaging cameras 60 1 to 60 n .
  • the imaging camera selection unit 1211 determines that the processing has not been completed for all the imaging cameras 60 1 to 60 n (step S210, “No”)
  • the imaging camera selection unit 1211 determines that the processing has been completed for all the imaging cameras 60 1 to 60 n (step S210, “Yes”), the process proceeds to step S211.
  • the imaging camera selection unit 1211 selects camera information indicating top m imaging cameras from the array of imaging cameras 60 1 to 60 n sorted in ascending order of angle.
  • the imaging camera selection unit 1211 transfers information indicating each selected imaging camera to the imaging viewpoint depth generation unit 1212 and the imaging camera information transfer unit 1213 as camera selection information.
  • step S211 ends, the imaging camera selection unit 1211 ends the series of processes according to the flowchart of FIG.
  • one reference position 85 is collectively set for a plurality of three-dimensional models 51a and 51b. Therefore, in post-effect processing and the like, which will be described later, the three-dimensional models 51a and 51b are subjected to common effect processing at the same time.
  • FIG. 20 is an exemplary flowchart illustrating a second example of imaging camera selection processing in rendering processing according to the embodiment.
  • one reference position is set for each of one or more subjects.
  • Each process in the flowchart of FIG. 20 is a process executed by the imaging camera selection unit 1211 included in the rendering unit 121.
  • FIG. 20 is a process executed by the imaging camera selection unit 1211 included in the rendering unit 121.
  • steps S200 to S205 is the same as the processing of steps S200 to S205 in the flowchart of the above-described figure, so the description is omitted here.
  • the imaging camera selection unit 1211 advances the process to step S2060 when the reference position addition processing for all objects is completed in step S205.
  • the subsequent processing from step S208 to step S2101 is processing for each object.
  • the subsequent processing of steps S208 to S210 is processing for each of the imaging cameras 60 1 to 60 n . Note that the processing of steps S208 to S210 is the same as the processing of steps S208 to S210 in the flow chart of FIG.
  • step S210 determines that the processing for all the arranged imaging cameras 60 1 to 60 n is completed in step S210 (step S210, “Yes”), the process proceeds to step S2101.
  • step S2101 the imaging camera selection unit 1211 determines whether or not the processing for all objects included within the angle of view ⁇ of the virtual camera 70 has been completed.
  • step S2101, “No” the imaging camera selection unit 1211 determines that processing for all objects included in the angle of view ⁇ of the virtual camera 70 has not ended.
  • step S2101, "Yes” the process proceeds to step S211.
  • the imaging camera selection unit 1211 selects the top m imaging cameras from the arrangement of the imaging cameras 60 1 to 60 n sorted in ascending order of angle, as in step S211 of the flowchart of FIG. Select the camera information to display.
  • the imaging camera selection unit 1211 transfers information indicating each selected imaging camera to the imaging viewpoint depth generation unit 1212 and the imaging camera information transfer unit 1213 as camera selection information.
  • step S211 ends, the imaging camera selection unit 1211 ends the series of processes according to the flowchart of FIG.
  • the reference position 85 in FIG. 19 described above is not set, and reference positions 84a and 84b are set for the three-dimensional models 51a and 51b, respectively. .
  • reference positions 84a and 84b are individually set for each of the plurality of three-dimensional models 51a and 51b. Therefore, in post-effect processing, etc., which will be described later, it is possible to apply effect processing to each of the three-dimensional models 51a and 51b individually.
  • FIG. 21 is an exemplary flowchart illustrating rendering processing according to the embodiment. Each process according to the flowchart of FIG. 21 is a process executed by the virtual viewpoint texture generation unit 1214 included in the rendering unit 121 .
  • the mesh information may include mesh information of a plurality of subjects.
  • the virtual viewpoint texture generation unit 1214 selects vertices to be projected onto the virtual viewpoint by the virtual camera 70 from the mesh information based on the subject position information.
  • the virtual viewpoint texture generation unit 1214 rasterizes based on the vertices selected in step S301. That is, the vertices not selected in step S301 are not rasterized and are not projected onto the virtual viewpoint, ie, the virtual camera 70 . Therefore, the virtual viewpoint texture generation unit 1214 can selectively set display/non-display for each of a plurality of subjects.
  • step S305 the virtual viewpoint texture generation unit 1214 obtains the vertex of the mesh corresponding to the pixel q of the virtual viewpoint.
  • the subsequent processing of steps S307 to S313 is processing for each of the imaging cameras 60 1 to 60 n . Also, the imaging camera 60 r is assumed to be the object of processing in the loop among the imaging cameras 60 1 to 60 n .
  • step S307 the virtual viewpoint texture generation unit 1214 projects the vertex coordinates of the vertices obtained in step S305 onto the imaging camera 60r , and obtains the UV coordinates of the vertex coordinates of the imaging camera 60r .
  • step S308 the virtual viewpoint texture generation unit 1214 compares the depth of each vertex of the mesh in the imaging camera 60 r with the depth of the vertex coordinates of the vertex obtained in step S305, and obtains the difference between the two.
  • step S309 the virtual viewpoint texture generation unit 1214 determines whether the difference obtained in step S308 is equal to or greater than a threshold. If the virtual viewpoint texture generation unit 1214 determines that the difference is equal to or greater than the threshold value (step S309, "Yes"), the process proceeds to step S310, and the imaging camera information (selected camera information) obtained by the imaging camera 60r is generated. I don't use it.
  • step S308 determines that the difference obtained in step S308 is less than the threshold value (step S309 , "No")
  • the process proceeds to step S311, and Suppose we use information.
  • the virtual viewpoint texture generation unit 1214 acquires color information at the UV coordinates obtained in step S307 from the imaging camera information.
  • the virtual viewpoint texture generation unit 1214 then obtains a blend coefficient for the color information.
  • the virtual viewpoint texture generation unit 1214 selects the imaging camera 60 r based on the imaging camera information selected by the processing of steps S208 to S211 in the flowchart of FIG. 17 or FIG. Find the blend coefficient for the captured image (texture image) of r .
  • step S313 the virtual viewpoint texture generation unit 1214 determines whether or not the processing for all of the arranged imaging cameras 60 1 to 60 n has been completed.
  • step S313, “Yes” the process proceeds to step S314.
  • step S314 the virtual viewpoint texture generation unit 1214 blends the color information in the imaging camera information used in step S311 among the imaging cameras 60 1 to 60 n according to the blending coefficient obtained in step S312. Thus, color information for pixel q is determined.
  • step S315 determines in step S315 that processing has been completed for all pixels, it terminates the series of processing according to the flowchart of FIG.
  • FIG. 22 is a schematic diagram for explaining post-effect processing according to the embodiment.
  • Section (a) of FIG. 22 shows an example of rendering processing according to the embodiment
  • section (b) shows an example of rendering processing by existing technology.
  • the virtual viewpoint texture generation unit 1214 performs ray tracing from the position of the pixel 72 of the image acquired by the virtual camera 70 (the output pixel of the virtual camera 70) to the virtual optical path 95 and the optical paths 96 1 to 96 4 . Then, the position of the pixel (input pixel) corresponding to the pixel 72 of each of the plurality of imaging cameras 60 1 to 60 4 is obtained (FIG. 21, steps S305 to S307).
  • the virtual viewpoint texture generation unit 1214 obtains the color information of the pixels 72 of the virtual camera 70 by blending the obtained color information of the respective pixels of the plurality of imaging cameras 60 1 to 60 4 according to the blend coefficients (Fig. 21, step S312).
  • the subject 87 is on the front side of the subject 86 from the virtual camera 70 , and is on the virtual optical path 96 4 to the subject 86 with respect to the imaging camera 60 4 and is reflected in the imaging camera 604 .
  • the virtual viewpoint texture generation unit 1214 selects vertices to be projected onto the virtual viewpoint by the virtual camera 70 from mesh information based on subject position information. Therefore, the virtual viewpoint texture generation unit 1214 can selectively set display/non-display for each of a plurality of subjects. Specifically, like the subject 87 indicated by the dotted line in section (a) of FIG. 22, the subject 87 can be hidden based on the subject position information indicating the position of the subject 87 .
  • the imaging camera 60 4 in real space images the subject 87 . Therefore, the virtual viewpoint texture generation unit 1214 does not use the captured image captured by the imaging camera 604 as the texture image of the subject 86 . Also, the plane of the subject 87 on the side of the imaging camera (not shown) located at a position where the subject 87 has passed from the virtual camera 70 (indicated by an arrow 97 ) cannot be seen from the virtual camera 70 . Therefore, the virtual viewpoint texture generation unit 1214 does not acquire the imaging camera information of the imaging camera. As a result, the processing load of the virtual viewpoint texture generation unit 1214 can be reduced.
  • the processing for switching display/non-display of a specific subject among the subjects included in the angle of view ⁇ of the virtual camera 70 has been described as an example, but this is not limited to this example. , can also be applied to other post-effect processing.
  • FIG. 23 is a schematic diagram showing more specifically the post-effect processing according to the embodiment.
  • Section (a) of FIG. 23 shows an example of an output image 300a output from the virtual camera 70 in which the three-dimensional models 51c and 51d are included in the angle of view ⁇ of the virtual camera 70.
  • FIG. Three-dimensional models 51c and 51d are associated with bounding boxes 200c and 200d, respectively. Note that in sections (a) and (b) of FIG. 23, the frame lines of the respective bounding boxes 200c and 200d are shown for explanation and are not displayed in the actual image.
  • Section (b) of FIG. 23 shows an example of an output image 300b when the position of the virtual camera 70 is moved from section (a) and the three-dimensional model 51c is moved.
  • the three-dimensional model 51d associated with the bounding box 200d is specified based on subject position information indicating the position of the three-dimensional model 51d, and is hidden by post-effect processing.
  • the three-dimensional model 51d itself is within the angle of view ⁇ of the virtual camera 70, and the related bounding box 200d exists.
  • a 3D model of a subject generated by the information processing system 100 according to the embodiment and 3D data managed by another device may be combined to produce new video content. good.
  • the three-dimensional model of the subject generated by the information processing system 100 according to the embodiment By combining the background data with the background data, it is possible to create video content that makes the subject appear as if it exists in the location indicated by the background data.
  • the video content to be produced may be video content having three-dimensional information, or video content obtained by converting three-dimensional information into two-dimensional information.
  • the 3D model of the subject generated by the information processing system 100 according to the embodiment includes, for example, a 3D model generated by the 3D model generation unit 111 and a 3D model reconstructed by the rendering unit 121. .
  • a subject for example, a performer generated by the information processing system 100 according to the embodiment can be placed in a virtual space where the user communicates as an avatar.
  • the user becomes an avatar and can observe the photographed subject in the virtual space.
  • the user at the remote location can observe the 3D model of the subject through a playback device at the remote location.
  • real-time communication between the subject and a remote user can be realized by transmitting the three-dimensional model of the subject in real time.
  • a case where the subject is a teacher and the user is a student, or a case where the subject is a doctor and the user is a patient can be assumed.
  • the information processing program according to the embodiment described above may be executed in another device having a CPU, a ROM, a RAM, etc. and having functions as an information processing device.
  • the device should have the necessary functional blocks and be able to obtain the necessary information.
  • each step of one flowchart may be executed by one device, or may be shared by a plurality of devices.
  • the plurality of processes may be executed by one device, or may be shared by a plurality of devices.
  • a plurality of processes included in one step of the flowchart can also be executed as a process of a plurality of steps.
  • the processing described as a plurality of steps in the flowchart can also be collectively executed as one step.
  • the processing of the steps describing the information processing program may be executed in chronological order according to the order shown in each flowchart described above. , may be executed in parallel, or individually as needed, such as when a call is made. That is, as long as there is no contradiction, the processing of each step may be executed in an order different from the order described above. Furthermore, the processing of the step of writing the information processing program according to the embodiment may be executed in parallel with the processing of other programs, or may be executed in combination with the processing of other programs. good.
  • a plurality of technologies related to the present disclosure can be implemented independently, or a plurality of technologies related to the present disclosure can be implemented in combination, as long as there is no contradiction. Also, part or all of the techniques according to the above-described embodiments can be implemented in combination with other techniques not described above.
  • the present technology can also take the following configuration.
  • a generation unit that generates an image by applying a texture image to a three-dimensional model included in three-dimensional data; Based on a first position of a virtual camera that acquires an image of the virtual space, a second position of the three-dimensional model, and a third position of one or more imaging cameras that capture an object in the real space, a selection unit that selects, from one or more imaging cameras, an imaging camera that acquires a captured image of the subject to be used as the texture image; comprising Information processing equipment.
  • the generating unit The texture image is generated according to the viewpoint from the virtual camera based on the captured image acquired by the imaging camera selected by the selection unit from the one or more imaging cameras.
  • the information processing device according to (1) above.
  • the selection unit Selecting an imaging camera that acquires a captured image of the subject according to the importance of each of the one or more imaging cameras obtained based on the first position, the second position, and the third position; The information processing apparatus according to (1) or (2).
  • the selection unit Obtaining the degree of importance based on an angle formed by the first position and the third position with the second position as the vertex; The information processing device according to (3) above.
  • the generating unit generating the texture image by blending the captured images captured by the one or more imaging cameras according to the importance; The information processing apparatus according to (3) or (4).
  • the generating unit applying the texture image to the three-dimensional model when at least one vertex coordinate of each of the vertex coordinates of a rectangular parallelepiped circumscribing the three-dimensional model is within the angle of view of the virtual camera;
  • the information processing apparatus according to any one of (1) to (5) above.
  • the generating unit Designating the three-dimensional model to give a predetermined effect based on the second position;
  • the information processing apparatus according to any one of (1) to (6).
  • the predetermined effect is an effect of hiding the specified three-dimensional model from the virtual camera;
  • the information processing device according to (7) above.
  • the selection unit Deselecting, from among the one or more imaging cameras, an imaging camera that images the subject from a direction outside the angle of view of the virtual camera of the three-dimensional model; The information processing apparatus according to any one of (1) to (8).
  • the selection unit Using the average coordinates of the vertex coordinates of a rectangular parallelepiped circumscribing the three-dimensional model as the second position; The information processing apparatus according to any one of (1) to (9).
  • (11) The selection unit When a plurality of three-dimensional models included in the three-dimensional data are included within the angle of view of the virtual camera, the average of the second positions of the plurality of three-dimensional models is calculated as the plurality of three-dimensional models. as the second position for The information processing device according to (10) above.
  • a generation step of generating an image by applying a texture image to a three-dimensional model included in three-dimensional data Based on a first position of a virtual camera that acquires an image of the virtual space, a second position of the three-dimensional model, and a third position of one or more imaging cameras that capture an object in the real space, a selection step of selecting, from one or more imaging cameras, an imaging camera that acquires a captured image of the subject to be used as the texture image; having Information processing methods.
  • a generation unit that generates three-dimensional data based on captured images captured by one or more imaging cameras; a separation unit that separates a three-dimensional model corresponding to a subject included in the captured image from the three-dimensional data and generates position information indicating the position of the separated three-dimensional model; comprising Information processing equipment.
  • the separation unit is By specifying a region of the subject on the two-dimensional plane based on information on the two-dimensional plane obtained by projecting the three-dimensional data in the height direction, and providing the region with information in the height direction, separating the three-dimensional model; The information processing device according to (13) above.
  • the separation unit is generating the position information including the coordinates of each vertex of a rectangular parallelepiped circumscribing the three-dimensional model, which is generated by giving information in the height direction to the region;
  • the information processing device according to (14) above.
  • an output unit that adds the position information to the three-dimensional model separated from the three-dimensional data by the separation unit and outputs the model; further comprising The information processing apparatus according to any one of (13) to (15).
  • the output unit outputting the information of the three-dimensional model as multi-viewpoint captured images obtained by capturing the subject corresponding to the three-dimensional model with the one or more imaging cameras, and depth information for each of the multi-viewpoint captured images;
  • the information processing device according to (16) above.
  • the output unit outputting the information of the three-dimensional model as mesh information;
  • the information processing device according to (16) above. (19) executed by a processor; a generation step of generating three-dimensional data based on captured images captured by one or more imaging cameras; a separation unit step of separating a three-dimensional model corresponding to a subject included in the captured image from the three-dimensional data and generating position information indicating the position of the separated three-dimensional model; has a Information processing methods.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Graphics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Image Generation (AREA)

Abstract

An information processing device according to the present disclosure is provided with: a generation unit (1214) that generates an image obtained by applying a texture image to a three-dimensional model included in three-dimensional data; and a selection unit (1211) that selects, on the basis of a first position of a virtual camera for acquiring an image of a virtual space, a second position of the three-dimensional model, and a third position of one or more imaging cameras each for capturing an image of a subject in a real space, an imaging camera for acquiring the image which is captured of the subject and which is to be used as the texture image, from among the one or more imaging cameras.

Description

情報処理装置および情報処理方法Information processing device and information processing method
 本開示は、情報処理装置および情報処理方法に関する。 The present disclosure relates to an information processing device and an information processing method.
 実在する被写体を撮像した撮像画像を用いて被写体の3次元モデルを生成し、生成した3Dモデルと被写体の撮像画像とに基づき被写体の3次元画像を高画質に生成する、ボリュメトリックキャプチャと呼ばれる技術が知られている(例えば非特許文献1)。 A technology called volumetric capture that generates a 3D model of a subject using captured images of an existing subject, and generates a high-quality 3D image of the subject based on the generated 3D model and the captured image of the subject. is known (for example, Non-Patent Document 1).
国際公開第2017/082076号WO2017/082076
 従来のボリュメトリックキャプチャ技術では、撮像画像に複数の被写体が含まれる場合に、各被写体の分離あるいは認識できていないために、生成された各被写体の3次元画像において十分な品質を得ることができない場合があった。 With conventional volumetric capture technology, when multiple subjects are included in a captured image, each subject cannot be separated or recognized, so sufficient quality cannot be obtained in the generated 3D image of each subject. there was a case.
 本開示は、ボリュメトリックキャプチャにおいてより高品質の3次元画像を生成可能な情報処理装置および情報処理方法を提供することを目的とする。 An object of the present disclosure is to provide an information processing apparatus and an information processing method capable of generating a three-dimensional image of higher quality in volumetric capture.
 本開示に係る情報処理装置は、3次元データに含まれる3次元モデルに対してテクスチャ画像を適用した画像を生成する生成部と、仮想空間の画像を取得する仮想カメラの第1の位置と、3次元モデルの第2の位置と、実空間の被写体を撮像する1以上の撮像カメラの第3の位置と、に基づき、1以上の撮像カメラから、テクスチャ画像として用いる被写体の撮像画像を取得する撮像カメラを選択する選択部と、を備える。 An information processing apparatus according to the present disclosure includes a generation unit that generates an image by applying a texture image to a three-dimensional model included in three-dimensional data, a first position of a virtual camera that acquires an image of a virtual space, Based on a second position of the three-dimensional model and a third position of one or more imaging cameras that capture an image of the subject in real space, a captured image of the subject to be used as a texture image is obtained from one or more imaging cameras. and a selection unit that selects an imaging camera.
 また、本開示に係る情報処理装置は、1以上の撮像カメラで撮像された撮像画像に基づき3次元データを生成する生成部と、3次元データから、撮像画像に含まれる被写体に対応する3次元モデルを分離し、分離された3次元モデルの位置を示す位置情報を生成する分離部と、を備える。 Further, the information processing apparatus according to the present disclosure includes a generation unit that generates three-dimensional data based on captured images captured by one or more imaging cameras, and a three-dimensional data corresponding to a subject included in the captured image from the three-dimensional data. a separating unit that separates the model and generates position information indicating the position of the separated three-dimensional model.
実施形態に適用可能な、実写の撮像画像に基づくボリュメトリックキャプチャの基本的な処理を示す図である。FIG. 10 is a diagram showing basic processing of volumetric capture based on a captured image of a photographed image, which is applicable to the embodiment; 既存技術の一例の課題について説明するための図である。FIG. 10 is a diagram for explaining a problem of an example of existing technology; 既存技術の別の一例の課題について説明するための図である。FIG. 10 is a diagram for explaining a problem of another example of the existing technology; 既存技術による、3次元モデルに適用するテクスチャ画像を取得する撮像カメラの一例の選択方法を説明するための図である。FIG. 10 is a diagram for explaining a method of selecting an example of an imaging camera that acquires a texture image to be applied to a 3D model, according to existing technology; 既存技術による撮像カメラ選択の第1の例について説明するための模式図である。FIG. 10 is a schematic diagram for explaining a first example of image pickup camera selection according to existing technology; 既存技術による撮像カメラ選択の第1の例について説明するための模式図である。FIG. 10 is a schematic diagram for explaining a first example of image pickup camera selection according to existing technology; 既存技術による撮像カメラ選択の第2の例について説明するための模式図である。FIG. 10 is a schematic diagram for explaining a second example of image pickup camera selection according to existing technology; 既存技術による撮像カメラ選択の第2の例について説明するための模式図である。FIG. 10 is a schematic diagram for explaining a second example of image pickup camera selection according to existing technology; 実施形態に係る情報処理システムの機能を示す一例の機能ブロック図である。1 is a functional block diagram showing an example of functions of an information processing system according to an embodiment; FIG. 実施形態に適用可能な、被写体の画像データを取得するための一例の構成を示す模式図である。FIG. 3 is a schematic diagram showing an example configuration for acquiring image data of a subject, which is applicable to the embodiment; 実施形態に係る情報処理システムに適用可能な情報処理装置の一例のハードウェア構成を示すブロック図である。1 is a block diagram showing a hardware configuration of an example of an information processing device applicable to an information processing system according to an embodiment; FIG. 実施形態に係る情報処理システムにおける処理を概略的に示す一例のフローチャートである。6 is an exemplary flowchart schematically showing processing in the information processing system according to the embodiment; 実施形態に適用可能な3次元モデルの生成処理を概略的に示す模式図である。FIG. 4 is a schematic diagram schematically showing a three-dimensional model generation process applicable to the embodiment; 実施形態に係る3Dモデル生成部の一例の構成を示すブロック図である。It is a block diagram showing an example of the configuration of a 3D model generation unit according to the embodiment. 実施形態に係る被写体の分離処理を説明するための模式図である。FIG. 7 is a schematic diagram for explaining subject separation processing according to the embodiment; 図14は、実施形態に係る被写体の分離処理を示す一例のフローチャートである。FIG. 14 is an exemplary flowchart illustrating subject separation processing according to the embodiment. 実施形態に係る撮像カメラ選択について説明するための模式図である。FIG. 4 is a schematic diagram for explaining selection of an imaging camera according to the embodiment; 実施形態に係るレンダリング部の一例の構成を示すブロック図である。4 is a block diagram showing an example configuration of a rendering unit according to the embodiment; FIG. 実施形態に係る、レンダリング処理における撮像カメラ選択処理の第1の例を示す一例のフローチャートである。8 is an example flowchart illustrating a first example of imaging camera selection processing in rendering processing according to the embodiment; 実施形態に係る、オブジェクトと仮想カメラとの関係を説明するための模式図である。FIG. 4 is a schematic diagram for explaining the relationship between an object and a virtual camera according to the embodiment; 実施形態に係る、オブジェクトの基準位置の平均値を算出する処理を説明するための模式図である。FIG. 10 is a schematic diagram for explaining processing for calculating an average value of reference positions of objects according to the embodiment; 実施形態に係る、レンダリング処理における撮像カメラ選択処理の第2の例を示す一例のフローチャートである。9 is an example flowchart illustrating a second example of imaging camera selection processing in rendering processing according to the embodiment; 実施形態に係るレンダリング処理を示す一例のフローチャートである。6 is an example flowchart illustrating rendering processing according to the embodiment; 実施形態に係るポストエフェクト処理について説明するための模式図である。FIG. 4 is a schematic diagram for explaining post-effect processing according to the embodiment; 実施形態に係るポストエフェクト処理について、より具体的に示す模式図である。FIG. 5 is a schematic diagram showing more specifically post-effect processing according to the embodiment;
 以下、本開示の実施形態について、図面に基づいて詳細に説明する。なお、以下の実施形態において、同一の部位には同一の符号を付することにより、重複する説明を省略する。 Hereinafter, embodiments of the present disclosure will be described in detail based on the drawings. In addition, in the following embodiments, the same parts are denoted by the same reference numerals, thereby omitting redundant explanations.
 以下、本開示の実施形態について、下記の順序に従って説明する。
1.ボリュメトリックキャプチャの概要
2.既存技術について
3.本開示の実施形態
 3-1.実施形態に係る情報処理システム構成例
 3-2.実施形態に係る3Dモデル生成処理について
  3-2-1.実施形態に係る3Dモデル生成処理の概要
  3-2-2.実施形態に係る3Dモデル生成部の構成例
  3-2-3.実施形態に係る3Dモデル生成処理の具体例
 3-3.実施形態に係るレンダリング処理について
  3-3-1.実施形態に係るレンダリング処理の概要
  3-3-2.実施形態に係るレンダリング部の構成例
  3-3-3.実施形態に係るレンダリング処理の具体例
   3-3-3-1.撮像カメラ選択処理の第1の例
   3-3-3-2.撮像カメラ選択処理の第2の例
   3-3-3-3.レンダリング処理の詳細
   3-3-3-4.ポストエフェクト処理について
4.本開示の実施形態の応用例
5.他の実施形態
Hereinafter, embodiments of the present disclosure will be described according to the following order.
1. Overview of Volumetric Capture2. 3. Existing technology Embodiment of Present Disclosure 3-1. Configuration example of information processing system according to embodiment 3-2. 3D model generation processing according to the embodiment 3-2-1. Overview of 3D model generation processing according to embodiment 3-2-2. Configuration example of 3D model generation unit according to embodiment 3-2-3. Specific example of 3D model generation processing according to embodiment 3-3. Regarding rendering processing according to the embodiment 3-3-1. Overview of Rendering Processing According to Embodiment 3-3-2. Configuration example of rendering unit according to embodiment 3-3-3. Specific example of rendering processing according to embodiment 3-3-3-1. First example of imaging camera selection process 3-3-3-2. Second example of imaging camera selection process 3-3-3-3. Details of rendering processing 3-3-3-4. 4. Regarding post-effect processing. 5. Application example of the embodiment of the present disclosure. Other embodiment
[1.ボリュメトリックキャプチャの概要]
 先ず、本開示の実施形態の説明に先んじて、理解を容易とするために、ボリュメトリックキャプチャの概要について説明する。図1は、実施形態に適用可能な、実写の撮像画像に基づくボリュメトリックキャプチャの基本的な処理を示す図である。
[1. Overview of volumetric capture]
First, prior to describing the embodiments of the present disclosure, an overview of volumetric capture will be described for easy understanding. FIG. 1 is a diagram showing basic processing of volumetric capture based on a captured image of a photographed image, which is applicable to the embodiment.
 図1において、先ず、ステップS1で、システムは、実空間において多台数のカメラでオブジェクト(被写体)を取り囲んで、被写体を撮像する。以下、実空間で被写体を撮像するカメラを、撮像カメラと呼ぶ。次に、ステップS2で、システムは、多台数の撮像カメラで被写体を撮像した複数の撮像画像に基づき、当該被写体を3次元データ化し、当該被写体の3次元モデルを生成する(3Dモデリング処理)。次に、ステップS3で、システムは、ステップS2で生成した3次元モデルをレンダリングして画像を生成する。 In FIG. 1, first, in step S1, the system surrounds an object (subject) with a large number of cameras in real space and captures an image of the subject. A camera that captures an image of a subject in real space is hereinafter referred to as an imaging camera. Next, in step S2, the system converts the subject into three-dimensional data and generates a three-dimensional model of the subject based on a plurality of captured images captured by multiple imaging cameras (3D modeling processing). Next, at step S3, the system renders the three-dimensional model generated at step S2 to generate an image.
 ステップS3で、システムは、当該3次元モデルを仮想空間内に置き、仮想空間内で自在に移動可能な仮想的なカメラ(以下、仮想カメラと呼ぶ)から当該3次元モデルを見た視点でレンダリングを行い、画像を生成する。すなわち、システムは、当該3次元モデルに対する仮想カメラの位置および方向に応じてレンダリングを行う。例えば仮想カメラを操作するユーザは、自身の操作に応じた位置から見た3次元モデルの画像を観察することができる。 In step S3, the system places the 3D model in the virtual space, and renders the 3D model from a virtual camera (hereinafter referred to as a virtual camera) that can freely move in the virtual space. to generate an image. That is, the system renders according to the position and orientation of the virtual camera with respect to the 3D model. For example, a user who operates a virtual camera can observe an image of a three-dimensional model viewed from a position according to his/her own operation.
 3次元モデルデータのフォーマットとして、メッシュ情報とUVテクスチャとを組み合わせたフォーマットと、メッシュ情報とマルチテクスチャとを組み合わせたフォーマットとが、一般的に用いられる。メッシュ情報は、ポリゴンによる3次元モデルの各頂点と各辺との集合である。UVテクスチャは、テクスチャ画像に対して、テクスチャ上の座標であるUV座標を割り当てたテクスチャである。また、マルチテクスチャは、3次元モデルのポリゴンに対して、複数のテクスチャ画像を重ねて貼り付ける。 As formats for 3D model data, a format combining mesh information and UV textures and a format combining mesh information and multi-textures are generally used. Mesh information is a set of vertices and edges of a three-dimensional model made up of polygons. A UV texture is a texture obtained by assigning UV coordinates, which are coordinates on the texture, to a texture image. Also, multi-texture is used to overlap and paste a plurality of texture images onto the polygons of the three-dimensional model.
 これらのうち、メッシュ情報とUVテクスチャとを組み合わせたフォーマットは、1つのUVテクスチャで3次元モデルの全方位をカバーするため、データ量が比較的少なく軽量であると共に、レンダリングが低負荷である。このフォーマットは、幾何形状が仮想カメラの視点移動に対して固定的なレンダリング方式である、View Independent方式(以下、VI方式と略称する)に用いて好適である。 Among these, the format that combines mesh information and UV texture covers all directions of the 3D model with one UV texture, so the amount of data is relatively small and lightweight, and the rendering load is low. This format is suitable for use in the View Independent method (hereinafter abbreviated as the VI method), which is a rendering method in which the geometry is fixed with respect to the viewpoint movement of the virtual camera.
 一方、メッシュ情報とマルチテクスチャとを組み合わせたフォーマットは、データ量が多くなると共にレンダリングが高負荷になるが、高画質を得ることができる。このフォーマットは、幾何形状が仮想カメラの視点移動に伴い変化するView Dependent方式(以下、VD方式と略称する)に用いて好適である。 On the other hand, the format that combines mesh information and multi-textures increases the amount of data and the rendering load, but it can provide high image quality. This format is suitable for use in the View Dependent method (hereinafter abbreviated as the VD method) in which the geometric shape changes as the viewpoint of the virtual camera moves.
[2.既存技術について]
 次に、上述したようなボリュメトリックキャプチャに関する既存技術およびその課題について説明する。
[2. About existing technology]
Next, existing techniques and their problems regarding volumetric capture as described above will be described.
(複数の被写体を含む場合の例)
 図2は、既存技術の一例の課題について説明するための図である。既存技術では、図2のセクション(a)に示されるように、単一の3次元データ50に対して複数の3次元モデル511~513が含まれていた。なお、3次元モデル511~513は、撮像画像に含まれる実空間における被写体の像に3次元情報を与えた仮想空間内のオブジェクトである。
(Example when multiple subjects are included)
FIG. 2 is a diagram for explaining a problem of an example of existing technology. In the existing technology, multiple three-dimensional models 51 1 to 51 3 are included for a single three-dimensional data 50, as shown in section (a) of FIG. The three-dimensional models 51 1 to 51 3 are objects in virtual space obtained by giving three-dimensional information to the image of the subject in real space included in the captured image.
 この場合、各3次元モデル511~513の分離および認識ができていないために、各3次元モデル511~513のレンダリングの際に、十分な品質を得ることが困難であった。すなわち、各3次元モデル511~513を十分な品質でレンダリングするためには、図2のセクション(b)に示されるように、各3次元モデル511~513を、それぞれ独立したデータ521~523として扱う必要がある。 In this case, since the three-dimensional models 51 1 to 51 3 could not be separated and recognized, it was difficult to obtain sufficient quality when rendering the three-dimensional models 51 1 to 51 3 . That is, in order to render each three-dimensional model 51 1 -51 3 with sufficient quality, each three -dimensional model 51 1 -51 3 must be rendered as independent data, as shown in section (b) of FIG. It must be treated as 52 1 to 52 3 .
 また、複数の3次元モデル511~513が、単一の3次元データ50に含まれる場合、レンダリング後の各3次元モデル511~513に対して効果を与えるポストエフェクト処理が困難になる場合がある。 In addition, when a plurality of three-dimensional models 51 1 to 51 3 are included in a single three-dimensional data 50, it is difficult to perform post-effect processing to apply effects to each of the three-dimensional models 51 1 to 51 3 after rendering. may become.
 図3は、既存技術の別の一例の課題について説明するための図である。図3のセクション(a)に示されるように、単一の3次元データ50に複数の3次元モデル511~513が含まれる場合について考える。この場合、例えば、図3のセクション(b)に、ポストエフェクト処理後の3次元データ500として示されるように、3次元モデル511~513のうち、3次元モデル512および513に対して選択的にエフェクト処理(この例では非表示処理)を施すことが困難であった。 FIG. 3 is a diagram for explaining a problem of another example of existing technology. Consider a case where a single three-dimensional data 50 includes a plurality of three-dimensional models 51 1 to 51 3 as shown in section (a) of FIG. In this case , for example, as shown as three - dimensional data 500 after post-effect processing in section (b) of FIG. It was difficult to selectively apply effect processing (in this example, non-display processing).
 このような、複数の3次元モデル511~513のうち特定の被写体に対してエフェクト処理を施すためには、各3次元モデル511~513を分離する分離処理が必要となる。既存技術では、このような、複数の3次元モデル511~513の分離について、考慮されていなかった。 In order to apply effect processing to a specific subject among the plurality of three-dimensional models 51 1 to 51 3 , separation processing for separating each of the three-dimensional models 51 1 to 51 3 is required. The existing technology does not consider such separation of the plurality of three-dimensional models 51 1 to 51 3 .
(仮想カメラの位置に応じた撮像カメラの選択)
 次に、既存技術のさらに別の一例の課題について説明する。既存技術では、複数の被写体に基づく複数の3次元モデルが1つのデータに含まれている場合において、各3次元モデルに適用するテクスチャを取得する撮像カメラを選択する際に、仮想カメラの位置によっては、最適なテクスチャが選択されない場合があった。
(Selection of the imaging camera according to the position of the virtual camera)
Next, a problem of still another example of existing technology will be described. In the existing technology, when a plurality of 3D models based on a plurality of subjects are included in one data, when selecting an imaging camera for obtaining textures to be applied to each 3D model, the position of the virtual camera In some cases, the optimal texture was not selected.
 図4は、既存技術による、被写体に適用するテクスチャ画像を取得する撮像カメラの一例の選択方法を説明するための図である。図4の例では、実空間における被写体80と、被写体80を実空間において取り囲む複数の撮像カメラ601~608と、が示されている。また、図4の例では、被写体80の基準となる位置として基準位置81が示されている。また、図4において、仮想空間に配置される仮想カメラ70が示されている。 FIG. 4 is a diagram for explaining an example of a selection method of an imaging camera that acquires a texture image to be applied to a subject according to existing technology. In the example of FIG. 4, a subject 80 in real space and a plurality of imaging cameras 60 1 to 60 8 surrounding the subject 80 in real space are shown. In addition, in the example of FIG. 4, a reference position 81 is shown as a reference position of the subject 80 . FIG. 4 also shows a virtual camera 70 arranged in the virtual space.
 なお、以下では、実空間における座標と、仮想空間における座標とが一致しているものとし、特に記載の無い限り、実空間と仮想空間とを区別せずに説明を行う。例えば、実空間と仮想空間は、スケールが一致し、実空間に配置された物体(被写体、撮像カメラなど)の位置は、仮想空間上の位置にそのまま置き換えることができるものとする。同様に、仮想空間上の、例えば3次元モデルや仮想カメラ70の位置は、実空間上の位置にそのまま置き換えることができるものとする。 In the following description, it is assumed that the coordinates in the real space and the coordinates in the virtual space match, and unless otherwise specified, the description will be made without distinguishing between the real space and the virtual space. For example, it is assumed that the real space and the virtual space have the same scale, and the position of an object (object, imaging camera, etc.) placed in the real space can be directly replaced with the position in the virtual space. Similarly, the positions of, for example, the three-dimensional model and the virtual camera 70 in the virtual space can be directly replaced with the positions in the real space.
 なお、被写体80の基準位置81は、被写体80における、全ての撮像カメラ601~608の光軸に最も近い点に対応する位置を適用することができる。これに限らず、被写体80の基準位置81は、当該被写体80の頂点座標の最大値および最小値の中間の位置としてもよいし、当該被写体80において最も重要とされる位置(被写体80に対応する被写体が人であれば例えば顔の位置)としてもよい。 As the reference position 81 of the object 80, the position corresponding to the point closest to the optical axes of all the imaging cameras 60 1 to 60 8 in the object 80 can be applied. Not limited to this, the reference position 81 of the subject 80 may be an intermediate position between the maximum value and the minimum value of the vertex coordinates of the subject 80, or the most important position in the subject 80 (corresponding to the subject 80). If the subject is a person, it may be the position of the face, for example.
 仮想カメラ70の位置から被写体80に対応する3次元モデルを見た場合に当該3次元モデルに適用するテクスチャを取得するために最適な撮像カメラを、各撮像カメラ601~608の重要度に基づき選択する方法が知られている。重要度は、例えば基準位置81を頂点として、仮想カメラ70の位置と、各撮像カメラ601~608の位置と、により成す角度に基づき決めることができる。 When the three-dimensional model corresponding to the subject 80 is viewed from the position of the virtual camera 70, the optimum imaging camera for acquiring the texture to be applied to the three-dimensional model is determined according to the importance of each imaging camera 60 1 to 60 8 . It is known to select based on The degree of importance can be determined, for example, based on the angle formed by the position of the virtual camera 70 and the positions of the imaging cameras 60 1 to 60 8 with the reference position 81 as the vertex.
 図4の例では、基準位置81に対して仮想カメラ70の位置と撮像カメラ601とにより成す角度θ1が最も小さい角度であり、撮像カメラ602とにより成す角度θ2が次に小さい角度となる。したがって、仮想カメラ70の位置に対して、撮像カメラ601が最も重要度が高く、撮像カメラ602が撮像カメラ601の次に重要度が高い。 In the example of FIG. 4, the angle θ 1 formed by the position of the virtual camera 70 and the imaging camera 60 1 with respect to the reference position 81 is the smallest angle, and the angle θ 2 formed by the imaging camera 60 2 is the next smallest angle. becomes. Therefore, with respect to the position of the virtual camera 70, the imaging camera 60 1 has the highest importance, and the imaging camera 60 2 has the next highest importance after the imaging camera 60 1 .
 具体的には、次式(1)により、各撮像カメラ601~608の重要度P(i)を算出することができる。
P(i)=arccos(Ci・Cv)  …(1)
Specifically, the importance P(i) of each imaging camera 60 1 to 60 8 can be calculated by the following equation (1).
P(i)=arccos(C i ·C v ) (1)
 なお、式(1)において、値iは、各撮像カメラ601~608を表す。また、値Ciは、各撮像カメラ601~608から基準位置81へのベクトル、値Cvは、仮想カメラ70から基準位置81へのベクトルを表す。すなわち、式(1)は、各撮像カメラ601~608および仮想カメラ70から基準位置81へのベクトルの内積に基づき、撮像カメラ601~608の重要度P(i)を求める。 Note that in equation (1), the value i represents each of the imaging cameras 60 1 to 60 8 . Also, the value C i represents a vector from each imaging camera 60 1 to 60 8 to the reference position 81 , and the value C v represents a vector from the virtual camera 70 to the reference position 81 . That is, equation (1) obtains the importance P(i) of the imaging cameras 60 1 to 60 8 based on the inner product of vectors from the imaging cameras 60 1 to 60 8 and the virtual camera 70 to the reference position 81 .
 既存技術によれば、複数の被写体を撮像する複数の撮像カメラから最適な撮像カメラを選択する場合に、意図しない撮像カメラが最適な撮像カメラとして選択されてしまう場合があった。 According to the existing technology, when selecting the optimal imaging camera from multiple imaging cameras that capture multiple subjects, an unintended imaging camera may be selected as the optimal imaging camera.
(既存技術による撮像カメラ選択の第1の例)
 先ず、既存技術による撮像カメラ選択の第1の例について説明する。図5Aおよび図5Bは、既存技術による撮像カメラ選択の第1の例について説明するための模式図である。この第1の例は、基準位置に対するベクトルに基づき撮像カメラを選択する場合の例である。すなわち、図5Aおよび図5Bでは、図4を用いて説明したように、複数の被写体821および822が含まれる場合において、各撮像カメラ601~6016から基準位置へのベクトルCiと、仮想カメラ70から基準位置へのベクトルCvとに基づき最適な撮像カメラを選択する場合の例を示している。
(First example of selection of imaging camera by existing technology)
First, a first example of image pickup camera selection by existing technology will be described. 5A and 5B are schematic diagrams for explaining a first example of image pickup camera selection according to the existing technology. This first example is an example of selecting an imaging camera based on a vector relative to a reference position. That is, in FIGS. 5A and 5B, as described with reference to FIG. 4 , when a plurality of subjects 82 1 and 82 2 are included, vectors C i and , and a vector C v from the virtual camera 70 to the reference position.
 図5Aおよび図5Bにおいて、撮像範囲84に、2つの被写体821および822が含まれている。被写体821は、図において撮像範囲84の左上隅に位置し、被写体822は、図において撮像範囲84の右下隅に位置している。また、撮像範囲84を取り囲んで、それぞれ画角βを有する16台の撮像カメラ601~6016が、それぞれ撮像方向を撮像範囲84の中央に向けて配置されている。また、仮想カメラ70は、画角αを有し、被写体821に対応する3次元モデルを画角α内に収めるものとする。基準位置83は、各被写体821および822の位置が不明であるため、撮像範囲84の中心、あるいは、各被写体821および822による重心を採用している。 5A and 5B, an imaging range 84 includes two subjects 82 1 and 82 2 . The subject 82 1 is positioned at the upper left corner of the imaging range 84 in the figure, and the subject 82 2 is positioned at the lower right corner of the imaging range 84 in the figure. Also, 16 imaging cameras 60 1 to 60 16 , each having an angle of view β, are arranged surrounding the imaging range 84 with their imaging directions facing the center of the imaging range 84 . Also, the virtual camera 70 has an angle of view α, and the three-dimensional model corresponding to the subject 82 1 is assumed to fit within the angle of view α. Since the positions of the subjects 82 1 and 82 2 are unknown, the reference position 83 adopts the center of the imaging range 84 or the center of gravity of the subjects 82 1 and 82 2 .
 以下、特に記載の無い限り、被写体821および822それぞれに対応する3次元モデルを、被写体821および822として説明を行う。 In the following description, the three-dimensional models corresponding to the subjects 82 1 and 82 2 are assumed to be the subjects 82 1 and 82 2 unless otherwise specified.
 図5Aは、仮想カメラ70が被写体821に対して基準位置83よりも手前側にある場合の例を示している。仮想カメラ70により被写体821を撮像する場合、理想的には、被写体821から仮想カメラ70を通る直線93a上にある撮像カメラ601が最適な撮像カメラとなる。 FIG. 5A shows an example in which the virtual camera 70 is on the front side of the reference position 83 with respect to the subject 82 1 . When an object 82 1 is imaged by the virtual camera 70, ideally, the imaging camera 60 1 located on a straight line 93a passing through the virtual camera 70 from the object 82 1 is the optimum imaging camera.
 しかしながら、この例では、仮想カメラ70から基準位置83へのベクトル91aの方向と、撮像カメラ6016から基準位置83へのベクトル90aの方向とが略一致しており、当該撮像カメラ6016が最適なカメラとして選択されることになる。撮像カメラ6016は、理想的な場合の最適な撮像カメラ601に対して位置および被写体821に対する方向が異なる。そのため、撮像カメラ6016の撮像画像に基づくテクスチャは、撮像カメラ601の撮像画像に基づくテクスチャと比較して、品質が低下する。 However, in this example, the direction of the vector 91a from the virtual camera 70 to the reference position 83 and the direction of the vector 90a from the imaging camera 60 16 to the reference position 83 substantially match, and the imaging camera 60 16 is optimal. camera. The imaging camera 60 16 differs in position and orientation with respect to the object 82 1 with respect to the optimal imaging camera 60 1 in the ideal case. Therefore, the quality of the texture based on the captured image of the imaging camera 60 16 is lower than that of the texture based on the captured image of the imaging camera 60 1 .
 図5Bは、仮想カメラ70が被写体821と基準位置83との間に位置する場合の例を示している。この場合においても、理想的には、被写体822から仮想カメラ70を通る直線93b上にある撮像カメラ601が最適な撮像カメラとなる。 FIG. 5B shows an example in which the virtual camera 70 is positioned between the subject 82 1 and the reference position 83. As shown in FIG. Also in this case, ideally, the imaging camera 60 1 located on the straight line 93b passing through the virtual camera 70 from the object 82 2 is the optimum imaging camera.
 しかしながら、この例では、基準位置83が仮想カメラ70に対して被写体821と反対側にあり、仮想カメラ70から基準位置83へのベクトル91bは、被写体821とは反対側に向いてしまうことになる。そのため、仮想カメラ70から見て被写体821の向う側に位置する撮像カメラ6011から基準位置83へのベクトル90bの方向がベクトル91bの方向と近くなり、当該撮像カメラ6011が最適な撮像カメラとして選択されてしまう。当該撮像カメラ6011は、被写体821の、仮想カメラ70からは見えない面を撮像する。そのため、撮像カメラ6011の撮像画像に基づくテクスチャは、理想的な撮像カメラ602の撮像画像に基づくテクスチャと比較して、品質が大きく低下する。 However, in this example, the reference position 83 is on the side opposite to the subject 82 1 with respect to the virtual camera 70, and the vector 91b from the virtual camera 70 to the reference position 83 points to the side opposite to the subject 82 1 . become. Therefore, the direction of the vector 90b from the imaging camera 60 11 located on the opposite side of the subject 82 1 as viewed from the virtual camera 70 to the reference position 83 becomes close to the direction of the vector 91b, and the imaging camera 60 11 is the optimum imaging camera. be selected. The imaging camera 60 11 images a surface of the subject 82 1 that cannot be seen from the virtual camera 70 . Therefore, the quality of the texture based on the image captured by the imaging camera 60 11 is greatly reduced compared to the texture based on the ideal image captured by the imaging camera 60 2 .
(既存技術による撮像カメラ選択の第2の例)
 次に、既存技術による撮像カメラ選択の第2の例について説明する。最適な撮像カメラの選択方法は、上述の基準位置に対するベクトルに基づく選択方法に限られない。既存技術による撮像カメラ選択の第2の例は、仮想カメラ70の光軸と、各撮像カメラ601~6016の被写体821に対するベクトルとの角度に基づき、各撮像カメラ601~6016から最適な撮像カメラを選択する例である。
(Second example of selection of imaging camera by existing technology)
Next, a second example of image pickup camera selection by existing technology will be described. The selection method of the optimum imaging camera is not limited to the selection method based on the vector for the reference position described above. A second example of image pickup camera selection by existing technology is based on the angle between the optical axis of the virtual camera 70 and the vector of each image pickup camera 60 1 to 60 16 with respect to the subject 82 1 , from each image pickup camera 60 1 to 60 16 This is an example of selecting the optimum imaging camera.
 図6Aおよび図6Bは、既存技術による撮像カメラ選択の第2の例について説明するための模式図である。なお、図6Aおよび図6Bにおいて、各被写体821および822、基準位置83、ならびに、撮像範囲84については、上述の図5Aおよび図5Bと同様なので、ここでの説明を省略する。 FIGS. 6A and 6B are schematic diagrams for explaining a second example of image pickup camera selection according to existing technology. In FIGS. 6A and 6B, the subjects 821 and 822, the reference position 83, and the imaging range 84 are the same as in FIGS. 5A and 5B described above, so descriptions thereof will be omitted here.
 図6Aは、上述した図5Aに対応し、仮想カメラ70が被写体821に対して基準位置83よりも手前側にある場合の例を示している。仮想カメラ70により被写体821を撮像する場合、理想的には、被写体821から仮想カメラ70を通る直線93a上にある撮像カメラ601が最適な撮像カメラとなる。 FIG. 6A corresponds to FIG. 5A described above, and shows an example in which the virtual camera 70 is positioned closer to the subject 82 1 than the reference position 83 . When an object 82 1 is imaged by the virtual camera 70, ideally, the imaging camera 60 1 located on a straight line 93a passing through the virtual camera 70 from the object 82 1 is the optimum imaging camera.
 図6Aの例では、仮想カメラ70が図において上方向を向いており、光軸94aが上方向となっている。図6Aにおいて、各撮像カメラ601~6016のうち、撮像カメラ601から基準位置83へのベクトル90cの方向と、仮想カメラ70の光軸94aとの角度が最も小さくなっている。したがって、最適な撮像カメラとして、理想的な最適撮像カメラと同一の撮像カメラ601が選択され、高品質のテクスチャを得ることができる。 In the example of FIG. 6A, the virtual camera 70 faces upward in the figure, and the optical axis 94a is upward. In FIG. 6A, among the imaging cameras 60 1 to 60 16 , the angle between the direction of the vector 90c from the imaging camera 60 1 to the reference position 83 and the optical axis 94a of the virtual camera 70 is the smallest. Therefore, the same imaging camera 60 1 as the ideal optimum imaging camera is selected as the optimum imaging camera, and high-quality texture can be obtained.
 図6Bは、上述した図5Bに対応し、仮想カメラ70が被写体821と基準位置83との間に位置する場合の例を示している。この場合においては、理想的には、被写体822から仮想カメラ70を通る直線93c上にある撮像カメラ602が最適な撮像カメラとなる。 FIG. 6B corresponds to FIG. 5B described above, and shows an example in which the virtual camera 70 is positioned between the object 82 1 and the reference position 83 . In this case, ideally, the imaging camera 60 2 located on a straight line 93c passing through the virtual camera 70 from the object 82 2 is the optimum imaging camera.
 図6Bの例では、図6Aと同様に、仮想カメラ70が図において上方向を向いており、光軸94bが上方向となっている。図6Bにおいて、各撮像カメラ601~6016のうち、撮像カメラ601から基準位置83へのベクトル90cの方向と、仮想カメラ70の光軸94bとの角度が、最も小さくなっている。したがって、最適な撮像カメラとして、理想的な場合の最適な撮像カメラ602と異なる撮像カメラ601が選択される。そのため、撮像カメラ601の撮像画像に基づくテクスチャは、撮像カメラ602の撮像画像に基づくテクスチャと比較して、品質が低下することになる。 In the example of FIG. 6B, as in FIG. 6A, the virtual camera 70 faces upward in the drawing, and the optical axis 94b is directed upward. In FIG. 6B, among the imaging cameras 60 1 to 60 16 , the angle between the direction of the vector 90c from the imaging camera 60 1 to the reference position 83 and the optical axis 94b of the virtual camera 70 is the smallest. Therefore, as the optimum imaging camera, the imaging camera 60 1 different from the optimum imaging camera 60 2 in the ideal case is selected. Therefore, the quality of the texture based on the captured image of the imaging camera 60 1 is lower than that of the texture based on the captured image of the imaging camera 60 2 .
 本開示の実施形態に係る情報処理システムは、各被写体の位置を、それぞれの3次元モデル生成時に求めておく。そして、当該情報処理システムは、各被写体に基づく3次元モデルのレンダリング時に、3次元モデル生成時に求めた各被写体の位置を用いて、当該3次元モデルに適用するテクスチャを取得するために用いる撮像カメラを選択する。 The information processing system according to the embodiment of the present disclosure obtains the position of each subject when generating each three-dimensional model. Then, when rendering a three-dimensional model based on each subject, the information processing system uses the position of each subject obtained when the three-dimensional model is generated, and uses the imaging camera to acquire the texture to be applied to the three-dimensional model. to select.
 そのため、実施形態に係る情報処理システムを適用することで、複数の被写体が入力に含まれている場合であっても、当該3次元モデルに適用するテクスチャを取得するために用いる撮像カメラを適切に選択することができ、品質の高いテクスチャを取得可能となる。また、各3次元モデルに付加される位置情報を用いることで、3次元モデル毎にポストエフェクト処理を掛けることが可能となる。 Therefore, by applying the information processing system according to the embodiment, even when a plurality of subjects are included in the input, the imaging camera used for acquiring the texture to be applied to the 3D model can be appropriately selected. You can choose and get high quality textures. Also, by using the position information added to each three-dimensional model, it is possible to apply post-effect processing to each three-dimensional model.
[3.本開示の実施形態]
 次に、本開示の実施形態について説明する。
[3. Embodiment of the Present Disclosure]
Next, embodiments of the present disclosure will be described.
(3-1.実施形態に係る情報処理システム構成例)
 先ず、実施形態に係る情報処理システムの構成例について説明する。図7は、実施形態に係る情報処理システムの機能を示す一例の機能ブロック図である。図7において、情報処理システム100は、データ取得部110と、3D(3-Demension:3次元)モデル生成部111と、フォーマット化部112と、送信部113と、受信部120と、レンダリング部121と、表示部122と、を含む。
(3-1. Configuration example of information processing system according to embodiment)
First, a configuration example of an information processing system according to an embodiment will be described. FIG. 7 is an exemplary functional block diagram illustrating functions of the information processing system according to the embodiment. 7, the information processing system 100 includes a data acquisition unit 110, a 3D (3-Dimensional) model generation unit 111, a formatting unit 112, a transmission unit 113, a reception unit 120, and a rendering unit 121. , and a display unit 122 .
 情報処理システム100は、例えば、データ取得部110、3Dモデル生成部111、フォーマット化部112および送信部113を含む3Dモデル出力用の情報処理装置と、受信部120、レンダリング部121および122を含む表示情報出力用の情報処理装置と、により構成してよい。これに限らず、情報処理システム100を、単体のコンピュータ装置(情報処理装置)にて構成することもできる。 The information processing system 100 includes, for example, an information processing device for outputting a 3D model including a data acquisition unit 110, a 3D model generation unit 111, a formatting unit 112, and a transmission unit 113, a reception unit 120, and rendering units 121 and 122. and an information processing device for outputting display information. The information processing system 100 can also be configured by a single computer device (information processing device).
 データ取得部110、3Dモデル生成部111、フォーマット化部112、送信部113、受信部120、レンダリング部121および表示部122は、例えばCPU(Central Processing Unit)上で実施形態に係る情報処理プログラムが実行されることで実現される。これに限らず、これらデータ取得部110、3Dモデル生成部111、フォーマット化部112、送信部113、受信部120、レンダリング部121および表示部122の一部または全部を、互いに協働するハードウェア回路で実現してもよい。 The data acquisition unit 110, the 3D model generation unit 111, the formatting unit 112, the transmission unit 113, the reception unit 120, the rendering unit 121, and the display unit 122 run the information processing program according to the embodiment on, for example, a CPU (Central Processing Unit). It is realized by being executed. Not limited to this, some or all of the data acquisition unit 110, the 3D model generation unit 111, the formatting unit 112, the transmission unit 113, the reception unit 120, the rendering unit 121, and the display unit 122 may be hardware that cooperates with each other. It may be realized by a circuit.
 データ取得部110は、被写体の3Dモデルを生成するための画像データを取得する。図8は、実施形態に適用可能な、被写体の画像データを取得するための一例の構成を示す模式図である。図8に示すように、被写体80を取り囲むように配置された複数の撮像カメラ601、602、603、…、60n-1、60nによって複数視点から撮像された複数の撮像画像を画像データとして取得する。この場合、複数視点の撮像画像は、複数の撮像カメラ601~60nが同期して撮像された画像が好ましい。 The data acquisition unit 110 acquires image data for generating a 3D model of a subject. FIG. 8 is a schematic diagram showing an example configuration for acquiring image data of a subject, which is applicable to the embodiment. As shown in FIG. 8, a plurality of captured images captured from a plurality of viewpoints by a plurality of imaging cameras 60 1 , 60 2 , 60 3 , . Acquire as image data. In this case, the captured images from multiple viewpoints are preferably images captured in synchronism by the plurality of imaging cameras 60 1 to 60 n .
 これに限らず、データ取得部110は、例えば、1台の撮像カメラで被写体80を複数視点から撮像した複数の撮像画像を画像データとして取得してもよい。一例として、被写体80の位置が固定的とされている場合に、この画像データ取得方法が適用可能である。 For example, the data acquisition unit 110 may acquire, as image data, a plurality of captured images obtained by capturing the subject 80 from a plurality of viewpoints with a single imaging camera. As an example, this image data acquisition method is applicable when the position of the subject 80 is fixed.
 なお、データ取得部110は、画像データに基づいてキャリブレーションを行い、各撮像カメラ601~60nの内部パラメータおよび外部パラメータを取得してもよい。また、データ取得部110は、例えば、複数箇所の視点から被写体80までの距離を示す複数のデプス情報を取得してもよい。 Note that the data acquisition unit 110 may perform calibration based on the image data and acquire the internal parameters and external parameters of each imaging camera 60 1 to 60 n . Also, the data acquisition unit 110 may acquire a plurality of pieces of depth information indicating distances from a plurality of viewpoints to the subject 80, for example.
 3Dモデル生成部111は、データ取得部110で取得された、被写体80を複数視点から撮像した撮像画像による画像データに基づき、被写体80の3次元情報を有する3次元モデルを生成する。 The 3D model generation unit 111 generates a 3D model having 3D information of the subject 80 based on image data obtained by the data acquisition unit 110 and obtained by imaging the subject 80 from multiple viewpoints.
 3Dモデル生成部111は、例えば、所謂ビジュアルフル(Visual Hull)を用いて、複数の視点からの画像(例えば、複数の視点からのシルエット画像)を用いて被写体80の3次元形状を削ることによって被写体80の3次元モデルを生成する。この場合、3Dモデル生成部111は、さらに、ビジュアルフルを用いて生成した3次元モデルを、複数箇所の視点から被写体80までの距離を示す複数のデプス情報を用いて、高精度に変形させることができる。 The 3D model generation unit 111 uses, for example, a so-called Visual Hull to cut the three-dimensional shape of the subject 80 using images from multiple viewpoints (for example, silhouette images from multiple viewpoints). A three-dimensional model of the subject 80 is generated. In this case, the 3D model generation unit 111 further transforms the 3D model generated using Visual Full with a high degree of accuracy using a plurality of pieces of depth information indicating distances from a plurality of viewpoints to the subject 80. can be done.
 3Dモデル生成部111が生成する3次元モデルは、実空間における撮像カメラ601~60nにより撮像された撮像画像を用いて生成されるため、実写の3次元モデルであるといえる。 The 3D model generated by the 3D model generation unit 111 is generated using captured images captured by the imaging cameras 60 1 to 60 n in the real space, and therefore can be said to be a real 3D model.
 3Dモデル生成部111は、生成する3次元モデルを、例えばメッシュデータの形式で表現することができる。メッシュデータは、被写体80の表面形状を表す形状情報を、ポリゴンメッシュと呼ばれる、頂点(Vertex)と頂点との繋がりで表現したデータである。3Dモデル生成部111が生成する3次元モデルの表現の方法は、メッシュデータによるものに限定されない。例えば、3Dモデル生成部111は、生成した3次元モデルを、点の位置情報で表現される所謂ポイントクラウドの表現方法で記述してもよい。 The 3D model generation unit 111 can express the generated 3D model, for example, in the form of mesh data. The mesh data is data representing shape information representing the surface shape of the subject 80 by connections between vertices called polygon meshes. The method of expressing the three-dimensional model generated by the 3D model generation unit 111 is not limited to mesh data. For example, the 3D model generation unit 111 may describe the generated 3D model in a so-called point cloud representation method represented by point position information.
 3Dモデル生成部111は、被写体80の3次元モデルに関連付けて、被写体80の色情報のデータも、テクスチャとして生成される。3Dモデル生成部111は、例えば、どの方向から観察しても一定の色となるView Independent(VD)テクスチャを生成することができる。これに限らず、3Dモデル生成部111は、観察する方向によって色が変化するView Dependent(VI)テクスチャを生成してもよい。 The 3D model generation unit 111 also generates color information data of the subject 80 as a texture in association with the three-dimensional model of the subject 80 . The 3D model generation unit 111 can generate, for example, a View Independent (VD) texture that has a constant color when viewed from any direction. Not limited to this, the 3D model generation unit 111 may generate a View Dependent (VI) texture whose color changes depending on the viewing direction.
 フォーマット化部112は、3Dモデル生成部111で生成された3次元モデルのデータを、伝送や蓄積に適したフォーマットのデータに変換する。フォーマット化部112は、例えば、3Dモデル生成部111で生成された3次元モデルを複数の方向から透視投影することにより、複数の2次元画像に変換することができる。この場合、フォーマット化部112は、3次元モデルを用いて、複数の視点からの2次元のデプス画像であるデプス情報を生成してもよい。 The formatting unit 112 converts the 3D model data generated by the 3D model generation unit 111 into data in a format suitable for transmission and storage. For example, the formatting unit 112 can convert the 3D model generated by the 3D model generating unit 111 into a plurality of two-dimensional images by perspectively projecting the model from a plurality of directions. In this case, the formatting unit 112 may generate depth information, which is two-dimensional depth images from multiple viewpoints, using the three-dimensional model.
 フォーマット化部112は、この2次元画像の状態のデプス情報と、色情報とを圧縮符号化して送信部113に出力する。フォーマット化部112は、デプス情報と色情報とを並べて1つの画像として伝送してもよいし、2つの個別の画像として伝送してもよい。この場合、伝送されるデータは、2次元画像データの形となるため、フォーマット化部112は、AVC(Advanced Video Coding)などの2次元画像に対する圧縮技術を用いて圧縮符号化することができる。 The formatting unit 112 compresses and encodes the depth information and the color information of the state of the two-dimensional image, and outputs them to the transmission unit 113 . The formatting unit 112 may transmit the depth information and the color information side by side as one image, or may transmit them as two separate images. In this case, since the data to be transmitted is in the form of two-dimensional image data, the formatting unit 112 can compress and encode the data using a compression technique for two-dimensional images such as AVC (Advanced Video Coding).
 また、フォーマット化部112は、3次元モデルを、ポイントクラウドのフォーマットに変換してもよい。さらに、フォーマット化部112は、3次元モデルを、3次元データとして送信部113に出力してもよい。この場合、フォーマット化部112は、例えば、MPEG(Moving Picture Experts Group)で議論されているGeometry-based-Approachの3次元圧縮技術を用いることができる。 The formatting unit 112 may also convert the three-dimensional model into a point cloud format. Furthermore, the formatting unit 112 may output the 3D model to the transmission unit 113 as 3D data. In this case, the formatting unit 112 can use, for example, the Geometry-based-Approach three-dimensional compression technology discussed in MPEG (Moving Picture Experts Group).
 送信部113は、フォーマット化部112で生成された伝送データを送信する。送信部113は、データ取得部110、3Dモデル生成部111およびフォーマット化部112の一連の処理をオフラインで行った後に、伝送データを送信する。また、送信部113は、上述した一連の処理から生成された伝送データを、リアルタイムに送信してもよい。 The transmission unit 113 transmits transmission data generated by the formatting unit 112 . The transmission unit 113 transmits transmission data after performing a series of processes of the data acquisition unit 110, the 3D model generation unit 111, and the formatting unit 112 offline. Further, the transmission unit 113 may transmit the transmission data generated from the series of processes described above in real time.
 受信部120は、送信部113から伝送された伝送データを受信する。 The receiving section 120 receives transmission data transmitted from the transmitting section 113 .
 レンダリング部121は、受信部120で受信した伝送データを用いて、仮想カメラ70の位置に応じたレンダリングを行う。例えば、3次元モデルのメッシュデータを、描画を行う仮想カメラ70の視点で投影し、色や模様を表すテクスチャを貼り付けるテクスチャマッピングを行う。描画された画像は、仮想カメラ70により、撮影時の撮像カメラ601~60nの位置とは関係無く、任意に設定した自由な視点で見ることができる。 The rendering unit 121 performs rendering according to the position of the virtual camera 70 using the transmission data received by the receiving unit 120 . For example, mesh data of a three-dimensional model is projected from the viewpoint of the virtual camera 70 that performs drawing, and texture mapping is performed to paste textures representing colors and patterns. The drawn image can be viewed from a freely set viewpoint by means of the virtual camera 70 regardless of the positions of the imaging cameras 60 1 to 60 n at the time of photographing.
 レンダリング部121は、例えば、3次元モデルのメッシュの位置に応じて、メッシュの色、模様や質感を表すテクスチャを貼り付けるテクスチャマッピングを行う。レンダリング部121は、ユーザ(仮想カメラ70)からの視点を考慮するVD方式によりテクスチャマッピングを行ってよい。これに限らず、レンダリング部121は、ユーザからの視点を考慮しないVI方式によりテクスチャマッピングを行ってもよい。 The rendering unit 121, for example, performs texture mapping to paste textures representing the color, pattern, and texture of the mesh according to the position of the mesh of the three-dimensional model. The rendering unit 121 may perform texture mapping using a VD method that considers the viewpoint from the user (virtual camera 70). Not limited to this, the rendering unit 121 may perform texture mapping by a VI method that does not consider the viewpoint of the user.
 VD方式は、ユーザからの視点(仮想カメラ70からの視点)の位置に応じて3次元モデルに貼り付けるテクスチャを変化させる。そのため、VD方式では、VI方式よりも高品質なレンダリングを実現できる利点がある。一方、VI方式は、ユーザからの視点の位置を考慮しないため、VD方式と比べて処理量が少なくする利点がある。 The VD method changes the texture to be pasted on the 3D model according to the position of the viewpoint from the user (the viewpoint from the virtual camera 70). Therefore, the VD method has the advantage of realizing higher quality rendering than the VI method. On the other hand, since the VI method does not consider the position of the viewpoint from the user, it has the advantage of reducing the amount of processing compared to the VD method.
 なお、ユーザによる視点のデータは、ユーザの注視個所(Region of Interest)を例えば表示装置が検出し、表示装置からレンダリング部121に入力されてよい。また、レンダリング部121は、例えば、ユーザからの視点に対してオブジェクトが垂直な姿勢を保つようにオブジェクトをレンダリングする、ビルボードレンダリングを採用してもよい。レンダリング部121は、例えば、複数オブジェクトをレンダリングする際に、ユーザの関心が低いオブジェクトをビルボードレンダリングでレンダリングし、その他のオブジェクトを他のレンダリング方式でレンダリングすることもできる。 Note that the user's viewpoint data may be input to the rendering unit 121 from the display device, for example, by detecting the region of interest of the user. Also, the rendering unit 121 may employ billboard rendering, which renders an object such that the object maintains a vertical orientation with respect to the viewpoint of the user, for example. For example, when rendering a plurality of objects, the rendering unit 121 may render an object that the user is less interested in using billboard rendering, and render other objects using another rendering method.
 表示部122は、レンダリング部121によりレンダリングされた結果の画像を表示装置の表示デバイスに表示させる。表示装置は、例えば、ヘッドマウントディスプレイや空間ディスプレイでもよいし、スマートフォンやテレビ受像機、パーソナルコンピュータなどの情報機器の表示デバイスであってもよい。また、表示装置は、2次元表示を行う2Dモニタでもよいし、3次元表示を行う3Dモニタであってもよい。 The display unit 122 displays the resulting image rendered by the rendering unit 121 on the display device of the display device. The display device may be, for example, a head-mounted display or a spatial display, or may be a display device of an information device such as a smartphone, a television receiver, or a personal computer. Also, the display device may be a 2D monitor for two-dimensional display, or a 3D monitor for three-dimensional display.
(実施形態に係る情報処理システムの他の構成例)
 図7に示した情報処理システム100は、コンテンツを生成する材料である撮像画像を取得するデータ取得部110からユーザの観察する表示装置を制御する表示制御部までの一連の流れを示した。しかしながら、実施形態の実施のために全ての機能ブロックが必要という意味ではなく、各機能ブロック毎または複数の機能ブロックの組合せに実施形態が実施でき得る。例えば、図7では、コンテンツ(3次元モデル)を作成する側からコンテンツデータの配信を通じてコンテンツを観察する側までの一連の流れを示すために送信部113や受信部120を設けたが、コンテンツの制作から観察までを同じ情報処理装置(例えばパーソナルコンピュータ)で実施する場合は、情報処理システム100は、フォーマット化部112、送信部113および受信部120を省略することができる。
(Another configuration example of the information processing system according to the embodiment)
The information processing system 100 shown in FIG. 7 shows a series of flows from the data acquisition unit 110 that acquires captured images, which are materials for generating content, to the display control unit that controls the display device observed by the user. However, this does not mean that all functional blocks are required to implement an embodiment, and embodiments can be implemented for each functional block or combination of multiple functional blocks. For example, in FIG. 7, the transmitting unit 113 and the receiving unit 120 are provided to show a series of flow from the content (three-dimensional model) creation side to the content observation side through the distribution of the content data. If the same information processing apparatus (for example, a personal computer) is used for production to observation, the information processing system 100 can omit the formatting unit 112, the transmission unit 113, and the reception unit 120. FIG.
 実施形態に係る情報処理システム100の実施に当たっては、同一の実施者が全てを実施する場合もあれば、各機能ブロックを異なる実施者が実施することもできる。その一例としては、事業者Aは、データ取得部110、3Dモデル生成部111およびフォーマット形成部112を用いて3Dコンテンツ(3次元モデル)を生成する。その上で、事業者Bの送信部113(プラットフォーム)を通じて3Dコンテンツが配信され、事業者Cの表示装置が3Dコンテンツの受信、レンダリング、表示制御を行うことが考えられる。 In implementing the information processing system 100 according to the embodiment, the same implementer may implement all of them, or each functional block may be implemented by different implementers. As an example, operator A generates 3D content (three-dimensional model) using data acquisition section 110 , 3D model generation section 111 and format formation section 112 . In addition, it is conceivable that the 3D content is distributed through the transmitter 113 (platform) of the operator B, and the display device of the operator C receives, renders, and displays the 3D content.
 また、図7に示した各機能ブロックは、クラウドネットワーク上で実施することができる。例えば、レンダリング部121は、表示装置内で実施されてもよいし、クラウドネットワーク上のサーバで実施してもよい。その場合は、表示装置とサーバとの間で情報のやり取りが生じる。 Also, each functional block shown in FIG. 7 can be implemented on a cloud network. For example, the rendering unit 121 may be implemented within a display device, or may be implemented in a server on a cloud network. In that case, information is exchanged between the display device and the server.
 図7では、データ取得部110、3Dモデル生成部111、フォーマット化部112、送信部113、受信部120、レンダリング部121および表示部122を纏めて情報処理システム100として説明した。これはこの例に限定されず、実施形態に係る情報処理システム100は、2以上の機能ブロックが関係していれば情報処理システム100ということとし、例えば、表示部122は含めずに、データ取得部110、3Dモデル生成部111、フォーマット化部112、送信部113、受信部120およびレンダリング部121を総称して、情報処理システム100としてもよい。 In FIG. 7, the data acquisition unit 110, the 3D model generation unit 111, the formatting unit 112, the transmission unit 113, the reception unit 120, the rendering unit 121, and the display unit 122 are collectively described as the information processing system 100. This is not limited to this example, and the information processing system 100 according to the embodiment is defined as the information processing system 100 if two or more functional blocks are involved. The unit 110 , the 3D model generation unit 111 , the formatting unit 112 , the transmission unit 113 , the reception unit 120 and the rendering unit 121 may be collectively referred to as the information processing system 100 .
(実施形態に適用可能な情報処理装置のハードウェア構成例)
 図9は、実施形態に係る情報処理システム100に適用可能な情報処理装置の一例のハードウェア構成を示すブロック図である。なお、図9に示す情報処理装置2000は、上述した3Dモデル出力用の情報処理装置と、表示情報出力用の情報処理装置と、の何れにも適用可能である。また、図9に示す情報処理装置2000は、図7に示した情報処理システム100の全体を含む構成にも適用可能である。
(Hardware Configuration Example of Information Processing Apparatus Applicable to Embodiment)
FIG. 9 is a block diagram showing a hardware configuration of an example of an information processing device applicable to the information processing system 100 according to the embodiment. The information processing apparatus 2000 shown in FIG. 9 can be applied to both the information processing apparatus for outputting the 3D model and the information processing apparatus for outputting the display information described above. The information processing apparatus 2000 shown in FIG. 9 can also be applied to a configuration including the entire information processing system 100 shown in FIG.
 図9において、情報処理装置2000は、CPU(Central Processing Unit)2100と、ROM(Read Only Memory)2101と、RAM(Random Access Memory)2102と、インタフェース(I/F)2103と、入力部2104と、出力部2105と、ストレージ装置2106と、通信I/F2107と、ドライブ装置2108と、を含む。 9, an information processing device 2000 includes a CPU (Central Processing Unit) 2100, a ROM (Read Only Memory) 2101, a RAM (Random Access Memory) 2102, an interface (I/F) 2103, an input section 2104, and , an output unit 2105 , a storage device 2106 , a communication I/F 2107 and a drive device 2108 .
 CPU2100、ROM2101、RAM2102およびI/F2103は、バス2110により互いに通信可能に接続される。また、I/F2103に対して、入力部2104、出力部2105、ストレージ装置2106、通信I/F2107およびドライブ装置2108が接続される。これら入力部2104、出力部2105、ストレージ装置2106、通信I/F2107およびドライブ装置2108は、I/F2103およびバス2110を介してCPU2100などと通信が可能とされている。 The CPU 2100, ROM 2101, RAM 2102 and I/F 2103 are communicably connected to each other via a bus 2110. An input unit 2104 , an output unit 2105 , a storage device 2106 , a communication I/F 2107 and a drive device 2108 are connected to the I/F 2103 . These input unit 2104 , output unit 2105 , storage device 2106 , communication I/F 2107 and drive device 2108 can communicate with CPU 2100 and the like via I/F 2103 and bus 2110 .
 ストレージ装置2106は、ハードディスクドライブやフラッシュメモリといった、不揮発性の記憶媒体である。CPU2100は、ROM2101およびストレージ装置2106に記憶されるプログラムに従い、RAM2102をワークメモリとして用いて、この情報処理装置2000の全体の動作を制御する。 The storage device 2106 is a non-volatile storage medium such as a hard disk drive or flash memory. The CPU 2100 controls the overall operation of the information processing apparatus 2000 according to programs stored in the ROM 2101 and storage device 2106 and using the RAM 2102 as a work memory.
 入力部2104は、この情報処理装置2000に対するデータの入力を受け付ける。入力部2104としては、マウスなどのポインティングデバイス、キーボード、タッチパネル、ジョイスティック、コントローラなどの、ユーザ操作に応じてデータを入力するための入力デバイスを適用できる。また、入力部2104は、外部の機器からのデータの入力を行うための各種の入力端子を含むことができる。さらに、入力部2104は、マイクロホンといった収音デバイスを含むことができる。 The input unit 2104 accepts data input to the information processing device 2000 . As the input unit 2104, an input device for inputting data according to user operation, such as a pointing device such as a mouse, a keyboard, a touch panel, a joystick, or a controller, can be applied. Also, the input unit 2104 can include various input terminals for inputting data from an external device. Additionally, the input section 2104 can include a sound pickup device such as a microphone.
 出力部2105は、この情報処理装置2000による情報の出力を担う。出力部2105としては、ディスプレイなどの表示デバイスを適用できる。また、出力部2105は、スピーカなどの音出力デバイスを含むことができる。さらに、出力部2105は、外部の機器に対してデータを出力するための各種の出力端子を含むことができる。 The output unit 2105 is responsible for outputting information from the information processing device 2000 . A display device such as a display can be applied as the output unit 2105 . Also, the output unit 2105 can include a sound output device such as a speaker. Furthermore, the output unit 2105 can include various output terminals for outputting data to external devices.
 また、この情報処理装置2000がレンダリング部121の処理を実行する場合には、出力部2105がGPU(Graphics Processing Unit)を含むと好ましい。GPUは、グラフィクス処理のためのメモリ(GPUメモリ)を有する。 Also, when the information processing device 2000 executes the processing of the rendering unit 121, the output unit 2105 preferably includes a GPU (Graphics Processing Unit). The GPU has a memory (GPU memory) for graphics processing.
 通信I/F2107は、LAN(Local Area Network)やインターネットといったネットワークを介した通信を制御する。ドライブ装置2108は、例えば光ディスク、光磁気ディスク、フレキシブルディスク、半導体メモリなどのリムーバブルメディアを駆動し、データの読み書きを行う。 A communication I/F 2107 controls communication via a network such as a LAN (Local Area Network) or the Internet. A drive device 2108 drives removable media such as optical discs, magneto-optical discs, flexible discs, and semiconductor memories to read and write data.
 例えば、この情報処理装置2000が3Dモデル出力用の情報処理装置として用いられる場合、CPU2100は、実施形態に係る情報処理プログラムが実行されることで、上述したデータ取得部110、3Dモデル生成部111、フォーマット化部112および送信部113をRAM2012における主記憶領域上に、それぞれ例えばモジュールとして構成する。 For example, when the information processing apparatus 2000 is used as an information processing apparatus for outputting a 3D model, the CPU 2100 executes the information processing program according to the embodiment to obtain the data acquisition unit 110 and the 3D model generation unit 111 described above. , the formatting unit 112 and the transmitting unit 113 are configured as modules on the main storage area of the RAM 2012, respectively.
 また例えば、この情報処理装置2000が表示情報出力用の情報処理装置として用いられる場合、CPU2100は、実施形態に係る情報処理プログラムが実行されることで、受信部120、レンダリング部121および表示部122をRAM2012における主記憶領域上に、それぞれ例えばモジュールとして構成する。 Further, for example, when the information processing apparatus 2000 is used as an information processing apparatus for outputting display information, the CPU 2100 executes the information processing program according to the embodiment, so that the receiving unit 120, the rendering unit 121, and the display unit 122 are configured as modules, for example, on the main memory area of the RAM 2012 .
 これらの情報処理プログラムは、例えば通信I/F2107を介した通信により、LANあるいはインターネットといったネットワークを介して外部(例えばサーバ装置)から取得し、当該情報処理装置2000上にインストールすることが可能とされている。これに限らず、当該情報プログラムは、CD(Compact Disk)やDVD(Digital Versatile Disk)、USB(Universal Serial Bus)メモリといった着脱可能な記憶媒体に記憶されて提供されてもよい。 These information processing programs can be acquired from outside (for example, a server device) via a network such as a LAN or the Internet by communication via the communication I/F 2107, and installed on the information processing device 2000. ing. Not limited to this, the information program may be stored in a removable storage medium such as a CD (Compact Disk), a DVD (Digital Versatile Disk), or a USB (Universal Serial Bus) memory and provided.
(実施形態に係る処理の概略)
 図10は、実施形態に係る情報処理システム100における処理を概略的に示す一例のフローチャートである。図10のフローチャートによる処理に先立って、図8を用いて説明したように、多数の撮像カメラ601~60nにより被写体80を撮像する。
(Outline of processing according to the embodiment)
FIG. 10 is an exemplary flowchart schematically showing processing in the information processing system 100 according to the embodiment. Prior to the processing according to the flowchart of FIG. 10 , as described with reference to FIG.
 図10のフローチャートによる処理が開始されると、ステップS10で、情報処理システム100は、データ取得部110により、被写体80の3次元モデルを生成するための撮像画像データを取得する。次のステップS11で、情報処理システム100は、3Dモデル生成部111により、ステップS10で取得された撮像画像データに基づき、被写体80の3次元情報を有する3次元モデルを生成する。 When the process according to the flowchart of FIG. 10 is started, the information processing system 100 acquires captured image data for generating a three-dimensional model of the subject 80 by the data acquisition unit 110 in step S10. In the next step S11, the information processing system 100 uses the 3D model generation unit 111 to generate a three-dimensional model having three-dimensional information of the subject 80 based on the captured image data acquired in step S10.
 次のステップS12において、情報処理システム100は、フォーマット化部112により、ステップS11で生成された3次元モデルの形状およびテクスチャデータを、伝送や蓄積に好適なフォーマットにエンコードする。次のステップ13で、情報処理システム100は、送信部113により、ステップS12でエンコードされたデータを送信する。 In the next step S12, the information processing system 100 causes the formatting unit 112 to encode the three-dimensional model shape and texture data generated in step S11 into a format suitable for transmission and storage. In the next step 13, the information processing system 100 causes the transmission unit 113 to transmit the data encoded in step S12.
 次のステップS14で、情報処理システム100は、受信部120により、ステップS13で送信されたデータを受信する。受信部120は、受信したデータをデコードし、3次元モデルの形状およびテクスチャデータを復元する。 In the next step S14, the information processing system 100 receives the data transmitted in step S13 by the receiving unit 120. The receiving unit 120 decodes the received data and restores the shape and texture data of the three-dimensional model.
 次のステップS15で、情報処理システム100は、レンダリング部121により、受信部120から渡された形状およびテクスチャデータを用いてレンダリングを行い、3次元モデルを表示するための画像データを生成する。次のステップ16で、情報処理システム100は、表示部122により、レンダリングにより生成された画像データを表示装置に表示させる。 In the next step S15, the information processing system 100 causes the rendering section 121 to perform rendering using the shape and texture data passed from the receiving section 120, and generate image data for displaying the three-dimensional model. In the next step 16, the information processing system 100 causes the display unit 122 to display the image data generated by rendering on the display device.
 ステップS16の処理が終了すると、この図10のフローチャートの一連の処理が終了される。 When the process of step S16 ends, the series of processes in the flowchart of FIG. 10 ends.
(3-2.実施形態に係る3Dモデル生成処理について)
 次に、図10のステップS11における、実施形態に係る、3Dモデル生成部111による3次元モデルの生成処理について、より詳細に説明する。
(3-2. 3D model generation processing according to the embodiment)
Next, the 3D model generation processing by the 3D model generation unit 111 according to the embodiment in step S11 of FIG. 10 will be described in more detail.
(3-2-1.実施形態に係る3Dモデル生成処理の概要)
 図11は、実施形態に適用可能な3次元モデルの生成処理を概略的に示す模式図である。図11のセクション(a)に示されるように、3Dモデル生成部111は、異なる視点から撮像された撮像画像に基づき、例えばそれぞれ実空間の被写体に基づく複数の3次元モデル511~513を含む3次元データ50を生成する。各3次元モデル511~513に対して位置情報を付加する方法は、様々に考えられる。ここでは、バウンディングボックスを用いて各3次元モデル511~513に対して位置情報を付加する。
(3-2-1. Overview of 3D model generation processing according to the embodiment)
FIG. 11 is a schematic diagram that schematically shows a three-dimensional model generation process that can be applied to the embodiment. As shown in section (a) of FIG. 11, the 3D model generation unit 111 generates a plurality of 3D models 51 1 to 51 3 based on captured images captured from different viewpoints, for example, each of which is based on an object in real space. Generate three-dimensional data 50 including: Various methods are conceivable for adding position information to each of the three-dimensional models 51 1 to 51 3 . Here, position information is added to each of the three-dimensional models 51 1 to 51 3 using bounding boxes.
 図11のセクション(b)は、バウンディングボックスの例を示している。各3次元モデル511、512および513に対し、それぞれ外接する直方体を3次元のバウンディングボックス2001、2002および2003として求める。これら3次元のバウンディングボックス2001~2003の各頂点を、対応する3次元モデル511~513の位置を示す位置情報とする。 Section (b) of FIG. 11 shows an example of a bounding box. Rectangular parallelepipeds circumscribing the three- dimensional models 51 1 , 51 2 and 51 3 are determined as three-dimensional bounding boxes 200 1 , 200 2 and 200 3 . Each vertex of these three-dimensional bounding boxes 200 1 to 200 3 is used as position information indicating the position of the corresponding three-dimensional models 51 1 to 51 3 .
 セクション(b)の例では、3次元モデル511に対する3次元のバウンディングボックス2001の位置BoundingBox[0]は、x軸を図の横方向、y軸を図の高さ方向、z軸を図の奥行方向とするとき、次式(2)として表される。なお、「min」および「max」は、バウンディングボックスのその方向における最小値、最大値をそれぞれ示している。
BoundingBox[0]=(xmin0, xmax0, ymin0, ymax0, zmin0, zmax0)  …(2)
In the example of section (b), the position BoundingBox[0] of the three-dimensional bounding box 200 1 with respect to the three-dimensional model 51 1 has the x-axis in the horizontal direction in the drawing, the y-axis in the vertical direction in the drawing, and the z-axis in the drawing. is expressed as the following equation (2). Note that "min" and "max" indicate the minimum and maximum values of the bounding box in that direction, respectively.
BoundingBox[0] = (x min0 , x max0 , y min0 , y max0 , z min0 , z max0 ) … (2)
 3次元モデル512および513に対する3次元のバウンディングボックス2002および2003の位置BoundingBox[1]およびBoundingBox[2]も、同様にして次式(3)および(4)として表される。
BoundingBox[1]=(xmin1, xmax1, ymin1, ymax1, zmin1, zmax1)  …(3)
BoundingBox[2]=(xmin2, xmax2, ymin2, ymax2, zmin2, zmax2)  …(4)
The positions BoundingBox[1] and BoundingBox[2] of the three-dimensional bounding boxes 200 2 and 200 3 with respect to the three- dimensional models 51 2 and 51 3 are similarly represented by the following equations (3) and (4).
BoundingBox[1] = (x min1 , x max1 , y min1 , y max1 , z min1 , z max1 ) … (3)
BoundingBox[2] = (x min2 , x max2 , y min2 , y max2 , z min2 , z max2 ) (4)
(3-2-2.実施形態に係る3Dモデル生成部の構成例)
 図12は、実施形態に係る3Dモデル生成部111の一例の構成を示すブロック図である。図12において、3Dモデル生成部111は、3Dモデル処理部1110と、3Dモデル分離部1111と、を含む。
(3-2-2. Configuration example of 3D model generation unit according to embodiment)
FIG. 12 is a block diagram showing an example configuration of the 3D model generation unit 111 according to the embodiment. In FIG. 12 , the 3D model generation unit 111 includes a 3D model processing unit 1110 and a 3D model separation unit 1111 .
 データ取得部110から出力された、各撮像カメラ601~60nによる撮像画像データと撮像カメラ情報とが3Dモデル生成部111に入力される。撮像カメラ情報は、色情報、デプス情報、カメラパラメータ情報などを含んでよい。カメラパラメータ情報は、例えば、各撮像カメラ601~60nの位置、方向および画角βの情報を含む。カメラパラメータ情報は、さらに、ズーム情報、シャッタ速度情報、絞り情報などを含んでもよい。各撮像カメラ601~60nの撮像カメラ情報は、3Dモデル処理部1110に渡されると共に、3Dモデル生成部111から出力される。 The image data captured by each of the imaging cameras 60 1 to 60 n and the imaging camera information output from the data acquisition unit 110 are input to the 3D model generation unit 111 . The imaging camera information may include color information, depth information, camera parameter information, and the like. The camera parameter information includes, for example, information on the position, direction and angle of view β of each imaging camera 60 1 to 60 n . Camera parameter information may further include zoom information, shutter speed information, aperture information, and the like. The imaging camera information of each of the imaging cameras 60 1 to 60 n is passed to the 3D model processing section 1110 and output from the 3D model generation section 111 .
 3Dモデル処理部1110は、各撮像カメラ601~60nによる撮像画像データおよび撮像カメラ情報に基づき、上述したビジュアルフルにより、撮像範囲に含まれる被写体の3次元形状を削ることによって、被写体の頂点および面データを生成する。より具体的には、3Dモデル処理部1110は、実空間において被写体が配置される空間の背景の画像を、各撮像カメラ601~60nについて予め取得しておく。各撮像カメラ601~60nによる被写体の各撮像画像と、各背景画像との差分に基づき被写体のシルエット画像を生成する。このシルエット画像により3次元空間を削ることで、頂点および面データによる被写体の3次元形状を得ることができる。 Based on the image data captured by each of the imaging cameras 60 1 to 60 n and the imaging camera information, the 3D model processing unit 1110 removes the three-dimensional shape of the subject included in the imaging range using the above-described Visual Full, thereby arranging the vertices of the subject. and generate surface data. More specifically, the 3D model processing unit 1110 acquires in advance an image of the background of the space in which the subject is placed in the real space for each of the imaging cameras 60 1 to 60 n . A silhouette image of the subject is generated based on the difference between each image of the subject captured by each of the imaging cameras 60 1 to 60 n and each background image. By shaving the three-dimensional space from this silhouette image, it is possible to obtain the three-dimensional shape of the subject based on the vertex and surface data.
 このように、3Dモデル処理部1110は、1以上の撮像カメラで撮像された撮像画像に基づき3次元データを生成する生成部として機能する。 In this way, the 3D model processing unit 1110 functions as a generation unit that generates three-dimensional data based on captured images captured by one or more imaging cameras.
 3Dモデル処理部1110は、生成した被写体の頂点および面データをメッシュ情報として出力する。3Dモデル処理部1110から出力されたメッシュ情報は、3Dモデル生成部111から出力されると共に、3Dモデル分離部1111に渡される。3Dモデル分離部1111は、3Dモデル処理部1110から渡されたメッシュ情報に基づき各被写体を分離し、各被写体の位置情報を生成する。 The 3D model processing unit 1110 outputs the generated vertex and surface data of the subject as mesh information. The mesh information output from the 3D model processing unit 1110 is output from the 3D model generation unit 111 and passed to the 3D model separation unit 1111 . The 3D model separation unit 1111 separates each subject based on the mesh information passed from the 3D model processing unit 1110 and generates position information of each subject.
(3-2-3.実施形態に係る3Dモデル生成処理の具体例)
 3Dモデル分離部1111による被写体の分離処理について、より詳細に説明する。図13は、実施形態に係る被写体の分離処理を説明するための模式図である。また、図14は、実施形態に係る被写体の分離処理を示す一例のフローチャートである。
(3-2-3. Specific example of 3D model generation processing according to the embodiment)
The subject separation processing by the 3D model separation unit 1111 will be described in more detail. FIG. 13 is a schematic diagram for explaining subject separation processing according to the embodiment. Also, FIG. 14 is a flowchart of an example showing subject separation processing according to the embodiment.
 先ず、図13のセクション(a)に示されるように、3Dモデル生成部111において、3Dモデル処理部1110は、異なる視点から撮像された撮像画像に基づき、ビジュアルフルにより、複数の3次元モデル511~513を含む3次元データ50を生成する。 First, as shown in section (a) of FIG. 13, in the 3D model generation unit 111, the 3D model processing unit 1110 generates a plurality of 3D models 51 visually based on captured images captured from different viewpoints. Generate three-dimensional data 50 containing 1 to 51 3 .
 3Dモデル分離部1111は、図14のステップS100で、3次元データ50を高さ方向(y軸方向)に投影し、各3次元モデル511~513の、2次元のシルエット情報を生成する。図13のセクション(b)は、各3次元モデル511~513に基づく2次元のシルエット521~523の例を示している。 In step S100 of FIG. 14, the 3D model separation unit 1111 projects the 3D data 50 in the height direction (y-axis direction) to generate 2D silhouette information for each of the 3D models 51 1 to 51 3 . . Section (b) of FIG. 13 shows examples of two-dimensional silhouettes 52 1 -52 3 based on respective three-dimensional models 51 1 -51 3 .
 次のステップS101で、3Dモデル分離部1111は、シルエット521~523に基づき2次元平面上でクラスタリングを行い、ブロブ(塊)を検出する。次のステップS102で、3Dモデル分離部1111は、検出されたブロブの数に係るループ変数iをi=0とする。以降のステップS103~ステップS105の処理は、ステップS101で検出されたブロブ毎の処理となる。 In the next step S101, the 3D model separation unit 1111 performs clustering on a two-dimensional plane based on the silhouettes 52 1 to 52 3 to detect blobs. In the next step S102, the 3D model separation unit 1111 sets the loop variable i related to the number of detected blobs to i=0. Subsequent steps S103 to S105 are processed for each blob detected in step S101.
 ステップS103で、3Dモデル分離部1111は、図13のセクション(c)に示されるように、検出されたブロブに外接する矩形を、3次元モデル511~513それぞれに対応する2次元のバウンディングボックス531~533として求める。 In step S103, the 3D model separating unit 1111 divides the detected blob into a two-dimensional bounding rectangle corresponding to each of the three-dimensional models 51 1 to 51 3 , as shown in section (c) of FIG. Obtained as boxes 53 1 to 53 3 .
 次のステップS104で、3Dモデル分離部1111は、ステップS103で求めた2次元のバウンディングボックス531に対して高さ情報を追加して、図13のセクション(d)に示されるように、3次元のバウンディングボックス2001を生成する。2次元のバウンディングボックス531に追加する高さ情報は、例えば図13のセクション(a)に示した3次元データ50に基づき取得することができる。 In the next step S104, the 3D model separation unit 1111 adds height information to the two-dimensional bounding box 53 1 obtained in step S103 to obtain a 3D model as shown in section (d) of FIG. Generate a dimensional bounding box 200 1 . Height information to be added to the two-dimensional bounding box 53 1 can be obtained based on the three-dimensional data 50 shown in section (a) of FIG. 13, for example.
 3Dモデル分離部1111は、2次元のバウンディングボックス532および533についても同様にして、それぞれ高さ情報を与えて3次元のバウンディングボックス2002および2003を生成する。 Similarly, the 3D model separation unit 1111 gives height information to the two-dimensional bounding boxes 53 2 and 53 3 to generate three-dimensional bounding boxes 200 2 and 200 3 .
 次のステップS105で、3Dモデル分離部1111は、ステップS101で検出された全てのブロブについて処理が終了したか否かを判定する。例えば、ステップS101でm個のブロブが検出された場合、3Dモデル分離部1111は、ループ変数iがi<m-1である場合に、全ブロブに対する処理が終了していないと判定する。3Dモデル分離部1111は、全ブロブに対する処理が終了していないと判定した場合(ステップS105、「No」)、ループ変数iをi=i+1として処理をステップS103に戻す。 At the next step S105, the 3D model separation unit 1111 determines whether or not all the blobs detected at step S101 have been processed. For example, when m blobs are detected in step S101, the 3D model separation unit 1111 determines that processing for all blobs has not ended if i<m−1 for the loop variable i. When the 3D model separation unit 1111 determines that the processing for all blobs has not been completed (step S105, “No”), the loop variable i is set to i=i+1 and the process returns to step S103.
 図13の例では、ループ変数iがi=1で、ステップS103により2次元のバウンディングボックス532が求められ、次のステップS104により2次元のバウンディングボックス532に対して高さ情報が追加されて3次元のバウンディングボックス2002が生成される。同様に、ループ変数iがi=2で、ステップS103により2次元のバウンディングボックス533が求められ、次のステップS104により2次元のバウンディングボックス533に対して高さ情報が追加されて3次元のバウンディングボックス200情報3が生成される。 In the example of FIG. 13, the loop variable i is i=1, a two-dimensional bounding box 53 2 is obtained in step S103, and height information is added to the two-dimensional bounding box 53 2 in the next step S104. , a three-dimensional bounding box 200 2 is generated. Similarly, when the loop variable i is i=2, a two-dimensional bounding box 53 3 is obtained in step S103, and height information is added to the two-dimensional bounding box 53 3 in step S104 to obtain a three-dimensional bounding box 200 information 3 is generated.
 ステップS105で、3Dモデル分離部1111は、全ブロブに対する処理が終了したと判定した場合(ステップS105、「Yes」)、図14のフローチャートによる一連の処理を終了させる。 In step S105, when the 3D model separation unit 1111 determines that the processing for all blobs has ended (step S105, "Yes"), it ends the series of processing according to the flowchart in FIG.
 このようにして、3次元モデル511~513それぞれに対応する3次元のバウンディングボックス2001~2003が生成される。そして、これら3次元のバウンディングボックス2001~2003それぞれの頂点座標に基づき、各3次元モデル511~513の位置を示す位置情報が取得される。 In this manner, three-dimensional bounding boxes 200 1 -200 3 corresponding to the three-dimensional models 51 1 -51 3 are generated. Then, based on the vertex coordinates of each of these three-dimensional bounding boxes 200 1 to 200 3 , position information indicating the position of each of the three-dimensional models 51 1 to 51 3 is obtained.
 このように、3Dモデル分離部1111は、3次元データから、撮像画像に含まれる被写体に対応する3次元モデルを分離し、分離された3次元モデルの位置を示す位置情報を生成する分離部として機能する。3Dモデル生成部111は、3Dモデル分離部1111で分離された3次元モデルに対して、当該3次元モデルの位置を示す位置情報を付加して出力する。 In this way, the 3D model separation unit 1111 separates the 3D model corresponding to the subject included in the captured image from the 3D data and generates position information indicating the position of the separated 3D model. Function. The 3D model generation unit 111 adds position information indicating the position of the 3D model to the 3D model separated by the 3D model separation unit 1111 and outputs the 3D model.
(3-3.実施形態に係るレンダリング処理について)
 次に、実施形態に係る、レンダリング部121によるレンダリング処理について、より詳細に説明する。レンダリング部121は、3Dモデル生成部111により上述したようにして取得された、被写体の位置を示す位置情報を用いて、当該被写体に適用するテクスチャを取得するために最適な撮像カメラを選択する。
(3-3. Regarding rendering processing according to the embodiment)
Next, rendering processing by the rendering unit 121 according to the embodiment will be described in more detail. The rendering unit 121 uses the position information indicating the position of the subject acquired by the 3D model generation unit 111 as described above to select the optimum imaging camera for acquiring the texture to be applied to the subject.
(3-3-1.実施形態に係るレンダリング処理の概要)
 図15は、実施形態に係る撮像カメラ選択について説明するための模式図である。図15において、セクション(a)は、実施形態に係る撮像カメラ選択の例を示している。また、セクション(b)は、上述した既存技術による図6Bと同一の図であり、実施形態との比較のために再掲している。
(3-3-1. Overview of rendering processing according to the embodiment)
FIG. 15 is a schematic diagram for explaining selection of an imaging camera according to the embodiment. In FIG. 15, section (a) shows an example of imaging camera selection according to the embodiment. Also, section (b) is the same diagram as FIG. 6B according to the existing technology described above, and is reprinted for comparison with the embodiment.
 図15のセクション(a)において、上述した図5Aなどと同様に、撮像対象となる撮像範囲84に2つの被写体821および822が含まれている。被写体821は、図15のセクション(a)において撮像範囲84の左上隅に位置し、被写体822は、図15のセクション(a)において撮像範囲84の右下隅に位置している。図15のセクション(a)において、撮像範囲84を取り囲んで、それぞれ画角βを有する16台の撮像カメラ601~6016が、それぞれ撮像方向を撮像範囲84の中央に向けて配置されている。また、仮想カメラ70は、画角αを有し、被写体821に対応する3次元モデルを画角α内に収めるものとする。 In section (a) of FIG. 15, two subjects 82 1 and 82 2 are included in an imaging range 84 to be imaged, as in FIG. 5A and the like described above. Subject 82 1 is positioned at the upper left corner of imaging range 84 in section (a) of FIG. 15, and subject 82 2 is positioned at the lower right corner of imaging range 84 in section (a) of FIG. In section (a) of FIG. 15, 16 imaging cameras 60 1 to 60 16 each having an angle of view β are arranged to surround an imaging range 84 with their imaging directions facing the center of the imaging range 84. . Also, the virtual camera 70 has an angle of view α, and the three-dimensional model corresponding to the subject 82 1 is assumed to fit within the angle of view α.
 また、図15のセクション(a)の例では、仮想カメラ70が撮像範囲84の中央より被写体821側に寄って配置され、仮想カメラ70の画角α内に被写体821が含まれている。基準位置83は、3Dモデル生成部111で求められた被写体821の位置を示す位置情報に基づき設定されている。ここでは、被写体821の中央に基準位置83が設定されているものとする。 In addition, in the example of section (a) of FIG. 15, the virtual camera 70 is arranged closer to the subject 82 1 than the center of the imaging range 84, and the subject 82 1 is included within the angle of view α of the virtual camera 70. . The reference position 83 is set based on position information indicating the position of the subject 82 1 obtained by the 3D model generation unit 111 . Here, it is assumed that the reference position 83 is set at the center of the subject 82 1 .
 図15のセクション(a)に示すように、実施形態では、レンダリング部121は、各撮像カメラ601~6016の位置と、仮想カメラ70の画角α内に含まれる被写体821の位置とから、各撮像カメラ601~6016から基準位置83への各ベクトルを求める。また、レンダリング部121は、仮想カメラ70と、仮想カメラ70の画角α内に含まれる被写体821の位置とから、仮想カメラ70から基準位置83へのベクトル91eを求める。 As shown in section (a) of FIG. 15, in the embodiment, the rendering unit 121 renders the positions of the imaging cameras 60 1 to 60 16 and the position of the subject 82 1 included within the angle of view α of the virtual camera 70. , each vector from each imaging camera 60 1 to 60 16 to the reference position 83 is obtained. The rendering unit 121 also obtains a vector 91 e from the virtual camera 70 to the reference position 83 from the virtual camera 70 and the position of the subject 821 included within the angle of view α of the virtual camera 70 .
 レンダリング部121は、例えば上述した式(1)に従い、各撮像カメラ601~6016から基準位置83への各ベクトル(ベクトルCi)と、ベクトル91e(ベクトルCv)とが成す角度それぞれに基づき、各撮像カメラ601~6016の重要度P(i)を求める。レンダリング部121は、各撮像カメラ601~6016について求めた重要度P(i)に基づき、当該被写体821に適用するテクスチャを取得するために最適な撮像カメラを選択する。 The rendering unit 121, for example, in accordance with the above-described formula (1), for each angle formed by each vector (vector C i ) from each imaging camera 60 1 to 60 16 to the reference position 83 and the vector 91e (vector C v ) Based on this, the degree of importance P(i) of each imaging camera 60 1 to 60 16 is obtained. The rendering unit 121 selects the optimum imaging camera for acquiring the texture to be applied to the subject 82 1 based on the importance P(i) obtained for each of the imaging cameras 60 1 to 60 16 .
 図15のセクション(a)の例では、仮想カメラ70により被写体821を撮像する場合、理想的には、被写体821(基準位置83)から仮想カメラ70を通る直線93c上にある撮像カメラ602が最適な撮像カメラとなる。 In the example of section (a) of FIG. 15, when the subject 82 1 is imaged by the virtual camera 70, ideally, the imaging camera 60 is on a straight line 93c passing through the virtual camera 70 from the subject 82 1 (reference position 83). 2 is the optimal imaging camera.
 実施形態では、被写体821の位置が取得され、その位置が基準位置83とされている。そのため、各撮像カメラ601~6016から基準位置83への各ベクトルのうち、仮想カメラ70から基準位置83へのベクトル91eとの角度が最も小さいベクトルを選択することで、理想に近い撮像カメラを、最適な撮像カメラとして選択することができる。図15のセクション(a)の例では、最適な撮像カメラとして、上述した理想的な撮像カメラである撮像カメラ602が選択されている。換言すれば、撮像カメラ602は、仮想カメラ70と略同じ方向から被写体821を見ているカメラであるといえる。したがって、実施形態に係る撮像カメラの選択方法によれば、より高品質のテクスチャを得ることが可能である。 In the embodiment, the position of the subject 82 1 is obtained and set as the reference position 83 . Therefore, among the vectors from the imaging cameras 60 1 to 60 16 to the reference position 83, by selecting the vector with the smallest angle with the vector 91e from the virtual camera 70 to the reference position 83, an imaging camera close to the ideal can be selected as the optimal imaging camera. In the example of section (a) of FIG . 15, the imaging camera 602, which is the above-described ideal imaging camera, is selected as the optimum imaging camera. In other words, the imaging camera 60 2 can be said to be a camera viewing the subject 82 1 from substantially the same direction as the virtual camera 70 . Therefore, according to the imaging camera selection method according to the embodiment, it is possible to obtain textures of higher quality.
 図15のセクション(b)は、既存技術による、仮想カメラ70の光軸と、各撮像カメラ601~6016の被写体821に対するベクトルとの角度に基づき、各撮像カメラ601~6016から最適な撮像カメラを選択する例である。この既存技術による選択方法では、基準位置83が被写体821の位置と一致しておらず、仮想カメラ70と、選択された最適な撮像カメラと、が略同じ方向から被写体821を見ているとは限らない。図15のセクション(b)の例では、理想的な撮像カメラ602とは異なる撮像カメラ601が、最適な撮像カメラとして選択されている。したがって、実施形態に係る、被写体821の位置を示す位置情報を用いた選択方向と比較して、取得されるテクスチャの品質が低下することになる。 Section (b) of FIG. 15 is based on the angle between the optical axis of the virtual camera 70 and the vector of each of the imaging cameras 60 1 to 60 16 with respect to the subject 82 1 from each of the imaging cameras 60 1 to 60 16 according to the existing technology. This is an example of selecting the optimum imaging camera. In this selection method according to the existing technology, the reference position 83 does not match the position of the subject 82 1 , and the virtual camera 70 and the selected optimum imaging camera are viewing the subject 82 1 from substantially the same direction. Not necessarily. In the example of section (b) of FIG. 15, the imaging camera 60 1 different from the ideal imaging camera 60 2 is selected as the optimum imaging camera. Therefore, compared to the selection direction using the position information indicating the position of the subject 821 according to the embodiment, the quality of the acquired texture is degraded.
(3-3-2.実施形態に係るレンダリング部の構成例)
 図16は、実施形態に係るレンダリング部121の一例の構成を示すブロック図である。図16において、レンダリング部121は、メッシュ転送部1210と、撮像カメラ選択部1211と、撮像視点デプス生成部1212と、撮像カメラ情報転送部1213と、仮想視点テクスチャ生成部1214と、を含む。
(3-3-2. Configuration example of rendering unit according to embodiment)
FIG. 16 is a block diagram showing an example configuration of the rendering unit 121 according to the embodiment. 16, the rendering unit 121 includes a mesh transfer unit 1210, an imaging camera selection unit 1211, an imaging viewpoint depth generation unit 1212, an imaging camera information transfer unit 1213, and a virtual viewpoint texture generation unit 1214.
 3Dモデル生成部111で生成されたメッシュ情報、撮像カメラ情報および被写体位置情報がレンダリング部121に入力される。また、仮想カメラ70の位置および方向を示す仮想視点位置情報がレンダリング部121に入力される。この仮想視点位置情報は、例えばユーザによりコントローラ(入力部2104に対応)などを用いて入力される。レンダリング部121は、これらメッシュ情報、撮像カメラ情報、仮想視点位置情報および被写体位置情報に基づき、仮想カメラ70による仮想視点におけるテクスチャを生成する。 The mesh information, imaging camera information, and subject position information generated by the 3D model generation unit 111 are input to the rendering unit 121 . Also, virtual viewpoint position information indicating the position and direction of the virtual camera 70 is input to the rendering unit 121 . This virtual viewpoint position information is input by the user, for example, using a controller (corresponding to the input unit 2104). The rendering unit 121 generates a texture at the virtual viewpoint of the virtual camera 70 based on the mesh information, the imaging camera information, the virtual viewpoint position information, and the subject position information.
 メッシュ情報は、メッシュ転送部1210に転送される。メッシュ転送部1210は、渡されたメッシュ情報を撮像視点デプス生成部1212と、仮想視点テクスチャ生成部1214とに転送する。例えば当該レンダリング部121が構成される情報処理装置2000がGPUを有する場合、このメッシュ転送部1210によるメッシュ転送処理は、メッシュ情報をGPUメモリに転送する処理となる。この場合、仮想視点テクスチャ生成部1214は、GPUメモリにアクセスして、メッシュ情報を取得してよい。なお、受信部120にメッシュ情報が受信された時点で、当該メッシュ情報がGPUメモリ上にある場合は、メッシュ転送部1210を省略することができる。 The mesh information is transferred to the mesh transfer unit 1210. The mesh transfer unit 1210 transfers the passed mesh information to the imaging viewpoint depth generation unit 1212 and the virtual viewpoint texture generation unit 1214 . For example, when the information processing apparatus 2000 including the rendering unit 121 has a GPU, the mesh transfer processing by the mesh transfer unit 1210 is processing for transferring mesh information to the GPU memory. In this case, the virtual viewpoint texture generation unit 1214 may access the GPU memory to acquire mesh information. Note that if the mesh information is on the GPU memory when the reception unit 120 receives the mesh information, the mesh transfer unit 1210 can be omitted.
 撮像カメラ情報は、撮像カメラ情報転送部1213に転送される。また、撮像カメラ情報のうちカメラパラメータ情報は、撮像カメラ選択部1211および撮像視点デプス生成部1212に転送される。 The imaging camera information is transferred to the imaging camera information transfer unit 1213. In addition, camera parameter information in the imaging camera information is transferred to the imaging camera selection unit 1211 and the imaging viewpoint depth generation unit 1212 .
 撮像視点デプス生成部1212は、撮像カメラ601~60nから、後述する撮像カメラ選択部1211から渡されるカメラ選択情報に従い撮像カメラを選択する。撮像視点デプス生成部1212は、メッシュ転送部1210から転送されたメッシュ情報に基づき、選択された撮像カメラによる撮像画像に対応するデプス情報である選択撮像視点デプス情報を生成する。 The imaging viewpoint depth generation unit 1212 selects an imaging camera from the imaging cameras 60 1 to 60 n according to camera selection information passed from the imaging camera selection unit 1211, which will be described later. Based on the mesh information transferred from the mesh transfer unit 1210, the imaging viewpoint depth generation unit 1212 generates selected imaging viewpoint depth information, which is depth information corresponding to the image captured by the selected imaging camera.
 なお、レンダリング部121に入力される撮像カメラ情報に含まれるデプス情報を撮像視点デプス生成部1212に転送することも可能である。この場合、撮像視点デプス生成部1212によるデプス生成処理が不要であり、撮像視点デプス生成部1212は、当該デプス情報を選択撮像視点デプス情報として仮想視点テクスチャ生成部1214に転送する。また、デプス情報がGPUメモリ上にある場合は、選択撮像視点デプス情報の転送を省略できる。この場合、仮想視点テクスチャ生成部1214は、GPUメモリにアクセスして、選択撮像視点デプス情報を取得してよい。 Note that it is also possible to transfer the depth information included in the imaging camera information input to the rendering unit 121 to the imaging viewpoint depth generation unit 1212 . In this case, depth generation processing by the imaging viewpoint depth generation unit 1212 is unnecessary, and the imaging viewpoint depth generation unit 1212 transfers the depth information to the virtual viewpoint texture generation unit 1214 as selected imaging viewpoint depth information. Further, when the depth information is on the GPU memory, the transfer of the selected imaging viewpoint depth information can be omitted. In this case, the virtual viewpoint texture generation unit 1214 may access the GPU memory and acquire the selected imaging viewpoint depth information.
 仮想視点位置情報および被写体位置情報は、撮像カメラ選択部1211と、撮像カメラ情報転送部1213に転送される。撮像カメラ選択部1211は、カメラパラメータ情報と、仮想視点位置情報および被写体位置情報とに基づき、各撮像カメラ601~60nから後段の処理で用いる1以上の撮像カメラを選択し、選択された1以上の撮像カメラを示すカメラ選択情報を生成する。撮像カメラ選択部1211は、生成したカメラ選択情報を撮像視点デプス生成部1212と、撮像カメラ情報転送部1213と、に転送する。 The virtual viewpoint position information and the subject position information are transferred to the imaging camera selection section 1211 and the imaging camera information transfer section 1213 . The imaging camera selection unit 1211 selects one or more imaging cameras to be used in subsequent processing from the imaging cameras 60 1 to 60 n based on the camera parameter information, the virtual viewpoint position information, and the subject position information. Camera selection information is generated that indicates one or more imaging cameras. The imaging camera selection unit 1211 transfers the generated camera selection information to the imaging viewpoint depth generation unit 1212 and the imaging camera information transfer unit 1213 .
 このように、撮像カメラ選択部1211は、仮想空間の画像を取得する仮想カメラの第1の位置と、3次元モデルの第2の位置と、実空間の被写体を撮像する1以上の撮像カメラの第3の位置と、に基づき、1以上の撮像カメラから、テクスチャ画像として用いる被写体の撮像画像を取得する撮像カメラを選択する選択部として機能する。 In this way, the imaging camera selection unit 1211 selects the first position of the virtual camera that acquires the image of the virtual space, the second position of the three-dimensional model, and one or more imaging cameras that capture the subject in the real space. It functions as a selection unit that selects, from one or more imaging cameras, an imaging camera that acquires an imaging image of a subject to be used as a texture image based on the third position.
 撮像カメラ情報転送部1213は、撮像カメラ選択部1211から渡されたカメラ選択情報に基づき、選択された撮像カメラを示す撮像カメラ情報を、選択カメラ情報として仮想視点テクスチャ生成部1214に転送する。この場合においても、撮像カメラ情報が既にGPUメモリ上にある場合は、選択カメラ情報の転送処理を省略可能である。この場合、仮想視点テクスチャ生成部1214は、GPUメモリにアクセスして、撮像カメラ情報を取得してよい。 Based on the camera selection information passed from the imaging camera selection section 1211, the imaging camera information transfer section 1213 transfers imaging camera information indicating the selected imaging camera to the virtual viewpoint texture generation section 1214 as selected camera information. Even in this case, if the imaging camera information is already on the GPU memory, the process of transferring the selected camera information can be omitted. In this case, the virtual viewpoint texture generation unit 1214 may access the GPU memory and acquire the imaging camera information.
 仮想視点テクスチャ生成部1214は、上述したように、メッシュ転送部1210からメッシュ情報が転送され、撮像視点デプス生成部1212から選択撮像視点デプス情報が転送され、撮像カメラ情報転送部1213から選択カメラ情報が転送される。また、仮想視点テクスチャ生成部1214は、レンダリング部121に入力された仮想視点位置情報および被写体位置情報が転送される。仮想視点テクスチャ生成部1214は、これら各部から転送された各情報に基づき、仮想カメラ70からの視点である仮想視点のテクスチャを生成する。 As described above, the virtual viewpoint texture generation unit 1214 receives the mesh information from the mesh transfer unit 1210, the selected imaging viewpoint depth information from the imaging viewpoint depth generation unit 1212, and the selected camera information from the imaging camera information transfer unit 1213. is transferred. Also, the virtual viewpoint position information and the subject position information input to the rendering unit 121 are transferred to the virtual viewpoint texture generation unit 1214 . The virtual viewpoint texture generation unit 1214 generates the texture of the virtual viewpoint, which is the viewpoint from the virtual camera 70, based on the information transferred from each of these units.
 このように、仮想視点テクスチャ生成部1214は、3次元データに含まれる3次元モデルに対してテクスチャ画像を適用した画像を生成する生成部として機能する。 Thus, the virtual viewpoint texture generation unit 1214 functions as a generation unit that generates an image by applying a texture image to the 3D model included in the 3D data.
(3-3-3.実施形態に係るレンダリング処理の具体例)
 次に、実施形態に係るレンダリング処理について、より具体的に説明する。
(3-3-3. Specific example of rendering processing according to the embodiment)
Next, rendering processing according to the embodiment will be described more specifically.
(3-3-3-1.撮像カメラ選択処理の第1の例)
 図17は、実施形態に係る、レンダリング処理における撮像カメラ選択処理の第1の例を示す一例のフローチャートである。この第1の例では、1以上の被写体に対して1つの基準位置を設定する。この図17のフローチャートの各処理は、レンダリング部121に含まれる撮像カメラ選択部1211により実行される処理となる。
(3-3-3-1. First example of imaging camera selection processing)
FIG. 17 is an exemplary flowchart illustrating a first example of imaging camera selection processing in rendering processing according to the embodiment. In this first example, one reference position is set for one or more subjects. Each process in the flowchart of FIG. 17 is a process executed by the imaging camera selection unit 1211 included in the rendering unit 121 .
 ステップS200で、撮像カメラ選択部1211は、レンダリング部121に入力されたオブジェクト(被写体)の数に係るループ変数iをi=0とする。以降のステップS201~ステップS205の処理は、当該オブジェクト(被写体)毎の処理となる。なお、レンダリング部121に入力されたオブジェクトの数は、被写体位置情報から求めることができる。 In step S200, the imaging camera selection unit 1211 sets the loop variable i related to the number of objects (subjects) input to the rendering unit 121 to i=0. The subsequent processing from step S201 to step S205 is processing for each object (subject). Note that the number of objects input to the rendering unit 121 can be obtained from subject position information.
 次のステップS201で、撮像カメラ選択部1211は、バウンディングボックスの頂点の数に係るループ変数jをj=0とする。以降のステップS202およびステップS203の処理は、i番目のオブジェクトのバウンディングボックスの頂点毎の処理となる。 In the next step S201, the imaging camera selection unit 1211 sets the loop variable j related to the number of vertices of the bounding box to j=0. The subsequent processing in steps S202 and S203 is processing for each vertex of the bounding box of the i-th object.
 次のステップS202で、撮像カメラ選択部1211は、i番目のオブジェクトのバウンディングボックスのj番目の頂点を、仮想視点位置情報および当該オブジェクトに係る被写体位置情報に基づき、仮想カメラ70に投影する。 In the next step S202, the imaging camera selection unit 1211 projects the j-th vertex of the bounding box of the i-th object onto the virtual camera 70 based on the virtual viewpoint position information and the subject position information related to the object.
 次のステップS203で、撮像カメラ選択部1211は、対象のバウンディングボックスの全頂点について処理が終了した、あるいは、対象のバウンディングボックスのj番目の頂点が仮想カメラ70の画角α内に投影されたか否かを判定する。撮像カメラ選択部1211は、対象のバウンディングボックス全頂点について処理が終了しておらず、勝つ、対象のバウンディングボックスのj番目の頂点が仮想カメラ70の画角α内に投影されていない、と判定した場合(ステップS203、「No」)、ループ変数jをj=j+1として、処理をステップS202に戻す。 In the next step S203, the imaging camera selection unit 1211 determines whether the processing for all vertices of the target bounding box has been completed, or whether the j-th vertex of the target bounding box has been projected within the angle of view α of the virtual camera 70. determine whether or not The imaging camera selection unit 1211 determines that processing has not been completed for all vertices of the target bounding box, and that the j-th vertex of the target bounding box is not projected within the angle of view α of the virtual camera 70. If so (step S203, "No"), the loop variable j is set to j=j+1, and the process returns to step S202.
 一方、撮像カメラ選択部1211は、対象のバウンディングボックスの全頂点について処理が終了した、あるいは、対象のバウンディングボックスのj番目の頂点が仮想カメラ70の画角α内に投影された、と判定した場合(ステップS203、「Yes」)、処理をステップS204に移行させる。ステップS204で、撮像カメラ選択部1211は、対象のバウンディングボックスの全頂点のうち1つでも仮想カメラ70の画角α内に存在すれば、当該バウンディングボックスに基づき基準位置を追加する。 On the other hand, the imaging camera selection unit 1211 has determined that processing has been completed for all vertices of the target bounding box, or that the j-th vertex of the target bounding box has been projected within the angle of view α of the virtual camera 70. If so (step S203, "Yes"), the process proceeds to step S204. In step S204, if even one of all the vertices of the target bounding box exists within the angle of view α of the virtual camera 70, the imaging camera selection unit 1211 adds a reference position based on the bounding box.
 ステップS203およびステップS204において、撮像カメラ選択部1211は、仮想カメラ70に投影されたバウンディングボックスの各頂点のうち、少なくとも1つの頂点が仮想カメラ70の画角α内に含まれていれば、当該バウンディングボックスに係るオブジェクトの(被写体)が仮想カメラ70の画角α内に存在していると見做す。そして、撮像カメラ選択部1211は、仮想カメラ70の画角α内に存在していると見做されたバウンディングボックスに基づき、基準位置を求める。 In steps S203 and S204, if at least one of the vertices of the bounding box projected onto the virtual camera 70 is included within the angle of view α of the virtual camera 70, the imaging camera selection unit 1211 It is assumed that the (subject) of the object related to the bounding box exists within the angle of view α of the virtual camera 70 . Then, the imaging camera selection unit 1211 obtains the reference position based on the bounding box assumed to exist within the angle of view α of the virtual camera 70 .
 図18は、実施形態に係る、オブジェクト(被写体)と仮想カメラ70との関係を説明するための模式図である。図18の例では、3次元モデル51aに係る3次元のバウンディングボックス200aの各頂点のうち、頂点201aが仮想カメラ70の画角αの外にある。撮像カメラ選択部1211は、このような場合であっても、当該3次元モデル51aが仮想カメラ70の画角α内に存在するものと見做す。すなわち、撮像カメラ選択部1211は、3次元モデル51aに係る3次元のバウンディングボックス200aの各頂点のうち少なくとも1つの頂点が仮想カメラ70の画角α内にある場合に、当該3次元モデル51aが仮想カメラ70の画角α内に存在するものと見做す。撮像カメラ選択部1211は、3次元のバウンディングボックス200aの各頂点の座標に基づき、当該バウンディングボックス200aに係る3次元モデル51aに対する基準位置84aを求める。例えば、撮像カメラ選択部1211は、3次元のバウンディングボックス200aの各頂点の座標の平均値を、当該バウンディングボックス200aに係る3次元モデル51aに対する基準位置84aとして求める。 FIG. 18 is a schematic diagram for explaining the relationship between an object (subject) and the virtual camera 70 according to the embodiment. In the example of FIG. 18 , among the vertices of the three-dimensional bounding box 200a related to the three-dimensional model 51a, the vertex 201a is outside the angle of view α of the virtual camera 70. In the example of FIG. Even in such a case, the imaging camera selection unit 1211 assumes that the three-dimensional model 51a exists within the angle of view α of the virtual camera 70 . That is, when at least one of the vertices of the three-dimensional bounding box 200a related to the three-dimensional model 51a is within the angle of view α of the virtual camera 70, the imaging camera selection unit 1211 determines that the three-dimensional model 51a is It is assumed to exist within the angle of view α of the virtual camera 70 . The imaging camera selection unit 1211 obtains the reference position 84a for the three-dimensional model 51a related to the bounding box 200a based on the coordinates of each vertex of the three-dimensional bounding box 200a. For example, the imaging camera selection unit 1211 obtains the average value of the coordinates of each vertex of the three-dimensional bounding box 200a as the reference position 84a for the three-dimensional model 51a related to the bounding box 200a.
 なお、図18の例では、バウンディングボックス200aの頂点201a以外の何れかの頂点に対するステップS203での判定において、その頂点が仮想カメラ70の画角α内に存在していると判定された場合に、処理がステップS204に移行されることになる。 In the example of FIG. 18, when it is determined in step S203 that any vertex other than the vertex 201a of the bounding box 200a exists within the angle of view α of the virtual camera 70, , the process proceeds to step S204.
 次のステップS205で、撮像カメラ選択部1211は、レンダリング部121に入力された全てのオブジェクトについて、ステップS202~ステップS204の処理が終了したか否かを判定する。撮像カメラ選択部1211は、レンダリング部121に入力された全てのオブジェクトについて処理が終了していないと判定した場合(ステップS205、「No」)、ループ変数iをi=i+1として、処理をステップS201に戻す。一方。撮像カメラ選択部1211は、レンダリング部121に入力された全てのオブジェクトについて処理が終了したと判定した場合(ステップS205、「Yes」)、処理をステップS206に移行させる。 In the next step S205, the imaging camera selection unit 1211 determines whether the processes of steps S202 to S204 have been completed for all objects input to the rendering unit 121. When the imaging camera selection unit 1211 determines that processing has not been completed for all objects input to the rendering unit 121 (“No” in step S205), the loop variable i is set to i=i+1, and the process proceeds to step S201. back to on the other hand. When the imaging camera selection unit 1211 determines that the processing has been completed for all objects input to the rendering unit 121 (step S205, “Yes”), the process proceeds to step S206.
 ステップS206で、撮像カメラ選択部1211は、ステップS205までで処理が終了した全てのオブジェクトに対する代表の基準位置を算出する。より具体的には、撮像カメラ選択部1211は、ステップS206で、ステップS205までで処理が終了した全てのオブジェクトそれぞれの基準位置の平均値を算出する。撮像カメラ選択部1211は、算出した平均値を、当該全てのオブジェクトに対する代表の基準位置とする。 In step S206, the imaging camera selection unit 1211 calculates a representative reference position for all objects for which processing has been completed up to step S205. More specifically, in step S206, the imaging camera selection unit 1211 calculates the average value of the reference positions of all the objects for which the processing has been completed up to step S205. The imaging camera selection unit 1211 uses the calculated average value as a representative reference position for all the objects.
 図19は、実施形態に係る、オブジェクトの基準位置の平均値を算出する処理を説明するための模式図である。図19の例では、仮想カメラ70の画角α内に、3次元モデル51aに係るバウンディングボックス200aと、3次元モデル51bに係るバウンディングボックス200bとが含まれている。3次元モデル51aおよび51bは、それぞれ基準位置84aおよび84bが設定されている。 FIG. 19 is a schematic diagram for explaining the process of calculating the average value of the reference positions of the objects according to the embodiment. In the example of FIG. 19, the angle of view α of the virtual camera 70 includes a bounding box 200a for the three-dimensional model 51a and a bounding box 200b for the three-dimensional model 51b. Reference positions 84a and 84b are set for the three- dimensional models 51a and 51b, respectively.
 撮像カメラ選択部1211は、これら基準位置84aおよび84bそれぞれの座標の平均値の座標に対して、基準位置85を設定する。この基準位置85が、3次元モデル51aおよび51bに対する共通の基準位置となる。例えば、3次元モデル51aおよび51bが分離する必要のない1つの被写体として見做される場合、この基準位置85を用いて、当該3次元モデル51aおよび51bに対して共通して最適な撮像カメラを設定する。このような、3次元モデル51aおよび51bが分離する必要のない1つの被写体として見做される場合の例として、例えば3次元モデル51aおよび51bが1つのグループを形成する場合が考えられる。 The imaging camera selection unit 1211 sets the reference position 85 with respect to the coordinates of the average values of the coordinates of the reference positions 84a and 84b. This reference position 85 serves as a common reference position for the three- dimensional models 51a and 51b. For example, when the three- dimensional models 51a and 51b are regarded as one object that does not need to be separated, the reference position 85 is used to select the optimum imaging camera for the three- dimensional models 51a and 51b in common. set. As an example of the case where the three- dimensional models 51a and 51b are regarded as one subject that does not need to be separated, for example, the three- dimensional models 51a and 51b form one group.
 次のステップS207で、撮像カメラ選択部1211は、撮像カメラ601~60nに係るループ変数kをk=0とする。以降のステップS208~ステップS210の処理は、撮像カメラ601~60n毎の処理となる。また、撮像カメラ601~60nのうち当該ループにおける処理対象を撮像カメラ60kとして説明を行う。 In the next step S207, the imaging camera selection unit 1211 sets the loop variable k related to the imaging cameras 60 1 to 60 n to k=0. The subsequent processing of steps S208 to S210 is processing for each of the imaging cameras 60 1 to 60 n . Also, the processing target in the loop among the imaging cameras 60 1 to 60 n is assumed to be the imaging camera 60 k .
 ステップS208で、撮像カメラ選択部1211は、k番目の撮像カメラ60kについて、当該撮像カメラ60kから基準位置85に向かうベクトルと、仮想カメラ70から基準位置85に向かうベクトルと、の角度を求める。次のステップS209で、撮像カメラ選択部1211は、ループ変数kに基づくループ処理におけるステップS208で求めた各角度の小さい順に、撮像カメラ60kをソートする。すなわち、撮像カメラ選択部1211は、ステップS209で、撮像カメラ60kを重要度の高い順にソートする。 In step S208, for the k-th imaging camera 60 k , the imaging camera selection unit 1211 obtains an angle between a vector directed from the imaging camera 60 k to the reference position 85 and a vector directed from the virtual camera 70 to the reference position 85. . In the next step S209, the imaging camera selection unit 1211 sorts the imaging cameras 60 k in ascending order of angles obtained in step S208 in the loop processing based on the loop variable k. That is, in step S209, the imaging camera selection unit 1211 sorts the imaging cameras 60 k in descending order of importance.
 次のステップS210で、撮像カメラ選択部1211は、配置された全ての撮像カメラ601~60nについて処理が終了したか否かを判定する。撮像カメラ選択部1211は、全ての撮像カメラ601~60nについて処理が終了していないと判定した場合(ステップS210、「No」)、ループ変数kをk=k+1として、処理をステップS208に戻す。一方、撮像カメラ選択部1211は、全ての撮像カメラ601~60nについて処理が終了したと判定した場合(ステップS210、「Yes」)、処理をステップS211に移行させる。 In the next step S210, the imaging camera selection unit 1211 determines whether or not processing has been completed for all of the arranged imaging cameras 60 1 to 60 n . When the imaging camera selection unit 1211 determines that the processing has not been completed for all the imaging cameras 60 1 to 60 n (step S210, “No”), the loop variable k is set to k=k+1, and the process proceeds to step S208. return. On the other hand, when the imaging camera selection unit 1211 determines that the processing has been completed for all the imaging cameras 60 1 to 60 n (step S210, “Yes”), the process proceeds to step S211.
 ステップS211で、撮像カメラ選択部1211は、角度の小さい順にソートされた各撮像カメラ601~60nの配列から、上位m個の撮像カメラを示すカメラ情報を選択する。撮像カメラ選択部1211は、選択した各撮像カメラを示す情報を、カメラ選択情報として撮像視点デプス生成部1212および撮像カメラ情報転送部1213に転送する。 In step S211, the imaging camera selection unit 1211 selects camera information indicating top m imaging cameras from the array of imaging cameras 60 1 to 60 n sorted in ascending order of angle. The imaging camera selection unit 1211 transfers information indicating each selected imaging camera to the imaging viewpoint depth generation unit 1212 and the imaging camera information transfer unit 1213 as camera selection information.
 撮像カメラ選択部1211は、ステップS211の処理が終了すると、この図17のフローチャートによる一連の処理を終了させる。 When the process of step S211 ends, the imaging camera selection unit 1211 ends the series of processes according to the flowchart of FIG.
 この第1の例では、複数の3次元モデル51aおよび51bに対して1の基準位置85を纏めて設定している。そのため、後述するポストエフェクト処理などにおいては、3次元モデル51aおよび51bに対して共通するエフェクト処理が同時に施されることになる。 In this first example, one reference position 85 is collectively set for a plurality of three- dimensional models 51a and 51b. Therefore, in post-effect processing and the like, which will be described later, the three- dimensional models 51a and 51b are subjected to common effect processing at the same time.
(3-3-3-2.撮像カメラ選択処理の第2の例)
 次に、実施形態に係るレンダリング処理における撮像カメラ選択処理の第2の例について説明する。図20は、実施形態に係る、レンダリング処理における撮像カメラ選択処理の第2の例を示す一例のフローチャートである。この第2の例では、1以上の被写体それぞれに対して1ずつ基準位置を設定する。この図20のフローチャートの各処理は、レンダリング部121に含まれる撮像カメラ選択部1211により実行される処理となる。
(3-3-3-2. Second example of imaging camera selection processing)
Next, a second example of imaging camera selection processing in rendering processing according to the embodiment will be described. FIG. 20 is an exemplary flowchart illustrating a second example of imaging camera selection processing in rendering processing according to the embodiment. In this second example, one reference position is set for each of one or more subjects. Each process in the flowchart of FIG. 20 is a process executed by the imaging camera selection unit 1211 included in the rendering unit 121. In FIG.
 図20のフローチャートにおいて、ステップS200~ステップS205の処理は、上述した図のフローチャートにおけるステップS200~ステップS205の処理と同一であるので、ここでの説明を省略する。 In the flowchart of FIG. 20, the processing of steps S200 to S205 is the same as the processing of steps S200 to S205 in the flowchart of the above-described figure, so the description is omitted here.
 撮像カメラ選択部1211は、ステップS205で、全オブジェクトについて基準位置の追加処理が終了すると、諸理をステップS2060に移行させる。ステップS2060で、撮像カメラ選択部1211は、仮想カメラ70の画角α内に含まれるオブジェクトに係るループ変数lをl=0とする。以降のステップS208~ステップS2101の処理は、当該オブジェクト毎の処理となる。 The imaging camera selection unit 1211 advances the process to step S2060 when the reference position addition processing for all objects is completed in step S205. In step S2060, the imaging camera selection unit 1211 sets the loop variable l related to the object included within the angle of view α of the virtual camera 70 to l=0. The subsequent processing from step S208 to step S2101 is processing for each object.
 次のステップS207で、撮像カメラ選択部1211は、撮像カメラ601~60nに係るループ変数kをk=0とする。以降のステップS208~ステップS210の処理は、撮像カメラ601~60n毎の処理となる。なお、ステップS208~ステップS210の処理は、上述した図17のフローチャートにおけるステップS208~ステップS210の処理と同一であるので、ここでの説明を省略する。 In the next step S207, the imaging camera selection unit 1211 sets the loop variable k related to the imaging cameras 60 1 to 60 n to k=0. The subsequent processing of steps S208 to S210 is processing for each of the imaging cameras 60 1 to 60 n . Note that the processing of steps S208 to S210 is the same as the processing of steps S208 to S210 in the flow chart of FIG.
 撮像カメラ選択部1211は、ステップS210で配置された全ての撮像カメラ601~60nについて処理が終了したと判定した場合(ステップS210、「Yes」)、諸理をステップS2101に移行させる。 When the imaging camera selection unit 1211 determines that the processing for all the arranged imaging cameras 60 1 to 60 n is completed in step S210 (step S210, “Yes”), the process proceeds to step S2101.
 ステップS2101で、撮像カメラ選択部1211は、仮想カメラ70の画角α内に含まれる全てのオブジェクトに対する処理が終了したな否かを判定する。撮像カメラ選択部1211は、仮想カメラ70の画角α内に含まれる全てのオブジェクトに対する処理が終了していないと判定した場合(ステップS2101、「No」)、ループ変数lをl=l+1として、処理をステップS207に戻す。一方、撮像カメラ選択部1211は、当該処理が終了したと判定した場合(ステップS2101、「Yes」)、処理をステップS211に移行させる。 In step S2101, the imaging camera selection unit 1211 determines whether or not the processing for all objects included within the angle of view α of the virtual camera 70 has been completed. When the imaging camera selection unit 1211 determines that processing for all objects included in the angle of view α of the virtual camera 70 has not ended (step S2101, “No”), the loop variable l is set to l=l+1, The process returns to step S207. On the other hand, if the imaging camera selection unit 1211 determines that the process has ended (step S2101, "Yes"), the process proceeds to step S211.
 ステップS211で、撮像カメラ選択部1211は、上述した図17のフローチャートのステップS211と同様に、角度の小さい順にソートされた各撮像カメラ601~60nの配列から、上位m個の撮像カメラを示すカメラ情報を選択する。撮像カメラ選択部1211は、選択した各撮像カメラを示す情報を、カメラ選択情報として撮像視点デプス生成部1212および撮像カメラ情報転送部1213に転送する。 In step S211, the imaging camera selection unit 1211 selects the top m imaging cameras from the arrangement of the imaging cameras 60 1 to 60 n sorted in ascending order of angle, as in step S211 of the flowchart of FIG. Select the camera information to display. The imaging camera selection unit 1211 transfers information indicating each selected imaging camera to the imaging viewpoint depth generation unit 1212 and the imaging camera information transfer unit 1213 as camera selection information.
 撮像カメラ選択部1211は、ステップS211の処理が終了すると、この図20のフローチャートによる一連の処理を終了させる。 When the process of step S211 ends, the imaging camera selection unit 1211 ends the series of processes according to the flowchart of FIG.
 このレンダリング処理における撮像カメラ選択処理の第2の例においては、例えば上述した図19における基準位置85が設定されず、各3次元モデル51aおよび51bに対してそれぞれ基準位置84aおよび84bが設定される。 In the second example of the imaging camera selection process in this rendering process, for example, the reference position 85 in FIG. 19 described above is not set, and reference positions 84a and 84b are set for the three- dimensional models 51a and 51b, respectively. .
 この第2の例では、複数の3次元モデル51aおよび51bそれぞれに対して個別に基準位置84aおよび84bが設定されている。そのため、後述するポストエフェクト処理などにおいては、各3次元モデル51aおよび51bに対して個別にエフェクト処理を施すことが可能である。 In this second example, reference positions 84a and 84b are individually set for each of the plurality of three- dimensional models 51a and 51b. Therefore, in post-effect processing, etc., which will be described later, it is possible to apply effect processing to each of the three- dimensional models 51a and 51b individually.
(3-3-3-3.レンダリング処理の詳細)
 次に、実施形態に係るレンダリング処理について、より詳細に説明する。図21は、実施形態に係るレンダリング処理を示す一例のフローチャートである。図21のフローチャートによる各処理は、レンダリング部121に含まれる仮想視点テクスチャ生成部1214において実行される処理となる。
(3-3-3-3. Details of rendering processing)
Next, rendering processing according to the embodiment will be described in more detail. FIG. 21 is an exemplary flowchart illustrating rendering processing according to the embodiment. Each process according to the flowchart of FIG. 21 is a process executed by the virtual viewpoint texture generation unit 1214 included in the rendering unit 121 .
 ステップS300で、仮想視点テクスチャ生成部1214は、被写体に基づくメッシュ情報の頂点に係るループ変数pをp=0とする。以降、ステップS301~ステップS303の処理は、当該メッシュ情報に示される頂点毎の処理となる。なお、メッシュ情報は、複数の被写体のメッシュ情報を含んでよい。 At step S300, the virtual viewpoint texture generation unit 1214 sets the loop variable p related to the vertices of the mesh information based on the subject to p=0. Thereafter, the processing of steps S301 to S303 is performed for each vertex indicated in the mesh information. Note that the mesh information may include mesh information of a plurality of subjects.
 次のステップS301で、仮想視点テクスチャ生成部1214は、被写体位置情報に基づき、メッシュ情報から仮想カメラ70による仮想視点に投影する頂点を選択する。次のステップS302で、仮想視点テクスチャ生成部1214は、ステップS301で選択された頂点に基づきラスタライズを行う。すなわち、ステップS301で選択されなかった頂点は、ラスタライズが行われず、仮想視点すなわち仮想カメラ70に投影されない。したがって、仮想視点テクスチャ生成部1214は、複数の被写体のそれぞれに対して選択的に、表示/非表示を設定することができる。 In the next step S301, the virtual viewpoint texture generation unit 1214 selects vertices to be projected onto the virtual viewpoint by the virtual camera 70 from the mesh information based on the subject position information. In the next step S302, the virtual viewpoint texture generation unit 1214 rasterizes based on the vertices selected in step S301. That is, the vertices not selected in step S301 are not rasterized and are not projected onto the virtual viewpoint, ie, the virtual camera 70 . Therefore, the virtual viewpoint texture generation unit 1214 can selectively set display/non-display for each of a plurality of subjects.
 次のステップS303で、仮想視点テクスチャ生成部1214は、メッシュ情報に示されるメッシュの全頂点について処理が終了したか否かを判定する。仮想視点テクスチャ生成部1214は、ステップS303でメッシュ情報に示されるメッシュの全頂点について処理が終了していないと判定した場合(ステップS303、「No」)、ループ変数pをp=p+1として、処理をステップS301に戻す。一方、仮想視点テクスチャ生成部1214は、メッシュ情報に示されるメッシュの全頂点について処理が終了したと判定した場合(ステップS303、「Yes」)、処理をステップS304に移行させる。 In the next step S303, the virtual viewpoint texture generation unit 1214 determines whether or not processing has been completed for all vertices of the mesh indicated by the mesh information. If the virtual viewpoint texture generation unit 1214 determines in step S303 that processing has not been completed for all vertices of the mesh indicated by the mesh information (step S303, “No”), the loop variable p is set to p=p+1, and processing is performed. is returned to step S301. On the other hand, if the virtual viewpoint texture generation unit 1214 determines that processing has been completed for all vertices of the mesh indicated by the mesh information (step S303, "Yes"), the process proceeds to step S304.
 ステップS304で、仮想視点テクスチャ生成部1214は、仮想カメラ70の仮想視点に係るループ変数qをq=0とする。以降、ステップS305~ステップS315の処理は、当該仮想視点の画素毎の処理となる。 In step S304, the virtual viewpoint texture generation unit 1214 sets the loop variable q related to the virtual viewpoint of the virtual camera 70 to q=0. After that, the processing from step S305 to step S315 is processing for each pixel of the virtual viewpoint.
 ステップS305で、仮想視点テクスチャ生成部1214は、仮想視点の画素qに対応するメッシュの頂点を求める。 In step S305, the virtual viewpoint texture generation unit 1214 obtains the vertex of the mesh corresponding to the pixel q of the virtual viewpoint.
 次のステップS306で、仮想視点テクスチャ生成部1214は、撮像カメラ601~60nに係るループ変数rをr=0とする。以降のステップS307~ステップS313の処理は、撮像カメラ601~60n毎の処理となる。また、撮像カメラ601~60nのうち当該ループにおける処理対象を撮像カメラ60rとして説明を行う。 In the next step S306, the virtual viewpoint texture generation unit 1214 sets the loop variable r related to the imaging cameras 60 1 to 60 n to r=0. The subsequent processing of steps S307 to S313 is processing for each of the imaging cameras 60 1 to 60 n . Also, the imaging camera 60 r is assumed to be the object of processing in the loop among the imaging cameras 60 1 to 60 n .
 ステップS307で、仮想視点テクスチャ生成部1214は、ステップS305で求めた頂点の頂点座標から撮像カメラ60rに投影し、当該頂点座標の撮像カメラ60rにおけるUV座標を求める。次のステップS308で、仮想視点テクスチャ生成部1214は、撮像カメラ60rにおけるメッシュの各頂点のデプスと、ステップS305で求めた頂点の頂点座標のデプスとを比較し、両者の差分を求める。 In step S307, the virtual viewpoint texture generation unit 1214 projects the vertex coordinates of the vertices obtained in step S305 onto the imaging camera 60r , and obtains the UV coordinates of the vertex coordinates of the imaging camera 60r . In the next step S308, the virtual viewpoint texture generation unit 1214 compares the depth of each vertex of the mesh in the imaging camera 60 r with the depth of the vertex coordinates of the vertex obtained in step S305, and obtains the difference between the two.
 ステップS309で、仮想視点テクスチャ生成部1214は、ステップS308で求めた差分が閾値以上であるか否かを判定する。仮想視点テクスチャ生成部1214は、差分が閾値以上であると判定した場合(ステップS309、「Yes」)、処理をステップS310に移行させ、当該撮像カメラ60rによる撮像カメラ情報(選択カメラ情報)を用いないとする。 In step S309, the virtual viewpoint texture generation unit 1214 determines whether the difference obtained in step S308 is equal to or greater than a threshold. If the virtual viewpoint texture generation unit 1214 determines that the difference is equal to or greater than the threshold value (step S309, "Yes"), the process proceeds to step S310, and the imaging camera information (selected camera information) obtained by the imaging camera 60r is generated. I don't use it.
 一方、仮想視点テクスチャ生成部1214は、ステップS308で求めた差分が閾値未満であると判定した場合(ステップS309、「No」)、処理をステップS311に移行させ、当該撮像カメラ60rによる撮像カメラ情報を用いるとする。 On the other hand, if the virtual viewpoint texture generation unit 1214 determines that the difference obtained in step S308 is less than the threshold value (step S309 , "No"), the process proceeds to step S311, and Suppose we use information.
 次のステップS312で、仮想視点テクスチャ生成部1214は、当該撮像カメラ情報から、ステップS307で求めたUV座標における色情報を取得する。そして、仮想視点テクスチャ生成部1214は、色情報に対するブレンド係数を求める。例えば、仮想視点テクスチャ生成部1214は、図17または図20のフローチャートのステップS208~ステップS211の処理により選択された撮像カメラ情報に基づく当該撮像カメラ60rの重要度に応じて、当該撮像カメラ60rの撮像画像(テクスチャ画像)に対するブレンド係数を求める。 In the next step S312, the virtual viewpoint texture generation unit 1214 acquires color information at the UV coordinates obtained in step S307 from the imaging camera information. The virtual viewpoint texture generation unit 1214 then obtains a blend coefficient for the color information. For example, the virtual viewpoint texture generation unit 1214 selects the imaging camera 60 r based on the imaging camera information selected by the processing of steps S208 to S211 in the flowchart of FIG. 17 or FIG. Find the blend coefficient for the captured image (texture image) of r .
 仮想視点テクスチャ生成部1214は、ステップS310の処理、あるいは、ステップS312の処理の後、処理をステップS313に移行させる。ステップS313で、仮想視点テクスチャ生成部1214は、配置された全ての撮像カメラ601~60nに対する処理が終了したか否かを判定する。仮想視点テクスチャ生成部1214は、全ての撮像カメラ601~60nに対する処理が終了していないと判定した場合(ステップS313、「No」)、ループ変数rをr=r+1として、処理をステップS307に戻す。一方、仮想視点テクスチャ生成部1214は、全ての撮像カメラ601~60nに対する処理が終了したと判定した場合(ステップS313、「Yes」)、処理をステップS314に移行させる。 After the process of step S310 or the process of step S312, the virtual viewpoint texture generation unit 1214 shifts the process to step S313. In step S313, the virtual viewpoint texture generation unit 1214 determines whether or not the processing for all of the arranged imaging cameras 60 1 to 60 n has been completed. When the virtual viewpoint texture generation unit 1214 determines that the processing for all the imaging cameras 60 1 to 60 n has not been completed (step S313, “No”), the loop variable r is set to r=r+1, and the process proceeds to step S307. back to On the other hand, when the virtual viewpoint texture generation unit 1214 determines that the processing for all the imaging cameras 60 1 to 60 n is completed (step S313, “Yes”), the process proceeds to step S314.
 ステップS314で、仮想視点テクスチャ生成部1214は、各撮像カメラ601~60nのうち、ステップS311で用いるとされた撮像カメラ情報における色情報を、ステップS312で求めたブレンド係数に従いブレンドする。これにより、画素qに対する色情報が決定される。 In step S314, the virtual viewpoint texture generation unit 1214 blends the color information in the imaging camera information used in step S311 among the imaging cameras 60 1 to 60 n according to the blending coefficient obtained in step S312. Thus, color information for pixel q is determined.
 次のステップS315で、仮想視点テクスチャ生成部1214は、仮想カメラ70の仮想視点における全画素について処理が終了したか否かを判定する。仮想視点テクスチャ生成部1214は、全画素について処理が終了していないと判定した場合(ステップS315、「No」)、ループ変数qをq=q+1として処理をステップS305に戻す。 In the next step S315, the virtual viewpoint texture generation unit 1214 determines whether or not the processing for all pixels at the virtual viewpoint of the virtual camera 70 has been completed. If the virtual viewpoint texture generation unit 1214 determines that processing has not been completed for all pixels (“No” at step S315), the loop variable q is set to q=q+1, and the process returns to step S305.
 一方、仮想視点テクスチャ生成部1214は、ステップS315で全画素について処理が終了したと判定した場合、この図21のフローチャートによる一連の処理を終了させる。 On the other hand, if the virtual viewpoint texture generation unit 1214 determines in step S315 that processing has been completed for all pixels, it terminates the series of processing according to the flowchart of FIG.
(3-3-3-4.ポストエフェクト処理について)
 次に、実施形態に係るポストエフェクト処理について説明する。図22は、実施形態に係るポストエフェクト処理について説明するための模式図である。図22のセクション(a)は、実施形態に係るレンダリング処理の例、セクション(b)は、既存技術によるレンダリング処理の例をそれぞれ示している。図22のセクション(a)および(b)それぞれにおいて、仮想カメラ70の画角α内に2つの被写体86および87が存在し、撮像カメラ601~604がこれら被写体86および87を取り囲むように配置されているものとする。
(3-3-3-4. Post effect processing)
Next, post-effect processing according to the embodiment will be described. FIG. 22 is a schematic diagram for explaining post-effect processing according to the embodiment. Section (a) of FIG. 22 shows an example of rendering processing according to the embodiment, and section (b) shows an example of rendering processing by existing technology. In each of sections (a) and (b) of FIG. 22, there are two subjects 86 and 87 within the angle of view α of the virtual camera 70, and the imaging cameras 60 1 to 60 4 surround these subjects 86 and 87. shall be placed.
 ここで、ポストエフェクト処理により、これら被写体86および87のうち、仮想カメラ70により撮像された画像に対して被写体87を非表示とする場合について考える。仮想視点テクスチャ生成部1214は、仮想カメラ70により取得される画像の画素72(仮想カメラ70の出力画素)の位置から仮想的な光路95、および、光路961~964に示すように光線追跡し、複数の撮像カメラ601~604それぞれの、画素72に対応する画素(入力画素)の位置を求める(図21、ステップS305~ステップS307)。仮想視点テクスチャ生成部1214は、求めた複数の撮像カメラ601~604それぞれの画素における色情報を、ブレンド係数に応じてブレンドすることで、仮想カメラ70における画素72の色情報を求める(図21、ステップS312)。 Here, a case will be considered in which the subject 87 among the subjects 86 and 87 is hidden from the image captured by the virtual camera 70 by post-effect processing. The virtual viewpoint texture generation unit 1214 performs ray tracing from the position of the pixel 72 of the image acquired by the virtual camera 70 (the output pixel of the virtual camera 70) to the virtual optical path 95 and the optical paths 96 1 to 96 4 . Then, the position of the pixel (input pixel) corresponding to the pixel 72 of each of the plurality of imaging cameras 60 1 to 60 4 is obtained (FIG. 21, steps S305 to S307). The virtual viewpoint texture generation unit 1214 obtains the color information of the pixels 72 of the virtual camera 70 by blending the obtained color information of the respective pixels of the plurality of imaging cameras 60 1 to 60 4 according to the blend coefficients (Fig. 21, step S312).
 被写体87は、仮想カメラ70からは、被写体86よりも手前側にあり、また、撮像カメラ604については、被写体86に対する仮想的な光路964上にあり、撮像カメラ604に写り込んでいる。この場合、撮像カメラ604の撮像カメラ情報に含まれるデプス情報に基づき、被写体87の撮像画像を被写体86のテクスチャ画像として用いないようにできる。これについては、図22のセクション(a)に示す実施形態に係るレンダリング処理と、セクション(b)に示す既存技術によるレンダリング処理とで同様である。 The subject 87 is on the front side of the subject 86 from the virtual camera 70 , and is on the virtual optical path 96 4 to the subject 86 with respect to the imaging camera 60 4 and is reflected in the imaging camera 604 . In this case, it is possible not to use the captured image of the subject 87 as the texture image of the subject 86 based on the depth information included in the imaging camera information of the imaging camera 60 4 . This is the same for the rendering processing according to the embodiment shown in section (a) of FIG. 22 and the rendering processing by the existing technology shown in section (b).
 実施形態では、図21のステップS301で説明したように、仮想視点テクスチャ生成部1214は、被写体位置情報に基づき、メッシュ情報から仮想カメラ70による仮想視点に投影する頂点を選択する。そのため、仮想視点テクスチャ生成部1214は、複数の被写体のそれぞれに対して選択的に、表示/非表示を設定することができる。具体的には、図22のセクション(a)に点線で示される被写体87のように、被写体87の位置を示す被写体位置情報に基づき、当該被写体87を非表示とすることができる。 In the embodiment, as described in step S301 of FIG. 21, the virtual viewpoint texture generation unit 1214 selects vertices to be projected onto the virtual viewpoint by the virtual camera 70 from mesh information based on subject position information. Therefore, the virtual viewpoint texture generation unit 1214 can selectively set display/non-display for each of a plurality of subjects. Specifically, like the subject 87 indicated by the dotted line in section (a) of FIG. 22, the subject 87 can be hidden based on the subject position information indicating the position of the subject 87 .
 なお、セクション(a)の場合においても、実空間にある撮像カメラ604は、被写体87を撮像している。そのため、仮想視点テクスチャ生成部1214は、撮像カメラ604により撮像された撮像画像は、被写体86のテクスチャ画像としては用いない。また、仮想カメラ70から被写体87を通過(矢印97で示す)した位置にある撮像カメラ(図示しない)の側の被写体87の面は、仮想カメラ70からは見えない。そのため、仮想視点テクスチャ生成部1214は、当該撮像カメラの撮像カメラ情報を取得しない。これにより、仮想視点テクスチャ生成部1214による処理の負荷を軽減させることができる。 Also in the case of section (a), the imaging camera 60 4 in real space images the subject 87 . Therefore, the virtual viewpoint texture generation unit 1214 does not use the captured image captured by the imaging camera 604 as the texture image of the subject 86 . Also, the plane of the subject 87 on the side of the imaging camera (not shown) located at a position where the subject 87 has passed from the virtual camera 70 (indicated by an arrow 97 ) cannot be seen from the virtual camera 70 . Therefore, the virtual viewpoint texture generation unit 1214 does not acquire the imaging camera information of the imaging camera. As a result, the processing load of the virtual viewpoint texture generation unit 1214 can be reduced.
 なお、上述では、ポストエフェクト処理として、仮想カメラ70の画角αに含まれる被写体のうち、特定の被写体の表示/非表示を切り替える処理を例にとって説明したが、これはこの例に限定されず、他のポストエフェクト処理にも適用可能である。 In the above description, as the post-effect processing, the processing for switching display/non-display of a specific subject among the subjects included in the angle of view α of the virtual camera 70 has been described as an example, but this is not limited to this example. , can also be applied to other post-effect processing.
 図23は、実施形態に係るポストエフェクト処理について、より具体的に示す模式図である。図23のセクション(a)は、仮想カメラ70の画角αに3次元モデル51cおよび51dが含まれて仮想カメラ70から出力された出力画像300aの例を示している。3次元モデル51cおよび51dは、それぞれバウンディングボックス200cおよび200dが関連付けられている。なお、図23のセクション(a)および(b)において、各バウンディングボックス200cおよび200dなどの枠線は、説明のために示されるもので、実際の画像には表示されない。 FIG. 23 is a schematic diagram showing more specifically the post-effect processing according to the embodiment. Section (a) of FIG. 23 shows an example of an output image 300a output from the virtual camera 70 in which the three- dimensional models 51c and 51d are included in the angle of view α of the virtual camera 70. FIG. Three- dimensional models 51c and 51d are associated with bounding boxes 200c and 200d, respectively. Note that in sections (a) and (b) of FIG. 23, the frame lines of the respective bounding boxes 200c and 200d are shown for explanation and are not displayed in the actual image.
 図23のセクション(b)は、セクション(a)から仮想カメラ70の位置を移動させると共に、3次元モデル51cが移動している場合の出力画像300bの例を示している。このセクション(b)において、バウンディングボックス200dに関連付けられた3次元モデル51dが、当該3次元モデル51dの位置を示す被写体位置情報に基づき指定され、ポストエフェクト処理により非表示とされている。但し、3次元モデル51dそのものは、仮想カメラ70の画角α内にあり、関連するバウンディングボックス200dは存在している。 Section (b) of FIG. 23 shows an example of an output image 300b when the position of the virtual camera 70 is moved from section (a) and the three-dimensional model 51c is moved. In this section (b), the three-dimensional model 51d associated with the bounding box 200d is specified based on subject position information indicating the position of the three-dimensional model 51d, and is hidden by post-effect processing. However, the three-dimensional model 51d itself is within the angle of view α of the virtual camera 70, and the related bounding box 200d exists.
 このように、実施形態によれば、3次元モデル51cおよび51dに対して、それぞれの被写体位置情報に基づき個別にポストエフェクト処理のオン/オフを切り替えることが可能である。 Thus, according to the embodiment, it is possible to switch on/off of post-effect processing individually for the three- dimensional models 51c and 51d based on their subject position information.
[4.本開示の実施形態の応用例]
 本開示に係る技術は、様々な製品やサービスへ応用することができる。次に、本開示の実施形態に応用例について説明する。
[4. Application example of the embodiment of the present disclosure]
The technology according to the present disclosure can be applied to various products and services. Next, application examples to the embodiments of the present disclosure will be described.
(コンテンツの制作)
 例えば、実施形態に係る情報処理システム100で生成された被写体の3次元モデルと他の装置(サーバなど)で管理されている3次元データとを合成して、新たな映像コンテンツを制作してもよい。また、例えば、LiDAR(Light Detection and Rangin/ Laser Imaging Detection and Ranging)などの撮像装置で取得した背景データが存在している場合、実施形態に係る情報処理システム100で生成された被写体の3次元モデルと背景データとを組み合わせることで、被写体が背景データで示す場所に恰も存在しているかのような映像コンテンツを制作することもできる。これらの場合、制作される映像コンテンツは、3次元情報を持つ映像コンテンツであってもよいし、3次元情報が2次元情報に変換された映像コンテンツでもよい。なお、実施形態に係る情報処理システム100で生成された被写体の3次元モデルは、例えば、3Dモデル生成部111で生成された3次元モデルや、レンダリング部121で再構築した3次元モデルなどがある。
(Content production)
For example, a 3D model of a subject generated by the information processing system 100 according to the embodiment and 3D data managed by another device (such as a server) may be combined to produce new video content. good. Further, for example, when there is background data acquired by an imaging device such as LiDAR (Light Detection and Rangin/Laser Imaging Detection and Ranging), the three-dimensional model of the subject generated by the information processing system 100 according to the embodiment By combining the background data with the background data, it is possible to create video content that makes the subject appear as if it exists in the location indicated by the background data. In these cases, the video content to be produced may be video content having three-dimensional information, or video content obtained by converting three-dimensional information into two-dimensional information. Note that the 3D model of the subject generated by the information processing system 100 according to the embodiment includes, for example, a 3D model generated by the 3D model generation unit 111 and a 3D model reconstructed by the rendering unit 121. .
(仮想空間での体験)
 例えば、ユーザがアバタとなってコミュニケーションする場である仮想空間に対して、実施形態に係る情報処理システム100で生成された被写体(例えば、演者)を配置することができる。この場合、ユーザは、アバタとなって仮想空間で実写の被写体を観察することが可能となる。
(Experience in virtual space)
For example, a subject (for example, a performer) generated by the information processing system 100 according to the embodiment can be placed in a virtual space where the user communicates as an avatar. In this case, the user becomes an avatar and can observe the photographed subject in the virtual space.
(遠隔地とのコミュニケーションへの応用)
 例えば、3Dモデル生成部111で生成された被写体の3次元モデルを送信部113から遠隔地に送信することにより、遠隔地にある再生装置を通じて遠隔地のユーザが被写体の3次元モデルを観察することができる。例えば、この被写体の3次元モデルをリアルタイムに伝送することにより、被写体と遠隔地のユーザとのリアルタイムなコミュニケーションを実現可能である。この場合の適用例として、被写体が先生であり、ユーザが生徒であるや、被写体が医者であり、ユーザが患者である場合が想定できる。
(Application to communication with remote locations)
For example, by transmitting the 3D model of the subject generated by the 3D model generation unit 111 from the transmission unit 113 to a remote location, the user at the remote location can observe the 3D model of the subject through a playback device at the remote location. can be done. For example, real-time communication between the subject and a remote user can be realized by transmitting the three-dimensional model of the subject in real time. As an application example of this case, a case where the subject is a teacher and the user is a student, or a case where the subject is a doctor and the user is a patient can be assumed.
(その他の応用例)
 例えば、実施形態に係る情報処理システム100により生成された複数の被写体の3次元モデルに基づいてスポーツなどの自由視点映像を生成することができる。また、個人が実施形態に係る情報処理システム100により生成された自分自身の3次元モデルを配信プラットフォームに配信することもできる。このように、本開示の実施形態に係る技術は、種々の技術やサービスに応用することができる。
(Other application examples)
For example, it is possible to generate a free-viewpoint video of a sport or the like based on three-dimensional models of a plurality of subjects generated by the information processing system 100 according to the embodiment. Also, an individual can distribute his/her own three-dimensional model generated by the information processing system 100 according to the embodiment to the distribution platform. In this way, the technology according to the embodiments of the present disclosure can be applied to various technologies and services.
[5.他の実施形態]
 例えば、上述した、実施形態に係る情報処理プログラムは、CPU、ROM、RAMなどを有し、情報処理装置としての機能を有する他の装置において実行されるようにしてもよい。その場合、その装置が、必要な機能ブロックを有し、必要な情報を得ることができるようにすればよい。
[5. Other embodiments]
For example, the information processing program according to the embodiment described above may be executed in another device having a CPU, a ROM, a RAM, etc. and having functions as an information processing device. In that case, the device should have the necessary functional blocks and be able to obtain the necessary information.
 また、例えば、上述した各フローチャートにおいて、1つのフローチャートの各ステップを、1つの装置が実行するようにしてもよいし、複数の装置が分担して実行するようにしてもよい。さらに、フローチャートの1つのステップに複数の処理が含まれる場合、その複数の処理を、1つの装置が実行するようにしてもよいし、複数の装置が分担して実行するようにしてもよい。換言するに、フローチャートの1つのステップに含まれる複数の処理を、複数のステップの処理として実行することもできる。逆に、フローチャートにおいて複数のステップとして説明した処理を1つのステップとして纏めて実行することもできる。 Also, for example, in each of the flowcharts described above, each step of one flowchart may be executed by one device, or may be shared by a plurality of devices. Furthermore, when one step of the flowchart includes a plurality of processes, the plurality of processes may be executed by one device, or may be shared by a plurality of devices. In other words, a plurality of processes included in one step of the flowchart can also be executed as a process of a plurality of steps. Conversely, the processing described as a plurality of steps in the flowchart can also be collectively executed as one step.
 さらに、例えば、情報処理システム100が実行する情報処理プログラムは、当該情報処理プログラムを記述するステップの処理が、上述した各フローチャートに示す順序に従い時系列に沿って実行されるようにしてもよいし、並列に、あるいは呼び出しが行われたとき等の必要なタイミングで個別に実行されるようにしてもよい。つまり、矛盾が生じない限り、各ステップの処理が上述した順序と異なる順序で実行されるようにしてもよい。さらに、実施形態に係る情報処理プログラムを記述するステップの処理が、他のプログラムの処理と並列に実行されるようにしてもよいし、他のプログラムの処理と組み合わせて実行されるようにしてもよい。 Furthermore, for example, in the information processing program executed by the information processing system 100, the processing of the steps describing the information processing program may be executed in chronological order according to the order shown in each flowchart described above. , may be executed in parallel, or individually as needed, such as when a call is made. That is, as long as there is no contradiction, the processing of each step may be executed in an order different from the order described above. Furthermore, the processing of the step of writing the information processing program according to the embodiment may be executed in parallel with the processing of other programs, or may be executed in combination with the processing of other programs. good.
 さらにまた、例えば、本開示に関する複数の技術は、矛盾が生じない限り、それぞれ独立に単体で実施することができるし、本開示に係る技術を複数併用して実施することもできる。また、上述した実施形態に係る技術の一部または全部を、上述していない他の技術と併用して実施することもできる。 Furthermore, for example, a plurality of technologies related to the present disclosure can be implemented independently, or a plurality of technologies related to the present disclosure can be implemented in combination, as long as there is no contradiction. Also, part or all of the techniques according to the above-described embodiments can be implemented in combination with other techniques not described above.
 なお、本明細書に記載された効果はあくまで例示であって限定されるものでは無く、また他の効果があってもよい。 It should be noted that the effects described in this specification are only examples and are not limited, and other effects may also occur.
 なお、本技術は以下のような構成も取ることができる。
(1)
 3次元データに含まれる3次元モデルに対してテクスチャ画像を適用した画像を生成する生成部と、
 仮想空間の画像を取得する仮想カメラの第1の位置と、前記3次元モデルの第2の位置と、実空間の被写体を撮像する1以上の撮像カメラの第3の位置と、に基づき、前記1以上の撮像カメラから、前記テクスチャ画像として用いる前記被写体の撮像画像を取得する撮像カメラを選択する選択部と、
を備える、
情報処理装置。
(2)
 前記生成部は、
 前記テクスチャ画像を、前記1以上の撮像カメラから前記選択部に選択された撮像カメラにより取得された撮像画像に基づき、前記仮想カメラからの視点に応じて生成する、
前記(1)に記載の情報処理装置。
(3)
 前記選択部は、
 前記第1の位置、前記第2の位置および前記第3の位置に基づき求めた前記1以上の撮像カメラそれぞれの重要度に応じて、前記被写体の撮像画像を取得する撮像カメラを選択する、
前記(1)または(2)に記載の情報処理装置。
(4)
 前記選択部は、
 前記第2の位置を頂点として前記第1の位置および前記第3の位置により成す角度に基づき前記重要度を求める、
前記(3)に記載の情報処理装置。
(5)
 前記生成部は、
 前記1以上の撮像カメラで撮像された撮像画像を前記重要度に応じてブレンドして前記テクスチャ画像を生成する、
前記(3)または(4)に記載の情報処理装置。
(6)
 前記生成部は、
 前記3次元モデルに外接する直方体の頂点座標それぞれのうち少なくとも1つの頂点座標が前記仮想カメラの画角内にある場合に、前記テクスチャ画像を前記3次元モデルに適用する、
前記(1)乃至(5)の何れかに記載の情報処理装置。
(7)
 前記生成部は、
 所定の効果を与える前記3次元モデルを、前記第2の位置に基づき指定する、
前記(1)乃至(6)の何れかに記載の情報処理装置。
(8)
 前記所定の効果は、指定された前記3次元モデルを前記仮想カメラに対して非表示とする効果である、
前記(7)に記載の情報処理装置。
(9)
 前記選択部は、
 前記1以上の撮像カメラのうち、前記3次元モデルの前記仮想カメラの画角外の方向から前記被写体を撮像する撮像カメラを非選択とする、
前記(1)乃至(8)の何れかに記載の情報処理装置。
(10)
 前記選択部は、
 前記第2の位置として、前記3次元モデルに外接する直方体の各頂点座標の平均の座標を用いる、
前記(1)乃至(9)の何れかに記載の情報処理装置。
(11)
 前記選択部は、
 前記仮想カメラの画角内にそれぞれ前記3次元データに含まれる複数の3次元モデルが含まれる場合に、前記複数の3次元モデルそれぞれの前記第2の位置の平均を、前記複数の3次元モデルに対する前記第2の位置として用いる、
前記(10)に記載の情報処理装置。
(12)
 プロセッサにより実行される、
 3次元データに含まれる3次元モデルに対してテクスチャ画像を適用した画像を生成する生成ステップと、
 仮想空間の画像を取得する仮想カメラの第1の位置と、前記3次元モデルの第2の位置と、実空間の被写体を撮像する1以上の撮像カメラの第3の位置と、に基づき、前記1以上の撮像カメラから、前記テクスチャ画像として用いる前記被写体の撮像画像を取得する撮像カメラを選択する選択ステップと、
を有する、
情報処理方法。
(13)
 1以上の撮像カメラで撮像された撮像画像に基づき3次元データを生成する生成部と、
 前記3次元データから、前記撮像画像に含まれる被写体に対応する3次元モデルを分離し、分離された前記3次元モデルの位置を示す位置情報を生成する分離部と、
を備える、
情報処理装置。
(14)
 前記分離部は、
 前記3次元データを高さ方向に投影した2次元平面上の情報に基づき前記被写体の前記2次元平面における領域を特定し、前記領域に高さ方向の情報を与えることで、前記3次元データから前記3次元モデルを分離する、
前記(13)に記載の情報処理装置。
(15)
 前記分離部は、
 前記領域に高さ方向の情報を与えて生成される、前記3次元モデルに外接する直方体の各頂点の座標を含む前記位置情報を生成する、
前記(14)に記載の情報処理装置。
(16)
 前記分離部により前記3次元データから分離された前記3次元モデルに対して前記位置情報を付加して出力する出力部、
をさらに備える、
前記(13)乃至(15)の何れかに記載の情報処理装置。
(17)
 前記出力部は、
 前記3次元モデルの情報を、前記3次元モデルに対応する前記被写体を前記1以上の撮像カメラで撮像した多視点の撮像画像と、前記多視点の撮像画像それぞれに対するデプス情報と、により出力する、
前記(16)に記載の情報処理装置。
(18)
 前記出力部は、
 前記3次元モデルの情報を、メッシュ情報として出力する、
前記(16)に記載の情報処理装置。
(19)
 プロセッサにより実行される、
 1以上の撮像カメラで撮像された撮像画像に基づき3次元データを生成する生成ステップと、
 前記3次元データから、前記撮像画像に含まれる被写体に対応する3次元モデルを分離し、分離された前記3次元モデルの位置を示す位置情報を生成する分離部ステップと、
を有する、
情報処理方法。
Note that the present technology can also take the following configuration.
(1)
a generation unit that generates an image by applying a texture image to a three-dimensional model included in three-dimensional data;
Based on a first position of a virtual camera that acquires an image of the virtual space, a second position of the three-dimensional model, and a third position of one or more imaging cameras that capture an object in the real space, a selection unit that selects, from one or more imaging cameras, an imaging camera that acquires a captured image of the subject to be used as the texture image;
comprising
Information processing equipment.
(2)
The generating unit
The texture image is generated according to the viewpoint from the virtual camera based on the captured image acquired by the imaging camera selected by the selection unit from the one or more imaging cameras.
The information processing device according to (1) above.
(3)
The selection unit
Selecting an imaging camera that acquires a captured image of the subject according to the importance of each of the one or more imaging cameras obtained based on the first position, the second position, and the third position;
The information processing apparatus according to (1) or (2).
(4)
The selection unit
Obtaining the degree of importance based on an angle formed by the first position and the third position with the second position as the vertex;
The information processing device according to (3) above.
(5)
The generating unit
generating the texture image by blending the captured images captured by the one or more imaging cameras according to the importance;
The information processing apparatus according to (3) or (4).
(6)
The generating unit
applying the texture image to the three-dimensional model when at least one vertex coordinate of each of the vertex coordinates of a rectangular parallelepiped circumscribing the three-dimensional model is within the angle of view of the virtual camera;
The information processing apparatus according to any one of (1) to (5) above.
(7)
The generating unit
Designating the three-dimensional model to give a predetermined effect based on the second position;
The information processing apparatus according to any one of (1) to (6).
(8)
the predetermined effect is an effect of hiding the specified three-dimensional model from the virtual camera;
The information processing device according to (7) above.
(9)
The selection unit
Deselecting, from among the one or more imaging cameras, an imaging camera that images the subject from a direction outside the angle of view of the virtual camera of the three-dimensional model;
The information processing apparatus according to any one of (1) to (8).
(10)
The selection unit
Using the average coordinates of the vertex coordinates of a rectangular parallelepiped circumscribing the three-dimensional model as the second position;
The information processing apparatus according to any one of (1) to (9).
(11)
The selection unit
When a plurality of three-dimensional models included in the three-dimensional data are included within the angle of view of the virtual camera, the average of the second positions of the plurality of three-dimensional models is calculated as the plurality of three-dimensional models. as the second position for
The information processing device according to (10) above.
(12)
executed by a processor;
a generation step of generating an image by applying a texture image to a three-dimensional model included in three-dimensional data;
Based on a first position of a virtual camera that acquires an image of the virtual space, a second position of the three-dimensional model, and a third position of one or more imaging cameras that capture an object in the real space, a selection step of selecting, from one or more imaging cameras, an imaging camera that acquires a captured image of the subject to be used as the texture image;
having
Information processing methods.
(13)
a generation unit that generates three-dimensional data based on captured images captured by one or more imaging cameras;
a separation unit that separates a three-dimensional model corresponding to a subject included in the captured image from the three-dimensional data and generates position information indicating the position of the separated three-dimensional model;
comprising
Information processing equipment.
(14)
The separation unit is
By specifying a region of the subject on the two-dimensional plane based on information on the two-dimensional plane obtained by projecting the three-dimensional data in the height direction, and providing the region with information in the height direction, separating the three-dimensional model;
The information processing device according to (13) above.
(15)
The separation unit is
generating the position information including the coordinates of each vertex of a rectangular parallelepiped circumscribing the three-dimensional model, which is generated by giving information in the height direction to the region;
The information processing device according to (14) above.
(16)
an output unit that adds the position information to the three-dimensional model separated from the three-dimensional data by the separation unit and outputs the model;
further comprising
The information processing apparatus according to any one of (13) to (15).
(17)
The output unit
outputting the information of the three-dimensional model as multi-viewpoint captured images obtained by capturing the subject corresponding to the three-dimensional model with the one or more imaging cameras, and depth information for each of the multi-viewpoint captured images;
The information processing device according to (16) above.
(18)
The output unit
outputting the information of the three-dimensional model as mesh information;
The information processing device according to (16) above.
(19)
executed by a processor;
a generation step of generating three-dimensional data based on captured images captured by one or more imaging cameras;
a separation unit step of separating a three-dimensional model corresponding to a subject included in the captured image from the three-dimensional data and generating position information indicating the position of the separated three-dimensional model;
has a
Information processing methods.
50 3次元データ
511,512,513,51a,51b,51c,51d 3次元モデル
521,522,523 シルエット
531,532,533,2001,2002,2003,200a,200b,200c,200d バウンディングボックス
601,602,603,604,608,6016,60n-1,60n 撮像カメラ
70 仮想カメラ
80,821,822,86,87 被写体
81,83,84a,84b,85 基準位置
90a,90b,91a,91b ベクトル
100 情報処理システム
110 データ取得部
111 3Dモデル生成部
112 フォーマット化部
113 送信部
120 受信部
121 レンダリング部
122 表示部
1110 3Dモデル処理部
1111 3Dモデル分離部
1210 メッシュ転送部
1211 撮像カメラ選択部
1212 撮像視点デプス生成部
1213 撮像カメラ情報転送部
1214 仮想視点テクスチャ生成部
2000 情報処理装置
2100 CPU
50 three- dimensional data 51 1 , 51 2 , 51 3 , 51a, 51b, 51c, 51d three-dimensional models 52 1 , 52 2 , 52 3 silhouettes 53 1 , 53 2 , 53 3 , 200 1 , 200 2 , 200 3 , 200a, 200b, 200c, 200d bounding boxes 601, 602, 603, 604 , 608 , 6016 , 60n - 1 , 60n imaging camera 70 virtual cameras 80 , 821, 822 , 86, 87 Objects 81, 83, 84a, 84b, 85 Reference positions 90a, 90b, 91a, 91b Vector 100 Information processing system 110 Data acquisition unit 111 3D model generation unit 112 Formatting unit 113 Transmission unit 120 Reception unit 121 Rendering unit 122 Display unit 1110 3D model processing unit 1111 3D model separation unit 1210 Mesh transfer unit 1211 Imaging camera selection unit 1212 Imaging viewpoint depth generation unit 1213 Imaging camera information transfer unit 1214 Virtual viewpoint texture generation unit 2000 Information processing device 2100 CPU

Claims (19)

  1.  3次元データに含まれる3次元モデルに対してテクスチャ画像を適用した画像を生成する生成部と、
     仮想空間の画像を取得する仮想カメラの第1の位置と、前記3次元モデルの第2の位置と、実空間の被写体を撮像する1以上の撮像カメラの第3の位置と、に基づき、前記1以上の撮像カメラから、前記テクスチャ画像として用いる前記被写体の撮像画像を取得する撮像カメラを選択する選択部と、
    を備える、
    情報処理装置。
    a generation unit that generates an image by applying a texture image to a three-dimensional model included in three-dimensional data;
    Based on a first position of a virtual camera that acquires an image of the virtual space, a second position of the three-dimensional model, and a third position of one or more imaging cameras that capture an object in the real space, a selection unit that selects, from one or more imaging cameras, an imaging camera that acquires a captured image of the subject to be used as the texture image;
    comprising
    Information processing equipment.
  2.  前記生成部は、
     前記テクスチャ画像を、前記1以上の撮像カメラから前記選択部に選択された撮像カメラにより取得された撮像画像に基づき、前記仮想カメラからの視点に応じて生成する、
    請求項1に記載の情報処理装置。
    The generating unit
    The texture image is generated according to the viewpoint from the virtual camera based on the captured image acquired by the imaging camera selected by the selection unit from the one or more imaging cameras.
    The information processing device according to claim 1 .
  3.  前記選択部は、
     前記第1の位置、前記第2の位置および前記第3の位置に基づき求めた前記1以上の撮像カメラそれぞれの重要度に応じて、前記被写体の撮像画像を取得する撮像カメラを選択する、
    請求項1に記載の情報処理装置。
    The selection unit
    Selecting an imaging camera that acquires a captured image of the subject according to the importance of each of the one or more imaging cameras obtained based on the first position, the second position, and the third position;
    The information processing device according to claim 1 .
  4.  前記選択部は、
     前記第2の位置を頂点として前記第1の位置および前記第3の位置により成す角度に基づき前記重要度を求める、
    請求項3に記載の情報処理装置。
    The selection unit
    Obtaining the degree of importance based on an angle formed by the first position and the third position with the second position as the vertex;
    The information processing apparatus according to claim 3.
  5.  前記生成部は、
     前記1以上の撮像カメラで撮像された撮像画像を前記重要度に応じてブレンドして前記テクスチャ画像を生成する、
    請求項3に記載の情報処理装置。
    The generating unit
    generating the texture image by blending the captured images captured by the one or more imaging cameras according to the importance;
    The information processing apparatus according to claim 3.
  6.  前記生成部は、
     前記3次元モデルに外接する直方体の頂点座標それぞれのうち少なくとも1つの頂点座標が前記仮想カメラの画角内にある場合に、前記テクスチャ画像を前記3次元モデルに適用する、
    請求項1に記載の情報処理装置。
    The generating unit
    applying the texture image to the three-dimensional model when at least one vertex coordinate of each of the vertex coordinates of a rectangular parallelepiped circumscribing the three-dimensional model is within the angle of view of the virtual camera;
    The information processing device according to claim 1 .
  7.  前記生成部は、
     所定の効果を与える前記3次元モデルを、前記第2の位置に基づき指定する、
    請求項1に記載の情報処理装置。
    The generating unit
    Designating the three-dimensional model to give a predetermined effect based on the second position;
    The information processing device according to claim 1 .
  8.  前記所定の効果は、指定された前記3次元モデルを前記仮想カメラに対して非表示とする効果である、
    請求項7に記載の情報処理装置。
    the predetermined effect is an effect of hiding the specified three-dimensional model from the virtual camera;
    The information processing apparatus according to claim 7.
  9.  前記選択部は、
     前記1以上の撮像カメラのうち、前記3次元モデルの前記仮想カメラの画角外の方向から前記被写体を撮像する撮像カメラを非選択とする、
    請求項1に記載の情報処理装置。
    The selection unit
    Deselecting, from among the one or more imaging cameras, an imaging camera that images the subject from a direction outside the angle of view of the virtual camera of the three-dimensional model;
    The information processing device according to claim 1 .
  10.  前記選択部は、
     前記第2の位置として、前記3次元モデルに外接する直方体の各頂点座標の平均の座標を用いる、
    請求項1に記載の情報処理装置。
    The selection unit
    Using the average coordinates of the vertex coordinates of a rectangular parallelepiped circumscribing the three-dimensional model as the second position;
    The information processing device according to claim 1 .
  11.  前記選択部は、
     前記仮想カメラの画角内にそれぞれ前記3次元データに含まれる複数の3次元モデルが含まれる場合に、前記複数の3次元モデルそれぞれの前記第2の位置の平均を、前記複数の3次元モデルに対する前記第2の位置として用いる、
    請求項10に記載の情報処理装置。
    The selection unit
    When a plurality of three-dimensional models included in the three-dimensional data are included within the angle of view of the virtual camera, the average of the second positions of the plurality of three-dimensional models is calculated as the plurality of three-dimensional models. as the second position for
    The information processing apparatus according to claim 10.
  12.  プロセッサにより実行される、
     3次元データに含まれる3次元モデルに対してテクスチャ画像を適用した画像を生成する生成ステップと、
     仮想空間の画像を取得する仮想カメラの第1の位置と、前記3次元モデルの第2の位置と、実空間の被写体を撮像する1以上の撮像カメラの第3の位置と、に基づき、前記1以上の撮像カメラから、前記テクスチャ画像として用いる前記被写体の撮像画像を取得する撮像カメラを選択する選択ステップと、
    を有する、
    情報処理方法。
    executed by a processor,
    a generation step of generating an image by applying a texture image to a three-dimensional model included in three-dimensional data;
    Based on a first position of a virtual camera that acquires an image of the virtual space, a second position of the three-dimensional model, and a third position of one or more imaging cameras that capture an object in the real space, a selection step of selecting, from one or more imaging cameras, an imaging camera that acquires a captured image of the subject to be used as the texture image;
    having
    Information processing methods.
  13.  1以上の撮像カメラで撮像された撮像画像に基づき3次元データを生成する生成部と、
     前記3次元データから、前記撮像画像に含まれる被写体に対応する3次元モデルを分離し、分離された前記3次元モデルの位置を示す位置情報を生成する分離部と、
    を備える、
    情報処理装置。
    a generation unit that generates three-dimensional data based on captured images captured by one or more imaging cameras;
    a separation unit that separates a three-dimensional model corresponding to a subject included in the captured image from the three-dimensional data and generates position information indicating the position of the separated three-dimensional model;
    comprising
    Information processing equipment.
  14.  前記分離部は、
     前記3次元データを高さ方向に投影した2次元平面上の情報に基づき前記被写体の前記2次元平面における領域を特定し、前記領域に高さ方向の情報を与えることで、前記3次元データから前記3次元モデルを分離する、
    請求項13に記載の情報処理装置。
    The separation unit is
    By specifying a region of the subject on the two-dimensional plane based on information on the two-dimensional plane obtained by projecting the three-dimensional data in the height direction, and providing the region with information in the height direction, separating the three-dimensional model;
    The information processing apparatus according to claim 13.
  15.  前記分離部は、
     前記領域に高さ方向の情報を与えて生成される、前記3次元モデルに外接する直方体の各頂点の座標を含む前記位置情報を生成する、
    請求項14に記載の情報処理装置。
    The separation unit is
    generating the position information including the coordinates of each vertex of a rectangular parallelepiped circumscribing the three-dimensional model, which is generated by giving information in the height direction to the region;
    The information processing apparatus according to claim 14.
  16.  前記分離部により前記3次元データから分離された前記3次元モデルに対して前記位置情報を付加して出力する出力部、
    をさらに備える、
    請求項13に記載の情報処理装置。
    an output unit that adds the position information to the three-dimensional model separated from the three-dimensional data by the separation unit and outputs the model;
    further comprising
    The information processing apparatus according to claim 13.
  17.  前記出力部は、
     前記3次元モデルの情報を、前記3次元モデルに対応する前記被写体を前記1以上の撮像カメラで撮像した多視点の撮像画像と、前記多視点の撮像画像それぞれに対するデプス情報と、により出力する、
    請求項16に記載の情報処理装置。
    The output unit
    outputting the information of the three-dimensional model as multi-viewpoint captured images obtained by capturing the subject corresponding to the three-dimensional model with the one or more imaging cameras, and depth information for each of the multi-viewpoint captured images;
    The information processing apparatus according to claim 16.
  18.  前記出力部は、
     前記3次元モデルの情報を、メッシュ情報として出力する、
    請求項16に記載の情報処理装置。
    The output unit
    outputting the information of the three-dimensional model as mesh information;
    The information processing apparatus according to claim 16.
  19.  プロセッサにより実行される、
     1以上の撮像カメラで撮像された撮像画像に基づき3次元データを生成する生成ステップと、
     前記3次元データから、前記撮像画像に含まれる被写体に対応する3次元モデルを分離し、分離された前記3次元モデルの位置を示す位置情報を生成する分離部ステップと、
    を有する、
    情報処理方法。
    executed by a processor,
    a generation step of generating three-dimensional data based on captured images captured by one or more imaging cameras;
    a separation unit step of separating a three-dimensional model corresponding to a subject included in the captured image from the three-dimensional data and generating position information indicating the position of the separated three-dimensional model;
    having
    Information processing methods.
PCT/JP2022/008967 2021-03-12 2022-03-02 Information processing device and information processing method WO2022191010A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2021-040346 2021-03-12
JP2021040346 2021-03-12

Publications (1)

Publication Number Publication Date
WO2022191010A1 true WO2022191010A1 (en) 2022-09-15

Family

ID=83227186

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/008967 WO2022191010A1 (en) 2021-03-12 2022-03-02 Information processing device and information processing method

Country Status (1)

Country Link
WO (1) WO2022191010A1 (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009539155A (en) * 2006-06-02 2009-11-12 イジュノシッヒ テクニッヒ ホッフシューラ チューリッヒ Method and system for generating a 3D representation of a dynamically changing 3D scene
WO2019039282A1 (en) * 2017-08-22 2019-02-28 ソニー株式会社 Image processing device and image processing method
WO2019082958A1 (en) * 2017-10-27 2019-05-02 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ Three-dimensional model encoding device, three-dimensional model decoding device, three-dimensional model encoding method, and three-dimensional model decoding method
JP2020014159A (en) * 2018-07-19 2020-01-23 キヤノン株式会社 File generation device and video generation device on the basis of file
US20200143557A1 (en) * 2018-11-01 2020-05-07 Samsung Electronics Co., Ltd. Method and apparatus for detecting 3d object from 2d image
JP2020126393A (en) * 2019-02-04 2020-08-20 キヤノン株式会社 Image processing apparatus, image processing method, and program
JP2021022032A (en) * 2019-07-25 2021-02-18 Kddi株式会社 Synthesizer, method and program

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009539155A (en) * 2006-06-02 2009-11-12 イジュノシッヒ テクニッヒ ホッフシューラ チューリッヒ Method and system for generating a 3D representation of a dynamically changing 3D scene
WO2019039282A1 (en) * 2017-08-22 2019-02-28 ソニー株式会社 Image processing device and image processing method
WO2019082958A1 (en) * 2017-10-27 2019-05-02 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ Three-dimensional model encoding device, three-dimensional model decoding device, three-dimensional model encoding method, and three-dimensional model decoding method
JP2020014159A (en) * 2018-07-19 2020-01-23 キヤノン株式会社 File generation device and video generation device on the basis of file
US20200143557A1 (en) * 2018-11-01 2020-05-07 Samsung Electronics Co., Ltd. Method and apparatus for detecting 3d object from 2d image
JP2020126393A (en) * 2019-02-04 2020-08-20 キヤノン株式会社 Image processing apparatus, image processing method, and program
JP2021022032A (en) * 2019-07-25 2021-02-18 Kddi株式会社 Synthesizer, method and program

Similar Documents

Publication Publication Date Title
US10535181B2 (en) Virtual viewpoint for a participant in an online communication
EP3712856B1 (en) Method and system for generating an image
JP7386888B2 (en) Two-shot composition of the speaker on the screen
JP2014505917A (en) Hybrid reality for 3D human machine interface
TWI813098B (en) Neural blending for novel view synthesis
EP3396635A2 (en) A method and technical equipment for encoding media content
WO2020184174A1 (en) Image processing device and image processing method
Kurillo et al. A framework for collaborative real-time 3D teleimmersion in a geographically distributed environment
Farbiz et al. Live three-dimensional content for augmented reality
GB2565301A (en) Three-dimensional video processing
WO2022191010A1 (en) Information processing device and information processing method
JP6091850B2 (en) Telecommunications apparatus and telecommunications method
US20230252722A1 (en) Information processing apparatus, information processing method, and program
Andersen et al. An AR-guided system for fast image-based modeling of indoor scenes
EP3564905A1 (en) Conversion of a volumetric object in a 3d scene into a simpler representation model
US11769299B1 (en) Systems and methods for capturing, transporting, and reproducing three-dimensional simulations as interactive volumetric displays
CN116528065B (en) Efficient virtual scene content light field acquisition and generation method
Scheer et al. A client-server architecture for real-time view-dependent streaming of free-viewpoint video
WO2023276261A1 (en) Information processing device, information processing method, and program
US20240185526A1 (en) Systems and methods for capturing, transporting, and reproducing three-dimensional simulations as interactive volumetric displays
Thatte et al. Real-World Virtual Reality With Head-Motion Parallax
WO2022224964A1 (en) Information processing device and information processing method
Pitkänen Open Access Dynamic Human Point Cloud Datasets
CN117424997A (en) Video processing method, device, equipment and readable storage medium
JP2023026148A (en) Viewpoint calculation apparatus and program of the same

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22766969

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22766969

Country of ref document: EP

Kind code of ref document: A1