WO2022019149A1 - Information processing device, 3d model generation method, information processing method, and program - Google Patents

Information processing device, 3d model generation method, information processing method, and program Download PDF

Info

Publication number
WO2022019149A1
WO2022019149A1 PCT/JP2021/025929 JP2021025929W WO2022019149A1 WO 2022019149 A1 WO2022019149 A1 WO 2022019149A1 JP 2021025929 W JP2021025929 W JP 2021025929W WO 2022019149 A1 WO2022019149 A1 WO 2022019149A1
Authority
WO
WIPO (PCT)
Prior art keywords
mask area
information processing
unit
model
rendering
Prior art date
Application number
PCT/JP2021/025929
Other languages
French (fr)
Japanese (ja)
Inventor
宜之 高尾
剛也 小林
Original Assignee
ソニーグループ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ソニーグループ株式会社 filed Critical ソニーグループ株式会社
Publication of WO2022019149A1 publication Critical patent/WO2022019149A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/10Geometric effects
    • G06T15/20Perspective computation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects

Definitions

  • This disclosure relates to an information processing device, a 3D model generation method, an information processing method and a program.
  • Patent Document 1 discloses a technique for drawing an object as a three-dimensional model.
  • One of the purposes of the present disclosure is to provide an information processing device that can automatically set a mask area, a 3D model generation method, an information processing method, and a program.
  • the present disclosure is, for example, A mask area setting unit that sets the mask area for the obstruction that exists between the actual camera and the target, It is an information processing apparatus having a 3D model generation unit that generates a 3D model based on a plurality of image data including image data in which a mask area is set.
  • the present disclosure is, for example,
  • the mask area setting unit sets the mask area for the obstruction that exists between the actual camera and the target.
  • This is a 3D model generation method in which the 3D model generation unit generates a 3D model based on a plurality of image data including image data in which a mask area is set.
  • the present disclosure is, for example,
  • the mask area setting unit sets the mask area for the obstruction that exists between the actual camera and the target.
  • the 3D model generation unit is a program that causes a computer to execute a 3D model generation method for generating a 3D model based on a plurality of image data including image data in which a mask area is set.
  • the present disclosure is, for example,
  • the acquisition unit acquires mask area information indicating the mask area set for the obstruction existing between the actual camera and the target.
  • the rendering unit is an information processing method that performs rendering excluding the mask area.
  • the present disclosure is, for example,
  • the acquisition unit acquires mask area information indicating the mask area set for the obstruction existing between the actual camera and the target.
  • the rendering unit is a program that causes a computer to execute an information processing method that performs rendering excluding the mask area.
  • FIG. 1A to 1C are views which are referred to when the outline of the present disclosure is explained.
  • 2A to 2C are diagrams referred to when the outline of the present disclosure is explained.
  • 3A to 3C are views which are referred to when the outline of the present disclosure is explained.
  • FIG. 4 is a diagram referred to when the outline of the present disclosure is explained.
  • 5A-5C are views which will be referred to when the outline of the present disclosure is explained.
  • FIG. 6 is a diagram referred to when the outline of the present disclosure is explained.
  • FIG. 7 is a block diagram showing a configuration example of the information processing system according to the embodiment.
  • FIG. 8 is a diagram for explaining an example of one process performed in an information processing system.
  • FIG. 9A to 9C are diagrams referred to when the processing performed by the mask area setting unit according to the embodiment is described.
  • 10A to 10C are diagrams referred to when the processing performed by the mask area setting unit according to the embodiment is described.
  • 11A to 11C are diagrams that are referred to when the processing performed by the mask area setting unit according to the embodiment is described.
  • FIG. 12 is a diagram referred to when an example of using the mask area at the time of rendering is explained.
  • 13A to 13C are diagrams referred to when an example of using the mask area at the time of rendering is described.
  • FIG. 14 is a diagram referred to when an example of using the mask area at the time of rendering is described.
  • FIG. 15 is a flowchart for explaining an operation example of the information processing apparatus according to the embodiment.
  • FIG. 16 is a flowchart for explaining an operation example of the mask area setting unit according to the embodiment.
  • FIG. 17 is a diagram showing a configuration example when the processing performed by the information processing system according
  • a 3D model As one method of generating a 3D model (hereinafter, appropriately referred to as a 3D model), a method of creating a mesh by modeling with Visual Hull and generating a 3D model by performing texture mapping on the mesh is known. Has been done.
  • the subject and the background are separated for each of a plurality of 2D image data. For example, a binary image called a silhouette image in which the silhouette of the subject is represented in white and the other areas are represented in black is obtained. Used.
  • FIG. 1A is a two-dimensional image data IM1A including a target (person TA in this example) that is a target for generating a 3D model.
  • 2A and 3A are two-dimensional image data IM2A and IM3A taken by another camera different from the camera that took the two-dimensional image data shown in FIG. 1A.
  • a bar BA which is an example of a shield, exists between the camera that captured the two-dimensional image data IM3A and the person TA.
  • a silhouette image SI1 (see FIG. 1C) in which the person TA and the background are separated is obtained. Be done. Further, the silhouette image SI2 in which the person TA and the background are separated by performing the process of separating the background image data IM2B shown in FIG. 2B from the two-dimensional image data IM2A shown in FIG. 2A (see FIG. 2C). Is obtained.
  • the silhouette image SI3 in which the person TA and the background are separated by performing the process of separating the background image data IM3B shown in FIG. 3B from the two-dimensional image data IM3A shown in FIG. 3A.
  • the bar BA disappears, so that the area of the bar BA is regarded as the background. That is, in the silhouette image SI3 shown in FIG. 3C, the bar BA is represented by the background, that is, black.
  • FIG. 4 shows an example of a 3D model generated using the silhouette images SI1 to SI3.
  • the 3D model is generated using the silhouette images SI1 to SI3 in order to simplify the explanation. If a shield such as a bar BA exists between the camera and the person TA, the silhouette cannot be acquired correctly as in the silhouette image SI3, so that the obtained 3D model becomes unnatural. For example, as shown in FIG. 4, the body becomes a 3D model divided into upper and lower parts.
  • the shield portion is set as the mask area.
  • the portion of the bar BA is set as the mask area MA. Since the portion set as the mask area MA is not subject to processing in background subtraction, the silhouette can be extracted as the foreground.
  • FIG. 5C shows a silhouette image SI3'when the mask area MA is set.
  • a 3D model is generated using the silhouette images SI1, SI2, SI3'.
  • FIG. 6 shows an example of a 3D model generated using the silhouette images SI1, SI2, and SI3'.
  • the silhouette image SI3' the silhouettes of the person TA and the bar BA (white parts) overlap.
  • the part of the bar BA is It will be scraped.
  • an appropriate 3D model corresponding to the person TA can be obtained.
  • the above-mentioned mask area can be set automatically. This is because the 3D model can be efficiently generated by automatically setting the mask area.
  • the mask area When the mask area is set, it is necessary to consider it in the rendering process in the process of generating the 3D model. That is, since there is no texture at the part of the mask area, it is desired that the texture at the part of the mask area can be rendered appropriately.
  • the details of the rendering process when there is a mask area will be described later. Based on the above, an embodiment of the present disclosure will be described in detail.
  • FIG. 7 shows an outline of an information processing system to which the present technology is applied.
  • the data acquisition unit 1 acquires image data for generating a 3D model of the subject.
  • a plurality of viewpoint images captured by a plurality of image pickup devices 8B arranged so as to surround the subject 8A are acquired as image data.
  • the plurality of viewpoint images are preferably images captured by a plurality of cameras in synchronization.
  • the data acquisition unit 1 may acquire, for example, a plurality of viewpoint images obtained by capturing the subject 8A from a plurality of viewpoints with one camera as image data.
  • the data acquisition unit 1 may perform calibration based on the image data and acquire the internal parameters and the external parameters of each image pickup apparatus 8B. Further, the data acquisition unit 1 may acquire a plurality of depth information indicating a distance from a plurality of viewpoints to the subject 8A, for example.
  • the 3D model generation unit 2 generates a model having 3D information of the subject 8A based on the image data for generating the 3D model of the subject 8A.
  • the 3D model generation unit 2 uses, for example, the so-called Visual Hull to cut the three-dimensional shape of the subject 8A using images from a plurality of viewpoints (for example, silhouette images from a plurality of viewpoints) to create a 3D image of the subject 8A. Generate a model.
  • the 3D model generation unit 2 can further transform the 3D model generated by using Visual Hull with high accuracy by using a plurality of depth information indicating the distances from the viewpoints of a plurality of points to the subject 8A.
  • the 3D model generated by the 3D model generation unit 2 can be said to be a moving image of the 3D model by generating it in time-series frame units. Further, since the 3D model is generated by using the image captured by the image pickup apparatus 8B, it can be said to be a live-action 3D model.
  • the 3D model can express shape information representing the surface shape of the subject 8A in the form of mesh data called a polygon mesh, which is expressed by the connection between vertices (Vertex) and vertices.
  • the method of expressing the 3D model is not limited to these, and may be described by a so-called point cloud expression method expressed by the position information of points.
  • Color information data is also generated as a texture in a form linked to these 3D shape data. For example, there are cases of ViewIndependent textures that have a constant color when viewed from any direction, and cases of ViewDependent textures whose colors change depending on the viewing direction.
  • the 3D model generation unit 2 has a mask area setting unit 2A as a functional block.
  • the mask area setting unit 2A sets the mask area for the shield existing between the actual camera and the target.
  • the 3D model generation unit 2 generates a 3D model based on a plurality of image data including image data in which a mask area is set.
  • the formatting unit 3 converts the 3D model data generated by the 3D model generation unit 2 into a format suitable for transmission and storage.
  • the 3D model generated by the 3D model generation unit 2 may be converted into a plurality of two-dimensional images by perspectively projecting them from a plurality of directions.
  • the 3D model may be used to generate depth information which is a two-dimensional depth image from a plurality of viewpoints.
  • the depth information and the color information of the state of this two-dimensional image are compressed and output to the transmission unit 4.
  • the depth information and the color information may be transmitted side by side as one image, or may be transmitted as two separate images.
  • the formatting unit 3 converts the mask area information indicating the mask area set by the mask area setting unit 2A into a predetermined format.
  • the mask area information is, for example, information indicating a mask area in the background two-dimensional image data, but is not limited to this.
  • 3D data may be converted into a point cloud format. It may be output to the transmission unit 4 as three-dimensional data.
  • the Geometry-based-Approach 3D compression technique discussed in MPEG can be used.
  • the transmission unit 4 transmits the transmission data (including the mask area information) formed by the formatting unit 3 to the reception unit 5.
  • the transmission unit 4 transmits the transmission data to the reception unit 5 after performing a series of processes of the data acquisition unit 1, the 3D model generation unit 2 and the formatting unit 3 offline. Further, the transmission unit 4 may transmit the transmission data generated from the series of processes described above to the reception unit 5 in real time.
  • the receiving unit 5 receives the transmission data transmitted from the transmitting unit 4. As described above, the transmission data includes mask setting information. In this way, the receiving unit 5 functions as an acquisition unit for acquiring the mask setting information transmitted from the transmitting unit 4.
  • the rendering unit 6 renders using the transmission data received by the receiving unit 5. For example, texture mapping is performed by projecting a mesh of a 3D model from the viewpoint of a camera that draws it and pasting a texture representing a color or pattern. The drawing at this time can be arbitrarily set regardless of the camera position at the time of shooting and can be viewed from a free viewpoint. Further, although the details will be described later, the rendering unit 6 performs rendering excluding the mask area.
  • the rendering unit 6 performs texture mapping to paste a texture representing the color, pattern or texture of the mesh according to the position of the mesh of the 3D model, for example.
  • Texture mapping includes a so-called View Dependent method that considers the user's viewing viewpoint and a View Independent method that does not consider the user's viewing viewpoint.
  • the View Dependent method has the advantage of being able to achieve higher quality rendering than the View Independent method because the texture to be pasted on the 3D model changes according to the position of the viewing viewpoint.
  • the View Independent method has an advantage that the amount of processing is smaller than that of the View Dependent method because the position of the viewing viewpoint is not considered.
  • the viewing viewpoint data is input to the rendering unit 6 from the display device after the display device detects the viewing point (Region of Interest) of the user.
  • the rendering unit 6 may adopt, for example, billboard rendering that renders the object so that the object maintains a vertical posture with respect to the viewing viewpoint. For example, when rendering multiple objects, objects that are of less interest to the viewer may be rendered on the billboard, and other objects may be rendered using other rendering methods.
  • the display unit 7 displays the result rendered by the rendering unit 6 on the display unit 7 of the display device.
  • the display device may be a 2D monitor or a 3D monitor such as a head-mounted display, a spatial display, a mobile phone, a television, or a PC.
  • the information processing system of FIG. 7 shows a series of flows from a data acquisition unit 1 that acquires an captured image, which is a material for generating content, to a display control unit that controls a display device to be viewed by a user.
  • a data acquisition unit 1 that acquires an captured image, which is a material for generating content
  • a display control unit that controls a display device to be viewed by a user.
  • this does not mean that all functional blocks are required for the implementation of the present technology, and the present technology can be implemented for each functional block or a combination of a plurality of functional blocks.
  • a transmitting unit 4 and a receiving unit 5 are provided to show a series of flow from the side that creates the content to the side that views the content through the distribution of the content data.
  • the same information processing device for example, a personal computer
  • the same implementer may implement everything, or different implementers may implement each functional block.
  • the business operator A generates 3D contents through the data acquisition unit 1, the 3D model generation unit 2, and the formatting unit 3. Then, it is conceivable that the 3D content is distributed through the transmission unit 4 (platform) of the business operator B, and the display device of the business operator C performs reception, rendering, and display control of the 3D content.
  • each functional block can be implemented on the cloud.
  • the rendering unit 6 may be performed in the display device or may be performed in the server. In that case, information is exchanged between the display device and the server.
  • FIG. 7 describes the data acquisition unit 1, the 3D model generation unit 2, the formatting unit 3, the transmission unit 4, the reception unit 5, the rendering unit 6, and the display unit 7 as an information processing system.
  • the information processing system of the present specification is referred to as an information processing system if two or more functional blocks are involved.
  • the data acquisition unit 1 and the 3D model generation unit 2 are not included in the display unit 7.
  • the encoding unit, the transmitting unit 4, the receiving unit 5, the decoding unit, and the rendering unit 6 can be collectively referred to as an information processing system.
  • the present disclosure can be configured as an information processing apparatus including any configuration among the configurations of the information processing system shown in FIG. 7.
  • the present disclosure can be configured as an information processing device having all the configurations shown in FIG. 7, an information processing device having only a 3D model generation unit 2, and an information processing device having a receiving unit 5 and a rendering unit 6. ..
  • FIG. 9A for example, it is assumed that the actual camera RC1 shoots a person TA as a target. A shield 41 exists between the actual camera RC1 and the person TA.
  • FIG. 9B the image taken by the actual camera RC1 is an image in which a shield exists.
  • FIG. 9C an image viewed from a virtual camera VC1 (virtual viewpoint) having no obstruction is created.
  • FIG. 10A there are actual cameras RC1 and RC2. There is a shield 41 between the actual cameras RC1 and RC2 and the person TA.
  • FIG. 10B shows an image taken by the actual camera RC1
  • FIG. 10C shows an image taken by the actual camera RC2.
  • the position and range of the person TA to be modeled or rendered is manually set by the user, for example.
  • the person TA may be set automatically.
  • the position and orientation of each actual camera can be determined by prior camera calibration.
  • Zhang's method using a chess board is known.
  • a method other than Zhang's method can also be applied.
  • the 3D position in the input image is estimated using the position information of each camera and the input image from each camera.
  • the three-dimensional position is estimated in pixel units, for example.
  • the pixel is regarded as a shield and is set as a mask area.
  • the above processing is performed in pixel units.
  • FIG. 11A is the same diagram as FIG. 10A.
  • FIG. 11B is a diagram schematically showing the mask region MA4 set in the image obtained by the actual camera RC1.
  • FIG. 11C is a diagram schematically showing the mask region MA5 set in the image obtained by the actual camera RC2.
  • the mask area can be automatically set.
  • the rendering unit 6 excludes the set mask area and renders an area other than the mask area.
  • rendering with the mask area excluded means that the texture obtained from the image taken by the actual camera is not used. That is, rendering may be performed on the mask area by a texture obtained from an image other than the image taken by the actual camera.
  • the person TA is photographed by the actual cameras RC1 to RC3. Further, the virtual camera VC1 is arranged in the virtual space at a position corresponding to the virtual viewpoint. There is a shield 41 between the actual camera RC3 and the person TA.
  • FIG. 13A shows the photographed image IM4A obtained by the actual camera RC1
  • FIG. 13B shows the photographed image IM4B obtained by the actual camera RC2
  • FIG. 13C shows the photographed image IM4C obtained by the actual camera RC3.
  • the mask area MA6 is set at the place of the shield 41 in the captured image IM4C.
  • the rendering unit 6 renders the estimated texture based on the pixels in the other regions with respect to the mask region. For example, the texture is estimated based on the image of the mask area seen from the camera image at a position close to the virtual viewpoint.
  • the texture is estimated from the pixels in the area corresponding to the mask area taken by the real cameras RC1 and RC2, which are close to the virtual cameras VC1, and the pixels in the area around the mask area MA6 of the real camera RC3. Estimate the texture using.
  • the rendering unit 6 renders using the estimated texture.
  • FIG. 14 is a video showing two athletes engaged in martial arts in a polygonal wire mesh and a spectator watching the game outside the wire mesh.
  • This example is an example in which the processing corresponding to the mask area differs depending on the context of the two target players.
  • the mask area MA7 is set on the front side of the two athletes, and the mask area MA8 is set on the rear side. There is no corresponding 3D model in the area corresponding to the mask area MA7, and no texture is pasted in the area. Further, although the 3D model exists in the mask area MA8, a 3D model created from another image and a texture associated with the 3D model are pasted. This makes it possible to create a more accurate free-viewpoint video.
  • step S101 the data acquisition unit 1 acquires image data for generating a 3D model of the target subject.
  • step S102 the mask area setting unit 2A sets the mask area, and the 3D model generation unit 2 generates a model having the three-dimensional information of the subject based on the image data for generating the 3D model of the subject. That is, modeling using the mask area is performed.
  • step S103 the formatting unit 3 encodes the shape, texture data, and mask area information of the 3D model generated by the 3D model generation unit 2 into a format suitable for transmission and storage.
  • step S104 the transmission unit 4 transmits the encoded data
  • step S105 the reception unit 5 receives the transmitted data.
  • step S106 a decoding unit (not shown) performs decoding processing, converts it into shape and texture data necessary for display, and the rendering unit 6 renders using the shape, texture data, and mask area information.
  • step S107 the display unit 7 displays the rendered result.
  • step S201 camera calibration regarding the position and orientation of each actual camera is performed.
  • the camera calibration process is usually performed when creating a free-viewpoint image. For example, the position and orientation between the cameras are estimated using the captured image using the calibration board. Then, the process proceeds to step S202.
  • the target area is set.
  • the target area is, for example, an area for which modeling and rendering are desired.
  • the area may be set manually by the user or may be set automatically. When it is set automatically, for example, if there is a camera path of the free viewpoint image in advance, the area near the focal point of the virtual camera is automatically set as the target area. Then, the process proceeds to step S203.
  • step S203 the three-dimensional position of each pixel of each camera image is estimated by using a plurality of camera position information obtained in step S201. Then, the process proceeds to step S204.
  • step S204 in step S203, when the position information of each pixel obtained in step S203 is between the position of the camera and the position of the target, the area is regarded as a shield and set as a mask area. After that, normal free-viewpoint video modeling and rendering processing operates. As mentioned above, the mask area is not used in modeling and rendering.
  • FIG. 17 is a block diagram showing an example of hardware configuration of a computer that executes the above-mentioned series of processes programmatically.
  • a CPU Central Processing Unit
  • ROM Read Only Memory
  • RAM Random Access Memory
  • the input / output interface 15 is also connected to the bus 14.
  • An input unit 16, an output unit 17, a storage unit 18, a communication unit 19, and a drive 20 are connected to the input / output interface 15.
  • the input unit 16 includes, for example, a keyboard, a mouse, a microphone, a touch panel, an input terminal, and the like.
  • the output unit 17 includes, for example, a display, a speaker, an output terminal, and the like.
  • the storage unit 18 is composed of, for example, a hard disk, a RAM disk, a non-volatile memory, or the like.
  • the communication unit 19 is composed of, for example, a network interface.
  • the drive 20 drives a removable medium such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory.
  • the CPU 11 loads the program stored in the storage unit into the RAM 13 via the input / output interface 15 and the bus 14 and executes the program, thereby executing the series described above. Processing is done.
  • the RAM 13 also appropriately stores data and the like necessary for the CPU 11 to execute various processes.
  • the program executed by the computer can be recorded and applied to removable media such as package media, for example.
  • the program can be installed in the storage unit 18 via the input / output interface 15 by mounting the removable media in the drive 20.
  • the program can also be provided via wired or wireless transmission media such as local area networks, the Internet, and digital satellite broadcasts. In that case, the program can be received by the communication unit 19 and installed in the storage unit 18.
  • a part of the configuration and functions of the information processing device according to the above-described embodiment may exist in a device different from the information processing device (for example, a server device on a network).
  • the program that realizes the above-mentioned function may be executed in any device.
  • the device may have the necessary functional blocks so that the necessary information can be obtained.
  • each step of one flowchart may be executed by one device, or may be shared and executed by a plurality of devices.
  • one device may execute the plurality of processes, or the plurality of devices may share and execute the plurality of processes.
  • a plurality of processes included in one step can be executed as processes of a plurality of steps.
  • the processes described as a plurality of steps can be collectively executed as one step.
  • the processing of the steps for writing the program may be executed in chronological order in the order described in the present specification, and may be executed in parallel or in a row. It may be executed individually at the required timing such as when it is broken. That is, as long as there is no contradiction, the processes of each step may be executed in an order different from the above-mentioned order. Further, the processing of the step for describing this program may be executed in parallel with the processing of another program, or may be executed in combination with the processing of another program. Further, for example, a plurality of techniques related to this technique can be independently implemented independently as long as there is no contradiction. Of course, any plurality of the present technologies can be used in combination.
  • the present disclosure may also adopt the following configuration.
  • a mask area setting unit that sets the mask area for the obstruction that exists between the actual camera and the target
  • An information processing device including a 3D model generation unit that generates a 3D model based on a plurality of image data including image data in which a mask area is set.
  • the 3D model generation unit generates the 3D model based on a silhouette image in which the mask area is extracted as a foreground and another silhouette image in which the foreground and the background are separated.
  • Device (3) The information processing apparatus according to (2), wherein the silhouette of the mask area extracted as the foreground is removed based on the silhouette of another silhouette image.
  • the mask area setting unit obtains the three-dimensional position information of the photographed image taken by the actual camera pixel by pixel based on the position / orientation information of the actual camera estimated by the camera calibration, and the three-dimensional of the pixel.
  • the information processing apparatus according to any one of (1) to (3), wherein when the position information is between the actual camera and the target, the pixel is determined as a shield.
  • the information processing apparatus according to any one of (1) to (5) which has a rendering unit that performs rendering excluding the mask area.
  • the information processing apparatus wherein the rendering unit renders a 3D model generated in advance for the mask area and a texture associated with the 3D model.
  • the mask area setting unit sets the mask area for the obstruction that exists between the actual camera and the target.
  • a 3D model generation method in which a 3D model generation unit generates a 3D model based on a plurality of image data including image data in which the mask area is set.
  • the mask area setting unit sets the mask area for the obstruction that exists between the actual camera and the target.
  • An acquisition unit that acquires mask area information indicating a mask area set for a shield existing between the actual camera and the target, and an acquisition unit.
  • An information processing device having a rendering unit that performs rendering excluding the mask area.
  • the information processing apparatus according to (11), wherein the rendering unit renders a texture estimated based on pixels in another region with respect to the mask region.
  • the information processing apparatus according to (12), wherein the other region is a region around the mask region.
  • the information processing apparatus according to (12), wherein the other area corresponds to the mask area obtained by a real camera close to a virtual camera.
  • the rendering unit renders a 3D model generated in advance in the mask area and a texture associated with the 3D model.
  • a first mask area is set on the front side of the target as seen from a predetermined virtual viewpoint, and a second mask area is set on the rear side of the target.
  • the information processing apparatus according to any one of (11) to (15), wherein different rendering processes are performed on the first mask area and the second mask area.
  • the acquisition unit acquires mask area information indicating the mask area set for the obstruction existing between the actual camera and the target.
  • the acquisition unit acquires mask area information indicating the mask area set for the obstruction existing between the actual camera and the target.
  • a new video content may be created by synthesizing the 3D model of the subject generated in the above-described embodiment and the 3D data managed by another server. Further, for example, when the background data acquired by an image pickup device such as Lidar exists, the subject can be placed in the place indicated by the background data by combining the 3D model of the subject generated in the above-described embodiment and the background data. You can also create content that looks as if it were.
  • the video content may be a three-dimensional video content or a two-dimensional video content converted into two dimensions.
  • the 3D model of the subject generated in the above-described embodiment includes, for example, a 3D model generated by the 3D model generation unit 2 and a 3D model reconstructed by the rendering unit 6.
  • a subject for example, a performer generated in the present embodiment can be placed in a virtual space where the user acts as an avatar and communicates.
  • the user can act as an avatar and view the live-action subject in the virtual space.
  • a user in the remote location can view the 3D model of the subject through the playback device in the remote location. ..
  • the subject and a user in a remote place can communicate in real time.
  • the subject is a teacher and the user is a student, or the subject is a doctor and the user is a patient.
  • a free-viewpoint image such as sports based on a 3D model of a plurality of subjects generated in the above-described embodiment, or an individual distributes himself / herself as a 3D model generated in the above-mentioned embodiment. It can also be delivered to the platform.
  • a free-viewpoint image such as sports based on a 3D model of a plurality of subjects generated in the above-described embodiment, or an individual distributes himself / herself as a 3D model generated in the above-mentioned embodiment. It can also be delivered to the platform.
  • the contents of the embodiments described in the present specification can be applied to various techniques and services.

Abstract

The present invention properly generates 3D models, for example. An information processing device according to the present invention has a mask region setting unit that sets a mask region with respect to an obstruction that is present between a real camera and a target, and a 3D model generation unit that generates a 3D model on the basis of multiple pieces of image data, including image data in which a mask region is set.

Description

情報処理装置、3Dモデル生成方法、情報処理方法およびプログラムInformation processing equipment, 3D model generation method, information processing method and program
 本開示は、情報処理装置、3Dモデル生成方法、情報処理方法およびプログラムに関する。 This disclosure relates to an information processing device, a 3D model generation method, an information processing method and a program.
 特許文献1は、3次元モデルとしてオブジェクトを描画する技術を開示する。 Patent Document 1 discloses a technique for drawing an object as a three-dimensional model.
特許第5483761号公報Japanese Patent No. 5483761
 3次元モデルを生成する際には、適切なマスク領域が設定されないと不自然な3次元モデルが生成される虞がある。従来は、このマスク領域を手動で設定する必要があったため、3次元モデルを効率的に生成できる方法が望まれていた。 When generating a 3D model, there is a risk that an unnatural 3D model will be generated if an appropriate mask area is not set. Conventionally, since it was necessary to manually set this mask area, a method capable of efficiently generating a three-dimensional model has been desired.
 本開示は、マスク領域を自動で設定できる情報処理装置、3Dモデル生成方法、情報処理方法およびプログラムを提供することを目的の一つとする。 One of the purposes of the present disclosure is to provide an information processing device that can automatically set a mask area, a 3D model generation method, an information processing method, and a program.
 本開示は、例えば、
 実カメラとターゲットとの間に存在する遮蔽物に対してマスク領域を設定するマスク領域設定部と、
 マスク領域が設定された画像データを含む、複数の画像データに基づいて3Dモデルを生成する3Dモデル生成部と
 を有する情報処理装置である。
The present disclosure is, for example,
A mask area setting unit that sets the mask area for the obstruction that exists between the actual camera and the target,
It is an information processing apparatus having a 3D model generation unit that generates a 3D model based on a plurality of image data including image data in which a mask area is set.
 本開示は、例えば、
 マスク領域設定部が、実カメラとターゲットとの間に存在する遮蔽物に対してマスク領域を設定し、
 3Dモデル生成部が、マスク領域が設定された画像データを含む、複数の画像データに基づいて3Dモデルを生成する
 3Dモデル生成方法である。
The present disclosure is, for example,
The mask area setting unit sets the mask area for the obstruction that exists between the actual camera and the target.
This is a 3D model generation method in which the 3D model generation unit generates a 3D model based on a plurality of image data including image data in which a mask area is set.
 本開示は、例えば、
 マスク領域設定部が、実カメラとターゲットとの間に存在する遮蔽物に対してマスク領域を設定し、
 3Dモデル生成部が、マスク領域が設定された画像データを含む、複数の画像データに基づいて3Dモデルを生成する
 3Dモデル生成方法をコンピュータに実行させるプログラムである。
The present disclosure is, for example,
The mask area setting unit sets the mask area for the obstruction that exists between the actual camera and the target.
The 3D model generation unit is a program that causes a computer to execute a 3D model generation method for generating a 3D model based on a plurality of image data including image data in which a mask area is set.
 本開示は、例えば、
 取得部が、実カメラとターゲットとの間に存在する遮蔽物に対して設定されたマスク領域を示すマスク領域情報を取得し、
 レンダリング部が、マスク領域を除外したレンダリングを行う
 情報処理方法である。
The present disclosure is, for example,
The acquisition unit acquires mask area information indicating the mask area set for the obstruction existing between the actual camera and the target.
The rendering unit is an information processing method that performs rendering excluding the mask area.
 本開示は、例えば、
 取得部が、実カメラとターゲットとの間に存在する遮蔽物に対して設定されたマスク領域を示すマスク領域情報を取得し、
 レンダリング部が、マスク領域を除外したレンダリングを行う
 情報処理方法をコンピュータに実行させるプログラムである。
The present disclosure is, for example,
The acquisition unit acquires mask area information indicating the mask area set for the obstruction existing between the actual camera and the target.
The rendering unit is a program that causes a computer to execute an information processing method that performs rendering excluding the mask area.
図1A~図1Cは、本開示の概要の説明がなされる際に参照される図である。1A to 1C are views which are referred to when the outline of the present disclosure is explained. 図2A~図2Cは、本開示の概要の説明がなされる際に参照される図である。2A to 2C are diagrams referred to when the outline of the present disclosure is explained. 図3A~図3Cは、本開示の概要の説明がなされる際に参照される図である。3A to 3C are views which are referred to when the outline of the present disclosure is explained. 図4は、本開示の概要の説明がなされる際に参照される図である。FIG. 4 is a diagram referred to when the outline of the present disclosure is explained. 図5A~図5Cは、本開示の概要の説明がなされる際に参照される図である。5A-5C are views which will be referred to when the outline of the present disclosure is explained. 図6は、本開示の概要の説明がなされる際に参照される図である。FIG. 6 is a diagram referred to when the outline of the present disclosure is explained. 図7は、一実施形態に係る情報処理システムの構成例を示すブロック図である。FIG. 7 is a block diagram showing a configuration example of the information processing system according to the embodiment. 図8は、情報処理システムで行われる一処理の例を説明するための図である。FIG. 8 is a diagram for explaining an example of one process performed in an information processing system. 図9A~図9Cは、一実施形態に係るマスク領域設定部で行われる処理の説明がなされる際に参照される図である。9A to 9C are diagrams referred to when the processing performed by the mask area setting unit according to the embodiment is described. 図10A~図10Cは、一実施形態に係るマスク領域設定部で行われる処理の説明がなされる際に参照される図である。10A to 10C are diagrams referred to when the processing performed by the mask area setting unit according to the embodiment is described. 図11A~図11Cは、一実施形態に係るマスク領域設定部で行われる処理の説明がなされる際に参照される図である。11A to 11C are diagrams that are referred to when the processing performed by the mask area setting unit according to the embodiment is described. 図12は、レンダリング時におけるマスク領域の利用例についての説明がなされる際に参照される図である。FIG. 12 is a diagram referred to when an example of using the mask area at the time of rendering is explained. 図13A~図13Cは、レンダリング時におけるマスク領域の利用例についての説明がなされる際に参照される図である。13A to 13C are diagrams referred to when an example of using the mask area at the time of rendering is described. 図14は、レンダリング時におけるマスク領域の利用例についての説明がなされる際に参照される図である。FIG. 14 is a diagram referred to when an example of using the mask area at the time of rendering is described. 図15は、一実施形態に係る情報処理装置の動作例を説明するためのフローチャートである。FIG. 15 is a flowchart for explaining an operation example of the information processing apparatus according to the embodiment. 図16は、一実施形態に係るマスク領域設定部の動作例を説明するためのフローチャートである。FIG. 16 is a flowchart for explaining an operation example of the mask area setting unit according to the embodiment. 図17は、一実施形態に係る情報処理システムで行われる処理をハードウエア的に構成した場合の構成例を示す図である。FIG. 17 is a diagram showing a configuration example when the processing performed by the information processing system according to the embodiment is configured in terms of hardware.
 以下、本開示の実施形態等について図面を参照しながら説明する。なお、説明は以下の順序で行う。
<本開示の概要>
<一実施形態>
<変形例>
<応用例>
 以下に説明する実施形態等は本開示の好適な具体例であり、本開示の内容がこれらの実施形態等に限定されるものではない。
Hereinafter, embodiments and the like of the present disclosure will be described with reference to the drawings. The explanation will be given in the following order.
<Summary of this disclosure>
<One Embodiment>
<Modification example>
<Application example>
The embodiments and the like described below are suitable specific examples of the present disclosure, and the contents of the present disclosure are not limited to these embodiments and the like.
<本開示の概要>
 始めに、本開示において考慮すべき問題について触れつつ、本開示の概要について説明する。3次元モデル(以下、3Dモデルと適宜、称する)を生成する一つの方法としてVisual Hullによってモデルリングを行うことでメッシュを作成し、メッシュに対するテクスチャマッピングを行うことで3Dモデルを生成する方法が知られている。3Dモデルを生成するにあたっては、複数の2次元画像データ毎に被写体と背景との分離が行われ、例えば、被写体のシルエットを白、その他の領域を黒で表したシルエット画像と呼ばれる2値画像が用いられる。
<Summary of this disclosure>
First, the outline of the present disclosure will be described while touching on the issues to be considered in the present disclosure. As one method of generating a 3D model (hereinafter, appropriately referred to as a 3D model), a method of creating a mesh by modeling with Visual Hull and generating a 3D model by performing texture mapping on the mesh is known. Has been done. In generating a 3D model, the subject and the background are separated for each of a plurality of 2D image data. For example, a binary image called a silhouette image in which the silhouette of the subject is represented in white and the other areas are represented in black is obtained. Used.
 ここで、カメラと被写体との間に遮蔽物が存在すると、適切でない3Dモデルが生成される虞がある。この点について、図1~図3を参照しつつ、具体的に説明する。図1Aは、3Dモデルの生成対象であるターゲット(本例における人物TA)を含む、2次元画像データIM1Aである。図2A、図3Aは、図1Aに示される2次元画像データを撮影したカメラとは異なる他のカメラによって撮影された2次元画像データIM2A、IM3Aである。なお、図3Aに示すように、2次元画像データIM3Aを撮影したカメラと人物TAとの間には、遮蔽物の一例であるバーBAが存在している。 Here, if there is a shield between the camera and the subject, there is a risk that an inappropriate 3D model will be generated. This point will be specifically described with reference to FIGS. 1 to 3. FIG. 1A is a two-dimensional image data IM1A including a target (person TA in this example) that is a target for generating a 3D model. 2A and 3A are two-dimensional image data IM2A and IM3A taken by another camera different from the camera that took the two-dimensional image data shown in FIG. 1A. As shown in FIG. 3A, a bar BA, which is an example of a shield, exists between the camera that captured the two-dimensional image data IM3A and the person TA.
 図1Aに示す2次元画像データIM1Aに対して、図1Bに示す背景画像データIM1Bを分離する処理が行われることで、人物TAと背景とが分離されたシルエット画像SI1(図1C参照)が得られる。また、図2Aに示す2次元画像データIM2Aに対して、図2Bに示す背景画像データIM2Bを分離する処理が行われることで、人物TAと背景とが分離されたシルエット画像SI2(図2C参照)が得られる。 By performing the process of separating the background image data IM1B shown in FIG. 1B from the two-dimensional image data IM1A shown in FIG. 1A, a silhouette image SI1 (see FIG. 1C) in which the person TA and the background are separated is obtained. Be done. Further, the silhouette image SI2 in which the person TA and the background are separated by performing the process of separating the background image data IM2B shown in FIG. 2B from the two-dimensional image data IM2A shown in FIG. 2A (see FIG. 2C). Is obtained.
 同様に、図3Aに示す2次元画像データIM3Aに対して、図3Bに示す背景画像データIM3Bを分離する処理が行われることで、人物TAと背景とが分離されたシルエット画像SI3(図3C参照)が得られる。ここで、2次元画像データIM3Aから背景画像データIM3Bを除外(引き算)すると、バーBAはなくなることから、バーBAの領域は背景とみなされる。すなわち、図3Cに示すシルエット画像SI3では、バーBAは背景、すなわち、黒で表されることになる。 Similarly, the silhouette image SI3 (see FIG. 3C) in which the person TA and the background are separated by performing the process of separating the background image data IM3B shown in FIG. 3B from the two-dimensional image data IM3A shown in FIG. 3A. ) Is obtained. Here, when the background image data IM3B is excluded (subtracted) from the two-dimensional image data IM3A, the bar BA disappears, so that the area of the bar BA is regarded as the background. That is, in the silhouette image SI3 shown in FIG. 3C, the bar BA is represented by the background, that is, black.
 図4は、シルエット画像SI1~SI3を用いて生成された3Dモデルの一例を示す。なお、3Dモデルの生成にあたっては実際にはより多くのシルエット画像が用いられるが、ここでは説明を簡略化するため、3Dモデルがシルエット画像SI1~SI3を用いて生成されるようにしている。カメラと人物TAとの間にバーBAのような遮蔽物が存在すると、シルエット画像SI3のようにシルエットが正しく取得できないため、得られる3Dモデルが不自然となってしまう。例えば、図4に示すように、身体が上下に分割された3Dモデルとなってしまう。 FIG. 4 shows an example of a 3D model generated using the silhouette images SI1 to SI3. Although more silhouette images are actually used to generate the 3D model, the 3D model is generated using the silhouette images SI1 to SI3 in order to simplify the explanation. If a shield such as a bar BA exists between the camera and the person TA, the silhouette cannot be acquired correctly as in the silhouette image SI3, so that the obtained 3D model becomes unnatural. For example, as shown in FIG. 4, the body becomes a 3D model divided into upper and lower parts.
 そこで、本実施形態では、遮蔽物の部分をマスク領域として設定する。具体的には、図5Bに示すように、バーBAの箇所をマスク領域MAとして設定する。マスク領域MAとして設定された箇所は、背景差分における処理の対象外となるため、前景としてシルエットの抽出ができるようにする。図5Cは、マスク領域MAが設定された場合におけるシルエット画像SI3’を示す。 Therefore, in the present embodiment, the shield portion is set as the mask area. Specifically, as shown in FIG. 5B, the portion of the bar BA is set as the mask area MA. Since the portion set as the mask area MA is not subject to processing in background subtraction, the silhouette can be extracted as the foreground. FIG. 5C shows a silhouette image SI3'when the mask area MA is set.
 シルエット画像SI1、SI2、SI3’を用いて3Dモデルが生成される。図6は、シルエット画像SI1、SI2、SI3’を用いて生成された3Dモデルの一例を示す。シルエット画像SI3’では、人物TAとバーBAのシルエット(白の箇所)が重なっている。しかしながら、別のカメラ、すなわち、別の視点からの2次元画像データ(遮蔽物がうつっていない画像データ)に対応するシルエット画像SI1、SI2を用いたVisual Hullによる作成過程において、バーBAの箇所が削られる。これにより、人物TAに対応する適切な3Dモデルが得られる。 A 3D model is generated using the silhouette images SI1, SI2, SI3'. FIG. 6 shows an example of a 3D model generated using the silhouette images SI1, SI2, and SI3'. In the silhouette image SI3', the silhouettes of the person TA and the bar BA (white parts) overlap. However, in the process of creating by Visual Hull using silhouette images SI1 and SI2 corresponding to two-dimensional image data (image data without obstruction) from another camera, that is, another viewpoint, the part of the bar BA is It will be scraped. As a result, an appropriate 3D model corresponding to the person TA can be obtained.
 上述したマスク領域は、自動で設定できることが好ましい。自動でマスク領域が設定されることで、3Dモデルを効率的に生成できるからである。 It is preferable that the above-mentioned mask area can be set automatically. This is because the 3D model can be efficiently generated by automatically setting the mask area.
 なお、マスク領域が設定された場合には、3Dモデルを生成する過程におけるレンダリング処理においても考慮する必要がある。すなわち、マスク領域の箇所のテクスチャが存在しないため、マスク領域の箇所のテクスチャを適切にレンダリングできることが望まれる。マスク領域がある場合のレンダリング処理の詳細は後述する。以上を踏まえつつ、本開示の一実施形態について詳細に説明する。 When the mask area is set, it is necessary to consider it in the rendering process in the process of generating the 3D model. That is, since there is no texture at the part of the mask area, it is desired that the texture at the part of the mask area can be rendered appropriately. The details of the rendering process when there is a mask area will be described later. Based on the above, an embodiment of the present disclosure will be described in detail.
<一実施形態>
[情報処理システムの概要]
 図7は、本技術を適用した情報処理システムの概要を示している。データ取得部1は、被写体の3Dモデルを生成するための画像データを取得する。例えば、図8に示すように被写体8Aを取り囲むように配置された複数の撮像装置8Bによって撮像された複数の視点画像を画像データとして取得する。この場合、複数の視点画像は、複数のカメラが同期して撮像された画像が好ましい。また、データ取得部1は、例えば、1台のカメラで被写体8Aを複数の視点から撮像した複数の視点画像を画像データとして取得してもよい。
<One Embodiment>
[Overview of information processing system]
FIG. 7 shows an outline of an information processing system to which the present technology is applied. The data acquisition unit 1 acquires image data for generating a 3D model of the subject. For example, as shown in FIG. 8, a plurality of viewpoint images captured by a plurality of image pickup devices 8B arranged so as to surround the subject 8A are acquired as image data. In this case, the plurality of viewpoint images are preferably images captured by a plurality of cameras in synchronization. Further, the data acquisition unit 1 may acquire, for example, a plurality of viewpoint images obtained by capturing the subject 8A from a plurality of viewpoints with one camera as image data.
 なお、データ取得部1は、画像データに基づいてキャリブレーションを行い、各撮像装置8Bの内部パラメータ及び外部パラメータを取得してもよい。また、データ取得部1は、例えば、複数箇所の視点から被写体8Aまでの距離を示す複数のデプス情報を取得してもよい。 The data acquisition unit 1 may perform calibration based on the image data and acquire the internal parameters and the external parameters of each image pickup apparatus 8B. Further, the data acquisition unit 1 may acquire a plurality of depth information indicating a distance from a plurality of viewpoints to the subject 8A, for example.
 3Dモデル生成部2は、被写体8Aの3Dモデルを生成するための画像データに基づいて被写体8Aの3次元情報を有するモデルを生成する。3Dモデル生成部2は、例えば、所謂Visual Hullを用いて、複数の視点からの画像(例えば、複数の視点からのシルエット画像)を用いて被写体8Aの3次元形状を削ることによって被写体8Aの3Dモデルを生成する。この場合、3Dモデル生成部2は更に、Visual Hullを用いて生成した3Dモデルを複数箇所の視点から被写体8Aまでの距離を示す複数のデプス情報を用いて高精度に変形させることができる。3Dモデル生成部2で生成される3Dモデルは、時系列のフレーム単位で生成することで3Dモデルの動画と言うこともできる。また、3Dモデルは、撮像装置8Bで撮像された画像を用いて生成されるため実写の3Dモデルとも言うことができる。3Dモデルは、被写体8Aの表面形状を表す形状情報を、例えば、ポリゴンメッシュと呼ばれる、頂点(Vertex)と頂点との繋がりで表現したメッシュデータの形式で表現することができる。3Dモデルの表現の方法はこれらに限定されるものではなく、点の位置情報で表現される所謂ポイントクラウドの表現方法で記述されてもよい。 The 3D model generation unit 2 generates a model having 3D information of the subject 8A based on the image data for generating the 3D model of the subject 8A. The 3D model generation unit 2 uses, for example, the so-called Visual Hull to cut the three-dimensional shape of the subject 8A using images from a plurality of viewpoints (for example, silhouette images from a plurality of viewpoints) to create a 3D image of the subject 8A. Generate a model. In this case, the 3D model generation unit 2 can further transform the 3D model generated by using Visual Hull with high accuracy by using a plurality of depth information indicating the distances from the viewpoints of a plurality of points to the subject 8A. The 3D model generated by the 3D model generation unit 2 can be said to be a moving image of the 3D model by generating it in time-series frame units. Further, since the 3D model is generated by using the image captured by the image pickup apparatus 8B, it can be said to be a live-action 3D model. The 3D model can express shape information representing the surface shape of the subject 8A in the form of mesh data called a polygon mesh, which is expressed by the connection between vertices (Vertex) and vertices. The method of expressing the 3D model is not limited to these, and may be described by a so-called point cloud expression method expressed by the position information of points.
 これらの3D形状データに紐づけられる形で、色情報のデータもテクスチャとして生成される。例えば、どの方向から見ても一定の色となるView Independent テクスチャの場合と、視聴する方向によって色が変化するView Dependentテクスチャの場合がある。 Color information data is also generated as a texture in a form linked to these 3D shape data. For example, there are cases of ViewIndependent textures that have a constant color when viewed from any direction, and cases of ViewDependent textures whose colors change depending on the viewing direction.
 3Dモデル生成部2は、機能ブロックとして、マスク領域設定部2Aを有する。マスク領域設定部2Aは、実カメラとターゲットとの間に存在する遮蔽物に対してマスク領域を設定する。3Dモデル生成部2は、マスク領域が設定された画像データを含む、複数の画像データに基づいて3Dモデルを生成する。 The 3D model generation unit 2 has a mask area setting unit 2A as a functional block. The mask area setting unit 2A sets the mask area for the shield existing between the actual camera and the target. The 3D model generation unit 2 generates a 3D model based on a plurality of image data including image data in which a mask area is set.
 フォーマット化部3(符号化部)は、3Dモデル生成部2で生成された3Dモデルのデータを伝送や蓄積に適したフォーマットに変換する。例えば、3Dモデル生成部2で生成された3Dモデルを複数の方向から透視投影することにより複数の2次元画像に変換しても良い。この場合、3Dモデルを用いて複数の視点からの2次元のデプス画像であるデプス情報を生成してもよい。この2次元画像の状態のデプス情報と、色情報を圧縮して送信部4に出力する。デプス情報と色情報は、並べて1枚の画像として伝送してもよいし、2本の別々の画像として伝送してもよい。この場合、2次元画像データの形であるため、AVC(Advanced Video Coding)などの2次元圧縮技術を用いて圧縮することもできる。さらに、本実施形態では、フォーマット化部3は、マスク領域設定部2Aによって設定されたマスク領域を示すマスク領域情報を所定のフォーマットに変換する。マスク領域情報は、例えば、背景の2次元画像データにおけるマスク領域を示す情報であるが、これに限られない。 The formatting unit 3 (encoding unit) converts the 3D model data generated by the 3D model generation unit 2 into a format suitable for transmission and storage. For example, the 3D model generated by the 3D model generation unit 2 may be converted into a plurality of two-dimensional images by perspectively projecting them from a plurality of directions. In this case, the 3D model may be used to generate depth information which is a two-dimensional depth image from a plurality of viewpoints. The depth information and the color information of the state of this two-dimensional image are compressed and output to the transmission unit 4. The depth information and the color information may be transmitted side by side as one image, or may be transmitted as two separate images. In this case, since it is in the form of two-dimensional image data, it can be compressed by using a two-dimensional compression technique such as AVC (Advanced Video Coding). Further, in the present embodiment, the formatting unit 3 converts the mask area information indicating the mask area set by the mask area setting unit 2A into a predetermined format. The mask area information is, for example, information indicating a mask area in the background two-dimensional image data, but is not limited to this.
 また、例えば、3Dデータをポイントクラウドのフォーマットに変換してもよい。3次元データとして送信部4に出力してもよい。この場合、例えば、MPEGで議論されているGeometry-based-Approachの3次元圧縮技術を用いることができる。 Also, for example, 3D data may be converted into a point cloud format. It may be output to the transmission unit 4 as three-dimensional data. In this case, for example, the Geometry-based-Approach 3D compression technique discussed in MPEG can be used.
 送信部4は、フォーマット化部3で形成された伝送データ(マスク領域情報を含む)を受信部5に送信する。送信部4は、データ取得部1、3Dモデル生成部2とフォーマット化部3の一連の処理をオフラインで行った後に、伝送データを受信部5に伝送する。また、送信部4は、上述した一連の処理から生成された伝送データをリアルタイムに受信部5に伝送してもよい。 The transmission unit 4 transmits the transmission data (including the mask area information) formed by the formatting unit 3 to the reception unit 5. The transmission unit 4 transmits the transmission data to the reception unit 5 after performing a series of processes of the data acquisition unit 1, the 3D model generation unit 2 and the formatting unit 3 offline. Further, the transmission unit 4 may transmit the transmission data generated from the series of processes described above to the reception unit 5 in real time.
 受信部5は、送信部4から伝送された伝送データを受信する。上述したように、伝送データにはマスク設定情報が含まれる。このように、受信部5は、送信部4から送信されるマスク設定情報を取得する取得部として機能する。 The receiving unit 5 receives the transmission data transmitted from the transmitting unit 4. As described above, the transmission data includes mask setting information. In this way, the receiving unit 5 functions as an acquisition unit for acquiring the mask setting information transmitted from the transmitting unit 4.
 レンダリング部6は、受信部5で受信した伝送データを用いてレンダリングを行う。例えば、3Dモデルのメッシュを描画するカメラの視点で投影し、色や模様を表すテクスチャを貼り付けるテクスチャマッピングを行う。この時の描画は、撮影時のカメラ位置と関係なく任意に設定し自由な視点で見ることができる。また、詳細は後述するが、レンダリング部6は、マスク領域を除外したレンダリングを行う。 The rendering unit 6 renders using the transmission data received by the receiving unit 5. For example, texture mapping is performed by projecting a mesh of a 3D model from the viewpoint of a camera that draws it and pasting a texture representing a color or pattern. The drawing at this time can be arbitrarily set regardless of the camera position at the time of shooting and can be viewed from a free viewpoint. Further, although the details will be described later, the rendering unit 6 performs rendering excluding the mask area.
 レンダリング部6は、例えば、3Dモデルのメッシュの位置に応じて、メッシュの色、模様や質感を表すテクスチャを貼り付けるテクスチャマッピングを行う。テクスチャマッピングには、所謂、ユーザーの視聴視点を考慮するView Dependentと呼ばれる方式や、ユーザーの視聴視点を考慮しないView Independentという方式がある。 View Dependent方式は、視聴視点の位置に応じて3Dモデルに貼り付けるテクスチャを変化させるため、View Independent方式よりも高品質なレンダリングが実現できる利点がある。一方、View Independent方式は視聴視点の位置を考慮しないためView Dependent方式に比べて処理量が少なくする利点がある。なお、視聴視点のデータは、ユーザーの視聴個所(Region of Interest)を表示装置が検出し、表示装置からレンダリング部6に入力される。また、レンダリング部6は、例えば、視聴視点に対しオブジェクトが垂直な姿勢を保つようにオブジェクトをレンダリングするビルボードレンダリングを採用してもよい。例えば、複数オブジェクトをレンダリングする際に、視聴者の関心が低いオブジェクトをビルボードでレンダリングし、その他のオブジェクトを他のレンダリング方式でレンダリングすることもできる。 The rendering unit 6 performs texture mapping to paste a texture representing the color, pattern or texture of the mesh according to the position of the mesh of the 3D model, for example. Texture mapping includes a so-called View Dependent method that considers the user's viewing viewpoint and a View Independent method that does not consider the user's viewing viewpoint. The View Dependent method has the advantage of being able to achieve higher quality rendering than the View Independent method because the texture to be pasted on the 3D model changes according to the position of the viewing viewpoint. On the other hand, the View Independent method has an advantage that the amount of processing is smaller than that of the View Dependent method because the position of the viewing viewpoint is not considered. The viewing viewpoint data is input to the rendering unit 6 from the display device after the display device detects the viewing point (Region of Interest) of the user. Further, the rendering unit 6 may adopt, for example, billboard rendering that renders the object so that the object maintains a vertical posture with respect to the viewing viewpoint. For example, when rendering multiple objects, objects that are of less interest to the viewer may be rendered on the billboard, and other objects may be rendered using other rendering methods.
 表示部7は、レンダリング部6によりレンダリングされた結果を表示装置の表示部7に表示する。表示装置は、例えば、ヘッドマウントディスプレイ、空間ディスプレイ、携帯電話、テレビ、PCなど、2Dモニタでも3Dモニタでもよい。 The display unit 7 displays the result rendered by the rendering unit 6 on the display unit 7 of the display device. The display device may be a 2D monitor or a 3D monitor such as a head-mounted display, a spatial display, a mobile phone, a television, or a PC.
 図7の情報処理システムは、コンテンツを生成する材料である撮像画像を取得するデータ取得部1からユーザーの視聴する表示装置を制御する表示制御部までの一連の流れを示した。しかしながら、本技術の実施のために全ての機能ブロックが必要という意味ではなく、各機能ブロック又は複数の機能ブロックの組合せに本技術が実施でき得る。例えば、図7は、コンテンツを作成する側からコンテンツデータの配信を通じてコンテンツを視聴する側までの一連の流れを示すために送信部4や受信部5を設けたが、コンテンツの制作から視聴までを同じ情報処理装置(例えばパーソナルコンピューター)で実施する場合は、符号化部、送信部4、復号化部又は受信部5を備える必要はない。 The information processing system of FIG. 7 shows a series of flows from a data acquisition unit 1 that acquires an captured image, which is a material for generating content, to a display control unit that controls a display device to be viewed by a user. However, this does not mean that all functional blocks are required for the implementation of the present technology, and the present technology can be implemented for each functional block or a combination of a plurality of functional blocks. For example, in FIG. 7, a transmitting unit 4 and a receiving unit 5 are provided to show a series of flow from the side that creates the content to the side that views the content through the distribution of the content data. When the same information processing device (for example, a personal computer) is used, it is not necessary to include a coding unit, a transmitting unit 4, a decoding unit, or a receiving unit 5.
 本情報処理システムを実施に当たっては、同一の実施者が全てを実施する場合もあれば、各機能ブロックに異なる実施者が実施することもできる。その一例としては、事業者Aは、データ取得部1、3Dモデル生成部2、フォーマット化部3を通じて3Dコンテンツを生成する。その上で、事業者Bの送信部4(プラットフォーム)を通じて3Dコンテンツが配信され、事業者Cの表示装置が3Dコンテンツの受信、レンダリング、表示制御を行うことが考えられる。 In implementing this information processing system, the same implementer may implement everything, or different implementers may implement each functional block. As an example, the business operator A generates 3D contents through the data acquisition unit 1, the 3D model generation unit 2, and the formatting unit 3. Then, it is conceivable that the 3D content is distributed through the transmission unit 4 (platform) of the business operator B, and the display device of the business operator C performs reception, rendering, and display control of the 3D content.
 また、各機能ブロックは、クラウド上で実施することができる。例えば、レンダリング部6は、表示装置内で実施されてもよいし、サーバーで実施してもよい。その場合は表示装置とサーバー間での情報のやり取りが生じる。 Also, each functional block can be implemented on the cloud. For example, the rendering unit 6 may be performed in the display device or may be performed in the server. In that case, information is exchanged between the display device and the server.
 図7は、データ取得部1、3Dモデル生成部2、フォーマット化部3、送信部4、受信部5、レンダリング部6、表示部7を纏めて情報処理システムとして説明した。但し、本明細書の情報処理システムは、2以上の機能ブロックが関係していれば情報処理システムと言うこととし、例えば、表示部7は含めずに、データ取得部1、3Dモデル生成部2、符号化部、送信部4、受信部5、復号化部、レンダリング部6を総称して情報処理システムと言うこともできる。また、本開示は、図7に示す情報処理システムの構成のうちの任意の構成を含む情報処理装置として構成することができる。例えば、本開示は、図7に示す構成をすべて有する情報処理装置、3Dモデル生成部2のみを有する情報処理装置、受信部5およびレンダリング部6を有する情報処理装置として構成することが可能である。 FIG. 7 describes the data acquisition unit 1, the 3D model generation unit 2, the formatting unit 3, the transmission unit 4, the reception unit 5, the rendering unit 6, and the display unit 7 as an information processing system. However, the information processing system of the present specification is referred to as an information processing system if two or more functional blocks are involved. For example, the data acquisition unit 1 and the 3D model generation unit 2 are not included in the display unit 7. , The encoding unit, the transmitting unit 4, the receiving unit 5, the decoding unit, and the rendering unit 6 can be collectively referred to as an information processing system. Further, the present disclosure can be configured as an information processing apparatus including any configuration among the configurations of the information processing system shown in FIG. 7. For example, the present disclosure can be configured as an information processing device having all the configurations shown in FIG. 7, an information processing device having only a 3D model generation unit 2, and an information processing device having a receiving unit 5 and a rendering unit 6. ..
[マスク領域設定部の動作例]
 次にマスク領域設定部2Aの動作例について説明する。図9Aに示すように、例えば、実カメラRC1がターゲットである人物TAを撮影する場合を想定する。実カメラRC1と人物TAの間には、遮蔽物41が存在する。実カメラRC1で撮影される映像は、図9Bに示すように、遮蔽物が存在する映像である。本例では、図9Cに示すように、遮蔽物がない仮想カメラVC1(仮想視点)からみた映像を作成する。
[Operation example of mask area setting unit]
Next, an operation example of the mask area setting unit 2A will be described. As shown in FIG. 9A, for example, it is assumed that the actual camera RC1 shoots a person TA as a target. A shield 41 exists between the actual camera RC1 and the person TA. As shown in FIG. 9B, the image taken by the actual camera RC1 is an image in which a shield exists. In this example, as shown in FIG. 9C, an image viewed from a virtual camera VC1 (virtual viewpoint) having no obstruction is created.
 より具体的に説明する。図10Aに示すように、実カメラRC1、RC2が存在する。実カメラRC1、RC2との人物TAとの間には遮蔽物41が存在する。図10Bは、実カメラRC1により撮影される映像を示し、図10Cは、実カメラRC2により撮影される映像を示す。 I will explain more specifically. As shown in FIG. 10A, there are actual cameras RC1 and RC2. There is a shield 41 between the actual cameras RC1 and RC2 and the person TA. FIG. 10B shows an image taken by the actual camera RC1, and FIG. 10C shows an image taken by the actual camera RC2.
 モデリングやレンダリングの対象となる人物TAの位置および範囲は、例えば、ユーザーによって手動で設定される。人物TAが自動で設定されてもよい。 The position and range of the person TA to be modeled or rendered is manually set by the user, for example. The person TA may be set automatically.
 各実カメラの位置および姿勢(以下、位置姿勢と総称する)は、事前のカメラキャリブレーションによって判別することができる。カメラキャリブレーションに関する手法としては、チェスボードを使用するZhangの手法が知られている。勿論、カメラキャリブレーションに関する手法としてZhangの手法以外の手法も適用可能である、例えば、3次元物体を撮像してパラメータを求める手法、2本の光線を直接カメラに向けて撮像することでパラメータを求める手法、プロジェクタを用いて特徴点を投影し、その投影画像を使ってパラメータを求める手法、LED(Light Emitting Diode)ライトを振って点光源を撮像してパラメータを求める手法等を適用することも可能である。 The position and orientation of each actual camera (hereinafter collectively referred to as position and orientation) can be determined by prior camera calibration. As a method for camera calibration, Zhang's method using a chess board is known. Of course, as a method related to camera calibration, a method other than Zhang's method can also be applied. It is also possible to apply a method of finding, a method of projecting a feature point using a projector and finding parameters using the projected image, a method of shaking an LED (Light Emitting Diode) light to image a point light source, and finding parameters. It is possible.
 各カメラの位置情報および各カメラからの入力画像を用いて入力画像内における3次元位置が推定される。3次元位置は、例えば、ピクセル単位で推定される。3次元位置が、実カメラと人物TAとの間の位置である場合には、当該ピクセルは遮蔽物とみなされ、マスク領域として設定される。以上の処理がピクセル単位で行われる。 The 3D position in the input image is estimated using the position information of each camera and the input image from each camera. The three-dimensional position is estimated in pixel units, for example. When the three-dimensional position is the position between the real camera and the person TA, the pixel is regarded as a shield and is set as a mask area. The above processing is performed in pixel units.
 図11Aは図10Aと同じ図である。図11Bは、実カメラRC1で得られた画像において設定されたマスク領域MA4を模式的に示した図である。図11Cは、実カメラRC2で得られた画像において設定されたマスク領域MA5を模式的に示した図である。このように、本実施形態では、マスク領域を自動で設定することができる。 FIG. 11A is the same diagram as FIG. 10A. FIG. 11B is a diagram schematically showing the mask region MA4 set in the image obtained by the actual camera RC1. FIG. 11C is a diagram schematically showing the mask region MA5 set in the image obtained by the actual camera RC2. As described above, in the present embodiment, the mask area can be automatically set.
(モデリング時におけるマスク領域の利用例)
 上述したようにして設定されたマスク領域の利用例について説明する。マスク領域として設定された領域は、モデリングの処理の対象外となる。具体的には、上述したように、マスク領域の箇所は前景として抽出される。そして、遮蔽物を含まない視点で撮影された画像データに基づくシルエット画像によって遮蔽物の箇所が削られることで、遮蔽物によって遮蔽されない3Dモデルを生成することが可能となる。
(Example of using mask area during modeling)
An example of using the mask area set as described above will be described. The area set as the mask area is excluded from the modeling process. Specifically, as described above, the portion of the mask region is extracted as the foreground. Then, by removing the portion of the shield by the silhouette image based on the image data taken from the viewpoint not including the shield, it becomes possible to generate a 3D model that is not covered by the shield.
(レンダリング時におけるマスク領域の利用例)
 図12~図14を参照してレンダリング時におけるマスク領域の利用例について説明する。レンダリング部6は、設定されたマスク領域を除外してマスク領域以外の他の領域のレンダリングを行う。ここで、マスク領域を除外してレンダリングを行うとは、実カメラの撮影画像から得られるテクスチャを用いないことを意味する。すなわち、実カメラの撮影画像以外から得られるテクスチャによってマスク領域に対するレンダリングが行われてもよい。
(Example of using the mask area at the time of rendering)
An example of using the mask area at the time of rendering will be described with reference to FIGS. 12 to 14. The rendering unit 6 excludes the set mask area and renders an area other than the mask area. Here, rendering with the mask area excluded means that the texture obtained from the image taken by the actual camera is not used. That is, rendering may be performed on the mask area by a texture obtained from an image other than the image taken by the actual camera.
 図12に示すように、実カメラRC1~RC3によって人物TAが撮影される。また、仮想視点に対応する位置に仮想カメラVC1が仮想空間内に配置される。実カメラRC3と人物TAとの間に遮蔽物41が存在する。 As shown in FIG. 12, the person TA is photographed by the actual cameras RC1 to RC3. Further, the virtual camera VC1 is arranged in the virtual space at a position corresponding to the virtual viewpoint. There is a shield 41 between the actual camera RC3 and the person TA.
 図13Aは実カメラRC1により得られる撮影画像IM4A、図13Bは実カメラRC2により得られる撮影画像IM4B、図13Cは実カメラRC3により得られる撮影画像IM4Cをそれぞれ示す。上述したマスク領域設定部2Aによる処理により、撮影画像IM4Cにおける遮蔽物41の箇所にマスク領域MA6が設定される。このマスク領域にはテクスチャが存在しない。そこで、レンダリング部6は、マスク領域に対して他の領域の画素に基づいて推測したテクスチャをレンダリングする。例えば、仮想視点と近い位置のカメラ映像から見えるマスク領域の映像に基づいてテクスチャを推定する。具体的には、仮想カメラVC1に位置的に近い実カメラRC1、RC2により撮影されたマスク領域に対応する領域の画素からテクスチャを推定したり、実カメラRC3のマスク領域MA6の周辺の領域における画素を使ってテクスチャを推定する。レンダリング部6は、推定したテクスチャを用いてレンダリングする。 FIG. 13A shows the photographed image IM4A obtained by the actual camera RC1, FIG. 13B shows the photographed image IM4B obtained by the actual camera RC2, and FIG. 13C shows the photographed image IM4C obtained by the actual camera RC3. By the processing by the mask area setting unit 2A described above, the mask area MA6 is set at the place of the shield 41 in the captured image IM4C. There are no textures in this mask area. Therefore, the rendering unit 6 renders the estimated texture based on the pixels in the other regions with respect to the mask region. For example, the texture is estimated based on the image of the mask area seen from the camera image at a position close to the virtual viewpoint. Specifically, the texture is estimated from the pixels in the area corresponding to the mask area taken by the real cameras RC1 and RC2, which are close to the virtual cameras VC1, and the pixels in the area around the mask area MA6 of the real camera RC3. Estimate the texture using. The rendering unit 6 renders using the estimated texture.
 他の例について説明する。図14は、多角形状の金網の中で格闘技を行っている2選手、および、金網外で観戦する観客を示す映像である。本例は、ターゲットである2人選手に対する前後関係によってマスク領域に対応する処理が異なる例である。 Explain other examples. FIG. 14 is a video showing two athletes engaged in martial arts in a polygonal wire mesh and a spectator watching the game outside the wire mesh. This example is an example in which the processing corresponding to the mask area differs depending on the context of the two target players.
 図14に示すように、2人の選手の前側にはマスク領域MA7が設定されており、後側にはマスク領域MA8が設定されている。マスク領域MA7に対応する領域には、対応する3Dモデルがなく、当該領域には、テクスチャを貼り付けない。また、マスク領域MA8には、3Dモデルは存在するものの、あえて別の映像から作成された3Dモデルおよび当該3Dモデルに対応づけられたテクスチャを貼り付ける。これによって、より精度の高い自由視点映像を作成することができる。 As shown in FIG. 14, the mask area MA7 is set on the front side of the two athletes, and the mask area MA8 is set on the rear side. There is no corresponding 3D model in the area corresponding to the mask area MA7, and no texture is pasted in the area. Further, although the 3D model exists in the mask area MA8, a 3D model created from another image and a texture associated with the 3D model are pasted. This makes it possible to create a more accurate free-viewpoint video.
[処理の流れ]
(全体の処理の流れ)
 次に、本実施形態に係る情報処理システムで行われる処理の流れについて説明する。始めに、図15のフローチャートを参照して全体の処理の流れについて説明する。処理が開始されると、ステップS101において、データ取得部1は、ターゲットである被写体の3Dモデルを生成するための画像データを取得する。ステップS102において、マスク領域設定部2Aがマスク領域を設定し、3Dモデル生成部2が、被写体の3Dモデルを生成するための画像データに基づいて被写体の3次元情報を有するモデルを生成する。すなわち、マスク領域を用いたモデリングが行われる。ステップS103において、フォーマット化部3は、3Dモデル生成部2で生成された3Dモデルの形状、テクスチャデータおよびマスク領域情報を伝送や蓄積に好適なフォーマットにエンコードする。ステップS104において、送信部4が符号化されたデータを伝送し、ステップS105において、この伝送されたデータを受信部5が受信する。ステップS106において、不図示の復号部がデコード処理を行い、表示に必要な形状およびテクスチャデータに変換し、レンダリング部6は、形状、テクスチャデータおよびマスク領域情報を用いてレンダリングを行う。ステップS107において、レンダリングした結果を表示部7が表示する。ステップS107の処理が終了すると、情報処理システムの処理が終了する。
[Processing flow]
(Overall processing flow)
Next, the flow of processing performed by the information processing system according to the present embodiment will be described. First, the entire processing flow will be described with reference to the flowchart of FIG. When the process is started, in step S101, the data acquisition unit 1 acquires image data for generating a 3D model of the target subject. In step S102, the mask area setting unit 2A sets the mask area, and the 3D model generation unit 2 generates a model having the three-dimensional information of the subject based on the image data for generating the 3D model of the subject. That is, modeling using the mask area is performed. In step S103, the formatting unit 3 encodes the shape, texture data, and mask area information of the 3D model generated by the 3D model generation unit 2 into a format suitable for transmission and storage. In step S104, the transmission unit 4 transmits the encoded data, and in step S105, the reception unit 5 receives the transmitted data. In step S106, a decoding unit (not shown) performs decoding processing, converts it into shape and texture data necessary for display, and the rendering unit 6 renders using the shape, texture data, and mask area information. In step S107, the display unit 7 displays the rendered result. When the process of step S107 is completed, the process of the information processing system is completed.
(自動マスク領域設定処理の流れ)
 次に、マスク領域設定部2Aにより行われる自動でマスク領域を設定する自動マスク領域設定処理の流れについて、図16のフローチャートを参照して説明する。以下に説明する処理は、図15のフローチャートにおけるステップS102における3Dモデルを生成する処理の一部として、マスク領域設定部2Aにより行われる。
(Flow of automatic mask area setting process)
Next, the flow of the automatic mask area setting process for automatically setting the mask area performed by the mask area setting unit 2A will be described with reference to the flowchart of FIG. The process described below is performed by the mask area setting unit 2A as a part of the process of generating the 3D model in step S102 in the flowchart of FIG.
 ステップS201では、実カメラ毎の位置姿勢に関するカメラキャリブレーションが行われる。カメラキャリブレーション処理は、通常、自由視点映像を作成する場合に行われ処理である。例えば、キャリブレーションボードを利用した撮影画像を利用してカメラ間の位置姿勢が推定される。そして、処理がステップS202に進む。 In step S201, camera calibration regarding the position and orientation of each actual camera is performed. The camera calibration process is usually performed when creating a free-viewpoint image. For example, the position and orientation between the cameras are estimated using the captured image using the calibration board. Then, the process proceeds to step S202.
 ステップS202では、ターゲット領域が設定される。ターゲット領域は、例えば、モデリングおよびレンダリングをしたい領域である。係る領域は、ユーザーの手動により設定されてもよいし、自動で設定されてもよい。自動で設定される場合は、例えば、事前に自由視点映像のカメラパスがある場合はその仮想カメラの焦点付近の領域がターゲット領域として自動で設定される。そして、処理がステップS203に進む。 In step S202, the target area is set. The target area is, for example, an area for which modeling and rendering are desired. The area may be set manually by the user or may be set automatically. When it is set automatically, for example, if there is a camera path of the free viewpoint image in advance, the area near the focal point of the virtual camera is automatically set as the target area. Then, the process proceeds to step S203.
 ステップS203では、ステップS201で得られたカメラの位置情報を複数使って、各カメラ画像の各ピクセルの3次元位置を推定する。そして、処理がステップS204に進む。 In step S203, the three-dimensional position of each pixel of each camera image is estimated by using a plurality of camera position information obtained in step S201. Then, the process proceeds to step S204.
 ステップS204では、ステップS203では、ステップS203で求めた各ピクセルの位置情報がカメラの位置とターゲットの位置との間にある場合、その領域は遮蔽物とみなしてマスク領域と設定する。その後、通常の自由視点映像のモデリング、レンダリング処理が動作する。上述したように、マスク領域はモデリング、レンダリングでは利用されない。 In step S204, in step S203, when the position information of each pixel obtained in step S203 is between the position of the camera and the position of the target, the area is regarded as a shield and set as a mask area. After that, normal free-viewpoint video modeling and rendering processing operates. As mentioned above, the mask area is not used in modeling and rendering.
[ハードウエアの構成例]
 図17は、上述した一連の処理をプログラムにより実行するコンピュータのハードウエアの構成例を示すブロック図である。図17に示されるコンピュータにおいて、CPU(Central Processing Unit)11、ROM(Read Only Memory)12、RAM(Random Access Memory)13は、バス14を介して相互に接続されている。バス14には、入出力インタフェース15も接続されている。入出力インタフェース15には、入力部16、出力部17、記憶部18、通信部19、およびドライブ20が接続されている。入力部16は、例えば、キーボード、マウス、マイクロホン、タッチパネル、入力端子などよりなる。出力部17は、例えば、ディスプレイ、スピーカ、出力端子などよりなる。記憶部18は、例えば、ハードディスク、RAMディスク、不揮発性のメモリなどよりなる。通信部19は、例えば、ネットワークインタフェースよりなる。ドライブ20は、磁気ディスク、光ディスク、光磁気ディスク、または半導体メモリなどのリムーバブルメディアを駆動する。
[Hardware configuration example]
FIG. 17 is a block diagram showing an example of hardware configuration of a computer that executes the above-mentioned series of processes programmatically. In the computer shown in FIG. 17, a CPU (Central Processing Unit) 11, a ROM (Read Only Memory) 12, and a RAM (Random Access Memory) 13 are connected to each other via a bus 14. The input / output interface 15 is also connected to the bus 14. An input unit 16, an output unit 17, a storage unit 18, a communication unit 19, and a drive 20 are connected to the input / output interface 15. The input unit 16 includes, for example, a keyboard, a mouse, a microphone, a touch panel, an input terminal, and the like. The output unit 17 includes, for example, a display, a speaker, an output terminal, and the like. The storage unit 18 is composed of, for example, a hard disk, a RAM disk, a non-volatile memory, or the like. The communication unit 19 is composed of, for example, a network interface. The drive 20 drives a removable medium such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory.
 以上のように構成されるコンピュータでは、CPU11が、例えば、記憶部に記憶されているプログラムを、入出力インタフェース15およびバス14を介して、RAM13にロードして実行することにより、上述した一連の処理が行われる。RAM13にはまた、CPU11が各種の処理を実行する上において必要なデータなども適宜記憶される。 In the computer configured as described above, the CPU 11 loads the program stored in the storage unit into the RAM 13 via the input / output interface 15 and the bus 14 and executes the program, thereby executing the series described above. Processing is done. The RAM 13 also appropriately stores data and the like necessary for the CPU 11 to execute various processes.
 コンピュータが実行するプログラムは、例えば、パッケージメディア等としてのリムーバブルメディアに記録して適用することができる。その場合、プログラムは、リムーバブルメディアをドライブ20に装着することにより、入出力インタフェース15を介して、記憶部18にインストールすることができる。また、このプログラムは、ローカルエリアネットワーク、インターネット、デジタル衛星放送といった、有線または無線の伝送媒体を介して提供することもできる。その場合、プログラムは、通信部19で受信し、記憶部18にインストールすることができる。 The program executed by the computer can be recorded and applied to removable media such as package media, for example. In that case, the program can be installed in the storage unit 18 via the input / output interface 15 by mounting the removable media in the drive 20. The program can also be provided via wired or wireless transmission media such as local area networks, the Internet, and digital satellite broadcasts. In that case, the program can be received by the communication unit 19 and installed in the storage unit 18.
<変形例>
 以上、本開示の一実施形態について具体的に説明したが、本開示の内容は上述した一実施形態に限定されるものではなく、本開示の技術的思想に基づく各種の変形が可能である。
<Modification example>
Although one embodiment of the present disclosure has been specifically described above, the content of the present disclosure is not limited to the above-mentioned one embodiment, and various modifications based on the technical idea of the present disclosure are possible.
 上述した実施形態に係る情報処理装置の構成や機能の一部が、情報処理装置とは異なる機器(例えば、ネットワーク上のサーバー装置等)に存在してもよい。 A part of the configuration and functions of the information processing device according to the above-described embodiment may exist in a device different from the information processing device (for example, a server device on a network).
 また、例えば、上述した機能を実現するプログラムは、任意の装置において実行されるようにしてもよい。その場合、その装置が、必要な機能ブロックを有し、必要な情報を得ることができるようにすればよい。また、例えば、1つのフローチャートの各ステップを、1つの装置が実行するようにしてもよいし、複数の装置が分担して実行するようにしてもよい。さらに、1つのステップに複数の処理が含まれる場合、その複数の処理を、1つの装置が実行するようにしてもよいし、複数の装置が分担して実行するようにしてもよい。換言するに、1つのステップに含まれる複数の処理を、複数のステップの処理として実行することもできる。逆に、複数のステップとして説明した処理を1つのステップとしてまとめて実行することもできる。 Further, for example, the program that realizes the above-mentioned function may be executed in any device. In that case, the device may have the necessary functional blocks so that the necessary information can be obtained. Further, for example, each step of one flowchart may be executed by one device, or may be shared and executed by a plurality of devices. Further, when a plurality of processes are included in one step, one device may execute the plurality of processes, or the plurality of devices may share and execute the plurality of processes. In other words, a plurality of processes included in one step can be executed as processes of a plurality of steps. On the contrary, the processes described as a plurality of steps can be collectively executed as one step.
 また、例えば、コンピュータが実行するプログラムは、プログラムを記述するステップの処理が、本明細書で説明する順序に沿って時系列に実行されるようにしても良いし、並列に、あるいは呼び出しが行われたとき等の必要なタイミングで個別に実行されるようにしても良い。つまり、矛盾が生じない限り、各ステップの処理が上述した順序と異なる順序で実行されるようにしてもよい。さらに、このプログラムを記述するステップの処理が、他のプログラムの処理と並列に実行されるようにしても良いし、他のプログラムの処理と組み合わせて実行されるようにしても良い。また、例えば、本技術に関する複数の技術は、矛盾が生じない限り、それぞれ独立に単体で実施することができる。もちろん、任意の複数の本技術を併用して実施することもできる。例えば、いずれかの実施形態において説明した本技術の一部または全部を、他の実施形態において説明した本技術の一部または全部と組み合わせて実施することもできる。また、上述した任意の本技術の一部または全部を、上述していない他の技術と併用して実施することもできる。 Further, for example, in a program executed by a computer, the processing of the steps for writing the program may be executed in chronological order in the order described in the present specification, and may be executed in parallel or in a row. It may be executed individually at the required timing such as when it is broken. That is, as long as there is no contradiction, the processes of each step may be executed in an order different from the above-mentioned order. Further, the processing of the step for describing this program may be executed in parallel with the processing of another program, or may be executed in combination with the processing of another program. Further, for example, a plurality of techniques related to this technique can be independently implemented independently as long as there is no contradiction. Of course, any plurality of the present technologies can be used in combination. For example, some or all of the techniques described in any of the embodiments may be combined with some or all of the techniques described in other embodiments. In addition, a part or all of any of the above-mentioned techniques may be carried out in combination with other techniques not described above.
 なお、本明細書中で例示された効果により本開示の内容が限定して解釈されるものではない。 It should be noted that the contents of the present disclosure are not limitedly interpreted due to the effects exemplified in the present specification.
 本開示は、以下の構成も採ることができる。
(1)
 実カメラとターゲットとの間に存在する遮蔽物に対してマスク領域を設定するマスク領域設定部と、
 前記マスク領域が設定された画像データを含む、複数の画像データに基づいて3Dモデルを生成する3Dモデル生成部と
を有する情報処理装置。
(2)
 前記3Dモデル生成部は、前記マスク領域が前景として抽出されたシルエット画像と、前景と背景とが分離された他のシルエット画像とに基づいて前記3Dモデルを生成する
 (1)に記載の情報処理装置。
(3)
 前記前景として抽出されたマスク領域のシルエットが、他のシルエット画像のシルエットに基づいて削られる
 (2)に記載の情報処理装置。
(4)
 前記マスク領域設定部は、カメラキャリブレーションによって推定された前記実カメラの位置姿勢情報に基づいて、前記実カメラによって撮影された撮影画像の3次元位置情報をピクセル毎に求め、前記ピクセルの3次元位置情報が前記実カメラと前記ターゲットとの間にある場合には、当該ピクセルを遮蔽物として判断する
 (1)から(3)までの何れかに記載の情報処理装置。
(5)
 前記ターゲットが自動または手動で設定される
 (1)から(4)までの何れかに記載の情報処理装置。
(6)
 前記マスク領域を除外したレンダリングを行うレンダリング部を有する
 (1)から(5)までの何れかに記載の情報処理装置。
(7)
 前記レンダリング部は、前記マスク領域に対して他の領域の画素に基づいて推測したテクスチャをレンダリングする
 (6)に記載の情報処理装置。
(8)
 前記レンダリング部は、前記マスク領域に対して予め生成された3Dモデルおよび当該3Dモデルに対応づけられたテクスチャをレンダリングする
 (6)に記載の情報処理装置。
(9)
 マスク領域設定部が、実カメラとターゲットとの間に存在する遮蔽物に対してマスク領域を設定し、
 3Dモデル生成部が、前記マスク領域が設定された画像データを含む、複数の画像データに基づいて3Dモデルを生成する
 3Dモデル生成方法。
(10)
 マスク領域設定部が、実カメラとターゲットとの間に存在する遮蔽物に対してマスク領域を設定し、
 3Dモデル生成部が、前記マスク領域が設定された画像データを含む、複数の画像データに基づいて3Dモデルを生成する
 3Dモデル生成方法をコンピュータに実行させるプログラム。
(11)
 実カメラとターゲットとの間に存在する遮蔽物に対して設定されたマスク領域を示すマスク領域情報を取得する取得部と、
 前記マスク領域を除外したレンダリングを行うレンダリング部を有する
 情報処理装置。
(12)
 前記レンダリング部は、前記マスク領域に対して他の領域の画素に基づいて推測したテクスチャをレンダリングする
 (11)に記載の情報処理装置。
(13)
 前記他の領域は、前記マスク領域の周辺の領域である
 (12)に記載の情報処理装置。
(14)
 前記他の領域は、仮想カメラに近い実カメラにより得られた前記マスク領域に対応する領域である
 (12)に記載の情報処理装置。
(15)
 前記レンダリング部は、前記マスク領域に予め生成された3Dモデルおよび当該3Dモデルに対応づけられたテクスチャをレンダリングする
 (11)に記載の情報処理装置。
(16)
 所定の仮想的な視点から見たターゲットの前側に第1のマスク領域が設定され、前記ターゲットの後側に第2のマスク領域が設定され、
 前記第1のマスク領域および前記第2のマスク領域に対して、異なるレンダリング処理が行われる
 (11)から(15)までの何れかに記載の情報処理装置。
(17)
 取得部が、実カメラとターゲットとの間に存在する遮蔽物に対して設定されたマスク領域を示すマスク領域情報を取得し、
 レンダリング部が、前記マスク領域を除外したレンダリングを行う
 情報処理方法。
(18)
 取得部が、実カメラとターゲットとの間に存在する遮蔽物に対して設定されたマスク領域を示すマスク領域情報を取得し、
 レンダリング部が、前記マスク領域を除外したレンダリングを行う
 情報処理方法をコンピュータに実行させるプログラム。
The present disclosure may also adopt the following configuration.
(1)
A mask area setting unit that sets the mask area for the obstruction that exists between the actual camera and the target,
An information processing device including a 3D model generation unit that generates a 3D model based on a plurality of image data including image data in which a mask area is set.
(2)
The information processing according to (1), wherein the 3D model generation unit generates the 3D model based on a silhouette image in which the mask area is extracted as a foreground and another silhouette image in which the foreground and the background are separated. Device.
(3)
The information processing apparatus according to (2), wherein the silhouette of the mask area extracted as the foreground is removed based on the silhouette of another silhouette image.
(4)
The mask area setting unit obtains the three-dimensional position information of the photographed image taken by the actual camera pixel by pixel based on the position / orientation information of the actual camera estimated by the camera calibration, and the three-dimensional of the pixel. The information processing apparatus according to any one of (1) to (3), wherein when the position information is between the actual camera and the target, the pixel is determined as a shield.
(5)
The information processing apparatus according to any one of (1) to (4), wherein the target is automatically or manually set.
(6)
The information processing apparatus according to any one of (1) to (5), which has a rendering unit that performs rendering excluding the mask area.
(7)
The information processing apparatus according to (6), wherein the rendering unit renders a texture estimated based on pixels in another region with respect to the mask region.
(8)
The information processing apparatus according to (6), wherein the rendering unit renders a 3D model generated in advance for the mask area and a texture associated with the 3D model.
(9)
The mask area setting unit sets the mask area for the obstruction that exists between the actual camera and the target.
A 3D model generation method in which a 3D model generation unit generates a 3D model based on a plurality of image data including image data in which the mask area is set.
(10)
The mask area setting unit sets the mask area for the obstruction that exists between the actual camera and the target.
A program in which a 3D model generation unit causes a computer to execute a 3D model generation method for generating a 3D model based on a plurality of image data including image data in which the mask area is set.
(11)
An acquisition unit that acquires mask area information indicating a mask area set for a shield existing between the actual camera and the target, and an acquisition unit.
An information processing device having a rendering unit that performs rendering excluding the mask area.
(12)
The information processing apparatus according to (11), wherein the rendering unit renders a texture estimated based on pixels in another region with respect to the mask region.
(13)
The information processing apparatus according to (12), wherein the other region is a region around the mask region.
(14)
The information processing apparatus according to (12), wherein the other area corresponds to the mask area obtained by a real camera close to a virtual camera.
(15)
The information processing apparatus according to (11), wherein the rendering unit renders a 3D model generated in advance in the mask area and a texture associated with the 3D model.
(16)
A first mask area is set on the front side of the target as seen from a predetermined virtual viewpoint, and a second mask area is set on the rear side of the target.
The information processing apparatus according to any one of (11) to (15), wherein different rendering processes are performed on the first mask area and the second mask area.
(17)
The acquisition unit acquires mask area information indicating the mask area set for the obstruction existing between the actual camera and the target.
An information processing method in which the rendering unit performs rendering excluding the mask area.
(18)
The acquisition unit acquires mask area information indicating the mask area set for the obstruction existing between the actual camera and the target.
A program in which the rendering unit causes a computer to execute an information processing method for rendering excluding the mask area.
<応用例>
 本開示に係る技術は、様々な製品やサービスへ応用することができる。
 (コンテンツの制作)
 例えば、上述した実施形態で生成された被写体の3Dモデルと他のサーバーで管理されている3Dデータを合成して新たな映像コンテンツを制作してもよい。また、例えば、Lidarなどの撮像装置で取得した背景データが存在している場合、上述した実施形態で生成された被写体の3Dモデルと背景データを組合せることで、被写体が背景データで示す場所にあたかもいるようなコンテンツを制作することもできる。尚、映像コンテンツは3次元の映像コンテンツであってもよいし、2次元に変換された2次元の映像コンテンツでもよい。尚、上述した実施形態で生成された被写体の3Dモデルは、例えば、3Dモデル生成部2で生成された3Dモデルやレンダリング部6で再構築した3Dモデルなどがある。
<Application example>
The technology according to the present disclosure can be applied to various products and services.
(Content production)
For example, a new video content may be created by synthesizing the 3D model of the subject generated in the above-described embodiment and the 3D data managed by another server. Further, for example, when the background data acquired by an image pickup device such as Lidar exists, the subject can be placed in the place indicated by the background data by combining the 3D model of the subject generated in the above-described embodiment and the background data. You can also create content that looks as if it were. The video content may be a three-dimensional video content or a two-dimensional video content converted into two dimensions. The 3D model of the subject generated in the above-described embodiment includes, for example, a 3D model generated by the 3D model generation unit 2 and a 3D model reconstructed by the rendering unit 6.
 (仮想空間での体験)
 例えば、ユーザーがアバタとなってコミュニケーションする場である仮想空間の中で、本実施形態で生成された被写体(例えば、演者)を配置することができる。この場合、ユーザーは、アバタとなって仮想空間で実写の被写体を視聴することが可能となる。
(Experience in virtual space)
For example, a subject (for example, a performer) generated in the present embodiment can be placed in a virtual space where the user acts as an avatar and communicates. In this case, the user can act as an avatar and view the live-action subject in the virtual space.
 (遠隔地とのコミュニケーションへの応用)
 例えば、3Dモデル生成部2で生成された被写体の3Dモデルを送信部4から遠隔地に送信することにより、遠隔地にある再生装置を通じて遠隔地のユーザーが被写体の3Dモデルを視聴することができる。例えば、この被写体の3Dモデルをリアルタイムに伝送することにより被写体と遠隔地のユーザーとがリアルタイムにコミュニケーションすることができる。例えば、被写体が先生であり、ユーザーが生徒であるや、被写体が医者であり、ユーザーが患者である場合が想定できる。
(Application to communication with remote areas)
For example, by transmitting the 3D model of the subject generated by the 3D model generation unit 2 from the transmission unit 4 to a remote location, a user in the remote location can view the 3D model of the subject through the playback device in the remote location. .. For example, by transmitting the 3D model of the subject in real time, the subject and a user in a remote place can communicate in real time. For example, it can be assumed that the subject is a teacher and the user is a student, or the subject is a doctor and the user is a patient.
 (その他)
 例えば、上述した実施形態で生成された複数の被写体の3Dモデルに基づいてスポーツなどの自由視点映像を生成することもできるし、個人が上述した実施形態で生成された3Dモデルである自分を配信プラットフォームに配信することもできる。このように、本明細書に記載の実施形態における内容は種々の技術やサービスに応用することができる。
(others)
For example, it is possible to generate a free-viewpoint image such as sports based on a 3D model of a plurality of subjects generated in the above-described embodiment, or an individual distributes himself / herself as a 3D model generated in the above-mentioned embodiment. It can also be delivered to the platform. As described above, the contents of the embodiments described in the present specification can be applied to various techniques and services.
2・・・3Dモデル生成部
2A・・・マスク領域設定部
4・・・送信部
5・・・受信部
6・・・レンダリング部
2 ... 3D model generation unit 2A ... Mask area setting unit 4 ... Transmission unit 5 ... Reception unit 6 ... Rendering unit

Claims (18)

  1.  実カメラとターゲットとの間に存在する遮蔽物に対してマスク領域を設定するマスク領域設定部と、
     前記マスク領域が設定された画像データを含む、複数の画像データに基づいて3Dモデルを生成する3Dモデル生成部と
     を有する情報処理装置。
    A mask area setting unit that sets the mask area for the obstruction that exists between the actual camera and the target,
    An information processing device including a 3D model generation unit that generates a 3D model based on a plurality of image data including image data in which a mask area is set.
  2.  前記3Dモデル生成部は、前記マスク領域が前景として抽出されたシルエット画像と、前景と背景とが分離された他のシルエット画像とに基づいて前記3Dモデルを生成する
     請求項1に記載の情報処理装置。
    The information processing according to claim 1, wherein the 3D model generation unit generates the 3D model based on a silhouette image in which the mask area is extracted as a foreground and another silhouette image in which the foreground and the background are separated. Device.
  3.  前記前景として抽出されたマスク領域のシルエットが、他のシルエット画像のシルエットに基づいて削られる
     請求項2に記載の情報処理装置。
    The information processing apparatus according to claim 2, wherein the silhouette of the mask area extracted as the foreground is removed based on the silhouette of another silhouette image.
  4.  前記マスク領域設定部は、カメラキャリブレーションによって推定された前記実カメラの位置姿勢情報に基づいて、前記実カメラによって撮影された撮影画像の3次元位置情報をピクセル毎に求め、前記ピクセルの3次元位置情報が前記実カメラと前記ターゲットとの間にある場合には、当該ピクセルを遮蔽物として判断する
     請求項1に記載の情報処理装置。
    The mask area setting unit obtains the three-dimensional position information of the photographed image taken by the actual camera pixel by pixel based on the position / orientation information of the actual camera estimated by the camera calibration, and the three-dimensional of the pixel. The information processing apparatus according to claim 1, wherein when the position information is between the actual camera and the target, the pixel is determined as a shield.
  5.  前記ターゲットが自動または手動で設定される
     請求項1に記載の情報処理装置。
    The information processing apparatus according to claim 1, wherein the target is automatically or manually set.
  6.  前記マスク領域を除外したレンダリングを行うレンダリング部を有する
     請求項1に記載の情報処理装置。
    The information processing apparatus according to claim 1, further comprising a rendering unit that performs rendering excluding the mask area.
  7.  前記レンダリング部は、前記マスク領域に対して他の領域の画素に基づいて推測したテクスチャをレンダリングする
     請求項6に記載の情報処理装置。
    The information processing apparatus according to claim 6, wherein the rendering unit renders a texture estimated based on pixels in another region with respect to the mask region.
  8.  前記レンダリング部は、前記マスク領域に対して予め生成された3Dモデルおよび当該3Dモデルに対応づけられたテクスチャをレンダリングする
     請求項6に記載の情報処理装置。
    The information processing device according to claim 6, wherein the rendering unit renders a 3D model generated in advance for the mask area and a texture associated with the 3D model.
  9.  マスク領域設定部が、実カメラとターゲットとの間に存在する遮蔽物に対してマスク領域を設定し、
     3Dモデル生成部が、前記マスク領域が設定された画像データを含む、複数の画像データに基づいて3Dモデルを生成する
     3Dモデル生成方法。
    The mask area setting unit sets the mask area for the obstruction that exists between the actual camera and the target.
    A 3D model generation method in which a 3D model generation unit generates a 3D model based on a plurality of image data including image data in which the mask area is set.
  10.  マスク領域設定部が、実カメラとターゲットとの間に存在する遮蔽物に対してマスク領域を設定し、
     3Dモデル生成部が、前記マスク領域が設定された画像データを含む、複数の画像データに基づいて3Dモデルを生成する
     3Dモデル生成方法をコンピュータに実行させるプログラム。
    The mask area setting unit sets the mask area for the obstruction that exists between the actual camera and the target.
    A program in which a 3D model generation unit causes a computer to execute a 3D model generation method for generating a 3D model based on a plurality of image data including image data in which the mask area is set.
  11.  実カメラとターゲットとの間に存在する遮蔽物に対して設定されたマスク領域を示すマスク領域情報を取得する取得部と、
     前記マスク領域を除外したレンダリングを行うレンダリング部を有する
     情報処理装置。
    An acquisition unit that acquires mask area information indicating a mask area set for a shield existing between the actual camera and the target, and an acquisition unit.
    An information processing device having a rendering unit that performs rendering excluding the mask area.
  12.  前記レンダリング部は、前記マスク領域に対して他の領域の画素に基づいて推測したテクスチャをレンダリングする
     請求項11に記載の情報処理装置。
    The information processing device according to claim 11, wherein the rendering unit renders a texture estimated based on pixels in another region with respect to the mask region.
  13.  前記他の領域は、前記マスク領域の周辺の領域である
     請求項12に記載の情報処理装置。
    The information processing apparatus according to claim 12, wherein the other region is a region around the mask region.
  14.  前記他の領域は、仮想カメラに近い実カメラにより得られた前記マスク領域に対応する領域である
     請求項12に記載の情報処理装置。
    The information processing apparatus according to claim 12, wherein the other area corresponds to the mask area obtained by a real camera close to a virtual camera.
  15.  前記レンダリング部は、前記マスク領域に予め生成された3Dモデルおよび当該3Dモデルに対応づけられたテクスチャをレンダリングする
     請求項11に記載の情報処理装置。
    The information processing device according to claim 11, wherein the rendering unit renders a 3D model generated in advance in the mask area and a texture associated with the 3D model.
  16.  所定の仮想的な視点から見たターゲットの前側に第1のマスク領域が設定され、前記ターゲットの後側に第2のマスク領域が設定され、
     前記第1のマスク領域および前記第2のマスク領域に対して、異なるレンダリング処理が行われる
     請求項11に記載の情報処理装置。
    A first mask area is set on the front side of the target as seen from a predetermined virtual viewpoint, and a second mask area is set on the rear side of the target.
    The information processing apparatus according to claim 11, wherein different rendering processes are performed on the first mask area and the second mask area.
  17.  取得部が、実カメラとターゲットとの間に存在する遮蔽物に対して設定されたマスク領域を示すマスク領域情報を取得し、
     レンダリング部が、前記マスク領域を除外したレンダリングを行う
     情報処理方法。
    The acquisition unit acquires mask area information indicating the mask area set for the obstruction existing between the actual camera and the target.
    An information processing method in which the rendering unit performs rendering excluding the mask area.
  18.  取得部が、実カメラとターゲットとの間に存在する遮蔽物に対して設定されたマスク領域を示すマスク領域情報を取得し、
     レンダリング部が、前記マスク領域を除外したレンダリングを行う
     情報処理方法をコンピュータに実行させるプログラム。
    The acquisition unit acquires mask area information indicating the mask area set for the obstruction existing between the actual camera and the target.
    A program in which the rendering unit causes a computer to execute an information processing method for rendering excluding the mask area.
PCT/JP2021/025929 2020-07-21 2021-07-09 Information processing device, 3d model generation method, information processing method, and program WO2022019149A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2020-124656 2020-07-21
JP2020124656 2020-07-21

Publications (1)

Publication Number Publication Date
WO2022019149A1 true WO2022019149A1 (en) 2022-01-27

Family

ID=79728721

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/025929 WO2022019149A1 (en) 2020-07-21 2021-07-09 Information processing device, 3d model generation method, information processing method, and program

Country Status (1)

Country Link
WO (1) WO2022019149A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016009864A1 (en) * 2014-07-18 2016-01-21 ソニー株式会社 Information processing device, display device, information processing method, program, and information processing system
JP2019083402A (en) * 2017-10-30 2019-05-30 キヤノン株式会社 Image processing apparatus, image processing system, image processing method, and program
WO2019116942A1 (en) * 2017-12-14 2019-06-20 キヤノン株式会社 Generation device, generation method and program for three-dimensional model
JP2019197523A (en) * 2018-05-07 2019-11-14 キヤノン株式会社 Information processing apparatus, control method of the same and program
JP2020013216A (en) * 2018-07-13 2020-01-23 キヤノン株式会社 Device, control method, and program
JP2020135525A (en) * 2019-02-21 2020-08-31 Kddi株式会社 Image processing device and program

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016009864A1 (en) * 2014-07-18 2016-01-21 ソニー株式会社 Information processing device, display device, information processing method, program, and information processing system
JP2019083402A (en) * 2017-10-30 2019-05-30 キヤノン株式会社 Image processing apparatus, image processing system, image processing method, and program
WO2019116942A1 (en) * 2017-12-14 2019-06-20 キヤノン株式会社 Generation device, generation method and program for three-dimensional model
JP2019197523A (en) * 2018-05-07 2019-11-14 キヤノン株式会社 Information processing apparatus, control method of the same and program
JP2020013216A (en) * 2018-07-13 2020-01-23 キヤノン株式会社 Device, control method, and program
JP2020135525A (en) * 2019-02-21 2020-08-31 Kddi株式会社 Image processing device and program

Similar Documents

Publication Publication Date Title
JP7277372B2 (en) 3D model encoding device, 3D model decoding device, 3D model encoding method, and 3D model decoding method
WO2018123801A1 (en) Three-dimensional model distribution method, three-dimensional model receiving method, three-dimensional model distribution device, and three-dimensional model receiving device
JP7003994B2 (en) Image processing equipment and methods
US10650590B1 (en) Method and system for fully immersive virtual reality
JP5654138B2 (en) Hybrid reality for 3D human machine interface
JP2022002418A (en) Reception method and terminal
US11652970B2 (en) Apparatus and method for representing a spatial image of an object in a virtual environment
US9380263B2 (en) Systems and methods for real-time view-synthesis in a multi-camera setup
US11967014B2 (en) 3D conversations in an artificial reality environment
CN113382275B (en) Live broadcast data generation method and device, storage medium and electronic equipment
US20210166485A1 (en) Method and apparatus for generating augmented reality images
EP3631767A1 (en) Methods and systems for generating a virtualized projection of a customized view of a real-world scene for inclusion within virtual reality media content
US11557087B2 (en) Image processing apparatus and image processing method for generating a strobe image using a three-dimensional model of an object
US20220114784A1 (en) Device and method for generating a model of an object with superposition image data in a virtual environment
WO2022019149A1 (en) Information processing device, 3d model generation method, information processing method, and program
WO2022004234A1 (en) Information processing device, information processing method, and program
EP2525573A1 (en) Method and system for conducting a video conference
JP6091850B2 (en) Telecommunications apparatus and telecommunications method
WO2022024780A1 (en) Information processing device, information processing method, video distribution method, and information processing system
Price et al. Real-time production and delivery of 3D media
WO2023218979A1 (en) Image processing device, image processing method, and program
WO2022004233A1 (en) Information processing device, information processing method, and program
Scheer et al. A client-server architecture for real-time view-dependent streaming of free-viewpoint video
WO2024014197A1 (en) Image processing device, image processing method, and program
US20230288622A1 (en) Imaging processing system and 3d model generation method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21845346

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21845346

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP