WO2020158392A1

WO2020158392A1 - Image generation device, display processing device, image generation method, control program, and recording medium

Info

Publication number: WO2020158392A1
Application number: PCT/JP2020/001072
Authority: WO
Inventors: 恭平池田; 山本　智幸
Original assignee: シャープ株式会社
Priority date: 2019-01-30
Filing date: 2020-01-15
Publication date: 2020-08-06
Also published as: JP2022049716A

Abstract

The present invention realizes a depth information generation device for reducing a traffic amount and improving the quality of a 3D model to reproduce. The depth information generation device comprises a depth information generation unit for generating depth information from a 3D model, a 3D model reproduction unit for reproducing a reproduction model from the depth information, a hole detection unit for detecting a hole area of the reproduction model, and a camera information setting unit for generating camera information on the basis of the hole area.

Description

Image generation device, display processing device, image generation method, control program, and recording medium

One embodiment of the present invention relates to an image generation device, a display processing device, an image generation method, a control program, and a recording medium.
The present application claims priority based on Japanese Patent Application No. 2019-14233 filed in Japan on January 30, 2019, the content of which is incorporated herein by reference.

In recent years, AR (Augmented Reality) and VR (Virtual Reality) technologies have attracted attention. In addition, with the development of technology, interest in technology for real-time distribution of AR and VR contents is also increasing.

As a conventional technique, a technique called Holoportation is known. In the Hollocation, it is possible to reproduce a 3D model in a remote AR space by delivering a colorless model and a plurality of texture images and integrating them at the receiving side.

Also, in the field of 3DCG, a technology based on KinectFusion that builds a 3D model by integrating depth maps is being considered. By using KinectFusion, it becomes possible to construct a precise 3D model from a low-resolution depth map in real time. By utilizing this, it becomes possible to reproduce the 3D model as described above by distributing the depth map.

However, the above-described conventional technology has a problem that the traffic volume in data transmission tends to increase because the 3D model to be displayed is directly transmitted. In addition, when a 3D model is reproduced by delivering a depth map, there is a problem that a defect area called a hole is likely to occur in the 3D model due to the influence of occlusion and depth integration. Although there is a technology for filling holes based on a 3D model, real-time performance is lost because the processing takes time.

One aspect of the present invention has been made in view of the above problems. Based on the 3D model, a 3D model in which the occurrence of holes is suppressed in real time is constructed while suppressing an increase in traffic volume in transmission. The purpose is to generate possible transmission data.

In order to solve the above problems, a depth information generation device and a 3D model reproduction device according to an aspect of the present invention include the following means.
(First means)
Depth information generation means for generating depth information based on camera information based on a reference model, 3D model reproduction means for reproducing the reproduction model by integrating the depth information, and the reproduction model with reference to the reference model. Depth information generating means for estimating the hole area existing in the hole and extracting it as a hole extraction model, and camera information setting means for setting the camera information based on the hole extraction model. ..
(Second means)
Auxiliary model generation means for extracting a fine area and a hole area in the reference model as an auxiliary model, and camera information setting means for generating an auxiliary model by extracting a fine area and a hole area in the hole extraction model are further provided. The depth information generating means according to the first means, further comprising an auxiliary model added to the reproduction model when the reproduction model is reproduced.
(Third means)
Comparing the 3D model reproduction means for generating the reproduction voxel space by integrating the depth information, the reference voxel space generation means for generating the reference voxel space based on the reference model, and the reproduction voxel space and the reference voxel space. 6. The depth information generating means according to the first means, further comprising: an auxiliary TSDF generating means for generating the auxiliary TSDF at, and adding the auxiliary TSDF after integrating the depth information at the time of reproducing the reproduction model.
(Fourth means)
The depth information generation according to the first, second or third means characterized in that, when the depth information is integrated in the 3D model reproducing means, depth values near a contour of an object reflected in the depth information are not integrated. means.
(Fifth means)
The depth information generating means according to the first, second or third means, wherein a filter for interpolating the TSDF value is added after the depth information is integrated in the 3D model reproducing means.
(Sixth means)
The depth information according to the first, second or third means, wherein the number of meshes with which the nearest vertex of the reference model is adjacent is used to detect the hole area of the reproduction model in the hole detecting means. Generating means.
(Seventh means)
Depth images are added one by one to the depth information based on the camera information generated by the camera information setting means, and the added depth image is removed from the depth information when the accuracy of the reproduction model is improved and the level is not satisfied each time it is added. The depth information generating means according to the first, second or third means.
(Eighth means)
The depth information generating means according to the first, second or third means, wherein the depth information generating means generates the depth information based on a sub model instead of a reference model.
(Ninth means)
The reference model is further provided with a subdivision means 15 for subjecting the reference model to a 3D model having a uniform distribution, and the accuracy of the reproduction model is evaluated for each grid divided by the hole detecting means, resulting in poor evaluation. Depth information generating means according to the first, second or third means, wherein camera parameters corresponding to the grid are preferentially added to the camera information.
(Tenth means)
3D model reproduction means for integrating the depth information generated by the depth information generation means and reproducing the reproduction model, the 3D model reproduction means.
(Eleventh means)
The 3D model reproducing means according to the tenth means, further comprising: an auxiliary model integrating means for adding an auxiliary model to the reproduction model to make a new reproduction model, and restoring a fine area of the reproduction model.
(Twelfth means)
The 3D model reproducing means according to the tenth means, further comprising: 3D model generating means for adding the auxiliary TSDF by integrating the depth information, and restoring a fine area of the reproduction model.

According to an aspect of the present invention, it is possible to realize a depth information generation device that generates depth information and that can construct a 3D model in which the generation of holes is suppressed while suppressing the traffic volume.

3 is a functional block diagram of the depth information generation device according to the first embodiment. FIG. 3 is a functional block diagram of the 3D model reproduction device according to the first embodiment. FIG. 6 is a flowchart showing a flow of processing according to the first embodiment. 7 is a functional block diagram of a depth information generation device according to a second modification of the first embodiment. FIG. 9 is a functional block diagram of a depth information generation device according to a modified example 3 of the first embodiment. FIG. 7 is a functional block diagram of a depth information generation device according to the second embodiment. FIG. 7 is a functional block diagram of a 3D model reproduction device according to a second embodiment. FIG. 9 is a flowchart showing a flow of processing according to the second embodiment. 9 is a functional block diagram of a depth information generation device according to a third embodiment. FIG. FIG. 9 is a functional block diagram of a 3D model reproduction device according to a third embodiment. 9 is a flowchart showing a flow of processing according to the third embodiment. FIG. 13 is a functional block diagram of a depth information generation device according to a modified example of the third embodiment.

An embodiment of the present invention will be described below with reference to FIGS. 1 to 3.

[Embodiment 1]
An embodiment of the present invention will be described below with reference to FIGS. 1 to 3.

In the present embodiment, a configuration will be described in which the depth information generation device 1 that generates depth information performs a process of generating depth information based on an input reference model.

[1. Configuration of depth information generation device 1]
The configuration of the depth information generation device 1 according to the present embodiment will be described based on FIG.

The depth information generation device 1 is a device that generates depth information based on an input reference model.

The above-mentioned reference model is a 3D model that is a source of generating depth information and is a target to be reproduced based on the depth information in the 3D model reproducing device 2 described later. In other words, the reference model is a 3D model that is converted into depth information in the depth information generation device 1 and reproduced by the 3D model reproduction device 2. The reference model is, for example, a 3D model including vertices or meshes.

Note that the reference model may be a 3D model that has been subjected to subdivision processing in advance and has a uniform vertex distribution. As a result, it becomes possible to perform uniform evaluation of the reference in the accuracy evaluation of the model described later.

The depth information described above is a set of depth images when the reference model is viewed from multiple or single cameras placed at specific positions. The depth image is information in which the depth information when the subject (reference model) is viewed from the camera is recorded in an image format, and is, for example, a depth map in which the information is represented by the brightness of a monochrome image. Note that the camera in the present invention is assumed to be a virtual camera that is virtually reproduced in software, but it may be an existing camera. Further, the virtual camera may be a virtual camera of a format other than the above.

-The depth information does not necessarily have to include a plurality of depth images, and may be composed of a single depth image. The depth information may be, for example, an image format in which depth images are arranged so as not to overlap each other. In other words, the depth information may be an image generated by tiling the depth image. Further, the depth information may be information obtained by grouping depth images. Camera information is attached to the depth information. The camera information is, for example, a set of camera parameters of the camera that records the depth image included in the depth information. The camera parameters and camera information will be described later.

The above-mentioned camera parameter is information indicating the parameter of the camera that records the depth image. The camera parameters include at least information on the position of the camera when the corresponding depth image was captured, the direction in which the camera is facing, the focal length, and the resolution of the depth image to be recorded.

The above-mentioned camera information is information attached to the depth information, and is a set of camera parameters corresponding to the depth image included in the depth information. The camera information includes information indicating which depth image in the depth information is the depth image corresponding to the camera parameter included in the camera information. The information indicating the corresponding depth image is information indicating which area in the depth information is the depth image corresponding to the camera parameter when the depth information is an image format in which a plurality of depth images are arranged. For example, the information indicating the coordinates of the origin of the corresponding depth image and the information indicating the number of pixels of the corresponding depth image. In addition, for example, when the depth information is a grouped depth image, the depth information corresponding to the camera parameter is information indicating which depth image.

FIG. 1 is a functional block diagram of the depth information generation device 1 according to this embodiment. As shown in FIG. 1, the depth information generation device 1 includes a depth information generation unit 11, a 3D model reproduction unit 12, a hole detection unit 13, and a camera information setting unit 14.

The depth information generation unit 11 generates depth information based on the input reference model and camera information.

The 3D model reproduction unit 12 generates a reproduction model based on the input depth information. The reproduction model described above is a model generated based on the depth information, and is, for example, a 3D model including vertices and triangular meshes.

The hole detection unit 13 detects a hole area based on the input reference model and reproduction model, and outputs a vertex or a mesh corresponding to the hole area of the reference model as a hole extraction model.

The above-mentioned hole area is a specific area in the 3D model. Explaining the reference model as an example, it is an area in which holes do not exist in the area of the reference model and holes exist in the area in the reproduction model corresponding to the area. In other words, a hole that should not exist in the reference model has been generated in the corresponding part of the reproduction model. An object of the present invention is to suppress the hole area in the reproduction model described above. In the following description, the process of suppressing or removing the hole region will be referred to as filling the hole or compensating the hole.

The above-mentioned hole extraction model is, for example, a 3D model in which the vertices or meshes in the reference model corresponding to the hole area are extracted.

The camera information setting unit 14 clusters the input hole extraction models, generates camera parameters for each cluster, and collects the camera parameters to generate camera information. The generated camera information is input to the depth information generation unit 11.

(3D model generation method)
Hereinafter, an example of a process of generating a 3D model by integrating depth images, which is represented by KinectFusion, will be briefly described.

(1) The content of voxels is calculated based on the depth image and camera parameters.
A voxel is a distribution obtained by dividing a 3D space into a grid shape, and holds a TSDF value and a weight value for each voxel. Here, a set of voxels existing in the 3D space is called a voxel space. In the initial state, the TSDF (Truncated Signed Distance Function) value and the weight value of the voxel are both 0. The TSDF value here represents a distance from the voxel to the surface of the 3D model, and is a signed numerical value meaning that the smaller the TSDF value, the closer to the surface. The TSDF value means, for example, a positive TSDF value means a voxel closer to the camera than the surface, and a negative TSDF value means a voxel located deeper than the surface. The weight value is a numerical value indicating the reliability of the corresponding TSDF value, and the minimum value is 0.

Calculating the TSDF value and weight value of the voxel described above based on the depth image and the camera parameters corresponding to the depth image. Specifically, the TSDF value and the weight value are calculated for the voxels that are included in the camera parameters and are arranged at the positions and orientations of the cameras, and the voxels on the corresponding depth image that pass through each pixel. However, it is not necessary to calculate both values for all voxels on the ray, and as seen from the camera, the voxels existing up to the surface of the 3D model on the ray (the depth value of the corresponding pixel) and the same surface Both values may be calculated for any number of voxels in the back from.

The voxel TSDF value is the distance from the voxel position to the surface of the 3D model on the ray (the depth value of the corresponding pixel). The weight value is, for example, the inner product of the normal line of the pixel of the depth image on the ray and the ray. Here, only weight values of 0 and positive values are considered. When the voxel holds the TSDF value and the weight value which are not 0, the weighted average in which the corresponding weight is the weight is calculated for the existing TSDF value and the new TSDF value, and the average value is Overwrite the TSDF value of the voxel as a new TSDF value. In addition, the weight value overwrites the weight value of the voxel with a new weight value that is the sum of the existing weight value and the new weight value. The above calculation is sequentially performed for all pixels of all depth images. In the present invention, calculation is performed for all pixels of all depth images included in the depth information. In the following description, the above calculation is also referred to as depth image integration or depth integration.

(2) The voxel space in which the TSDF value is recorded is converted into a 3D model with a mesh structure by the Marching Cubes method. In the above conversion, the calculation time may be shortened by skipping the calculation of the voxel whose recorded weight is 0. Through the above processing, a 3D model is generated from the depth image.

[2.3 Configuration of 3D Model Playback Device 2]
The configuration of the 3D model playback device 2 according to the present embodiment will be described based on FIG. The 3D model reproduction device 2 is a device that generates a reproduction model based on the input depth information.

The above-mentioned reproduction model is a model generated based on the depth information in the 3D model reproduction device 2. The reproduction model is, for example, a 3D model including vertices or meshes.

FIG. 2 is a functional block diagram of the 3D model playback device 2 according to the present embodiment. As shown in FIG. 2, the 3D model reproduction device 2 includes a 3D model reproduction unit 12.

The function of the 3D model reproduction unit 12 is the same as that of the 3D model reproduction unit 12 included in the depth information generation device 1.

[3. Processing flow]
An example of processing according to this embodiment will be described in detail step by step based on FIGS. 1 to 3.

FIG. 3 is a flowchart showing the flow of processing according to this embodiment. Note that steps S101 to S104 are processes in the depth information generation device 1, and steps S105 to S106 are processes in the 3D model reproduction device 2.

(S101)
In step S101, the depth information generation unit 11 generates depth information based on the input reference model. Specifically, the depth images when the reference model is viewed are acquired from the cameras arranged based on the camera information, and the depth information is generated based on those depth images. At this time, the depth information generation unit 11 holds the information of the camera information used to generate the depth information.

The process of the depth information generation unit 11 differs depending on whether or not camera information is input to the depth information generation unit 11 from the camera information setting unit 14.

When camera information is not input from the camera information setting unit 14, in other words, when processing of the depth information generation unit 11 is performed for the first time after starting the processing of the depth information generation device 1, the camera parameters included in the camera information are It is set arbitrarily. For example, the camera parameter has a position and orientation of the camera that positions the camera so as to surround the center of gravity of the reference model, and a position and focal length that allows the entire reference model to be seen.

Note that when setting the camera parameters, the structure may be such that an important area in the reference model is shot by a large number of cameras or high resolution cameras. The above-mentioned important part is, for example, the head or face of a human when the reference model includes the human. Further, for example, when the reference model includes a numerical value, this is an area in which the numerical value is drawn. In any case, the method of detecting the important area does not matter. In addition, the above-mentioned important part may be set arbitrarily. With the above-described configuration, it is possible to obtain the effect of accurately reproducing the important area in the 3D model reproduction device 2.

When camera information is input to the depth information generation unit 11 from the camera information setting unit 14, in other words, in the processing of the depth information generation unit 11 for the second time and thereafter, first, all the input camera information is input to the existing camera information. The camera information added is replaced with new camera information. At this stage, the information indicating the depth image corresponding to the camera parameter, which is related to the camera parameter added above, may not be included in the camera information. Next, depth information is generated based on the new camera information and the reference model. The depth image newly added to the depth information functions to fill the hole area of the reproduction model in the 3D model reproduction unit 12. Next, information indicating the depth image corresponding to the camera parameter, which is related to the camera parameter added above, is added to the camera information.

Note that the depth image is divided into a foreground part that is a pixel in which the reference model is recorded and a background part that is a pixel in which the reference model is not imaged. At the time of depth integration in step S102, which will be described later, the foreground part described above contributes to the reproduction of the reproduction model.

In the processing of step S101 from the second time onward, the resolution of the depth image added to the depth information is lower than the resolution of the depth image added in the first processing of step S101 described above. Is also good. The depth image added in the first processing described above needs to have a resolution sufficient to maintain the detail of the reproduction model, but the depth image added in the second and subsequent processings is required to compensate for the hole area. For the purpose, it is not necessary to maintain the height of the detail, and there is no problem even if the resolution is lower than the depth image added in the first processing. With the above configuration, by reducing the resolution of the depth image added in the second and subsequent processes, the data amount of the entire depth information can be reduced, and the effect of suppressing the traffic amount to be transmitted can be obtained.

If the camera parameter included in the camera information input from the camera information setting unit 14 is accompanied by camera priority information, which will be described later, the depth information may be generated based on the priority. Specifically, the depth image is generated based on the camera parameter in order from the camera parameter with the highest priority of the camera, and the depth image is added to the depth information. When the depth information is an image format in which a plurality of depth images are arranged, the image resolution of the depth information may be limited. That is, it is not always possible to add depth images for all camera parameters that are added. Therefore, by attaching the information of the priority of the camera to the parameter of the camera and adding the depth image to the depth information based on the priority of the camera, the depth image to be added with priority is added with priority. To do. Further, when the depth image does not fit in the depth information, the camera parameter having a priority equal to or lower than the priority associated with the camera parameter recording the depth image may be ignored. In other words, it is not necessary to add a camera parameter having a priority lower than the camera parameter when the depth information is added to the depth information in order from the depth image with the highest priority and the depth image does not fit in the depth information.

When the reference model input to the depth information generation device 1 is one frame in the temporally continuous sequence and the temporally continuous reference model is input to the depth information generation device 1, The initial camera information in S101 may be another example. Specifically, in step S101, the camera information included in the depth information output from the depth information generation device 1 in the previous frame is included in the camera information when the processing of the depth information generation unit 11 is performed for the first time after the processing is started. May be set. In other words, the camera information calculated in the previous frame may be set as the initial camera information in the current frame, and the process of step S101 may be performed. With the above configuration, when the depth information generated for each frame is encoded in the video format, the position of the camera that views each depth image can be fixed, so that the encoding efficiency is improved and the traffic volume can be suppressed. ..

Further, at this time, if the accuracy of the reproduction model generated in the 3D model reproduction unit 12 is less than a certain level based on the above camera information, the camera information may be initialized. That is, the camera information of the previous frame may be discarded and the process may be restarted from step S101. By the processing described above, it is not necessary to add an extra depth image even when the reference model is largely changed due to the time change and the place where the hole is generated is changed.

(S102)
Then, in step S102, the 3D model reproduction unit 12 generates a reproduction model based on the input depth information. Specifically, the depth information input first is divided into depth images based on the camera information. Next, a filter is added to the depth image. The filter is, for example, a smoothing filter represented by a bilateral filter. The process of adding the filter described above is not essential, and it is not always necessary to add the filter to the depth image. Next, the depth image is integrated based on the camera parameters corresponding to the depth image to generate a 3D model, and the 3D model is input to the hole detection unit as a reproduction model. The procedure for generating a 3D model by integrating the depth images may be in accordance with the above-described 3D model generation method.

In the process of step S102, it is described that the method of generating the 3D model described above calculates the TSDF value and the weight value for the voxels on the rays of the camera, but in step S102, both values are calculated. You may choose the voxel in another way. For example, for a pixel in the depth pixel, both values may be calculated for voxels in the normal direction of the pixel. Specifically, the TSDF value and the weight value are calculated for a voxel located on the normal line of the pixel and located at an arbitrary distance from the pixel. With the above configuration, even when a small number of depth images are integrated, it is possible to obtain the effect of suppressing the generation of the hole area in the reproduction model.

In the method of generating the 3D model described above, the TSDF value and the weight value are calculated for all the pixels included in the depth image, but in step S102, the pixels corresponding to the background portion in the depth image are determined. , May be excluded from the calculation. With the above-described configuration, the calculation that does not contribute to the generation of the reproduction model is skipped, so that the calculation speed can be improved.

Also, pixels near the contour of the object shown in the depth image may be excluded from the calculation. The contour of the object described above refers to, for example, a pixel at the boundary between the foreground part and the background part in the depth image. In addition, for example, it refers to a pixel in the depth image, which has a larger difference from the depth value of an adjacent pixel than an arbitrary value. When transmitting the depth information generated by the depth information generating device 1, the depth image included in the depth information is usually encoded. At this time, since the depth value near the contour of the object shown in the depth image is strongly distorted by encoding, there is a possibility that the reproduction model is adversely affected when the depth images are integrated in step S102. For example, the reproduction model is distorted and the accuracy of the model is reduced. Therefore, it is preferable to exclude the region from the integration at the time of integration.

With the above configuration, when integrating the depth information in the 3D model playback unit 12, it is possible to exclude from the integration a region of strong distortion in the depth image that may adversely affect the playback model. As a result, the influence of the distortion of the depth image can be removed from the reproduction model, so that the accuracy of the reproduction model can be improved.

In the process of step S102, a process of filtering the voxel space may be added before the process of generating the 3D model by the MarchingCubes method.

The filter described above is, for example, a filter that interpolates TSDF values. Specifically, a non-zero negative TSDF value and a weight value are given to a voxel adjacent to a voxel holding a negative TSDF value and a non-zero weight value and holding a TSDF value and a weight value of zero. It is a filter. The TSDF value given to the voxel may be, for example, an average value of the TSDF values of adjacent voxels holding a negative TSDF value and a non-zero weight. Further, the weight value is set to the lowest value other than 0. In other words, it is a weight whose calculation is not skipped in the Marching Cubes method and which is the lowest non-zero weight value that can be given. With the above configuration, the hole area generated in the reproduction model can be filled, and the effect of improving the accuracy of the reproduction model can be obtained.
Further, for example, a filter that is applied after applying the above-described filter, is adjacent to a voxel to which a TSDF value and a weight are given by the above-described filter, and is adjacent to a voxel that holds a positive TSDF value and a non-zero weight. However, it is a filter that gives a positive TSDF value and weight value which are not 0 to the voxels whose TSDF value and weight value that are held are 0. The TSDF value given to the voxel may be, for example, an average value of the TSDF values of adjacent voxels holding a positive TSDF value and a non-zero weight. Further, for example, the TSDF value and the TSDF value of the voxel to which the weight is given by the above-mentioned filter may be a value in which the signs are exchanged. Further, the weight value is set to the lowest value other than 0.

With the above configuration, the voxel space calculated by integrating the depth information can be interpolated in the 3D model reproduction unit 12. As a result, a negative TSDF value can be given to a voxel corresponding to the hole area, which is adjacent to a voxel having a positive TSDF and has a weight of 0. That is, the hole area generated in the reproduction model can be further filled, and the effect of improving the accuracy of the reproduction model can be obtained.

The above two types of filters may be filters in which the signs of TSDF values are exchanged.

Also, for example, for a voxel having a weight value smaller than an arbitrary value, it is a filter that replaces the TSDF value and the weight value with 0. With the above configuration, by removing the TSDF value having low reliability, noise generated in the reproduction model can be suppressed, and the effect of improving the accuracy of the reproduction model can be obtained.

Note that, in the process of step S102, the depth integrated calculation may be performed using the maximum value of different weight values for each depth image included in the depth information. Specifically, the depth image added to the depth information in the first process in step S101 and the depth image added to the depth information in the second and subsequent processes are identified and used when the former depth image is integrated. The latter depth image is integrated using a lower weight value than the existing weight value. For example, the former depth image is given a weight of 1 and the latter depth image is given a weight of 1/10 to calculate the depth integration. Further, for example, in the latter integration of the depth images, the lowest weight other than 0 is used. With the above configuration, compared to the depth image added to the depth information in the first process, the depth image added in the second and subsequent processes weakens the effect on depth integration, so it affects the areas other than the hole area. Is suppressed. As a result, it is possible to obtain the effect of preventing the accuracy of the reproduction model from decreasing.

Similarly, the weight value of the low resolution depth image may be smaller than the weight value of the high resolution depth image. For example, the depth image having a resolution of 1280×960 is given a weight of 1×, and the depth image having a resolution of 640×480 is given a weight of ¼ to calculate the depth integration. With the above-described configuration, in depth integration, the influence of a high-resolution depth image with reliable depth accuracy can be strengthened, and thus the effect of improving the accuracy of the playback model can be obtained.

Note that in the process of step S102, the depth integrated calculation may be performed for voxels in different ranges for each depth image included in the depth information. Specifically, when the depth image added to the depth information in the first process in step S101 and the depth image added to the depth information in the second and subsequent processes are identified and the former depth image is integrated, The TSDF value and the weight value are calculated for a wide range of voxels, and when the latter depth images are integrated, the TSDF value and the weight value are calculated for a narrow range of voxels. For example, for the former depth image, the TSDF value and weight value are calculated for the range of 3 voxels from the corresponding surface, and for the latter depth image, the TSDF value and weight value are similarly calculated for the range of 1 voxel. With the above configuration, when the latter depth image is depth-integrated, it is possible to suppress the influence of the depth image on the periphery of the hole area. Therefore, by integrating the depth image, noise generated in the reproduction model is suppressed. You can get the effect.

(S103)
Subsequently, in step S103, the hole detection unit 13 generates a hole extraction model based on the input reference model and reproduction model. Specifically, first, the hole detection unit 13 estimates the hole area by comparing the input reference model and the reproduction model. Next, based on the estimated hole area, a hole extraction model is extracted from the reference model and input to the camera information setting unit 14.

The method of estimating the hole area does not matter. For example, a distance between a vertex in the reference model and a vertex in the reproduction model that is the closest to the vertex is calculated. If the distance is a certain value or more, the vertex of the reference model corresponds to the hole area. You can judge that it is. In this case, not only the area where the hole actually exists, but also the area where the difference in shape between the reference model and the reproduction model is large is determined as the hole area. With the above-described configuration, it is possible to obtain the effect of correcting the above-described region having a large difference in shape by adding the camera parameter described later.

Further, for example, for a certain vertex in the reference model and a vertex in the reproduction model closest to the vertex, if the number of meshes to which the vertex in the reproduction model belongs is less than a certain value, the vertex of the reference model is A method of determining that it corresponds to the hole area may be used. In KinectFusion, since the TSDF value and the weight value are recorded by the Marching Cubes method, a reproduction model is created based on the voxel space. At this time, the number of meshes in the reproduction model to which the vertices not adjacent to the hole region belong is usually within the range of 4 to 8 when the reproduction model is composed of triangular meshes. Therefore, for example, if the number of meshes to which a certain vertex in the reproduction model belongs is 3 or less, it can be determined that the vertex is adjacent to the hole area. Therefore, it is determined that the vertex in the reference model corresponds to the hole area. Is also good. With the above-described configuration, it is possible to obtain the effect of being able to detect even a hole area that cannot be detected by the distance between the nearest vertices as described above. Further, since the hole detection by the above method does not require complicated calculation, it is possible to obtain the effect that the hole detection can be performed in real time.

By the above processing, the hole extraction model is generated by extracting the vertices or meshes of the reference model determined to be the hole area from the reference model.

(S104)
Subsequently, in step S104, the camera information setting unit 14 generates camera information based on the input hole extraction model. Specifically, first, the input hole extraction model is clustered and decomposed into a plurality of clusters. Hereinafter, the hole extraction model decomposed into clusters is called a sub model. Next, for each sub-model, the optimum camera parameters for recording the sub-model are estimated, and these camera parameters are put together as camera information and input to the depth information generation unit 11.

The method of clustering the hole extraction model does not matter. For example, a method may be used in which a vertex that is close to the vertex and has a normal line close to the normal line of the vertex is assigned to the same cluster as the vertex with an appropriate vertex as a reference. As a result, vertices that have similar normal directions and are gathered at close positions can be grouped as a sub model.

Optimal camera parameters for recording a submodel include, for example, the position, orientation, and focal length of the camera such that the inner product of the normal line of the vertices included in the submodel and the deflection angle formed by the rays of the camera is large. It is a camera parameter to have.

Also, for example, a camera parameter that has a position and orientation that captures the center of gravity of the submodel on the optical axis of the camera.

Also, for example, it is a camera parameter that has a position, orientation, and focal length that fits the entire sub-model within the screen, but that fills the screen with the sub-model.

With the configuration of steps S101 to S104 described above, the depth images included in the depth information are added by looping the processing of steps S101 to S104. The camera information generated by the processing of steps S103 and S104 is a camera parameter for viewing a hole area existing in the reproduction model generated by integrating the depth information, and is therefore added after the second loop. The depth image is an image that fills the hole area described above. Therefore, the depth image added after the second loop has an effect of filling the hole area existing in the reproduction model generated by integrating the depth images added in the first loop. Therefore, by repeating the above loop, it is possible to obtain the effect that the 3D model reproduction unit 12 can generate the reproduction model in which the hole area is filled. In other words, a reproduction model closer to the reference model can be generated.

Note that in step S104, it is not always necessary to generate camera parameters for all submodels, and it is not necessary to generate camera parameters for submodels having specific conditions. The specific condition is, for example, that the total area of the meshes included in the sub model is below an arbitrary value. In other words, the camera parameters do not have to be generated for the sub-model in which the area of the corresponding hole region falls below an arbitrary value. With the above configuration, the depth image is not added to fill the small hole area, the data amount of the entire depth information can be reduced, and the traffic amount to be transmitted can be suppressed.

Note that the camera parameters generated in step S104 may be accompanied by camera priority information. The camera priority described above is information set for each sub-cluster, and is used by the depth information generation unit 11 to represent the order in which depth images are added to the depth information.

The method of setting the priority of the camera may be, for example, a method of calculating the total value of the mesh area for each submodel and setting the priority in descending order of the total value.

(Loop end judgment)
After the processing of step S104 is completed, the next processing branches based on the loop end condition.

If the loop end condition is satisfied, the process proceeds to step S105. That is, the depth information is output from the depth information generation device 1, and the process ends. If the loop end condition is not satisfied, the camera information generated by the camera information setting unit 14 is input to the depth information generation unit 11, and the process proceeds to step S101.

The loop end condition described above is a condition determined during the processing of steps S101 to S104. For example, when the processing of step S101 is repeated any number of times in step S101, it is determined that the loop end condition is satisfied. You may judge. In other words, when the loop ends an arbitrary number of times, it may be determined that the loop end condition is satisfied.

Further, for example, in step S101, when the depth image cannot be added to the depth information, in other words, when the depth information is full, it may be determined that the loop end condition is satisfied.

Further, for example, in step S102, when the accuracy of the generated reproduction model exceeds an arbitrary value, it may be determined that the loop end condition is satisfied. The accuracy of the model will be described later.

Further, for example, in step S103, if the area of the detected hole region is less than an arbitrary value, it may be determined that the loop termination condition is satisfied.

Further, for example, in step S102, if the accuracy of the generated reproduction model is lower than the accuracy of the reproduction model generated in the previous loop, it may be determined that the loop end condition is satisfied. In this case, the depth information output from the depth information generation device 1 may be the depth information of the previous loop.

Note that in the processing of steps S101 to S104, when the loop termination condition is satisfied, the processing until the termination of S104 may be skipped. For example, if the loop end condition is satisfied in step S102, the processes in steps S103 and S104 may be skipped and the depth information generation device 1 may output the depth information.

The accuracy of the reproduction model described above is an index indicating how close the reproduction model generated in the 3D model reproduction unit 12 is to the reference model. The accuracy of the reproduction model is calculated, for example, by averaging the distances between the vertices of the reference model and the vertices of the reproduction model, which are the nearest vertices. Further, for example, it is the RMSE value of the reproduction model viewed from the reference model. Also, for example, the accuracy between images such as PSNR is calculated between the depth image when the reference model is viewed with a camera having arbitrary camera parameters and the depth image when the playback model is viewed, and the accuracy is reproduced. It may be treated as the accuracy of the model.

(S105)
In step S105, the 3D model reproduction unit 12 generates a reproduction model based on the input depth information and outputs the reproduction model from the 3D model reproduction device, as in step S102. When the depth image is filtered in the process of step S102, it is desirable that the same filter be applied in the process of step S105. The same applies when the filter is not applied.

With the above configuration, the 3D model generation device 2 can generate a reproduction model based on the depth information by the same method as the 3D model generation unit 12 of the depth information generation device 1. As described above, since the depth information generated by the depth information generation device 1 is added with the depth image capable of generating the reproduction model in which the hole area is filled in the 3D model generation unit 12, the 3D model generation is performed. The device 2 can also reproduce the reproduction model in which the hole area is filled. That is, it is possible to obtain the effect that a reproduction model close to the reference model can be generated. Further, generally, the data amount of the depth information is smaller than that of the 3D model. Therefore, by generating the reproduction model from the transmitted depth information, it is possible to obtain the effect of suppressing the traffic volume while generating the reproduction model close to the reference model as compared with the case of directly transmitting the reference model.

[Modification 1]
Hereinafter, a first modification of the first embodiment will be described with reference to FIGS. 1 to 3. In the present modification, when the depth information generation unit 11 inputs camera information having a plurality of camera parameters to the depth information generation unit 11, the depth information generation unit 11 does not collectively add those camera parameters to the camera information, The configuration to be added in order will be described. The purpose of this modification is to identify, from the camera information input to the depth information generation unit 11, a camera parameter that contributes to the improvement of the accuracy of the reproduction model, and add only the camera parameter to the camera information. .. For the sake of convenience, members having the same functions as the members described in the above embodiment will be designated by the same reference numerals, and the description thereof will be omitted. The configuration shown in FIG. 1 is also used in this modification. However, the depth information generation unit 11 and the 3D model reproduction unit 12 have the function of performing the processing shown in the present modification.

In this modification, a configuration is used in which the processes of steps S101 and S102 are repeated for the number of camera parameters added.

In the process corresponding to step S101 of the first embodiment, the depth information generation unit 11 according to the present modification adds any one of the camera parameters to the camera information when the input camera information includes a plurality of camera parameters. I do. In other words, in the process of step S101 of the first embodiment, all the added camera parameters are added to the existing camera information, but in the present modification, only one camera parameter is added.

The 3D model reproduction unit 12 according to the present modification first generates a reproduction model in the process corresponding to step S102 of the first embodiment, as in the first embodiment. Next, the accuracy of the generated reproduction model is calculated and compared with the accuracy of the reproduction model generated using the camera information before the camera parameter is added. At this time, if the accuracy is not improved more than an arbitrary value, it is determined that the camera parameter does not contribute to the accuracy improvement of the reproduction model, and the camera parameter is removed from the camera information.

In this modification, the process is returned to step S101 after the above step S102 is completed. In this modification, this loop is repeated for the number of camera parameters to be added. That is, the above-described processing is performed for all camera parameters included in all input camera information.

With the configuration described above, among the camera parameters input to the depth information generation unit 11, only the camera parameters that contribute to the improvement of the accuracy of the reproduction model can be added to the existing camera information. As a result, the number of depth images included in the depth information can be suppressed, so that the effect of suppressing the traffic amount can be obtained.

[Modification 2]
Hereinafter, a second modification of the first embodiment will be described with reference to FIGS. 1 to 4. In the present modification, when camera information is input from the camera information setting unit 14a to the depth information generation unit 11a, the depth information generation unit 11a will generate depth information based on a sub model instead of a reference model. To do. For the sake of convenience, members having the same functions as the members described in the above embodiment will be designated by the same reference numerals, and the description thereof will be omitted. In this modification, the configuration shown in FIG. 4 is used.

The purpose of this modified example is to record a depth image that is not affected by the self-occlusion of the reference model in the depth information generation unit 11a. It is desirable that each depth image generated by the depth information generation unit 11 of the first embodiment shows the entire corresponding sub-model. However, depending on the clustering method in the camera information setting unit 14, camera parameters may be generated such that a part or the whole of the sub model is shielded by a region that is not the sub model of the reference model. In this case, there arises a problem that the hole area corresponding to the occluded sub model in the reproduction model is not filled with the addition of the depth image.

The camera information setting unit 14a according to the present modification inputs the sub-model to the depth information generating unit 11a in addition to the camera information in the process corresponding to step S104 of the first embodiment. At this time, the camera parameter included in the camera information and the sub model are associated with each other.

The depth information generation unit 11a according to this modification performs the process corresponding to step S101 of the first embodiment. In the first process, a depth image is generated based on the input reference model and added to the depth information. In the second and subsequent processes, the depth image is generated based on the sub-model corresponding to the camera parameter included in the input camera information and added to the depth information.

With the above-described configuration, the depth image generated by the depth information generation unit 11a does not include a region that is not a submodel of the reference model, so that the depth image that is not affected by the self-occlusion of the reference model can be recorded. it can. As a result, it is possible to obtain the effect that the hole region described above can be filled.

The second modification may be used in combination with the first embodiment. For example, in step S104, the camera parameter for which it is determined that the above-described self-occlusion occurs may be subjected to the process of the second modification, and if not so, the process of the first embodiment may be performed.

[Modification 3]
Hereinafter, a third modified example of the first embodiment will be described with reference to FIGS. 1 to 3 and 5. In the present modification, a configuration will be described in which the hole detection unit 13b detects a hole by a method different from that of the first embodiment, and the hole detection unit 13b generates camera information. For the sake of convenience, members having the same functions as the members described in the above embodiment will be designated by the same reference numerals, and the description thereof will be omitted.

In this modification, the configuration shown in FIG. 5 is used. As shown in FIG. 5, the depth information generation device 1b includes a subdivision unit 15b, a depth information generation unit 11, a 3D model reproduction unit 12, and a hole detection unit 13b.

The purpose of this modification is to detect a hole area by a method different from that of the first embodiment in the processing of the hole detection unit 13b.

The subdivision unit 15b according to the present modification has a uniform distribution generated by adding subdivision processing to the input reference model in the processing performed before step S101 of the first embodiment. A 3D model having vertices is input to the depth information generation unit 11 and the hole detection unit 13b as a new reference model is generated. Note that the process added to the reference model does not have to be the subdivision process as long as the process can be corrected to a 3D model having vertices with a uniform distribution.

The hole detection unit 13b according to the present modification estimates a hole area in the reproduction model based on the input reproduction model and reference model, generates camera information corresponding to the hole area, and inputs the camera information to the depth information generation unit 11. To do.

An example of the hole detection process in the hole detection unit 13b according to this modification will be described.
(1) The 3D space in which the input reference model exists is divided into grids of arbitrary width.
(2) The evaluation value of each grid is calculated based on the input reference model and reproduction model.
Specifically, first, with respect to the vertices of the reference model included in the grid, the distances between the vertices and the vertices of the nearest reproduction model are calculated. The above calculation is performed for all the vertices of the reference model included in the grid, and the total value of those distances is set as the evaluation value of the grid. That is, it can be said that the higher the evaluation value, the lower the accuracy of the region of the reproduction model corresponding to the grid. The same calculation is performed for all grids.
(3) Of all the grids, the camera parameter is generated for a grid whose evaluation value is higher than an arbitrary value. That is, the optimum camera parameter is generated for the grid whose accuracy of the reproduction model in the grid is out of the allowable range, and the optimum camera parameter is input to the depth information generation unit 11. The optimal camera parameter is, for example, a camera parameter that has a position, orientation, and focal length of the camera such that the inner product of the normal of the vertex of the reference model in the grid and the declination created by the ray of the camera is large Is. Further, for example, it is a camera parameter having a position and orientation that captures the center of gravity of the grid on the optical axis of the camera.

By the above process, the camera parameters generated are collected into camera information and input to the depth information generation unit 11.

This modification may be combined with modification 1. That is, only the camera parameter regarding the grid with the worst evaluation is input to the depth information generation unit 11, and only when the accuracy of the reproduction model is improved, the depth image corresponding to the camera parameter may be added to the depth information. .. With the above-described configuration, it is possible to add depth images that modify the grid in order from the grid with the poorest evaluation. That is, in the depth information generation unit 11, it is possible to obtain an effect that the depth image corresponding to a large hole region or distortion can be preferentially added without missing.

[Embodiment 2]
A second embodiment of the present invention will be described with reference to FIGS. 1 to 3 and 6 to 8.

In the present embodiment, a configuration will be described in which the depth information generation device 3 that generates depth information performs a process of generating an auxiliary model in addition to the process of the first embodiment. For the sake of convenience, members having the same functions as the members described in the above embodiment will be designated by the same reference numerals, and the description thereof will be omitted.

[1. Configuration of depth information generation device 3]
The configuration of the depth information generation device 3 according to the present embodiment will be described based on FIG.

The depth information generation device 3 is a device that generates depth information and an auxiliary model based on an input reference model. The purpose of this embodiment is to extract a part of the vertices or meshes included in the reference model and send them to the 3D model reproduction device 4.

The above-mentioned auxiliary model is information obtained by extracting a part of the input reference model, and is, for example, a 3D model composed of vertices or meshes. The auxiliary model is used for the purpose of compensating the hole area of the reproduction model by adding it to the reproduction model generated by the 3D model reproduction device 4.

FIG. 6 is a functional block diagram of the depth information generation device 3 according to this embodiment. As shown in FIG. 6, the depth information generation device 3 includes an auxiliary model generation unit 31, a depth information generation unit 11, a 3D model reproduction unit 12, a hole detection unit 13, and a camera information setting unit 32.

The functions of the depth information generation unit 11, the 3D model reproduction unit 12, and the hole detection unit 13 in the present embodiment are the same as those of the same name block included in the depth information generation device 1 in the first embodiment.

The auxiliary model generation unit 31 generates an auxiliary model and a reference model based on the input reference model. The reference model generated by the auxiliary model generation unit 31 is a model obtained by removing the auxiliary model from the input reference model. The generated reference model is input to the depth information generation unit 11 and the hole detection unit 13.

The camera information setting unit 32 in the present embodiment clusters the input hole extraction models, and generates camera information and auxiliary models in which camera parameters related to each cluster are summarized. The generated camera information is input to the depth information generation unit 11. Further, the generated auxiliary model is input to the 3D model reproduction device 4.

[2.3 Configuration of 3D Model Playback Device 4]
The configuration of the 3D model playback device 4 according to the present embodiment will be described based on FIG. 7. The 3D model reproduction device 4 is a device that generates a reproduction model based on the input depth information and the auxiliary model.

FIG. 7 is a functional block diagram of the 3D model playback device 4 according to the present embodiment. As shown in FIG. 7, the 3D model reproduction device 4 includes a 3D model reproduction unit 12 and an auxiliary model integration unit 41.

The function of the 3D model reproduction unit 12 in this embodiment is the same as that of the model with the same name included in the 3D model reproduction device 2 in the first embodiment.

The auxiliary model integration unit 41 generates a new 3D model by adding the auxiliary model to the reproduction model based on the input reproduction model and the auxiliary model, and the 3D model is used as a new reproduction model. Output from 4.

[3. Processing flow]
The process flow in this embodiment will be described step by step based on FIGS. 1 to 3 and 6 to 8. FIG. 8 is a flowchart showing the flow of processing according to this embodiment. Note that steps S201 to S204 are processes in the depth information generating device 3, and steps S105 to S206 are processes in the 3D model reproducing device 4.

(S201)
In step S201, the auxiliary model generation unit 31 generates a new reference model and a new auxiliary model based on the input reference model. Specifically, first, an auxiliary model is generated by extracting vertices or meshes that satisfy specific conditions from the input reference model. Next, the vertices or meshes satisfying the above conditions are removed from the reference model, a new reference model is generated, and the new reference model is input to the depth information generation unit 11 and the hole detection unit 13. In other words, a new reference model is generated by removing the auxiliary model from the reference model.

The vertices or meshes that satisfy the above-described specific conditions are, for example, vertices or meshes that correspond to regions that are difficult to reproduce in the processing of the 3D model reproduction unit 12. In the process of the 3D model reproduction unit 12, since the depth maps are integrated in voxel units, details smaller than the voxel size in the reference model are not reproduced. In view of the above-mentioned problem, the auxiliary model is generated by extracting the vertices or meshes in the reference model, which have details smaller than the voxel size described above. With the above-described configuration, it is possible to obtain the effect of reproducing details smaller than the voxel size in the 3D model reproduction device 4 while suppressing the traffic volume, as compared with the case where the reference model can be corrected.

(S101, S102, S103)
In steps S101 to S103, the same processing as that of the first embodiment is performed.

(S204)
In step S204, the camera information setting unit 32 generates additional camera information and an auxiliary model based on the input hole extraction model. That is, in step S204, an auxiliary model is generated in addition to the same processing as step S104 of the first embodiment.

Specifically, the auxiliary model is generated by extracting the vertices or meshes satisfying a specific condition from the sub model while performing the same processing as step S104 of the first embodiment. The vertices or meshes satisfying the above-described specific conditions are, for example, vertices or meshes included in a sub model whose total area of meshes included in one sub model is less than an arbitrary value. In other words, it is a sub model in which the area of the hole region is below an arbitrary value. In the case of a small hole area, the hole area may be filled with vertices or meshes having a smaller amount of information than the depth image added to fill the hole area.

With the configuration of steps S201 to S204 described above, the depth information generation device 3 can generate an auxiliary model in addition to the depth information generated by the depth information generation device 1 of the first embodiment. The above-described auxiliary model corresponds to a 3D model obtained by extracting a reference model corresponding to a small area of the reference model and a small hole area of the reproduction model generated in the 3D model reproduction unit 12. The auxiliary model is selected and generated so that the data amount of the auxiliary model corresponding to the above-described small hole area is smaller than the data amount of the depth image having an effect of filling the hole area. Since the information for filling the small hole area can be transmitted in the form of the auxiliary model, which has less information amount than the depth image, the effect of suppressing the traffic amount can be obtained.

Note that vertices or meshes included in the auxiliary model may be removed from the sub model or reference model. In other words, the 3D model obtained by removing the auxiliary model from the sub model or reference model may be used as the new sub model or reference model.

(Supplementary model)
The auxiliary model generated in step S201 and step S204 is held inside the depth information generation device 3, and the depth information generation device 3 outputs the depth information, and at the same time, the depth information generation device 3 outputs the depth information. The auxiliary models generated by the auxiliary model generation unit 31 and the camera information setting unit 32 may be integrated into one auxiliary model.

Note that, in step S201 and step S204 in the present embodiment, only one of the steps may be processed. That is, the auxiliary model may be generated by either one of the functional blocks of the auxiliary model generation unit 31 and the camera information setting unit 32. When the auxiliary model is not generated in step S204, the process of step S204 is the same as step S104 of the first embodiment.

Further, the auxiliary model does not necessarily have to be the 3D model that has been extracted from the reference model or the sub model, and may be a model in which the number of vertices or meshes is reduced after extracting from the sub model. .. With the configuration described above, an auxiliary model having a smaller amount of data can be used as compared with the case where an auxiliary model is used without changing the sub-model, so that an effect of suppressing the traffic amount can be obtained.

(S105)
In step S105, the same processing as that of the first embodiment is performed.

(S206)
In step S206, the auxiliary model integration unit 41 generates a new reproduction model based on the input reproduction model and auxiliary model. Specifically, by integrating the reproduction model and the auxiliary model, a new reproduction model is generated and output from the 3D model reproduction device 4. The above-mentioned integration of the reproduction model and the auxiliary model refers to a process of replacing the 3D model obtained by adding the vertices or meshes of both with each other as a new reproduction model.

With the above configuration, the 3D model generation device 4 can output a 3D model in which an auxiliary model is added to the reproduction model generated by integrating the depth information in the 3D model reproduction unit 12. Accordingly, by adding the auxiliary model generated by the camera information setting unit 32 of the depth information generation device 3 to the above-described reproduction model, it is possible to generate the reproduction model in which the hole area is filled. In addition, by adding the auxiliary model generated by the auxiliary model generation unit 31 of the depth information generation device 3 to the above-described reproduction model, it is not possible to reproduce by depth integration in which the detail of the 3D model is limited by the voxel size. A reproduction model with details can be generated. By the above-described processing, it is possible to obtain the effect of generating a reproduction model close to the reference model.

Note that when the reproduction model output from the 3D model reproduction device 4 is displayed after the processing in step S206 is completed, the apex may be displayed in a size that fills the hole area. With the above configuration, since the hole area of the reproduction model is filled with the auxiliary model having a smaller number of vertices, the data amount of the auxiliary model can be reduced and the traffic amount can be suppressed.

[Embodiment 3]
A third embodiment of the present invention will be described with reference to FIGS. 1 to 3 and 9 to 11.

In the present embodiment, a configuration will be described in which the depth information generation device 5 that generates depth information further performs a process of generating auxiliary TSDF information. For the sake of convenience, members having the same functions as the members described in the above embodiment will be designated by the same reference numerals, and the description thereof will be omitted.

[1. Configuration of Depth Information Generation Device 5]
The configuration of the depth information generation device 5 according to this embodiment will be described based on FIG. 9.

The depth information generation device 3 is a device that generates depth information and auxiliary TSDF based on an input reference model. The purpose of this embodiment is to generate a supplementary TSDF by comparing the reproduction voxel space generated from the depth information with the reference voxel space generated based on the reference model, and to transmit the auxiliary TSDF to the 3D model reproduction device 6. ..

The playback voxel space described above is a voxel space in which the TSDF value and the weight value are recorded, which are generated by the 3D model playback unit 51 based on the depth information.

The reference voxel space described above is a voxel space in which the TSDF value and the weight value are recorded, which are generated by the reference voxel space generation unit 52 based on the reference model. Note that the reproduction voxel space and the reference voxel space have the same voxel resolution. In other words, both voxels have the same number, and there is a one-to-one corresponding voxel.

The above-mentioned auxiliary TSDF is information in which the auxiliary TSDF generation unit 53 generates the coordinates of the voxel and the TSDF value of the voxel, which is generated based on the reproduction voxel space and the reference voxel space. The auxiliary TSDF has at least the coordinates of the original voxel and the TSDF value of the voxel. The auxiliary TSDF is used in the 3D model reproduction unit 61 of the 3D model reproduction device 6 for the purpose of improving the accuracy of the reproduction model generated in the 3D model reproduction unit 61 by being added to the voxel space generated from the depth information. To be

FIG. 9 is a functional block diagram of the depth information generation device 5 according to this embodiment. As shown in FIG. 9, the depth information generation device 5 includes a depth information generation unit 11, a 3D model reproduction unit 51, a reference voxel space generation unit 52, and an auxiliary TSDF generation unit 53. The function of the depth information generation unit 11 in this embodiment is the same as that of the same name block included in the depth information generation device 1 in the first embodiment.

The 3D model playback unit 51 performs a depth integration process based on the input depth information to generate a playback voxel space.

The reference voxel space generation unit 52 generates a reference voxel space based on the input reference model.
The auxiliary TSDF generation unit generates the auxiliary TSDF by comparing the reproduction voxel space and the reference voxel space based on the input reproduction voxel space and reference voxel space.

[2.3 Configuration of 3D Model Playback Device 6]
The configuration of the 3D model playback device 6 according to the present embodiment will be described based on FIG. The 3D model reproduction device 6 is a device that generates a reproduction model based on the input depth information and the auxiliary TSDF.

FIG. 10 is a functional block diagram of the 3D model playback device 6 according to the present embodiment. As shown in FIG. 10, the 3D model playback device 6 includes a 3D model playback unit 61.

The 3D model reproduction unit 61 generates a reproduction model based on the input depth information and auxiliary TSDF.

[3. Processing flow]
The flow of processing in this embodiment will be described step by step based on FIGS. 1 to 3 and 9 to 11. FIG. 11 is a flowchart showing the flow of processing according to this embodiment. Note that steps S101 to S303 are processes in the depth information generation device 5, and steps S304 to S106 are processes in the 3D model reproduction device 6.

(S101)
In step S101, the same processing as that of the first embodiment is performed.

(S301)
Subsequently, in step S301, the 3D model reproduction unit 51 generates a reproduction voxel space based on the input depth information. Specifically, in the process of step S102 of the first embodiment, the depth information is integrated, the processes up to the calculation of the TSDF value and the weight value of the voxel are performed, and the TSDF value of the voxel is extracted to reproduce the voxel space. Is generated and input to the auxiliary TSDF generation unit 53.

Note that, in the process of step S301, the TSDF value and the weight value are calculated for the pixels in the depth pixel and for the voxels in the normal direction of the pixel.

(S302)
Then, in step S302, the reference voxel space generation unit 52 generates a reference voxel space based on the input reference model. Specifically, for all vertices included in the reference model, the reference voxel space is generated by calculating the TSDF value and weight for the voxels in the normal direction of the vertices, and the auxiliary TSDF generation unit 53 is generated. input. .. The TSDF value here represents the distance from the voxel to the vertex of the reference model, and is a signed numerical value meaning that the smaller the TSDF value, the closer to the vertex. For example, a positive TSDF value means a voxel whose normal line is on the positive side when viewed from the apex, and a negative TSDF value means a voxel whose normal line is on the negative side when viewed from the apex. When a plurality of TSDF values are included in one voxel, the average value of the TSDF values of those voxels is taken. Further, the weight value is set to a non-zero uniform value.

Note that in the calculation of the TSDF value in the reference voxel space generation unit 52, the TSDF value is calculated for voxels in a narrower range than the calculation of the TSDF value in the 3D model reproduction unit 51. For example, in the calculation of the TSDF value in the 3D model reproducing unit 51, the TSDF value is calculated in the range of 3 voxels from the corresponding surface, and in the calculation of the TSDF value in the reference voxel

space generating unit

52, 1 voxel from the corresponding vertex. Calculate the TSDF value for the range.

(S303)
Subsequently, in step S303, the auxiliary TSDF generation unit 53 generates the auxiliary TSDF based on the input reproduction voxel space and reference voxel space. Specifically, the input reproduction voxel space and the reference voxel space are compared, an auxiliary TSDF is generated, and the auxiliary TSDF is output from the depth information generating device 6. The comparison between the reproduction voxel space and the reference voxel space described above is, for example, for all voxels existing in the reproduction voxel space and for the voxels in the reference voxel space corresponding to the voxel, whether or not the weight value is 0 is determined. Processing. Here, a voxel with a weight value of 0 indicates a voxel for which the TSDF value has never been calculated. Specifically, a voxel in the reference voxel space in which the weight value of the voxel in the reproduction voxel space is 0 and the weight value of the voxel in the reference voxel space is not 0 is extracted. That is, a voxel in which the TSDF value is calculated in the reference voxel space but the TSDF value is not calculated in the reproduction voxel space is extracted from the reference voxel space. The voxels satisfying the above conditions correspond to the hole area in the reproduction model.

Further, for example, it is determined whether or not the TSDF values of all voxels in the reproduction voxel space whose weight values are not 0 and the voxels of the reference voxel space whose weight value is not 0 corresponding to the voxels are significantly different. Processing. Specifically, the difference between the TSDF value of the voxel in the reproduction voxel space and the TSDF value of the voxel in the reference voxel space is calculated, and the voxel in the reference voxel space at which the difference in the TSDF value becomes a certain value or more is extracted. .. The voxels satisfying the above conditions correspond to regions in the reproduction model that deviate from the reference model.

Further, for example, whether all the voxels having a weight value other than 0 existing in the reproduction voxel space and the voxels of the reference voxel space having a weight value other than 0 corresponding to the voxels have the same TSDF value sign or not. Is a process for determining. Specifically, the TSDF value of the voxel in the playback voxel space described above is compared with the code of the TSDF value of the voxel in the reference voxel space, and if the codes do not match, the voxel in the reference voxel space is extracted. The voxels satisfying the above conditions correspond to regions in the reproduction model that deviate from the reference model.

-For all voxels extracted by this, the auxiliary TSDF is generated by putting together the coordinates of the voxel and the TSDF values of the voxels at the coordinates in the reference model. A specific example of the auxiliary TSDF is information having information (per voxel X coordinate, voxel Y coordinate, voxel Z coordinate, TSDF value) per one voxel.

With the configuration of steps S101 to S303 described above, the reproduction voxel space generated by integrating the depth information generated by the depth information generation unit 11 and the reference voxel space generated based on the reference model are compared. , An auxiliary TSDF can be generated and output. It can be said that the auxiliary TSDF has a TSDF value that is not in the reproduction voxel space based on the depth information but in the reference voxel space. In other words, the area of the reference model, which cannot be reproduced only by the depth information output from the depth information generation unit 11, can be reproduced by the auxiliary TSDF. As a result, in the 3D model generation device 6, it is possible to obtain the effect of generating a reproduction model that is closer to the reference model. Further, by transmitting the information for filling the small hole area of the reproduction model in the form of the auxiliary TSDF, which has a smaller information amount than the depth image, an effect of suppressing the traffic amount can be obtained.

(S304)
In step S304, the 3D model reproduction unit 61 generates a reproduction model based on the input depth information and auxiliary TSDF. Specifically, first, in the process of step S301, the processes of integrating the depth information and calculating the TSDF value and the weight value of the voxel space are performed. Next, the auxiliary TSDF is added to the voxel space calculated by the 3D model reproduction unit 61. Next, a reproduction model is generated and output based on the voxel space to which the auxiliary TSDF is added. The method of generating the reproduction model from the voxel space is the same as the processing performed by the 3D model reproduction unit according to the first embodiment.

The method of adding the auxiliary TSDF to the voxel space is, for example, a method of overwriting the TSDF value of the voxel existing at the coordinates recorded in the auxiliary TSDF in the voxel space with the TSDF value recorded in the auxiliary TSDF. When the TSDF value is overwritten, the weight value of the voxel is replaced with the lowest non-zero value regardless of whether or not the voxel has a weight value.

With the above configuration, the TSDF value of the auxiliary TSDF is added to the voxel space calculated by integrating the depth information in the 3D model reproduction unit 61. As a result, it is possible to obtain the effect that the holes existing in the reproduction model generated by integrating only the depth information described above can be filled with the auxiliary TSDF. Further, for example, a region which is difficult to reproduce by the method of integrating the depth information, such as a sharp region in the reference model, can be reproduced by adding the auxiliary TSDF.

[Modification of Embodiment 3]
Hereinafter, a modified example of the third embodiment will be described with reference to FIGS. 1 to 3 and 9 to 12. In this modification, a configuration in which the first and third embodiments are used together will be described. For the sake of convenience, members having the same functions as the members described in the above embodiment are designated by the same reference numerals and the description thereof will be omitted. In this modification, the configuration shown in FIG. 12 is used.

The purpose of this modification is to use the first and third embodiments together in the depth information generation device 5a.

The 3D model playback unit 51a according to the present modification generates a playback model as well as the playback voxel space, as in step S102 of the first embodiment.

In the processing in this modification, first, the processing from step S101 to step S104 of the first embodiment is looped. Here, the processing of the 3D model reproducing unit 12 is carried out by the 3D model reproducing unit 51a. When the condition for the loop end determination described in the first embodiment is satisfied, the depth information is generated and output, and the loop is ended. Next, the processing from step S301 to step S303 of the third embodiment is performed to generate the auxiliary TSDF, which is output from the depth information generation device 5a. However, in step S301, the depth information input to the 3D model reproduction unit 51a is the depth information generated by the above-described processing.

With the above-described configuration, the process of comparing the voxel space of step S304 in this modification with the reference voxel space and the reproduction voxel space generated based on the depth information generated in the first embodiment can be performed. As a result, the large hole area can be filled with the depth image added in the first embodiment, and the small hole area can be filled with the auxiliary TSDF generated in the third embodiment. The effect of suppressing the traffic volume can be obtained. In addition, an effect that a reproduction model close to the reference model can be generated is obtained.

[Example of software implementation]
Even if the control blocks of the depth

information generation devices

1 and 1a and 3 and 5 and 5a and the 3D

model reproduction devices

2 and 4 and 6 are realized by a logic circuit (hardware) formed in an integrated circuit (IC chip) or the like. It may be realized by software.

In the latter case, the depth

information generation devices

1 and 1a and 3 and 5 and 5a, and the 3D

model reproduction devices

2 and 4 and 6 are equipped with a computer that executes the instructions of a program that is software that realizes each function. The computer includes, for example, at least one processor (control device) and at least one computer-readable recording medium that stores the program. Then, in the computer, the processor reads the program from the recording medium and executes the program to achieve the object of the present invention. As the processor, for example, a CPU (Central Processing Unit) can be used. As the recording medium, a "non-transitory tangible medium" such as a ROM (Read Only Memory), a tape, a disk, a card, a semiconductor memory, a programmable logic circuit, or the like can be used. Further, a RAM (Random Access Memory) for expanding the above program may be further provided. The program may be supplied to the computer via any transmission medium (communication network, broadcast wave, etc.) capable of transmitting the program. Note that one aspect of the present invention can also be realized in the form of a data signal embedded in a carrier wave, in which the program is embodied by electronic transmission.

[Summary]
A depth information generation device according to aspect 1 of the present invention is a depth information generation unit that generates depth information based on camera information based on a reference model, and a 3D model reproduction unit that reproduces a reproduction model by integrating the depth information. A hole detection unit that estimates a hole area existing in the reproduction model by referring to the reference model and extracts the hole region as a hole extraction model; and a camera information setting unit that sets the camera information based on the hole extraction model, Is provided.

According to the above configuration, it is possible to realize a depth information generation device that generates depth information that can suppress the traffic volume and improve the quality of the reproduction model in the 3D model reproduction device.

The depth information generation device according to the second aspect of the present invention is the depth information generation device according to the first aspect, wherein an auxiliary model generation unit that extracts a fine area and a hole area in the reference model as an auxiliary model, and a fine area in the hole extraction model A camera information setting unit that generates an auxiliary model by extracting a hole area may be further included.

A depth information generation device according to aspect 3 of the present invention is a depth information generation unit that generates depth information based on camera information based on a reference model, and a 3D model reproduction that integrates the depth information to generate a reproduction voxel space. And a reference voxel space generation unit that generates a reference voxel space based on the reference model, and an auxiliary TSDF generation unit that generates an auxiliary TSDF by comparing the reproduction voxel space and the reference voxel space. This is a characteristic configuration.

The depth information generation device according to aspect 4 of the present invention is the depth information generation device according to any one of aspects 1 to 3 above, wherein when the depth information is integrated in the 3D model reproduction unit, the depth information is displayed in the vicinity of the contour of the object reflected in the depth information. The depth values may not be integrated.

The depth information generation device according to the fifth aspect of the present invention may be configured such that, in any of the above-described first to third aspects, a filter that interpolates the TSDF value is added after the depth information is integrated in the 3D model reproduction unit.

In the depth information generation device according to Aspect 6 of the present invention, in any one of Aspects 1 to 3 above, the nearest vertices of the reference model are adjacent to each other in order to detect the hole area of the reproduction model in the hole detection unit. It may be configured to use the number of meshes.

The depth information generation device according to aspect 7 of the present invention is the depth information generation device according to any one of aspects 1 to 3 above, in which depth images are added one by one to the depth information based on the camera information generated by the camera information setting unit. When the accuracy of the reproduction model is not satisfied and the level is not satisfied each time, the added depth image may be removed from the depth information.

The depth information generation device according to the eighth aspect of the present invention may be configured to generate the depth information based on a sub-model instead of the reference model in the depth information generation section in any of the above-described first to third aspects.

The depth information generation device according to aspect 9 of the present invention is the depth information generation device according to any one of aspects 1 to 3 above, further including a subdivision unit that performs subdivision processing on the reference model to form a 3D model having a uniform distribution, The accuracy of the reproduction model may be evaluated for each grid divided by the hole detection unit, and the camera parameter corresponding to the poorly evaluated grid may be preferentially added to the camera information.

A 3D model reproduction device according to aspect 10 of the present invention comprises the 3D model reproduction unit configured to integrate the depth information generated by the depth information generation unit and reproduce the reproduction model. is there.

A 3D model reproduction device according to aspect 11 of the present invention is the same as the aspect 10 described above, further including an auxiliary model generation unit that adds an auxiliary model to the reproduction model to create a new reproduction model, and stores a detailed region of the reproduction model. It may be configured to restore.

A 3D model playback device according to aspect 11 of the present invention is the aspect 10 described above, comprising a 3D model generation unit that adds an auxiliary TSDF to the top by integrating the depth information, and restores a fine region of the playback model. It may be configured to.

The depth

information generation devices

1, 1a, 1b, 3, 5, 5a and the 3D

model generation devices

2, 4, 6 according to the respective aspects of the present invention may be realized by a computer. The depth

information generation devices

1, 1a, 1b, 3, 5 and 5 are operated by operating the

information generation devices

1, 1a, 1b, 3, 5 and 5a and the respective units (software elements) included in the 3D

model generation devices

2, 4 and 6. 5a and 3D

model generation devices

2, 4 and 6 realized by computer, depth

information generation devices

1, 1a, 1b, 3, 5, 5a and control programs of 3D

model generation devices

2, 4 and 6, and recording them The computer-readable recording medium described above also falls within the scope of the present invention.

The present invention is not limited to the above-described embodiments, but various modifications can be made within the scope of the claims, and embodiments obtained by appropriately combining the technical means disclosed in the different embodiments Is also included in the technical scope of the present invention. Furthermore, a new technical feature can be formed by combining the technical means disclosed in each embodiment.

Claims

A depth information generation unit that generates depth information based on camera information based on the reference model,
A 3D model reproduction unit for reproducing the reproduction model by integrating the depth information,
With reference to the reference model, the hole area existing in the reproduction model is estimated, and a hole detection unit for extracting as a hole extraction model,
A camera information setting unit that sets the camera information based on the hole extraction model;
A depth information generation device comprising:
An auxiliary model generation unit that extracts a fine area and a hole area in the reference model as an auxiliary model,
A camera information setting unit that generates an auxiliary model by extracting a fine area and a hole area in the hole extraction model,
The depth information generating apparatus according to claim 1, further comprising: and adding the auxiliary model to the reproduction model when the reproduction model is reproduced.
A 3D model reproduction unit that integrates the depth information to generate a reproduction voxel space;
A reference voxel space generation unit that generates a reference voxel space based on the reference model,
An auxiliary TSDF generator that generates an auxiliary TSDF by comparing the reproduction voxel space and the reference voxel space,
The depth information generating device according to claim 1, further comprising: and further adding an auxiliary TSDF after integrating the depth information at the time of reproducing the reproduction model.
The depth information generating apparatus according to claim 1, 2 or 3, wherein when integrating the depth information in the 3D model reproduction unit, depth values near a contour of an object reflected in the depth information are not integrated. ..
The depth information generation device according to claim 1, 2 or 3, wherein a filter for interpolating a TSDF value is added after the depth information is integrated in the 3D model reproduction unit.
The depth information generation according to claim 1, 2 or 3, wherein the number of meshes with which the nearest vertex of the reference model is adjacent is used to detect the hole area of the reproduction model in the hole detection unit. apparatus.