WO2021206284A1

WO2021206284A1 - Depth estimation method and system using cycle gan and segmentation

Info

Publication number: WO2021206284A1
Application number: PCT/KR2021/001803
Authority: WO
Inventors: 이승호; 곽동훈
Original assignee: 한밭대학교 산학협력단
Priority date: 2020-04-09
Filing date: 2021-02-10
Publication date: 2021-10-14
Also published as: KR102127153B1

Abstract

The present invention relates to a depth estimation method and system using cycle GAN and segmentation, the method and the system estimating depth information about an image by using only a single image through cycle GAN and segmentation without using special equipment or a camera. A depth estimation method using cycle GAN and segmentation, according to an embodiment of the present invention, comprises the steps of: (S10) generating depth information and segmentation image information about an input RGB image of a standard database by using a generator; (S20) reconstructing an RGB image by using the generated depth information and segmentation image information; and (S30) calculating a loss and a discrimination probability by comparing the generated depth information and segmentation image information, and the reconstructed RGB image with the standard database, and discriminating same, respectively. In addition, the depth estimation method comprises the steps of: (S40) determining whether the loss and the discrimination probability value satisfy a preset reference convergence value, on the basis of the calculated result values; (S50) adjusting training on the basis of the determination result so that the loss and the discrimination probability value of a discriminator converge on the preset reference convergence value, and repeating steps (S10) to (S40); and (S60) estimating depth information about the RGB image by using a generator generated through steps (S10) to (S50).

Description

Depth estimation method and system using cycle GAN and segmentation

The present invention relates to a method and a system for estimating depth using cycle GAN and segmentation, and more particularly, by using only a single image through cycle GAN and segmentation without using special equipment or a camera, depth information of an image is obtained. It relates to a depth estimation method and system using a cycle GAN and segmentation to estimate.

In the image processing field, 3D information refers to information that includes spatial information such as depth and scale other than visual information of an image. Starting with the 4th industrial revolution, such 3D information is indispensable in the fields of VR, AR, and autonomous driving, and demands technologies that can acquire and calculate it more accurately and quickly.

For example, in the field of augmented reality (AR), a virtual environment is overlaid on a real environment to reinforce and provide additional information to the user. The virtual environment created by computer graphics naturally overlaps with the real environment, so that a more immersive service can be provided to users. These technologies can build a natural virtual environment only when 3D information is combined with the visual information coming through the camera.

Therefore, in order to obtain such three-dimensional information, radar, ultrasonic, and laser sensors have been developed, and a three-dimensional imaging method using a special camera or stereo camera has been proposed.

However, in order to obtain the conventional 3D information, there is a problem in that the cost for extracting the 3D information is high by using special equipment, a camera, a radar, an ultrasonic wave, and a sensor, and the data cannot be easily obtained.

[Prior art literature]

[Patent Document] Republic of Korea Patent Registration No. 10-1650702 (Aug. 24, 2016 Announcement)

Accordingly, the present invention is to solve the disadvantages of the prior art, and it is an object of the present invention to extract a 3D image inexpensively by using only a single camera without using special equipment, a camera, a radar, an ultrasonic wave, and a sensor. In addition, the purpose of the present invention is to make it easy to obtain data for generating 3D image information. In addition, it aims to solve the data imbalance problem that occurs in the process of estimating depth information.

The depth estimation method using cycle GAN and segmentation according to an aspect of the present invention for achieving this technical problem is a generator for the input RGB image X of the standard database.

and the Generator

Step (S10) of generating depth information and segmentation image information using and a step (S30) of discriminating and comparing the presentation image information and the reconstructed RGB image with a standard database, and calculating a loss and a discrimination probability for each.

In addition, based on the calculated result value, determining whether each loss and the discriminator's discriminator's discriminator value satisfies a preset reference convergence value (S40), and based on the judgment result, the loss and discriminator's probability value are preset If the reference convergence value is not satisfied, the learning is adjusted so that each loss and the discriminator's discrimination probability value converge to a preset reference convergence value, and repeating the steps (S10) to (S40) (S50) include

In addition, the generator (Generator) generated through the steps (S10) to (S50)

Estimating depth information for an input RGB image of RGB data using

and estimating segmentation image information for the input RGB image of the RGB data by using (S70).

In addition, the depth estimation system using cycle GAN and segmentation according to another aspect of the present invention includes an image information learning unit, a calculation unit, a determination unit, a database, an image input unit, and an image information estimation unit. In this case, the database includes a standard database.

In addition, the image information learning unit receives the RGB image of the standard database, the generator (Generator)

Depth information is generated using

is used to generate segmentation image information, reconstructs an RGB image using the generated depth information and segmentation image information, and performs learning through the objective function of the cycle GAN.

In addition, the calculation unit compares the depth information generated by the image information learning unit, the segmentation image information, and the reconstructed RGB image with a standard database, respectively, and calculates a loss and a discrimination probability for each. In addition, the determination unit determines whether each loss and discrimination probability value of a discriminator satisfy a preset reference convergence value based on the result calculated by the operation unit.

In addition, the image input unit receives an RGB image. In addition, the image information estimating unit is the generator (Generator) learning is completed in the image information learning unit

Wow, the Generator

is used to estimate the depth information of the RGB image received from the image input unit.

As described above, the depth estimation method and system using cycle GAN and segmentation according to the present invention is inexpensive by generating 3D information using only a single image without using special equipment, cameras, radar, ultrasound and sensors. It has the effect of extracting a 3D image. In addition, due to the high scalability, 3D information can be generated even when other information such as a stereo image, an optical flow technique, or a point cloud cannot be used, and there is an advantageous effect in miniaturization of equipment for extracting 3D information.

In addition, since depth information of an image can be estimated using a single image, data for generating 3D image information can be easily obtained. In addition, there is an effect of visually displaying the data imbalance problem that occurs in the process of estimating depth information based on segmentation and emphasizing small features that are lost by being buried in relatively large features.

1 is a diagram illustrating a problem occurring in a conventional depth estimation process.

2 is a block diagram illustrating a depth estimation system using a cycle GAN and segmentation according to an embodiment of the present invention.

3 is a conceptual diagram illustrating a depth estimation method using a cycle GAN and segmentation according to an embodiment of the present invention.

4 is a flowchart illustrating a depth estimation method using a cycle GAN and segmentation according to an embodiment of the present invention.

5 is a flowchart illustrating a depth estimation method using a cycle GAN and segmentation according to an embodiment of the present invention.

6 is a diagram illustrating an operation sequence of a method for estimating depth information of a single image according to an embodiment of the present invention.

7 is a diagram illustrating a segmentation estimation process according to an embodiment of the present invention.

8 is a diagram illustrating a depth estimation process according to an embodiment of the present invention.

9 is a diagram illustrating a generation distribution of a generator and a discrimination probability of a discriminator.

10 is a diagram illustrating a cycle-consistency loss according to an embodiment of the present invention.

11 is a diagram illustrating a depth information estimation step of an execution step according to an embodiment of the present invention.

12 is a diagram illustrating a segmentation information estimation step of an execution step according to an embodiment of the present invention.

13A and 13B are diagrams showing a comparison before and after the segmentation process is used in the depth information estimation process.

14 is a diagram illustrating an evaluation procedure of a depth estimation method using a cycle GAN and segmentation according to an embodiment of the present invention.

Hereinafter, with reference to the accompanying drawings, embodiments of the present invention will be described in detail so that those of ordinary skill in the art to which the present invention pertains can easily implement them. However, the present invention may be embodied in various different forms and is not limited to the embodiments described herein. And in order to clearly explain the present invention in the drawings, parts irrelevant to the description are omitted, and similar reference numerals are attached to similar parts throughout the specification.

Throughout the specification, when a part "includes" a certain element, it means that other elements may be further included, rather than excluding other elements, unless otherwise stated. In addition, terms such as “…unit”, “…group”, “…module”, etc. described in the specification mean a unit that processes at least one function or operation, which is implemented by hardware or software or a combination of hardware and software. can be

Hereinafter, the present invention will be described in detail by describing preferred embodiments of the present invention with reference to the accompanying drawings.

Like reference numerals in each figure indicate like elements.

1 is a diagram illustrating a problem occurring in a conventional depth estimation process, and FIG. 2 is a configuration diagram illustrating a depth estimation system 10 using a cycle GAN and segmentation according to an embodiment of the present invention. That is, FIG. 1 is a diagram illustrating a problem in that depth information is ambiguous in a conventional depth estimation process for an image.

Generative Adversarial Network (GAN) refers to a network that generates adversarially and represents an unsupervised generative model based on unsupervised learning. In this case, two neural networks with relative characteristics compete with each other, resulting in a synergistic effect.

The GAN includes a generator that generates a data instance, and a discriminator that determines whether the data is authentic or not. Here, the generator receives random noise z generated as a zero-mean Gaussian and generates fake data similar to the actual data distribution.

In contrast to this, the discriminator distinguishes whether the data generated by the generator is fake data or data of a training dataset, and indicates a probability for each. Therefore, the discriminator operates to decrease the probability of making a mistake, and the generator operates to increase the probability that the discriminator makes a mistake. This is called the Minimax Problem.

The depth estimation method and system 10 using cycle GAN and segmentation according to an embodiment of the present invention improves the conventional depth information estimation method requiring conventional special equipment or multiple images, such as cycle GAN (Cycle GAN). Depth information can be estimated using Generative Adversarial Network and Segmentation.

In general, when estimating depth information through learning from RGB images, as shown in FIG. 1 , depth information is vaguely displayed for features that are relatively less learned due to data imbalance between training data, or depth information is not applied to large features. There are problems such as fading to be buried.

Therefore, the depth estimation method and system 10 using cycle GAN and segmentation according to an embodiment of the present invention visually displays the data imbalance problem that occurs in the process of estimating the conventional depth information and provides a relatively large feature. In order to highlight small features that are buried and lost, we try to solve this by introducing segmentation.

The depth estimation system 10 using cycle GAN and segmentation according to an embodiment of the present invention includes an image information learning unit 100 , a calculating unit 200 , a determining unit 300 , a database 400 , and an image input unit 500 . ) and an image information estimation unit 600 . In this case, the database 400 includes a standard database 410 .

The image information learning unit 100 receives the RGB image of the standard database 410, and a generator

Depth information is estimated using

is used to estimate segmentation image information.

At this time, the image information learning unit 100 generates a generator to convert RGB image information into segmentation information.

Segmentation information for the input RGB image X is obtained by using , and the corresponding information is used for depth information estimation.

In addition, the image information learning unit 100 is a generator (Generator)

to obtain depth information on the input RGB image X and use the information for estimating segmentation information. Also, the image information learning unit 100 restores the RGB image by using the generated depth information and segmentation image information. In addition, the image information learning unit 100 is a generator (Generator) through the objective function of the cycle GAN

Wow, the Generator

carry out learning about

The calculator 200 determines and compares the generated depth information, the segmented image information, and the reconstructed RGB image with the standard database 410, and calculates a loss and a discrimination probability for each. At this time, the calculating unit 200 calculates the numerical values of the loss and discrimination probability result values through the objective function of the cycle GAN.

The determination unit 300 determines whether each loss and the discrimination probability value of the discriminator satisfy a preset reference convergence value based on the result calculated by the operation unit 200 .

In addition, the determination unit 300 adjusts learning so that, based on the determination result, the loss and discrimination probability value does not satisfy the preset reference convergence value, the respective loss and the discrimination probability value of the discriminator converge to the preset reference convergence value, , feed back to the image information learning unit 100 to induce re-learning or re-estimation of depth information and segmentation image information. That is, the determination unit 300 feeds back to the image information learning unit 100 a result adjusted so that the re-learning is performed by the image information learning unit 100 .

The database 400 includes a standard database 410 for performing learning in the image information learning unit 100 . That is, the image information learning unit 100 receives the standard database 410 from the database 400 and estimates depth information and segmentation image information.

In this case, the standard database 410 includes RGB image information, depth information, and segmentation information. In addition, as the standard database 410, NYU Depth Dataset V2 may be used.

In addition, the database 400 stores a reference convergence value serving as a determination criterion of the determination unit 300 . In addition, the database 400 is a generator for estimating depth information and segmentation information in the image information learning unit 100 .

Wow, the Generator

Save the data.

The image input unit 500 receives an RGB image. In addition, the image information estimator 600 is the generator (Generator) the learning process is completed.

Wow, the Generator

is used to estimate depth information or segmentation information for an RGB image received from the image input unit 500 .

3 is a conceptual diagram illustrating a depth estimation method using cycle GAN and segmentation according to an embodiment of the present invention, and FIG. 4 is a flowchart illustrating a depth estimation method using cycle GAN and segmentation according to an embodiment of the present invention. and FIG. 5 is a flowchart illustrating a depth estimation method using a cycle GAN and segmentation according to an embodiment of the present invention.

The depth estimation method using cycle GAN and segmentation according to an embodiment of the present invention includes a learning step and an execution step as shown in FIG. 4 . In the learning step, segmentation and a depth estimation method using the segmentation are learned. In addition, in the learning step, the learning is adjusted by calculating the objective function and the discrimination probability.

In the execution step, depth information is estimated using only RGB image information based on the learning result of the learning step. In this case, the execution step estimates depth information using a generator used in the process of learning the depth information in the learning step.

The learning step of the depth estimation method using cycle GAN and segmentation according to an embodiment of the present invention includes a generator for the input RGB image X of the standard database 410 .

and the Generator

Step (S10) of generating depth information and segmentation image information using The method may include determining and comparing the presentation image information and the reconstructed RGB image with the standard database 410, respectively, and calculating a loss and a determination probability for each (S30).

In addition, based on the calculated result value, the step of determining whether each loss and the discriminator's discriminator's discriminator's discriminator satisfies a preset reference convergence value (S40), and the judgment result loss and discriminator's preset convergence criteria If the values are not satisfied, adjusting the learning so that the respective loss and discrimination probability values of the discriminator converge to a preset reference convergence value, and repeating the steps (S10) to (S40). can

6 is a diagram illustrating an operation sequence of a method for estimating depth information of a single image using a cycle GAN and segmentation according to an embodiment of the present invention, and FIG. 7 is a segmentation estimation according to an embodiment of the present invention. It is a diagram illustrating a process, and FIG. 8 is a diagram illustrating a depth estimation process according to an embodiment of the present invention.

In the step ( S10 ) of generating the depth information and the segmentation image information, depth information and segmentation information are estimated based on the input RGB image of the standard database 410 , and a learning rate is calculated through an objective function. In addition, it is possible to evaluate the effect of a combination of cycle-consistency losses according to each domain on the performance of depth information estimation.

As shown in FIGS. 7 and 8 , the segmentation network structure for estimating the segmentation information in the step S10 of generating the depth information and the segmentation image information and the depth network structure for estimating the depth information are It has the same structure, and only the roles of each generator and discriminator are changed.

The operation sequence of the two networks is as follows. First, a hint about depth information may be provided through segmentation information of the standard database 410 . In addition, in order to convert RGB image information into segmentation information, a generator

Segmentation information for the input RGB image X may be obtained by the , and the corresponding information may be used for depth information estimation.

In addition, as in Fig. 8, the generator (Generator)

depth information on the input RGB image X can be obtained by using the . In this case, the generator of the two networks is fed back in the step S30 of calculating the loss and the discrimination probability as shown in FIG. 4 and converted to estimate depth information and segmentation information through the RGB image.

In the step of calculating the loss and the discrimination probability ( S30 ), numerical values of the loss and discrimination probability result values are calculated through the objective function of the cycle GAN. Here, the objective function may be composed of an adversarial loss function and a cycle-consistency loss function of a cyclic generative adversarial network (GAN).

In addition, the adversarial loss function learns according to the minimax results of the generator and the discriminator. In the adversarial loss, a generator imitates a standard distribution of data, and a discriminator calculates a discrimination probability accordingly.

That is, the adversarial loss function calculates only the adversarial loss among the objective functions calculated in the step S10 of generating the depth information and the segmentation image information. In this step, since there is no intersection between depth and segmentation, depth information and segmentation information are estimated independently of each other.

Therefore, calculating the loss and the discrimination probability (S30) is performed with the depth image estimated in the step (S10) of generating the depth information and the segmentation image information and the step (S20) of reconstructing the RGB image. The segmentation image and the reconstructed RGB image may be discriminated through a discriminator, and probabilities for each may be calculated.

In this case, the objective function of the network according to the present invention can be expressed as [Equation 1] below.

[Equation 1]

Here, D represents the discriminator's discrimination probability for each input, G represents the data space mapping of the generator to the input, and λ represents the hyperparameter used for weighting.

In addition, [Equation 1] is composed of an adversarial loss function and a cycle-consistency loss function of a cyclic GAN (Generative Adversarial Network).

The adversarial loss function can be calculated in the process of estimating segmentation information and depth information from RGB image information, and can be expressed as [Equation 2] and [Equation 3] below.

[Equation 2]

[Equation 3]

Here, E represents the expected value for the distribution, and P _i represents the probability distribution for i. As shown in Figure 5, the generator (Generator)

fake data generated by

the discriminator

is determined, and the generator

fake data generated by

the discriminator

Learning proceeds while discriminating.

The core of the adversarial loss function is to map the distribution generated through the GAN to the actual distribution. Accordingly, the Adversarial Loss Function is trained according to the minimax results of the generator and the discriminator, and the generator can generate a distribution perfectly similar to the actual distribution. In addition, the discrimination probability of the discriminator converges to 50%.

9 is a diagram illustrating a generation distribution of a generator and a discrimination probability of a discriminator. That is, FIG. 9 is a graph showing a process in which the generation distribution of the generator and the discrimination probability of the discriminator are changed as the learning of the GAN proceeds in the learning step.

In FIG. 9 , the black dotted line represents the actual data distribution (Discriminator distribution), the green solid line represents the generative distribution of the generator, and the blue dotted line represents the discriminator distribution. Here, the small distance between the two distributions means that the distributions are very similar, indicating that the discriminator cannot easily discriminate.

Therefore, learning is carried out so that the discrimination probability of the discriminator is low (min) and the generation distribution similarity of the generator is high (max), so that the generation distribution of the generator is very similar to the actual data distribution.

In addition, by adding a reconstruction loss, the loss used in the conventional CNN-based learning method is combined to induce the generation distribution of the generator to learn the standard distribution of the target. The reconstruction loss may be expressed as in [Equation 4] below.

[Equation 4]

After the step (S30) of calculating the loss and discrimination probability, different output values are obtained for the same input RGB image X, but later this is again the RGB image.

The original RGB image X and the restored RGB image when restored to

The method may further include a cycle-consistency loss step (S31) of inducing to generate depth information and a segmentation image while maintaining the shape of the original RGB image X by comparing .

The cycle-consistency loss calculation step S31 evaluates the similarity when the two image information estimated separately from the two objective functions are restored back to the original image information. The cycle-consistency loss calculation step S31 serves to induce the generator to attempt conversion to a domain while maintaining the shape of each domain.

A cycle GAN and a depth estimation method using segmentation according to an embodiment of the present invention may use a cycle GAN composed of two domains: segmentation and depth. Accordingly, the cycle-consistency loss can be expressed as [Equation 5] below, which is composed of the sum of two losses.

[Equation 5]

That is, an error value generated through restoration is set as a cycle-consistency loss. Therefore, if the restoration is performed well, the loss function is lowered.

10 is a diagram illustrating a cycle-consistency loss of a cycle GAN model according to an embodiment of the present invention. Referring to the cycle-consistency loss as shown in FIG. 10 , it indicates that the generator must consider not only the restoration from the depth information but also the restoration from the segmentation information when restoration is performed to the RGB image.

Therefore, in the depth estimation method using cycle GAN and segmentation according to an embodiment of the present invention, the generation of further object-classified depth information by adding segmentation information to the conventional constraint method of restoring only depth information in consideration possible. Similarly, even when segmentation information is restored, segmentation information is restored in consideration of depth information, thereby exhibiting a synergistic effect on background separation.

In addition, normalization is performed by imposing a penalty on the L1 norm of the model weight (the sum of the absolute values of each element of the weight) through the L1 loss to the cycle-consistency loss. The L1 loss (Loss) is relatively robust compared to the L2 loss (Loss), and is robust to an unstable solution problem.

The L1 loss may be expressed as in Equation 6 below.

[Equation 6]

As such, the depth estimation method using cycle GAN and segmentation according to an embodiment of the present invention calculates an Adversarial Loss and a Cycle-Consistency Loss through the learning step, and based on this Finally, fake depth information after training is completed

can be estimated.

That is, when learning is completed so that it is similar to the actual depth information with a preset similarity through the learning step, the learning process of the cycle GAN is terminated.

11 is a diagram illustrating a depth information estimation step of an execution step according to an embodiment of the present invention, and FIG. 12 is a diagram illustrating a segmentation information estimation step of an execution step according to an embodiment of the present invention.

The execution step of the depth estimation method using the cycle GAN and the segmentation according to an embodiment of the present invention includes the generator generated through the steps (S10) to (S50).

Estimating depth information for an input RGB image of RGB data using

The step (S60) of estimating the depth information is performed by a generator generated in the learning step as shown in FIG. 11 .

is used to estimate depth information.

That is, the step of estimating the depth information ( S60 ) includes a generator generated in the learning step to generate a depth image of an RGB image similar to an actual depth image.

Depth estimation is performed using

In addition, the step of estimating the segmentation image information ( S70 ) is performed by a generator generated in the learning step as shown in FIG. 12 .

is used to estimate the segmentation information. That is, the step (S70) of estimating the segmentation image information is performed by a generator generated in the learning step.

Converts RGB images to segmentation information using

13A and 13B are diagrams showing a comparison before and after the segmentation process is used in the depth information estimation process. That is, FIG. 13A is a diagram illustrating uncertainty of an image estimated without using a segmentation process in a depth information estimation process, and FIG. 13B is a diagram illustrating a result of estimating depth information on an input image by adding a segmentation process. It is a drawing.

The reason for adding the segmentation process to the conventional depth information estimation process is to solve the problem of uncertainty of depth information with respect to the depth information estimation result for the input image as shown in FIG. 13B .

Since it is impossible to perfectly estimate depth information through the conventional depth information estimation process, as shown in FIG. 13B , it is possible to reduce the uncertainty of depth information by complexly applying various variables generated through the segmentation estimation process.

As described above, the depth estimation method and system 10 using cycle GAN and segmentation according to an embodiment of the present invention results in depth information estimation through a multi-task learning technique that improves performance through several variables. can improve

14 is a diagram illustrating an evaluation procedure of a depth estimation method using a cycle GAN and segmentation according to an embodiment of the present invention. 14 , evaluation is performed using NYU Depth Dataset V2, which is an open standard database, for reliability evaluation of a depth estimation method using cycle GAN and segmentation according to an embodiment of the present invention.

The NYU Depth Dataset V2 database provides video sequence data focusing on various indoor scenes shot using Microsoft's Kinect v1 model. In addition, the NYU Depth Dataset V2 database provides depth information and segmentation information for RGB images through the Labeled Dataset.

The segmentation and depth estimation step S110 of FIG. 14 is a generator for the input RGB image X of the standard database 410 .

and the Generator

is used to generate depth information and segmentation image information (S10).

That is, in the segmentation and depth estimation step S110, depth information and segmentation information are estimated and a learning rate is calculated through an objective function. In addition, the effect of the combination of cycle-consistency losses according to each domain of the present invention on performance is evaluated.

In addition, in the adversarial loss calculation step S130 of FIG. 14 , the generated depth information, the segmentation image information, and the reconstructed RGB image are respectively discriminated and compared with the standard database 410, and the loss and discrimination for each. The step of calculating the probability (S30) is shown.

That is, in the adversarial loss calculation step S130 , only the adversarial loss is calculated among the objective functions calculated in the segmentation and depth estimation step S110 . In addition, since there is no intersection between the depth information and the segmentation information in the adversarial loss calculation step ( S130 ), depth and segmentation estimation are performed independently of each other.

In this case, the method may further include the step of reconstructing the RGB image ( S120 ) using the depth information and the segmentation image information generated after the segmentation and depth estimation step ( S110 ).

In addition, in the Cycle-Consistency Loss calculation step (S131) of FIG. 14, although different output values are obtained for the same input RGB image X, they are later converted into the RGB image again.

The original RGB image X and the restored RGB image when restored to

shows a cycle-consistency loss step (S31) inducing depth information and segmentation image to be generated while maintaining the shape of the original RGB image X by comparing .

That is, in the Cycle-Consistency Loss calculation step (S131), the reconstructed RGB image is compared with the original RGB image to calculate a Cycle-Consistency Loss, and a penalty is applied to the generator of each depth information and segmentation information. give Through this process, each generator performs depth and segmentation estimation in consideration of restoration of depth information and segmentation information.

In addition, the depth and segmentation evaluation step (S140) of FIG. 14 is a step (S40) of determining whether each loss and discrimination probability value of a discriminator satisfies a preset reference convergence value based on the calculated result value. .

In the depth and segmentation evaluation step ( S140 ), depth information generated by measuring RMSLE, which is a variation of root mean square error (RMSE), which measures a numerical error of the generated result, is evaluated. The RMSLE can be expressed as [Equation 7] below.

[Equation 7]

_{Here, P i} and a _i necessary for calculating the RMSLE are inputted after being normalized to a value between 0 and 1. The RMSLE cost function is mainly used to give a penalty to an underestimated item rather than an overestimated item. It is a numerical value indicating the error for the correct answer, and the larger the value, the greater the error.

[Table 1] below shows the results of comparison and evaluation of the depth estimation method using cycle GAN and segmentation according to an embodiment of the present invention with other techniques based on the NYU Depth Dataset V2 database.

[Table 1] Comparison result for NYU Depth Dataset V2 of depth estimation method using cycle GAN and segmentation and other techniques according to an embodiment of the present invention

As shown in [Table 1], the RMSLE value of the depth estimation method using cycle GAN and segmentation according to an embodiment of the present invention is 0.220, which is lower than other techniques. That is, since a lower value of the RMSLE value indicates an excellent depth estimation method, it can be confirmed that the depth estimation method using the cycle GAN and segmentation according to the embodiment of the present invention exhibits a higher degree of similarity than other techniques.

As such, the depth estimation method and system 10 using cycle GAN and segmentation according to an embodiment of the present invention uses only a single camera compared to the conventional method that has to use special equipment or sensors to obtain 3D information. It is cheaper because it can generate three-dimensional information, it is highly scalable, and it is advantageous for miniaturization.

In addition, the depth estimation method and system 10 using cycle GAN and segmentation according to an embodiment of the present invention can estimate depth information and improve the precision of depth information. In addition, it is possible to visually display the problem of data imbalance occurring in the process of estimating the depth information of the conventional input image, and to solve small features that are lost by being buried in relatively large features by introducing segmentation. That is, it is possible to solve the problem of uncertainty of depth information with respect to the depth information estimation result of the input image.

Although preferred embodiments of the present invention have been described above, the present invention is not limited to the above embodiments, and can be easily changed by those of ordinary skill in the art to which the present invention pertains from the embodiments of the present invention and equivalent. It includes all changes to the extent recognized as such.

[Explanation of code]

10: depth estimation system 100: image information learning unit

200: calculation unit 300: judgment unit

400: database 410: standard database

500: image input unit 600: image information estimation unit

Claims

In the depth estimation method for estimating depth information of an image using only a single image,

Generator for input RGB image X from standard database
and the Generator
generating depth information and segmentation image information using (S10);

reconstructing an RGB image using the generated depth information and segmentation image information (S20);

determining and comparing the generated depth information, segmentation image information, and the reconstructed RGB image with a standard database, respectively, and calculating a loss and a determination probability for each (S30);

Determining whether each loss and the discriminator's discriminator's discriminator's probability value satisfies a preset reference convergence value based on the calculated result value (S40);

If the loss and discrimination probability values do not satisfy the preset reference convergence value based on the judgment result, the learning is adjusted so that the respective loss and discrimination probability values of the discriminator converge to the preset reference convergence value, and the steps (S10) to ( S40) repeating the step (S50); and

The generator generated through the steps (S10) to (S50)
A depth estimation method using cycle GAN and segmentation, comprising the step (S60) of estimating depth information for an input RGB image of RGB data using
According to claim 1,

The generator generated through the steps (S10) to (S50)
The depth estimation method using cycle GAN and segmentation further comprising the step (S70) of estimating segmentation image information for the input RGB image of the RGB data using
According to claim 1,

The standard database is a depth estimation method using a cycle GAN and segmentation including RGB image information, depth information, and segmentation information.
According to claim 1,

The standard database is a depth estimation method using cycle GAN and segmentation, characterized in that NYU Depth Dataset V2.
According to claim 1,

In the step of estimating depth information for the input RGB image ( S60 ), the RGB data includes only RGB image information. A depth estimation method using cycle GAN and segmentation.
According to claim 1,

Calculating the loss and discrimination probability (S30) is a depth estimation method using cycle GAN and segmentation, characterized in that the numerical value of the loss and discrimination probability result value is calculated through an objective function of the cycle GAN.
7. The method of claim 6,

The objective function is a depth estimation method using a cycle GAN and segmentation, characterized in that it consists of an adversarial loss function and a cycle-consistency loss function of a cycle GAN (Generative Adversarial Network) .
According to claim 1,

In the step (S50) of repeating the steps (S10) to (S40), the loss converges to 0, and the discriminator's discrimination probability value converges to 50% Cycle GAN, characterized in that the learning is adjusted and depth estimation method using segmentation.
According to claim 1,

After calculating the loss and the discrimination probability (S30),

It has different output values for the same input RGB image X, but later it is converted to RGB image again.
The original RGB image X and the restored RGB image when restored to
Depth estimation using cycle GAN and segmentation, which further includes a cycle-consistency loss step that induces generation of depth information and segmentation images while maintaining the shape of the original RGB image X by comparing Way.
10. The method of claim 9,

The cycle consistency loss (Cycle-Consistency Loss) step,

The generator generated through the steps (S10) to (S30) in order to proceed with learning by reflecting the results of the minimax of the generator and the discriminator (Back-propagation)
Wow, the Generator
A depth estimation method using cycle GAN and segmentation, characterized in that by feeding back the depth information and segmentation image information.
In the depth estimation system for estimating depth information of an image using only a single image,

Receive RGB image from standard database, and generate
Depth information is generated using
an image information learning unit generating segmentation image information using

a calculation unit for discriminating and comparing the depth information generated by the image information learning unit, the segmentation image information, and the reconstructed RGB image with a standard database, respectively, and calculating a loss and a determination probability for each;

a determination unit for determining whether each loss and discrimination probability value of a discriminator satisfies a preset reference convergence value based on the result calculated by the operation unit;

an image input unit receiving an RGB image; and

The generator for which learning has been completed in the image information learning unit
Wow, the Generator
A depth estimation system using cycle GAN and segmentation including an image information estimator for estimating depth information for an RGB image input from an image input unit using
12. The method of claim 11,

The image information learning unit generates a generator to convert RGB image information into segmentation information.
obtains the segmentation information for the input RGB image X and uses the information for depth information estimation,

The Generator (Generator)
A depth estimation system using cycle GAN and segmentation, characterized in that the depth information on the input RGB image X is obtained by the method and the corresponding information is used for estimating the segmentation information.
12. The method of claim 11,

The determination unit adjusts learning so that, based on the determination result, the loss and discrimination probability values do not satisfy a preset reference convergence value, each loss and the discrimination probability value of the discriminator converge to the preset reference convergence value, and the image information learning Depth estimation system using cycle GAN and segmentation, characterized in that feedback is performed so that re-learning is performed in the sub.