WO2021206284A1 - Depth estimation method and system using cycle gan and segmentation - Google Patents

Depth estimation method and system using cycle gan and segmentation Download PDF

Info

Publication number
WO2021206284A1
WO2021206284A1 PCT/KR2021/001803 KR2021001803W WO2021206284A1 WO 2021206284 A1 WO2021206284 A1 WO 2021206284A1 KR 2021001803 W KR2021001803 W KR 2021001803W WO 2021206284 A1 WO2021206284 A1 WO 2021206284A1
Authority
WO
WIPO (PCT)
Prior art keywords
segmentation
information
image
depth
loss
Prior art date
Application number
PCT/KR2021/001803
Other languages
French (fr)
Korean (ko)
Inventor
이승호
곽동훈
Original Assignee
한밭대학교 산학협력단
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 한밭대학교 산학협력단 filed Critical 한밭대학교 산학협력단
Publication of WO2021206284A1 publication Critical patent/WO2021206284A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning

Definitions

  • the present invention relates to a method and a system for estimating depth using cycle GAN and segmentation, and more particularly, by using only a single image through cycle GAN and segmentation without using special equipment or a camera, depth information of an image is obtained. It relates to a depth estimation method and system using a cycle GAN and segmentation to estimate.
  • 3D information refers to information that includes spatial information such as depth and scale other than visual information of an image.
  • 3D information is indispensable in the fields of VR, AR, and autonomous driving, and demands technologies that can acquire and calculate it more accurately and quickly.
  • augmented reality a virtual environment is overlaid on a real environment to reinforce and provide additional information to the user.
  • the virtual environment created by computer graphics naturally overlaps with the real environment, so that a more immersive service can be provided to users.
  • These technologies can build a natural virtual environment only when 3D information is combined with the visual information coming through the camera.
  • the present invention is to solve the disadvantages of the prior art, and it is an object of the present invention to extract a 3D image inexpensively by using only a single camera without using special equipment, a camera, a radar, an ultrasonic wave, and a sensor.
  • the purpose of the present invention is to make it easy to obtain data for generating 3D image information.
  • it aims to solve the data imbalance problem that occurs in the process of estimating depth information.
  • the depth estimation method using cycle GAN and segmentation is a generator for the input RGB image X of the standard database. and the Generator Step (S10) of generating depth information and segmentation image information using and a step (S30) of discriminating and comparing the presentation image information and the reconstructed RGB image with a standard database, and calculating a loss and a discrimination probability for each.
  • the generator generated through the steps (S10) to (S50) Estimating depth information for an input RGB image of RGB data using and estimating segmentation image information for the input RGB image of the RGB data by using (S70).
  • the depth estimation system using cycle GAN and segmentation includes an image information learning unit, a calculation unit, a determination unit, a database, an image input unit, and an image information estimation unit.
  • the database includes a standard database.
  • the image information learning unit receives the RGB image of the standard database, the generator (Generator) Depth information is generated using is used to generate segmentation image information, reconstructs an RGB image using the generated depth information and segmentation image information, and performs learning through the objective function of the cycle GAN.
  • the generator Generator
  • Depth information is generated using is used to generate segmentation image information, reconstructs an RGB image using the generated depth information and segmentation image information, and performs learning through the objective function of the cycle GAN.
  • the calculation unit compares the depth information generated by the image information learning unit, the segmentation image information, and the reconstructed RGB image with a standard database, respectively, and calculates a loss and a discrimination probability for each.
  • the determination unit determines whether each loss and discrimination probability value of a discriminator satisfy a preset reference convergence value based on the result calculated by the operation unit.
  • the image input unit receives an RGB image.
  • the image information estimating unit is the generator (Generator) learning is completed in the image information learning unit Wow, the Generator is used to estimate the depth information of the RGB image received from the image input unit.
  • the depth estimation method and system using cycle GAN and segmentation according to the present invention is inexpensive by generating 3D information using only a single image without using special equipment, cameras, radar, ultrasound and sensors. It has the effect of extracting a 3D image.
  • 3D information can be generated even when other information such as a stereo image, an optical flow technique, or a point cloud cannot be used, and there is an advantageous effect in miniaturization of equipment for extracting 3D information.
  • depth information of an image can be estimated using a single image, data for generating 3D image information can be easily obtained.
  • FIG. 1 is a diagram illustrating a problem occurring in a conventional depth estimation process.
  • FIG. 2 is a block diagram illustrating a depth estimation system using a cycle GAN and segmentation according to an embodiment of the present invention.
  • FIG. 3 is a conceptual diagram illustrating a depth estimation method using a cycle GAN and segmentation according to an embodiment of the present invention.
  • FIG. 4 is a flowchart illustrating a depth estimation method using a cycle GAN and segmentation according to an embodiment of the present invention.
  • FIG. 5 is a flowchart illustrating a depth estimation method using a cycle GAN and segmentation according to an embodiment of the present invention.
  • FIG. 6 is a diagram illustrating an operation sequence of a method for estimating depth information of a single image according to an embodiment of the present invention.
  • FIG. 7 is a diagram illustrating a segmentation estimation process according to an embodiment of the present invention.
  • FIG. 8 is a diagram illustrating a depth estimation process according to an embodiment of the present invention.
  • FIG. 9 is a diagram illustrating a generation distribution of a generator and a discrimination probability of a discriminator.
  • FIG. 10 is a diagram illustrating a cycle-consistency loss according to an embodiment of the present invention.
  • FIG. 11 is a diagram illustrating a depth information estimation step of an execution step according to an embodiment of the present invention.
  • FIG. 12 is a diagram illustrating a segmentation information estimation step of an execution step according to an embodiment of the present invention.
  • 13A and 13B are diagrams showing a comparison before and after the segmentation process is used in the depth information estimation process.
  • FIG. 14 is a diagram illustrating an evaluation procedure of a depth estimation method using a cycle GAN and segmentation according to an embodiment of the present invention.
  • FIG. 1 is a diagram illustrating a problem occurring in a conventional depth estimation process
  • FIG. 2 is a configuration diagram illustrating a depth estimation system 10 using a cycle GAN and segmentation according to an embodiment of the present invention. That is, FIG. 1 is a diagram illustrating a problem in that depth information is ambiguous in a conventional depth estimation process for an image.
  • GAN Generative Adversarial Network
  • the GAN includes a generator that generates a data instance, and a discriminator that determines whether the data is authentic or not.
  • the generator receives random noise z generated as a zero-mean Gaussian and generates fake data similar to the actual data distribution.
  • the discriminator distinguishes whether the data generated by the generator is fake data or data of a training dataset, and indicates a probability for each. Therefore, the discriminator operates to decrease the probability of making a mistake, and the generator operates to increase the probability that the discriminator makes a mistake. This is called the Minimax Problem.
  • the depth estimation method and system 10 using cycle GAN and segmentation improves the conventional depth information estimation method requiring conventional special equipment or multiple images, such as cycle GAN (Cycle GAN).
  • Depth information can be estimated using Generative Adversarial Network and Segmentation.
  • depth information is vaguely displayed for features that are relatively less learned due to data imbalance between training data, or depth information is not applied to large features. There are problems such as fading to be buried.
  • the depth estimation method and system 10 using cycle GAN and segmentation visually displays the data imbalance problem that occurs in the process of estimating the conventional depth information and provides a relatively large feature. In order to highlight small features that are buried and lost, we try to solve this by introducing segmentation.
  • the depth estimation system 10 using cycle GAN and segmentation includes an image information learning unit 100 , a calculating unit 200 , a determining unit 300 , a database 400 , and an image input unit 500 . ) and an image information estimation unit 600 .
  • the database 400 includes a standard database 410 .
  • the image information learning unit 100 receives the RGB image of the standard database 410, and a generator Depth information is estimated using is used to estimate segmentation image information.
  • the image information learning unit 100 generates a generator to convert RGB image information into segmentation information. Segmentation information for the input RGB image X is obtained by using , and the corresponding information is used for depth information estimation.
  • the image information learning unit 100 is a generator (Generator) to obtain depth information on the input RGB image X and use the information for estimating segmentation information. Also, the image information learning unit 100 restores the RGB image by using the generated depth information and segmentation image information. In addition, the image information learning unit 100 is a generator (Generator) through the objective function of the cycle GAN Wow, the Generator carry out learning about
  • the calculator 200 determines and compares the generated depth information, the segmented image information, and the reconstructed RGB image with the standard database 410, and calculates a loss and a discrimination probability for each. At this time, the calculating unit 200 calculates the numerical values of the loss and discrimination probability result values through the objective function of the cycle GAN.
  • the determination unit 300 determines whether each loss and the discrimination probability value of the discriminator satisfy a preset reference convergence value based on the result calculated by the operation unit 200 .
  • the determination unit 300 adjusts learning so that, based on the determination result, the loss and discrimination probability value does not satisfy the preset reference convergence value, the respective loss and the discrimination probability value of the discriminator converge to the preset reference convergence value, feed back to the image information learning unit 100 to induce re-learning or re-estimation of depth information and segmentation image information. That is, the determination unit 300 feeds back to the image information learning unit 100 a result adjusted so that the re-learning is performed by the image information learning unit 100 .
  • the database 400 includes a standard database 410 for performing learning in the image information learning unit 100 . That is, the image information learning unit 100 receives the standard database 410 from the database 400 and estimates depth information and segmentation image information.
  • the standard database 410 includes RGB image information, depth information, and segmentation information.
  • the standard database 410 NYU Depth Dataset V2 may be used as the standard database 410.
  • the database 400 stores a reference convergence value serving as a determination criterion of the determination unit 300 .
  • the database 400 is a generator for estimating depth information and segmentation information in the image information learning unit 100 . Wow, the Generator Save the data.
  • the image input unit 500 receives an RGB image.
  • the image information estimator 600 is the generator (Generator) the learning process is completed.
  • the Generator is used to estimate depth information or segmentation information for an RGB image received from the image input unit 500 .
  • FIG. 3 is a conceptual diagram illustrating a depth estimation method using cycle GAN and segmentation according to an embodiment of the present invention
  • FIG. 4 is a flowchart illustrating a depth estimation method using cycle GAN and segmentation according to an embodiment of the present invention
  • FIG. 5 is a flowchart illustrating a depth estimation method using a cycle GAN and segmentation according to an embodiment of the present invention.
  • the depth estimation method using cycle GAN and segmentation includes a learning step and an execution step as shown in FIG. 4 .
  • the learning step segmentation and a depth estimation method using the segmentation are learned.
  • the learning step the learning is adjusted by calculating the objective function and the discrimination probability.
  • depth information is estimated using only RGB image information based on the learning result of the learning step.
  • the execution step estimates depth information using a generator used in the process of learning the depth information in the learning step.
  • the learning step of the depth estimation method using cycle GAN and segmentation includes a generator for the input RGB image X of the standard database 410 . and the Generator Step (S10) of generating depth information and segmentation image information using The method may include determining and comparing the presentation image information and the reconstructed RGB image with the standard database 410, respectively, and calculating a loss and a determination probability for each (S30).
  • FIG. 6 is a diagram illustrating an operation sequence of a method for estimating depth information of a single image using a cycle GAN and segmentation according to an embodiment of the present invention
  • FIG. 7 is a segmentation estimation according to an embodiment of the present invention. It is a diagram illustrating a process
  • FIG. 8 is a diagram illustrating a depth estimation process according to an embodiment of the present invention.
  • step ( S10 ) of generating the depth information and the segmentation image information depth information and segmentation information are estimated based on the input RGB image of the standard database 410 , and a learning rate is calculated through an objective function. In addition, it is possible to evaluate the effect of a combination of cycle-consistency losses according to each domain on the performance of depth information estimation.
  • the segmentation network structure for estimating the segmentation information in the step S10 of generating the depth information and the segmentation image information and the depth network structure for estimating the depth information are It has the same structure, and only the roles of each generator and discriminator are changed.
  • a hint about depth information may be provided through segmentation information of the standard database 410 .
  • a generator Segmentation information for the input RGB image X may be obtained by the , and the corresponding information may be used for depth information estimation.
  • the generator (Generator) depth information on the input RGB image X can be obtained by using the .
  • the generator of the two networks is fed back in the step S30 of calculating the loss and the discrimination probability as shown in FIG. 4 and converted to estimate depth information and segmentation information through the RGB image.
  • the objective function may be composed of an adversarial loss function and a cycle-consistency loss function of a cyclic generative adversarial network (GAN).
  • the adversarial loss function learns according to the minimax results of the generator and the discriminator.
  • a generator imitates a standard distribution of data, and a discriminator calculates a discrimination probability accordingly.
  • the adversarial loss function calculates only the adversarial loss among the objective functions calculated in the step S10 of generating the depth information and the segmentation image information. In this step, since there is no intersection between depth and segmentation, depth information and segmentation information are estimated independently of each other.
  • calculating the loss and the discrimination probability (S30) is performed with the depth image estimated in the step (S10) of generating the depth information and the segmentation image information and the step (S20) of reconstructing the RGB image.
  • the segmentation image and the reconstructed RGB image may be discriminated through a discriminator, and probabilities for each may be calculated.
  • D represents the discriminator's discrimination probability for each input
  • G represents the data space mapping of the generator to the input
  • represents the hyperparameter used for weighting.
  • Equation 1 is composed of an adversarial loss function and a cycle-consistency loss function of a cyclic GAN (Generative Adversarial Network).
  • the adversarial loss function can be calculated in the process of estimating segmentation information and depth information from RGB image information, and can be expressed as [Equation 2] and [Equation 3] below.
  • E represents the expected value for the distribution
  • P i represents the probability distribution for i.
  • the generator (Generator) fake data generated by the discriminator is determined, and the generator fake data generated by the discriminator Learning proceeds while discriminating.
  • the core of the adversarial loss function is to map the distribution generated through the GAN to the actual distribution. Accordingly, the Adversarial Loss Function is trained according to the minimax results of the generator and the discriminator, and the generator can generate a distribution perfectly similar to the actual distribution. In addition, the discrimination probability of the discriminator converges to 50%.
  • FIG. 9 is a diagram illustrating a generation distribution of a generator and a discrimination probability of a discriminator. That is, FIG. 9 is a graph showing a process in which the generation distribution of the generator and the discrimination probability of the discriminator are changed as the learning of the GAN proceeds in the learning step.
  • the black dotted line represents the actual data distribution (Discriminator distribution)
  • the green solid line represents the generative distribution of the generator
  • the blue dotted line represents the discriminator distribution.
  • the small distance between the two distributions means that the distributions are very similar, indicating that the discriminator cannot easily discriminate.
  • the loss used in the conventional CNN-based learning method is combined to induce the generation distribution of the generator to learn the standard distribution of the target.
  • the reconstruction loss may be expressed as in [Equation 4] below.
  • the method may further include a cycle-consistency loss step (S31) of inducing to generate depth information and a segmentation image while maintaining the shape of the original RGB image X by comparing .
  • the cycle-consistency loss calculation step S31 evaluates the similarity when the two image information estimated separately from the two objective functions are restored back to the original image information.
  • the cycle-consistency loss calculation step S31 serves to induce the generator to attempt conversion to a domain while maintaining the shape of each domain.
  • a cycle GAN and a depth estimation method using segmentation may use a cycle GAN composed of two domains: segmentation and depth. Accordingly, the cycle-consistency loss can be expressed as [Equation 5] below, which is composed of the sum of two losses.
  • an error value generated through restoration is set as a cycle-consistency loss. Therefore, if the restoration is performed well, the loss function is lowered.
  • FIG. 10 is a diagram illustrating a cycle-consistency loss of a cycle GAN model according to an embodiment of the present invention. Referring to the cycle-consistency loss as shown in FIG. 10 , it indicates that the generator must consider not only the restoration from the depth information but also the restoration from the segmentation information when restoration is performed to the RGB image.
  • L1 loss is relatively robust compared to the L2 loss (Loss), and is robust to an unstable solution problem.
  • the L1 loss may be expressed as in Equation 6 below.
  • the depth estimation method using cycle GAN and segmentation calculates an Adversarial Loss and a Cycle-Consistency Loss through the learning step, and based on this Finally, fake depth information after training is completed can be estimated.
  • FIG. 11 is a diagram illustrating a depth information estimation step of an execution step according to an embodiment of the present invention
  • FIG. 12 is a diagram illustrating a segmentation information estimation step of an execution step according to an embodiment of the present invention.
  • the execution step of the depth estimation method using the cycle GAN and the segmentation includes the generator generated through the steps (S10) to (S50). Estimating depth information for an input RGB image of RGB data using and estimating segmentation image information for the input RGB image of the RGB data by using (S70).
  • the step (S60) of estimating the depth information is performed by a generator generated in the learning step as shown in FIG. 11 . is used to estimate depth information.
  • the step of estimating the depth information includes a generator generated in the learning step to generate a depth image of an RGB image similar to an actual depth image. Depth estimation is performed using
  • the step of estimating the segmentation image information ( S70 ) is performed by a generator generated in the learning step as shown in FIG. 12 . is used to estimate the segmentation information. That is, the step (S70) of estimating the segmentation image information is performed by a generator generated in the learning step. Converts RGB images to segmentation information using
  • FIG. 13A and 13B are diagrams showing a comparison before and after the segmentation process is used in the depth information estimation process. That is, FIG. 13A is a diagram illustrating uncertainty of an image estimated without using a segmentation process in a depth information estimation process, and FIG. 13B is a diagram illustrating a result of estimating depth information on an input image by adding a segmentation process. It is a drawing.
  • the reason for adding the segmentation process to the conventional depth information estimation process is to solve the problem of uncertainty of depth information with respect to the depth information estimation result for the input image as shown in FIG. 13B .
  • the depth estimation method and system 10 using cycle GAN and segmentation results in depth information estimation through a multi-task learning technique that improves performance through several variables. can improve
  • 14 is a diagram illustrating an evaluation procedure of a depth estimation method using a cycle GAN and segmentation according to an embodiment of the present invention. 14 , evaluation is performed using NYU Depth Dataset V2, which is an open standard database, for reliability evaluation of a depth estimation method using cycle GAN and segmentation according to an embodiment of the present invention.
  • NYU Depth Dataset V2 which is an open standard database
  • the NYU Depth Dataset V2 database provides video sequence data focusing on various indoor scenes shot using Microsoft's Kinect v1 model.
  • the NYU Depth Dataset V2 database provides depth information and segmentation information for RGB images through the Labeled Dataset.
  • the segmentation and depth estimation step S110 of FIG. 14 is a generator for the input RGB image X of the standard database 410 . and the Generator is used to generate depth information and segmentation image information (S10).
  • step S110 depth information and segmentation information are estimated and a learning rate is calculated through an objective function.
  • the effect of the combination of cycle-consistency losses according to each domain of the present invention on performance is evaluated.
  • the generated depth information, the segmentation image information, and the reconstructed RGB image are respectively discriminated and compared with the standard database 410, and the loss and discrimination for each.
  • the step of calculating the probability (S30) is shown.
  • the adversarial loss calculation step S130 only the adversarial loss is calculated among the objective functions calculated in the segmentation and depth estimation step S110 .
  • depth and segmentation estimation are performed independently of each other.
  • the method may further include the step of reconstructing the RGB image ( S120 ) using the depth information and the segmentation image information generated after the segmentation and depth estimation step ( S110 ).
  • the Cycle-Consistency Loss calculation step (S131) the reconstructed RGB image is compared with the original RGB image to calculate a Cycle-Consistency Loss, and a penalty is applied to the generator of each depth information and segmentation information. give Through this process, each generator performs depth and segmentation estimation in consideration of restoration of depth information and segmentation information.
  • the depth and segmentation evaluation step (S140) of FIG. 14 is a step (S40) of determining whether each loss and discrimination probability value of a discriminator satisfies a preset reference convergence value based on the calculated result value. .
  • RMSLE depth information generated by measuring RMSLE, which is a variation of root mean square error (RMSE), which measures a numerical error of the generated result.
  • RMSE root mean square error
  • P i and a i necessary for calculating the RMSLE are inputted after being normalized to a value between 0 and 1.
  • the RMSLE cost function is mainly used to give a penalty to an underestimated item rather than an overestimated item. It is a numerical value indicating the error for the correct answer, and the larger the value, the greater the error.
  • Table 1 shows the results of comparison and evaluation of the depth estimation method using cycle GAN and segmentation according to an embodiment of the present invention with other techniques based on the NYU Depth Dataset V2 database.
  • the RMSLE value of the depth estimation method using cycle GAN and segmentation according to an embodiment of the present invention is 0.220, which is lower than other techniques. That is, since a lower value of the RMSLE value indicates an excellent depth estimation method, it can be confirmed that the depth estimation method using the cycle GAN and segmentation according to the embodiment of the present invention exhibits a higher degree of similarity than other techniques.
  • the depth estimation method and system 10 using cycle GAN and segmentation uses only a single camera compared to the conventional method that has to use special equipment or sensors to obtain 3D information. It is cheaper because it can generate three-dimensional information, it is highly scalable, and it is advantageous for miniaturization.
  • the depth estimation method and system 10 using cycle GAN and segmentation can estimate depth information and improve the precision of depth information.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The present invention relates to a depth estimation method and system using cycle GAN and segmentation, the method and the system estimating depth information about an image by using only a single image through cycle GAN and segmentation without using special equipment or a camera. A depth estimation method using cycle GAN and segmentation, according to an embodiment of the present invention, comprises the steps of: (S10) generating depth information and segmentation image information about an input RGB image of a standard database by using a generator; (S20) reconstructing an RGB image by using the generated depth information and segmentation image information; and (S30) calculating a loss and a discrimination probability by comparing the generated depth information and segmentation image information, and the reconstructed RGB image with the standard database, and discriminating same, respectively. In addition, the depth estimation method comprises the steps of: (S40) determining whether the loss and the discrimination probability value satisfy a preset reference convergence value, on the basis of the calculated result values; (S50) adjusting training on the basis of the determination result so that the loss and the discrimination probability value of a discriminator converge on the preset reference convergence value, and repeating steps (S10) to (S40); and (S60) estimating depth information about the RGB image by using a generator generated through steps (S10) to (S50).

Description

사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법 및 시스템Depth estimation method and system using cycle GAN and segmentation
본 발명은 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법 및 시스템에 관한 것으로서, 더욱 상세하게는 특수 장비나 카메라를 이용하지 않고 사이클 GAN과 세그맨테이션을 통해 단일 영상만을 사용하여 영상의 깊이 정보를 추정하는 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법 및 시스템에 관한 것이다.The present invention relates to a method and a system for estimating depth using cycle GAN and segmentation, and more particularly, by using only a single image through cycle GAN and segmentation without using special equipment or a camera, depth information of an image is obtained. It relates to a depth estimation method and system using a cycle GAN and segmentation to estimate.
영상처리 분야에서 3차원 정보란 영상의 시각적 정보 이외의 깊이(Depth), 규모(Scale) 등의 공간감적 정보가 포함된 정보를 나타낸다. 4차 산업혁명을 시작으로 VR, AR 및 자율주행 분야 등에서 이러한 3차원 정보는 없어선 안 되는 필수적인 정보이며, 이를 보다 정확하고 빠른 시간 내 획득하고 계산할 수 있는 기술 등을 요구하고 있다.In the image processing field, 3D information refers to information that includes spatial information such as depth and scale other than visual information of an image. Starting with the 4th industrial revolution, such 3D information is indispensable in the fields of VR, AR, and autonomous driving, and demands technologies that can acquire and calculate it more accurately and quickly.
예를 들어, 증강현실(AR) 분야에서는 실제 환경 위에 가상의 환경을 덮어씌워 사용자에게 부가적인 정보를 보강하여 제공한다. 컴퓨터 그래픽으로 만들어진 가상환경이 실제 환경과 자연스럽게 오버랩되어 사용자에게 보다 몰입감 있는 서비스를 제공할 수 있다. 이러한 기술들은 카메라를 통해 들어오는 시각적인 정보에 3차원 정보가 결합된 형태로 구성되어야 자연스러운 형태의 가상 환경을 구축할 수 있다.For example, in the field of augmented reality (AR), a virtual environment is overlaid on a real environment to reinforce and provide additional information to the user. The virtual environment created by computer graphics naturally overlaps with the real environment, so that a more immersive service can be provided to users. These technologies can build a natural virtual environment only when 3D information is combined with the visual information coming through the camera.
따라서 이러한 3차원 정보를 얻기 위하여 레이더나 초음파 및 레이저 센서 등이 개발되어 왔으며 아울러 특수 카메라나 스테레오 카메라들을 통한 3차원 촬영 방법 등이 제시되어 왔다.Therefore, in order to obtain such three-dimensional information, radar, ultrasonic, and laser sensors have been developed, and a three-dimensional imaging method using a special camera or stereo camera has been proposed.
하지만, 종래의 3차원 정보를 얻기 위해서는 특수 장비나 카메라, 레이더, 초음파 및 센서 등을 사용함으로써 3차원 정보를 추출하기 위한 비용이 높고, 자료를 쉽게 구할 수 없는 문제가 있다.However, in order to obtain the conventional 3D information, there is a problem in that the cost for extracting the 3D information is high by using special equipment, a camera, a radar, an ultrasonic wave, and a sensor, and the data cannot be easily obtained.
[선행기술문헌][Prior art literature]
[특허문헌] 대한민국 등록특허 제10-1650702호(2016년 08월 24일 공고)[Patent Document] Republic of Korea Patent Registration No. 10-1650702 (Aug. 24, 2016 Announcement)
따라서, 본 발명은 종래의 단점을 해결하기 위한 것으로서, 특수 장비나 카메라, 레이더, 초음파 및 센서를 이용하지 않고 단일 카메라만을 사용하여 저렴하게 3차원 영상을 추출하고자 하는데 그 목적이 있다. 또한, 3차원 영상 정보를 생성하기 위한 자료를 용이하게 구할 수 있도록 하는데 그 목적이 있다. 또한, 깊이 정보를 추정하는 과정에서 발생하는 데이터 불균형 문제를 해결하고자 하는데 그 목적이 있다.Accordingly, the present invention is to solve the disadvantages of the prior art, and it is an object of the present invention to extract a 3D image inexpensively by using only a single camera without using special equipment, a camera, a radar, an ultrasonic wave, and a sensor. In addition, the purpose of the present invention is to make it easy to obtain data for generating 3D image information. In addition, it aims to solve the data imbalance problem that occurs in the process of estimating depth information.
이러한 기술적 과제를 이루기 위한 본 발명의 일 측면에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법은 표준 데이터베이스의 입력 RGB 영상 X에 대하여 생성자(Generator)
Figure PCTKR2021001803-appb-I000001
와 생성자(Generator)
Figure PCTKR2021001803-appb-I000002
을 이용하여 깊이 정보와 세그맨테이션 영상 정보를 생성하는 단계(S10)와, 생성된 깊이 정보와 세그맨테이션 영상 정보를 이용하여 RGB 영상을 복원하는 단계(S20) 및 생성된 깊이 정보와 세그맨테이션 영상 정보 및 복원된 RGB 영상을 표준 데이터베이스와 각각 판별하여 비교하고, 각각에 대한 손실(Loss) 및 판별 확률을 계산하는 단계(S30)를 포함한다.
The depth estimation method using cycle GAN and segmentation according to an aspect of the present invention for achieving this technical problem is a generator for the input RGB image X of the standard database.
Figure PCTKR2021001803-appb-I000001
and the Generator
Figure PCTKR2021001803-appb-I000002
Step (S10) of generating depth information and segmentation image information using and a step (S30) of discriminating and comparing the presentation image information and the reconstructed RGB image with a standard database, and calculating a loss and a discrimination probability for each.
또한, 계산된 결과값을 토대로 각각의 손실 및 판별자(Discriminator)의 판별 확률 값이 미리 설정된 기준 수렴값을 만족하는지 판단하는 단계(S40)와, 판단 결과를 토대로 손실 및 판별 확률 값이 미리 설정된 기준 수렴값을 만족하지 않는 경우 각각의 손실 및 판별자의 판별 확률 값이 미리 설정된 기준 수렴값에 수렴되도록 학습을 조정하고, 상기 (S10) 단계 내지 (S40) 단계를 반복 수행하는 단계(S50)를 포함한다.In addition, based on the calculated result value, determining whether each loss and the discriminator's discriminator's discriminator value satisfies a preset reference convergence value (S40), and based on the judgment result, the loss and discriminator's probability value are preset If the reference convergence value is not satisfied, the learning is adjusted so that each loss and the discriminator's discrimination probability value converge to a preset reference convergence value, and repeating the steps (S10) to (S40) (S50) include
또한, 상기 (S10) 단계 내지 (S50) 단계를 통해 생성된 생성자(Generator)
Figure PCTKR2021001803-appb-I000003
를 이용하여 RGB 데이터의 입력 RGB 영상에 대한 깊이 정보를 추정하는 단계(S60)와, 상기 (S10) 단계 내지 (S50) 단계를 통해 생성된 생성자(Generator)
Figure PCTKR2021001803-appb-I000004
을 이용하여 상기 RGB 데이터의 입력 RGB 영상에 대한 세그맨테이션 영상 정보를 추정하는 단계(S70)를 포함한다.
In addition, the generator (Generator) generated through the steps (S10) to (S50)
Figure PCTKR2021001803-appb-I000003
Estimating depth information for an input RGB image of RGB data using
Figure PCTKR2021001803-appb-I000004
and estimating segmentation image information for the input RGB image of the RGB data by using (S70).
또한, 본 발명의 다른 측면에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 시스템은 영상정보 학습부, 연산부, 판단부, 데이터베이스, 영상 입력부 및 영상정보 추정부를 포함한다. 이때, 상기 데이터베이스는 표준 데이터베이스를 포함한다.In addition, the depth estimation system using cycle GAN and segmentation according to another aspect of the present invention includes an image information learning unit, a calculation unit, a determination unit, a database, an image input unit, and an image information estimation unit. In this case, the database includes a standard database.
또한, 상기 영상정보 학습부는 표준 데이터베이스의 RGB 영상을 입력받고, 생성자(Generator)
Figure PCTKR2021001803-appb-I000005
를 이용하여 깊이 정보를 생성하며, 생성자(Generator)
Figure PCTKR2021001803-appb-I000006
을 이용하여 세그맨테이션 영상 정보를 생성하고, 생성된 깊이 정보와 세그맨테이션 영상 정보를 이용하여 RGB 영상을 복원하며, 사이클 GAN의 목적 함수를 통해 학습을 수행한다.
In addition, the image information learning unit receives the RGB image of the standard database, the generator (Generator)
Figure PCTKR2021001803-appb-I000005
Depth information is generated using
Figure PCTKR2021001803-appb-I000006
is used to generate segmentation image information, reconstructs an RGB image using the generated depth information and segmentation image information, and performs learning through the objective function of the cycle GAN.
또한, 상기 연산부는 영상정보 학습부에서 생성된 깊이 정보와 세그맨테이션 영상 정보 및 복원된 RGB 영상을 표준 데이터베이스와 각각 판별하여 비교하고, 각각에 대한 손실(Loss) 및 판별 확률을 계산한다. 또한, 상기 판단부는 연산부에서 계산된 결과값을 토대로 각각의 손실 및 판별자(Discriminator)의 판별 확률 값이 미리 설정된 기준 수렴값을 만족하는지 판단한다.In addition, the calculation unit compares the depth information generated by the image information learning unit, the segmentation image information, and the reconstructed RGB image with a standard database, respectively, and calculates a loss and a discrimination probability for each. In addition, the determination unit determines whether each loss and discrimination probability value of a discriminator satisfy a preset reference convergence value based on the result calculated by the operation unit.
또한, 상기 영상 입력부는 RGB 영상을 입력받는다. 또한, 영상정보 추정부는 영상정보 학습부에서 학습이 완료된 상기 생성자(Generator)
Figure PCTKR2021001803-appb-I000007
와, 생성자(Generator)
Figure PCTKR2021001803-appb-I000008
를 이용하여 영상 입력부에서 입력받은 RGB 영상에 대한 깊이 정보를 추정한다.
In addition, the image input unit receives an RGB image. In addition, the image information estimating unit is the generator (Generator) learning is completed in the image information learning unit
Figure PCTKR2021001803-appb-I000007
Wow, the Generator
Figure PCTKR2021001803-appb-I000008
is used to estimate the depth information of the RGB image received from the image input unit.
이상에서 설명한 바와 같이, 본 발명에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법 및 시스템은 특수 장비나 카메라, 레이더, 초음파 및 센서를 이용하지 않고 단일 영상만을 사용하여 3차원 정보를 생성함으로써 저렴하게 3차원 영상을 추출할 수 있는 효과가 있다. 또한, 확장성이 높아 스테레오 이미지, 광학 흐름 기법 또는 포인트 클라우드와 같은 다른 정보를 사용할 수 없는 경우에도 3차원 정보를 생성할 수 있고, 3차원 정보를 추출하기 위한 장비의 소형화에 유리한 효과가 있다.As described above, the depth estimation method and system using cycle GAN and segmentation according to the present invention is inexpensive by generating 3D information using only a single image without using special equipment, cameras, radar, ultrasound and sensors. It has the effect of extracting a 3D image. In addition, due to the high scalability, 3D information can be generated even when other information such as a stereo image, an optical flow technique, or a point cloud cannot be used, and there is an advantageous effect in miniaturization of equipment for extracting 3D information.
또한, 단일 영상을 사용하여 영상의 깊이 정보를 추정할 수 있어 3차원 영상 정보를 생성하기 위한 자료를 용이하게 구할 수 있다. 또한, 세그맨테이션(Segmentation)을 토대로 깊이 정보를 추정하는 과정에서 발생하는 데이터 불균형 문제를 시각적으로 표시하고 상대적으로 큰 특징에 묻혀 소실되는 작은 특징들을 부각시켜 해결할 수 있는 효과가 있다.In addition, since depth information of an image can be estimated using a single image, data for generating 3D image information can be easily obtained. In addition, there is an effect of visually displaying the data imbalance problem that occurs in the process of estimating depth information based on segmentation and emphasizing small features that are lost by being buried in relatively large features.
도 1은 종래의 깊이 추정 과정에서 발생하는 문제점을 나타내는 도면이다.1 is a diagram illustrating a problem occurring in a conventional depth estimation process.
도 2는 본 발명의 실시 예에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 시스템을 나타내는 구성도이다.2 is a block diagram illustrating a depth estimation system using a cycle GAN and segmentation according to an embodiment of the present invention.
도 3은 본 발명의 실시 예에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법을 나타내는 개념도이다.3 is a conceptual diagram illustrating a depth estimation method using a cycle GAN and segmentation according to an embodiment of the present invention.
도 4는 본 발명의 실시 예에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법을 나타내는 흐름도이다.4 is a flowchart illustrating a depth estimation method using a cycle GAN and segmentation according to an embodiment of the present invention.
도 5는 본 발명의 실시 예에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법을 나타내는 순서도이다.5 is a flowchart illustrating a depth estimation method using a cycle GAN and segmentation according to an embodiment of the present invention.
도 6은 본 발명의 실시 예에 따른 단일 영상의 깊이 정보를 추정하는 방법의 동작 순서를 나타내는 도면이다.6 is a diagram illustrating an operation sequence of a method for estimating depth information of a single image according to an embodiment of the present invention.
도 7은 본 발명의 실시 예에 따른 세그맨테이션 추정 과정을 나타내는 도면이다.7 is a diagram illustrating a segmentation estimation process according to an embodiment of the present invention.
도 8은 본 발명의 실시 예에 따른 깊이 추정 과정을 나타내는 도면이다.8 is a diagram illustrating a depth estimation process according to an embodiment of the present invention.
도 9는 생성자(Generator)의 생성 분포 및 판별자(Discriminator)의 판별 확률을 나타내는 도면이다.9 is a diagram illustrating a generation distribution of a generator and a discrimination probability of a discriminator.
도 10은 본 발명의 실시 예에 따른 사이클 일관성 손실(Cycle-Consistency Loss)을 나타내는 도면이다.10 is a diagram illustrating a cycle-consistency loss according to an embodiment of the present invention.
도 11은 본 발명의 실시 예에 따른 실행단계의 깊이 정보 추정 단계를 나타내는 도면이다.11 is a diagram illustrating a depth information estimation step of an execution step according to an embodiment of the present invention.
도 12는 본 발명의 실시 예에 따른 실행단계의 세그맨테이션 정보 추정 단계를 나타내는 도면이다.12 is a diagram illustrating a segmentation information estimation step of an execution step according to an embodiment of the present invention.
도 13a 및 도 13b는 깊이 정보 추정 과정에서 세그맨테이션 과정을 사용하기 전과 후를 비교하여 나타내는 도면이다.13A and 13B are diagrams showing a comparison before and after the segmentation process is used in the depth information estimation process.
도 14는 본 발명의 실시 예에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법의 평가 절차를 나타내는 도면이다.14 is a diagram illustrating an evaluation procedure of a depth estimation method using a cycle GAN and segmentation according to an embodiment of the present invention.
아래에서는 첨부한 도면을 참고로 하여 본 발명의 실시 예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시 예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면부호를 붙였다.Hereinafter, with reference to the accompanying drawings, embodiments of the present invention will be described in detail so that those of ordinary skill in the art to which the present invention pertains can easily implement them. However, the present invention may be embodied in various different forms and is not limited to the embodiments described herein. And in order to clearly explain the present invention in the drawings, parts irrelevant to the description are omitted, and similar reference numerals are attached to similar parts throughout the specification.
명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다. 또한, 명세서에 기재된 "…부", "…기", "…모듈" 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어나 또는 소프트웨어 또는 하드웨어 및 소프트웨어의 결합으로 구현될 수 있다.Throughout the specification, when a part "includes" a certain element, it means that other elements may be further included, rather than excluding other elements, unless otherwise stated. In addition, terms such as “…unit”, “…group”, “…module”, etc. described in the specification mean a unit that processes at least one function or operation, which is implemented by hardware or software or a combination of hardware and software. can be
이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시 예를 설명함으로써, 본 발명을 상세히 설명한다.Hereinafter, the present invention will be described in detail by describing preferred embodiments of the present invention with reference to the accompanying drawings.
각 도면에 제시된 동일한 참조 부호는 동일한 부재를 나타낸다.Like reference numerals in each figure indicate like elements.
도 1은 종래의 깊이 추정 과정에서 발생하는 문제점을 나타내는 도면이고, 도 2는 본 발명의 실시 예에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 시스템(10)을 나타내는 구성도이다. 즉, 도 1은 영상에 대한 종래의 깊이(Depth) 추정 과정에서 깊이 정보가 모호하게 나오는 문제점을 나타내는 도면이다.1 is a diagram illustrating a problem occurring in a conventional depth estimation process, and FIG. 2 is a configuration diagram illustrating a depth estimation system 10 using a cycle GAN and segmentation according to an embodiment of the present invention. That is, FIG. 1 is a diagram illustrating a problem in that depth information is ambiguous in a conventional depth estimation process for an image.
GAN(Generative Adversarial Network)은 적대적(Adversarial)으로 생성(Generative)하는 네트워크(Network)의 의미로서 비지도 학습 기반의 생성모델(Unsupervised generative model)을 나타낸다. 이는 서로 상대적인 특성을 가진 2개의 신경망이 서로 경쟁하여 상승효과를 나타내게 된다.Generative Adversarial Network (GAN) refers to a network that generates adversarially and represents an unsupervised generative model based on unsupervised learning. In this case, two neural networks with relative characteristics compete with each other, resulting in a synergistic effect.
상기 GAN은 각각 데이터 인스턴스를 생성하는 생성자(Generator)와 데이터의 진위 여부를 판단하는 판별자(Discriminator)를 포함한다. 여기에서, 생성자는 Zero-mean Gaussian으로 생성되는 랜던 노이즈(Random Noise) z를 입력받아 실제 데이터 분포와 유사한 페이크 데이터(Fake data)를 생성한다.The GAN includes a generator that generates a data instance, and a discriminator that determines whether the data is authentic or not. Here, the generator receives random noise z generated as a zero-mean Gaussian and generates fake data similar to the actual data distribution.
이와는 대조적으로 상기 판별자는 생성자가 생성하는 데이터가 페이크 데이터인지 트레이닝 데이터셋의 데이터인지 구별하여 각각에 대한 확률을 나타낸다. 따라서, 판별자는 실수할 확률을 낮추고자 동작하고, 생성자는 판별자가 실수할 확률을 높이고자 동작하게 되는데 이를 미니맥스 문제(Minimax Problem)라 한다.In contrast to this, the discriminator distinguishes whether the data generated by the generator is fake data or data of a training dataset, and indicates a probability for each. Therefore, the discriminator operates to decrease the probability of making a mistake, and the generator operates to increase the probability that the discriminator makes a mistake. This is called the Minimax Problem.
본 발명의 실시 예에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법 및 시스템(10)은 종래의 특수 장비나 여러 장의 영상 등을 필요로 하는 종래의 깊이 정보 추정 방식을 개선함으로써 사이클 GAN(Cycle Generative Adversarial Network)과 세그맨테이션(Segmentation)을 사용하여 깊이 정보를 추정할 수 있다.The depth estimation method and system 10 using cycle GAN and segmentation according to an embodiment of the present invention improves the conventional depth information estimation method requiring conventional special equipment or multiple images, such as cycle GAN (Cycle GAN). Depth information can be estimated using Generative Adversarial Network and Segmentation.
일반적으로 RGB 영상에서 학습을 통해 깊이(Depth) 정보를 추정할 때, 도 1에서 도시된 바와 같이 학습 데이터 간 데이터 불균형을 이유로 비교적 덜 학습된 특징에 대해서는 깊이 정보가 모호하게 나오거나 아예 큰 특징에 묻히게 되는 페이딩(Fading) 등의 문제점들이 발생한다.In general, when estimating depth information through learning from RGB images, as shown in FIG. 1 , depth information is vaguely displayed for features that are relatively less learned due to data imbalance between training data, or depth information is not applied to large features. There are problems such as fading to be buried.
따라서, 본 발명의 실시 예에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법 및 시스템(10)은 종래의 깊이 정보를 추정하는 과정에서 발생하는 데이터 불균형 문제를 시각적으로 표시하고 상대적으로 큰 특징에 묻혀 소실되는 작은 특징들을 부각시키려는 목적으로 세그맨테이션(Segmentation)을 도입하여 해결하고자 한다.Therefore, the depth estimation method and system 10 using cycle GAN and segmentation according to an embodiment of the present invention visually displays the data imbalance problem that occurs in the process of estimating the conventional depth information and provides a relatively large feature. In order to highlight small features that are buried and lost, we try to solve this by introducing segmentation.
본 발명의 실시 예에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 시스템(10)은 영상정보 학습부(100), 연산부(200), 판단부(300), 데이터베이스(400), 영상 입력부(500) 및 영상정보 추정부(600)를 포함할 수 있다. 이때, 데이터베이스(400)는 표준 데이터베이스(410)를 포함한다.The depth estimation system 10 using cycle GAN and segmentation according to an embodiment of the present invention includes an image information learning unit 100 , a calculating unit 200 , a determining unit 300 , a database 400 , and an image input unit 500 . ) and an image information estimation unit 600 . In this case, the database 400 includes a standard database 410 .
영상정보 학습부(100)는 표준 데이터베이스(410)의 RGB 영상을 입력받고, 생성자(Generator)
Figure PCTKR2021001803-appb-I000009
를 이용하여 깊이 정보를 추정하며, 생성자(Generator)
Figure PCTKR2021001803-appb-I000010
을 이용하여 세그맨테이션 영상 정보를 추정한다.
The image information learning unit 100 receives the RGB image of the standard database 410, and a generator
Figure PCTKR2021001803-appb-I000009
Depth information is estimated using
Figure PCTKR2021001803-appb-I000010
is used to estimate segmentation image information.
이때, 영상정보 학습부(100)는 RGB 영상 정보에서 세그맨테이션 정보로 변환하기 위하여 생성자(Generator)
Figure PCTKR2021001803-appb-I000011
에 의해 입력 RGB 영상 X에 대한 세그맨테이션 정보를 획득하여 해당 정보를 깊이 정보 추정에 활용한다.
At this time, the image information learning unit 100 generates a generator to convert RGB image information into segmentation information.
Figure PCTKR2021001803-appb-I000011
Segmentation information for the input RGB image X is obtained by using , and the corresponding information is used for depth information estimation.
또한, 영상정보 학습부(100)는 생성자(Generator)
Figure PCTKR2021001803-appb-I000012
에 의해 입력 RGB 영상 X에 대한 깊이 정보를 획득하여 해당 정보를 세그맨테이션 정보 추정에 활용한다. 또한, 영상정보 학습부(100)는 생성된 깊이 정보와 세그맨테이션 영상 정보를 이용하여 RGB 영상을 복원한다. 또한, 영상정보 학습부(100)는 사이클 GAN의 목적 함수를 통해 생성자(Generator)
Figure PCTKR2021001803-appb-I000013
와, 생성자(Generator)
Figure PCTKR2021001803-appb-I000014
에 대한 학습을 수행한다.
In addition, the image information learning unit 100 is a generator (Generator)
Figure PCTKR2021001803-appb-I000012
to obtain depth information on the input RGB image X and use the information for estimating segmentation information. Also, the image information learning unit 100 restores the RGB image by using the generated depth information and segmentation image information. In addition, the image information learning unit 100 is a generator (Generator) through the objective function of the cycle GAN
Figure PCTKR2021001803-appb-I000013
Wow, the Generator
Figure PCTKR2021001803-appb-I000014
carry out learning about
연산부(200)는 생성된 깊이 정보와 세그맨테이션 영상 정보 및 복원된 RGB 영상을 표준 데이터베이스(410)와 각각 판별하여 비교하고, 각각에 대한 손실(Loss) 및 판별 확률을 계산한다. 이때, 연산부(200)는 사이클 GAN의 목적 함수를 통해 손실 및 판별 확률 결과값의 수치를 계산한다.The calculator 200 determines and compares the generated depth information, the segmented image information, and the reconstructed RGB image with the standard database 410, and calculates a loss and a discrimination probability for each. At this time, the calculating unit 200 calculates the numerical values of the loss and discrimination probability result values through the objective function of the cycle GAN.
판단부(300)는 연산부(200)에서 계산된 결과값을 토대로 각각의 손실 및 판별자(Discriminator)의 판별 확률 값이 미리 설정된 기준 수렴값을 만족하는지 판단한다.The determination unit 300 determines whether each loss and the discrimination probability value of the discriminator satisfy a preset reference convergence value based on the result calculated by the operation unit 200 .
또한, 판단부(300)는 판단 결과를 토대로 손실 및 판별 확률 값이 미리 설정된 기준 수렴값을 만족하지 않는 경우 각각의 손실 및 판별자의 판별 확률 값이 미리 설정된 기준 수렴값에 수렴되도록 학습을 조정하고, 영상정보 학습부(100)에 피드백하여 재학습 또는 깊이 정보와 세그맨테이션 영상 정보를 다시 추정할 수 있도록 유도한다. 즉, 판단부(300)는 영상정보 학습부(100)에서 재학습이 수행되도록 조정된 결과를 영상정보 학습부(100)에 피드백한다.In addition, the determination unit 300 adjusts learning so that, based on the determination result, the loss and discrimination probability value does not satisfy the preset reference convergence value, the respective loss and the discrimination probability value of the discriminator converge to the preset reference convergence value, , feed back to the image information learning unit 100 to induce re-learning or re-estimation of depth information and segmentation image information. That is, the determination unit 300 feeds back to the image information learning unit 100 a result adjusted so that the re-learning is performed by the image information learning unit 100 .
데이터베이스(400)는 영상정보 학습부(100)에서 학습을 수행하기 위한 표준 데이터베이스(410)를 포함한다. 즉, 영상정보 학습부(100)는 데이터베이스(400)로부터 표준 데이터베이스(410)를 입력받아 깊이 정보와 세그맨테이션 영상 정보를 추정한다.The database 400 includes a standard database 410 for performing learning in the image information learning unit 100 . That is, the image information learning unit 100 receives the standard database 410 from the database 400 and estimates depth information and segmentation image information.
이때, 표준 데이터베이스(410)는 RGB 영상 정보와 깊이 정보 및 세그맨테이션 정보를 포함한다. 또한, 표준 데이터베이스(410)는 NYU Depth Dataset V2가 사용될 수 있다.In this case, the standard database 410 includes RGB image information, depth information, and segmentation information. In addition, as the standard database 410, NYU Depth Dataset V2 may be used.
또한, 데이터베이스(400)는 판단부(300)의 판단 기준이 되는 기준 수렴값을 저장한다. 또한, 데이터베이스(400)는 영상정보 학습부(100)에서 깊이 정보 및 세그맨테이션 정보를 추정하는 생성자(Generator)
Figure PCTKR2021001803-appb-I000015
와, 생성자(Generator)
Figure PCTKR2021001803-appb-I000016
데이터를 저장한다.
In addition, the database 400 stores a reference convergence value serving as a determination criterion of the determination unit 300 . In addition, the database 400 is a generator for estimating depth information and segmentation information in the image information learning unit 100 .
Figure PCTKR2021001803-appb-I000015
Wow, the Generator
Figure PCTKR2021001803-appb-I000016
Save the data.
영상 입력부(500)는 RGB 영상을 입력받는다. 또한, 영상정보 추정부(600)는 학습과정이 완료된 상기 생성자(Generator)
Figure PCTKR2021001803-appb-I000017
와, 생성자(Generator)
Figure PCTKR2021001803-appb-I000018
를 이용하여 영상 입력부(500)에서 입력받은 RGB 영상에 대한 깊이 정보 또는 세그맨테이션 정보를 추정한다.
The image input unit 500 receives an RGB image. In addition, the image information estimator 600 is the generator (Generator) the learning process is completed.
Figure PCTKR2021001803-appb-I000017
Wow, the Generator
Figure PCTKR2021001803-appb-I000018
is used to estimate depth information or segmentation information for an RGB image received from the image input unit 500 .
도 3은 본 발명의 실시 예에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법을 나타내는 개념도이고, 도 4는 본 발명의 실시 예에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법을 나타내는 흐름도이며, 도 5는 본 발명의 실시 예에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법을 나타내는 순서도이다.3 is a conceptual diagram illustrating a depth estimation method using cycle GAN and segmentation according to an embodiment of the present invention, and FIG. 4 is a flowchart illustrating a depth estimation method using cycle GAN and segmentation according to an embodiment of the present invention. and FIG. 5 is a flowchart illustrating a depth estimation method using a cycle GAN and segmentation according to an embodiment of the present invention.
본 발명의 실시 예에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법은 도 4에서 도시된 바와 같이 학습단계와 실행단계를 포함한다. 상기 학습단계에서는 세그맨테이션과 이를 이용한 깊이 추정 방법을 학습한다. 또한, 상기 학습단계에서는 목적 함수와 판별 확률을 계산하여 학습을 조정한다.The depth estimation method using cycle GAN and segmentation according to an embodiment of the present invention includes a learning step and an execution step as shown in FIG. 4 . In the learning step, segmentation and a depth estimation method using the segmentation are learned. In addition, in the learning step, the learning is adjusted by calculating the objective function and the discrimination probability.
상기 실행단계에서는 학습단계의 학습결과를 토대로 RGB 영상정보만을 이용하여 깊이 정보를 추정한다. 이때, 실행단계는 상기 학습단계에서 깊이 정보를 학습하던 과정에서 사용했던 생성자(Generator)를 사용하여 깊이 정보를 추정한다.In the execution step, depth information is estimated using only RGB image information based on the learning result of the learning step. In this case, the execution step estimates depth information using a generator used in the process of learning the depth information in the learning step.
본 발명의 실시 예에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법의 학습단계는 표준 데이터베이스(410)의 입력 RGB 영상 X에 대하여 생성자(Generator)
Figure PCTKR2021001803-appb-I000019
와 생성자(Generator)
Figure PCTKR2021001803-appb-I000020
을 이용하여 깊이 정보와 세그맨테이션 영상 정보를 생성하는 단계(S10)와, 생성된 깊이 정보와 세그맨테이션 영상 정보를 이용하여 RGB 영상을 복원하는 단계(S20) 및 생성된 깊이 정보와 세그맨테이션 영상 정보 및 복원된 RGB 영상을 표준 데이터베이스(410)와 각각 판별하여 비교하고, 각각에 대한 손실(Loss) 및 판별 확률을 계산하는 단계(S30)를 포함할 수 있다.
The learning step of the depth estimation method using cycle GAN and segmentation according to an embodiment of the present invention includes a generator for the input RGB image X of the standard database 410 .
Figure PCTKR2021001803-appb-I000019
and the Generator
Figure PCTKR2021001803-appb-I000020
Step (S10) of generating depth information and segmentation image information using The method may include determining and comparing the presentation image information and the reconstructed RGB image with the standard database 410, respectively, and calculating a loss and a determination probability for each (S30).
또한, 계산된 결과값을 토대로 각각의 손실 및 판별자(Discriminator)의 판별 확률 값이 미리 설정된 기준 수렴값을 만족하는지 판단하는 단계(S40)와, 판단 결과 손실 및 판별 확률 값이 미리 설정된 기준 수렴값을 만족하지 않는 경우 각각의 손실 및 판별자의 판별 확률 값이 미리 설정된 기준 수렴값에 수렴되도록 학습을 조정하고, 상기 (S10) 단계 내지 (S40) 단계를 반복 수행하는 단계(S50)를 포함할 수 있다.In addition, based on the calculated result value, the step of determining whether each loss and the discriminator's discriminator's discriminator's discriminator satisfies a preset reference convergence value (S40), and the judgment result loss and discriminator's preset convergence criteria If the values are not satisfied, adjusting the learning so that the respective loss and discrimination probability values of the discriminator converge to a preset reference convergence value, and repeating the steps (S10) to (S40). can
도 6은 본 발명의 실시 예에 따른 사이클 GAN과 세그맨테이션을 사용하여 단일 영상의 깊이 정보를 추정하는 방법의 동작 순서를 나타내는 도면이고, 도 7은 본 발명의 실시 예에 따른 세그맨테이션 추정 과정을 나타내는 도면이며, 도 8은 본 발명의 실시 예에 따른 깊이 추정 과정을 나타내는 도면이다.6 is a diagram illustrating an operation sequence of a method for estimating depth information of a single image using a cycle GAN and segmentation according to an embodiment of the present invention, and FIG. 7 is a segmentation estimation according to an embodiment of the present invention. It is a diagram illustrating a process, and FIG. 8 is a diagram illustrating a depth estimation process according to an embodiment of the present invention.
상기 깊이 정보와 세그맨테이션 영상 정보를 생성하는 단계(S10)는 표준 데이터베이스(410)의 입력 RGB 영상을 토대로 깊이 정보와 세그맨테이션 정보를 추정하고, 목적 함수를 통해 학습률을 계산한다. 또한, 각 도메인(Domain)에 따른 사이클 일관성 손실(Cycle-Consistency Loss)들의 결합이 깊이 정보 추정의 성능에 미치는 영향을 평가할 수 있다.In the step ( S10 ) of generating the depth information and the segmentation image information, depth information and segmentation information are estimated based on the input RGB image of the standard database 410 , and a learning rate is calculated through an objective function. In addition, it is possible to evaluate the effect of a combination of cycle-consistency losses according to each domain on the performance of depth information estimation.
도 7 및 도 8에서 도시된 바와 같이 상기 깊이 정보와 세그맨테이션 영상 정보를 생성하는 단계(S10)에서 세그맨테이션 정보를 추정하는 세그맨테이션 네트워크 구조와, 깊이 정보를 추정하는 깊이 네트워크 구조는 동일한 구조를 가지며, 각 생성자(Generator)와 판별자(Discriminator)의 역할만 변경된다.As shown in FIGS. 7 and 8 , the segmentation network structure for estimating the segmentation information in the step S10 of generating the depth information and the segmentation image information and the depth network structure for estimating the depth information are It has the same structure, and only the roles of each generator and discriminator are changed.
상기 2개 네트워크의 동작 순서는 다음과 같다. 먼저, 표준 데이터베이스(410)의 세그맨테이션 정보를 통해 깊이 정보에 대한 힌트를 제공할 수 있다. 또한, RGB 영상 정보에서 세그맨테이션 정보로 변환하기 위하여 생성자(Generator)
Figure PCTKR2021001803-appb-I000021
에 의해 입력 RGB 영상 X에 대한 세그맨테이션 정보를 획득하여 해당 정보를 깊이 정보 추정에 활용할 수 있다.
The operation sequence of the two networks is as follows. First, a hint about depth information may be provided through segmentation information of the standard database 410 . In addition, in order to convert RGB image information into segmentation information, a generator
Figure PCTKR2021001803-appb-I000021
Segmentation information for the input RGB image X may be obtained by the , and the corresponding information may be used for depth information estimation.
또한, 도 8과 같이 마찬가지로 생성자(Generator)
Figure PCTKR2021001803-appb-I000022
에 의해 입력 RGB 영상 X에 대한 깊이 정보를 획득하여 해당 정보를 세그맨테이션 정보 추정에 활용할 수 있다. 이때, 2개 네트워크의 생성자는 도 4와 같이 상기 손실(Loss) 및 판별 확률을 계산하는 단계(S30)에서 피드백 되어 RGB 영상을 통해 깊이 정보 및 세그맨테이션 정보를 추정할 수 있도록 변환된다.
In addition, as in Fig. 8, the generator (Generator)
Figure PCTKR2021001803-appb-I000022
depth information on the input RGB image X can be obtained by using the . In this case, the generator of the two networks is fed back in the step S30 of calculating the loss and the discrimination probability as shown in FIG. 4 and converted to estimate depth information and segmentation information through the RGB image.
상기 손실(Loss) 및 판별 확률을 계산하는 단계(S30)는 사이클 GAN의 목적 함수를 통해 손실 및 판별 확률 결과값의 수치를 계산한다. 여기에서, 상기 목적 함수는 사이클 GAN(Generative Adversarial Network)의 적대적인 손실 함수(Adversarial Loss Function)와 사이클 일관성 손실 함수(Cycle-Consistency Loss Function)로 구성될 수 있다.In the step of calculating the loss and the discrimination probability ( S30 ), numerical values of the loss and discrimination probability result values are calculated through the objective function of the cycle GAN. Here, the objective function may be composed of an adversarial loss function and a cycle-consistency loss function of a cyclic generative adversarial network (GAN).
또한, 상기 적대적인 손실 함수(Adversarial Loss Function)는 생성자(Generator)와 판별자(Discriminator)의 미니맥스(Minimax) 결과에 따라 학습을 진행한다. 상기 적대적인 손실(Adversarial Loss)에서 생성자(Generator)는 데이터의 표준 분포를 모방하고, 판별자(Discriminator)는 이에 따른 판별 확률을 계산한다.In addition, the adversarial loss function learns according to the minimax results of the generator and the discriminator. In the adversarial loss, a generator imitates a standard distribution of data, and a discriminator calculates a discrimination probability accordingly.
즉, 상기 적대적인 손실 함수(Adversarial Loss Function)는 상기 깊이 정보와 세그맨테이션 영상 정보를 생성하는 단계(S10)에서 계산된 목적 함수 중 적대적인 손실(Adversarial Loss) 만을 계산한다. 해당 단계에서는 깊이 및 세그맨테이션 상호간의 교점이 존재하지 않으므로 서로 독립적으로 깊이 정보 및 세그맨테이션 정보 추정을 진행한다.That is, the adversarial loss function calculates only the adversarial loss among the objective functions calculated in the step S10 of generating the depth information and the segmentation image information. In this step, since there is no intersection between depth and segmentation, depth information and segmentation information are estimated independently of each other.
따라서, 상기 손실(Loss) 및 판별 확률을 계산하는 단계(S30)는 상기 깊이 정보와 세그맨테이션 영상 정보를 생성하는 단계(S10) 및 RGB 영상을 복원하는 단계(S20)에서 추정된 깊이 영상과 세그맨테이션 영상 및 복원된 RGB 영상을 판별자(Discriminator)를 통해 판별하여 각각에 대한 확률을 계산할 수 있다.Therefore, calculating the loss and the discrimination probability (S30) is performed with the depth image estimated in the step (S10) of generating the depth information and the segmentation image information and the step (S20) of reconstructing the RGB image. The segmentation image and the reconstructed RGB image may be discriminated through a discriminator, and probabilities for each may be calculated.
이때, 본 발명에 따른 네트워크의 목적 함수는 아래의 [수학식 1]과 같이 나타낼 수 있다.In this case, the objective function of the network according to the present invention can be expressed as [Equation 1] below.
[수학식 1][Equation 1]
Figure PCTKR2021001803-appb-I000023
Figure PCTKR2021001803-appb-I000023
여기에서, D는 각 입력에 대한 판별자의 판별 확률을 나타내고, G는 입력에 대한 생성자의 데이터 공간 매핑을 나타내며, λ는 가중치 부여에 사용되는 하이퍼 파라미터를 나타낸다.Here, D represents the discriminator's discrimination probability for each input, G represents the data space mapping of the generator to the input, and λ represents the hyperparameter used for weighting.
또한, 상기 [수학식 1]은 사이클 GAN(Generative Adversarial Network)의 적대적인 손실 함수(Adversarial Loss Function)와 사이클 일관성 손실 함수(Cycle-Consistency Loss Function)로 구성된다.In addition, [Equation 1] is composed of an adversarial loss function and a cycle-consistency loss function of a cyclic GAN (Generative Adversarial Network).
상기 적대적인 손실 함수(Adversarial Loss Function)는 RGB 영상 정보에서 세그맨테이션 정보와 깊이 정보를 추정하는 과정에서 연산 가능하며, 아래의 [수학식 2] 및 [수학식 3]과 같이 나타낼 수 있다.The adversarial loss function can be calculated in the process of estimating segmentation information and depth information from RGB image information, and can be expressed as [Equation 2] and [Equation 3] below.
[수학식 2][Equation 2]
Figure PCTKR2021001803-appb-I000024
Figure PCTKR2021001803-appb-I000024
[수학식 3][Equation 3]
Figure PCTKR2021001803-appb-I000025
Figure PCTKR2021001803-appb-I000025
여기에서, E는 해당 분포에 대한 기댓값을 나타내고, Pi는 i에 대한 확률 분포를 나타낸다. 도 5에서 도시된 바와 같이 생성자(Generator)
Figure PCTKR2021001803-appb-I000026
가 생성한 페이크 데이터
Figure PCTKR2021001803-appb-I000027
를 판별자
Figure PCTKR2021001803-appb-I000028
가 판별하고, 생성자(Generator)
Figure PCTKR2021001803-appb-I000029
가 생성한 페이크 데이터
Figure PCTKR2021001803-appb-I000030
를 판별자
Figure PCTKR2021001803-appb-I000031
가 판별하면서 학습을 진행한다.
Here, E represents the expected value for the distribution, and P i represents the probability distribution for i. As shown in Figure 5, the generator (Generator)
Figure PCTKR2021001803-appb-I000026
fake data generated by
Figure PCTKR2021001803-appb-I000027
the discriminator
Figure PCTKR2021001803-appb-I000028
is determined, and the generator
Figure PCTKR2021001803-appb-I000029
fake data generated by
Figure PCTKR2021001803-appb-I000030
the discriminator
Figure PCTKR2021001803-appb-I000031
Learning proceeds while discriminating.
상기 적대적인 손실 함수(Adversarial Loss Function)의 핵심은 GAN을 통해 생성된 분포를 실제 분포로 매핑하는 것이다. 따라서, 상기 적대적인 손실 함수(Adversarial Loss Function)는 생성자와 판별자의 미니맥스(Minimax) 결과에 따라 학습이 진행되며, 상기 생성자는 실제 분포와 완벽히 유사한 분포를 생성할 수 있다. 또한, 이에 따른 판별자의 판별 확률은 50%에 수렴하게 된다.The core of the adversarial loss function is to map the distribution generated through the GAN to the actual distribution. Accordingly, the Adversarial Loss Function is trained according to the minimax results of the generator and the discriminator, and the generator can generate a distribution perfectly similar to the actual distribution. In addition, the discrimination probability of the discriminator converges to 50%.
도 9는 생성자(Generator)의 생성 분포 및 판별자(Discriminator)의 판별 확률을 나타내는 도면이다. 즉, 도 9는 학습단계에서 GAN의 학습이 진행됨에 따른 생성자의 생성 분포와 판별자의 판별 확률이 변동되는 과정을 나타내는 그래프이다.9 is a diagram illustrating a generation distribution of a generator and a discrimination probability of a discriminator. That is, FIG. 9 is a graph showing a process in which the generation distribution of the generator and the discrimination probability of the discriminator are changed as the learning of the GAN proceeds in the learning step.
도 9에서 검은색 점선은 실제 데이터 분포(Discriminator distribution)를 나타내고, 녹색 실선은 생성자의 생성 분포(Generative distribution)를 나타내며, 파란 점선은 판별자의 판별 확률(Discriminator distribution)을 나타낸다. 여기에서, 두 분포의 거리가 좁다는 것은 해당 분포가 매우 유사하다는 것을 의미하며, 이는 판별자가 쉽게 판별하지 못함을 나타낸다.In FIG. 9 , the black dotted line represents the actual data distribution (Discriminator distribution), the green solid line represents the generative distribution of the generator, and the blue dotted line represents the discriminator distribution. Here, the small distance between the two distributions means that the distributions are very similar, indicating that the discriminator cannot easily discriminate.
따라서, 판별자의 판별 확률은 낮게(min), 생성자의 생성 분포 유사도를 높게(max) 학습을 진행하여 생성자의 생성 분포가 실제 데이터 분포와 매우 유사하도록 학습을 진행한다.Therefore, learning is carried out so that the discrimination probability of the discriminator is low (min) and the generation distribution similarity of the generator is high (max), so that the generation distribution of the generator is very similar to the actual data distribution.
또한, 재건 손실(Reconstruction Loss)을 추가함으로써 종래의 CNN 기반 학습 방법에서 사용하던 손실(Loss)을 결합하여 생성자의 생성 분포가 목표(Target)의 표준 분포를 학습할 수 있도록 유도한다. 상기 재건 손실(Reconstruction Loss)은 아래의 [수학식 4]와 같이 나타낼 수 있다.In addition, by adding a reconstruction loss, the loss used in the conventional CNN-based learning method is combined to induce the generation distribution of the generator to learn the standard distribution of the target. The reconstruction loss may be expressed as in [Equation 4] below.
[수학식 4][Equation 4]
Figure PCTKR2021001803-appb-I000032
Figure PCTKR2021001803-appb-I000032
상기 손실(Loss) 및 판별 확률을 계산하는 단계(S30) 이후에, 동일한 입력 RGB 영상 X에 대하여 서로 다른 출력값을 갖지만 후에 이를 다시 RGB 영상
Figure PCTKR2021001803-appb-I000033
로 복원하였을 때 원본 RGB 영상 X와 복원된 RGB 영상
Figure PCTKR2021001803-appb-I000034
를 비교하여 원본 RGB 영상 X의 형상을 유지하면서 깊이 정보와 세그맨테이션 영상을 생성할 수 있도록 유도하는 사이클 일관성 손실(Cycle-Consistency Loss) 단계(S31)를 더 포함할 수 있다.
After the step (S30) of calculating the loss and discrimination probability, different output values are obtained for the same input RGB image X, but later this is again the RGB image.
Figure PCTKR2021001803-appb-I000033
The original RGB image X and the restored RGB image when restored to
Figure PCTKR2021001803-appb-I000034
The method may further include a cycle-consistency loss step (S31) of inducing to generate depth information and a segmentation image while maintaining the shape of the original RGB image X by comparing .
상기 사이클 일관성 손실(Cycle-Consistency Loss) 계산 단계(S31)는 2개의 목적 함수와 별개로 추정된 두 영상정보를 다시 원래의 영상정보로 복원하였을 때의 유사도를 평가한다. 상기 사이클 일관성 손실(Cycle-Consistency Loss) 계산 단계(S31)는 생성자가 각 도메인(Domain)의 형태를 유지한 채 도메인으로의 변환을 시도하게끔 유도하는 역할을 한다.The cycle-consistency loss calculation step S31 evaluates the similarity when the two image information estimated separately from the two objective functions are restored back to the original image information. The cycle-consistency loss calculation step S31 serves to induce the generator to attempt conversion to a domain while maintaining the shape of each domain.
본 발명의 실시 예에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법은 세그맨테이션(Segmentation)과 깊이(Depth)의 2가지 도메인(Domain)으로 구성되는 사이클 GAN을 사용할 수 있다. 따라서, 상기 사이클 일관성 손실(Cycle-Consistency Loss)은 2가지 손실(Loss)의 합으로 구성되는 아래의 [수학식 5]와 같이 나타낼 수 있다.A cycle GAN and a depth estimation method using segmentation according to an embodiment of the present invention may use a cycle GAN composed of two domains: segmentation and depth. Accordingly, the cycle-consistency loss can be expressed as [Equation 5] below, which is composed of the sum of two losses.
[수학식 5][Equation 5]
Figure PCTKR2021001803-appb-I000035
Figure PCTKR2021001803-appb-I000035
즉, 복원을 통해 생긴 오차 값을 사이클 일관성 손실(Cycle-Consistency Loss)로 설정한다. 따라서, 복원이 잘 이루어진다면 손실(Loss) 함수는 낮아진다.That is, an error value generated through restoration is set as a cycle-consistency loss. Therefore, if the restoration is performed well, the loss function is lowered.
도 10은 본 발명의 실시 예에 따른 사이클 GAN 모델의 사이클 일관성 손실(Cycle-Consistency Loss)을 나타내는 도면이다. 도 10에서 도시된 바와 같이 상기 사이클 일관성 손실(Cycle-Consistency Loss)을 살펴보면 RGB 영상으로 다시 복원이 진행될 때, 생성자는 깊이 정보로부터의 복원뿐만 아니라 세그맨테이션 정보로부터의 복원까지 고려해야함을 나타낸다.10 is a diagram illustrating a cycle-consistency loss of a cycle GAN model according to an embodiment of the present invention. Referring to the cycle-consistency loss as shown in FIG. 10 , it indicates that the generator must consider not only the restoration from the depth information but also the restoration from the segmentation information when restoration is performed to the RGB image.
따라서, 본 발명의 실시 예에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법은 깊이 정보만을 고려하여 복원하는 종래의 제약 방식에 세그맨테이션 정보를 추가함으로써 더욱 객체 분류화된 깊이 정보의 생성이 가능하다. 마찬가지로, 세그맨테이션 정보의 복원을 수행할 경우에도 깊이 정보를 고려한 세그맨테이션 정보의 복원이 수행됨으로써 배경 분리 등에 대한 상승효과를 나타낼 수 있다.Therefore, in the depth estimation method using cycle GAN and segmentation according to an embodiment of the present invention, the generation of further object-classified depth information by adding segmentation information to the conventional constraint method of restoring only depth information in consideration possible. Similarly, even when segmentation information is restored, segmentation information is restored in consideration of depth information, thereby exhibiting a synergistic effect on background separation.
또한, 상기 사이클 일관성 손실(Cycle-Consistency Loss)에 L1 손실(Loss)을 통해 모델 가중치의 L1 놈(norm)(가중치 각 요소 절대값의 합)에 대해 패널티를 부과하여 정규화를 진행한다. L1 손실(Loss)은 L2 손실(Loss)에 비해 상대적으로 강건(Robust)하고, 불안정한 솔루션 문제(Unstable solution problem)에 강인하다.In addition, normalization is performed by imposing a penalty on the L1 norm of the model weight (the sum of the absolute values of each element of the weight) through the L1 loss to the cycle-consistency loss. The L1 loss (Loss) is relatively robust compared to the L2 loss (Loss), and is robust to an unstable solution problem.
상기 L1 손실(Loss)은 아래의 [수학식 6]과 같이 나타낼 수 있다.The L1 loss may be expressed as in Equation 6 below.
[수학식 6][Equation 6]
Figure PCTKR2021001803-appb-I000036
Figure PCTKR2021001803-appb-I000036
이와 같이, 본 발명의 실시 예에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법은 상기 학습단계를 통해 적대적인 손실(Adversarial Loss)과 사이클 일관성 손실(Cycle-Consistency Loss)을 계산하고, 이를 기반으로 학습이 완료된 후 최종적으로 페이크 깊이(Fake Depth) 정보
Figure PCTKR2021001803-appb-I000037
를 추정할 수 있다.
As such, the depth estimation method using cycle GAN and segmentation according to an embodiment of the present invention calculates an Adversarial Loss and a Cycle-Consistency Loss through the learning step, and based on this Finally, fake depth information after training is completed
Figure PCTKR2021001803-appb-I000037
can be estimated.
즉, 상기 학습단계를 통해 실제 깊이 정보와 미리 설정된 유사도를 가지고 유사하도록 학습이 완료되면 사이클 GAN의 학습 과정을 종료한다.That is, when learning is completed so that it is similar to the actual depth information with a preset similarity through the learning step, the learning process of the cycle GAN is terminated.
도 11은 본 발명의 실시 예에 따른 실행단계의 깊이 정보 추정 단계를 나타내는 도면이고, 도 12는 본 발명의 실시 예에 따른 실행단계의 세그맨테이션 정보 추정 단계를 나타내는 도면이다.11 is a diagram illustrating a depth information estimation step of an execution step according to an embodiment of the present invention, and FIG. 12 is a diagram illustrating a segmentation information estimation step of an execution step according to an embodiment of the present invention.
본 발명의 실시 예에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법의 실행단계는 상기 (S10) 단계 내지 (S50) 단계를 통해 생성된 생성자(Generator)
Figure PCTKR2021001803-appb-I000038
를 이용하여 RGB 데이터의 입력 RGB 영상에 대한 깊이 정보를 추정하는 단계(S60)와, 상기 (S10) 단계 내지 (S50) 단계를 통해 생성된 생성자(Generator)
Figure PCTKR2021001803-appb-I000039
을 이용하여 상기 RGB 데이터의 입력 RGB 영상에 대한 세그맨테이션 영상 정보를 추정하는 단계(S70)를 포함할 수 있다.
The execution step of the depth estimation method using the cycle GAN and the segmentation according to an embodiment of the present invention includes the generator generated through the steps (S10) to (S50).
Figure PCTKR2021001803-appb-I000038
Estimating depth information for an input RGB image of RGB data using
Figure PCTKR2021001803-appb-I000039
and estimating segmentation image information for the input RGB image of the RGB data by using (S70).
상기 깊이 정보를 추정하는 단계(S60)는 도 11과 같이 학습단계에서 생성된 생성자(Generator)
Figure PCTKR2021001803-appb-I000040
를 사용하여 깊이 정보를 추정(Estimation)한다.
The step (S60) of estimating the depth information is performed by a generator generated in the learning step as shown in FIG. 11 .
Figure PCTKR2021001803-appb-I000040
is used to estimate depth information.
즉, 상기 깊이 정보를 추정하는 단계(S60)는 실제 깊이 영상과 유사한 RGB 영상의 깊이 영상을 생성하기 위해 상기 학습단계에서 생성된 생성자(Generator)
Figure PCTKR2021001803-appb-I000041
를 사용하여 깊이 정보 추정(Depth estimation)을 수행한다.
That is, the step of estimating the depth information ( S60 ) includes a generator generated in the learning step to generate a depth image of an RGB image similar to an actual depth image.
Figure PCTKR2021001803-appb-I000041
Depth estimation is performed using
또한, 상기 세그맨테이션 영상 정보를 추정하는 단계(S70)는 도 12와 같이 학습단계에서 생성된 생성자(Generator)
Figure PCTKR2021001803-appb-I000042
를 사용하여 세그맨테이션 정보를 추정한다. 즉, 상기 세그맨테이션 영상 정보를 추정하는 단계(S70)는 상기 학습단계에서 생성된 생성자(Generator)
Figure PCTKR2021001803-appb-I000043
를 사용하여 RGB 영상을 세그맨테이션 정보로 변환한다.
In addition, the step of estimating the segmentation image information ( S70 ) is performed by a generator generated in the learning step as shown in FIG. 12 .
Figure PCTKR2021001803-appb-I000042
is used to estimate the segmentation information. That is, the step (S70) of estimating the segmentation image information is performed by a generator generated in the learning step.
Figure PCTKR2021001803-appb-I000043
Converts RGB images to segmentation information using
도 13a 및 도 13b는 깊이 정보 추정 과정에서 세그맨테이션 과정을 사용하기 전과 후를 비교하여 나타내는 도면이다. 즉, 도 13a는 깊이 정보 추정 과정에서 세그맨테이션 과정을 사용하지 않고 추정된 영상의 불확실성을 나타내는 도면이고, 도 13b는 세그맨테이션 과정을 추가하여 입력 영상에 대한 깊이 정보를 추정한 결과를 나타내는 도면이다.13A and 13B are diagrams showing a comparison before and after the segmentation process is used in the depth information estimation process. That is, FIG. 13A is a diagram illustrating uncertainty of an image estimated without using a segmentation process in a depth information estimation process, and FIG. 13B is a diagram illustrating a result of estimating depth information on an input image by adding a segmentation process. It is a drawing.
종래의 깊이 정보 추정 과정에 세그맨테이션 과정을 추가하는 이유는 도 13b에서 도시된 바와 같이 입력 영상에 대한 깊이 정보 추정 결과에 대하여 깊이 정보의 불확실성 문제를 해결하기 위한 것이다.The reason for adding the segmentation process to the conventional depth information estimation process is to solve the problem of uncertainty of depth information with respect to the depth information estimation result for the input image as shown in FIG. 13B .
종래의 깊이 정보 추정 과정을 통해 완벽하게 깊이 정보를 추정할 수는 없으므로 도 13b와 같이 세그맨테이션 추정 과정을 통해 생성되는 여러 변수를 복합적으로 적용하여 깊이 정보의 불확실성을 줄일 수 있다.Since it is impossible to perfectly estimate depth information through the conventional depth information estimation process, as shown in FIG. 13B , it is possible to reduce the uncertainty of depth information by complexly applying various variables generated through the segmentation estimation process.
이와 같이 본 발명의 실시 예에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법 및 시스템(10)은 여러 변수를 통해 성능을 향상시키는 멀티태스킹 학습(multi-task learning) 기법을 통해 깊이 정보 추정 결과를 향상시킬 수 있다.As described above, the depth estimation method and system 10 using cycle GAN and segmentation according to an embodiment of the present invention results in depth information estimation through a multi-task learning technique that improves performance through several variables. can improve
도 14는 본 발명의 실시 예에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법의 평가 절차를 나타내는 도면이다. 도 14와 같이 본 발명의 실시 예에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법의 신뢰성 평가를 위해 공개된 표준 데이터베이스인 NYU Depth Dataset V2를 이용하여 평가를 수행한다.14 is a diagram illustrating an evaluation procedure of a depth estimation method using a cycle GAN and segmentation according to an embodiment of the present invention. 14 , evaluation is performed using NYU Depth Dataset V2, which is an open standard database, for reliability evaluation of a depth estimation method using cycle GAN and segmentation according to an embodiment of the present invention.
상기 NYU Depth Dataset V2 데이터베이스는 마이크로소프트사의 Kinect v1 모델을 사용하여 촬영된 다양한 실내 장면 중심의 비디오 시퀀스 데이터를 제공한다. 또한, 상기 NYU Depth Dataset V2 데이터베이스는 Labeled Dataset을 통해 RGB 영상에 대한 깊이 정보와 세그맨테이션 정보를 제공한다.The NYU Depth Dataset V2 database provides video sequence data focusing on various indoor scenes shot using Microsoft's Kinect v1 model. In addition, the NYU Depth Dataset V2 database provides depth information and segmentation information for RGB images through the Labeled Dataset.
도 14의 Segmentation 및 Depth 추정 단계(S110)는 표준 데이터베이스(410)의 입력 RGB 영상 X에 대하여 생성자(Generator)
Figure PCTKR2021001803-appb-I000044
와 생성자(Generator)
Figure PCTKR2021001803-appb-I000045
을 이용하여 깊이 정보와 세그맨테이션 영상 정보를 생성하는 단계(S10)를 나타낸다.
The segmentation and depth estimation step S110 of FIG. 14 is a generator for the input RGB image X of the standard database 410 .
Figure PCTKR2021001803-appb-I000044
and the Generator
Figure PCTKR2021001803-appb-I000045
is used to generate depth information and segmentation image information (S10).
즉, 상기 Segmentation 및 Depth 추정 단계(S110)에서는 깊이 정보와 세그맨테이션 정보를 추정함과 동시에 목적 함수를 통해 학습률을 계산한다. 또한, 본 발명의 각 도메인(Domain)에 따른 사이클 일관성 손실(Cycle-Consistency Loss)들의 결합이 성능에 미치는 영향을 평가한다.That is, in the segmentation and depth estimation step S110, depth information and segmentation information are estimated and a learning rate is calculated through an objective function. In addition, the effect of the combination of cycle-consistency losses according to each domain of the present invention on performance is evaluated.
또한, 도 14의 Adversarial Loss 계산 단계(S130)는 생성된 깊이 정보와 세그맨테이션 영상 정보 및 복원된 RGB 영상을 표준 데이터베이스(410)와 각각 판별하여 비교하고, 각각에 대한 손실(Loss) 및 판별 확률을 계산하는 단계(S30)를 나타낸다.In addition, in the adversarial loss calculation step S130 of FIG. 14 , the generated depth information, the segmentation image information, and the reconstructed RGB image are respectively discriminated and compared with the standard database 410, and the loss and discrimination for each. The step of calculating the probability (S30) is shown.
즉, 상기 Adversarial Loss 계산 단계(S130)에서는 상기 Segmentation 및 Depth 추정 단계(S110)에서 계산된 목적 함수 중 적대적인 손실(Adversarial Loss)만을 계산한다. 또한, 상기 Adversarial Loss 계산 단계(S130)에서는 깊이 정보 및 세그맨테이션 정보 상호 간의 교점이 존재하지 않으므로 서로 독립적으로 깊이 및 세그맨테이션 추정을 진행한다.That is, in the adversarial loss calculation step S130 , only the adversarial loss is calculated among the objective functions calculated in the segmentation and depth estimation step S110 . In addition, since there is no intersection between the depth information and the segmentation information in the adversarial loss calculation step ( S130 ), depth and segmentation estimation are performed independently of each other.
이때, 상기 Segmentation 및 Depth 추정 단계(S110) 이후에 생성된 깊이 정보와 세그맨테이션 영상 정보를 이용하여 RGB 영상을 복원하는 단계(S120)를 더 포함할 수 있다.In this case, the method may further include the step of reconstructing the RGB image ( S120 ) using the depth information and the segmentation image information generated after the segmentation and depth estimation step ( S110 ).
또한, 도 14의 Cycle-Consistency Loss 계산 단계(S131)는 동일한 입력 RGB 영상 X에 대하여 서로 다른 출력값을 갖지만 후에 이를 다시 RGB 영상
Figure PCTKR2021001803-appb-I000046
로 복원하였을 때 원본 RGB 영상 X와 복원된 RGB 영상
Figure PCTKR2021001803-appb-I000047
를 비교하여 원본 RGB 영상 X의 형상을 유지하면서 깊이 정보와 세그맨테이션 영상을 생성할 수 있도록 유도하는 사이클 일관성 손실(Cycle-Consistency Loss) 단계(S31)를 나타낸다.
In addition, in the Cycle-Consistency Loss calculation step (S131) of FIG. 14, although different output values are obtained for the same input RGB image X, they are later converted into the RGB image again.
Figure PCTKR2021001803-appb-I000046
The original RGB image X and the restored RGB image when restored to
Figure PCTKR2021001803-appb-I000047
shows a cycle-consistency loss step (S31) inducing depth information and segmentation image to be generated while maintaining the shape of the original RGB image X by comparing .
즉, 상기 Cycle-Consistency Loss 계산 단계(S131)에서는 복원된 RGB 영상을 원본 RGB 영상과 비교하여 사이클 일관성 손실(Cycle-Consistency Loss)을 계산한 뒤 각 깊이 정보 및 세그맨테이션 정보의 생성자에 패널티를 부여한다. 이러한 과정을 통해 각 생성자는 깊이 정보와 세그맨테이션 정보의 복원까지 고려하여 깊이 및 세그맨테이션 추정을 진행하게 된다.That is, in the Cycle-Consistency Loss calculation step (S131), the reconstructed RGB image is compared with the original RGB image to calculate a Cycle-Consistency Loss, and a penalty is applied to the generator of each depth information and segmentation information. give Through this process, each generator performs depth and segmentation estimation in consideration of restoration of depth information and segmentation information.
또한, 도 14의 Depth 및 Segmentation 평가 단계(S140)는 계산된 결과값을 토대로 각각의 손실 및 판별자(Discriminator)의 판별 확률 값이 미리 설정된 기준 수렴값을 만족하는지 판단하는 단계(S40)를 나타낸다.In addition, the depth and segmentation evaluation step (S140) of FIG. 14 is a step (S40) of determining whether each loss and discrimination probability value of a discriminator satisfies a preset reference convergence value based on the calculated result value. .
상기 Depth 및 Segmentation 평가 단계(S140)에서는 생성된 결과물에 대하여 수치적 에러를 측정하는 평균 제곱근 오차(RMSE)의 변형인 RMSLE를 측정하여 생성된 깊이 정보를 평가한다. 상기 RMSLE는 아래의 [수학식 7]과 같이 나타낼 수 있다.In the depth and segmentation evaluation step ( S140 ), depth information generated by measuring RMSLE, which is a variation of root mean square error (RMSE), which measures a numerical error of the generated result, is evaluated. The RMSLE can be expressed as [Equation 7] below.
[수학식 7][Equation 7]
Figure PCTKR2021001803-appb-I000048
Figure PCTKR2021001803-appb-I000048
여기에서, 상기 RMSLE를 계산하기 위해 필요한 Pi와 ai는 0부터 1 사이의 값으로 정규화되어 입력된다. 상기 RMSLE 비용함수는 주로 과대평가 된 항목보다 과소평가 된 항목에 패널티를 주기 위해 사용되며 정답에 대한 오류를 숫자로 나타낸 값으로 값이 클수록 오차가 크다. Here, P i and a i necessary for calculating the RMSLE are inputted after being normalized to a value between 0 and 1. The RMSLE cost function is mainly used to give a penalty to an underestimated item rather than an overestimated item. It is a numerical value indicating the error for the correct answer, and the larger the value, the greater the error.
아래의 [표 1]은 상기 NYU Depth Dataset V2 데이터베이스를 토대로 본 발명의 실시 예에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법을 다른 기법들과 비교 평가한 결과를 나타낸다.[Table 1] below shows the results of comparison and evaluation of the depth estimation method using cycle GAN and segmentation according to an embodiment of the present invention with other techniques based on the NYU Depth Dataset V2 database.
[표 1] 본 발명의 실시 예에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법과 다른 기법들의 NYU Depth Dataset V2에 대한 비교 결과[Table 1] Comparison result for NYU Depth Dataset V2 of depth estimation method using cycle GAN and segmentation and other techniques according to an embodiment of the present invention
Figure PCTKR2021001803-appb-I000049
Figure PCTKR2021001803-appb-I000049
상기 [표 1]에 나타난 바와 같이 본 발명의 실시 예에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법의 RMSLE 값이 0.220으로 다른 기법들보다 낮은 수치를 나타낸다. 즉, 상기 RMSLE 값은 수치가 낮을수록 우수한 깊이 추정 방법을 나타내므로, 다른 기법들보다 본 발명의 실시 예에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법이 높은 유사도를 나타내는 것을 확인할 수 있다.As shown in [Table 1], the RMSLE value of the depth estimation method using cycle GAN and segmentation according to an embodiment of the present invention is 0.220, which is lower than other techniques. That is, since a lower value of the RMSLE value indicates an excellent depth estimation method, it can be confirmed that the depth estimation method using the cycle GAN and segmentation according to the embodiment of the present invention exhibits a higher degree of similarity than other techniques.
이와 같이 본 발명의 실시 예에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법 및 시스템(10)은 종래의 3차원 정보를 얻기 위해 특수 장비나 센서 등을 사용해야 했던 방식에 비해 단일 카메라만을 사용하여 3차원 정보를 생성할 수 있어 보다 저렴하며, 확장성이 높고 소형화에 유리하며, 무엇보다도 단일 영상으로 구성된 자료를 구하기 쉽다는 장점이 있다.As such, the depth estimation method and system 10 using cycle GAN and segmentation according to an embodiment of the present invention uses only a single camera compared to the conventional method that has to use special equipment or sensors to obtain 3D information. It is cheaper because it can generate three-dimensional information, it is highly scalable, and it is advantageous for miniaturization.
또한, 본 발명의 실시 예에 따른 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법 및 시스템(10)은 깊이 정보를 추정함과 동시에 깊이 정보의 정밀성을 향상할 수 있다. 또한, 종래의 입력 영상에 대한 깊이 정보를 추정하는 과정에서 발생되는 데이터 불균형 문제를 시각적으로 표시하고, 상대적으로 큰 특징에 묻혀 소실되는 작은 특징들을 세그맨테이션을 도입하여 해결할 수 있다. 즉, 입력 영상의 깊이 정보 추정 결과에 대하여 깊이 정보의 불확실성 문제를 해결할 수 있다.In addition, the depth estimation method and system 10 using cycle GAN and segmentation according to an embodiment of the present invention can estimate depth information and improve the precision of depth information. In addition, it is possible to visually display the problem of data imbalance occurring in the process of estimating the depth information of the conventional input image, and to solve small features that are lost by being buried in relatively large features by introducing segmentation. That is, it is possible to solve the problem of uncertainty of depth information with respect to the depth information estimation result of the input image.
이상으로 본 발명에 관한 바람직한 실시 예를 설명하였으나, 본 발명은 상기 실시 예에 한정되지 아니하며, 본 발명의 실시 예로부터 당해 발명이 속하는 기술분야에서 통상의 지식을 가진 자에 의한 용이하게 변경되어 균등하다고 인정되는 범위의 모든 변경을 포함한다.Although preferred embodiments of the present invention have been described above, the present invention is not limited to the above embodiments, and can be easily changed by those of ordinary skill in the art to which the present invention pertains from the embodiments of the present invention and equivalent. It includes all changes to the extent recognized as such.
[부호의 설명][Explanation of code]
10 : 깊이 추정 시스템 100 : 영상정보 학습부10: depth estimation system 100: image information learning unit
200 : 연산부 300 : 판단부200: calculation unit 300: judgment unit
400 : 데이터베이스 410 : 표준 데이터베이스400: database 410: standard database
500 : 영상 입력부 600 : 영상정보 추정부500: image input unit 600: image information estimation unit

Claims (13)

  1. 단일 영상만을 사용하여 영상의 깊이 정보를 추정하는 깊이 추정 방법에 있어서,In the depth estimation method for estimating depth information of an image using only a single image,
    표준 데이터베이스의 입력 RGB 영상 X에 대하여 생성자(Generator)
    Figure PCTKR2021001803-appb-I000050
    와 생성자(Generator)
    Figure PCTKR2021001803-appb-I000051
    을 이용하여 깊이 정보와 세그맨테이션 영상 정보를 생성하는 단계(S10);
    Generator for input RGB image X from standard database
    Figure PCTKR2021001803-appb-I000050
    and the Generator
    Figure PCTKR2021001803-appb-I000051
    generating depth information and segmentation image information using (S10);
    생성된 깊이 정보와 세그맨테이션 영상 정보를 이용하여 RGB 영상을 복원하는 단계(S20);reconstructing an RGB image using the generated depth information and segmentation image information (S20);
    생성된 깊이 정보와 세그맨테이션 영상 정보 및 복원된 RGB 영상을 표준 데이터베이스와 각각 판별하여 비교하고, 각각에 대한 손실(Loss) 및 판별 확률을 계산하는 단계(S30);determining and comparing the generated depth information, segmentation image information, and the reconstructed RGB image with a standard database, respectively, and calculating a loss and a determination probability for each (S30);
    계산된 결과값을 토대로 각각의 손실 및 판별자(Discriminator)의 판별 확률 값이 미리 설정된 기준 수렴값을 만족하는지 판단하는 단계(S40);Determining whether each loss and the discriminator's discriminator's discriminator's probability value satisfies a preset reference convergence value based on the calculated result value (S40);
    판단 결과를 토대로 손실 및 판별 확률 값이 미리 설정된 기준 수렴값을 만족하지 않는 경우 각각의 손실 및 판별자의 판별 확률 값이 미리 설정된 기준 수렴값에 수렴되도록 학습을 조정하고, 상기 (S10) 단계 내지 (S40) 단계를 반복 수행하는 단계(S50); 및If the loss and discrimination probability values do not satisfy the preset reference convergence value based on the judgment result, the learning is adjusted so that the respective loss and discrimination probability values of the discriminator converge to the preset reference convergence value, and the steps (S10) to ( S40) repeating the step (S50); and
    상기 (S10) 단계 내지 (S50) 단계를 통해 생성된 생성자(Generator)
    Figure PCTKR2021001803-appb-I000052
    를 이용하여 RGB 데이터의 입력 RGB 영상에 대한 깊이 정보를 추정하는 단계(S60)를 포함하는 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법.
    The generator generated through the steps (S10) to (S50)
    Figure PCTKR2021001803-appb-I000052
    A depth estimation method using cycle GAN and segmentation, comprising the step (S60) of estimating depth information for an input RGB image of RGB data using
  2. 제1항에 있어서,According to claim 1,
    상기 (S10) 단계 내지 (S50) 단계를 통해 생성된 생성자(Generator)
    Figure PCTKR2021001803-appb-I000053
    을 이용하여 상기 RGB 데이터의 입력 RGB 영상에 대한 세그맨테이션 영상 정보를 추정하는 단계(S70)를 더 포함하는 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법.
    The generator generated through the steps (S10) to (S50)
    Figure PCTKR2021001803-appb-I000053
    The depth estimation method using cycle GAN and segmentation further comprising the step (S70) of estimating segmentation image information for the input RGB image of the RGB data using
  3. 제1항에 있어서,According to claim 1,
    상기 표준 데이터베이스는 RGB 영상 정보와 깊이 정보 및 세그맨테이션 정보를 포함하는 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법.The standard database is a depth estimation method using a cycle GAN and segmentation including RGB image information, depth information, and segmentation information.
  4. 제1항에 있어서,According to claim 1,
    상기 표준 데이터베이스는 NYU Depth Dataset V2인 것을 특징으로 하는 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법.The standard database is a depth estimation method using cycle GAN and segmentation, characterized in that NYU Depth Dataset V2.
  5. 제1항에 있어서,According to claim 1,
    상기 입력 RGB 영상에 대한 깊이 정보를 추정하는 단계(S60)에서 상기 RGB 데이터는 RGB 영상 정보만을 포함하는 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법.In the step of estimating depth information for the input RGB image ( S60 ), the RGB data includes only RGB image information. A depth estimation method using cycle GAN and segmentation.
  6. 제1항에 있어서,According to claim 1,
    상기 손실(Loss) 및 판별 확률을 계산하는 단계(S30)는 사이클 GAN의 목적 함수를 통해 손실 및 판별 확률 결과값의 수치를 계산하는 것을 특징으로 하는 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법.Calculating the loss and discrimination probability (S30) is a depth estimation method using cycle GAN and segmentation, characterized in that the numerical value of the loss and discrimination probability result value is calculated through an objective function of the cycle GAN.
  7. 제6항에 있어서,7. The method of claim 6,
    상기 목적 함수는 사이클 GAN(Generative Adversarial Network)의 적대적인 손실 함수(Adversarial Loss Function)와 사이클 일관성 손실 함수(Cycle-Consistency Loss Function)로 구성되는 것을 특징으로 하는 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법.The objective function is a depth estimation method using a cycle GAN and segmentation, characterized in that it consists of an adversarial loss function and a cycle-consistency loss function of a cycle GAN (Generative Adversarial Network) .
  8. 제1항에 있어서,According to claim 1,
    상기 (S10) 단계 내지 (S40) 단계를 반복 수행하는 단계(S50)에서 손실(Loss)은 0에 수렴하고, 판별자의 판별 확률 값은 50%에 수렴하도록 학습을 조정하는 것을 특징으로 하는 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법.In the step (S50) of repeating the steps (S10) to (S40), the loss converges to 0, and the discriminator's discrimination probability value converges to 50% Cycle GAN, characterized in that the learning is adjusted and depth estimation method using segmentation.
  9. 제1항에 있어서,According to claim 1,
    상기 손실(Loss) 및 판별 확률을 계산하는 단계(S30) 이후에,After calculating the loss and the discrimination probability (S30),
    동일한 입력 RGB 영상 X에 대하여 서로 다른 출력값을 갖지만 후에 이를 다시 RGB 영상
    Figure PCTKR2021001803-appb-I000054
    로 복원하였을 때 원본 RGB 영상 X와 복원된 RGB 영상
    Figure PCTKR2021001803-appb-I000055
    를 비교하여 원본 RGB 영상 X의 형상을 유지하면서 깊이 정보와 세그맨테이션 영상을 생성할 수 있도록 유도하는 사이클 일관성 손실(Cycle-Consistency Loss) 단계를 더 포함하는 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법.
    It has different output values for the same input RGB image X, but later it is converted to RGB image again.
    Figure PCTKR2021001803-appb-I000054
    The original RGB image X and the restored RGB image when restored to
    Figure PCTKR2021001803-appb-I000055
    Depth estimation using cycle GAN and segmentation, which further includes a cycle-consistency loss step that induces generation of depth information and segmentation images while maintaining the shape of the original RGB image X by comparing Way.
  10. 제9항에 있어서,10. The method of claim 9,
    상기 사이클 일관성 손실(Cycle-Consistency Loss) 단계는,The cycle consistency loss (Cycle-Consistency Loss) step,
    상기 생성자와 판별자의 미니맥스(Minimax) 결과를 반영(Back-propagation)하여 학습을 진행하기 위해 상기 (S10) 단계 내지 (S30) 단계를 통해 생성된 생성자(Generator)
    Figure PCTKR2021001803-appb-I000056
    와, 생성자(Generator)
    Figure PCTKR2021001803-appb-I000057
    을 피드백하여 깊이 정보와 세그맨테이션 영상 정보를 추정하는 것을 특징으로 하는 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 방법.
    The generator generated through the steps (S10) to (S30) in order to proceed with learning by reflecting the results of the minimax of the generator and the discriminator (Back-propagation)
    Figure PCTKR2021001803-appb-I000056
    Wow, the Generator
    Figure PCTKR2021001803-appb-I000057
    A depth estimation method using cycle GAN and segmentation, characterized in that by feeding back the depth information and segmentation image information.
  11. 단일 영상만을 사용하여 영상의 깊이 정보를 추정하는 깊이 추정 시스템에 있어서,In the depth estimation system for estimating depth information of an image using only a single image,
    표준 데이터베이스의 RGB 영상을 입력받고, 생성자(Generator)
    Figure PCTKR2021001803-appb-I000058
    를 이용하여 깊이 정보를 생성하며, 생성자(Generator)
    Figure PCTKR2021001803-appb-I000059
    을 이용하여 세그맨테이션 영상 정보를 생성하고, 생성된 깊이 정보와 세그맨테이션 영상 정보를 이용하여 RGB 영상을 복원하며, 사이클 GAN의 목적 함수를 통해 학습을 수행하는 영상정보 학습부;
    Receive RGB image from standard database, and generate
    Figure PCTKR2021001803-appb-I000058
    Depth information is generated using
    Figure PCTKR2021001803-appb-I000059
    an image information learning unit generating segmentation image information using
    상기 영상정보 학습부에서 생성된 깊이 정보와 세그맨테이션 영상 정보 및 복원된 RGB 영상을 표준 데이터베이스와 각각 판별하여 비교하고, 각각에 대한 손실(Loss) 및 판별 확률을 계산하는 연산부;a calculation unit for discriminating and comparing the depth information generated by the image information learning unit, the segmentation image information, and the reconstructed RGB image with a standard database, respectively, and calculating a loss and a determination probability for each;
    상기 연산부에서 계산된 결과값을 토대로 각각의 손실 및 판별자(Discriminator)의 판별 확률 값이 미리 설정된 기준 수렴값을 만족하는지 판단하는 판단부;a determination unit for determining whether each loss and discrimination probability value of a discriminator satisfies a preset reference convergence value based on the result calculated by the operation unit;
    RGB 영상을 입력받는 영상 입력부; 및an image input unit receiving an RGB image; and
    상기 영상정보 학습부에서 학습이 완료된 상기 생성자(Generator)
    Figure PCTKR2021001803-appb-I000060
    와, 생성자(Generator)
    Figure PCTKR2021001803-appb-I000061
    를 이용하여 영상 입력부에서 입력받은 RGB 영상에 대한 깊이 정보를 추정하는 영상정보 추정부를 포함하는 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 시스템.
    The generator for which learning has been completed in the image information learning unit
    Figure PCTKR2021001803-appb-I000060
    Wow, the Generator
    Figure PCTKR2021001803-appb-I000061
    A depth estimation system using cycle GAN and segmentation including an image information estimator for estimating depth information for an RGB image input from an image input unit using
  12. 제11항에 있어서,12. The method of claim 11,
    상기 영상정보 학습부는 RGB 영상 정보에서 세그맨테이션 정보로 변환하기 위하여 생성자(Generator)
    Figure PCTKR2021001803-appb-I000062
    에 의해 입력 RGB 영상 X에 대한 세그맨테이션 정보를 획득하여 해당 정보를 깊이 정보 추정에 활용하고,
    The image information learning unit generates a generator to convert RGB image information into segmentation information.
    Figure PCTKR2021001803-appb-I000062
    obtains the segmentation information for the input RGB image X and uses the information for depth information estimation,
    상기 생성자(Generator)
    Figure PCTKR2021001803-appb-I000063
    에 의해 입력 RGB 영상 X에 대한 깊이 정보를 획득하여 해당 정보를 세그맨테이션 정보 추정에 활용하는 것을 특징으로 하는 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 시스템.
    The Generator (Generator)
    Figure PCTKR2021001803-appb-I000063
    A depth estimation system using cycle GAN and segmentation, characterized in that the depth information on the input RGB image X is obtained by the method and the corresponding information is used for estimating the segmentation information.
  13. 제11항에 있어서,12. The method of claim 11,
    상기 판단부는 판단 결과를 토대로 손실 및 판별 확률 값이 미리 설정된 기준 수렴값을 만족하지 않는 경우 각각의 손실 및 판별자의 판별 확률 값이 미리 설정된 기준 수렴값에 수렴되도록 학습을 조정하고, 상기 영상정보 학습부에서 재학습이 수행되도록 피드백하는 것을 특징으로 하는 사이클 GAN과 세그맨테이션을 사용한 깊이 추정 시스템.The determination unit adjusts learning so that, based on the determination result, the loss and discrimination probability values do not satisfy a preset reference convergence value, each loss and the discrimination probability value of the discriminator converge to the preset reference convergence value, and the image information learning Depth estimation system using cycle GAN and segmentation, characterized in that feedback is performed so that re-learning is performed in the sub.
PCT/KR2021/001803 2020-04-09 2021-02-10 Depth estimation method and system using cycle gan and segmentation WO2021206284A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020200043096A KR102127153B1 (en) 2020-04-09 2020-04-09 Depth estimation method and system using cycle GAN and segmentation
KR10-2020-0043096 2020-04-09

Publications (1)

Publication Number Publication Date
WO2021206284A1 true WO2021206284A1 (en) 2021-10-14

Family

ID=71136727

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2021/001803 WO2021206284A1 (en) 2020-04-09 2021-02-10 Depth estimation method and system using cycle gan and segmentation

Country Status (2)

Country Link
KR (1) KR102127153B1 (en)
WO (1) WO2021206284A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114240950A (en) * 2021-11-23 2022-03-25 电子科技大学 Brain tumor image generation and segmentation method based on deep neural network
CN114359361A (en) * 2021-12-28 2022-04-15 Oppo广东移动通信有限公司 Depth estimation method, depth estimation device, electronic equipment and computer-readable storage medium
CN117830340A (en) * 2024-01-04 2024-04-05 中南大学 Ground penetrating radar target feature segmentation method, system, equipment and storage medium

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102127153B1 (en) * 2020-04-09 2020-06-26 한밭대학교 산학협력단 Depth estimation method and system using cycle GAN and segmentation
CN112529978B (en) * 2020-12-07 2022-10-14 四川大学 Man-machine interactive abstract picture generation method
KR102617344B1 (en) * 2020-12-30 2023-12-28 한국기술교육대학교 산학협력단 Depth prediction method based on unsupervised learning and system using the same
CN112767418B (en) * 2021-01-21 2022-10-14 大连理工大学 Mirror image segmentation method based on depth perception
CN113468969B (en) * 2021-06-03 2024-05-14 江苏大学 Aliased electronic component space expression method based on improved monocular depth estimation
KR102477632B1 (en) * 2021-11-12 2022-12-13 프로메디우스 주식회사 Method and apparatus for training image using generative adversarial network
CN115292722B (en) * 2022-10-09 2022-12-27 浙江君同智能科技有限责任公司 Model safety detection method and device based on different color spaces

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200074674A1 (en) * 2018-08-29 2020-03-05 Toyota Jidosha Kabushiki Kaisha Distance Estimation Using Machine Learning
KR102127153B1 (en) * 2020-04-09 2020-06-26 한밭대학교 산학협력단 Depth estimation method and system using cycle GAN and segmentation

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102257827B (en) 2008-12-19 2014-10-01 皇家飞利浦电子股份有限公司 Creation of depth maps from images

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200074674A1 (en) * 2018-08-29 2020-03-05 Toyota Jidosha Kabushiki Kaisha Distance Estimation Using Machine Learning
KR102127153B1 (en) * 2020-04-09 2020-06-26 한밭대학교 산학협력단 Depth estimation method and system using cycle GAN and segmentation

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
GWN KIN; REDDY KISHORE; GIERING MICHAEL; BERNAL EDGAR A.: "Generative Adversarial Networks for Depth Map Estimation from RGB Video", 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), IEEE, 18 June 2018 (2018-06-18), pages 1258 - 12588, XP033475459, DOI: 10.1109/CVPRW.2018.00163 *
JAFARI OMID HOSSEINI; GROTH OLIVER; KIRILLOV ALEXANDER; YANG MICHAEL YING; ROTHER CARSTEN: "Analyzing modular CNN architectures for joint depth prediction and semantic segmentation", 2017 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), IEEE, 29 May 2017 (2017-05-29), pages 4620 - 4627, XP033127279, DOI: 10.1109/ICRA.2017.7989537 *
KWAK, DONG-HOON ET AL.: "A Technique for Generating Depth Information of RGB Images using a Learning-based Cycle GAN", INSTITUTE OF KOREAN ELECTRICAL AND ELECTRONICS ENGINEERS SUMMER CONFERENCE 2019, 8 August 2019 (2019-08-08), pages 29 - 32 *
KWAK, DONG-HOON: "A study on depth estimation method using cycle GAN and segmentation", MASTER THESIS, February 2020 (2020-02-01), Korea, pages 1 - 51, XP009531287 *
ZHAO SHANSHAN; FU HUAN; GONG MINGMING; TAO DACHENG: "Geometry-Aware Symmetric Domain Adaptation for Monocular Depth Estimation", 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), IEEE, 15 June 2019 (2019-06-15), pages 9780 - 9790, XP033686751, DOI: 10.1109/CVPR.2019.01002 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114240950A (en) * 2021-11-23 2022-03-25 电子科技大学 Brain tumor image generation and segmentation method based on deep neural network
CN114240950B (en) * 2021-11-23 2023-04-07 电子科技大学 Brain tumor image generation and segmentation method based on deep neural network
CN114359361A (en) * 2021-12-28 2022-04-15 Oppo广东移动通信有限公司 Depth estimation method, depth estimation device, electronic equipment and computer-readable storage medium
CN117830340A (en) * 2024-01-04 2024-04-05 中南大学 Ground penetrating radar target feature segmentation method, system, equipment and storage medium

Also Published As

Publication number Publication date
KR102127153B1 (en) 2020-06-26

Similar Documents

Publication Publication Date Title
WO2021206284A1 (en) Depth estimation method and system using cycle gan and segmentation
US20210065391A1 (en) Pseudo rgb-d for self-improving monocular slam and depth prediction
WO2020046066A1 (en) Method for training convolutional neural network to reconstruct an image and system for depth map generation from an image
US7715619B2 (en) Image collation system and image collation method
CN106875437B (en) RGBD three-dimensional reconstruction-oriented key frame extraction method
KR100560464B1 (en) Multi-view display system with viewpoint adaptation
CN110109535A (en) Augmented reality generation method and device
CN113012122A (en) Category-level 6D pose and size estimation method and device
CN109389156B (en) Training method and device of image positioning model and image positioning method
CN113015978B (en) Processing images to locate novel objects
US10755477B2 (en) Real-time face 3D reconstruction system and method on mobile device
CN105894443A (en) Method for splicing videos in real time based on SURF (Speeded UP Robust Features) algorithm
WO2023080266A1 (en) Face converting method and apparatus using deep learning network
CN113643342A (en) Image processing method and device, electronic equipment and storage medium
CN112016612A (en) Monocular depth estimation-based multi-sensor fusion SLAM method
KR20110043967A (en) System and method of camera tracking and live video compositing system using the same
CN113379877A (en) Face video generation method and device, electronic equipment and storage medium
US20240161254A1 (en) Information processing apparatus, information processing method, and program
CN116092178A (en) Gesture recognition and tracking method and system for mobile terminal
CN111754622A (en) Face three-dimensional image generation method and related equipment
Wofk et al. Monocular Visual-Inertial Depth Estimation
CN111582120A (en) Method and terminal device for capturing eyeball activity characteristics
WO2020085541A1 (en) Method and device for processing video
WO2022131793A1 (en) Method and apparatus for recognizing handwriting inputs in multiple-user environment
CN113657190A (en) Driving method of face picture, training method of related model and related device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21784937

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21784937

Country of ref document: EP

Kind code of ref document: A1