KR102364822B1

KR102364822B1 - Method and apparatus for recovering occluded area

Info

Publication number: KR102364822B1
Application number: KR1020200145929A
Authority: KR
Inventors: 박종빈; 박성주; 정종진; 김경원
Original assignee: 한국전자기술연구원
Priority date: 2020-11-04
Filing date: 2020-11-04
Publication date: 2022-02-18
Also published as: WO2022097766A1

Abstract

The present invention relates to a method and apparatus for recovering an occluded area to generate a restoration image suitable for an occluded area. According to one embodiment of the present invention, a recovery method is performed by an electronic device which recovers an object (occluded object) having an occluded area in an image to an object (restoration object) in which the occluded area is restored, wherein restoration is performed by using a restorer which is a model learned with a machine learning technique, to output a restoration object when receiving an occluded object and additional information related to the occluded object. The recovery method comprises the following steps: analyzing the type of an object occluded in the image; checking whether a pre-learned restorer exists based on the analyzed type of object; selecting a corresponding restorer when the restorer exists in the checking step; and determining whether re-learning is required for the selected restorer according to additional information related to the analyzed type of the object and using the selected restorer according to the determined necessity or using a restorer retrained from the selected restorer on the basis of the additional information to generate a restored object.

Description

METHOD AND APPARATUS FOR RECOVERING OCCLUDED AREA

본 발명은 가려진 영역 복원 방법 및 장치에 관한 것으로서, 더욱 상세하게는 카메라를 통해 촬영된 영상에서 인물 등의 특정 객체가 가려져 있는 경우에 이를 복원하는 방법 및 장치에 관한 것이다.The present invention relates to a method and apparatus for reconstructing an obscured area, and more particularly, to a method and apparatus for reconstructing a specific object such as a person in an image captured by a camera when it is obscured.

영상에서 가려진 영역에 대한 복원 문제는 “영상에서 가려지지 않은 영상 영역을 사용하여 가려진 영상 영역을 추정하는 문제”로 정의할 수 있다The restoration problem for the occluded region in the image can be defined as “the problem of estimating the occluded image region using the image region that is not occluded in the image”.

종래에는 영상에서 인물이나 객체 등에 가려진 영역을 복원하기 위해 통상적으로 영상의 화소(픽셀, Pixel, Picture Element)들 사이에 존재하는 공간적 상관성과 특징(feature) 정보를 이용하는 방법을 사용하였다.Conventionally, in order to restore an area covered by a person or an object in an image, a method using spatial correlation and feature information existing between pixels (pixels, pixels, picture elements) of an image is conventionally used.

가령, 공간적 상관성을 이용하여 가려진 화소를 추정하기 위해, 보간법(interpolation)을 사용하였다. 즉, 가려지지 않은 영역의 임의의 화소를 입력 받아, 주변 화소의 분포를 직선이나 곡선, 혹은 평면이나 곡면의 형태와 같은 함수식으로 가정하여 가려진 화소를 보간함으로써 추정한다. 이러한 종래 기술은 구현이 간단하고 용이하여 널리 사용되고는 있지만, 가려진 영역이 주변 영역과 상관성이 떨어지거나 가려진 영역이 매우 크다면, 잘못된 복원 결과를 추정할 가능성이 커진다. 특히, 알려진 화소만으로는 정확한 함수식의 형태(예: 직선, 곡선, 평면, 곡면 등)를 알 수 없고 소정의 가정을 통해서만 예측할 수 있을 뿐 아니라, 세부적인 설정 값은 일종의 하이퍼 파라미터(hyper parameter)로 간주할 수 있으므로 이를 추가로 추정해야 하는 어려움이 있었다.For example, to estimate the occluded pixel using spatial correlation, interpolation is used. That is, an arbitrary pixel in an unoccluded area is received, the distribution of surrounding pixels is assumed to be a function expression such as a straight line, a curved line, or a flat or curved surface, and the hidden pixels are interpolated to estimate. Although this prior art is widely used because it is simple and easy to implement, if the occluded region has poor correlation with the surrounding region or the occluded region is very large, the possibility of erroneous estimation of the restoration result increases. In particular, it is not possible to know the exact form of a function formula (eg, a straight line, a curve, a plane, a curved surface, etc.) only with a known pixel, and it can be predicted only through a predetermined assumption, and the detailed setting value is regarded as a kind of hyper parameter. Therefore, there was a difficulty in estimating this additionally.

이런 문제를 개선하기 위한 최신의 접근법으로는 인공신경망을 이용하는 방법이다. 즉, 확보한 학습 데이터 셋을 사용하여 인공신경망을 학습시킴으로써, 영상의 공간적 상관성이나 특징 정보를 활용하는 방법이다. 하지만, 종래의 인공신경망 기술은 학습된 인공신경망이 학습 데이터 셋에 지나치게 과적합(overfitting)되어 학습되는 문제가 빈번하게 발생한다. The latest approach to improve this problem is to use an artificial neural network. That is, it is a method of utilizing spatial correlation or feature information of an image by learning an artificial neural network using the acquired training data set. However, in the conventional artificial neural network technology, a problem in which a learned artificial neural network is overfitted to a learning data set is excessively overfitted to a learning data set.

특히, 복원 과정에서 가려진 객체에 대한 부가정보(Side Information)가 복원 과정에서 추가 제공될 수도 있다. 하지만, 종래의 인공신경망 기술은 학습 과정에서 이러한 부가정보가 추론이 안되어 해당 부가정보의 미 반영 상태로 인공신경망이 학습되므로, 복원 과정에서 해당 부가정보를 이용할 수 없어 그 활용도 및 복원 객체의 정확도가 떨어지는 문제점이 있었다.In particular, side information about an object hidden in the restoration process may be additionally provided during the restoration process. However, in the conventional artificial neural network technology, since such additional information is not inferred during the learning process, and the artificial neural network is learned in a state in which the corresponding additional information is not reflected, the corresponding additional information cannot be used in the restoration process, so the utilization and accuracy of the restoration object are reduced. There was a problem with falling.

KRKR 10-2006-011266610-2006-0112666 AA

상기한 바와 같은 종래 기술의 문제점을 해결하기 위하여, 본 발명은 영상에서 가려진 영역을 복원하는 과정에서 부가정보(Side Information)를 이용하는 가려진 영역 복원 방법 및 장치를 제공하는데 그 목적이 있다.In order to solve the problems of the prior art as described above, it is an object of the present invention to provide a method and apparatus for reconstructing an obscured area using side information in a process of reconstructing an obscured area in an image.

다만, 본 발명이 해결하고자 하는 과제는 이상에서 언급한 과제에 제한되지 않으며, 언급되지 않은 또 다른 과제들은 아래의 기재로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.However, the problems to be solved by the present invention are not limited to the problems mentioned above, and other problems not mentioned can be clearly understood by those of ordinary skill in the art to which the present invention belongs from the description below. There will be.

상기와 같은 과제를 해결하기 위한 본 발명의 일 실시예에 따른 복원 방법은 영상에서 가려진 영역을 가진 객체(가려진 객체)를 가려진 영역이 복원된 객체(복원 객체)로 복원하되, 가려진 객체 및 그 객체에 관련된 부가정보를 입력 받는 경우에 복원 객체를 출력하도록 머신 러닝(machine learning) 기법으로 학습된 모델인 복원기를 이용하여 복원하는 전자 장치에 의해 수행되는 방법으로서, 영상에서 가려진 객체의 종류를 분석하는 단계; 상기 분석된 종류의 객체를 기반으로 기 학습된 복원기의 존재 여부를 확인하는 단계; 상기 확인하는 단계에서 복원기가 존재하는 경우, 해당 복원기를 선택하는 단계; 및 상기 분석된 종류의 객체에 관련된 부가정보에 따라 상기 선택된 복원기에 대한 재학습의 필요 여부를 결정하며, 상기 결정된 필요 여부에 따라 상기 선택된 복원기를 이용하거나 상기 부가정보를 기반으로 상기 선택된 복원기를 재학습시킨 복원기를 이용하여, 복원 객체를 생성하는 단계;를 포함한다.In a restoration method according to an embodiment of the present invention for solving the above problems, an object having an obscured area (occluded object) in an image is restored to an object (restored object) in which the obscured area is restored, but the obscured object and its object A method performed by an electronic device that restores using a restorer, which is a model learned by machine learning, to output a restored object when receiving additional information related to step; checking whether there is a previously learned reconstructor based on the analyzed type of object; selecting a corresponding restorer if there is a restorer in the checking step; and determining whether re-learning of the selected reconstructor is necessary according to additional information related to the analyzed type of object, and uses the selected reconstructor according to the determined necessity or re-learns the selected reconstructor based on the additional information and generating a restoration object by using the learned restoration unit.

본 발명의 일 실시예에 따른 복원 방법은, 상기 선택하는 단계에서 복원기가 부재하는 경우, 상기 분석된 종류의 객체를 수집하고, 수집된 객체 및 그 객체에 관련된 부가정보를 기반으로 신규 학습시킨 복원기를 이용하여 복원 객체를 생성하는 단계를 더 포함할 수 있다.In the restoration method according to an embodiment of the present invention, when a restoration device is absent in the selecting step, the analyzed type of object is collected, and the restoration is newly learned based on the collected object and additional information related to the object. It may further include the step of creating a restoration object using the group.

본 발명의 일 실시예에 따른 복원 방법은, 상기 영상에서 가려진 객체를 포함하는 영역을 설정하는 단계를 더 포함할 수 있으며, 상기 분석하는 단계는 복원 영역 내의 가려진 객체에 대해 상기 분석을 수행할 수 있다.The restoration method according to an embodiment of the present invention may further include setting a region including the occluded object in the image, and the analyzing may include performing the analysis on the occluded object in the restoration region. there is.

상기 재학습된 복원기는 가려지지 않은 객체 및 부가정보가 입력되는 경우에 그 부가정보에 관련된 가려진 객체를 출력하도록 머신 러닝 기법으로 학습된 제1 생성기를 이용하여 재학습될 수 있다.The relearned restorer may be relearned using a first generator learned by machine learning techniques to output an unobscured object and a hidden object related to the additional information when additional information is input.

상기 생성하는 단계는, 상기 선택된 복원기에 대한 재학습이 불필요한 경우, 상기 선택된 복원기에 가려진 객체 및 그 부가정보를 입력시켜서 복원 객체를 생성하는 단계; 상기 선택된 복원기에 대한 재학습이 필요한 경우, 상기 분석된 종류의 가려지지 않은 객체와 관련된 영상을 수집하고, 수집된 영상의 가려지지 않은 객체 및 그 부가정보를 제1 생성기에 입력시켜 수집된 영상의 가려진 객체를 생성하며, 제1 생성기에서 생성된 수집된 영상의 가려진 객체 및 그 부가정보를 이용하여 상기 선택된 복원기를 재학습시킨 후, 재학습된 복원기에 가려진 객체 및 그 부가정보를 입력시켜서 복원 객체를 생성하는 단계;를 포함할 수 있다.The generating may include, when re-learning of the selected reconstructor is unnecessary, generating a reconstructed object by inputting an object hidden in the selected reconstructor and additional information thereof; When re-learning of the selected reconstructor is required, an image related to the analyzed type of unoccluded object is collected, and the unoccluded object of the collected image and its additional information are input to the first generator, and After generating a occluded object, re-learning the selected reconstructor using the occluded object of the collected image generated by the first generator and its additional information, the occluded object and its additional information are input into the re-learned reconstructor to restore the object may include; generating

상기 복원기는 가려진 객체로부터 복원 객체를 생성하는 제2 생성기와 제2 생성기가 생성한 복원 객체를 판별하는 판별기를 각각 포함하되, 제2 생성기 및 판별기가 GAN(Generative Adversarial Network) 기반으로 적대적으로 학습될 수 있다.The reconstructor includes a second generator for generating a reconstructed object from the obscured object, and a discriminator for discriminating the reconstructed object generated by the second generator, respectively, wherein the second generator and the discriminator are adversarially learned based on a Generative Adversarial Network (GAN). can

상기 제2 생성기는 오토인코더(Auto-encoder), 변이형 오토인코더(Variational Auto-encoder), U-Net 생성기, ResNET 생성기, 또는 생성적 적대 신경망(Generative Adversarial Network)를 기반으로 구현될 수 있다.The second generator may be implemented based on an auto-encoder, a variant auto-encoder, a U-Net generator, a ResNET generator, or a generative adversarial network.

상기 부가정보는 가려진 객체에서 가려진 영역 외의 영역에 대한 분석을 통해 제공될 수 있다.The additional information may be provided through analysis of a region other than the occluded region of the occluded object.

상기 부가정보는 상기 영상이 촬영된 동일 위치에서 촬영된 타 영상에 대한 분석을 통해 제공될 수 있다.The additional information may be provided through analysis of other images captured at the same location where the image was captured.

상기 부가정보는 사용자에 의해 입력 또는 선택되어 제공될 수 있다.The additional information may be input or selected by a user and provided.

상기 부가정보는 상기 분석된 종류의 객체에 대한 속성 정보, 상기 분석된 종류의 객체에 포함되는 다수의 세부 객체에 대한 정보, 또는 상기 분석된 종류의 객체에 대한 3차원 정보를 포함할 수 있다.The additional information may include attribute information on the analyzed type of object, information on a plurality of detailed objects included in the analyzed type of object, or 3D information on the analyzed type of object.

본 발명의 일 실시예에 따른 복원 장치는 영상에서 가려진 영역을 가진 객체(가려진 객체)를 가려진 영역이 복원된 객체(복원 객체)로 복원하는 장치로서, 가려진 객체 및 그 객체에 관련된 부가정보를 입력 받는 경우에 복원 객체를 출력하도록 머신 러닝(machine learning) 기법으로 학습된 모델인 복원기를 저장한 메모리; 및 메모리에 저장된 복원기를 이용하여 복원 객체를 생성하도록 제어하는 제어부;를 포함한다.A restoration apparatus according to an embodiment of the present invention is an apparatus for restoring an object (occluded object) having an occluded area in an image into an object (restored object) in which the occluded region is restored, inputting the occluded object and additional information related to the object. A memory for storing a restorer that is a model trained by machine learning techniques to output a restored object when received; and a control unit that controls to create a restoration object by using the restoration unit stored in the memory.

본 발명의 일 실시예에 따른 복원 장치는 영상에서 가려진 영역을 가진 객체(가려진 객체)를 가려진 영역이 복원된 객체(복원 객체)로 복원하는 장치로서, 가려진 객체 및 그 객체에 관련된 부가정보를 입력 받는 경우에 복원 객체를 출력하도록 머신 러닝(machine learning) 기법으로 학습된 모델인 복원기를 수신하는 통신부; 및 통신부에서 수신한 복원기를 이용하여 복원 객체를 생성하도록 제어하는 제어부;를 포함한다.A restoration apparatus according to an embodiment of the present invention is an apparatus for restoring an object (occluded object) having an occluded area in an image into an object (restored object) in which the occluded region is restored, inputting the occluded object and additional information related to the object. a communication unit for receiving a restorer that is a model trained by a machine learning technique to output a restored object when received; and a control unit controlling to generate a restoration object using the restoration unit received from the communication unit.

상기 제어부는, 영상에서 가려진 객체의 종류를 분석하고, 상기 분석된 종류의 객체를 기반으로 기 학습된 복원기의 존재 여부를 확인하며, 복원기가 존재하는 경우, 해당 복원기를 선택하되, 상기 분석된 종류의 객체에 관련된 부가정보에 따라 상기 선택된 복원기에 대한 재학습의 필요 여부를 결정하고, 상기 결정된 필요 여부에 따라 상기 선택된 복원기를 이용하거나 상기 부가정보를 기반으로 상기 선택된 복원기를 재학습시킨 복원기를 이용하여, 복원 객체를 생성하도록 제어할 수 있다.The control unit analyzes the type of the object obscured from the image, confirms the existence of a pre-learned reconstructor based on the analyzed type of object, and if there is a reconstructor, selects the reconstructor, Determining whether or not re-learning for the selected reconstructor is necessary according to additional information related to the type of object, and using the selected reconstructor according to the determined necessity or re-learning the selected reconstructor based on the additional information By using it, you can control to create a restoration object.

상기와 같이 구성되는 본 발명은 영상의 가려진 영역에 대해 그 복원 시점에 부가정보(Side Information)를 이용하여 복원함으로써 그 가려진 영역에 보다 적합한 복원 영상을 생성할 수 있는 이점이 있다.The present invention configured as described above has an advantage in that a reconstructed image more suitable for the occluded region can be generated by restoring the occluded region of the image using side information at the restoration time.

또한, 본 발명은 가용 부가정보를 사용하여 학습 데이터를 증강시킬 수 있으며, 동시에 입력 데이터를 수정하여 기 학습된 모델을 재학습시킬 수 있어, 그 복원 성능을 개선할 수 있는 이점이 있다.In addition, the present invention can augment the training data using the available additional information, and at the same time modify the input data to retrain the pre-learned model, so that the restoration performance can be improved.

또한, 본 발명은 복원 시점에 부가정보에 따른 복원 영상을 생성하므로, 그 활용성이 커져, CCTV 관제 현장, 인물 일치 여부, 사건 발생 후 영상 복원, 이동 단말기에서 생체정보 기반 인증, 건물 출입구 등의 출입 통제 등과 같은 응용 분야에서 객체의 가려진 영역이 존재할 때 이를 복원하는데 두루 활용될 수 있는 이점이 있다.In addition, since the present invention generates a restored image according to additional information at the time of restoration, its utility is increased, such as CCTV control site, whether a person matches, image restoration after an incident, biometric information-based authentication in a mobile terminal, building entrance, etc. In application fields such as access control, there is an advantage that can be widely used to restore a hidden area of an object when it exists.

본 발명에서 얻을 수 있는 효과는 이상에서 언급한 효과들로 제한되지 않으며, 언급하지 않은 또 다른 효과들은 아래의 기재로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The effects obtainable in the present invention are not limited to the above-mentioned effects, and other effects not mentioned may be clearly understood by those of ordinary skill in the art to which the present invention belongs from the following description. will be.

도 1은 본 발명에서 가려진 영역의 복원에 대한 개념도를 나타낸다.
도 2은 이상적인(ideal) 부가정보를 이용하여 가려진 객체를 복원하는 일 예를 나타낸다.
도 3은 현실적인 부가정보를 이용하여 가려진 객체를 복원하는 일 예를 나타낸다.
도 4는 본 발명의 일 실시예에 따른 복원 장치(100)의 블록 구성도를 나타낸다.
도 5는 본 발명의 일 실시예에 따른 복원 장치(100)에서 제어부(150)의 블록 구성도를 나타낸다.
도 6은 본 발명의 일 실시예에 따른 복원 방법의 순서도를 나타낸다.
도 7은 도 6에 대한 구체적인 예시(즉, 기 학습된 복원기에 대한 재학습이 필요한 경우에 대한 예시)를 나타낸다.
도 8은 다양한 객체 별로 복원기를 학습시키는 개념도를 나타낸다.
도 9는 가려진 객체를 복원하는 과정을 입력과 출력 관점에서 나타낸 것이다.
도 10은 복원기에 대한 구체적인 학습 과정에 대한 일 예를 나타낸다.
도 11은 부가정보 제공에 대한 일 예를 나타낸다.
도 12는 제2 생성기와 판별기의 학습 과정에 대한 일 예를 나타낸다.1 shows a conceptual diagram for restoration of an area covered by the present invention.
FIG. 2 shows an example of reconstructing a hidden object using ideal additional information.
3 illustrates an example of reconstructing a hidden object using realistic additional information.
4 is a block diagram illustrating a restoration apparatus 100 according to an embodiment of the present invention.
5 is a block diagram of the control unit 150 in the restoration apparatus 100 according to an embodiment of the present invention.
6 is a flowchart of a restoration method according to an embodiment of the present invention.
7 shows a specific example of FIG. 6 (ie, an example of a case in which re-learning is required for a previously learned reconstructor).
8 shows a conceptual diagram for learning a reconstructor for each of various objects.
9 is a diagram illustrating a process of restoring an obscured object in terms of input and output.
10 shows an example of a specific learning process for a reconstructor.
11 shows an example of providing additional information.
12 shows an example of a learning process of a second generator and a discriminator.

본 발명의 상기 목적과 수단 및 그에 따른 효과는 첨부된 도면과 관련한 다음의 상세한 설명을 통하여 보다 분명해질 것이며, 그에 따라 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자가 본 발명의 기술적 사상을 용이하게 실시할 수 있을 것이다. 또한, 본 발명을 설명함에 있어서 본 발명과 관련된 공지 기술에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략하기로 한다.The above object and means of the present invention and its effects will become more apparent through the following detailed description in relation to the accompanying drawings, and accordingly, those of ordinary skill in the art to which the present invention pertains can easily understand the technical idea of the present invention. will be able to carry out In addition, in describing the present invention, if it is determined that a detailed description of a known technology related to the present invention may unnecessarily obscure the gist of the present invention, the detailed description thereof will be omitted.

본 명세서에서 사용된 용어는 실시예들을 설명하기 위한 것이며, 본 발명을 제한하고자 하는 것은 아니다. 본 명세서에서, 단수형은 문구에서 특별히 언급하지 않는 한 경우에 따라 복수형도 포함한다. 본 명세서에서, "포함하다", “구비하다”, “마련하다” 또는 “가지다” 등의 용어는 언급된 구성요소 외의 하나 이상의 다른 구성요소의 존재 또는 추가를 배제하지 않는다.The terminology used herein is for the purpose of describing the embodiments, and is not intended to limit the present invention. In the present specification, the singular form also includes the plural form as the case may be, unless otherwise specified in the phrase. In this specification, terms such as “include”, “provide”, “provide” or “have” do not exclude the presence or addition of one or more other components other than the mentioned components.

본 명세서에서, “또는”, “적어도 하나” 등의 용어는 함께 나열된 단어들 중 하나를 나타내거나, 또는 둘 이상의 조합을 나타낼 수 있다. 예를 들어, “또는 B”“및 B 중 적어도 하나”는 A 또는 B 중 하나만을 포함할 수 있고, A와 B를 모두 포함할 수도 있다.In this specification, terms such as “or” and “at least one” may indicate one of the words listed together, or a combination of two or more. For example, “or B” and “at least one of B” may include only one of A or B, or both A and B.

본 명세서에서, “예를 들어” 등에 따르는 설명은 인용된 특성, 변수, 또는 값과 같이 제시한 정보들이 정확하게 일치하지 않을 수 있고, 허용 오차, 측정 오차, 측정 정확도의 한계와 통상적으로 알려진 기타 요인을 비롯한 변형과 같은 효과로 본 발명의 다양한 실시 예에 따른 발명의 실시 형태를 한정하지 않아야 할 것이다.In the present specification, descriptions according to “for example” and the like may not exactly match the information presented, such as recited properties, variables, or values, tolerances, measurement errors, limits of measurement accuracy, and other commonly known factors The embodiments of the present invention according to various embodiments of the present invention should not be limited by effects such as modifications including .

본 명세서에서, 어떤 구성요소가 다른 구성요소에 '연결되어’ 있다거나 '접속되어' 있다고 기재된 경우, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소가 다른 구성 요소에 '직접 연결되어' 있다거나 '직접 접속되어' 있다고 언급된 때에는, 중간에 다른 구성요소가 존재하지 않는 것으로 이해될 수 있어야 할 것이다.In this specification, when it is described that a certain element is 'connected' or 'connected' to another element, it may be directly connected or connected to the other element, but other elements may exist in between. It should be understood that there may be On the other hand, when it is mentioned that a certain element is 'directly connected' or 'directly connected' to another element, it should be understood that the other element does not exist in the middle.

본 명세서에서, 어떤 구성요소가 다른 구성요소의 '상에' 있다거나 '접하여' 있다고 기재된 경우, 다른 구성요소에 상에 직접 맞닿아 있거나 또는 연결되어 있을 수 있지만, 중간에 또 다른 구성요소가 존재할 수 있다고 이해되어야 할 것이다. 반면, 어떤 구성요소가 다른 구성요소의 '바로 위에' 있다거나 '직접 접하여' 있다고 기재된 경우에는, 중간에 또 다른 구성요소가 존재하지 않은 것으로 이해될 수 있다. 구성요소 간의 관계를 설명하는 다른 표현들, 예를 들면, '～사이에'와 '직접 ～사이에' 등도 마찬가지로 해석될 수 있다.In this specification, when it is described that a certain element is 'on' or 'in contact with' another element, it may be directly in contact with or connected to the other element, but another element may exist in the middle. It should be understood that On the other hand, when it is described that a certain element is 'directly on' or 'directly' of another element, it may be understood that another element does not exist in the middle. Other expressions describing the relationship between the elements, for example, 'between' and 'directly between', etc. may be interpreted similarly.

본 명세서에서, '제1', '제2' 등의 용어는 다양한 구성요소를 설명하는데 사용될 수 있지만, 해당 구성요소는 위 용어에 의해 한정되어서는 안 된다. 또한, 위 용어는 각 구성요소의 순서를 한정하기 위한 것으로 해석되어서는 안되며, 하나의 구성요소와 다른 구성요소를 구별하는 목적으로 사용될 수 있다. 예를 들어, '제1구성요소'는 '제2구성요소'로 명명될 수 있고, 유사하게 '제2구성요소'도 '제1구성요소'로 명명될 수 있다.In this specification, terms such as 'first' and 'second' may be used to describe various components, but the components should not be limited by the above terms. In addition, the above terms should not be construed as limiting the order of each component, and may be used for the purpose of distinguishing one component from another. For example, a 'first component' may be referred to as a 'second component', and similarly, a 'second component' may also be referred to as a 'first component'.

다른 정의가 없다면, 본 명세서에서 사용되는 모든 용어는 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 공통적으로 이해될 수 있는 의미로 사용될 수 있을 것이다. 또한, 일반적으로 사용되는 사전에 정의되어 있는 용어들은 명백하게 특별히 정의되어 있지 않는 한 이상적으로 또는 과도하게 해석되지 않는다. Unless otherwise defined, all terms used herein may be used with meanings commonly understood by those of ordinary skill in the art to which the present invention pertains. In addition, terms defined in a commonly used dictionary are not to be interpreted ideally or excessively unless specifically defined explicitly.

이하, 첨부된 도면을 참조하여 본 발명에 따른 바람직한 일 실시예를 상세히 설명하도록 한다.Hereinafter, a preferred embodiment according to the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명에서 가려진 영역의 복원에 대한 개념도를 나타낸다.1 shows a conceptual diagram for restoration of an area covered by the present invention.

도 1에 도시된 바와 같이, 본 발명은 카메라에 의해 촬영된 영상에서 가려진 영역을 가진 객체(즉, 가려진 객체)를 그 가려진 영역이 복원된 객체(즉, 복원 객체)로 복원하는 기술에 관한 것으로서, 영상 감시 시스템 등에서 활용 가능하다. 이때, 복원기(즉, 가려진 객체 복원기)는 머신 러닝(machine learning) 기법으로 학습된 학습 모델로서, 부가정보를 이용하여 가려진 객체를 복원 객체로 복원할 수 있다.As shown in Fig. 1, the present invention relates to a technique for restoring an object having an obscured area (i.e., an obscured object) in an image captured by a camera into an object in which the obscured area is restored (i.e., a restored object). , video surveillance systems, etc. In this case, the reconstructor (ie, the obscured object reconstructor) is a learning model learned by a machine learning technique, and may restore the obscured object to the reconstructed object using additional information.

영상 감시 시스템은 CCTV 등의 카메라에서 수집된 영상에 대한 분석을 기반으로 범죄예방, 재난감시 등과 같이 생명과 재산을 보호하기 위한 시스템일 수 있다. 가령, 영상 감시 시스템은 수집된 영상에서 사람, 차량 등 감시 대상 사물을 인식하고, 이들의 행동 패턴을 분석하여 이상 이벤트(예를 들어, 침입, 화재, 폭력, 도난, 투기 등)가 발생될 경우, 이에 대한 정보를 각종 경고 장치 등을 통해 감시자에게 전달하는 시스템일 수 있으나, 이에 한정되는 것은 아니다.The video surveillance system may be a system for protecting life and property, such as crime prevention and disaster monitoring, based on analysis of images collected from cameras such as CCTV. For example, the video surveillance system recognizes objects to be monitored, such as people and vehicles, from the collected images, analyzes their behavior patterns, and when abnormal events (eg, intrusion, fire, violence, theft, speculation, etc.) occur , it may be a system that transmits information on this to a supervisor through various warning devices, but is not limited thereto.

도 1에서, 가려진 객체의 영상에 대해 복원기를 이용하면 가려진 부분이 복원(제거)된 영상이 획득될 수 있다. 이때, 가용 부가정보를 이용함으로써 더욱 바람직한 복원이 가능하다. 가령, 도 1의 예에서는 얼굴의 윤곽과 눈, 코, 입의 위치에 대한 정보가 부가정보로 입력됨으로써, 가려진 영역을 정확히 복원할 수 있다.In FIG. 1 , if a restorer is used for an image of an occluded object, an image in which the occluded portion is restored (removed) may be obtained. In this case, more desirable restoration is possible by using the available additional information. For example, in the example of FIG. 1 , information on the contour of the face and the positions of the eyes, nose, and mouth is input as additional information, so that the hidden area can be accurately restored.

가려진 객체의 복원과 관련하여, 부가정보가 필요한 이유를 도 1의 예시 관점에서 기술하면 다음과 같다. 즉, 얼굴의 주요 특징점들은 약 60개 내지 90개 정도가 도출될 수 있는데, 마스크나 선글라스로 얼굴을 가리게 되면 특정 특징점들이 사라진다. 이러한 사라진 특징점들을 복원하려는 경우, 주변의 특징점을 이용하여 그 가려진 부분을 복원할 수도 있겠지만, 해당 경우는 통상적으로 변수의 개수가 방정식의 개수보다 많은 경우에 해당한다. 이러한 문제는 얼굴 외의 가려진 객체를 복원하려는 경우에도 대부분 동일하게 발생하는 문제로서, “ill-posed 문제”라고도 지칭된다.In relation to the restoration of the obscured object, the reason for the need for additional information will be described in terms of the example of FIG. 1 as follows. That is, about 60 to 90 main feature points of the face can be derived, and when the face is covered with a mask or sunglasses, the specific feature points disappear. In the case of restoring these missing feature points, the hidden part may be restored using surrounding feature points, but this case generally corresponds to a case in which the number of variables is greater than the number of equations. This problem occurs most of the time even when trying to restore an obscured object other than a face, and is also referred to as an “ill-posed problem”.

상술한 문제는 해결 측면에서 난이도가 매우 높다. 뿐만 아니라 복원 기술을 특정한 응용에 적용하려면 신중해야 한다. 가령, 인공신경망 기술(이하, “일반적인 인공신경망 기술”이라 지칭함)을 사용하면, 여러 얼굴을 학습에 사용하여 가려진 얼굴이나 전혀 새로운 얼굴을 복원하는 것이 가능하다. 하지만, 이와 같이 생성된 얼굴은 “일반적인 사람 얼굴”을 복원한 것이지 “어떤 특정한 사람의 얼굴”을 복원한 것은 아니다.The above-described problem is very difficult in terms of solving. In addition, the application of restoration techniques to specific applications requires careful consideration. For example, if artificial neural network technology (hereinafter, referred to as “general artificial neural network technology”) is used, it is possible to use multiple faces for learning to restore an obscured face or a completely new face. However, the face generated in this way is a restoration of a “general human face”, not a restoration of a “specific human face”.

이에 따라, 본 발명에서는 이러한 문제를 해결하고자 부가정보를 이용하는 방법을 제안하고 있다. 즉, 가려진 객체를 복원하기 위해서 어떤 특정한 객체의 부가정보를 제공하여, 해당 특정 객체의 특징을 유지하면서 해당 가려진 영역을 복원하는 것이다.Accordingly, the present invention proposes a method of using additional information to solve this problem. That is, in order to restore the occluded object, additional information of a specific object is provided to restore the occluded area while maintaining the characteristics of the specific object.

도 2은 이상적인(ideal) 부가정보를 이용하여 가려진 객체를 복원하는 일 예를 나타내며, 도 3은 현실적인 부가정보를 이용하여 가려진 객체를 복원하는 일 예를 나타낸다.FIG. 2 shows an example of restoring an obscured object using ideal additional information, and FIG. 3 shows an example of restoring an obscured object using realistic additional information.

도 2 및 도 3을 참조하면, 가려진 객체를 복원하는 복원기(즉, 가려진 객체 복원기)는 다양한 부가정보를 이용하여 복원 객체를 생성할 수 있다. 일례로, 부가정보는 객체의 외곽선과 같은 특징이나 표면의 색상이나 재질과 같은 정보가 될 수 있다.Referring to FIGS. 2 and 3 , a restorer that restores an obscured object (ie, an obscured object restorer) may generate a restored object by using various additional information. For example, the additional information may be information such as a characteristic such as an outline of an object or a color or material of a surface.

도 2에서는 부가정보의 품질이 매우 이상적인 경우를 나타낸 것이며, 이러한 정보는 대게 원본 데이터(Original data)나 그라운드 트루스 데이터(Ground truth data)가 있어야만 획득할 수 있다. 반면에, 도 3과 같이 현실에서는 제공되는 부가정보가 이상적인 것과는 다를 수 있다. 즉, 도 3에서는 복원 객체의 표면 재질과 모양이 다른 경우를 보여주고 있다, 그럼에도 불구하고 복원된 결과를 활용하는 측면에서는 복원을 위해 제공한 부가정보를 참고하여 그 복원 결과가 영향을 받을 수 있다는 사실을 알고 있으므로, 그 결과물을 특정 응용에서 사용할 때 종래 기술(즉, 일반적인 인공신경망을 사용한 경우, 명시적인 사용 근거 없이 결과물을 생성함)보다는 더 유용할 수 있다.2 shows a case in which the quality of additional information is very ideal, and such information can usually be acquired only when there is original data or ground truth data. On the other hand, as shown in FIG. 3 , in reality, the provided additional information may be different from the ideal one. That is, FIG. 3 shows a case where the surface material and shape of the restored object are different. Nevertheless, in terms of utilizing the restored result, the restoration result may be affected by referring to the additional information provided for restoration. Knowing the facts, the result may be more useful than the prior art (that is, when a general artificial neural network is used, generating the result without an explicit use case) when using the result in a specific application.

한편, 부가정보를 이용하여 가려진 객체를 복원할 때 발생하는 어려움은, 복원해야 할 객체의 종류가 다양하다는 것과, 사용할 수 있는 부가정보의 종류가 확정적이지 않고 상황에 따라 달라질 수 있다는 것이다. 이러한 문제를 해결하기 위해, 본 발명에서는 복원이 이뤄지는 시점(즉, 복원시점)에 가려진 객체를 복원하기 위한 복원기를 구성하고, 이를 신규학습 또는 재학습시킴으로써, 입력된 가려진 객체에 대한 복원을 수행할 수 있다. 즉, 본 발명은 복원시점에, 해당 부가정보에 관련되어 기 학습된 복원기가 존재하는 경우 그 복원기를 그대로 이용하거나 해당 부가정보를 기반으로 재학습시킨 후 이용하며, 해당 복원기가 미 존재하는 경우에 해당 부가정보를 기반으로 새로운 복원기를 학습시켜 이용할 수 있다.On the other hand, difficulties that occur when restoring an object hidden using additional information are that there are various types of objects to be restored and that the types of additional information that can be used are not definitive and may vary depending on circumstances. In order to solve this problem, in the present invention, by configuring a restorer for restoring the obscured object at the time of restoration (that is, at the restoration time), and new learning or re-learning, restoration of the input occluded object is performed. can That is, in the present invention, at the time of restoration, if there is a previously-learned restoration unit related to the corresponding additional information, the restoration device is used as it is or after re-learning based on the additional information is used. Based on the corresponding additional information, a new restorer can be learned and used.

이에 따라, 본 발명의 특징을 2가지로 요약하면 다음과 같다. (1) 가려진 영역의 복원 시점에서 가려진 객체에 관련된 부가정보를 이용할 수 있다. (2) 가용 부가정보를 사용하여 학습 데이터의 개수를 증강시킬 수 있으며, 학습 데이터에서 그 입력 데이터를 수정하여 기 학습된 복원기를 다시 학습시킴으로써 그 복원 성능을 개선할 수 있다.Accordingly, the two features of the present invention are summarized as follows. (1) Additional information related to the obscured object can be used at the time of restoration of the obscured area. (2) The number of training data can be augmented by using the available additional information, and the restoration performance can be improved by re-learning the previously learned restoration unit by correcting the input data in the training data.

도 4는 본 발명의 일 실시예에 따른 복원 장치(100)의 블록 구성도를 나타낸다.4 is a block diagram illustrating a restoration apparatus 100 according to an embodiment of the present invention.

본 발명의 일 실시예에 따른 복원 장치(100)는 영상 감시 시스템 등의 카메라에서 촬영된 영상에서 가려진 영역을 복원하는 장치(즉, 가려진 객체를 복원 객체로 복원하는 장치)로서, 컴퓨팅(computing)이 가능한 전자 장치 또는 컴퓨팅 네트워크일 수 있다.The restoration apparatus 100 according to an embodiment of the present invention is an apparatus (ie, an apparatus for restoring an obscured object to a restored object) in an image captured by a camera such as an image monitoring system, and is performed by computing. This may be an electronic device or a computing network capable of this.

예를 들어, 전자 장치는 데스크탑 PC(desktop personal computer), 랩탑 PC(laptop personal computer), 태블릿 PC(tablet personal computer), 넷북 컴퓨터(netbook computer), 워크스테이션(workstation), PDA(personal digital assistant), 스마트폰(smartphone), 스마트패드(smartpad), 또는 휴대폰(mobile phone), 등일 수 있으나, 이에 한정되는 것은 아니다.For example, the electronic device includes a desktop personal computer (PC), a laptop personal computer (PC), a tablet personal computer (PC), a netbook computer, a workstation, and a personal digital assistant (PDA). , a smartphone (smartphone), a smart pad (smartpad), or a mobile phone (mobile phone), etc., but is not limited thereto.

이러한 복원 장치(100)는, 도 4에 도시된 바와 같이, 입력부(110), 통신부(120), 디스플레이(130), 메모리(140) 및 제어부(150)를 포함할 수 있다.The restoration apparatus 100 may include an input unit 110 , a communication unit 120 , a display 130 , a memory 140 , and a control unit 150 , as shown in FIG. 4 .

입력부(110)는 다양한 사용자(감시자 등)의 입력에 대응하여, 입력데이터를 발생시키며, 다양한 입력수단을 포함할 수 있다. 예를 들어, 입력부(110)는 키보드(key board), 키패드(key pad), 돔 스위치(dome switch), 터치 패널(touch panel), 터치 키(touch key), 터치 패드(touch pad), 마우스(mouse), 메뉴 버튼(menu button) 등을 포함할 수 있으나, 이에 한정되는 것은 아니다.The input unit 110 may generate input data in response to input of various users (monitor, etc.), and may include various input means. For example, the input unit 110 may include a keyboard, a keypad, a dome switch, a touch panel, a touch key, a touch pad, and a mouse. (mouse), a menu button (menu button) and the like may be included, but is not limited thereto.

통신부(120)는 다른 장치와의 통신을 수행하는 구성이다. 예를 들어, 통신부(120)는 5G(5th generation communication), LTE-A(long term evolution-advanced), LTE(long term evolution), 블루투스, BLE(bluetooth low energe), NFC(near field communication), 와이파이(WiFi) 통신 등의 무선 통신을 수행하거나, 케이블 통신 등의 유선 통신을 수행할 수 있으나, 이에 한정되는 것은 아니다. 가령, 통신부(120)는 영상, 생성기, 복원기 등을 타 장치로부터 수신할 수 있으며, 복원 결과(영상), 생성기, 복원기 등을 타 장치로 송신할 수 있다. 이때, 생성기 또는 복원기 등을 타 장치로부터 수신하는 경우는 후술하는 복원 방법에서 생성기 또는 복원기에 대한 학습을 타 장치에서 수행하는 경우일 수 있다.The communication unit 120 is configured to communicate with other devices. For example, the communication unit 120 is 5th generation communication (5G), long term evolution-advanced (LTE-A), long term evolution (LTE), Bluetooth, bluetooth low energe (BLE), near field communication (NFC), Wireless communication such as Wi-Fi communication may be performed or wired communication such as cable communication may be performed, but is not limited thereto. For example, the communication unit 120 may receive an image, a generator, a restorer, etc. from another device, and may transmit a restoration result (image), a generator, a restorer, etc. to the other device. In this case, the case where the generator or the restorer is received from the other device may be a case in which learning of the generator or the restorer is performed in the other device in a restoration method to be described later.

디스플레이(130)는 다양한 영상 데이터를 화면으로 표시하는 것으로서, 비발광형 패널이나 발광형 패널로 구성될 수 있다. 예를 들어, 디스플레이(130)는 액정 디스플레이(LCD; liquid crystal display), 발광 다이오드(LED; light emitting diode) 디스플레이, 유기 발광 다이오드(OLED; organic LED) 디스플레이, 마이크로 전자기계 시스템(MEMS; micro electro mechanical systems) 디스플레이, 또는 전자 종이(electronic paper) 디스플레이 등을 포함할 수 있으나, 이에 한정되는 것은 아니다. 또한, 디스플레이(130)는 입력부(110)와 결합되어 터치 스크린(touch screen) 등으로 구현될 수 있다.The display 130 displays various image data on a screen, and may be configured as a non-emission panel or a light emitting panel. For example, the display 130 may include a liquid crystal display (LCD), a light emitting diode (LED) display, an organic light emitting diode (OLED) display, and a micro electromechanical system (MEMS). mechanical systems) display, or an electronic paper display, etc., but is not limited thereto. In addition, the display 130 may be implemented as a touch screen or the like in combination with the input unit 110 .

메모리(140)는 복원 장치(100)의 동작에 필요한 각종 정보를 저장한다. 저장 정보로는 영상, 생성기, 복원기, 후술할 복원 방법에 관련된 프로그램 정보 등이 포함될 수 있으나, 이에 한정되는 것은 아니다. 예를 들어, 메모리(140)는 그 유형에 따라 하드디스크 타입(hard disk type), 마그네틱 매체 타입(Sagnetic media type), CD-ROM(compact disc read only memory), 광기록 매체 타입(Optical Media type), 자기-광 매체 타입(Sagneto-optical media type), 멀티미디어 카드 마이크로 타입(Sultimedia card micro type), 플래시 저장부 타입(flash memory type), 롬 타입(read only memory type), 또는 램 타입(random access memory type) 등을 포함할 수 있으나, 이에 한정되는 것은 아니다. 또한, 메모리(140)는 그 용도/위치에 따라 캐시(cache), 버퍼, 주기억장치, 또는 보조기억장치이거나 별도로 마련된 저장 시스템일 수 있으나, 이에 한정되는 것은 아니다.The memory 140 stores various types of information necessary for the operation of the restoration apparatus 100 . The storage information may include, but is not limited to, an image, a generator, a restoration device, and program information related to a restoration method to be described later. For example, the memory 140 may be a hard disk type, a magnetic media type, a compact disc read only memory (CD-ROM), or an optical media type depending on the type. ), a Sagneto-optical media type, a multimedia card micro type, a flash memory type, a read only memory type, or a random access memory type), but is not limited thereto. In addition, the memory 140 may be a cache, a buffer, a main memory, an auxiliary memory, or a separately provided storage system according to its purpose/location, but is not limited thereto.

제어부(150)는 복원 장치(100)의 다양한 제어 동작을 수행할 수 있다. 즉, 제어부(150)는 후술할 복원 방법의 수행을 제어할 수 있으며, 복원 장치(100)의 나머지 구성, 즉 입력부(110), 통신부(120), 디스플레이(130), 메모리(140) 등의 동작을 제어할 수 있다. 예를 들어, 제어부(150)는 하드웨어인 프로세서(processor) 또는 해당 프로세서에서 수행되는 소프트웨어인 프로세스(process) 등을 포함할 수 있으나, 이에 한정되는 것은 아니다.The controller 150 may perform various control operations of the restoration apparatus 100 . That is, the controller 150 may control the execution of the restoration method to be described later, and the rest of the configuration of the restoration apparatus 100 , that is, the input unit 110 , the communication unit 120 , the display 130 , the memory 140 , etc. You can control the action. For example, the control unit 150 may include a processor that is hardware or a process that is software that is executed in a corresponding processor, but is not limited thereto.

도 5는 본 발명의 일 실시예에 따른 복원 장치(100)에서 제어부(150)의 블록 구성도를 나타낸다.5 is a block diagram of the control unit 150 in the restoration apparatus 100 according to an embodiment of the present invention.

제어부(150)는 본 발명의 일 실시예에 따른 복원 방법의 수행을 제어하며, 도 4에 도시된 바와 같이, 영상 분석부(151), 복원기 선택부(152), 학습부(153), 부가정보 제공부(154) 및 복원부(155)를 포함할 수 있다. 예를 들어, 영상 분석부(151), 복원기 선택부(152), 학습부(153), 부가정보 제공부(154) 및 복원부(155)는 제어부(150)의 하드웨어 구성이거나, 제어부(150)에서 수행되는 소프트웨어인 프로세스일 수 있으나, 이에 한정되는 것은 아니다.The controller 150 controls the execution of the restoration method according to an embodiment of the present invention, and as shown in FIG. 4 , the image analyzer 151 , the restorer selector 152 , the learner 153 , It may include an additional information providing unit 154 and a restoration unit 155 . For example, the image analysis unit 151 , the restorer selection unit 152 , the learning unit 153 , the additional information providing unit 154 , and the restoration unit 155 are hardware components of the control unit 150 or the control unit ( 150) may be a process that is software performed, but is not limited thereto.

한편, 복원기는 입력 데이터 및 출력 데이터 쌍(데이터 셋)의 학습 데이터를 통해 머신 러닝 기법에 따라 학습된 모델일 수 있다. 즉, 복원기는 다수의 레이어(layer)를 포함하여, 입력 데이터와 출력 데이터의 관계에 대한 함수를 가진다. 즉, 복원기에 입력 데이터가 입력되는 경우, 해당 함수에 따른 출력 데이터가 출력될 수 있다. Meanwhile, the restorer may be a model trained according to a machine learning technique through training data of an input data and an output data pair (data set). That is, the reconstructor includes a plurality of layers and has a function of the relationship between input data and output data. That is, when input data is input to the restorer, output data according to a corresponding function may be output.

이때, 입력 데이터는 가려진 객체의 영상 및 그 객체에 관련된 부가정보를 포함할 수 있으며, 출력 데이터는 복원 객체의 영상을 포함할 수 있다. 이에 따라, 복원기는 가려진 객체 및 그 객체에 관련된 부가정보를 입력 받는 경우에 복원 객체를 출력하도록 머신 러닝 기법으로 학습된 모델일 수 있다.In this case, the input data may include an image of the obscured object and additional information related to the object, and the output data may include an image of the reconstructed object. Accordingly, the reconstructor may be a model trained by machine learning to output a reconstructed object when receiving an input of an obscured object and additional information related to the object.

예를 들어, 머신 러닝 기법은 딥 러닝(Deep learning) 기법 등을 포함할 수 있으나, 이에 한정되는 것은 아니다. 즉, 복원기는 상술한 학습 데이터를 통해 딥 러닝(deep learning) 기법에 따라 학습된 딥 러닝 모델일 수 있다. 이때, 복원기는 입력 데이터와 출력 데이터 간의 관계를 다수의 층(즉, 레이어)으로 표현하며, 이러한 다수의 표현층을 “인공신경망(neural network)”라 지칭하기도 한다. 신경망 내의 각 레이어는 적어도 하나 이상의 필터로 이루어지며, 각 필터는 가중치(weight)의 매트릭스(matrix)를 가진다. 즉, 해당 필터의 매트릭스에서 각 원소(픽셀)는 가중치의 값에 해당할 수 있다.For example, the machine learning technique may include, but is not limited to, a deep learning technique. That is, the restorer may be a deep learning model learned according to a deep learning technique through the above-described learning data. In this case, the reconstructor expresses the relationship between the input data and the output data in a plurality of layers (ie, layers), and the plurality of expression layers is also referred to as an “artificial neural network”. Each layer in the neural network consists of at least one filter, and each filter has a matrix of weights. That is, each element (pixel) in the matrix of the corresponding filter may correspond to a weight value.

예를 들어, 딥 러닝 기법은 Deep Neural Network(DNN), Convolutional Neural Network(CNN), Recurrent Neural Network(RNN), Restricted Boltzmann Machine(RBM), Deep Belief Network(DBN), Deep Q-Networks, 오토인코더(Auto-encoder), 변이형 오토인코더(Variational Auto-encoder) 등을 포함할 수 있으나, 이에 한정되는 것은 아니다.For example, deep learning techniques include Deep Neural Network (DNN), Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Restricted Boltzmann Machine (RBM), Deep Belief Network (DBN), Deep Q-Networks, Autoencoder (Auto-encoder), and may include a variational auto-encoder (Variational Auto-encoder), but is not limited thereto.

이하, 본 발명에 따른 복원 방법에 대해 보다 상세하게 설명하도록 한다.Hereinafter, the restoration method according to the present invention will be described in more detail.

도 6은 본 발명의 일 실시예에 따른 복원 방법의 순서도를 나타내며, 도 7은 도 6에 대한 구체적인 예시(즉, 기 학습된 복원기에 대한 재학습이 필요한 경우에 대한 예시)를 나타낸다.6 shows a flowchart of a restoration method according to an embodiment of the present invention, and FIG. 7 shows a specific example of FIG.

본 발명의 일 실시예에 따른 복원 방법은 복원할 영역을 설정하고, 객체를 분석하여 이미 학습된 객체이면 해당 가려진 객체 복원기를 선택하여 복원을 수행한다. 하지만, 이미 학습된 객체가 아니라면 가려진 객체와 연관된 영상 데이터를 추가로 수집하여 그 수집 객체에 대해 신규 학습을 수행하고, 학습이 완료되면 신규 학습된 복원기를 저장(등록)한다. 또한, 기 학습된 복원기 중에서 선택된 복원기에 대해서도, 제공된 부가정보를 이용하여 복원이 가능하다면 선택된 복원기로 가려진 객체를 복원하고, 제공된 부가정보를 이용하지 못하는 경우에는 제공된 부가정보를 기반으로 선택된 복원기를 재학습시킴으로써, 보다 정확한 복원이 가능하도록 한다.The restoration method according to an embodiment of the present invention sets an area to be restored, and if it is an object that has already been learned by analyzing the object, the object is restored by selecting a corresponding occluded object restorer. However, if the object is not already learned, image data associated with the obscured object is additionally collected, new learning is performed on the collected object, and when the learning is completed, the newly learned restorer is stored (registered). In addition, with respect to the reconstructor selected from the previously learned reconstructors, if restoration is possible using the provided additional information, the object hidden by the selected reconstructor is restored, and if the provided additional information cannot be used, a reconstructor selected based on the provided additional information is used. By re-learning, more accurate restoration is possible.

즉, 도 6을 참조하면, 본 발명의 일 실시예에 따른 복원 방법은 S201 내지 S212를 포함할 수 있다.That is, referring to FIG. 6 , the restoration method according to an embodiment of the present invention may include steps S201 to S212.

먼저, 영상 분석부(151)는 영상에서 가려진 객체를 포함하는 영역(즉, 복원할 영역)을 설정하며(S201), 복원 영역 내의 가려진 객체에 대해 그 종류(즉, 가려진 객체의 종류)를 분석한다(S202).First, the image analyzer 151 sets a region (ie, a region to be restored) including an obscured object in the image (S201), and analyzes the type (ie, the type of the occluded object) of the occluded object in the restoration region (S201). do (S202).

S201에서, 복원할 영역 설정은 수동 또는 자동으로 처리될 수 있다. 즉, 수동으로 복원할 영역을 설정하는 경우, 사용자가 복원이 필요한 영역(즉, 가려진 객체가 포함된 영역)을 입력부(110)를 통해 선택할 수 있다. 반면, 자동으로 복원할 영역을 설정하는 경우, 다양한 객체 검출 기술이 활용될 수 있다. 예를 들어, 객체 검출 기술은 Haar 검출기, HOG(Histogram of Oriented Gradient), SIFT(Scale Invariant Feature Transform), LBP(Local Binary Pattern), MCT(Modified Census Transform), SVM(Support Vector Machine), Adaboost 등을 포함할 수 있으며, YOLO 등과 같이 딥러닝 기술을 기반으로 하는 방법을 사용할 수 있으나, 이에 한정되는 것은 아니다.In S201, the setting of the area to be restored may be processed manually or automatically. That is, when manually setting an area to be restored, the user may select an area requiring restoration (ie, an area including a hidden object) through the input unit 110 . On the other hand, when an area to be automatically restored is set, various object detection techniques may be utilized. For example, object detection technologies include Haar detector, Histogram of Oriented Gradient (HOG), Scale Invariant Feature Transform (SIFT), Local Binary Pattern (LBP), Modified Census Transform (MCT), Support Vector Machine (SVM), Adaboost, etc. may include, and a method based on deep learning technology such as YOLO may be used, but is not limited thereto.

한편, 도 7에서 복원할 영역이 사각형 형태로 설정되는 것으로 표시하였으나, 이에 한정되는 것은 아니며, 임의의 다각형(Polygon)으로 선택될 수 있음은 물론이다. 다만, 복원을 위해, 인공신경망 등의 복원기에 가려진 영역이 포함된 영상을 입력해야 하는데, 이 경우 해당 복원기의 구조에 맞는 크기 변형(Resize), 영역 잘라내기(Crop), 공백 채우기 등이 수행될 수 있다. 예를 들어, 인공신경망은 행과 열의 크기가 다른 행렬, 이런 행렬들의 집합인 텐서 형태를 구성함이 바람직하므로, 복원할 영역이 사각형으로 선택된 경우에는 해당 인공신경망에 맞게 그 크기 변형이나 영역 잘라내기 등이 수행될 수 있다. 또한, 복원할 영역이 그 외의 다각형으로 선택된 경우, 해당 다각형을 포함하는 바운딩 박스(Bounding Box)를 씌우고 이에 대한 크기 변형이나 영역 잘라내기 등이 수행될 수 있다.Meanwhile, although it is indicated that the region to be restored is set in a rectangular shape in FIG. 7 , the present invention is not limited thereto and may be selected as an arbitrary polygon. However, for restoration, it is necessary to input an image that includes a region covered by a restoration device such as an artificial neural network. can be For example, since the artificial neural network preferably forms a matrix with different row and column sizes and a tensor form that is a set of these matrices, when the area to be restored is selected as a rectangle, its size is changed or the area is cut to fit the artificial neural network. etc. may be performed. In addition, when the region to be restored is selected as another polygon, a bounding box including the corresponding polygon may be covered, and a size change or region cropping may be performed.

S202에서, 복원할 영역 설정을 통해 설정된 영역과 그 관련 영상 조각을 분석하여, 해당 영역 내의 가려진 객체가 어떤 객체를 의미하는지 그 종류를 분석한다.In S202, the region set through setting the region to be restored and the related image fragment are analyzed, and the type of object that is hidden within the region is analyzed.

S203에서, 복원기 선택부(152)는 분석된 종류의 객체를 기반으로 기 학습된 복원기의 존재 여부를 확인한다. 만일, 기 학습된 복원기가 존재하는 경우에는 S207이 수행되며, 기 학습된 복원기가 부재하는 경우에는 S204가 수행된다. 즉, 분석된 가려진 객체가 기 학습된 객체에 해당한다면, 해당 객체를 복원하기 위한 모델인 복원기는 내부 메모리(140), 외부 데이터베이스(Database), 또는 외부 파일시스템 등(이하, “복원기 저장 DB”라 지칭함)에 저장(등록)되어 있다.In S203 , the restorer selector 152 checks whether a previously learned restorer exists based on the analyzed type of object. If there is a pre-learned reconstructor, S207 is performed, and if there is no pre-learned reconstructor, S204 is performed. That is, if the analyzed obscured object corresponds to a pre-learned object, the model for restoring the object, the restorer, is an internal memory 140, an external database, or an external file system (hereinafter, “restorer storage DB”). ”) is stored (registered).

구체적으로, 분석된 종류의 객체를 기반으로 기 학습된 복원기가 부재하는 경우, 학습부(153)는 S202에서 분석된 종류의 객체에 관련된 영상을 수집하며(S204), 수집된 객체 영상 및 그 객체에 관련된 부가정보를 기반으로 복원기를 신규 학습시킨다(S205)(이하, 해당 복원기를 “신규 학습된 복원기”라 지칭함). 특히, S205에서, 학습부(153)는 신규 학습된 복원기를 복원기 저장 DB에 저장한다.Specifically, if there is no reconstructor previously learned based on the analyzed type of object, the learning unit 153 collects an image related to the analyzed type of object in S202 (S204), and the collected object image and the object A new reconstructor is learned based on additional information related to ( S205 ) (hereinafter, the corresponding reconstructor is referred to as a “newly learned reconstructor”). In particular, in S205, the learning unit 153 stores the newly learned restorer in the restorer storage DB.

이후, 복원부(155)는 신규 학습된 복원기에 S201의 복원 영역에 해당하는 영상(즉, 가려진 객체가 포함된 영상) 및 그 부가정보를 입력시켜, 그 가려진 객체를 복원시킬 수 있다(S206). 즉, 복원 객체를 생성할 수 있다. 이때, 부가정보는 부가정보 제공부(154)를 통해 별도로 제공될 수 있다. Thereafter, the restoration unit 155 may input an image corresponding to the restoration area of S201 (ie, an image including an obscured object) and additional information thereof to the newly learned restoration unit to restore the occluded object ( S206 ). . That is, a restoration object can be created. In this case, the additional information may be separately provided through the additional information providing unit 154 .

한편, 분석된 종류의 객체를 기반으로 기 학습된 복원기가 존재하는 경우, 복원기 선택부(152)는 해당 기 학습된 복원기를 선택하고(S207)(이하, 해당 복원기를 “선택된 복원기”라 지칭함), 부가정보 제공부(154)를 통해 부가정보를 제공받으며(S208), 제공된 부가정보가 선택된 복원기와 관련된 부가정보인지를 검토하여 선택된 복원기에 대한 재학습 필요 여부를 결정한다(S209).On the other hand, if there is a pre-learned reconstructor based on the analyzed type of object, the reconstructor selection unit 152 selects the pre-learned reconstructor (S207) (hereinafter referred to as a “selected reconstructor”). referred to), the additional information is provided through the additional information providing unit 154 (S208), and whether the provided additional information is additional information related to the selected restorer is checked to determine whether re-learning is required for the selected restorer (S209).

즉, 제공된 부가정보가 선택된 복원기의 기 학습 시에 사용된 학습 데이터의 입력 데이터에 포함된 부가정보와 매칭하는 경우(즉, 선택된 복원기에 대한 재학습이 불필요한 경우), 복원부(155)는 선택된 복원기에 S201의 복원 영역에 해당하는 영상(즉, 가려진 객체가 포함된 영상) 및 그 부가정보(즉, S208에서 제공된 부가정보)를 입력시켜, 그 가려진 객체를 복원시킬 수 있다(S210). 즉, 제공된 부가정보를 고려하여 선택된 복원기가 가려진 객체를 복원할 수 있다면, 더 이상의 추가 절차 없이 선택된 복원기에 가려진 객체를 복원한다.That is, when the provided additional information matches the additional information included in the input data of the learning data used during the pre-learning of the selected reconstructor (that is, when re-learning of the selected reconstructor is unnecessary), the reconstructor 155 is An image corresponding to the restoration area of S201 (ie, an image including an obscured object) and additional information thereof (ie, additional information provided in S208) may be input to the selected restoration unit to restore the occluded object (S210). That is, if the object hidden by the selected restorer can be restored in consideration of the provided additional information, the object hidden by the selected restorer is restored without any additional procedures.

반면, 제공된 부가정보가 선택된 복원기의 기 학습 시에 사용된 학습 데이터의 입력 데이터에 포함된 부가정보와 매칭하지 않는 경우(즉, 선택된 복원기에 대한 재학습이 필요한 경우), 학습부(153)는 S208에서 제공된 부가정보를 기반으로 선택된 복원기를 재학습시킨다(S211)(이하, 해당 복원기를 “재학습된 복원기”라 지칭함). 특히, S211에서, 학습부(153)는 재학습된 복원기를 복원기 저장 DB에 저장한다. 이후, 복원부(155)는 재학습된 복원기에 S201의 복원 영역에 해당하는 영상(즉, 가려진 객체가 포함된 영상) 및 그 부가정보(S208에서 제공된 부가정보)를 입력시켜, 그 가려진 객체를 복원시킬 수 있다.On the other hand, when the provided additional information does not match the additional information included in the input data of the training data used during pre-learning of the selected reconstructor (that is, when re-learning of the selected reconstructor is required), the learning unit 153 relearns the selected reconstructor based on the additional information provided in S208 (S211) (hereinafter referred to as a “relearned reconstructor”). In particular, in S211, the learning unit 153 stores the re-learned restorer in the restorer storage DB. Thereafter, the restoration unit 155 inputs an image corresponding to the restoration area of S201 (that is, an image including an obscured object) and its additional information (additional information provided in S208) to the retrained restoration unit to restore the occluded object. can be restored

가령, 211에서, 학습부(153)는 선택된 복원기가 신규 학습될 때에 수집된 객체 영상과, S208에서 제공된 부가정보(즉, 업데이트된 부가정보이며, 선택된 복원기가 신규 학습될 때에 사용된 종래의 부가정보가 아님)를 기반으로 선택된 복원기를 재학습시킬 수 있다. 그 결과, S212에서, 재학습된 복원기는 업데이트된 부가정보를 더욱 잘 반영한 복원 객체를 생성할 수 있다.For example, in 211, the learning unit 153 may include an object image collected when the selected reconstructor is newly learned, and the additional information provided in S208 (that is, updated additional information, and the conventional additional information used when the selected reconstructor is newly learned). It is possible to retrain the selected reconstructor based on the information). As a result, in S212 , the relearned restorer may generate a restoration object that better reflects the updated additional information.

도 5(선택된 복원기에 대한 재학습이 필요한 경우에 대한 예시)를 참조하면, 다수의 객체를 포함하는 영상에서, 가려진 영역을 포함하는 개체 타입 1을 포함하는 영역을 복원할 영역으로 설정한 후, 그 개체 타입 1의 종류에 대해 분석한다. 이후, 객체 타입 1에 대해 존재하는 기 학습된 복원기를 선택하면서 그 객체 타입 1에 대한 부가정보를 제공한다. 이후, 제공된 부가정보를 기반으로 해당 복원기를 재학습시킨 후, 재학습된 복원기를 이용하여 객체 타입 1에 대한 복원 객체를 생성할 수 있다.Referring to FIG. 5 (an example of a case in which re-learning of the selected reconstructor is required), in an image including a plurality of objects, after setting the area including the object type 1 including the hidden area as the area to be restored, Analyze the type of the object type 1. Thereafter, while selecting a previously learned reconstructor for object type 1, additional information about the object type 1 is provided. Thereafter, after re-learning a corresponding restorer based on the provided additional information, a restored object for object type 1 may be generated using the re-learned restorer.

한편, 부가정보는 가려진 객체마다 다를 수 있고, 같은 객체에서도 일부 정보만 존재할 수 있다. 바람직한 부가정보는 속성, 특징(세부 객체) 및 3차원 정보로 구분될 수 있다. 즉, 특징은 객체 내에 포함된 보다 소규모의 세부 객체에 대한 정보일 수 있으며, 속성은 이러한 세부 객체의 특징을 제외하고 해당 객체를 구분하기 위한 정보일 수 있다. 또한, 3차원 정보는 2차원 영상의 각각의 픽셀에 대응하는 깊이(depth) 정보일 수 있다.Meanwhile, the additional information may be different for each hidden object, and only some information may exist in the same object. Desirable additional information may be divided into attributes, characteristics (detailed objects), and 3D information. That is, the feature may be information on a smaller detailed object included in the object, and the attribute may be information for distinguishing the object except for the feature of the detailed object. Also, the 3D information may be depth information corresponding to each pixel of the 2D image.

일례로, 가려진 객체가 “사람 얼굴”이라면, 피부색, 눈동자 색, 머리카락 색, 성별, 나이, 키, 체중 등과 같은 정보가 속성 정보에 해당할 수 있다. 또한, 눈, 코, 입술, 귀, 이마, 턱선 등과 같이 사람 얼굴을 나타내는 주요 지점은 특징 정보에 해당할 수 있다. 또한, 카메라에서 사람 얼굴의 특정 위치까지의 거리 정보가 3차원 정보에 해당할 수 있다.For example, if the hidden object is a “human face,” information such as skin color, eye color, hair color, gender, age, height, weight, etc. may correspond to attribute information. Also, main points representing a human face, such as eyes, nose, lips, ears, forehead, and jaw lines, may correspond to feature information. In addition, distance information from the camera to a specific position of a person's face may correspond to 3D information.

다른 일례로, 객체가 “자동차”라면 색상, 차종(예: 세단, 트럭, 버스), 제조사, 브랜드명이 속성 정보에 해당할 수 있다. 또한, 외곽선을 표현하는 점들의 집합, 전조등 위치, 사이드미러 위치 등과 같은 정보가 특징 정보에 해당할 수 있다. 또한, 카메라에서 자동차의 특정 위치까지의 거리 정보가 3차원 정보에 해당할 수 있다.As another example, if the object is “car”, color, vehicle type (eg, sedan, truck, bus), manufacturer, and brand name may correspond to attribute information. Also, information such as a set of points representing an outline, a position of a headlamp, a position of a side mirror, etc. may correspond to the characteristic information. In addition, distance information from the camera to a specific location of the vehicle may correspond to 3D information.

예를 들어, 본 발명은 “사람 얼굴”, “사람 전신”, 또는 “자동차” 등의 가려진 객체의 종류를 분석할 수 있으며, 이를 통해 각각의 객체가 가려졌을 때 이를 복원하는 복원기를 로딩하여 복원을 수행할 수 있다. 본 발명에서는 객체의 크기나 모양에 대해서 특별히 한정하지 않는다. 따라서, 일례로 “사람 얼굴”에 대해서 보다 세부적으로 눈동자, 입술, 볼 부분만 전문적으로 복원하는 새로운 복원기를 만들 수 있음은 물론이다. 즉, 본 발명에 의하면 어떤 객체를 복원하고 싶은데, 그것이 이미 학습된 객체이면. 해당 모델을 가져와서 그대로 또는 재학습시켜 복원에 사용하고, 이미 학습된 객체가 아니라면 소정의 방법을 통해 해당 객체에 대한 복원기 모델을 신규 학습시킬 수 있다.For example, the present invention can analyze the type of obscured objects such as “human face”, “human body”, or “car”, and through this, when each object is obscured, it is restored by loading a restorer that restores it. can be performed. In the present invention, the size or shape of the object is not particularly limited. Therefore, as an example, it is of course possible to create a new restoration device that professionally restores only the pupils, lips, and cheeks in more detail with respect to a “human face”. That is, according to the present invention, if an object is to be restored, it is an object that has already been learned. It is possible to take the corresponding model and use it for restoration as it is or by re-learning, and if it is not an object that has already been learned, a new restoration model for the object may be newly trained through a predetermined method.

도 8은 다양한 객체 별로 복원기를 학습시키는 개념도를 나타낸다.8 shows a conceptual diagram for learning a reconstructor for each of various objects.

도 8을 참조하면, 기 확보되었거나 웹 크롤링 같은 점진적 영상 확보 방식으로 확보된 다양한 객체에 관련된 영상을 통해 학습 데이터를 구성하여 각 객체에 대한 복원기를 학습시킬 수 있다. 가령, 객체 검출기를 사용하여 확보된 영상에서 다양한 객체에 대한 영상을 수집 분류하고, 분류된 객체 영상을 이용하여 다양한 각 객체에 대한 복원기를 신규 학습 또는 재학습시킬 수 있다. 이러한 학습은 S203에서 기 학습된 객체가 아니어서 S204로 분기하는 경우나, S209에서 재학습이 필요하여 S211로 분기된 경우에 수행될 수 있다.Referring to FIG. 8 , a reconstructor for each object may be learned by composing training data through images related to various objects that have been previously secured or obtained through a gradual image securing method such as web crawling. For example, images of various objects may be collected and classified from images obtained by using an object detector, and a reconstructor for each of various objects may be newly learned or re-learned using the classified object images. Such learning may be performed when branching to S204 because it is not a pre-learned object in S203, or when branching to S211 because re-learning is required in S209.

여기서, “신규 학습”과 “재학습”에 대한 용어의 차이를 기술하면 다음과 같다. 본 발명에서 복원기는 학습을 통한 분류(Discrimination, Classification) 및 데이터 생성(Generation)이 가능해야 하므로, 인공신경망을 이용한 방식이 바람직할 수 있다. 따라서, “신규 학습”은 이전에 어떤 학습이 되지 않은 상태에서 주어진 학습용 데이터를 사용하여 처음부터 다시 학습하는 것을 의미한다. 반면, “재학습”의 의미는 사전에 인공신경망에 대한 학습이 이미 이뤄졌었고 이를 통해 인공신경망을 기술하는 모델이 확보되었으며, 추가적인 데이터(즉, 제공된 부가정보 등)를 통해 기 학습된 인공신경망의 내부 상태를 추가로 개선(업데이트)한다는 것을 의미한다.Here, the difference between the terms “new learning” and “re-learning” is described as follows. In the present invention, since the reconstructor must be capable of classification (Discrimination, Classification) and data generation (Generation) through learning, a method using an artificial neural network may be preferable. Therefore, “new learning” means learning from the beginning using the given learning data in a state where no learning has been done before. On the other hand, the meaning of “re-learning” is that the artificial neural network has already been trained in advance, and a model describing the artificial neural network has been secured through this, and the artificial neural network that has been previously learned through additional data (ie, provided additional information, etc.) It means to further improve (update) the internal state.

도 9는 가려진 객체를 복원하는 과정을 입력과 출력 관점에서 나타낸 것이다.9 is a diagram illustrating a process of restoring an obscured object in terms of input and output.

도 9를 참조하면, 복원기는 가려진 객체(복원용) 영상과 부가정보 제공부(154)로부터 제공된 부가정보를 입력받아 복원 객체를 출력할 수 있다. 이는 복원기가 가려진 객체 영상 및 그 부가정보의 입력 데이터와, 복원 영상의 결과 데이터를 각각 포함한 학습 데이터로 학습이 이루어지기 때문이다. 특히, 상술한 바와 같이, 본 발명에서는 가려진 객체(복원용) 영상의 객체를 확인하여 복원에 적합한 모델을 복원기 저장 DB에서 선택하는 과정이 추가된다.Referring to FIG. 9 , the restorer may receive an image of an obscured object (for restoration) and additional information provided from the additional information providing unit 154 to output a restored object. This is because learning is performed using learning data including input data of the object image and its additional information hidden by the restoration device, and result data of the restored image, respectively. In particular, as described above, in the present invention, a process of selecting a model suitable for restoration by checking an object of an image of an obscured object (for restoration) is added from the restoration storage DB.

도 9에서 부가정보 제공부(154)는 복원기에 부가정보를 제공하는데, 이러한 부가정보는, 도 9에 도시된 바와 같이, 가려진 객체(복원용) 영상에서 추출되거나, 참조 DB 또는 사용자를 통해 제공될 수 있다.In FIG. 9 , the additional information providing unit 154 provides additional information to the restorer. As shown in FIG. 9 , the additional information is extracted from an image of an obscured object (for restoration), or provided through a reference DB or a user. can be

구체적으로, 가려진 객체라도 통상적으로 완전히 모두 가려지지 않기 때문에, 그 가려진 객체의 부분 정보만 가지고도 부가정보를 얻을 수 있다. 이 경우, 부가정보는 가려진 객체(복원용) 영상에서 추출될 수 있다. 즉, 부가정보는 가려진 객체에서 가려진 영역 외의 영역에 대한 분석을 통해 제공될 수 있다. Specifically, since all of the hidden objects are not normally completely covered, additional information can be obtained with only partial information of the hidden objects. In this case, the additional information may be extracted from the obscured object (for restoration) image. That is, the additional information may be provided through analysis of a region other than the occluded region of the occluded object.

예를 들어, 사람 얼굴에서 마스크에 의해 코와 입이 가려졌다고 하면, 캐스캐이드 부스팅(점진적으로 특징점을 찾아가는 방식) 방법이나 인공신경망을 사용하여 특징점을 찾거나, 가려지지 않은 부분의 피부색을 이용하여 가려진 부분의 피부색을 추정할 수 있으며, 가려지지 않은 피부 특징이나 머리 모양을 참고하여 나이나 성별을 추정할 수도 있다.For example, assuming that the nose and mouth are covered by a mask on a human face, the cascade boosting (a method of finding feature points gradually) or artificial neural networks are used to find the feature points, or use the skin color of the part that is not covered. Thus, the skin color of the covered area can be estimated, and the age or gender can also be estimated by referring to the uncovered skin features or hair shape.

한편, 참조 DB에서 부가정보가 제공되는 경우는 다양한 데이터베이스에 저장된 타 영상에 대한 분석을 통해 부가정보가 제공되는 경우이다. 즉, 가려진 객체의 영상이 촬영된 동일 위치에서 촬영된 타 영상에 대한 분석을 통해 부가정보가 제공될 수 있다.On the other hand, when additional information is provided from the reference DB, additional information is provided through analysis of other images stored in various databases. That is, additional information may be provided through analysis of other images captured at the same location where the image of the obscured object was captured.

예를 들어, 동일한 장소의 어느 카메라에서 촬영된 영상에서 가려진 객체가 발생했다면, 해당 카메라가 기존에 촬영한 타 영상들이 저장된 참조 DB에서, 그 저장된 타 영상들로부터 밝기, 색상, 형상 등과 같은 객체에 대한 부가정보를 확보할 수 있다.For example, if an obscured object occurs in an image captured by a certain camera in the same place, in the reference DB in which other images previously captured by the corresponding camera are stored, objects such as brightness, color, shape, etc. are stored from the other stored images. Additional information can be obtained.

또한, 사용자에 의해 입력 또는 선택되어 부가정보가 제공되는 경우는 사용자의 지식이나 가설에 의해 부가정보를 제공되는 경우로서, 그 외에도 가려진 객체(복원용) 영상에서 추출된 부가정보 또는 참조 DB를 통해 제공된 부가정보를 사용자가 확인하고 선택적으로 보정하는 경우일 수도 있다.In addition, when additional information is provided through input or selection by the user, additional information is provided based on the user's knowledge or hypothesis. It may be a case in which the user confirms the provided additional information and selectively corrects it.

도 10은 복원기에 대한 구체적인 학습 과정에 대한 일 예를 나타낸다.10 shows an example of a specific learning process for the reconstructor.

도 10을 참조하면, 학습부(153)는 복원기(즉, 가려진 객체 복원기) 외에도 제1 생성기(즉, 가려진 객체 생성기)를 학습시키는 역할을 수행할 수 있다. 이때, 제1 생성기는 가려지지 않은 객체 영상 및 그 부가정보가 입력되는 경우에 그 부가정보에 관련된 가려진 객체를 출력하도록 머신 러닝 기법으로 학습된 모델이다. 즉, 제1 생성기는 가려지지 않은 객체(학습용)를 입력 받아 가려진 객체(학습용)를 생성하는 역할을 수행한다. 즉, 제1 생성기를 학습시키기 위한 학습 데이터에서, 입력 데이터는 가려지지 않은 객체 영상 및 그 부가정보를 포함할 수 있으며, 출력 데이터는 그 부가정보에 관련된 가려진 객체를 포함할 수 있다.Referring to FIG. 10 , the learning unit 153 may play a role of learning the first generator (ie, the obscured object generator) in addition to the restorer (ie, the obscured object generator). In this case, the first generator is a model trained by a machine learning technique to output an obscured object related to the additional information when an unoccluded object image and the additional information are input. That is, the first generator serves to generate an obscured object (for learning) by receiving an unoccluded object (for learning). That is, in the training data for learning the first generator, input data may include an unoccluded object image and additional information thereof, and output data may include a hidden object related to the additional information.

도 10에서, 부가정보 제공부(154)는 도 9에서 사용한 부가정보 제공부(154)를 그대로 활용할 수 있다. 즉, 상호 교환이 가능하다.In FIG. 10 , the additional information providing unit 154 may utilize the additional information providing unit 154 used in FIG. 9 as it is. That is, they are interchangeable.

일례로, 제1 생성기는 가려지지 않은 객체가 “사람 얼굴”인 경우, “모자”, “선글라스”, “마스크”, “콧수염”, “목도리” 등과 같은 객체로 해당 “사람 얼굴”을 임의로 가린 가려진 객체(학습용) 영상을 생성할 수 있다. 또한, 제1 생성기는 가려지지 않은 객체가 “자동차”라면 “다른 자동차”, “나무”, “블록” 또는 “임의의 다각형 모양”으로 해당 “자동차”를 임의로 가린 가려진 객체(학습용) 영상을 생성할 수도 있다. 혹은, 제1 생성기는 가리고자 하는 영역에 대해서 가우시안 노이즈(Gaussian noise), 소금후추 노이즈(Salt and pepper noise), 유니폼 노이즈(Uniform noise)와 같은 랜덤 패턴을 발생시켜 생성할 수도 있다.As an example, when the object that is not covered is a “human face,” the first generator arbitrarily covers the “human face” with objects such as “hat”, “sunglasses”, “mask”, “mustache”, “shawl”, etc. A hidden object (for learning) image can be created. In addition, the first generator generates an obscured object (for learning) image that arbitrarily obscures the corresponding “car” with “another car”, “tree”, “block” or “arbitrary polygonal shape” if the unoccluded object is “car” You may. Alternatively, the first generator may generate a random pattern such as Gaussian noise, salt and pepper noise, and uniform noise for an area to be covered.

도 11은 부가정보 제공에 대한 일 예를 나타낸다.11 shows an example of providing additional information.

본 발명에 의한 제1 생성기(즉, 가려진 객체 생성기)의 중요한 특징은, 가려진 객체를 생성함에 있어서 부가정보 제공부(154)를 통해 부가정보를 추가로 제공받을 수 있다는 것이다. 도 11은 이에 대한 일 예로서, 만약 부가정보 제공부(154)가 사람 얼굴을 가리는 마스크 정보를 부가정보로 제1 생성기에 제공한다면 제1 생성기는 가려지지 않은 객체를 마스크가 씌워진 형태의 가려진 객체로 합성할 수 있다. 이러한 과정을 통해, 복원기를 학습시킬 경우, 전혀 학습되지 않은 객체로 가려졌을 때에 이를 효과적으로 복원할 수 있도록 복원기를 재학습시킬 수 있다.An important feature of the first generator (ie, the hidden object generator) according to the present invention is that additional information can be additionally provided through the additional information providing unit 154 in generating the hidden object. 11 is an example of this, if the additional information providing unit 154 provides mask information that covers a person's face as additional information to the first generator, the first generator converts an unmasked object into a masked object. can be synthesized with Through this process, when the restorer is trained, the restorer can be retrained to effectively restore it when it is obscured by an object that has not been learned at all.

즉, 학습부(153)는 제1 생성기를 이용하여 복원기(즉, 가려진 객체 복원기)를 학습시킬 수 있다. 이때, 학습부(1530)는 제1 생성기에 입력된 가려지지 않은 객체(학습용) 영상과, 제1 생성기에서 출력된 가려진 객체(학습용) 영상과, 그와 관련된 부가정보 제공부(154)에서 제공된 부가정보(즉, 제1 생성기에도 제공된 부가정보)를 이용하여, 복원기를 학습시킬 수 있으며, 학습이 완료되면 해당 복원기를 복원기 저장 DB에 저장할 수 있다. 이러한 과정은 복원기에 대한 신규 학습 외에, 특히 재학습 시에 수행될 수 있다.That is, the learning unit 153 may learn the reconstructor (ie, the hidden object reconstructor) using the first generator. In this case, the learning unit 1530 provides an unoccluded object (learning) image input to the first generator, an obscured object (learning) image output from the first generator, and the related additional information providing unit 154 . Using the additional information (ie, additional information provided to the first generator), the restorer may be trained, and when the learning is completed, the restorer may be stored in the restorer storage DB. This process may be performed in addition to the new learning of the restorer, particularly during re-learning.

가령, 가려진 객체에 대한 기 학습된 복원기가 없는 경우, S204에서, S202의 분석된 종류의 가려지지 않은 객체와 관련된 영상을 수집하고, 수집된 영상의 가려지지 않은 객체 영상 및 그 부가정보를 제1 생성기에 입력시켜 수집된 영상의 가려진 객체 영상을 생성하며, 제1 생성기에서 생성된 수집된 영상의 가려진 객체 및 그 부가정보를 이용하여 선택된 복원기를 신규 학습시킬 수 있다.For example, if there is no pre-learned reconstructor for the occluded object, in S204, an image related to the unoccluded object analyzed in S202 is collected, and the non-occluded object image of the collected image and its additional information are first An occluded object image of the collected image is generated by input to the generator, and the selected reconstructor may be newly trained using the occluded object of the collected image generated by the first generator and additional information thereof.

또한, 선택된 복원기에 대한 재학습이 필요한 경우, S211에서, S202의 분석된 종류의 가려지지 않은 객체와 관련된 영상을 수집하고, 수집된 영상의 가려지지 않은 객체 영상 및 그 부가정보를 제1 생성기에 입력시켜 수집된 영상의 가려진 객체 영상을 생성하며, 제1 생성기에서 생성된 수집된 영상의 가려진 객체 및 그 부가정보를 이용하여 선택된 복원기를 재학습시킬 수 있다.In addition, when re-learning for the selected reconstructor is required, in S211, an image related to the unoccluded object analyzed in S202 is collected, and the unoccluded object image of the collected image and its additional information are transmitted to the first generator. An occluded object image of the collected image is generated by inputting the image, and the selected reconstructor may be re-learned using the occluded object of the collected image generated by the first generator and additional information thereof.

복원기의 구성은 앞서 기술했듯이 인공신경망을 사용하는 것이 바람직할 수 있다. 인공신경망은 내부적으로 제2 생성기(Generator)와 판별기(Discriminator)를 포함하는 구조가 바람직할 수 있다. As described above, it may be desirable to use an artificial neural network for the configuration of the restorer. The artificial neural network may preferably have a structure including a second generator and a discriminator internally.

이때, 제2 생성기는 오토인코더(Auto-encoder) 또는 변이형 오토인코더(Variational Auto-encoder), U-Net 생성기(생성자) 또는 ResNET 생성 기(생성자)를 포함하는 구조가 바람직할 수 있다. 또한, 제2 생성기는 생성적 적대 신경망(Generative Adversarial Network)의 다양한 변형을 통해 학습된 생성기 부분을 발췌하여 구현되는 것이 바람직할 수 있다. 일례로 Pix2Pix, CycleGAN, U-Net, ResNet, WGAN, ProgressivGAN, World Models, Self-Attention GAN, BigGAN, BERT, StyleGAN 등이 잘 알려진 GAN의 변형들이며 이들의 구조에서 생성기 부분을 추가로 변형하여 사용할 수 있는 것이다. In this case, the second generator may preferably have a structure including an auto-encoder or a variant auto-encoder, a U-Net generator (generator), or a ResNET generator (generator). In addition, the second generator may be preferably implemented by extracting the generator part learned through various modifications of the generative adversarial network. For example, Pix2Pix, CycleGAN, U-Net, ResNet, WGAN, ProgressivGAN, World Models, Self-Attention GAN, BigGAN, BERT, StyleGAN, etc. are well-known GAN variants and can be used by additionally transforming the generator part in these structures. there will be

생성적 적대 신경망과 그 변형들은 생성기와 판별기가 서로의 결과를 되먹임하여 학습을 진행하는 것이 통상적인데 되먹임 구조나 최적화 방식을 변형하여 구분하고 있다. 생성적 적대 신경망의 판별기는 “진짜”, “가짜”와 같은 이진 판단을 수행하는 것이 통상적이지만, 다수의 클래스를 구분할 수도 있으며, 0아니면 1과 같이 확정된 값이 아니라 출력 결과에 대한 신뢰도로 해석할 수 있는 값을 출력할 수도 있다. 생성적 적대 신경망의 생성기는 오토인코더, 변이형 오토인코더를 사용할 수도 있지만, U-net 생성자나 ResNet 생성자와 같은 오토인코더의 변형이 사용될 수 있다. U-net 생성자나 ResNet 생성자는 다운샘플링, 업샘플링, 스킵연결과 같은 구성을 통해 가려진 부분을 복원할 수 있고, 해상도 개선, 노이즈 제거, 스타일 변환 등을 수행할 수 있다. In generative adversarial neural networks and their variants, it is common for the generator and discriminator to feed back each other's results to proceed with learning, but they are distinguished by modifying the feedback structure or optimization method. The discriminator of the generative adversarial network usually performs binary judgments such as “real” and “fake”, but it can also distinguish a number of classes, and it is interpreted as the reliability of the output result rather than a definite value such as 0 or 1 It is also possible to output possible values. The generator of the generative adversarial neural network may use an autoencoder or a variant autoencoder, but variants of the autoencoder such as a U-net generator or a ResNet generator may be used. U-net or ResNet creators can restore hidden parts through configurations such as downsampling, upsampling, and skip connection, and can perform resolution improvement, noise removal, and style conversion.

오토인코더 이외에 U-net 생성기나 ResNet 생성기 등이 가려진 영역을 복원하는 원리를 추가로 기술하면 다음과 같다. U-net 생성기나 ResNet 생성기에서 다운샘플링의 역할은 공간 차원과 채널을 동시에 축소하기도 하고, 혹은 공간 차원을 축소하는 대신 채널을 확장하는 식으로 구현될 수 있다. 이는 통상의 분류기를 구성할 때 사용하는 방식으로써 공간 축소된 은닉층들은 이미지가 무엇인지는 감지할 수 있지만 어디에 있는지에 관한 위치정보를 잃게 된다. 푸리에 변환 등에서 발견되는 정보의 불확정성(Uncertainty principle) 측면에서 해석할 수 있는 현상이다. 반면에 업샘플링의 역할은 다운샘플링되는 동안 사라진 공간 정보를 되돌리는 역할을 수행한다. 스킵 연결은 단순히 순차적으로 업샘플링 할 때의 단점을 보완하는 것으로써 다운샘플링 과정에서 확보된 각각의 은닉층들의 다양한 수준의 잠재정보를 효과적으로 섞어서 출력할 수 있게 한다. 이를 정리하면 U-net 생성기나 ResNet 생성기와 같은 생성기 역시 당초 오토인코더가 수행하는 것과 마찬가지로 노이즈나 가려진 경우와 같이 주어진 학습셋에서 일반적이지 않은 특수한 현상들은 인코딩 혹은 다운샘플링 과정에서 사라지고, 이를 디코딩 혹은 업샘플링을 수행하면 가려지거나 노이즈가 제거된 결과물을 얻게 되는 것이다.In addition to the autoencoder, the principle of restoring the area covered by the U-net generator or ResNet generator is additionally described as follows. The role of downsampling in the U-net generator or ResNet generator can be implemented by simultaneously reducing the spatial dimension and the channel, or extending the channel instead of reducing the spatial dimension. This is a method used when constructing a general classifier. The spatially reduced hidden layers can detect what an image is, but lose location information about where it is. It is a phenomenon that can be interpreted in terms of the uncertainty principle of information found in the Fourier transform. On the other hand, the role of upsampling is to restore spatial information lost during downsampling. The skip connection simply compensates for the shortcomings of sequential upsampling, and enables to effectively mix and output various levels of latent information of each hidden layer obtained in the downsampling process. To summarize, generators such as U-net generators and ResNet generators also disappear in the encoding or downsampling process, and special phenomena that are not common in a given training set, such as noise or occlusion, disappear in the encoding or downsampling process, just like the original autoencoder does. If you perform sampling, you get a result that has been masked or de-noised.

제2 생성기에서 부가정보를 이용하기 위해 학습하고 이를 복원에 활용하는 방안을 추가로 설명하면 다음과 같다. 학습 데이터의 입력 데이터는 가려진 객체 영상을 포함하고 출력 데이터는 가려지지 않은 객체 영상(복원 영상)을 포함할 수 있다. 이와 같이 학습되는 경우, 가려진 객체(입력)와 가려지지 않은 객체(출력) 간에 대한 인코딩이 인공신경망의 은닉층에 구현될 수 있다. 즉, 부가정보(즉, 가려진 영역에 대한 정보)를 반영하여 학습된 은닉층이 구현될 수 있는 것이다. A method of learning to use the additional information in the second generator and using it for restoration will be further described as follows. The input data of the training data may include an obscured object image, and the output data may include an unoccluded object image (reconstructed image). In the case of learning in this way, encoding between the hidden object (input) and the non-occluded object (output) may be implemented in the hidden layer of the artificial neural network. That is, the hidden layer learned by reflecting the additional information (ie, information on the hidden area) can be implemented.

일례로 학습을 위해서 입력에는 가려진 객체가 R, G, B 채널이고, 여기에 부가정보로 윤곽선을 의미하는 부가정보 채널, 깊이 정보를 의미하는 부가정보 채널, 영역별 색상을 의미하는 부가정보 채널, 영상의 패턴을 의미하는 부가정보 채널 등을 각각 혹은 조합하여 추가로 입력한다고 가정하자. 물론 부가정보 채널을 추가하지 않는 경우도 가능하다. 이때 출력 데이터는 Ground Truth로 일컬어지는 가려지지 않은 객체 영상을 포함하여 학습을 수행한다. For example, in the input for learning, the hidden objects are R, G, and B channels, and as additional information, an additional information channel means an outline, an additional information channel means depth information, an additional information channel means a color for each area, It is assumed that additional information channels indicating a pattern of an image are additionally input individually or in combination. Of course, it is also possible not to add the additional information channel. At this time, the output data includes an unobscured object image called ground truth, and learning is performed.

상기와 같이 모델이 학습된 상태에서 이를 실제로 복원하는 다양한 상황들을 설명하면 다음과 같다.As described above, various situations in which the model is actually restored from the learned state will be described as follows.

(1) 가려진 객체를 복원함에 있어서 R, G, B 컬러정보 이외에 부가정보가 없는 경우에는 부가정보 없이 학습된 모델을 사용할 수 있다. 이 경우에도 R, G, B 컬러정보만으로 학습된 모델은 산발적인 노이즈, 뭉쳐져서 발생한 얼룩, 가려진 부분들을 복원할 수 있다. 심지어 특정 컬러정보가 입력에서 존재하지 않더라도 이를 어느 정도는 복원이 가능하다. 이는 상기 오토인코더, 변이형 오토인코더, GAN, GAN의 다양한 변형들의 특징에 기인한다.(1) When there is no additional information other than R, G, and B color information in reconstructing an obscured object, a model learned without additional information can be used. Even in this case, a model trained only with R, G, and B color information can restore sporadic noise, clustered spots, and occluded parts. Even if specific color information does not exist in the input, it can be restored to some extent. This is due to the characteristics of the autoencoder, variant autoencoder, GAN, and various variants of GAN.

(2) 가려진 객체를 복원함에 있어서 R, G, B 컬러정보 이외에 윤곽선을 의미하는 부가정보 채널이 입력된 경우에는 이에 대응하는 모델을 사용함으로써, R, G, B 컬러정보에는 가려진 객체가 존재하더라도 윤곽선 정보가 유지되기만 한다면 복원 객체는 해당 윤곽선을 고려하여 복원이 이뤄지게 된다. 유사한 경우로써 얼굴의 특징점을 부가정보로 사용하는 경우에는 코나 입이 마스크 등으로 가려지더라도, 가려진 R, G, B 컬러정보 자체에서 눈과 턱선의 정보를 추출하여, 가려진 코와 입의 위치를 추정하여 이를 얼굴 특징점에 대한 부가정보로 사용함으로써 얼굴 형태에 대한 효과적인 복원이 가능해진다.(2) In restoring the obscured object, when additional information channels representing contour lines are input in addition to the R, G, and B color information, a corresponding model is used. As long as the contour information is maintained, the restoration object is restored in consideration of the contour. In a similar case, when facial feature points are used as additional information, even if the nose or mouth is covered with a mask, etc., information on the eyes and jaw lines is extracted from the hidden R, G, and B color information itself to determine the positions of the hidden nose and mouth. By estimating and using this as additional information for facial feature points, it is possible to effectively restore the face shape.

(3) 가려진 객체를 복원함에 있어서 R, G, B 컬러정보 이외에 깊이정보 부가정보 채널이 입력된 경우에는 이에 대응하는 모델을 사용함으로써, R, G, B 컬러정보에는 가려진 객체가 존재하더라도 가려진 객체 자체에서 깊이정보를 추정하거나 사전에 알려진 정보를 활용함으로써 효과적인 복원이 가능하다. 사전에 알려진 정보를 활용하는 일례로 알려진 자동차 종류, 제품, 건물 등의 경우에는 3차원 형태(shape) 정보를 여러 경로를 통해 확보할 수 있는데 이를 깊이맵(depth map)으로 변환하여 R, G, B, 컬러채널에 추가시킬 수 있다. 논의를 단순화하기 위해 깊이맵을 R, G, B 컬러 영상의 크기와 같다고 한다면 영상의 각각의 픽셀은 깊이맵의 각각의 점과 일대일 대응된다. 이 경우 만약 A라는 자동차의 영상이 가려졌다고 하더라도 복원 시 가려지지 않은 영역을 통해 A라는 자동차의 차종을 알 수 있고, A라는 차종의 3차원 정보를 다양한 경로를 통해 확보하여 이를 부가정보로 제공함으로써 3차원 정보를 반영하는 방향으로 가려진 부분을 복원할 수 있게 된다. 이 경우 흥미로운 특징은 학습을 위해 자동차와 자동차의 깊이맵을 사용했더라도 영상과 깊이맵의 크기와 규격이 동일하기만 하다면 건물이나 다른 제품에 대해서도 깊이맵을 부가정보로 활용하여 복원할 수 있다는 것이다.(3) In restoring the obscured object, when the additional depth information channel is input in addition to the R, G, and B color information, a corresponding model is used. Effective restoration is possible by estimating depth information on its own or using previously known information. As an example of using known information in advance, in the case of known car types, products, buildings, etc., three-dimensional shape information can be obtained through multiple paths. B, it can be added to the color channel. To simplify the discussion, if the depth map is equal to the size of the R, G, and B color image, each pixel of the image corresponds to each point of the depth map one-to-one. In this case, even if the image of car A is obscured, the model of car A can be known through the area that is not covered during restoration. It is possible to restore the part that is obscured in a direction that reflects the 3D information. In this case, an interesting feature is that even if the vehicle and the vehicle depth map are used for learning, as long as the size and specification of the image and the depth map are the same, the depth map can be used as additional information to restore a building or other product. .

(4) 영역별 색상을 의미하는 부가정보 채널, 영상의 패턴을 의미하는 부가정보 채널에 대해서도 (1)내지 (3)의 경우와 유사한 방식으로 복원이 가능하다.(4) The additional information channel meaning color for each region and the additional information channel meaning the image pattern can be restored in a similar manner to the cases (1) to (3).

(5) 상기 (1)내지 (4)와 같은 단일한 부가정보 채널들을 상호 조합하여 학습하고 복원에 활용하는 것도 가능하다. 물론 이 경우에는 학습을 위한 데이터가 증가하고 차원이 커짐에 따라 학습에 소요되는 시간도 늘어나고, 필요한 연산자원도 크게 증가할 수는 있지만 부가정보 채널을 늘일수록 복원 시 부가정보가 반영된 결과물을 얻을 수 있게 되는 중요한 특징을 확보하게 된다.(5) It is also possible to learn by combining single additional information channels as in (1) to (4) above and use them for restoration. Of course, in this case, as the data for learning increases and the dimension increases, the time required for learning increases and the required operator resources can also increase significantly. important characteristics to be present.

제2 생성기는 생성 모델을 내부적으로 구현하는 장치라고 할 수 있다. 이때, 생성 모델이란 소정의 파라미터 값을 입력하면 이에 대응하는 데이터를 특정 확률분포로 출력하는 모델로 정의할 수 있다.The second generator may be said to be a device that internally implements the generative model. In this case, the generative model may be defined as a model that outputs data corresponding to the input of a predetermined parameter value in a specific probability distribution.

예를 들어, 1차원 가우시안 확률분포를 갖는 생성 모델은 평균값을 중심으로 종모양의 발생 분포를 갖는 데이터들을 생성할 수 있다. 또한, 다수개의 2차원 가우시안 함수의 합으로 표현된 생성 모델이 있다면 생성된 데이터들은 마치 여러 개의 산봉오리를 갖는 분포를 보일 것이다.For example, a generative model having a one-dimensional Gaussian probability distribution may generate data having a bell-shaped occurrence distribution centered on an average value. In addition, if there is a generative model expressed as the sum of a plurality of two-dimensional Gaussian functions, the generated data will show a distribution having several peaks.

생성적 적대 신경망에서 중요한 기술요소로 사용되는 오토인코더는 일종의 생성 모델이며, 그 출력되는 결과물은 그 학습 방법에 따라 영상이 될 수도 있다. 오토인코더는 통상적으로 인코딩을 통해 차원축소를 수행한 후 이를 디코딩하여 원래 차원의 크기로 확장하는 구조를 갖는다. 따라서, X를 입력으로 하여 X가 출력되도록 학습시키는 과정이 필요하다. 학습된 오토인코더는 X + delta라는 값이 입력되더라도 X와 거의 유사한 출력이 가능하게 되는 특징이 있다. 이를 수학적으로는 임의의 다차원 정보가 오토인코더를 통해 다차원 매니폴드 상에 투영되어 결과가 출력된다고도 표현할 수 있지만, 응용 관 점에서 보면 delta는 노이즈, 가려짐 영역, 얼룩 등으로 생각할 수 있다. 즉, X + delta는 영상에 노이즈, 가려짐 영역, 얼룩 등이 존재하는 것이고, 이를 입력하면 이것들이 제거된 깨끗한 출력 X를 얻을 수 있다.An autoencoder used as an important technical element in a generative adversarial neural network is a kind of generative model, and the output result may be an image depending on the learning method. Autoencoders typically have a structure in which dimensionality reduction is performed through encoding and then expanded to the original dimension size by decoding it. Therefore, it is necessary to take X as input and learn to output X. The learned autoencoder has the characteristic that it is possible to output almost similar to X even when a value of X + delta is input. Mathematically, it can be expressed that arbitrary multi-dimensional information is projected onto a multi-dimensional manifold through an autoencoder and the result is output. That is, X + delta means that noise, occlusion area, and speckle exist in the image. If you input this, you can get a clean output X from which these are removed.

이러한 특징은 본 발명에서 기대하는 중요한 특징이기도 하다. 즉, 오토인코더와 같은 생성 모델은 가려진 객체 영상이 입력될 경우에 가려지지 않은 객체 영상(즉, 복원 객체 영상)을 출력하게 되는데, 이는 고차원 매니폴드 공간이 가려지지 않은 객체 영상을 출력하도록 학습되었기 때문이라고 해석할 수 있다.These features are also important features expected in the present invention. That is, a generative model such as an autoencoder outputs an unoccluded object image (that is, a reconstructed object image) when a occluded object image is input. It can be interpreted because

도 12는 제2 생성기와 판별기의 학습 과정에 대한 일 예를 나타낸다.12 shows an example of a learning process of a second generator and a discriminator.

한편, 판별기는 가려진 객체 영상이 제2 생성기에 입력됨에 따라 출력된 가려지지 않은 객체 영상)을 입력 받아서, 생성기가 얼마나 충실하게 가려진 객체를 복원했는지(즉, 복원 객체의 부합여부)를 판별하는 역할을 수행한다. 가령, 도 12를 참조하면, 제1 생성기 등을 이용하여 가려지지 않은 객체 영상(즉, 제1 영상)으로부터 가려진 객체 영상을 합성할 수 있다. 이후, 합성된 가려진 객체 영상을 제2 생성기에 입력하면, 제2 생성기는 가려지지 않은 객체 영상(즉, 제2 영상)을 출력한다. 이때, 판별기는 제1 영상과 제2 영상을 비교하여 제2 생성기가 얼마나 충실히 제2 영상을 복원했는지를 판별한 결과(즉, 제2 영상이 제1 영상과 얼마나 유사한지 그 비교 결과)를 출력한다. 또한, 합성된 가려진 객체 영상과 부가정보를 제2 생성기에 함께 입력하면, 제2 생성기는 부가정보를 고려한 가려지지 않은 객체 영상(즉, 제2 영상)을 출력한다. 이때, 판별기는 제1 영상과 제2 영상을 비교하여 제2 생성기가 얼마나 충실히 제2 영상을 복원했는지를 판별한 결과(즉, 제2 영상이 제1 영상과 얼마나 유사한지 그 비교 결과)를 출력한다.On the other hand, the discriminator receives an unoccluded object image output as the occluded object image is input to the second generator, and determines how faithfully the generator restores the occluded object (that is, whether the restored object matches). carry out For example, referring to FIG. 12 , the occluded object image may be synthesized from the non-occluded object image (ie, the first image) using a first generator or the like. Thereafter, when the synthesized occluded object image is input to the second generator, the second generator outputs an unoccluded object image (ie, the second image). In this case, the discriminator compares the first image with the second image and outputs a result of determining how faithfully the second image is restored by the second generator (that is, the comparison result of how similar the second image is to the first image) do. Also, when the synthesized occluded object image and the additional information are input together into the second generator, the second generator outputs an unoccluded object image (ie, the second image) in consideration of the additional information. In this case, the discriminator compares the first image with the second image and outputs a result of determining how faithfully the second image is restored by the second generator (that is, the comparison result of how similar the second image is to the first image) do.

앞서 기술했듯이 제2 생성기는 종래의 오토인코더, 변이형 오토이코더, 생성적 적대 신경망(GAN) 혹은 그의 변형인 Pix2Pix, CycleGAN, U-Net, ResNet, WGAN, ProgressivGAN, World Models, Self-Attention GAN, BigGAN, BERT, StyleGAN 등에서 생성기 부분을 발췌하고 이를 변형하여 구성할 수 있다. 따라서 재학습하는 관점에서도 종래의 방식으로 생성기를 학습하고, 학습된 생성기 모델을 제2 생성기의 초기값으로 구성함으로써 학습의 시간을 줄일 수 있다. As described above, the second generator is a conventional autoencoder, variant autoencoder, generative adversarial neural network (GAN) or its variants Pix2Pix, CycleGAN, U-Net, ResNet, WGAN, ProgressivGAN, World Models, Self-Attention GAN, It can be configured by extracting the generator part from BigGAN, BERT, StyleGAN, etc. and transforming it. Therefore, from the viewpoint of re-learning, the learning time can be reduced by learning the generator in the conventional way and configuring the learned generator model as the initial value of the second generator.

제2 생성기의 또 다른 재학습 방식으로써 기 학습된 생성기에 제3의 부가정보가 추가로 발생하여 이를 학습에 반영하고자 하는 경우, 인공신경망의 가중치 값들을 랜덤으로 쓰는 것이 아니라 기학습된 모델을 초기값으로 사용하면 학습에 소요되는 자원과 시간을 줄일 수 있게 된다.As another re-learning method of the second generator, when the third additional information is additionally generated in the previously-learned generator and this is reflected in learning, the previously-learned model is initially used instead of randomly using the weight values of the artificial neural network. When used as a value, it is possible to reduce the resources and time required for learning.

즉, 제2 생성기 및 판별기는 GAN(Generative Adversarial Network) 기반으로 서로 적대적으로(상보적으로) 학습이 진행된다. 즉, 제2 생성기는 판별기의 판별 결과를 반영하여 제2 영상이 제1 영상에 보다 유사해지도록 학습되며, 판별기도 제2 생성기의 결과를 반영하여 그 판별 결과의 정확도가 보다 향상되게 학습된다. 이에 따라, 제2 생성기는 판별기의 성능을 높여주고, 판별기는 제2 생성기의 성능을 높여주는 방식으로 서로 진화하게 된다. 이후, 일정 이상 진화된 제2 생성기는 상술한 신규 학습 또는 재학습된 복원기로 사용될 수 있다.That is, the second generator and the discriminator are adversarially (complementarily) learning based on a Generative Adversarial Network (GAN). That is, the second generator is learned to make the second image more similar to the first image by reflecting the determination result of the discriminator, and the discriminator is also learned to reflect the result of the second generator to improve the accuracy of the discrimination result. . Accordingly, the second generator improves the performance of the discriminator, and the discriminator evolves with each other in a manner that increases the performance of the second generator. Thereafter, the second generator that has evolved over a certain period may be used as the above-described newly learned or re-learned restorer.

다만, 도 12에서, 제1 생성기에 입력되는 부가정보와, 제2 생성기에 입력되는 부가정보는 서로 동일하지 않고 다른 부가정보일 수도 있다. 즉, 제2 생성기에도 부가정보가 입력되는 경우가 복원을 위해 더욱 일반적일 수 있다. 예를 들어, 제2 생성기에 R, G, B 컬러 영상과 함께 부가정보 채널에 해당하는 윤곽선 채널이나 깊이맵 채널이 동시에 입력으로 사용될 수 있다. 이때, 이 부가정보 채널은 제1 생성기>에 입력되는 부가정보와 일치하지 않을 수 있다. 예를 들어, 인공신경망 형태를 고려하여, 부가정보의 크기, 형태, 또는 데이터 타입 등이 변할 수 있다.However, in FIG. 12 , the additional information input to the first generator and the additional information input to the second generator are not the same and may be different additional information. That is, the case in which additional information is also input to the second generator may be more common for restoration. For example, an outline channel or a depth map channel corresponding to an additional information channel may be simultaneously used as an input to the second generator together with R, G, and B color images. In this case, the additional information channel may not match the additional information input to the first generator>. For example, in consideration of the form of the artificial neural network, the size, form, or data type of additional information may be changed.

상술한 바와 같이 구성되는 본 발명은 영상의 가려진 영역에 대해 그 복원 시점에 부가정보(Side Information)를 이용하여 복원함으로써 그 가려진 영역에 보다 적합한 복원 영상을 생성할 수 있는 이점이 있다. 또한, 본 발명은 가용 부가정보를 사용하여 학습 데이터를 증강시킬 수 있으며, 동시에 입력 데이터를 수정하여 기 학습된 모델을 재학습시킬 수 있어, 그 복원 성능을 개선할 수 있는 이점이 있다. 또한, 본 발명은 복원 시점에 부가정보에 따른 복원 영상을 생성하므로, 그 활용성이 커져, CCTV 관제 현장, 인물 일치 여부, 사건 발생 후 영상 복원, 이동 단말기에서 생체정보 기반 인증, 건물 출입구 등의 출입 통제 등과 같은 응용 분야에서 객체의 가려진 영역이 존재할 때 이를 복원하는데 두루 활용될 수 있는 이점이 있다.The present invention configured as described above has an advantage in that a reconstructed image more suitable for the occluded region can be generated by restoring the occluded region of the image using side information at the restoration time. In addition, the present invention can augment the training data using the available additional information, and at the same time, by modifying the input data to retrain the pre-trained model, there is an advantage that can improve the restoration performance. In addition, since the present invention generates a restored image according to additional information at the time of restoration, its utility is increased, such as CCTV control site, person matching, image restoration after an incident, biometric information-based authentication in a mobile terminal, building entrance, etc. In application fields such as access control, there is an advantage that can be widely used to restore a hidden area of an object when it exists.

본 발명의 상세한 설명에서는 구체적인 실시 예에 관하여 설명하였으나 본 발명의 범위에서 벗어나지 않는 한도 내에서 여러 가지 변형이 가능함은 물론이다. 그러므로 본 발명의 범위는 설명된 실시 예에 국한되지 않으며, 후술되는 청구범위 및 이 청구범위와 균등한 것들에 의해 정해져야 한다.In the detailed description of the present invention, although specific embodiments have been described, various modifications are possible without departing from the scope of the present invention. Therefore, the scope of the present invention is not limited to the described embodiments, and should be defined by the following claims and their equivalents.

100: 복원 장치 110: 입력부
120: 통신부 130: 디스플레이부
140: 메모리 150: 제어부
151: 영상 분석부 152: 복원기 선택부
153: 학습부 154: 부가정보 제공부
155: 복원부100: restoration device 110: input unit
120: communication unit 130: display unit
140: memory 150: control unit
151: image analysis unit 152: restorer selection unit
153: learning unit 154: additional information providing unit
155: restoration unit

Claims

Machine learning (machine learning) to restore an object (occluded object) having an obscured area in an image to an object in which the obscured area is restored (restored object), but to output a reconstructed object when the obscured object and additional information related to the object are input ) method, which is a method performed by an electronic device that restores using a restorer that is a model learned by the method,
analyzing a type of an object that is obscured from the image;
checking whether there is a previously learned reconstructor based on the analyzed type of object;
selecting a corresponding restorer if a restorer exists in the checking step; and
It is determined whether re-learning of the selected reconstructor is necessary according to additional information related to the analyzed type of object, and uses the selected reconstructor according to the determined necessity or relearns the selected re-learner based on the additional information Including; creating a restoration object by using the restored restorer;
The relearned restorer is a relearned method using a first generator learned by machine learning techniques to output an unobscured object and a hidden object related to the additional information when additional information is input.

According to claim 1,
If there is no restorer in the selecting step, collecting the analyzed type of object, and generating a restored object using the newly learned restorer based on the collected object and additional information related to the object How to.

According to claim 1,
Further comprising the step of setting a region including the object covered in the image,
The analyzing is a method of performing the analysis on the occluded object in the restoration area.

delete

According to claim 1,
The generating step is
generating a restored object by inputting an object hidden in the selected restorer and additional information thereof when re-learning of the selected restorer is unnecessary; and
When re-learning of the selected reconstructor is required, an image related to the analyzed type of unobscured object is collected, and the unoccluded object of the collected image and its additional information are input to the first generator, and After generating an obscured object, re-learning the selected reconstructor using the obscured object of the collected image generated by the first generator and its additional information, the occluded object and its additional information are input into the re-learned reconstructor to restore the object creating a;
How to include.

According to claim 1,
The restorer includes a second generator for generating a restored object from the obscured object, and a discriminator for determining the restored object generated by the second generator, respectively, wherein the second generator and the discriminator are adversarially learned based on a GAN (Generative Adversarial Network). method.

7. The method of claim 6,
The second generator is an auto-encoder, a variant auto-encoder, a U-Net generator, a ResNET generator, or a method implemented based on a generative adversarial network.

According to claim 1,
The additional information is provided through analysis of a region other than the occluded region of the occluded object.

According to claim 1,
The additional information is provided through analysis of other images captured at the same location where the image was captured.

According to claim 1,
The additional information is provided by being input or selected by a user.

According to claim 1,
The additional information includes attribute information on the analyzed type of object, information on a plurality of detailed objects included in the analyzed type of object, or 3D information on the analyzed type of object.

A device for restoring an object (occluded object) having an obscured area in an image to an object (restored object) in which the obscured area is restored,
a memory for storing a reconstructor, which is a model learned by a machine learning technique, to output a reconstructed object when an obscured object and additional information related to the object are input; and
Includes; a control unit for controlling to create a restoration object using the restoration stored in the memory;
The control unit is
Analyzes the type of object obscured from the image, and confirms the existence of a previously learned reconstructor based on the analyzed type of object,
If there is a restorer, select the corresponding restorer, determine whether re-learning is necessary for the selected restorer according to additional information related to the analyzed type of object, and use the selected restorer according to the determined necessity, or Controlling to create a restoration object by using the restoration that re-learned the selected restoration unit based on the additional information,
The relearned restorer is a relearned device using a first generator learned by machine learning techniques to output an unobscured object and a hidden object related to the additional information when additional information is input.

A device for restoring an object (occluded object) having an obscured area in an image to an object (restored object) in which the obscured area is restored,
a communication unit for receiving a reconstructor, which is a model learned by a machine learning technique, to output a reconstructed object when receiving an input of an obscured object and additional information related to the object; and
It includes; a control unit for controlling to create a restored object using the restorer received from the communication unit;
The control unit is
Analyzes the type of object obscured from the image, and confirms the existence of a previously learned reconstructor based on the analyzed type of object,
If there is a restorer, select the corresponding restorer, determine whether re-learning is necessary for the selected restorer according to additional information related to the analyzed type of object, and use the selected restorer according to the determined necessity, or Controlling to create a restoration object by using the restoration that re-learned the selected restoration unit based on the additional information,
The relearned restorer is a relearned device using a first generator learned by machine learning techniques to output an unobscured object and a hidden object related to the additional information when additional information is input.

delete