KR102640653B1

KR102640653B1 - Apparatus and method for image processing

Info

Publication number: KR102640653B1
Application number: KR1020220102803A
Authority: KR
Inventors: 오병태
Original assignee: 한국항공대학교산학협력단
Priority date: 2022-08-17
Filing date: 2022-08-17
Publication date: 2024-02-23

Abstract

본 발명은 영상 처리 장치 및 방법에 관한 것이다. 개시된 영상 처리 장치 및 방법은, 임의의 카메라에서 획득한 영상에 대하여 이와 같은 패턴 및 특성 정보를 유지한 채 영상을 편집하거나 수정할 수 있는 영상 처리 장치에 있어서, 디모자이킹된 입력영상을 다시 모자이킹하여 영상의 센서단에서 획득한 상태로 되돌리는 모자이킹부, 상기 모자이킹된 영상의 빈 정보를 보간하는 보간부 및 상기 입력영상과 동일한 디모자이킹 패턴을 가지도록 학습하는 생성자를 포함하고, 상기 생성자의 학습은 입력영상을 다수의 패치로 분할하여 상기 생성된 패치내 특성이 상기 입력영상의 다른 패치들의 특성과 유사하게 되도록 학습하는 것을 특징으로 하는 깊이영상 처리 장치 및 그 방법을 제공한다. 본 발명에 의하면, 입력 영상을 편집하거나 수정하고자 할 때 영상의 획득 과정에서 발생하는 패턴 및 특성을 최대한 보존한 채로 편집 혹은 수정을 할 수 있는 영상 처리 장치 및 방법을 제공할 수 있다는 이점이 있다. The present invention relates to an image processing device and method. The disclosed image processing device and method are capable of editing or modifying an image acquired from an arbitrary camera while maintaining such pattern and characteristic information, and re-mosaicing the de-mosaiced input image. It includes a mosaic unit that returns the image to the state obtained at the sensor end, an interpolator that interpolates empty information of the mosaiced image, and a generator that learns to have the same demosaicing pattern as the input image, and the generator Provides a depth image processing device and method, characterized in that the learning involves dividing an input image into a plurality of patches and learning such that characteristics within the generated patch are similar to characteristics of other patches of the input image. According to the present invention, there is an advantage of providing an image processing device and method that can edit or modify an input image while preserving as much as possible the patterns and characteristics that occur during the image acquisition process.

Description

Image processing apparatus and method {APPARATUS AND METHOD FOR IMAGE PROCESSING}

본 발명은 영상 처리 장치 및 방법에 관한 것으로, 더욱 상세하게는 별다른 정보 없이 임의로 촬영된 영상에 대하여 해당 영상의 디모자이킹 패턴을 유지하면서 영상을 편집하거나 수정할 수 있는 영상 처리 장치 및 방법에 관한 것이다.The present invention relates to an image processing device and method, and more specifically, to an image processing device and method that can edit or modify an image randomly captured without any special information while maintaining the demosaicing pattern of the image. .

본 연구는 한국항공대학교의 기초연구사업인 "AI기반 영상 압축의 특징을 활용한 생성모델에 대응 가능한 카운터-안티포렌식 시스템 연구" (연구기간: 2022.3.1. ~ 2025.2.28) 과제의 연구비에 의해 지원되었다 (주관기관 : 한국항공대학교).This study was funded by Korea Aerospace University's basic research project, "Research on a counter-antiforensic system capable of responding to generative models using the characteristics of AI-based video compression" (research period: 2022.3.1. ~ 2025.2.28). Supported by (host organization: Korea Aerospace University).

이 부분에 기술된 내용은 단순히 본 실시 예에 대한 배경 정보를 제공할 뿐 종래기술을 구성하는 것은 아니다.The content described in this section simply provides background information for this embodiment and does not constitute prior art.

디지털 영상 획득과정 중 영상 센서가 모든 센싱 위치에서 R, G, B 등의 컬러 정보를 획득하는 과정이 있는데, 특별한 별도 장치가 없는 경우 한 개의 센서에서 한 가지의 컬러 정보만을 얻게 되는 문제점이 있다. 일반적인 카메라에서는 이렇게 획득하지 못한 정보를 채우기 위하여 주변 정보로부터 현재 위치의 정보를 예측하는 보간(interpolation) 기술을 사용하게 되며, 이러한 보간 기술을 디모자이킹(demosaicing)이라고 한다. 디모자이킹 기술은 특별히 표준이 없이 각 제조사마다 자신들만의 알고리즘을 적용하여 보간을 진행하기 때문에, 디모자이킹 패턴을 분석하면 해당 영상을 찍은 카메라의 종류를 알아낼 수 있게 된다. 즉, 이와 같은 패턴은 영상 정보 및 영상의 진위 여부를 가지고 있는 중요한 정보로 사용될 수 있다.During the digital image acquisition process, there is a process in which the image sensor acquires color information such as R, G, and B at all sensing positions. However, if there is no special device, there is a problem in that only one color information is obtained from one sensor. In a typical camera, interpolation technology is used to predict information about the current location from surrounding information to fill in the information that was not obtained, and this interpolation technology is called demosaicing. There is no specific standard for demosaicing technology, and each manufacturer applies its own algorithm for interpolation, so by analyzing the demosaicing pattern, it is possible to find out the type of camera that captured the video. In other words, such a pattern can be used as important information containing image information and the authenticity of the image.

한편, 최근 특정한 목적을 위해 주어진 영상을 임의로 조작하여 배포하는 사례가 증가하고 있다. 이러한 무분별한 영상 조작을 막기 위하여 조작된 영상을 찾는 영상 포렌식 기술이 다양하게 발전되었다. 특히 포렌식 기술의 한가지 사례로서, 앞서 언급한 영상 내의 디모자이킹 패턴을 분석하여 조작된 흔적을 찾아내는 기술도 소개되기도 하였다. Meanwhile, the number of cases of arbitrarily manipulating and distributing given videos for specific purposes has recently been increasing. In order to prevent such indiscriminate manipulation of images, various video forensic technologies have been developed to find manipulated images. In particular, as an example of forensic technology, technology was introduced to find traces of manipulation by analyzing demosaicing patterns in the aforementioned video.

반면, 다양한 이유로 영상을 수정하거나 편집하고자 할 때, 수정된 영상이 원본 영상이 가지고 있는 디모자이킹 패턴을 그대로 유지하고자 하는 필요성이 있다. 가장 쉬운 방법으로는 촬영한 카메라의 특징을 분석하고 모델화 하여 이를 다시 수정된 영상의 디모자이킹 과정에서 재현해 주는 방식을 활용할 수 있다. 일반적으로 널리 알려져 있는 카메라의 경우 충분히 데이터만 있다면 이와 같이 디모자이킹 패턴을 그대로 재현하는 기술이 소개된 바 있다.On the other hand, when you want to modify or edit a video for various reasons, there is a need for the modified video to maintain the demosaicing pattern of the original video. The easiest way is to analyze and model the characteristics of the captured camera and reproduce them during the demosaicing process of the modified video. In the case of widely known cameras, technology has been introduced to reproduce demosaicing patterns like this as long as there is enough data.

하지만, 종래기술에 의한 디모자이킹 패턴을 재현하는 기술은, 만약 카메라 정보가 없이 영상만 주어지게 되는 경우에는 이를 바탕으로 디모자이킹 패턴을 유지한 채 영상을 편집하기가 어렵게 된다. 해당 영상을 분석하여 카메라 모델을 추적하는 방식을 사용할 수도 있지만, 실제 활용되는 모든 카메라에 대한 각각의 모델을 데이터로 분석하고 데이터베이스화 해두는 것은 일반적으로 비실현적이다.However, with the technology for reproducing the demosaicing pattern according to the prior art, if only an image is provided without camera information, it becomes difficult to edit the image while maintaining the demosaicing pattern based on this. Although it is possible to track the camera model by analyzing the video, it is generally unrealistic to analyze each model for all cameras actually used and create a database.

이에, 본 발명에서는, 종래 기술에 비하여 영상의 내재적인 패턴을 유지한 채 영상을 수정/편집할 수 있는 영상 처리 장치 및 방법을 제안하고자 한다.Accordingly, the present invention proposes an image processing device and method that can modify/edit an image while maintaining the inherent pattern of the image compared to the prior art.

한국공개특허공보 제10-2022-0084236호, 2022년 6월 21일 공개(명칭 : 개선된 조작영상의 검출시스템 및 방법)Korean Patent Publication No. 10-2022-0084236, published on June 21, 2022 (Name: Improved detection system and method for manipulated images) 한국등록특허공보 제10-18181181호, 2017년 6월 22일 공개(명칭: 디지털 포렌식 영상 검증 시스템)Korean Patent Publication No. 10-18181181, published on June 22, 2017 (name: Digital Forensic Image Verification System) 한국등록특허공보 제10-1687989호, 2016년 12월 20일 공개(명칭: 디지털 포렌식 영상 검증 시스템 및 이에 사용되는 촬영장치와 영상저장장치)Korean Patent Publication No. 10-1687989, published on December 20, 2016 (Name: Digital forensic image verification system and imaging and image storage devices used therein)

(비특허 문헌 1) (Non-patent Document 1) [1] Chen, Chen, Xinwei Zhao, and Matthew C. Stamm. "Mislgan: An anti-forensic camera model falsification framework using a generative adversarial network."?2018 25th IEEE International Conference on Image Processing (ICIP). IEEE, 2018.[1] Chen, Chen, Xinwei Zhao, and Matthew C. Stamm. “Mislgan: An anti-forensic camera model falsification framework using a generative adversarial network.”?2018 25th IEEE International Conference on Image Processing (ICIP). IEEE, 2018. (비특허 문헌 2) (Non-patent Document 2) Andrews, Jerone TA, Yidan Zhang, and Lewis D. Griffin. "Conditional Adversarial Camera Model Anonymization." European Conference on Computer Vision. Springer, Cham, 2020.Andrews, Jerone T.A., Yidan Zhang, and Lewis D. Griffin. “Conditional Adversarial Camera Model Anonymization.” European Conference on Computer Vision. Springer, Cham, 2020.

본 발명은 전술한 종래 기술의 문제점을 해결하기 위하여 제안된 것으로, 입력 영상을 편집하거나 수정하고자 할 때 영상의 획득 과정에서 발생하는 패턴 및 특성을 최대한 보존한 채로 편집 혹은 수정을 진행할 수 있는 영상 처리 장치 및 방법을 제공하는데 주된 목적이 있다.The present invention was proposed to solve the problems of the prior art described above. When editing or modifying an input image, the present invention provides image processing that allows editing or modification while preserving as much as possible the patterns and characteristics that occur during the image acquisition process. The main purpose is to provide devices and methods.

또한, 본 발명의 다른 목적은 영상내 패턴 및 특성을 파악하여 해당 영상의 진위여부 및 카메라 모델 등을 유추할 수 있는 영상 처리 장치 및 방법을 제공하는데 있다.In addition, another object of the present invention is to provide an image processing device and method that can determine the authenticity of the image and the camera model by identifying patterns and characteristics in the image.

본 발명의 해결하고자 하는 과제는 이상에서 언급한 것으로 제한되지 않으며, 언급되지 않은 또 다른 해결하고자 하는 과제는 아래의 기재로부터 본 발명이 속하는 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The problems to be solved by the present invention are not limited to those mentioned above, and other problems to be solved that are not mentioned will be clearly understood by those skilled in the art to which the present invention pertains from the following description.

전술한 목적을 달성하기 위한 본 발명의 일 양상은, 임의의 카메라에서 획득한 영상에 대하여 이와 같은 패턴 및 특성 정보를 유지한 채 영상을 편집하거나 수정할 수 있는 영상 처리 장치에 있어서, 디모자이킹된 입력영상을 다시 모자이킹하여 영상의 센서단에서 획득한 상태로 되돌리는 모자이킹부; 상기 모자이킹된 영상의 빈 정보를 보간하는 보간부; 및 상기 입력영상과 동일한 디모자이킹 패턴을 가지도록 학습하는 생성자를 포함하고, 상기 생성자의 학습은 입력영상을 다수의 패치로 분할하여 상기 생성된 패치내 특성이 상기 입력영상의 다른 패치들의 특성과 유사하게 되도록 학습하는 것을 특징으로 하는 깊이영상 처리 장치를 제공한다. One aspect of the present invention for achieving the above-described object is an image processing device capable of editing or modifying an image acquired from an arbitrary camera while maintaining such pattern and characteristic information, wherein the demosaicing A mosaicing unit that mosaics the input image again and returns it to the state obtained at the sensor end of the image; an interpolation unit that interpolates empty information of the mosaiced image; and a generator that learns to have the same demosaicing pattern as the input image, wherein the generator's learning divides the input image into a plurality of patches so that characteristics within the generated patch are similar to characteristics of other patches of the input image. Provided is a depth image processing device characterized in that it learns to become similar.

본 발명의 다른 일 양상은, 임의의 카메라에서 획득한 영상에 대하여 이와 같은 패턴 및 특성 정보를 유지한 채 영상을 편집하거나 수정할 수 있는 영상 처리 방법에 있어서, 영상 처리장치에 주어진 디모자이킹된 영상을 입력하는 단계; 상기 입력영상을 다시 모자이킹을 진행하여 영상의 센서단에서 획득한 상태로 되돌리는 단계; 일반적으로 사용하는 보간(interpolation) 방식으로 상기 입력 영상의 빈 정보를 채우는 단계; 및 생성자가 상기 입력영상과 동일한 디모자이킹 패턴을 가지도록 학습하기 위하여, 상기 입력영상을 다수의 패치로 분할하여 상기 생성된 패치내 특성이 상기 입력영상의 다른 패치들의 특성과 유사하게 되도록 학습하는 단계를 포함하는 것을 특징으로 하는 깊이영상 처리 방법을 제공한다.Another aspect of the present invention is an image processing method that can edit or modify an image acquired from an arbitrary camera while maintaining the pattern and characteristic information, and includes demosaicing the image given to the image processing device. Entering; Mosaicing the input image again to return it to the state obtained at the sensor end of the image; filling empty information of the input image using a commonly used interpolation method; And in order for the generator to learn to have the same demosaicing pattern as the input image, the input image is divided into a plurality of patches and the characteristics of the generated patch are learned to be similar to the characteristics of other patches of the input image. Provides a depth image processing method comprising the following steps:

본 발명의 다른 일 양상은, 영상 처리 방법을 실행하는 프로그램을 기록한 컴퓨터 판독 가능한 기록 매체를 제공한다.Another aspect of the present invention provides a computer-readable recording medium on which a program for executing an image processing method is recorded.

본 발명의 영상 처리 장치 및 방법에 의하면, 입력 영상을 편집하거나 수정하고자 할 때 영상의 획득 과정에서 발생하는 패턴 및 특성을 최대한 보존한 채로 편집 혹은 수정을 진행할 수 있는 영상 처리 장치 및 방법을 제공할 수 있다는 효과가 있다.According to the image processing device and method of the present invention, when editing or modifying an input image, it is possible to provide an image processing device and method that can perform editing or modification while preserving as much as possible the patterns and characteristics that occur during the image acquisition process. There is an effect that it can be done.

또한, 영상내 패턴 및 특성을 파악하여 해당 영상의 진위여부 및 카메라 모델 등을 유추할 수 있는 영상 처리 장치 및 방법을 제공할 수 있다는 효과가 있다.In addition, it is possible to provide an image processing device and method that can determine the authenticity of the image and the camera model by identifying patterns and characteristics in the image.

본 발명에서 얻을 수 있는 효과는 이상에서 언급한 효과로 제한되지 않으며, 언급하지 않은 또 다른 효과들은 아래의 기재로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The effects that can be obtained from the present invention are not limited to the effects mentioned above, and other effects not mentioned can be clearly understood by those skilled in the art from the description below. .

본 발명에 관한 이해를 돕기 위해 상세한 설명의 일부로 포함되는, 첨부 도면은 본 발명에 대한 실시예를 제공하고, 상세한 설명과 함께 본 발명의 기술적 특징을 설명한다.
도 1은 본 발명의 일 실시예에 따른 영상 처리장치의 구성을 예시한 도면이다.
도 2는 본 발명의 일 실시예에 따른 영상 처리방법을 예시한 도면이다. The accompanying drawings, which are included as part of the detailed description to aid understanding of the present invention, provide embodiments of the present invention, and together with the detailed description, explain technical features of the present invention.
Figure 1 is a diagram illustrating the configuration of an image processing device according to an embodiment of the present invention.
Figure 2 is a diagram illustrating an image processing method according to an embodiment of the present invention.

이하, 본 발명에 따른 바람직한 실시 형태를 첨부된 도면을 참조하여 상세하게 설명한다. 첨부된 도면과 함께 이하에 개시될 상세한 설명은 본 발명의 예시적인 실시형태를 설명하고자 하는 것이며, 본 발명이 실시될 수 있는 유일한 실시형태를 나타내고자 하는 것이 아니다. 이하의 상세한 설명은 본 발명의 완전한 이해를 제공하기 위해서 구체적 세부사항을 포함한다. 그러나, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자는 본 발명이 이러한 구체적 세부사항 없이도 실시될 수 있음을 안다. Hereinafter, preferred embodiments according to the present invention will be described in detail with reference to the attached drawings. The detailed description set forth below in conjunction with the accompanying drawings is intended to illustrate exemplary embodiments of the invention and is not intended to represent the only embodiments in which the invention may be practiced. The following detailed description includes specific details to provide a thorough understanding of the invention. However, those skilled in the art will recognize that the present invention may be practiced without these specific details.

몇몇 경우, 본 발명의 개념이 모호해지는 것을 피하기 위하여 공지의 구조 및 장치는 생략되거나, 각 구조 및 장치의 핵심기능을 중심으로 한 블록도 형식으로 도시될 수 있다.In some cases, in order to avoid ambiguity of the concept of the present invention, well-known structures and devices may be omitted or may be shown in block diagram form focusing on the core functions of each structure and device.

명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함(comprising 또는 including)"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다. 또한, 명세서에 기재된 "…부", "…기", "모듈" 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어나 소프트웨어 또는 하드웨어 및 소프트웨어의 결합으로 구현될 수 있다. 또한, "일(a 또는 an)", "하나(one)", "그(the)" 및 유사 관련어는 본 발명을 기술하는 문맥에 있어서(특히, 이하의 청구항의 문맥에서) 본 명세서에 달리 지시되거나 문맥에 의해 분명하게 반박되지 않는 한, 단수 및 복수 모두를 포함하는 의미로 사용될 수 있다.Throughout the specification, when a part is said to "comprise or include" a certain element, this means that it may further include other elements rather than excluding other elements unless specifically stated to the contrary. do. In addition, terms such as “… unit”, “… unit”, “module”, etc. used in the specification refer to a unit that processes at least one function or operation, which may be implemented as hardware, software, or a combination of hardware and software. there is. In addition, the terms “a or an,” “one,” “the,” and similar related terms are used in the context of describing the present invention (particularly in the context of the claims below) as used herein. It may be used in both singular and plural terms, unless indicated otherwise or clearly contradicted by context.

본 발명의 실시예들을 설명함에 있어서 공지 기능 또는 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략할 것이다. 그리고 후술되는 용어들은 본 발명의 실시예에서의 기능을 고려하여 정의된 용어들로서 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다. 그러므로 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다.In describing embodiments of the present invention, if a detailed description of a known function or configuration is judged to unnecessarily obscure the gist of the present invention, the detailed description will be omitted. The terms described below are terms defined in consideration of functions in the embodiments of the present invention, and may vary depending on the intention or custom of the user or operator. Therefore, the definition should be made based on the contents throughout this specification.

본 발명의 도면의 각 구성부들은 영상 처리 장치 및 방법에서 서로 다른 특징적인 기능들을 나타내기 위해 독립적으로 도시한 것으로, 각 구성부들이 분리된 하드웨어나 하나의 소프트웨어 구성단위로 이루어짐을 의미하지 않는다. 즉, 각 구성부는 설명의 편의상 각각의 구성부로 나열하여 포함한 것으로 각 구성부 중 적어도 두 개의 구성부가 합쳐져 하나의 구성부로 이루어지거나, 하나의 구성부가 복수개의 구성부로 나뉘어져 기능을 수행할 수 있고 이러한 각 구성부의 통합된 실시예 및 분리된 실시예도 본 발명의 본질에서 벗어나지 않는 한 본 발명의 권리범위에 포함된다. Each component in the drawings of the present invention is shown independently to represent different characteristic functions in the image processing device and method, and does not mean that each component is comprised of separate hardware or a single software unit. That is, each component is listed and included as a separate component for convenience of explanation, and at least two of each component can be combined to form one component, or one component can be divided into a plurality of components to perform a function, and each of these components can be divided into a plurality of components. Integrated embodiments and separate embodiments of the constituent parts are also included in the scope of the present invention as long as they do not deviate from the essence of the present invention.

또한, 일부의 구성 요소는 본 발명에서 본질적인 기능을 수행하는 필수적인 구성 요소는 아니고 단지 성능을 향상시키기 위한 선택적 구성 요소일 수 있다. 본 발명은 단지 성능 향상을 위해 사용되는 구성 요소를 제외한 본 발명의 본질을 구현하는데 필수적인 구성부만을 포함하여 구현될 수 있고, 단지 성능 향상을 위해 사용되는 선택적 구성 요소를 제외한 필수 구성 요소만을 포함한 구조도 본 발명의 권리범위에 포함된다.Additionally, some components may not be essential components that perform essential functions in the present invention, but may simply be optional components to improve performance. The present invention can be implemented by including only essential components for implementing the essence of the present invention excluding components used only to improve performance, and a structure including only essential components excluding optional components used only to improve performance. is also included in the scope of rights of the present invention.

이하, 첨부된 도면들을 참조하여 본 발명의 실시예에 대해 살펴보기로 한다.Hereinafter, embodiments of the present invention will be described with reference to the attached drawings.

도 1은 본 발명의 일 실시예에 따른 영상 처리장치의 구성을 예시한 도면이다.Figure 1 is a diagram illustrating the configuration of an image processing device according to an embodiment of the present invention.

본 발명의 일실시예에서는 입력영상의 디모자이킹 패턴을 그대로 유지한 채 영상을 편집하는 방식을 사용하고 있다. 세부적으로는, 최근 영상 생성 분야에서 주로 사용하고 있는 딥러닝 기반의 Generative adversarial network(GAN) 구조를 주로 사용하고 있다.In one embodiment of the present invention, a method of editing the video is used while maintaining the demosaicing pattern of the input video. In detail, it mainly uses the deep learning-based Generative adversarial network (GAN) structure, which is mainly used in the field of image generation recently.

딥러닝(Deep Learning)은 여러 층을 가진 인공신경망(Artificial Neural Network, ANN)을 사용하여 머신러닝 학습을 수행하는 것으로 심층학습이라고도 부른다.Deep Learning is also called deep learning, as it performs machine learning using an artificial neural network (ANN) with multiple layers.

GAN은 '생성적 적대 신경망'의 약자로, 생성자와 식별자가 서로 경쟁(Adversarial)하며 데이터를 생성(Generative)하는 모델(Network)을 뜻한다. 만약, GAN으로 인물 사진을 생성해 낸다면 인물 사진을 만들어내는 것을 생성자(Generator)라고 하며, 만들어진 인물 사진을 평가하는 것을 구분자(Discriminator)라고 한다. GAN은 생성자와 구분자가 서로 대립하며(Adversarial:대립하는) 서로의 성능을 점차 개선해 나가는 쪽으로 학습이 진행되는 것이 주요 개념이다. 머신러닝은 크게 3가지 개념 지도학습/강화학습/비지도학습으로 분류되는데, GAN은 '비지도 학습'에 해당한다.GAN stands for 'Generative Adversarial Network' and refers to a model (Network) in which generators and identifiers compete with each other (Adversarial) to generate data (Generative). If a portrait is generated using a GAN, the person who creates the portrait is called a generator, and the person who evaluates the created portrait is called a discriminator. The main concept of GAN is that the generator and separator are in opposition to each other, and learning progresses in a way that gradually improves each other's performance. Machine learning is largely classified into three concepts: supervised learning/reinforcement learning/unsupervised learning, and GAN corresponds to 'unsupervised learning'.

본 발명의 일실시예에서는 먼저 모자이킹부(110)에서 주어진 디모자이킹된 영상을 다시 모자이킹을 진행하여 실제 센서단에서 획득한 상태로 되돌린다. In one embodiment of the present invention, the mosaiced image given by the mosaic unit 110 is first mosaiced again to return it to the state actually obtained at the sensor stage.

이후 보간부(130)에서 일반적으로 사용하는 보간(interpolation) 방식으로 먼저 영상의 빈 정보를 모두 채운다. Afterwards, the interpolation unit 130 first fills in all empty information in the image using a commonly used interpolation method.

이후 제안 생성적 적대 신경망(GAN) 방식으로 영상을 복원하는데, GAN 시스템의 생성자(150, Generator)가 원 영상과 동일한 디모자이킹 패턴을 가지도록 훈련하기 위하여, 도 1에서와 같이 동일 영상의 다른 패치를 이용하는 방식을 제안한다.Afterwards, the image is restored using the proposed generative adversarial network (GAN) method. In order to train the generator (150) of the GAN system to have the same demosaicing pattern as the original image, another image of the same image is used as shown in Figure 1. We propose a method using patches.

예컨대, 원 영상의 x0 패치에 대한 디모자이킹 패턴을 생성하기 위하여 원 영상의 x1과 같이 다른 위치에 있는 패치들을 상호 비교하여, 영상 처리장치(100) 내에서 동일한 디모자이킹 패턴 생성을 유도한다. For example, in order to generate a demosaicing pattern for the x0 patch of the original image, patches at different positions such as x1 of the original image are compared with each other to induce the generation of the same demosaicing pattern within the image processing device 100. .

이때, 원본 패턴과 생성된 패턴 사이의 상대적인 거리를 최소화 하기 위하여 제안된 생성자(150)에서 생성된 패턴들의 내부적인 거리(intra-distance)가 다른 디바이스에서 생성하는 패턴들과의 거리(inter-distance)보다 상대적으로 작게 되도록 학습을 유도한다. 이를 위하여 제안 GAN 시스템 내에 기존 디바이스에서 사용하고 있는 디모자이킹 생성방식을 모델화하는 사전훈련된 생성자(170, pre-trained generator)를 함께 포함한다. At this time, in order to minimize the relative distance between the original pattern and the generated pattern, the internal distance (intra-distance) of the patterns generated by the proposed generator 150 is the distance (inter-distance) between the patterns generated by other devices. ) to induce learning to be relatively smaller than ). To this end, the proposed GAN system includes a pre-trained generator (170) that models the demosaicing generation method used in existing devices.

임베더(190, Embedder)는 생성자(150)와 사전훈련된 생성자(170)에서 생성한 패턴을 통하여 측정가능한 특징 벡터를 추출한다. The embedder 190 extracts a measurable feature vector through the pattern generated by the generator 150 and the pre-trained generator 170.

임베딩이란 단어나 문장, 문서를 벡터로 변환시킨 값이나 그 과정을 말하며, 고차원의 정보를 저차원으로 변환하면서 필요한 정보를 보존하는 것이다. Embedding refers to the value or process of converting a word, sentence, or document into a vector. It preserves necessary information while converting high-dimensional information to low-dimensional information.

임베더(190)에서 추출한 특징 벡터를 통하여 아래의 4가지 손실함수 L1, L2, L3, L4를 생성하고, 이를 바탕으로 학습을 진행하게 된다. The following four loss functions L1, L2, L3, and L4 are generated through the feature vector extracted from the embedder 190, and learning is performed based on these.

딥러닝은 예상과 결과의 오차를 줄이는 방법을 통해 학습을 진행하는데 그 오차를 나타내는 함수가 손실함수(Loss function)이다. Deep learning progresses learning by reducing the error between predictions and results, and the function that represents the error is the loss function.

손실함수는 Neural Network의 예측이 얼마나 잘 맞는지 측정하는 역할을 한다. 손실함수로부터 얻어진 손실값(Loss value)은 훈련과정에서 Neural Network가 얼마나 잘 훈련되었는지 확인하는 지표가 된다.The loss function serves to measure how well the Neural Network's predictions are correct. The loss value obtained from the loss function serves as an indicator to check how well the Neural Network has been trained during the training process.

본 발명의 영상 처리장치(100)의 손실함수는 아래와 같은데, 여러 가지 요소를 추가하거나 뺄 수 있다.The loss function of the image processing device 100 of the present invention is as follows, and various elements can be added or subtracted.

L1 : 서로 다른 m개의 사전훈련된 생성자(170, pre-trained generator)를 구분하는 손실함수L1: Loss function that distinguishes between m different pre-trained generators (170, pre-trained generator)

L2 : 원본영상의 패치와 제안 생성자(150, generator)의 출력 패치를 분류하는 손실함수L2: Loss function that classifies the patches of the original video and the output patches of the proposed generator (150, generator)

L3 : 제안 생성자(150, generator)들의 출력들의 특징간 거리를 측정하는 손실함수L3: Loss function that measures the distance between features of the outputs of proposal generators (150, generators)

L4 : 제안 생성자(150, generator)와 사전훈련된 생성자(170, pre-trained generator)간의 특징간 거리를 측정하는 손실함수L4: Loss function that measures the distance between features between the proposed generator (150, generator) and the pre-trained generator (170, pre-trained generator)

도 2는 본 발명의 일 실시예에 따른 영상 처리방법을 예시한 도면이다. Figure 2 is a diagram illustrating an image processing method according to an embodiment of the present invention.

먼저 영상 처리장치(100)에 주어진 디모자이킹된 영상을 입력하고(S201), 다시 모자이킹을 진행하여 실제 센서단에서 획득한 상태로 되돌린다(S203). First, the de-mosaiced image given to the image processing device 100 is input (S201), and mosaicing is performed again to return it to the state actually obtained at the sensor stage (S203).

이후 일반적으로 사용하는 보간(interpolation) 방식으로 먼저 영상의 빈 정보를 모두 채운다(S203). Afterwards, all empty information in the image is first filled using a commonly used interpolation method (S203).

이후 제안 생성적 적대 신경망(GAN) 방식으로 영상을 복원하는데, GAN 시스템의 생성자가 원 영상과 동일한 디모자이킹 패턴을 가지도록 훈련한다(S205). Afterwards, the image is restored using the proposed generative adversarial network (GAN) method, and the generator of the GAN system is trained to have the same demosaicing pattern as the original image (S205).

구체적으로는, 동일 영상의 다른 패치를 이용하는데, 원 영상의 x0 패치에 대한 디모자이킹 패턴을 생성하기 위하여 원 영상의 x1과 같이 다른 위치에 있는 패치들을 상호 비교하여, 영상 처리장치(100) 내에서 동일한 디모자이킹 패턴 생성을 유도한다. Specifically, different patches of the same image are used. In order to generate a demosaicing pattern for the x0 patch of the original image, patches at different positions such as x1 of the original image are compared with each other, and the image processing device 100 It induces the creation of the same demosaicing pattern within.

이때, 원본 패턴과 생성된 패턴 사이의 상대적인 거리를 최소화 하기 위하여 제안된 생성자(150)에서 생성된 패턴들의 내부적인 거리(intra-distance)가 다른 디바이스에서 생성하는 패턴들과의 거리(inter-distance)보다 상대적으로 작게 되도록 학습을 유도한다. At this time, in order to minimize the relative distance between the original pattern and the generated pattern, the internal distance (intra-distance) of the patterns generated by the proposed generator 150 is the distance (inter-distance) between the patterns generated by other devices. ) to induce learning to be relatively smaller than ).

이를 위하여 제안 GAN 시스템 내에 기존 디바이스에서 사용하고 있는 디모자이킹 생성방식을 모델화하는 사전훈련된 생성자(pre-trained generator)를 확보하고(S207)고 함께 포함한다. To this end, a pre-trained generator that models the demosaicing generation method used in existing devices is secured (S207) and included in the proposed GAN system.

생성자와 사전훈련된 생성자에서 생성한 패턴을 통하여 임베딩, 즉, 측정가능한 특징 벡터를 추출한다(S209). Embedding, that is, a measurable feature vector, is extracted through the pattern generated by the generator and the pre-trained generator (S209).

상기 추출한 특징 벡터를 통하여 아래의 4가지 손실함수 L1, L2, L3, L4를 생성(S211)하고, 이를 바탕으로 학습을 진행한다. Through the extracted feature vectors, the following four loss functions L1, L2, L3, and L4 are generated (S211), and learning is performed based on them.

손실함수는 아래와 같은데, 여러 가지 요소를 추가하거나 뺄 수 있다.The loss function is as follows, and various elements can be added or subtracted.

L1 : 서로 다른 m개의 사전훈련된 생성자(pre-trained generator)를 구분하는 손실함수L1: Loss function that distinguishes m different pre-trained generators.

L2 : 원본영상의 패치와 제안 생성자(generator)의 출력 패치를 분류하는 손실함수L2: Loss function that classifies the patches of the original video and the output patches of the proposal generator.

L3 : 제안 생성자(generator)들의 출력들의 특징간 거리를 측정하는 손실함수L3: Loss function that measures the distance between features of the outputs of proposal generators.

L4 : 제안 생성자(generator)와 사전훈련된 생성자(pre-trained generator)간의 특징간 거리를 측정하는 손실함수L4: Loss function that measures the distance between features between the proposed generator and the pre-trained generator.

도 2에서는 단계 S201 내지 단계 S211를 순차적으로 실행하는 것으로 기재하고 있으나, 이는 본 실시예의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 실시예가 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 실시예의 본질적인 특성에서 벗어나지 않는 범위에서 도 2에 기재된 순서를 변경하여 실행하거나 단계 S201 내지 단계 S211 중 하나 이상의 단계를 병렬적으로 실행하는 것으로 다양하게 수정 및 변형하여 적용 가능할 것이므로, 도 2는 시계열적인 순서로 한정되는 것은 아니다. In Figure 2, steps S201 to S211 are described as being sequentially executed, but this is merely an illustrative explanation of the technical idea of this embodiment, and those skilled in the art in the technical field to which this embodiment belongs will understand the steps of this embodiment. Since it is possible to apply various modifications and modifications by changing the order shown in FIG. 2 or executing one or more of steps S201 to S211 in parallel without departing from the essential characteristics, FIG. 2 is shown in a time-serial order. It is not limited.

본 명세서에 첨부된 블록도의 각 블록과 흐름도의 각 단계의 조합들은 컴퓨터 프로그램 인스트럭션들에 의해 수행될 수도 있다. 이들 컴퓨터 프로그램 인스트럭션들은 범용 컴퓨터, 특수용 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비의 프로세서에 탑재될 수 있으므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비의 프로세서를 통해 수행되는 그 인스트럭션들이 블록도의 각 블록 또는 흐름도의 각 단계에서 설명된 기능들을 수행하는 수단을 생성하게 된다. 이들 컴퓨터 프로그램 인스트럭션들은 특정 방식으로 기능을 구현하기 위해 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비를 지향할 수 있는 컴퓨터 이용 가능 또는 컴퓨터 판독 가능 메모리에 저장되는 것도 가능하므로, 그 컴퓨터 이용가능 또는 컴퓨터 판독 가능 메모리에 저장된 인스트럭션들은 블록도의 각 블록 또는 흐름도 각 단계에서 설명된 기능을 수행하는 인스트럭션 수단을 내포하는 제조 품목을 생산하는 것도 가능하다. 컴퓨터 프로그램 인스트럭션들은 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비 상에 탑재되는 것도 가능하므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비 상에서 일련의 동작 단계들이 수행되어 컴퓨터로 실행되는 프로세스를 생성해서 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비를 수행하는 인스트럭션들은 블록도의 각 블록 및 흐름도의 각 단계에서 설명된 기능들을 실행하기 위한 단계들을 제공하는 것도 가능하다. Combinations of each block of the block diagram and each step of the flow diagram attached to this specification may be performed by computer program instructions. Since these computer program instructions can be mounted on the processor of a general-purpose computer, special-purpose computer, or other programmable data processing equipment, the instructions performed through the processor of the computer or other programmable data processing equipment are shown in each block of the block diagram or flow diagram. Each step creates the means to perform the functions described. These computer program instructions may also be stored in computer-usable or computer-readable memory that can be directed to a computer or other programmable data processing equipment to implement a function in a particular manner, so that the computer-usable or computer-readable memory The instructions stored in can also produce manufactured items containing instruction means that perform the functions described in each block of the block diagram or each step of the flow diagram. Computer program instructions can also be mounted on a computer or other programmable data processing equipment, so that a series of operational steps are performed on the computer or other programmable data processing equipment to create a process that is executed by the computer, thereby generating a process that is executed by the computer or other programmable data processing equipment. Instructions that perform processing equipment may also provide steps for executing functions described in each block of the block diagram and each step of the flow diagram.

또한, 각 블록 또는 각 단계는 특정된 논리적 기능(들)을 실행하기 위한 하나 이상의 실행 가능한 인스트럭션들을 포함하는 모듈, 세그먼트 또는 코드의 일부를 나타낼 수 있다. 또, 몇 가지 대체 실시예들에서는 블록들 또는 단계들에서 언급된 기능들이 순서를 벗어나서 발생하는 것도 가능함을 주목해야 한다. 예컨대, 잇달아 도시되어 있는 두 개의 블록들 또는 단계들은 사실 실질적으로 동시에 수행되는 것도 가능하고 또는 그 블록들 또는 단계들이 때때로 해당하는 기능에 따라 역순으로 수행되는 것도 가능하다.Additionally, each block or each step may represent a module, segment, or portion of code that includes one or more executable instructions for executing specified logical function(s). Additionally, it should be noted that in some alternative embodiments it is possible for the functions mentioned in the blocks or steps to occur out of order. For example, two blocks or steps shown in succession may in fact be performed substantially simultaneously, or the blocks or steps may sometimes be performed in reverse order depending on the corresponding function.

이상의 설명은 본 발명의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 다양한 수정 및 변형이 가능할 것이다. 따라서, 본 발명에 개시된 실시예들은 본 발명의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시예에 의하여 본 발명의 기술 사상의 범위가 한정되는 것은 아니다. 본 발명의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술사상은 본 발명의 권리범위에 포함되는 것으로 해석되어야 할 것이다.The above description is merely an illustrative explanation of the technical idea of the present invention, and various modifications and variations will be possible to those skilled in the art without departing from the essential characteristics of the present invention. Accordingly, the embodiments disclosed in the present invention are not intended to limit the technical idea of the present invention, but are for illustrative purposes, and the scope of the technical idea of the present invention is not limited by these embodiments. The scope of protection of the present invention shall be interpreted in accordance with the claims below, and all technical ideas within the equivalent scope shall be construed as being included in the scope of rights of the present invention.

본 발명의 영상 처리 장치 및 방법에 따르면, 입력 영상을 편집하거나 수정하고자 할 때 영상의 획득 과정에서 발생하는 패턴 및 특성을 최대한 보존한 채로 편집 혹은 수정을 할 수 있는 영상 처리 장치 및 방법을 제공할 수 있는 솔루션으로 활용 가능하다는 점에서, 기존 기술의 한계를 뛰어 넘음에 따라 관련 기술에 대한 이용만이 아닌 적용되는 장치의 시판 또는 영업의 가능성이 충분할 뿐만 아니라 현실적으로 명백하게 실시할 수 있는 정도이므로 산업상 이용가능성이 있는 발명이다.According to the image processing device and method of the present invention, when editing or modifying an input image, it is possible to provide an image processing device and method that can edit or modify the input image while preserving as much as possible the patterns and characteristics that occur during the image acquisition process. In that it can be used as a viable solution, it overcomes the limitations of existing technology, and not only has the potential to market or sell the applied device, not only the use of the related technology, but also the degree to which it can be clearly implemented in reality, making it industrially viable. It is an invention that has potential for use.

100: 영상 처리장치 110: 모자이킹부 130: 보간부
150: 생성자 170: 사전훈련된 생성자
190: 임베더100: Image processing device 110: Mosaic unit 130: Interpolation unit
150: Generator 170: Pretrained constructor
190: Embedder

Claims

In an image processing device that can edit or modify an image acquired from an arbitrary camera while maintaining pattern and characteristic information,
A mosaicing unit that re-mosaics the de-mosaiced input image and returns it to the state obtained at the sensor end of the image;
an interpolation unit that interpolates empty information of the mosaiced image; and
Includes a generator that learns to have the same demosaicing pattern as the input image,
The learning of the generator divides the input image into a plurality of patches, and compares second patches at different positions of the input image to generate a demosaicing pattern for the first patch of the input image, so that the same A depth image processing device characterized in that it induces the creation of a demosaicing pattern.

According to paragraph 1,
The constructor is:
A depth image processing device characterized by performing learning with a deep learning-based generative adversarial network (GAN) structure.

delete

According to paragraph 1,
Inducing the generation of the same demosaicing pattern is,
In order to minimize the relative distance between the original pattern and the pattern generated by the generator, the internal distance (intra-distance) of the generated patterns is relatively greater than the distance (inter-distance) between patterns generated by other devices. A depth image processing device characterized by inducing learning to become smaller.

According to paragraph 4,
In order to minimize the relative distance,
A pre-trained generator that models the demosaicing generation method used in existing devices; and
A depth image processing device further comprising an embedder that extracts a measurable feature vector through the pattern generated by the generator and the pre-trained generator.

According to clause 5,
A depth image processing device that generates a loss function using the feature vector extracted from the embedder and performs learning using the loss function.

According to clause 6,
The loss function is,
A first loss function that distinguishes between m different pretrained generators;
a second loss function that classifies the patch of the input image and the output patch of the generator;
a third loss function that measures distances between features of outputs of the generator; and
A fourth loss function that measures the distance between features between the generator and the pre-trained generator.
A depth image processing device comprising one or more of the following.

In an image processing method that can edit or modify an image acquired from an arbitrary camera while maintaining pattern and characteristic information,
Inputting a given de-mosaiced image into an image processing device;
Mosaicing the input image again to return it to the state obtained at the sensor end of the image;
Filling in empty information of the input image using an interpolation method; and
In order for the generator to learn to have the same demosaicing pattern as the input image, the input image is divided into a plurality of patches, and in order to generate a demosaicing pattern for the first patch of the input image, the input image is divided into multiple patches. A depth image processing method comprising the step of comparing second patches at different positions to generate the same demosaicing pattern.

delete

According to clause 8,
Inducing the generation of the same demosaicing pattern is,
In order to minimize the relative distance between the original pattern and the pattern generated by the generator, the internal distance (intra-distance) of the generated patterns is relatively greater than the distance (inter-distance) between patterns generated by other devices. A depth image processing method characterized by inducing learning to become smaller.

According to clause 10,
In order to minimize the relative distance,
Securing a pre-trained generator that models the demosaicing generation method used in existing devices; and
A depth image processing method further comprising an embedding step of extracting a measurable feature vector through the pattern generated by the generator and the pre-trained generator.

According to clause 11,
A depth image processing method characterized by generating a loss function through the feature vector extracted from the embedder and performing learning using the loss function.

According to clause 12,
The loss function is,
A first loss function that distinguishes between m different pretrained generators;
a second loss function that classifies the patch of the input image and the output patch of the generator;
a third loss function that measures distances between features of outputs of the generator; and
A fourth loss function that measures the distance between features between the generator and the pre-trained generator.
A depth image processing method comprising one or more of the following.

In an image processing method that can edit or modify an image acquired from an arbitrary camera while maintaining pattern and characteristic information,
Inputting a given de-mosaiced image into an image processing device;
Mosaicing the input image again to return it to the state obtained at the sensor end of the image;
Filling in empty information of the input image using an interpolation method; and
In order for the generator to learn to have the same demosaicing pattern as the input image, the input image is divided into a plurality of patches , and to generate a demosaicing pattern for the first patch of the input image, the input image is divided into multiple patches. A computer-readable recording medium recording a program for executing a depth image processing method , comprising the step of comparing second patches at different positions in the image to generate the same demosaicing pattern .

delete