KR102192016B1

KR102192016B1 - Method and Apparatus for Image Adjustment Based on Semantics-Aware

Info

Publication number: KR102192016B1
Application number: KR1020190003662A
Authority: KR
Inventors: 김선주; 남성현
Original assignee: 연세대학교 산학협력단
Priority date: 2019-01-11
Filing date: 2019-01-11
Publication date: 2020-12-16
Also published as: KR20200092492A

Abstract

의미 인식 기반의 이미지 보정 방법 및 그를 위한 장치를 개시한다.
본 발명의 실시예에 따른 이미지 보정장치는 입력 이미지에 대한 색상 특징 맵을 추출하는 색상 추출부; 컨볼루션 레이어를 이용하여 상기 입력 이미지에 대한 적어도 하나의 컨볼루션 특징 맵을 추출하는 특징 맵 추출부; 상기 적어도 하나의 컨볼루션 특징 맵을 기반으로 의미 보정 맵(SAM: Semantic Adjustment Map)을 생성하는 의미 보정 맵 생성부; 및 상기 의미 보정 맵 및 상기 색상 특징 맵을 기반으로 색 변환에 대한 색상 매핑 정보를 생성하여 이미지 보정에 대한 출력 색상을 예측하는 색 변환 처리부를 포함할 수 있다. A method for correcting an image based on meaning recognition and an apparatus therefor are disclosed.
An image correction apparatus according to an embodiment of the present invention includes a color extractor for extracting a color feature map of an input image; A feature map extraction unit for extracting at least one convolutional feature map of the input image using a convolutional layer; A semantic correction map generator for generating a semantic correction map (SAM) based on the at least one convolutional feature map; And a color conversion processor configured to predict an output color for image correction by generating color mapping information for color conversion based on the semantic correction map and the color feature map.

Description

Method and Apparatus for Image Adjustment Based on Semantics-Aware {Method and Apparatus for Image Adjustment Based on Semantics-Aware}

본 발명은 객체의 의미를 인식하여 이미지를 보정하는 방법 및 그를 위한 장치에 관한 것이다. The present invention relates to a method and apparatus for correcting an image by recognizing the meaning of an object.

이 부분에 기술된 내용은 단순히 본 발명의 실시예에 대한 배경 정보를 제공할 뿐 종래기술을 구성하는 것은 아니다.The content described in this section merely provides background information on the embodiments of the present invention and does not constitute the prior art.

아마추어 이미지 작가는 점점 더 많은 디지털 카메라로 이미지를 어디서나 쉽게 찍을 수 있습니다.With more and more digital cameras, amateur imagers can easily take images anywhere.

디지털 카메라로 누구나 이미지를 촬영할 수 있다 하지만, 촬영된 이미지는 시각적으로 만족스럽지 않을 수 있다. 이에 따라, 많은 사람들은 촬영된 이미지의 색조와 색상을 보정하여 시각적으로 더욱 인상적이고 스타일화 된 결과물을 얻기를 원한다. Anyone can take an image with a digital camera, but the captured image may not be visually satisfactory. Accordingly, many people want to correct the hue and color of the captured image to obtain a visually more impressive and styled result.

하지만, 이미지 보정은 이미지 편집에 대한 전문 지식이 없는 아마추어 사용자에게는 어려운 작업이다. 또한, 많은 양의 이미지를 보정하기 위해서는 많은 인력이 필요로 한다. However, image correction is a difficult task for amateur users who do not have expertise in image editing. In addition, a lot of manpower is required to correct a large amount of images.

이러한 이유로, 자동 이미지 보정을 위한 많은 기술이 연구되고 있다. 자동 이미지 보정 기술은 이미지의 색조와 색상을 자동으로 향상시켜 사람이 개입하지 않아도 시각적으로 더욱 인상적이고 스타일화 된 결과물을 출력할 수 있다. For this reason, many techniques for automatic image correction are being studied. The automatic image correction technology automatically improves the hue and color of the image, so you can output visually more impressive and stylized results without human intervention.

일반적으로 자동 이미지 보정 기술은 전문적인 품질을 제공하기 위하여 전문가의 이미지 보정 스타일을 모방하며, 이미지의 낮은 레벨의 색상 히스토그램, 밝기 및 대비를 기반으로 이미지의 대비 / 밝기 및 색상 / 채도를 조정하는 등 여러가지 방법이 적용되고 있다. In general, automatic image correction technology mimics the professional's image correction style to provide professional quality, and adjusts the image's contrast / brightness and color / saturation based on the low-level color histogram of the image, brightness and contrast, etc. Several methods are being applied.

하지만, 이러한 방법은 이미지의 모든 픽셀에 동일한 색상 매핑을 적용하여 이미지의 색을 전체적으로 보정함에 따라 과도한 색 보정이 이루어지게 되고, 객체 별 의미와 무관하게 균일한 방식으로 색 보정이 이루어지게 된다. However, this method applies the same color mapping to all pixels of the image to correct the color of the image as a whole, so that excessive color correction is performed, and color correction is performed in a uniform manner regardless of the meaning of each object.

본 발명은 입력 이미지의 컨볼루션 특징 맵 및 공간 특징 맵을 기반으로 의미 보정 맵을 생성하고, 입력 이미지의 색상 특징 맵 및 의미 보정 맵을 기반으로 색상 매핑 정보를 생성하여 이미지 보정에 대한 출력 색상을 예측하는 의미 인식 기반의 이미지 보정 방법 및 그를 위한 장치를 제공하는 데 주된 목적이 있다.The present invention generates a semantic correction map based on a convolutional feature map and a spatial feature map of an input image, and generates color mapping information based on a color feature map and a semantic correction map of the input image to generate an output color for image correction. The main object is to provide a predictive semantic recognition-based image correction method and an apparatus therefor.

본 발명의 일 측면에 의하면, 상기 목적을 달성하기 위한 이미지 보정장치는 입력 이미지에 대한 색상 특징 맵을 추출하는 색상 추출부; 컨볼루션 레이어를 이용하여 상기 입력 이미지에 대한 적어도 하나의 컨볼루션 특징 맵을 추출하는 특징 맵 추출부; 상기 적어도 하나의 컨볼루션 특징 맵을 기반으로 의미 보정 맵(SAM: Semantic Adjustment Map)을 생성하는 의미 보정 맵 생성부; 및 상기 의미 보정 맵 및 상기 색상 특징 맵을 기반으로 색 변환에 대한 색상 매핑 정보를 생성하여 이미지 보정에 대한 출력 색상을 예측하는 색 변환 처리부를 포함할 수 있다. According to an aspect of the present invention, an image correction apparatus for achieving the above object includes: a color extraction unit for extracting a color feature map for an input image; A feature map extraction unit for extracting at least one convolutional feature map of the input image using a convolutional layer; A semantic correction map generator for generating a semantic correction map (SAM) based on the at least one convolutional feature map; And a color conversion processor configured to predict an output color for image correction by generating color mapping information for color conversion based on the semantic correction map and the color feature map.

또한, 본 발명의 다른 측면에 의하면, 상기 목적을 달성하기 위한 의미 인식 기반의 이미지 보정 방법은 입력 이미지를 획득하는 이미지 획득단계; 상기 입력 이미지에 대한 색상 특징 맵을 추출하는 색상 추출단계; 컨볼루션 레이어를 이용하여 상기 입력 이미지에 대한 적어도 하나의 컨볼루션 특징 맵을 추출하는 특징 맵 추출단계; 상기 적어도 하나의 컨볼루션 특징 맵을 기반으로 의미 보정 맵(SAM: Semantic Adjustment Map)을 생성하는 의미 보정 맵 생성단계; 및 상기 의미 보정 맵 및 상기 색상 특징 맵을 기반으로 색 변환에 대한 색상 매핑 정보를 생성하여 이미지 보정에 대한 출력 색상을 예측하는 색 변환 처리단계를 포함할 수 있다. In addition, according to another aspect of the present invention, a method for correcting an image based on meaning recognition for achieving the above object comprises: an image acquisition step of obtaining an input image; A color extraction step of extracting a color feature map for the input image; A feature map extraction step of extracting at least one convolution feature map of the input image using a convolution layer; A semantic correction map generating step of generating a semantic correction map (SAM) based on the at least one convolutional feature map; And a color conversion processing step of predicting an output color for image correction by generating color mapping information for color conversion based on the semantic correction map and the color feature map.

이상에서 설명한 바와 같이, 본 발명은 사진 전문가와 동일한 방식으로 이미지 색 보정을 자동으로 수행할 수 있는 효과가 있다. As described above, the present invention has an effect of automatically performing image color correction in the same manner as a photographic expert.

또한, 본 발명은 아핀 모델(Affine Model) 대신에 쌍선형 색 변환 네트워크를 적용함으로써, 비선형 특성을 반영하여 이미지 색 보정을 수행할 수 있는 효과가 있다. In addition, according to the present invention, by applying a bilinear color conversion network instead of an affine model, it is possible to perform image color correction by reflecting nonlinear characteristics.

또한, 본 발명은 의미 보정 맵(SAM)을 생성함으로써, 픽셀 단위의 문맥 특징을 학습할 수 있는 효과가 있다. In addition, according to the present invention, by generating a semantic correction map (SAM), there is an effect of learning the context characteristics in units of pixels.

또한, 본 발명은 수작업으로 설계된 기능이나 광범위한 사전 처리가 필요하지 않고, 모든 기능은 종단 간 방식으로 자동 학습할 수 있는 효과가 있다. In addition, the present invention does not require a manually designed function or extensive pre-processing, and all functions can be automatically learned in an end-to-end manner.

또한, 본 발명은 다중 스케일 컨볼루션 신경망(CNN) 기능을 이용하여 픽셀 단위의 문맥 상의 특징을 학습할 수 있는 효과가 있다.In addition, according to the present invention, it is possible to learn a feature in context of a pixel unit by using a multi-scale convolutional neural network (CNN) function.

도 1은 종래기술 및 본 발명의 보정 결과를 나타낸 도면이다.
도 2는 본 발명의 실시예에 따른 의미 인식 기반의 이미지 보정장치를 개략적으로 나타낸 블록 구성도이다.
도 3은 본 발명의 실시예에 따른 의미 인식 기반의 이미지 보정장치의 동작을 설명하기 위한 도면이다.
도 4는 본 발명의 실시예에 따른 의미 인식 기반의 이미지 보정 방법을 설명하기 위한 순서도이다.
도 5는 본 발명의 실시예에 따른 이미지 보정 샘플 이미지를 나타낸 예시도이다.
도 6은 본 발명의 실시예에 따른 이미지 보정 결과물의 질을 비교한 샘플 이미지를 나타낸 예시도이다.
도 7a 및 도 7b는 본 발명의 실시예에 따른 의미 인식 기반의 이미지 보정 방식을 적용한 샘플 이미지를 나타낸다.
도 8a 및 도 8b는 본 발명의 실시예에 따른 이미지 보정 결과물의 비교 및 순환신경망의 적용 여부에 따른 결과물을 나타낸 도면이다.
도 9는 본 발명의 실시예에 따른 이미지 보정장치의 쌍선형 변환 동작을 시각화한 도면을 나타낸다.
도 10은 본 발명의 실시예에 따른 이미지 보정장치의 비선형 색상 특징 맵을 나타낸 예시도이다.1 is a view showing a correction result of the prior art and the present invention.
2 is a block diagram schematically illustrating an image correction apparatus based on meaning recognition according to an embodiment of the present invention.
3 is a diagram illustrating an operation of an image correction apparatus based on meaning recognition according to an embodiment of the present invention.
4 is a flowchart illustrating a method of correcting an image based on meaning recognition according to an embodiment of the present invention.
5 is an exemplary view showing an image correction sample image according to an embodiment of the present invention.
6 is an exemplary view showing a sample image comparing the quality of image correction results according to an embodiment of the present invention.
7A and 7B show sample images to which an image correction method based on meaning recognition according to an embodiment of the present invention is applied.
8A and 8B are diagrams illustrating comparison of image correction results according to an embodiment of the present invention and results according to whether or not a circulatory neural network is applied.
9 is a diagram illustrating a bilinear transformation operation of an image correction apparatus according to an embodiment of the present invention.
10 is an exemplary view showing a nonlinear color feature map of an image correction apparatus according to an embodiment of the present invention.

이하, 본 발명의 바람직한 실시예를 첨부된 도면들을 참조하여 상세히 설명한다. 본 발명을 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명은 생략한다. 또한, 이하에서 본 발명의 바람직한 실시예를 설명할 것이나, 본 발명의 기술적 사상은 이에 한정하거나 제한되지 않고 당업자에 의해 변형되어 다양하게 실시될 수 있음은 물론이다. 이하에서는 도면들을 참조하여 본 발명에서 제안하는 의미 인식 기반의 이미지 보정 방법 및 그를 위한 장치에 대해 자세하게 설명하기로 한다.Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. In describing the present invention, when it is determined that a detailed description of a related known configuration or function may obscure the subject matter of the present invention, a detailed description thereof will be omitted. In addition, a preferred embodiment of the present invention will be described below, but the technical idea of the present invention is not limited thereto or is not limited thereto, and may be modified and variously implemented by a person skilled in the art. Hereinafter, a semantic recognition-based image correction method and an apparatus for the same proposed by the present invention will be described in detail with reference to the drawings.

도 2는 본 발명의 실시예에 따른 의미 인식 기반의 이미지 보정장치를 개략적으로 나타낸 블록 구성도이고, 도 3은 본 발명의 실시예에 따른 의미 인식 기반의 이미지 보정장치의 동작을 설명하기 위한 도면이다.2 is a block diagram schematically showing a meaning recognition-based image correction apparatus according to an embodiment of the present invention, and FIG. 3 is a view for explaining the operation of the meaning recognition-based image correction apparatus according to an embodiment of the present invention to be.

본 실시예에 따른 이미지 보정장치(200)는 특징 맵 추출부(210), 색상 추출부(220), 의미 보정 맵 생성부(230), 색 변환 처리부(240) 및 이미지 색 보정부(260)를 포함한다. 도 1의 이미지 보정장치(200)는 일 실시예에 따른 것으로서, 도 1에 도시된 모든 블록이 필수 구성요소는 아니며, 다른 실시예에서 이미지 보정장치(200)에 포함된 일부 블록이 추가, 변경 또는 삭제될 수 있다. The image correction apparatus 200 according to the present embodiment includes a feature map extraction unit 210, a color extraction unit 220, a semantic correction map generation unit 230, a color conversion processing unit 240, and an image color correction unit 260. Includes. The image correction apparatus 200 of FIG. 1 is according to an embodiment, and not all blocks shown in FIG. 1 are essential components, and some blocks included in the image correction apparatus 200 are added or changed in other embodiments. Or it can be deleted.

일반적으로 전문가의 이미지 보정 방법은 근본적인 색 매핑이 공간적으로 다양하고, 장면의 객체에 의존적이기 때문에 이미지의 의미를 이해해야만 보정이 가능하다. 이러한 방법으로 인해 기존에는 수작업으로 생성된 특징값(Features)을 전처리하는 단계에 크게 의존하고, 특징값의 과잉으로 인해 공간적으로 일관성이 없는 픽셀 단위의 아핀 변환(affine transform)을 사용하여 이미지 보정을 수행하였다. In general, an expert's image correction method can be corrected only by understanding the meaning of the image because the fundamental color mapping is spatially diverse and dependent on objects in the scene. Due to this method, the conventional method relies heavily on pre-processing of manually generated features, and due to the excess of feature values, image correction is performed using an affine transform in units of pixels, which is spatially inconsistent. Performed.

본 실시예에 따른 이미지 보정장치(200)는 전문가의 이미지 보정 방법을 학습하고, 학습된 이미지 보정 스타일을 적용하여 입력 이미지를 자동으로 보정하는 동작을 수행한다. The image correction apparatus 200 according to the present exemplary embodiment learns an image correction method of an expert and automatically corrects an input image by applying the learned image correction style.

이미지 보정장치(200)는 엔드 투 엔드(end-to-end) 심층 신경 네트워크(deep neural network)를 기반으로 전문가의 이미지 보정 스타일로 이미지를 정확하게 변환한다. The image correction apparatus 200 accurately converts an image into an image correction style of an expert based on an end-to-end deep neural network.

이미지 보정장치(200)는 전문가의 이미지 보정 스타일에 대한 의미 보정 맵(SAM: Semantic Adjustment Map)을 생성하여 학습한다. 여기서, 의미 보정 맵은 전문가의 이미지 보정 스타일을 적용한 이미지의 장면을 파싱(parsing)하여 생성된 맵을 의미한다. 이미지 보정장치(200)는 공간적으로 일관된 색상 매핑이 적용된 의미 보정 맵을 생성하여 학습한다. The image correction apparatus 200 generates and learns a semantic correction map (SAM) for an expert's image correction style. Here, the semantic correction map means a map generated by parsing a scene of an image to which an expert's image correction style is applied. The image correction apparatus 200 generates and learns a semantic correction map to which spatially consistent color mapping is applied.

이미지 보정장치(200)는 의미론적 영역 내에서 단일 비선형 색 변환(single non-linear color transform)을 학습하기 위하여 아핀 모델(affine transform)을 대신 쌍선형 색 변환 방법(bilinear color transform)을 사용한다. 여기서, 의미론적 영역은 이미지 내의 객체를 소정의 기준으로 구분하여 유사한 의미를 갖는 영역을 말한다. The image correction apparatus 200 uses a bilinear color transform method instead of an affine transform in order to learn a single non-linear color transform within a semantic domain. Here, the semantic region refers to a region having a similar meaning by dividing objects in an image by a predetermined standard.

본 실시예에 따른 이미지 보정장치(200)는 전문가의 이미지 보정 스타일을 적용하여 이미지 색 보정을 자동으로 수행함에 따라 양적 및 질적으로 효율적인 이미지 보정 결과물을 생성할 수 있다. 또한, 이미지 보정장치(200)는 사용자 간 상호작용을 위한 이미지 양식을 제공하는 분야에 확장 적용될 수 있다. The image correcting apparatus 200 according to the present embodiment may generate an efficient image correction result quantitatively and qualitatively by automatically performing image color correction by applying an image correction style of an expert. In addition, the image correction apparatus 200 can be extended and applied to the field of providing an image format for user interaction.

이하, 본 실시예에 따른 이미지 보정장치(200)에 포함된 구성요소 각각에 대해 설명하도록 한다. Hereinafter, each of the components included in the image correction apparatus 200 according to the present embodiment will be described.

특징 맵 추출부(210)는 입력 이미지에 대한 적어도 하나의 특징 맵을 추출한다. 특징 맵 추출부(210)는 입력 이미지에서 프레임별 또는 시간 주기별로 특정 이미지들을 추출하고, 추출된 특정 이미지들의 메타 데이터에 따라 필터링하여 특징 맵을 추출하기 위한 복수의 레이어를 포함한다. 여기서, 특징 맵은 특정 이미지에서 추출된 적어도 하나의 특징 값을 포함하여 구성될 수 있다. The feature map extractor 210 extracts at least one feature map for the input image. The feature map extractor 210 includes a plurality of layers for extracting a feature map by extracting specific images by frame or by time period from the input image, filtering according to metadata of the extracted specific images. Here, the feature map may be configured to include at least one feature value extracted from a specific image.

특징 맵 추출부(210)는 적어도 하나의 컨볼루션 필터를 포함하는 컨볼루션 레이어부(212) 및 순환 신경망(RNN) 레이어를 적용하여 공간 특징 맵을 추출하는 순환 신경망 레이어부(214)를 포함한다. The feature map extraction unit 210 includes a convolution layer unit 212 including at least one convolutional filter and a cyclic neural network layer unit 214 that extracts a spatial feature map by applying a cyclic neural network (RNN) layer. .

컨볼루션 레이어부(212)는 적어도 하나의 컨볼루션 필터(310, 320, 322, 324)를 포함하고, 적어도 하나의 컨볼루션 필터(310, 320, 322, 324) 각각에 대한 컨볼루션 특징 맵을 추출한다. The convolution layer unit 212 includes at least one convolution filter 310, 320, 322, 324, and generates a convolution feature map for each of the at least one convolution filter 310, 320, 322, and 324. Extract.

컨볼루션 레이어부(212)는 입력 이미지의 비선형 특징을 추출하기 위한 초기 컨볼루션 필터(310)를 적용하여 비선형 특징 맵을 추출한다. 초기 컨볼루션 필터(310)를 통해 추출된 비선형 특징 맵은 색상 추출부(220)로 전달된다. The convolution layer unit 212 extracts a nonlinear feature map by applying an initial convolution filter 310 for extracting a nonlinear feature of an input image. The nonlinear feature map extracted through the initial convolution filter 310 is transmitted to the color extraction unit 220.

컨볼루션 레이어부(212)는 초기 컨볼루션 필터(310)의 결과물에서 에지(Edge) 정보를 추출하기 위한 제1 컨볼루션 필터(320)를 적용하여 제1 컨볼루션 특징 맵을 추출한다. 여기서, 에지 정보는 전경 객체, 배경 객체 등에 대한 경계선에 대한 특징값을 의미하며, HOG(Histogram of Oriented Gradients), Canny 에지 검출, LoG, 라플라시안 등 다양한 추출 방식 중 하나를 이용하여 추출될 수 있다. 제1 컨볼루션 필터(320)를 통해 추출된 제1 컨볼루션 특징 맵은 의미 보정 맵 생성부(230)로 전달된다.The convolution layer unit 212 extracts a first convolution feature map by applying a first convolution filter 320 for extracting edge information from a result of the initial convolution filter 310. Here, the edge information means a feature value for a boundary line for a foreground object, a background object, etc., and may be extracted using one of various extraction methods such as Histogram of Oriented Gradients (HOG), Canny edge detection, LoG, and Laplacian. The first convolution feature map extracted through the first convolution filter 320 is transmitted to the semantic correction map generator 230.

컨볼루션 레이어부(212)는 제1 컨볼루션 필터(320)의 결과물에서 객체의 일부분 각각에 대한 객체 분할정보를 추출하기 위한 제2 컨볼루션 필터(322)를 적용하여 제2 컨볼루션 특징 맵을 추출한다. 예를 들어, 제2 컨볼루션 필터(322)는 사람 객체에서 머리, 팔, 다리, 몸통 등 객체의 일부분 각각에 대한 특징값을 포함하는 제2 컨볼루션 특징 맵을 추출한다. 제2 컨볼루션 필터(322)를 통해 추출된 제2 컨볼루션 특징 맵은 의미 보정 맵 생성부(230)로 전달된다.The convolutional layer unit 212 applies the second convolutional filter 322 for extracting object segmentation information for each part of the object from the result of the first convolutional filter 320 to generate a second convolutional feature map. Extract. For example, the second convolutional filter 322 extracts a second convolution feature map including feature values for each part of an object such as a head, an arm, a leg, and a body from the human object. The second convolutional feature map extracted through the second convolutional filter 322 is transferred to the semantic correction map generator 230.

컨볼루션 레이어부(212)는 제2 컨볼루션 필터(322)의 결과물에서 객체 각각에 대한 객체 형상정보를 추출하기 위한 제3 컨볼루션 필터(324)를 적용하여 제3 컨볼루션 특징 맵을 추출한다. 예를 들어, 제3 컨볼루션 필터(324)는 사람, 강아지 등의 객체 각각의 전체 형상에 대한 특징값을 포함하는 제3 컨볼루션 특징 맵을 추출한다. 제3 컨볼루션 필터(324)를 통해 추출된 제3 컨볼루션 특징 맵은 의미 보정 맵 생성부(230)로 전달된다.The convolution layer unit 212 extracts a third convolution feature map by applying the third convolution filter 324 for extracting object shape information for each object from the result of the second convolution filter 322 . For example, the third convolution filter 324 extracts a third convolution feature map including feature values for the overall shape of each object, such as a person or a dog. The third convolutional feature map extracted through the third convolutional filter 324 is transmitted to the semantic correction map generator 230.

순환 신경망 레이어부(214)는 컨볼루션 레이어부(212)의 뒷단에 연결되며, 객체 위치 관계정보에 대한 공간 특징 맵을 추출한다. 순환 신경망 레이어부(214)는 제3 컨볼루션 필터(324)의 결과물에서 객체 위치 관계정보를 추출하기 위한 순환 신경망(RNN) 레이어(330, 332, 334)를 적용하여 공간 특징 맵을 추출한다. 여기서, 순환 신경망 레이어부(214)는 4 방향 공간 RNN 레이어(330, 332)와 공간 컨볼루션 필터(334)를 포함하여 구성될 수 있다. 여기서, 공간 컨볼루션 필터(334)는 1 × 1 컨볼루션으로 구성될 수 있으며, 추가적인 컨볼루션 레이어가 추가될 수도 있다. 예를 들어, 순환 신경망 레이어부(214)는 사람, 강아지 등의 객체가 존재하는 경우 사람 객체는 이미지의 제1 영역에 위치하고, 강아지 객체는 이미지의 제2 영역에 위치하는 것에 대한 객체 위치 관계정보를 추출할 수 있다. 순환 신경망 레이어부(214)에서 추출된 공간 특징 맵은 의미 보정 맵 생성부(230)로 전달된다.The recurrent neural network layer unit 214 is connected to the rear end of the convolutional layer unit 212, and extracts a spatial feature map for object positional relationship information. The recurrent neural network layer unit 214 extracts a spatial feature map by applying a recurrent neural network (RNN) layer 330, 332, and 334 for extracting object positional relationship information from the result of the third convolutional filter 324. Here, the recurrent neural network layer unit 214 may be configured to include the 4-way spatial RNN layers 330 and 332 and the spatial convolution filter 334. Here, the spatial convolution filter 334 may be configured with 1×1 convolution, and an additional convolution layer may be added. For example, when an object such as a person or a dog exists, the recurrent neural network layer unit 214 is object positional relationship information for the human object being located in the first area of the image and the puppy object being located in the second area of the image. Can be extracted. The spatial feature map extracted from the recurrent neural network layer unit 214 is transmitted to the semantic correction map generator 230.

색상 추출부(220)는 입력 이미지에 대한 색상 특징 맵을 추출하는 동작을 수행한다. The color extractor 220 performs an operation of extracting a color feature map for an input image.

색상 추출부(220)는 상기 색상 특징 맵을 포함하는 색상 특징 블록을 생성한다. 색상 추출부(220)는 입력 이미지에 대한 입력 색상 맵을 포함하는 입력 색 특징 블록(350) 및 컨볼루션 레이어부(212)의 초기 컨볼루션 필터를 통해 추출된 비선형 특징 맵을 포함하는 비선형 특징 블록(352)을 생성한다. 색상 추출부(220)는 입력 이미지에 대한 입력 색상 맵에 비선형 특징 맵을 결합한 색상 특징 맵을 포함하는 색상 특징 블록(354)을 생성한다. The color extractor 220 generates a color feature block including the color feature map. The color extraction unit 220 is a nonlinear feature block including an input color feature block 350 including an input color map for an input image and a nonlinear feature map extracted through an initial convolution filter of the convolution layer unit 212 Create 352. The color extractor 220 generates a color feature block 354 including a color feature map obtained by combining an input color map for an input image and a nonlinear feature map.

색상 추출부(220)는 입력 색 특징 블록(350) 및 비선형 특징 블록(352)을 소정의 컨볼루션 필터를 적용하여 색상 특징 맵을 포함하는 색상 특징 블록(354)으로 변환한다. The color extraction unit 220 converts the input color feature block 350 and the nonlinear feature block 352 into a color feature block 354 including a color feature map by applying a predetermined convolution filter.

의미 보정 맵 생성부(230)는 적어도 하나의 특징 맵을 기반으로 의미 보정 맵(SAM: Semantic Adjustment Map)을 생성한다. The semantic correction map generation unit 230 generates a semantic correction map (SAM) based on at least one feature map.

의미 보정 맵 생성부(230)는 적어도 하나의 컨볼루션 필터 각각에 대한 컨볼루션 특징 맵 각각을 포함하는 적어도 하나의 레지듀얼 블록(340, 341, 342)과 공간 특징 맵을 포함하는 공간 순환 신경망 블록(343)을 생성한다. The semantic correction map generator 230 includes at least one residual block 340, 341, and 342 each including a convolution feature map for each of at least one convolution filter and a spatial recurrent neural network block including a spatial feature map Generates (343).

의미 보정 맵 생성부(230)는 적어도 하나의 레지듀얼 블록(340, 341, 342) 및 공간 순환 신경망 블록(343)을 특정 크기로 축소 샘플링 처리하여 의미 보정 맵(346)을 생성할 수 있다. 구체적으로, 의미 보정 맵 생성부(230)는 적어도 하나의 레지듀얼 블록 및 공간 순환 신경망 블록을 업샘플링하여 보간 처리한 후 결합한 블록(340, 341, 342, 343)을 특정 크기 블록(344)로 축소 샘플링 처리하여 의미 보정 맵(346)을 생성한다. The semantic correction map generator 230 may generate a semantic correction map 346 by reducing sampling the at least one residual block 340, 341, 342 and the spatial recurrent neural network block 343 to a specific size. Specifically, the semantic correction map generator 230 upsamples at least one residual block and a spatial recurrent neural network block, interpolates, and then converts the combined blocks 340, 341, 342, and 343 into a specific size block 344. A semantic correction map 346 is generated by reducing sampling processing.

색 변환 처리부(240)는 의미 보정 맵 및 색상 특징 맵을 기반으로 색 변환에 대한 색상 매핑 정보를 생성하여 이미지 보정에 대한 출력 색상을 예측한다. The color conversion processing unit 240 predicts an output color for image correction by generating color mapping information for color conversion based on a semantic correction map and a color feature map.

색 변환 처리부(240)는 의미 보정 맵 및 색상 특징 맵을 쌍선형 풀링(Bilinear Pooling) 처리하여 색상 매칭 정보를 생성한다. 구체적으로, 색 변환 처리부(240)는 의미 보정 맵 및 상기 색상 특징 맵을 쌍선형 풀링 처리하여 객체별 색 변환정보를 생성하고, 객체별 색 변환정보에 근거하여 객체별 고유색상에 대한 색상 매칭 정보를 생성한다. The color conversion processing unit 240 processes the semantic correction map and the color feature map by bilinear pooling to generate color matching information. Specifically, the color conversion processing unit 240 generates color conversion information for each object by bilinear pooling the semantic correction map and the color feature map, and color matching information for the intrinsic color for each object based on the color conversion information for each object. Create

이미지 색 보정부(260)는 입력 이미지에 색상 매핑 정보를 기반으로 색 보정을 수행하여 출력 이미지를 생성한다. 이미지 색 보정부(260)는 입력 이미지의 의미에 따라 서로 다른 색상 매핑을 적용하여 색 보정을 수행할 수 있다. The image color corrector 260 generates an output image by performing color correction on the input image based on color mapping information. The image color corrector 260 may perform color correction by applying different color mappings according to the meaning of the input image.

이하, 이미지 보정장치(200)의 동작을 구체적으로 설명하도록 한다. Hereinafter, the operation of the image correction apparatus 200 will be described in detail.

이미지 보정장치(200)는 입력 이미지에 포함된 입력 픽셀에 포함된 의미론적 문맥(context)에 따라 색 보정을 위한 입력 컬러(x)에서 출력 컬러(y)로의 색상 매핑의 회귀적 모델(regression model)을 학습한다. 이미지 보정장치(200)는 의미 인식 기반의 색상 매핑을 위하여 심층 신경 네트워크(deep neural network)를 적용할 수 있다. The image correction apparatus 200 is a regression model of color mapping from an input color (x) to an output color (y) for color correction according to a semantic context included in an input pixel included in an input image. ) To learn. The image correction apparatus 200 may apply a deep neural network for color mapping based on semantic recognition.

본 실시예에 따른 이미지 보정장치(200)는 컨볼루션 신경 네트워크(CNN: Convolutional Neural Network)를 기반으로 제안된 심층 신경 네트워크(deep neural network)를 적용하여 색 매핑에 따른 출력 색상을 예측할 수 있다. 여기서, 이미지 보정장치(200)는 컨볼루션 신경 네트워크(CNN: Convolutional Neural Network) 중 ResNet을 이용하여 동작하는 것이 바람직하나 반드시 이에 한정되는 것은 아니다. The image correction apparatus 200 according to the present embodiment may predict an output color according to color mapping by applying a deep neural network proposed based on a convolutional neural network (CNN). Here, it is preferable that the image correction apparatus 200 operates using ResNet among convolutional neural networks (CNNs), but is not limited thereto.

이미지 보정장치(200)는 전처리 학습을 수행할 수 있으며, 입력 이미지의 색상, 객체, 의미 등을 나타내는 다양한 수준의 특징(Features)을 적용하여 학습을 수행할 수 있다. The image correction apparatus 200 may perform pre-processing learning, and may perform learning by applying various levels of features representing colors, objects, meanings, etc. of an input image.

본 실시예에 따른 이미지 보정장치(200)는 입력 이미지의 전반적인 합성정보를 필수적으로 분석한다. 즉, 이미지 보정장치(200)는 입력 이미지 중 색 보정을 수행하는 부분과 나머지 부분과의 관계 및 구성에 대한 전반적인 합성정보를 분석한다. The image correction apparatus 200 according to the present embodiment essentially analyzes the overall synthesis information of the input image. That is, the image correcting apparatus 200 analyzes the overall synthesis information on the relationship and composition between the part performing color correction and the rest of the input image.

하지만, 컨볼루션 신경 네트워크(CNN) 기반의 컨볼루션 특징 맵은 픽셀 레벨에서 입력 이미지의 전반적인 합성정보를 인코딩 처리하기는 어렵다. 이에 따라 이미지 보정장치(200)는 컨볼루션 신경 네트워크(CNN)의 뒷단에 공간 순환신경망(Spatial Recurrent neural network)를 추가로 적용할 수 있다. However, the convolutional feature map based on a convolutional neural network (CNN) is difficult to encode and process the overall synthesis information of the input image at the pixel level. Accordingly, the image correction apparatus 200 may additionally apply a spatial recurrent neural network to the rear end of the convolutional neural network (CNN).

공간 순환신경망(Spatial RNN)은 상하좌우에 대한 4 방향 공간 RNN 레이어들로 구성되며, 공간 RNN 레이어의 뒷단에 컨볼루션 필터가 추가로 구성된다. 여기서, 컨볼루션 필터는 1 × 1 컨볼루션으로 구성될 수 있으며, 추가적인 컨볼루션 레이어가 추가될 수도 있다. The spatial RNN is composed of four-directional spatial RNN layers for top, bottom, left and right, and a convolution filter is additionally configured at the rear end of the spatial RNN layer. Here, the convolution filter may consist of 1×1 convolution, and an additional convolution layer may be added.

이미지 보정장치(200)는 공간 순환신경망(Spatial RNN)을 적용함으로써, 적은 자원을 이용하여 공간 해상도를 잃지 않으면서 특징 맵을 추출할 수 있다. 만약, 이미지 보정장치(200)는 공간 순환신경망(Spatial RNN) 대신 컨볼루션 신경 네트워크(CNN)를 추가로 적용하는 경우, 컨볼루션 레이어를 위한 메모리 공간과 학습 가능한 가중치가 필요하다. The image correction apparatus 200 may extract a feature map without losing spatial resolution by using a small resource by applying a spatial RNN. If the image correction apparatus 200 additionally applies a convolutional neural network (CNN) instead of a spatial RNN, a memory space for the convolutional layer and a learnable weight are required.

이미지 보정장치(200)는 다양한 방식 중 적어도 하나의 기법을 통해 픽셀 단위의 특징값을 추출한다. 예를 들어, 이미지 보정장치(200)는 스파스 하이퍼컬럼 훈련방법(sparse hypercolumn training method)을 통해 픽셀 단위의 특징을 추출할 수 있다. 여기서, 스파스 하이퍼컬럼 훈련방법은 특징적인　로컬 영역(local region)에 대한 하이퍼컬럼 특징값(Hyper-column Feature)을 추출하여 처리하는 기법으로서, 훈련 시간에 신경망 네트워크는 backpropagation을 위해 이미지로부터 희소 픽셀을 무작위로 샘플링하여 많은 훈련 신호를 생성할 수 있다. 스파스 하이퍼컬럼 훈련방법은 기존의 디컨볼루션(deconvolutional) 접근법보다 훨씬 적은 파라미터를 필요로 한다. The image correction apparatus 200 extracts a feature value in units of pixels through at least one of various methods. For example, the image correction apparatus 200 may extract features in units of pixels through a sparse hypercolumn training method. Here, the sparse hyper-column training method is a technique that extracts and processes the hyper-column feature for a characteristic 　local region.At training time, the neural network network is used for backpropagation. We can generate many training signals by randomly sampling. The sparse hypercolumn training method requires far fewer parameters than the conventional deconvolutional approach.

이미지 보정장치(200)는 입력 이미지에 대한 소정의 데이터가 주어지면 스파스 하이퍼컬럼 훈련방법을 적용하여 로우 레벨에서 하이 레벨의 다양한 특징값을 추출할 수 있다. 이미지 보정장치(200)는 컨볼루션 레이어를 통해 추출된 특징값들을 포함하는 특징 맵을 저장하기 위한 레지듀얼 블록을 생성한다. 여기서, 레지듀얼 블록은 각각 256, 512 및 1024 채널 등을 사용하는 형태로 생성될 수 있다. The image correction apparatus 200 may extract various feature values from a low level to a high level by applying a sparse hypercolumn training method when predetermined data for an input image is given. The image correction apparatus 200 generates a residual block for storing a feature map including feature values extracted through the convolution layer. Here, the residual block may be generated in a form using 256, 512 and 1024 channels, respectively.

또한, 이미지 보정장치(200)는 공간 RNN 레이어를 통해 추출된 공간 특징 맵을 포함하는 공간 순환 신경망 블록을 추가로 생성한다. 여기서, 공간 순환 신경망 블록은 1024 채널을 사용하는 형태로 생성될 수 있다. In addition, the image correction apparatus 200 additionally generates a spatial recurrent neural network block including a spatial feature map extracted through the spatial RNN layer. Here, the spatial recurrent neural network block may be generated in a form using 1024 channels.

이미지 보정장치(200)는 컨볼루션 특징 맵 및 공간 특징 맵을 정규화 처리한다. 여기서, 특징 맵은 가중치 제곱의 합에 비례하여 가중치에 페널티를 주는 정규화 유형(L2 정규화)을 통해 정규화될 수 있다. The image correction apparatus 200 normalizes the convolutional feature map and the spatial feature map. Here, the feature map may be normalized through a normalization type (L2 normalization) that penalizes the weights in proportion to the sum of squared weights.

이미지 보정장치(200)는 정규화 처리된 특징 맵에 대한 레지듀얼 블록 및 공간 순환 신경망 블록을 연결한 후 특정 크기로 축소 샘플링 처리하여 의미 보정 맵을 생성할 수 있다. 예를 들어, 이미지 보정장치(200)는 1 × 1 컨볼루션 필터를 사용하여 512 채널의 크기로 축소 샘플링 처리할 수 있다. The image correction apparatus 200 may generate a semantic correction map by connecting the residual block and the spatial recurrent neural network block to the normalized feature map and then performing reduction sampling processing to a specific size. For example, the image correction apparatus 200 may perform reduction sampling processing to a size of 512 channels using a 1×1 convolution filter.

본 실시예에 따른 이미지 보정장치(200)는 쌍선형 색 변환 네트워크(Bilinear color transform network)를 기반으로 입력 이미지에 대한 색상 특징 맵 및 의미 보정 맵(SAM)을 생성하고, 색상 특징 맵 및 의미 보정 맵을 이용하여 출력 색상을 예측한다. 이하, 쌍선형 색 변환 네트워크 기반의 이미지 보정장치(200)에 대해 구체적으로 설명하도록 한다. The image correction apparatus 200 according to the present embodiment generates a color feature map and a semantic correction map (SAM) for an input image based on a bilinear color transform network, and corrects the color feature map and meaning. Use the map to predict the output color. Hereinafter, the image correction apparatus 200 based on the bilinear color conversion network will be described in detail.

컨볼루션 레이어부(212)는 적어도 하나의 컨볼루션 필터를 포함하고, 적어도 하나의 컨볼루션 필터 각각에 대한 컨볼루션 특징 맵을 추출한다. The convolution layer unit 212 includes at least one convolution filter, and extracts a convolution feature map for each of the at least one convolution filter.

컨볼루션 레이어부(212)는 입력 이미지의 비선형 특징을 추출하기 위한 초기 컨볼루션 필터를 적용하여 비선형 특징 맵을 추출한다. 초기 컨볼루션 필터를 통해 추출된 비선형 특징 맵은 색상 추출부(220)로 전달된다. The convolution layer unit 212 extracts a nonlinear feature map by applying an initial convolution filter for extracting a nonlinear feature of an input image. The nonlinear feature map extracted through the initial convolution filter is transmitted to the color extraction unit 220.

컨볼루션 레이어부(212)는 초기 컨볼루션 필터의 결과물에서 에지(Edge) 정보를 추출하기 위한 제1 컨볼루션 필터를 적용하여 제1 컨볼루션 특징 맵을 추출한다. 제1 컨볼루션 필터를 통해 추출된 제1 컨볼루션 특징 맵은 의미 보정 맵 생성부(230)로 전달된다.The convolution layer unit 212 extracts a first convolution feature map by applying a first convolution filter for extracting edge information from a result of the initial convolution filter. The first convolutional feature map extracted through the first convolutional filter is transmitted to the semantic correction map generator 230.

컨볼루션 레이어부(212)는 제1 컨볼루션 필터의 결과물에서 객체의 일부분 각각에 대한 객체 분할정보를 추출하기 위한 제2 컨볼루션 필터를 적용하여 제2 컨볼루션 특징 맵을 추출한다. 제2 컨볼루션 필터를 통해 추출된 제2 컨볼루션 특징 맵은 의미 보정 맵 생성부(230)로 전달된다.The convolution layer unit 212 extracts a second convolution feature map by applying a second convolution filter for extracting object segmentation information for each part of an object from a result of the first convolution filter. The second convolutional feature map extracted through the second convolutional filter is transmitted to the semantic correction map generator 230.

컨볼루션 레이어부(212)는 제2 컨볼루션 필터의 결과물에서 객체 각각에 대한 객체 형상정보를 추출하기 위한 제3 컨볼루션 필터를 적용하여 제3 컨볼루션 특징 맵을 추출한다. 제3 컨볼루션 필터를 통해 추출된 제3 컨볼루션 특징 맵은 의미 보정 맵 생성부(230)로 전달된다.The convolution layer unit 212 extracts a third convolution feature map by applying a third convolution filter for extracting object shape information for each object from the result of the second convolution filter. The third convolutional feature map extracted through the third convolutional filter is transmitted to the semantic correction map generator 230.

의미 보정 맵 생성부(230)는 적어도 하나의 특징 맵을 기반으로 의미 보정 맵(SAM)을 생성한다. 여기서, 시맨틱 조정 맵 (SAM)은 K-채널 2D 세그멘틱 분할 맵을 의미한다. The semantic correction map generator 230 generates a semantic correction map (SAM) based on at least one feature map. Here, the semantic adjustment map (SAM) means a K-channel 2D segmented segmentation map.

의미 보정 맵 생성부(230)는 전문가의 보정 스타일과 같이, 이미지 컨텍스트 또는 영역에 따라 K 개의 다른 색상 작업으로 구성하기 위하여 시맨틱 조정 맵 (SAM)은 각 픽셀에 대해 특성 교차를 수행한다. 여기서, 특성 교차는 원-핫 인코딩(one-hot encoding)인 것이 바람직하나 반드시 이에 한정되는 것은 아니다. The semantic correction map generator 230 performs characteristic crossing for each pixel in order to configure K different color tasks according to an image context or area, such as a correction style of an expert. Here, the characteristic crossing is preferably one-hot encoding, but is not limited thereto.

의미 보정 맵 생성부(230)는 각 픽셀에 대해 범주형 확률 변수(categorical random variable)에 따른 원-핫 벡터(f^SAM)를 생성한다. 원-핫 벡터(f^SAM)는 [수학식 1]과 같이 정의된다.The semantic correction map generator 230 generates a one-hot vector f ^SAM according to a categorical random variable for each pixel. One-hot vector (f ^SAM ) is defined as in [Equation 1].

여기서, m은 범주형 확률 밀도 함수 p(m_k = 1|x)로부터 샘플링된 원-핫 벡터이다. p(m_k = 1|x)는 k 번째 색상 매핑을 사용하여 픽셀 x를 보정할 확률을 의미한다. Cat()은 픽셀 x를 보정할 확률들을 결합하여 원-핫 벡터로 출력하는 함수를 의미한다.Where m is the one-hot vector sampled from the categorical probability density function p(m _k = 1|x). p(m _k = 1|x) means the probability of correcting the pixel x using the k-th color mapping. Cat() refers to a function that combines the probabilities to correct pixel x and outputs it as a one-hot vector.

의미 보정 맵 생성부(230)는 자율 훈련을 통해 각 픽셀을 처리하여 의미 영역 내에서 공간적으로 균일한 색상 매핑을 처리할 수 있다. The semantic correction map generator 230 may process each pixel through autonomous training to perform spatially uniform color mapping within the semantic region.

의미 보정 맵 생성부(230)는 의미 보정 맵을 생성하기 위해 색상 매핑을 수행하는 과정에서 이산 색상 매핑(discrete color mapping)에 따른 경계 주위에서 갑작스러운 색상 변경이 유발될 수 있다. 이에, 의미 보정 맵 생성부(230)는 경계를 부드럽게 하기 위하여 f^SAM에 가이드 필터링을 적용할 수 있다. In the process of performing color mapping to generate the semantic correction map, the semantic correction map generator 230 may cause an abrupt color change around the boundary according to discrete color mapping. Accordingly, the semantic correction map generator 230 may apply guide filtering to f ^SAM in order to smooth the boundary.

의미 보정 맵 생성부(230)는 회귀 손실 log p(y|x)에 따라 분산 하향식 기법을 적용하여 f^SAM에 가이드 필터링을 적용할 수 있다.여기서, 분산 하향식 기법(L)은 [수학식 2]로 정의될 수 있다. The semantic correction map generator 230 may apply guide filtering to f ^SAM by applying a distributed top-down technique according to the regression loss log p(y|x). Here, the distributed top-down technique L is [Equation 2] Can be defined as ].

여기서, L은 회귀 손실, E: 손실 함수를 의미한다. 일반적으로 픽셀이 서로 독립적이라고 가정하면, 의미 보정 맵 생성부(230)에서 사용되는 K(채널 수)는 매우 작기 때문에 특정 맵에 대한 정확한 기대치를 계산하기 어렵다. Here, L denotes regression loss, E: loss function. In general, assuming that pixels are independent of each other, since K (number of channels) used in the semantic correction map generator 230 is very small, it is difficult to calculate an accurate expected value for a specific map.

의미 보정 맵 생성부(230)는 실제적으로 발생하는 모든 보정 스타일을 하나 또는 두 개의 클래스로 분류하여 K 가 작아서 발생하는 문제를 해소할 수 있다. The semantic correction map generator 230 may solve a problem that occurs because K is small by classifying all the correction styles that actually occur into one or two classes.

예를 들어, 이미지 최적화는 하늘, 땅 등과 같은 몇 가지 큰 클래스에 의해 지배된다. 종래에는 클래스 균형 분류를 위해 클래스 재조정 트릭을 사용하였지만, 본 발명의 의미 보정 맵 생성부(230)에서는 K 가 부족한 문제를 완화하기 위하여 K 개의 손실 기간마다 다른 가중치를 곱한다. 여기서, 가중치는 의미 보정 맵 생성부(230)에 의해 처리되는 것으로 기재하고 있으나 반드시 이에 한정되는 것은 아니며, 특징 맵 추출부(210)에서 처리될 수도 있다. For example, image optimization is dominated by several large classes such as sky, earth, etc. Conventionally, a class readjustment trick has been used for class balance classification, but the semantic correction map generator 230 of the present invention multiplies a different weight for each K loss period in order to alleviate the problem of insufficient K. Here, the weight is described as being processed by the semantic correction map generator 230, but is not limited thereto, and may be processed by the feature map extractor 210.

의미 보정 맵 생성부(230)는 상대적으로 작은 트레이닝 신호에도 불구하고 작은 클래스가 쉽게 발견될 수 있도록 저주파 클래스의 손실 기간에 작은 가중치를 곱한다. 여기서, 가중치는 [수학식 3]과 같이 정의될 수 있다. The semantic correction map generator 230 multiplies the loss period of the low frequency class by a small weight so that a small class can be easily found despite a relatively small training signal. Here, the weight may be defined as in [Equation 3].

여기서, w_t는 가중치를 의미하고, α는 a의 손실에 대한 가중치의 기여도를 제어하는 변수를 의미한다. a_t 는 K 개의 클래스의 정규화된 소프트 주파수의 이동 평균을 의미한다. 여기서, K 클래스는 [수학식 4]와 같이 정의된 t 트레이닝 배치에서 계산된다.Here, w _t denotes a weight, and α denotes a variable that controls the contribution of the weight to the loss of a. a _t denotes a moving average of the normalized soft frequencies of K classes. Here, the K class is calculated from the t training batch defined as in [Equation 4].

여기서,

는 t 번째 배치의 모든 픽셀에 대한 p_t (m_k = 1|x)의 평균을 의미한다. here,

Denotes the average of p _t (m _k = 1|x) for all pixels in the t-th arrangement.

의미 보정 맵 생성부(230)에서 산출된 이미지 보정장치(200)의 최종 회귀 손실은 [수학식 5]와 같이 계산된다. The final regression loss of the image correcting apparatus 200 calculated by the semantic correction map generator 230 is calculated as in [Equation 5].

여기서, L은 최종 회귀 손실, E: 손실 함수, p(m_k = 1|x)는 k 번째 색상 매핑을 사용하여 픽셀 x를 보정할 확률, w_t는 가중치를 의미한다.Here, L is the final regression loss, E: loss function, p(m _k = 1|x) is the probability of correcting the pixel x using k-th color mapping, and w _t is the weight.

의미 보정 맵 생성부(230)는 최종 회귀 손실에 근거하여 의미 보정 맵을 생성할 수 있다. The semantic correction map generator 230 may generate a semantic correction map based on the final regression loss.

색 변환 처리부(240)는 의미 보정 맵을 기반으로 쌍선형 색 변환을 분석하여 색 보정을 위한 색상 매핑 정보를 생성한다. The color conversion processing unit 240 analyzes bilinear color conversion based on the semantic correction map to generate color mapping information for color correction.

색 변환 처리부(240)는 의미 보정 맵 생성부(230)에서 생성된 의미 보정 맵을 이용하여 의미 보정 맵의 각 채널에 대한 전반적인 색상 변환 및 비선형 색상 변환을 찾는다. The color conversion processing unit 240 finds overall color conversion and nonlinear color conversion for each channel of the semantic correction map by using the semantic correction map generated by the semantic correction map generator 230.

색 변환 처리부(240)는 입력 색상 기반의 색상 특징맵(f^color) 및 의미 보정 맵(f^SAM)을 쌍선형 풀링(bilinear pooling)에 기반한 쌍선형 변환을 사용하여 색상 매핑 정보를 생성한다. 여기서, 색상 매핑 정보는 쌍선형 변환에 대한 모델을 의미한다. 쌍선형 변환에 대한 모델은 요소 쌍의 곱셈을 유도하는 선형 행렬을 가진 두 벡터의 외적을 통해 산출될 수 있으며, 쌍선형 변환에 대한 모델은 [수학식 6]과 같이 정의될 수 있다. The color conversion processing unit 240 generates color mapping information by using a color feature map f ^color based on an input color and a semantic correction map f ^SAM using a bilinear transformation based on bilinear pooling. Here, the color mapping information means a model for bilinear transformation. The model for the bilinear transformation can be calculated through the cross product of two vectors having a linear matrix that induces the multiplication of the pair of elements, and the model for the bilinear transformation can be defined as in [Equation 6].

여기서, a_j는 쌍선형 변환 모델, f^color ∈ R^I는 색 특징을 의미한다 또한, f^SAM ∈ R^K는 의미 보정 맵(SAM)이고, W_j는 두 벡터 간의 상호 작용을 결정하는 변수를 의미한다. Here, a _j is a bilinear transformation model, f ^color ∈ R ^I is a color feature. Also, f ^SAM ∈ R ^K is a semantic correction map (SAM), and W _j is a variable that determines the interaction between two vectors. it means.

색 변환 처리부(240)는 쌍선형 변환에 대한 모델을 통해 비선형 색 변환(non-linear color transform)을 학습하기 위해 색상 추출부(220)로부터 비선형 특징 맵을 추가로 반영한 색상 특징 맵을 획득할 수 있다. 여기서, 비선형 특징은 이미지에서 소정의 기준에 따라 선형적으로 도출되지 않는 결과물에 대한 특징을 의미하며, 일반적인 낮은 순위 쌍선형 풀링 방법을 기반으로 도출된 비선형 특징일 수 있다. The color conversion processing unit 240 may acquire a color feature map additionally reflecting the non-linear feature map from the color extraction unit 220 in order to learn a non-linear color transform through a model for bilinear transformation. have. Here, the nonlinear feature refers to a feature of a result that is not linearly derived from an image according to a predetermined criterion, and may be a nonlinear feature derived based on a general low-order bilinear pooling method.

색 변환 처리부(240)는 쌍선형 변환에 따른 색상 매핑 정보를 도출하여 출력 색상(y^)을 예측할 수 있다. 출력 컬러(y^)는 [수학식 7]과 같이 표현될 수 있다. The color conversion processing unit 240 may predict an output color (y^) by deriving color mapping information according to the bilinear transformation. The output color (y^) can be expressed as [Equation 7].

여기서, P ∈ R^d×c, U ∈ R^I×d, V ∈ R^K×d는 W의 분해 값, b ∈ R^d, c ∈ R^d, d ∈ R^c는 추가 편향 값이다. ˚는 요소 단위의 곱셈이며, 비선형 함수 σ는 tanh를 사용한다. Here, P ∈ R ^d×c , U ∈ R ^I×d , V ∈ R ^K×d are the decomposition values of W, and b ∈ R ^d , c ∈ R ^d , d ∈ R ^c are the additional bias values. ˚ is the multiplication of the element unit, and the nonlinear function σ uses tanh.

색 변환 처리부(240)는 추가 개선을 위해 색상 특징 맵(f ^color)에 비선형 특성을 추가할 수 있다. 색 변환 처리부(240)는 특징 맵 추출부(210)의 초기 컨볼루션 필터를 통한 비선형 특징 맵을 획득하고, 비선형 특징 맵과 입력 이미지에 대한 입력 색상 맵을 결합하여 비선형 특성이 포함된 색상 특징 맵을 이용할 수 있다. The color conversion processor 240 may add a nonlinear characteristic to the color feature map f ^color for further improvement. The color conversion processing unit 240 acquires a nonlinear feature map through the initial convolution filter of the feature map extraction unit 210, and combines the nonlinear feature map and the input color map for the input image to include a color feature map including nonlinear features. You can use

색 변환 처리부(240)는 입력 색상을 원래의 색상 공간에서 비선형 공간으로 변환하고, 비선형 변환을 통해 색상 매핑이 쉽게 모델링되어 색상 매핑 정보를 생성할 수 있다. The color conversion processing unit 240 may convert an input color from an original color space to a nonlinear space, and the color mapping is easily modeled through nonlinear conversion to generate color mapping information.

이미지 색 보정부(260)는 입력 이미지에 색상 매핑 정보를 기반으로 색 보정을 수행하여 출력 이미지를 생성한다. 여기서, 이미지 색 보정부(260)는 보정 과정에서 부정확 한 세분화로 인해 객체 경계 주변에 일부 이상치가 존재할 수 있다. 그러므로, 이미지 색 보정부(260)는 손실에 대한 최적화를 수행한다. The image color corrector 260 generates an output image by performing color correction on the input image based on color mapping information. Here, the image color corrector 260 may have some outliers around the object boundary due to incorrect subdivision during the correction process. Therefore, the image color correction unit 260 optimizes the loss.

이미지 색 보정부(260)는 후버 손실(Huber loss) 방식을 이용하여 이상치에 대한 손실을 최소화할 수 있다. 이러한 방식은 [수학식 8]을 통해 정의될 수 있다. The image color correcting unit 260 may minimize a loss for an outlier by using a Huber loss method. This method can be defined through [Equation 8].

여기서, Lhuber()는 후버 손실값을 의미하고, e는 오차(이상치)를 의미하고, δ는 두 손실 함수 사이의 변화점을 의미한다. 기 설정된 기준에 따라 손실은 작은 오차 | e | ≤ δ 에 대해 2 차이고, 큰 오차 | e | > δ에 대해 선형이다. 선형 함수의 기울기가 항상 δ이므로 최적화에서 이상치의 기여도가 감소하게 된다. Here, Lhuber() means the Hoover loss value, e means the error (outlier), and δ means the point of change between the two loss functions. According to the preset criteria, the loss is a small error | e | Difference of 2 for ≤ δ, large error | e | > δ is linear. Since the slope of the linear function is always δ, the contribution of outliers to optimization decreases.

도 4는 본 발명의 실시예에 따른 의미 인식 기반의 이미지 보정 방법을 설명하기 위한 순서도이다. 4 is a flowchart illustrating a method of correcting an image based on meaning recognition according to an embodiment of the present invention.

이미지 보정장치(200)는 입력 이미지 획득하고(S410), 이미지 보정장치(200)는 입력 이미지에 대한 색상 특징 맵을 추출한다(S420).The image correction apparatus 200 acquires an input image (S410), and the image correction apparatus 200 extracts a color feature map for the input image (S420).

이미지 보정장치(200)는 컨볼루션 레이어를 통해 컨볼루션 특징 맵 추출한다(S430). The image correction apparatus 200 extracts a convolution feature map through the convolution layer (S430).

이미지 보정장치(200)는 순환신경망 레이어를 통해 순환신경망 특징 맵 추출한다(S440).The image correction device 200 extracts a circulatory neural network feature map through the circulatory neural network layer (S440).

이미지 보정장치(200)는 컨볼루션 특징 맵 및 순환신경망 특징 맵을 보간 처리하여 의미 보정 맵을 생성한다(S450).The image correction apparatus 200 generates a semantic correction map by interpolating the convolutional feature map and the circulatory neural network feature map (S450).

이미지 보정장치(200)는 색상 특징 맵 및 의미 보정 맵을 이용하여 색상 매핑 정보를 생성한다(S460).The image correction apparatus 200 generates color mapping information using a color feature map and a semantic correction map (S460).

이미지 보정장치(200)는 색상 매핑 정보를 기반으로 사진 보정에 대한 출력 색상을 예측한다(S470).The image correction apparatus 200 predicts an output color for photo correction based on the color mapping information (S470).

이미지 보정장치(200)는 색 보정을 위한 신규 입력 이미지가 존재하는지 여부를 확인한다(S480). 신규 입력 이미지가 존재하는 경우 이미지 보정장치(200)는 출력 색상을 적용하여 신규 입력 이미지에 대한 색 보정을 수행한다(S490).The image correction apparatus 200 checks whether a new input image for color correction exists (S480). When a new input image exists, the image correction apparatus 200 performs color correction on the new input image by applying an output color (S490).

도 4에서는 각 단계를 순차적으로 실행하는 것으로 기재하고 있으나, 반드시 이에 한정되는 것은 아니다. 다시 말해, 도 4에 기재된 단계를 변경하여 실행하거나 하나 이상의 단계를 병렬적으로 실행하는 것으로 적용 가능할 것이므로, 도 4는 시계열적인 순서로 한정되는 것은 아니다.In FIG. 4, it is described that each step is sequentially executed, but is not limited thereto. In other words, since the steps described in FIG. 4 may be changed and executed or one or more steps may be executed in parallel, FIG. 4 is not limited to a time series order.

도 4에 기재된 본 실시예에 따른 이미지 보정 방법은 애플리케이션(또는 프로그램)으로 구현되고 단말장치(또는 컴퓨터)로 읽을 수 있는 기록매체에 기록될 수 있다. 본 실시예에 따른 이미지 보정 방법을 구현하기 위한 애플리케이션(또는 프로그램)이 기록되고 단말장치(또는 컴퓨터)가 읽을 수 있는 기록매체는 컴퓨팅 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록장치 또는 매체를 포함한다.The image correction method according to the present embodiment illustrated in FIG. 4 may be implemented as an application (or program) and recorded on a recording medium readable by a terminal device (or computer). The application (or program) for implementing the image correction method according to the present embodiment is recorded and the recording medium that can be read by the terminal device (or computer) is any type of recording device that stores data that can be read by the computing system or Includes the medium.

도 5는 본 발명의 실시예에 따른 이미지 보정 샘플 이미지를 나타낸 예시도이다.5 is an exemplary view showing an image correction sample image according to an embodiment of the present invention.

도 5는 의미 인식 이미지를 보정하는 예시 결과물을 나타낸다. 도 5의 (a)는 입력 이미지이고, 도 5의 (b)는 의미 보정 맵(SAM)을 나타낸다. 또한, 도 5의 (c)는 의미 보정 맵(SAM)의 각 영역(510, 520)에 대한 색 변환을 분석한 색상 매칭 정보를 통해 보정된 출력 이미지를 나타낸다. 5 shows an example result of correcting a meaning recognition image. FIG. 5(a) is an input image, and FIG. 5(b) shows a semantic correction map (SAM). In addition, (c) of FIG. 5 shows an output image corrected through color matching information obtained by analyzing color conversion for each area 510 and 520 of the semantic correction map (SAM).

도 5의 예시에서는, 전경 객체 영역(520)을 포화 상태로 만들고, 배경 객체 영역(510)을 불포화 상태로 만든 의미 보정 맵(SAM)을 이용하여 색 보정을 수행한다. In the example of FIG. 5, color correction is performed using a semantic correction map (SAM) in which the foreground object region 520 is made saturated and the background object region 510 is made unsaturated.

본 실시예에 따른 이미지 보정장치(200)는 전문가가 전경 및 배경을 구분하여 객체의 의미에 따라 이미지의 색조와 색을 조정하는 방식과 같이, 전경 객체 영역(520)은 포화 상태가되고 배경 객체 영역(510)은 불포화 상태로 구분하여 색 보정을 수행할 수 있다. 이미지 보정장치(200)는 각각의 의미를 인식한 영역에 대해 각 영역 내의 모든 픽셀의 색 변환은 균일하게 이루어진다. In the image correction apparatus 200 according to the present embodiment, the foreground object area 520 is saturated and the background object is in a manner in which an expert separates the foreground and the background and adjusts the color tone and color of the image according to the meaning of the object. The area 510 may be classified into an unsaturated state to perform color correction. The image correction apparatus 200 uniformly converts colors of all pixels in each region for regions in which the meaning of each is recognized.

도 6은 본 발명의 실시예에 따른 이미지 보정 결과물의 질을 비교한 샘플 이미지를 나타낸 예시도이다. 6 is an exemplary view showing a sample image comparing the quality of image correction results according to an embodiment of the present invention.

도 6은 의미 인식 이미지의 보정하는 질적인 결과물을 나타낸다. 도 6의 (a)는 입력 이미지를 나타내고, (b)는 일반적인 보정 방식(Zhu et al.) 기반의 보정 결과물을 나타낸다. 도 6의 본 발명의 보정 결과물을 나타내고, (d)는 실제 전경을 나타낸다. 6 shows a qualitative result of correcting a meaning recognition image. 6A shows an input image, and FIG. 6B shows a correction result based on a general correction method (Zhu et al.). 6 shows the correction result of the present invention, and (d) shows the actual foreground.

도 6의 각 행에는 전경 팝 아웃(1, 2 행), 로컬 Xpro(3, 4 행) 및 수채화(5, 6 행)와 같은 3 가지 유형의 사진 조정 스타일이 적용된 결과물을 나타내며, 본 발명에 따른 이미지 보정장치(200)의 결과물인 도 6의 (c)가 공간적으로 변하는 픽셀 색상을 더 정확하게 추정하는 것을 확인할 수 있다. Each row in Fig. 6 represents the result of applying three types of photo adjustment styles: foreground pop-out (rows 1, 2), local Xpro (rows 3, 4), and watercolor (rows 5, 6). It can be seen that (c) of FIG. 6, which is a result of the image correction apparatus 200 according to this, more accurately estimates the spatially changing pixel color.

도 7a 및 도 7b는 본 발명의 실시예에 따른 의미 인식 기반의 이미지 보정 방식을 적용한 샘플 이미지를 나타낸다. 7A and 7B show sample images to which an image correction method based on meaning recognition according to an embodiment of the present invention is applied.

도 7a 및 도 7b는 본 발명에 따른 이미지 보정 방식에서 추출된 의미 보정 맵(SAM)을 나타낸다. 7A and 7B show a semantic correction map (SAM) extracted from the image correction method according to the present invention.

도 7의 (a)는 입력 이미지를 나타내고, (b)는 실제 전경을 나타낸다. 도 7의 (c)는 본 발명의 보정 결과물을 나타내고, (d)는 본 발명의 의미 보정 맵(SAM)을 나타낸다.7A shows the input image, and (b) shows the actual foreground. Figure 7 (c) shows the correction result of the present invention, (d) shows the semantic correction map (SAM) of the present invention.

도 7의 각 행에는 전경 팝 아웃(도 7a의 1, 2 행), 로컬 Xpro(도 7a의 3, 4 행), 수채화(도 7b의 1, 2 행) 및 골든(도 7b의 3, 4 행)과 같은 4 가지 유형의 사진 조정 스타일이 적용된 결과물을 나타낸다. Each row in Fig. 7 includes foreground pop-outs (rows 1 and 2 in Fig. 7A), local Xpro (rows 3 and 4 in Fig. 7A), watercolor (rows 1 and 2 in Fig. 7B) and Golden (rows 3 and 4 in Fig. 7B). Row) shows the result of applying 4 types of photo adjustment styles.

본 발명에 따른 이미지 보정장치(200)는 고유한 색 변환을 효과적으로 발견할 수 있다. 이산 SAM은 경계 주위에서 색상이 갑자기 변경 될 수 있으나, 이러한 문제는 유도된 페더링을 SAM에 적용함으로써 효과적으로 완화시킬 수 있다. The image correction apparatus 200 according to the present invention can effectively discover a unique color conversion. Discrete SAM can change color suddenly around the boundary, but this problem can be effectively mitigated by applying induced feathering to the SAM.

도 8a 및 도 8b는 본 발명의 실시예에 따른 이미지 보정 결과물의 비교 및 순환신경망의 적용 여부에 따른 결과물을 나타낸 도면이다. 8A and 8B are diagrams illustrating comparison of image correction results according to an embodiment of the present invention and results according to whether or not a circulatory neural network is applied.

도 8a은 본원발명의 색 보정 결과물, Zhu et al.의 보정 결과물, 및 Gharbi et al.의 보정 결과물에 대한 선호도 조사 결과를 나타내며, 본원발명의 결과물은 대부분의 사용자가 선호하는 이미지를 보정하는 것을 확인할 수 있다. 여기서, Zhu et al.의 보정 결과물은 “Exemplar-based image and video stylization using fully convolutional semantic features(F. Zhu, Z. Yan, J. Bu, Y. Yu)”에 기재된 방식의 결과물을 의미하며, Gharbi et al.의 보정 결과물은 “Deep bilateral learning for real-time image enhancement(M. Gharbi, J. Chen, J. T. Barron, S. W. Hasinoff, F. Durand)” 에 기재된 방식의 결과물을 의미한다. Figure 8a shows the result of the preference survey for the color correction result of the present invention, the correction result of Zhu et al., and the correction result of Gharbi et al., the result of the present invention is to correct the image that most users prefer. I can confirm. Here, the correction result of Zhu et al. refers to the result of the method described in “Exemplar-based image and video stylization using fully convolutional semantic features (F. Zhu, Z. Yan, J. Bu, Y. Yu)”, The correction result of Gharbi et al. refers to the result of the method described in "Deep bilateral learning for real-time image enhancement (M. Gharbi, J. Chen, JT Barron, SW Hasinoff, F. Durand)."

도 8b는 공간 RNN 레이어의 효과를 나타내는 도면으로써, 도 8b의 (a)는 입력 영상, (b)는 공간 RNN 레이어가 적용되지 않은 의미 보정 맵(SAM)을 나타낸다. 또한, 도 8b의 (c)는 공간 RNN 레이어가 적용된 의미 보정 맵(SAM)을 나타낸다. FIG. 8B is a diagram illustrating the effect of a spatial RNN layer. FIG. 8B (a) shows an input image, and FIG. 8B shows a semantic correction map (SAM) to which a spatial RNN layer is not applied. In addition, (c) of FIG. 8B shows a semantic correction map (SAM) to which a spatial RNN layer is applied.

도 8b의 (c)에서는 입력 영상의 사람 객체가 단일 클러스터로 분류되는 것을 확인할 수 있다. 즉, 하나의 사람 객체에 동일한 색상 매핑이 적용되어 균일한 색 보정 결과물을 생성할 수 있다. 이에 비해, 도 8b의 (b)에서는 입력 영상의 사람 객체가 여러 개의 클러스터로 분류되어 있다. 이러한 경우, 하나의 사람 객체에 서로 다른 색상 매핑이 적용되어 균일하지 않은 색 보정 결과물이 생성된다. In (c) of FIG. 8B, it can be seen that human objects of the input image are classified into a single cluster. That is, the same color mapping is applied to one human object, so that a uniform color correction result can be generated. In contrast, in (b) of FIG. 8B, human objects of the input image are classified into several clusters. In this case, different color mappings are applied to one human object, resulting in uneven color correction.

도 9는 본 발명의 실시예에 따른 이미지 보정장치의 쌍선형 변환 동작을 시각화한 도면을 나타낸다. 9 is a diagram illustrating a bilinear transformation operation of an image correction apparatus according to an embodiment of the present invention.

도 9는 쌍선형 색 변환을 시각화하여 나타낸 도면이다. 도 9의 (a)는 입력 이미지, 의미 보정 맵(SAM) 및 출력 이미지를 나타낸다. 9 is a diagram illustrating a visualization of bilinear color conversion. 9A shows an input image, a semantic correction map (SAM), and an output image.

도 9의 (b), (c), (d)는 (a)의 의미 보정 영역(A, B, C) 각각의 색상 매핑을 나타낸다. 도 9의 (b), (c), (d) 각각에 표시된 파란점은 실제 색상 매핑값을 의미하고, 빨간점은 본 발명의 색 보정 방식에 따른 출력 색 예측 결과를 의미한다. 9(b), (c), and (d) show color mapping of each of the semantic correction areas (A, B, C) of (a). The blue dots indicated in each of (b), (c), and (d) of FIG. 9 denote the actual color mapping values, and the red dots denote the output color prediction results according to the color correction method of the present invention.

본 발명에 따른 이미지 보정장치(200)의 의미 보정 맵(SAM) 및 쌍선형 모델은 색 변환을 정확히 예측할 수 있으며, 색 보정 정확도를 향상 시킬 수 있다. The semantic correction map (SAM) and the bilinear model of the image correction apparatus 200 according to the present invention can accurately predict color conversion and improve color correction accuracy.

도 10은 본 발명의 실시예에 따른 이미지 보정장치의 비선형 색상 특징 맵을 나타낸 예시도이다.10 is an exemplary view showing a nonlinear color feature map of an image correction apparatus according to an embodiment of the present invention.

도 10의 (a)는 2 차 색 기반의 t-SNE 임베딩을 보여 주며, (b)는 본 발명의 비선형 색 특성에 대한 t-SNE 임베딩을 나타낸다. 도 10의 빨강, 초록 및 파랑 점은 각각 도 9의 (a)의 클러스터 A, B 및 C에 속하는 픽셀이다.FIG. 10A shows the t-SNE embedding based on the secondary color, and (b) shows the t-SNE embedding for the nonlinear color characteristics of the present invention. Red, green, and blue dots of FIG. 10 are pixels belonging to clusters A, B, and C of FIG. 9A, respectively.

본 발명에 따른 이미지 보정장치(200)는 의미 보정 맵(SAM)과 함께 쌍선형 색 변환 모델의 원래 색 공간에서 비선형 색 특징(f^color)을 학습한다. 본 발명에 따른 이미지 보정장치(200)는 의미(semantic) 정보 없이 표현된 색을 의미적으로 인식하여 학습함으로써, 효과적인 쌍선형 색 변환이 가능하다. The image correction apparatus 200 according to the present invention learns a nonlinear color feature (f ^color ) in the original color space of a bilinear color conversion model together with a semantic correction map (SAM). The image correction apparatus 200 according to the present invention semantically recognizes and learns a color expressed without semantic information, thereby enabling effective bilinear color conversion.

이상의 설명은 본 발명의 실시예의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 발명의 실시예가 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 발명의 실시예의 본질적인 특성에서 벗어나지 않는 범위에서 다양한 수정 및 변형이 가능할 것이다. 따라서, 본 발명의 실시예들은 본 발명의 실시예의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시예에 의하여 본 발명의 실시예의 기술 사상의 범위가 한정되는 것은 아니다. 본 발명의 실시예의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 발명의 실시예의 권리범위에 포함되는 것으로 해석되어야 할 것이다.The above description is merely illustrative of the technical idea of the embodiments of the present invention, and those of ordinary skill in the technical field to which the embodiments of the present invention belong to, various modifications and modifications without departing from the essential characteristics of the embodiments of the present invention Transformation will be possible. Accordingly, the embodiments of the present invention are not intended to limit the technical idea of the embodiments of the present invention, but to explain, and the scope of the technical idea of the embodiments of the present invention is not limited by these embodiments. The scope of protection of the embodiments of the present invention should be interpreted by the following claims, and all technical ideas within the scope equivalent thereto should be construed as being included in the scope of the rights of the embodiments of the present invention.

200: 이미지 보정장치
210: 특징 맵 추출부 220: 색상 추출부
230: 의미 보정 맵 생성부 240: 색 변환 처리부
250: 출력 색 예측부 260: 이미지 색 보정부200: image correction device
210: feature map extraction unit 220: color extraction unit
230: semantic correction map generation unit 240: color conversion processing unit
250: output color prediction unit 260: image color correction unit

Claims

A color extracting unit for extracting a color feature map for the input image;
A feature map extraction unit for extracting at least one convolutional feature map of the input image using a convolutional layer;
A semantic correction map generator for generating a semantic correction map (SAM) based on the at least one convolutional feature map; And
A color conversion processor for predicting an output color for image correction by generating color mapping information for color conversion based on the semantic correction map and the color feature map,
The feature map extraction unit includes a convolution layer unit including at least one convolution filter, and the convolution layer unit includes an edge of a feature value for a boundary line in a result of the initial convolution filter of the input image. A first convolution filter for generating a first convolution feature map by extracting information; A second convolution filter for generating a second convolutional feature map by extracting object segmentation information including feature values for each part of the object from the first convolutional feature map; And a third convolution filter for generating a third convolution feature map by extracting object shape information including feature values for the entire shape of each object from the second convolution feature map,
The semantic correction map generator may form at least one residual block based on the first convolution feature map, the second convolution feature map, and the third convolution feature map, and the at least one residual block Semantic recognition-based image correction device, characterized in that generating the semantic correction map based on.

The method of claim 1,
The color extraction unit,
Generating a color feature block including the color feature map,
Wherein the color feature block includes the color feature map obtained by combining an input color map for the input image and a nonlinear feature map.

delete

The method of claim 1,
The feature map extraction unit,
An image correction apparatus based on semantic recognition, characterized in that it further comprises a recurrent neural network layer unit for extracting a spatial feature map for object positional relationship information by applying a recurrent neural network (RNN) layer.

The method of claim 5,
The semantic correction map generation unit,
At least one residual block including each of the convolution feature maps for each of the at least one convolution filter; And
And a spatial recurrent neural network block including the spatial feature map,
A semantic recognition-based image correction apparatus, characterized in that the at least one residual block and the spatial recurrent neural network block are reduced-sampled to a specific size to generate the semantic correction map.

The method of claim 6,
The semantic correction map generation unit,
An image correction device based on semantic recognition, characterized in that the at least one residual block and the spatial recurrent neural network block are upsampled and interpolated, and the combined block is reduced-sampled to the specific size to generate the semantic correction map. .

The method of claim 1,
The color conversion processing unit,
A semantic recognition-based image correction apparatus, characterized in that the semantic correction map and the color feature map are subjected to bilinear pooling to generate the color mapping information.

The method of claim 8,
The color conversion processing unit,
The semantic correction map and the color feature map are subjected to the bilinear pooling to generate color conversion information for each object, and the color mapping information for the intrinsic color for each object is generated based on the color conversion information for each object. Image correction device based on meaning recognition.

In the method of correcting the color of the image in the image correction device,
An image acquisition step of obtaining an input image;
A color extraction step of extracting a color feature map for the input image;
A feature map extraction step of extracting at least one convolution feature map of the input image using a convolution layer;
A semantic correction map generating step of generating a semantic correction map (SAM) based on the at least one convolutional feature map; And
A color conversion processing step of predicting an output color for image correction by generating color mapping information for color conversion based on the semantic correction map and the color feature map,
The feature map extraction step includes a convolution layer step including at least one convolution filter, and the convolution layer step comprises an edge for a feature value for a boundary line in a result of the initial convolution filter of the input image. (Edge) a first convolution filter step of extracting information to generate a first convolution feature map; A second convolution filter step of generating a second convolutional feature map by extracting object segmentation information including feature values for each part of the object from the first convolutional feature map; And a third convolution filter step of generating a third convolutional feature map by extracting object shape information including feature values for the entire shape of each object from the second convolutional feature map,
In the semantic correction map generating step, at least one residual block is formed based on the first convolutional feature map, the second convolutional feature map, and the third convolutional feature map, and the at least one residual Semantic recognition-based image correction method, characterized in that generating the semantic correction map based on the block.

delete

The method of claim 10,
The feature map extraction step,
A semantic recognition-based image correction method, characterized in that it further comprises a recurrent neural network layer step of extracting a spatial feature map for object positional relationship information by applying a recurrent neural network (RNN) layer.

A computer program stored in a medium to execute the image correction method based on meaning recognition according to any one of claims 10 and 13 on a computer.

delete