KR102246110B1

KR102246110B1 - Display apparatus and image processing method thereof

Info

Publication number: KR102246110B1
Application number: KR1020190080320A
Authority: KR
Inventors: 임형준; 문영수; 안태경
Original assignee: 삼성전자주식회사
Priority date: 2019-04-02
Filing date: 2019-07-03
Publication date: 2021-04-29
Also published as: KR20200116836A; TW202044199A

Abstract

영상 처리 장치가 개시된다. 적어도 하나의 명령어를 저장하는 메모리 및 메모리와 전기적으로 연결된 프로세서를 포함하고, 프로세서는, 명령어를 실행함으로써, 입력 영상을 학습 네트워크 모델에 적용하여 입력 영상에 포함된 픽셀 블록에 대응되는 텍스처 패치를 픽셀 블록에 적용하여 출력 영상을 획득하며, 학습 네트워크 모델은, 영상의 특성에 기초하여 분류된 복수의 클래스 각각에 대응되는 텍스처 패치를 저장하며, 입력 영상에 기초하여 복수의 클래스 각각에 대응되는 텍스처 패치를 학습한다. An image processing apparatus is disclosed. It includes a memory storing at least one instruction and a processor electrically connected to the memory, and the processor applies the input image to the learning network model by executing the instruction, thereby generating a pixel texture patch corresponding to the pixel block included in the input image. An output image is obtained by applying it to a block, and the learning network model stores a texture patch corresponding to each of a plurality of classes classified based on the characteristics of the image, and a texture patch corresponding to each of the plurality of classes based on the input image To learn.

Description

Image processing apparatus and image processing method thereof [Display apparatus and image processing method thereof]

본 발명은 영상 처리 장치 및 그 제어 방법에 관한 것으로, 더욱 상세하게는 입력 영상의 텍스처 성분을 복원하는 영상 처리 장치 및 그 영상 처리 방법에 관한 것이다.The present invention relates to an image processing apparatus and a control method thereof, and more particularly, to an image processing apparatus for reconstructing a texture component of an input image, and an image processing method thereof.

또한, 본 발명은 학습 네트워크 모델을 활용하여 인간 두뇌의 인지, 판단 등의 기능을 모사하는 인공 지능(Artificial Intelligence, AI) 시스템 및 그 응용에 관한 것이다.In addition, the present invention relates to an artificial intelligence (AI) system that simulates functions such as cognition and judgment of the human brain by using a learning network model and an application thereof.

전자 기술의 발달에 힘입어 다양한 유형의 전자기기가 개발 및 보급되고 있다. 특히, 가정, 사무실, 공공 장소 등 다양한 장소에서 이용되는 영상 처리 장치는 최근 수년 간 지속적으로 발전하고 있다. With the development of electronic technology, various types of electronic devices are being developed and distributed. In particular, image processing apparatuses used in various places such as homes, offices, and public places have been continuously developed in recent years.

최근 4K UHD TV 등의 고해상도 디스플레이 패널들이 출시되어 널리 보급되어 있다. 하지만, 아직 고품질의 고해상도 컨텐츠는 많이 부족한 상황이다. 이에 저해상도 컨텐츠에서 고해상도 컨텐츠를 생성하기 위한 다양한 기술이 요구되는 상황이다. 아울러, MPEG/H.264/HEVC 등의 영상 압축으로 인해 컨텐츠의 텍스처 손실이 발생될 수 있고 이에 따라 손실된 텍스처 성분을 복원하기 위한 기술이 요구되는 상황이다. Recently, high-resolution display panels such as 4K UHD TVs have been released and are widely spread. However, high-quality high-resolution content is still lacking. Accordingly, various technologies for generating high-resolution content from low-resolution content are required. In addition, texture loss of content may occur due to image compression such as MPEG/H.264/HEVC, and accordingly, a technique for restoring the lost texture component is required.

또한, 근래에는 인간 수준의 지능을 구현하는 인공 지능 시스템이 다양한 분야에서 이용되고 있다. 인공 지능 시스템은 기존의 룰(rule) 기반 스마트 시스템과 달리 기계가 스스로 학습하고 판단하며 똑똑해지는 시스템이다. 인공 지능 시스템은 사용할수록 인식률이 향상되고 사용자 취향을 보다 정확하게 이해할 수 있게 되어, 기존 룰 기반 스마트 시스템은 점차 딥러닝 기반 인공 지능 시스템으로 대체되고 있다.In addition, in recent years, artificial intelligence systems that implement human-level intelligence have been used in various fields. Unlike existing rule-based smart systems, artificial intelligence systems are systems where machines learn, judge, and become smarter. As the artificial intelligence system is used, the recognition rate improves and the user's taste can be understood more accurately, and the existing rule-based smart system is gradually being replaced by a deep learning-based artificial intelligence system.

인공 지능 기술은 기계학습(예로, 딥러닝) 및 기계학습을 활용한 요소 기술들로 구성된다.Artificial intelligence technology consists of machine learning (for example, deep learning) and component technologies using machine learning.

기계학습은 입력 데이터들의 특징을 스스로 분류/학습하는 알고리즘 기술이며, 요소기술은 딥러닝 등의 기계학습 알고리즘을 활용하여 인간 두뇌의 인지, 판단 등의 기능을 모사하는 기술로서, 언어적 이해, 시각적 이해, 추론/예측, 지식 표현, 동작 제어 등의 기술 분야로 구성된다.Machine learning is an algorithm technology that classifies/learns the features of input data by itself, and element technology is a technology that simulates functions such as cognition and judgment of the human brain using machine learning algorithms such as deep learning. It consists of technical fields such as understanding, reasoning/prediction, knowledge expression, and motion control.

인공 지능 기술이 응용되는 다양한 분야는 다음과 같다. 언어적 이해는 인간의 언어/문자를 인식하고 응용/처리하는 기술로서, 자연어 처리, 기계 번역, 대화시스템, 질의 응답, 음성 인식/합성 등을 포함한다. 시각적 이해는 사물을 인간의 시각처럼 인식하여 처리하는 기술로서, 오브젝트 인식, 오브젝트 추적, 영상 검색, 사람 인식, 장면 이해, 공간 이해, 영상 개선 등을 포함한다. 추론 예측은 정보를 판단하여 논리적으로 추론하고 예측하는 기술로서, 지식/확률 기반 추론, 최적화 예측, 선호 기반 계획, 추천 등을 포함한다. 지식 표현은 인간의 경험정보를 지식데이터로 자동화 처리하는 기술로서, 지식 구축(데이터 생성/분류), 지식 관리(데이터 활용) 등을 포함한다. 동작 제어는 차량의 자율 주행, 로봇의 움직임을 제어하는 기술로서, 움직임 제어(항법, 충돌, 주행), 조작 제어(행동 제어) 등을 포함한다.The various fields where artificial intelligence technology is applied are as follows. Linguistic understanding is a technology that recognizes and applies/processes human language/text, and includes natural language processing, machine translation, dialogue system, question and answer, and speech recognition/synthesis. Visual understanding is a technology that recognizes and processes objects like human vision, and includes object recognition, object tracking, image search, human recognition, scene understanding, spatial understanding, and image improvement. Inference prediction is a technique that logically infers and predicts information by judging information, and includes knowledge/probability-based reasoning, optimization prediction, preference-based planning, and recommendation. Knowledge expression is a technology that automatically processes human experience information into knowledge data, and includes knowledge construction (data generation/classification), knowledge management (data utilization), and the like. Motion control is a technology that controls autonomous driving of a vehicle and movement of a robot, and includes movement control (navigation, collision, travel), operation control (behavior control), and the like.

한편, 종래의 영상 처리 장치는 손실된 텍스처 성분을 복원하기 위해 고정된 텍스처 패치를 적용하거나, 영상과의 적합성이 떨어지는 텍스처 패치를 적용하는 문제가 있었다. 이에, 영상에 적합하게 텍스처를 생성하는 기술에 대한 요구가 있었다.Meanwhile, in the conventional image processing apparatus, there is a problem of applying a fixed texture patch to restore a lost texture component or applying a texture patch that is not compatible with an image. Accordingly, there has been a demand for a technique for generating a texture suitable for an image.

본 발명은 상술한 필요성에 따른 것으로, 본 발명의 목적은, 입력 영상의 특성에 기초하여 학습된 텍스처 패치를 이용하여 입력 영상의 세밀감(detail)을 향상시키는 영상 처리 장치 및 그 영상 처리 방법을 제공함에 있다. The present invention is in accordance with the above-described necessity, and an object of the present invention is to provide an image processing apparatus and an image processing method for improving the detail of an input image by using a texture patch learned based on the characteristics of the input image. It is in the offering.

이상과 같은 목적을 달성하기 위한 본 개시의 일 실시 예에 따른 영상 처리 장치는, 적어도 하나의 명령어를 저장하는 메모리 및 상기 메모리와 전기적으로 연결된 프로세서를 포함하고, 상기 프로세서는, 상기 명령어를 실행함으로써, 입력 영상을 학습 네트워크 모델에 적용하여 상기 입력 영상에 포함된 픽셀 블록에 대응되는 텍스처 패치를 획득하고, 상기 픽셀 블록에 상기 획득된 텍스처 패치를 적용하여 출력 영상을 획득하며, 상기 학습 네트워크 모델은, 영상의 특성에 기초하여 분류된 복수의 클래스 각각에 대응되는 텍스처 패치를 저장하며, 상기 입력 영상에 기초하여 상기 복수의 클래스 각각에 대응되는 텍스처 패치를 학습한다.An image processing apparatus according to an embodiment of the present disclosure for achieving the above object includes a memory storing at least one instruction and a processor electrically connected to the memory, and the processor, by executing the instruction , Applying an input image to a learning network model to obtain a texture patch corresponding to a pixel block included in the input image, and applying the obtained texture patch to the pixel block to obtain an output image, the learning network model , A texture patch corresponding to each of a plurality of classes classified based on an image characteristic is stored, and a texture patch corresponding to each of the plurality of classes is learned based on the input image.

여기서, 상기 학습 네트워크 모델은, 상기 픽셀 블록의 특성에 기초하여 상기 복수의 클래스 중 하나를 식별하고, 상기 식별된 클래스에 대응되는 텍스처 패치를 출력하며, 상기 픽셀 블록과 상기 식별된 클래스 간의 제1 유사도 및 상기 텍스처 패치와 상기 식별된 클래스 간의 제2 유사도를 비교하여 상기 텍스처 패치를 업데이트할지 여부를 식별할 수 있다.Here, the learning network model identifies one of the plurality of classes based on the characteristics of the pixel block, outputs a texture patch corresponding to the identified class, and a first between the pixel block and the identified class. Whether to update the texture patch may be identified by comparing a similarity and a second similarity between the texture patch and the identified class.

또한, 상기 학습 네트워크 모델은, 상기 상기 제1 및 제2 유사도에 기초하여 상기 식별된 클래스에 대응되는 텍스처 패치를 상기 픽셀 블록으로 대체하거나, 상기 픽셀 블록을 상기 식별된 클래스에 대응되는 텍스처 패치로 추가할 수 있다.In addition, the learning network model may replace a texture patch corresponding to the identified class with the pixel block based on the first and second similarities, or replace the pixel block with a texture patch corresponding to the identified class. Can be added.

또한, 상기 학습 네트워크 모델은, 상기 비교 결과에 기초하여 상기 제1 유사도가 상기 제2 유사도 보다 작은 값이면, 상기 식별된 클래스에 대응되는 상기 텍스처 패치를 유지하고, 상기 비교 결과에 기초하여 상기 제1 유사도가 상기 제2 유사도 보다 큰 값이면, 상기 픽셀 블록에 기초하여 상기 텍스처 패치를 업데이트할 수 있다.In addition, if the first similarity is less than the second similarity based on the comparison result, the learning network model maintains the texture patch corresponding to the identified class, and based on the comparison result, the first similarity If the first similarity is greater than the second similarity, the texture patch may be updated based on the pixel block.

또한, 상기 학습 네트워크 모델은, 상기 식별된 클래스에 대응되는 텍스처 패치가 복수 개인 경우, 상기 픽셀 블록 및 상기 복수 개의 텍스처 패치 각각의 상관 관계(correlation)에 기초하여 상기 복수 개의 텍스처 패치 중 어느 하나를 식별할 수 있다.In addition, the learning network model, if there are a plurality of texture patches corresponding to the identified class, based on the correlation (correlation) of each of the pixel block and the plurality of texture patches, any one of the plurality of texture patches Can be identified.

또한, 상기 학습 네트워크 모델은, 상기 복수의 클래스 각각에 대응되는 텍스처 패치의 저장 시기 또는 상기 텍스처 패치의 적용 빈도수 중 적어도 하나에 기초하여 상기 텍스처 패치를 학습할 수 있다.In addition, the learning network model may learn the texture patch based on at least one of a storage timing of a texture patch corresponding to each of the plurality of classes or an application frequency of the texture patch.

또한, 상기 학습 네트워크 모델은, 상기 픽셀 블록의 특성에 기초하여 상기 픽셀 블록이 상기 복수의 클래스 중 어느 하나에 대응되지 않는 것으로 식별되면, 상기 픽셀 블록의 특성에 기초하여 신규 클래스를 생성하고 상기 신규 클래스에 상기 픽셀 블록을 맵핑하여 저장할 수 있다.In addition, the learning network model, if it is identified that the pixel block does not correspond to any one of the plurality of classes based on the characteristic of the pixel block, generates a new class based on the characteristic of the pixel block, and the new The pixel block may be mapped to a class and stored.

또한, 상기 복수의 클래스는, 평균 픽셀 값, 픽셀 좌표, 분산, 에지 강도, 에지 방향 또는 색상 중 적어도 하나를 기준으로 구분될 수 있다.In addition, the plurality of classes may be classified based on at least one of an average pixel value, pixel coordinates, variance, edge intensity, edge direction, or color.

또한, 상기 프로세서는, 상기 획득된 텍스처 패치 및 상기 픽셀 블록 간 상관 관계(correlation)에 기초하여 상기 텍스처 패치에 대한 가중치를 획득하고, 상기 가중치가 적용된 텍스처 패치를 상기 픽셀 블록에 적용하여 상기 출력 영상을 획득할 수 있다.In addition, the processor obtains a weight for the texture patch based on a correlation between the obtained texture patch and the pixel block, and applies the weighted texture patch to the pixel block to obtain the output image. Can be obtained.

또한, 상기 출력 영상은, 4K UHD(Ultra High Definition) 영상 또는 8K UHD 영상일 수 있다.In addition, the output image may be a 4K Ultra High Definition (UHD) image or an 8K UHD image.

한편, 본 개시의 일 실시 예에 따른 영상 처리 장치의 영상 처리 방법은, 입력 영상을 학습 네트워크 모델에 적용하여 상기 입력 영상에 포함된 픽셀 블록에 대응되는 텍스처 패치를 획득하는 단계 및 상기 픽셀 블록에 상기 획득된 텍스처 패치를 적용하여 출력 영상을 획득하는 단계를 포함하고, 상기 학습 네트워크 모델은, 영상의 특성에 기초하여 분류된 복수의 클래스 각각에 대응되는 텍스처 패치를 저장하며, 상기 입력 영상에 기초하여 상기 복수의 클래스 각각에 대응되는 텍스처 패치를 학습한다.Meanwhile, an image processing method of an image processing apparatus according to an embodiment of the present disclosure includes the steps of obtaining a texture patch corresponding to a pixel block included in the input image by applying an input image to a learning network model, and obtaining a texture patch corresponding to a pixel block included in the input image. And obtaining an output image by applying the obtained texture patch, wherein the learning network model stores a texture patch corresponding to each of a plurality of classes classified based on a characteristic of the image, and based on the input image Thus, a texture patch corresponding to each of the plurality of classes is learned.

여기서, 상기 학습 네트워크 모델은, 상기 제1 및 제2 유사도에 기초하여 상기 식별된 클래스에 대응되는 텍스처 패치를 상기 픽셀 블록으로 대체하거나, 상기 픽셀 블록을 상기 식별된 클래스에 대응되는 텍스처 패치로 추가할 수 있다.Here, the learning network model replaces the texture patch corresponding to the identified class with the pixel block based on the first and second similarities, or adds the pixel block as a texture patch corresponding to the identified class. can do.

또한, 상기 학습 네트워크 모델은, 상기 입력 영상에 포함된 복수의 픽셀 블록 각각에 대응되는 클래스를 식별하고, 상기 복수의 클래스 각각의 식별 빈도수에 기초하여 상기 복수의 클래스 중 적어도 하나에 대응되는 상기 메모리의 저장 공간의 크기를 변경할 수 있다.In addition, the learning network model identifies a class corresponding to each of a plurality of pixel blocks included in the input image, and the memory corresponding to at least one of the plurality of classes based on the identification frequency of each of the plurality of classes. You can change the size of your storage space.

여기서, 상기 학습 네트워크 모델은, 상기 식별 빈도수에 기초하여 기 설정된 횟수 미만으로 식별된 클래스에 대응되는 텍스처 패치를 상기 메모리로부터 삭제하고, 상기 텍스처 패치의 삭제에 따라 확보된 저장 공간을 나머지 클래스에 할당할 수 있다.Here, the learning network model deletes a texture patch corresponding to a class identified less than a preset number of times based on the identification frequency from the memory, and allocates a storage space reserved according to the deletion of the texture patch to the remaining classes. can do.

또한, 상기 출력 영상을 획득하는 단계는, 상기 획득된 텍스처 패치 및 상기 픽셀 블록 간 상관 관계(correlation)에 기초하여 상기 텍스처 패치에 대한 가중치를 획득하는 단계 및 상기 가중치가 적용된 텍스처 패치를 상기 픽셀 블록에 적용하여 상기 출력 영상을 획득하는 단계를 포함할 수 있다.In addition, the obtaining of the output image may include obtaining a weight for the texture patch based on a correlation between the obtained texture patch and the pixel block, and applying the weighted texture patch to the pixel block. Applying to may include the step of obtaining the output image.

또한, 본 개시의 일 실시 예에 따른 영상 처리 장치의 프로세서에 의해 실행되는 경우 상기 영상 처리 장치가 동작을 수행하도록 하는 컴퓨터 명령을 저장하는 비일시적 컴퓨터 판독 가능 매체에 있어서, 상기 동작은, 입력 영상을 학습 네트워크 모델에 적용하여 상기 입력 영상에 포함된 픽셀 블록에 대응되는 텍스처 패치를 획득하는 단계 및 상기 픽셀 블록에 상기 획득된 텍스처 패치를 적용하여 출력 영상을 획득하는 단계를 포함하고, 상기 학습 네트워크 모델은, 영상의 특성에 기초하여 분류된 복수의 클래스 각각에 대응되는 텍스처 패치를 저장하며, 상기 입력 영상에 기초하여 상기 복수의 클래스 각각에 대응되는 텍스처 패치를 학습한다.In addition, in a non-transitory computer-readable medium storing a computer command that causes the image processing device to perform an operation when executed by a processor of the image processing device according to an embodiment of the present disclosure, the operation is an input image Applying to a learning network model to obtain a texture patch corresponding to a pixel block included in the input image, and applying the obtained texture patch to the pixel block to obtain an output image, wherein the learning network The model stores texture patches corresponding to each of a plurality of classes classified based on characteristics of an image, and learns a texture patch corresponding to each of the plurality of classes based on the input image.

상술한 바와 같이 본 개시의 다양한 실시 예에 따르면, 영상의 특성에 기초하여 학습된 텍스처 패치를 이용한 텍스처 생성을 통해 영상 전체의 세밀감을 향상시킬 수 있게 된다.As described above, according to various embodiments of the present disclosure, it is possible to improve the detail of the entire image by generating a texture using a texture patch learned based on the characteristics of the image.

도 1은 본 개시의 일 실시 예에 따른 영상 처리 장치의 구현 예를 설명하기 위한 도면이다.
도 2는 본 개시의 일 실시 예에 따른 영상 처리 장치의 구성을 나타내는 블록도다.
도 3은 본 개시의 일 실시 예에 따른 픽셀 블록을 설명하기 위한 도면이다.
도 4는 본 개시의 일 실시 예에 따른 텍스처 패치를 설명하기 위한 도면이다.
도 5는 본 개시의 일 실시 예에 따른 학습 네트워크 모델을 설명하기 위한 도면이다.
도 6은 본 개시의 일 실시 예에 따른 클래스 및 텍스처 패치를 설명하기 위한 도면이다.
도 7은 본 개시의 일 실시 예에 따른 입력 영상을 학습하는 모델을 설명하기 위한 도면이다,
도 8은 본 개시의 다른 실시 예에 따른 클래스를 설명하기 위한 도면이다.
도 9는 본 개시의 일 실시 예에 따른 학습 결과를 설명하기 위한 도면이다.
도 10은 본 개시의 다른 실시 예에 따른 클래스를 설명하기 위한 도면이다.
도 11은 도 2에 도시된 전자 장치의 세부 구성을 나타내는 블록도이다.
도 12는 본 개시의 일 실시 예에 따른 학습 네트워크 모델을 학습하고 이용하기 위한 영상 처리 장치의 구성을 나타내는 블록도이다.
도 13은 본 개시의 일 실시 예에 따른 영상 처리 방법을 설명하기 위한 흐름도이다.1 is a diagram for describing an example implementation of an image processing apparatus according to an exemplary embodiment of the present disclosure.
2 is a block diagram illustrating a configuration of an image processing apparatus according to an embodiment of the present disclosure.
3 is a diagram for describing a pixel block according to an exemplary embodiment of the present disclosure.
4 is a diagram for describing a texture patch according to an exemplary embodiment of the present disclosure.
5 is a diagram illustrating a learning network model according to an embodiment of the present disclosure.
6 is a diagram for describing a class and a texture patch according to an embodiment of the present disclosure.
7 is a diagram illustrating a model for learning an input image according to an embodiment of the present disclosure.
8 is a diagram for describing a class according to another exemplary embodiment of the present disclosure.
9 is a diagram for describing a learning result according to an embodiment of the present disclosure.
10 is a diagram for describing a class according to another exemplary embodiment of the present disclosure.
11 is a block diagram illustrating a detailed configuration of the electronic device shown in FIG. 2.
12 is a block diagram illustrating a configuration of an image processing apparatus for learning and using a learning network model according to an embodiment of the present disclosure.
13 is a flowchart illustrating an image processing method according to an exemplary embodiment of the present disclosure.

이하에서는 첨부 도면을 참조하여 본 개시를 상세히 설명한다. Hereinafter, the present disclosure will be described in detail with reference to the accompanying drawings.

본 개시의 실시 예에서 사용되는 용어는 본 개시에서의 기능을 고려하면서 가능한 현재 널리 사용되는 일반적인 용어들을 선택하였으나, 이는 당 분야에 종사하는 기술자의 의도 또는 판례, 새로운 기술의 출현 등에 따라 달라질 수 있다. 또한, 특정한 경우는 출원인이 임의로 선정한 용어도 있으며, 이 경우 해당되는 개시의 설명 부분에서 상세히 그 의미를 기재할 것이다. 따라서 본 개시에서 사용되는 용어는 단순한 용어의 명칭이 아닌, 그 용어가 가지는 의미와 본 개시의 전반에 걸친 내용을 토대로 정의되어야 한다. Terms used in the embodiments of the present disclosure have selected general terms that are currently widely used as possible while considering functions in the present disclosure, but this may vary according to the intention or precedent of a technician working in the field, the emergence of new technologies, etc. . In addition, in certain cases, there are terms arbitrarily selected by the applicant, and in this case, the meaning of the terms will be described in detail in the description of the corresponding disclosure. Therefore, the terms used in the present disclosure should be defined based on the meaning of the term and the overall contents of the present disclosure, not a simple name of the term.

본 명세서에서, "가진다," "가질 수 있다," "포함한다," 또는 "포함할 수 있다" 등의 표현은 해당 특징(예: 수치, 기능, 동작, 또는 부품 등의 구성요소)의 존재를 가리키며, 추가적인 특징의 존재를 배제하지 않는다.In this specification, expressions such as "have," "may have," "include," or "may include" are the presence of corresponding features (eg, elements such as numbers, functions, actions, or parts). And does not exclude the presence of additional features.

A 또는/및 B 중 적어도 하나라는 표현은 "A" 또는 "B" 또는 "A 및 B" 중 어느 하나를 나타내는 것으로 이해되어야 한다. The expression of at least one of A or/and B is to be understood as representing either “A” or “B” or “A and B”.

본 명세서에서 사용된 "제1," "제2," "첫째," 또는 "둘째,"등의 표현들은 다양한 구성요소들을, 순서 및/또는 중요도에 상관없이 수식할 수 있고, 한 구성요소를 다른 구성요소와 구분하기 위해 사용될 뿐 해당 구성요소들을 한정하지 않는다. Expressions such as "first," "second," "first," or "second," as used herein may modify various elements regardless of order and/or importance, and It is used to distinguish it from other components and does not limit the components.

어떤 구성요소(예: 제1 구성요소)가 다른 구성요소(예: 제2 구성요소)에 "(기능적으로 또는 통신적으로) 연결되어((operatively or communicatively) coupled with/to)" 있다거나 "접속되어(connected to)" 있다고 언급된 때에는, 어떤 구성요소가 다른 구성요소에 직접적으로 연결되거나, 다른 구성요소(예: 제3 구성요소)를 통하여 연결될 수 있다고 이해되어야 할 것이다. Some component (eg, the first component) is “(functionally or communicatively) coupled with/to)” to another component (eg, the second component) or “ When referred to as "connected to", it should be understood that a component can be directly connected to another component, or can be connected through another component (eg, a third component).

단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "포함하다" 또는 "구성되다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다. Singular expressions include plural expressions unless the context clearly indicates otherwise. In the present application, terms such as "comprise" or "comprise" are intended to designate the presence of features, numbers, steps, actions, components, parts, or combinations thereof described in the specification, but one or more other It is to be understood that the presence or addition of features, numbers, steps, actions, components, parts, or combinations thereof, does not preclude the possibility of preliminary exclusion.

본 개시에서 "모듈" 혹은 "부"는 적어도 하나의 기능이나 동작을 수행하며, 하드웨어 또는 소프트웨어로 구현되거나 하드웨어와 소프트웨어의 결합으로 구현될 수 있다. 또한, 복수의 "모듈" 혹은 복수의 "부"는 특정한 하드웨어로 구현될 필요가 있는 "모듈" 혹은 "부"를 제외하고는 적어도 하나의 모듈로 일체화되어 적어도 하나의 프로세서(미도시)로 구현될 수 있다.In the present disclosure, a "module" or "unit" performs at least one function or operation, and may be implemented as hardware or software, or a combination of hardware and software. In addition, a plurality of "modules" or a plurality of "units" are integrated into at least one module except for the "module" or "unit" that needs to be implemented with specific hardware and implemented as at least one processor (not shown). Can be.

본 명세서에서, 사용자라는 용어는 전자 장치를 사용하는 사람 또는 전자 장치를 사용하는 장치(예: 인공지능 전자 장치)를 지칭할 수 있다.In the present specification, the term user may refer to a person using an electronic device or a device (eg, an artificial intelligence electronic device) using an electronic device.

이하 첨부된 도면들을 참조하여 본 개시의 일 실시예를 보다 상세하게 설명한다.Hereinafter, an embodiment of the present disclosure will be described in more detail with reference to the accompanying drawings.

도 1은 본 개시의 일 실시 예에 따른 영상 처리 장치의 구현 예를 설명하기 위한 도면이다. 1 is a diagram for describing an example implementation of an image processing apparatus according to an exemplary embodiment of the present disclosure.

영상 처리 장치(100)는 도 1에 도시된 바와 같이 TV로 구현될 수 있으나, 이에 한정되는 것은 아니며 스마트 폰, 태블릿 PC, 노트북 PC, HMD(Head mounted Display), NED(Near Eye Display), LFD(large format display), Digital Signage(디지털 간판), DID(Digital Information Display), 비디오 월(video wall), 프로젝터 디스플레이 등과 같이 디스플레이 기능을 갖춘 장치라면 한정되지 않고 적용 가능하다. The image processing apparatus 100 may be implemented as a TV as shown in FIG. 1, but is not limited thereto, and is not limited to, and is not limited to, a smart phone, a tablet PC, a notebook PC, a head mounted display (HMD), a near eye display (NED), and a LFD. (Large format display), Digital Signage (digital signage), DID (Digital Information Display), video wall (video wall), projector display, such as a device equipped with a display function can be applied without limitation.

영상 처리 장치(100)는 다양한 해상도의 영상 또는 다양한 압축 영상을 수신할 수 있다. 예를 들어, 영상 처리 장치(100)는 SD(Standard Definition), HD(High Definition), Full HD, Ultra HD 영상 중 어느 하나의 영상을 수신할 수 있다. 또한 영상 처리 장치(100)는 MPEG(예를 들어, MP2, MP4, MP7 등), AVC, H.264, HEVC 등으로 압축된 형태로 영상을 수신할 수도 있다. The image processing apparatus 100 may receive images of various resolutions or various compressed images. For example, the image processing apparatus 100 may receive any one of standard definition (SD), high definition (HD), full HD, and ultra HD images. In addition, the image processing apparatus 100 may receive an image in a compressed form using MPEG (eg, MP2, MP4, MP7, etc.), AVC, H.264, HEVC, or the like.

일 실시 예에 따라 영상 처리 장치(100)가 UHD TV로 구현더라도, UHD 컨텐츠 자체가 부족하기 때문에 SD(Standard Definition), HD(High Definition), Full HD 등의 영상(이하 저해상도 영상이라 함)이 입력되는 경우가 많다. 이 경우, 입력된 저해상도 영상을 UHD 영상(이하 고해상도 영상이라 함)으로 확대하여 제공하는 방법을 이용할 수 있다. 하지만, 영상의 확대 과정에서 영상의 텍스처(texture)가 블러(Blur)되어 세밀감이 저하되는 문제가 있다. 여기서, 영상의 텍스처는 영상 중에서 동일한 피쳐(feature)로 간주되는 영역의 특유의 무늬 또는 모양을 의미한다. According to an embodiment, even if the image processing apparatus 100 is implemented as a UHD TV, images such as SD (Standard Definition), HD (High Definition), and Full HD (hereinafter referred to as low-resolution images) are not sufficient because UHD content itself is insufficient. It is often entered. In this case, a method of expanding the input low-resolution image into a UHD image (hereinafter referred to as a high-resolution image) may be used. However, in the process of expanding the image, there is a problem that the texture of the image is blurred, resulting in deterioration of detail. Here, the texture of the image means a unique pattern or shape of an area that is regarded as the same feature in the image.

다른 실시 예에 따라 고해상도 영상이 입력되더라도 영상 압축 등으로 인해 텍스처의 손실이 발생되어 세밀감이 떨어지는 문제가 있다. 디지털 영상은 화소 수가 증가할수록 더 많은 데이터를 필요로 하게 되며, 대용량 데이터를 압축하게 되는 경우 압축으로 인한 텍스처의 손실은 불가피하기 때문이다. According to another embodiment, even when a high-resolution image is input, there is a problem in that the texture is lost due to image compression or the like, resulting in deterioration of detail. This is because digital images require more data as the number of pixels increases, and when large amounts of data are compressed, loss of texture due to compression is inevitable.

따라서, 이하에서는 상술한 바와 같이 다양한 경우에 있어 손실된 텍스처 성분을 복원하여 영상의 세밀감을 향상시키는 다양한 실시 예에 대해 설명하도록 한다. Accordingly, in the following, various embodiments of improving the detail of an image by restoring a texture component lost in various cases as described above will be described.

도 2는 본 개시의 일 실시 예에 따른 영상 처리 장치의 구성을 나타내는 블록도다. 2 is a block diagram illustrating a configuration of an image processing apparatus according to an embodiment of the present disclosure.

도 2에 따르면, 영상 처리 장치(100)는 메모리(110) 및 프로세서(120)를 포함한다. Referring to FIG. 2, the image processing apparatus 100 includes a memory 110 and a processor 120.

메모리(110)는 프로세서(120)와 전기적으로 연결되며, 본 개시의 다양한 실시 예를 위해 필요한 데이터를 저장할 수 있다. 예를 들어, 메모리(110)는 프로세서(120)에 포함된 롬(ROM)(예를 들어, EEPROM(electrically erasable programmable read-only memory)), 램(RAM) 등의 내부 메모리로 구현되거나, 프로세서(120)와 별도의 메모리로 구현될 수도 있다. 이 경우, 메모리(110)는 데이터 저장 용도에 따라 영상 처리 장치(100)에 임베디드된 메모리 형태로 구현되거나, 영상 처리 장치(100)에 탈부착이 가능한 메모리 형태로 구현될 수도 있다. 예를 들어, 영상 처리 장치(100)의 구동을 위한 데이터의 경우 영상 처리 장치(100)에 임베디드된 메모리에 저장되고, 영상 처리 장치(100)의 확장 기능을 위한 데이터의 경우 영상 처리 장치(100)에 탈부착이 가능한 메모리에 저장될 수 있다. 한편, 영상 처리 장치(100)에 임베디드된 메모리의 경우 휘발성 메모리(예: DRAM(dynamic RAM), SRAM(static RAM), 또는 SDRAM(synchronous dynamic RAM) 등), 비휘발성 메모리(non-volatile Memory)(예: OTPROM(one time programmable ROM), PROM(programmable ROM), EPROM(erasable and programmable ROM), EEPROM(electrically erasable and programmable ROM), mask ROM, flash ROM, 플래시 메모리(예: NAND flash 또는 NOR flash 등), 하드 드라이브, 또는 솔리드 스테이트 드라이브(solid state drive(SSD)) 중 적어도 하나로 구현되고, 영상 처리 장치(100)에 탈부착이 가능한 메모리의 경우 메모리 카드(예를 들어, CF(compact flash), SD(secure digital), Micro-SD(micro secure digital), Mini-SD(mini secure digital), xD(extreme digital), MMC(multi-media card) 등), USB 포트에 연결가능한 외부 메모리(예를 들어, USB 메모리) 등과 같은 형태로 구현될 수 있다.The memory 110 is electrically connected to the processor 120 and may store data necessary for various embodiments of the present disclosure. For example, the memory 110 may be implemented as an internal memory such as ROM (eg, electrically erasable programmable read-only memory (EEPROM)) or RAM included in the processor 120, or It may be implemented as a separate memory from 120. In this case, the memory 110 may be implemented in the form of a memory embedded in the image processing apparatus 100 depending on the purpose of data storage, or may be implemented in the form of a memory that is detachable to the image processing apparatus 100. For example, data for driving the image processing device 100 is stored in a memory embedded in the image processing device 100, and in the case of data for an extended function of the image processing device 100, the image processing device 100 ) Can be stored in a removable memory. Meanwhile, in the case of a memory embedded in the image processing apparatus 100, a volatile memory (eg, dynamic RAM (DRAM), static RAM (SRAM), synchronous dynamic RAM (SDRAM), etc.), non-volatile memory) (E.g. one time programmable ROM (OTPROM), programmable ROM (PROM), erasable and programmable ROM (EPROM), electrically erasable and programmable ROM (EEPROM), mask ROM, flash ROM, flash memory (e.g. NAND flash or NOR flash) Etc.), a hard drive, or a solid state drive (SSD), and in the case of a memory that is detachable to the image processing apparatus 100, a memory card (for example, a compact flash (CF)), SD (secure digital), Micro-SD (micro secure digital), Mini-SD (mini secure digital), xD (extreme digital), MMC (multi-media card), etc.), external memory that can be connected to the USB port (e.g. For example, it may be implemented in a form such as a USB memory).

일 실시 예에 따라 메모리(110)는 입력 영상(10)에 포함된 픽셀 블록에 대응되는 텍스처 패치를 획득하기 위해 이용되는 학습 네트워크 모델을 저장할 수 있다. 여기서, 학습 네트워크 모델은 복수의 영상에 기초하여 기계 학습(Machine Learning)된 모델일 수 있다. 예를 들어, 학습 네트워크 모델은 복수의 샘플 영상 및 입력 영상(10)에 기초하여 CNN(Convolution Neural Network, 컨벌루션 신경망) 학습된 모델일 수 있다. 여기서, CNN은 음성처리, 이미지 처리 등을 위해 고안된 특수한 연결구조를 가진 다층신경망이다. 특히, CNN은 픽셀에 전처리를 통하여 이미지를 다양하게 필터링하고, 이미지의 특성을 인식할 수 있다. 일 예로, 입력 영상(10)에 포함된 기 설정된 크기의 픽셀 블록의 특성을 인식할 수 있다. 한편, 학습 네트워크 모델은 CNN에 한정되지 않음은 물론이다. 예를 들어, 영상 처리 장치(100)는 RNN(Recurrent Neural Network), DNN(Deep Neural Network) 등 다양한 신경망(Neural Network)에 기반한 학습 네트워크 모델을 이용할 수 있음은 물론이다. According to an embodiment, the memory 110 may store a learning network model used to acquire a texture patch corresponding to a pixel block included in the input image 10. Here, the learning network model may be a machine-learned model based on a plurality of images. For example, the training network model may be a model trained on a convolution neural network (CNN) based on a plurality of sample images and input images 10. Here, CNN is a multilayer neural network with a special connection structure designed for voice processing and image processing. In particular, the CNN can variously filter the image through preprocessing on the pixel and recognize the characteristics of the image. For example, a characteristic of a pixel block having a preset size included in the input image 10 may be recognized. On the other hand, it goes without saying that the learning network model is not limited to CNN. For example, it goes without saying that the image processing apparatus 100 may use a learning network model based on various neural networks, such as a recurrent neural network (RNN) and a deep neural network (DNN).

한편, “텍스처 패치”는 픽셀 블록의 텍스처를 향상시키기 위해 해당 픽셀 블록에 적용되는 패치를 의미할 수 있다. 여기서, "패치"는 기능을 고려하여 편의상 적용된 용어이므로, 패치라는 용어 이외에 다양한 용어가 본 실시 예에 적용될 수 있다. 예를 들면, 각 패치는 복수의 패치 값이 픽셀 단위의 행렬 형태로 배열된 구조를 가지는 바, 이러한 형태를 고려하여 마스크(mask)로 지칭될 수도 있다. 후술하는 바와 같이, 텍스처 패치가 해당 픽셀 블록에 적용됨에 따라, 픽셀 블록의 텍스처가 향상되고 세밀감이 증대될 수 있다. 한편, 영상 처리 장치(100)는 픽셀 블록의 특성에 관계 없이 픽셀 블록에 이미 정해진텍스처 패치를 적용하는 것이 아니라, 학습 네트워크 모델을 이용하여 업데이트된텍스처 패치를 적용할 수 있다.Meanwhile, the “texture patch” may mean a patch applied to a corresponding pixel block in order to improve the texture of the pixel block. Here, since "patch" is a term applied for convenience in consideration of functions, various terms other than the term "patch" may be applied to the present embodiment. For example, since each patch has a structure in which a plurality of patch values are arranged in a pixel-based matrix form, it may be referred to as a mask in consideration of this form. As will be described later, as the texture patch is applied to the corresponding pixel block, the texture of the pixel block may be improved and the fineness of the pixel block may be increased. Meanwhile, the image processing apparatus 100 may not apply a previously determined texture patch to the pixel block regardless of the characteristics of the pixel block, but may apply the updated texture patch using the learning network model.

프로세서(120)는 메모리(110)와 전기적으로 연결되어 영상 처리 장치(100)의 전반적인 동작을 제어한다. The processor 120 is electrically connected to the memory 110 and controls the overall operation of the image processing apparatus 100.

일 실시 예에 따라 프로세서(120)는 디지털 영상 신호를 처리하는 디지털 시그널 프로세서(digital signal processor(DSP), 마이크로 프로세서(microprocessor), T-CON(Timing controller)으로 구현될 수 있다. 다만, 이에 한정되는 것은 아니며, 중앙처리장치(central processing unit(CPU)), MCU(Micro Controller Unit), MPU(micro processing unit), 컨트롤러(controller), 어플리케이션 프로세서(application processor(AP)), 또는 커뮤니케이션 프로세서(communication processor(CP)), ARM 프로세서 중 하나 또는 그 이상을 포함하거나, 해당 용어로 정의될 수 있다. 또한, 프로세서(140)는 프로세싱 알고리즘이 내장된 SoC(System on Chip), LSI(large scale integration)로 구현될 수도 있고, FPGA(Field Programmable gate array) 형태로 구현될 수도 있다.According to an embodiment, the processor 120 may be implemented as a digital signal processor (DSP), a microprocessor, or a timing controller (T-CON) that processes digital image signals, but is limited thereto. It is not a central processing unit (CPU), a micro controller unit (MCU), a micro processing unit (MPU), a controller, an application processor (AP), or a communication processor. processor (CP)), or one or more of an ARM processor, or may be defined in a corresponding term In addition, the processor 140 includes a system on chip (SoC) and a large scale integration (LSI) with a built-in processing algorithm. It may be implemented in the form of a field programmable gate array (FPGA).

프로세서(120)는 입력 영상을 영상 처리하여 출력 영상을 획득한다. 구체적으로, 프로세서(120)는 입력 영상에 텍스처 향상 처리를 수행하여 출력 영상을 획득할 수 있다. 여기서, 출력 영상은 UHD(Ultra High Definition) 영상 특히, 4K UHD 영상 또는 8K UHD 영상일 수 있으나, 이에 한정되는 것은 아니다. The processor 120 image-processes the input image to obtain an output image. Specifically, the processor 120 may obtain an output image by performing texture enhancement processing on the input image. Here, the output image may be an Ultra High Definition (UHD) image, particularly, a 4K UHD image or an 8K UHD image, but is not limited thereto.

특히, 본 개시의 일 실시 예에 따른 프로세서(120)는 텍스처 향상 처리에 이용될 텍스처 패치를 획득할 수 있다. 구체적으로, 프로세서(120)는 입력 영상(10)을 학습 네트워크 모델에 적용하여 입력 영상(10)에 포함된 픽셀 블록에 대응되는 텍스처 패치를 획득할 수 있다. 여기서, 픽셀 블록은 적어도 하나의 픽셀을 포함하는 인접한 픽셀들의 집합을 의미한다. In particular, the processor 120 according to an embodiment of the present disclosure may obtain a texture patch to be used for texture enhancement processing. Specifically, the processor 120 may obtain a texture patch corresponding to a pixel block included in the input image 10 by applying the input image 10 to the learning network model. Here, the pixel block means a set of adjacent pixels including at least one pixel.

도 3은 본 개시의 일 실시 예에 따른 픽셀 블록을 설명하기 위한 도면이다.3 is a diagram for describing a pixel block according to an exemplary embodiment of the present disclosure.

도 3을 참조하면, 프로세서(120)는 입력 영상(10)을 구성하는 영상 프레임에 있어서, 영상 프레임에 포함된 복수의 픽셀을 픽셀 블록(20) 단위로 구분하여 학습 네트워크 모델에 입력할 수 있다. 일 실시 예에 따라, 프로세서(120)는 영상 프레임을 구성하는 복수의 픽셀 블록(20)을 순차적으로 학습 네트워크 모델에 입력할 수 있다. 이어서, 학습 네트워크 모델은 입력되는 복수의 픽셀 블록(20-1, ... 20-n) 각각에 대응되는 텍스처 패치(30-1, ... 30-n)를 출력할 수 있다. Referring to FIG. 3, in the image frame constituting the input image 10, the processor 120 may divide a plurality of pixels included in the image frame in units of pixel blocks 20 and input them to the learning network model. . According to an embodiment, the processor 120 may sequentially input a plurality of pixel blocks 20 constituting an image frame into the learning network model. Subsequently, the learning network model may output texture patches 30-1, ... 30-n corresponding to each of the input plurality of pixel blocks 20-1, ... 20-n.

본 개시의 일 실시 예에 따른 프로세서(120)는 입력 영상(10)을 5*5 크기의 픽셀 블록(20)으로 식별할 수 있으나, 픽셀 블록의 크기는 이에 한정되는 것은 아니며, 3*3, 4*4 등 N*N 형태의 다양한 크기로 구현 가능하다. 예를 들어, 프로세서(120)는 영상 처리의 목적, 입력 영상(10)의 해상도(예를 들어, FHD), 출력 영상의 해상도(UHD, 8K) 등에 따라 입력 영상(10)을 다양한 크기의 픽셀 블록(20)으로 구분할 수 있음은 물론이다. 이하에서는, 설명의 편의를 위해 입력 영상(10)을 구성하는 영상 프레임에서 행렬 형태로 배열된 구조의 기 설정된 크기의 픽셀 그룹을 입력 영상(10)에서 획득된 픽셀 블록(20)으로 상정하여 설명하도록 한다.The processor 120 according to the exemplary embodiment of the present disclosure may identify the input image 10 as a 5*5 pixel block 20, but the size of the pixel block is not limited thereto, and 3*3, It can be implemented in various sizes in the form of N*N such as 4*4. For example, the processor 120 may convert the input image 10 into pixels of various sizes according to the purpose of image processing, the resolution of the input image 10 (for example, FHD), and the resolution of the output image (UHD, 8K). Of course, it can be divided into blocks 20. Hereinafter, for convenience of explanation, a group of pixels of a predetermined size having a structure arranged in a matrix form in an image frame constituting the input image 10 is assumed to be a pixel block 20 obtained from the input image 10. Do it.

도 2로 돌아와서, 본 개시의 일 실시 예에 따른 프로세서(120)는 입력 영상(20)을 학습 네트워크 모델을 적용하여 픽셀 블록(20)에 대응되는 텍스처 패치를 획득할 수 있다. 이에 대한 구체적인 설명은 도 4를 참조하여 하도록 한다.Returning to FIG. 2, the processor 120 according to an embodiment of the present disclosure may obtain a texture patch corresponding to the pixel block 20 by applying the learning network model to the input image 20. A detailed description of this will be made with reference to FIG. 4.

도 4는 본 개시의 일 실시 예에 따른 텍스처 패치를 설명하기 위한 도면이다.4 is a diagram for describing a texture patch according to an exemplary embodiment of the present disclosure.

도 4는 입력 영상(10)을 구성하는 픽셀들 각각을 픽셀 값으로 표현한 도면이다. 본 개시의 일 실시 예에 따른 프로세서(120)는 입력 영상(10)을 학습 네트워크 모델을 적용하여 픽셀 블록(20)에 대응되는 텍스처 패치(30)를 획득할 수 있다. 여기서, 적용이란 입력 영상(10)을 학습 네트워크 모델에 입력하는 것을 의미하며, 학습 네트워크의 출력이 텍스처 패치(30)가 될 수 있다. 4 is a diagram in which pixels constituting the input image 10 are expressed as pixel values. The processor 120 according to an embodiment of the present disclosure may obtain a texture patch 30 corresponding to the pixel block 20 by applying a learning network model to the input image 10. Here, application means inputting the input image 10 to the learning network model, and the output of the learning network may be the texture patch 30.

여기서, 학습 네트워크 모델은, 입력 영상(10)에 포함된 픽셀 블록(20)에 대응되는 텍스처 패치(30)를 출력할 수 있고, 픽셀 블록(20)에 기초하여 학습을 수행할 수 있다.Here, the learning network model may output a texture patch 30 corresponding to the pixel block 20 included in the input image 10 and may perform learning based on the pixel block 20.

일 실시 예에 따른 학습 네트워크 모델은 영상의 다양한 특성 중 어느 하나를 기준으로 분류된 복수의 클래스를 포함할 수 있고, 복수의 클래스 각각에 대응되는 텍스처 패치(30)를 포함할 수 있다. 예를 들어, 학습 네트워크 모델은 영상의 특성 중 에지 방향에 기초하여 분류된 복수의 클래스를 저장할 수 있고, 복수의 클래스 각각에 매칭된 텍스처 패치(30)를 포함할 수 있다. 다른 예로, 학습 네트워크 모델은 영상의 특성 중 픽셀 블록(20) 단위의 계조 평균 값에 기초하여 분류된 복수의 클래스를 저장할 수 있고, 복수의 클래스 각각에 매칭된 텍스처 패치(30)를 포함할 수 있다. The learning network model according to an embodiment may include a plurality of classes classified based on any one of various characteristics of an image, and may include a texture patch 30 corresponding to each of the plurality of classes. For example, the learning network model may store a plurality of classes classified based on an edge direction among characteristics of an image, and may include a texture patch 30 matched to each of the plurality of classes. As another example, the learning network model may store a plurality of classes classified based on a grayscale average value of each pixel block 20 among the characteristics of an image, and may include a texture patch 30 matched to each of the plurality of classes. have.

한편, 본 개시의 일 실시 예에 따른 영상 처리 장치(100)는 복수의 학습 네트워크 모델을 포함할 수 있음은 물론이다. 일 예로, 영상 처리 장치(100)는 에지 방향을 기준으로 클래스를 구분하고, 텍스처 패치(30)에 대한 학습을 수행하는 제1 학습 네트워크 모델, 계조 평균 값을 기준으로 클래스를 구분하고, 학습을 수행하는 제2 학습 네트워크 모델, 색 좌표를 기준으로 클래스를 구분하고, 학습을 수행하는 제3 학습 네트워크 모델 등 복수의 학습 네트워크 모델을 포함할 수 있다. 일 실시 예에 따른 영상 처리 장치(100)는 입력 영상(10)의 특성에 기초하여 복수의 학습 네트워크 모델 중 어느 하나를 식별하고, 식별된 학습 네트워크 모델을 입력 영상(10)에 적용하여 텍스처 패치(30)를 획득할 수 있다. 예를 들어, 영상 처리 장치(100)는 복수의 학습 네트워크 모델 중 입력 영상(10)의 특성에 기초하여 적합한 텍스처 패치(30) 획득을 위한 어느 하나의 학습 네트워크 모델을 식별하는 전(前)처리 학습 네트워크 모델을 포함할 수 있다. 전처리 학습 네트워크 모델은 영상의 특성에 기초하여 예를 들어, 입력 영상(10)을 구성하는 복수의 픽셀들 색상이 유사 색상 내에 분포되어 있다면, 복수의 학습 네트워크 모델 중 에지 방향을 기준으로 클래스를 구분하고, 텍스처 패치(30)를 출력하는 제1 학습 네트워크 모델을 식별할 수 있다.Meanwhile, it goes without saying that the image processing apparatus 100 according to an embodiment of the present disclosure may include a plurality of learning network models. As an example, the image processing apparatus 100 classifies classes based on an edge direction, classifies classes based on a first learning network model that performs learning on a texture patch 30, a gray scale average value, and performs learning. It may include a plurality of learning network models, such as a second learning network model to perform, a third learning network model that classifies classes based on color coordinates, and performs learning. The image processing apparatus 100 according to an embodiment identifies any one of a plurality of learning network models based on the characteristics of the input image 10 and applies the identified learning network model to the input image 10 to patch a texture. (30) can be obtained. For example, the image processing apparatus 100 is a pre-process of identifying any one learning network model for obtaining an appropriate texture patch 30 based on the characteristics of the input image 10 among a plurality of learning network models. It may contain a learning network model. The preprocessing learning network model classifies a class based on the edge direction among the plurality of learning network models, for example, if the colors of a plurality of pixels constituting the input image 10 are distributed within a similar color based on the characteristics of the image. And, it is possible to identify a first learning network model that outputs the texture patch 30.

본 개시의 일 실시 예에 따른 학습 네트워크 모델은 입력 영상(10)에 기초하여 학습을 수행할 수 있다. 예를 들어, 학습 네트워크 모델은 입력 영상(10)에 포함된 픽셀 블록(20)에 대응되는 클래스에 대한 해당 픽셀 블록(20)의 제1 유사도를 식별하고, 클래스에 대한 해당 클래스에 매칭된 텍스처 패치(30)의 제2 유사도를 식별할 수 있다. 이어서, 제1 및 제2 유사도에 기초하여 텍스처 패치(30)의 업데이트 여부를 식별할 수 있다. 예를 들어, 학습 네트워크 모델은 제1 유사도가 제2 유사도 보다 크면, 획득된 텍스처 패치(30)가 해당 입력 영상(10)의 텍스처 향상에 적합하지 않은 것으로 판단하고, 입력 영상(10)의 픽셀 블록(20)에 기초하여 업데이트를 수행할 수 있다. 학습 네트워크 모델은 입력 영상(10)을 구성하는 다양한 픽셀 블록들 중 해당 픽셀 블록(20)와 동일한 클래스에 속하는 타 픽셀 블록(20’)에 대응되는 텍스처 패치(30)를 출력함에 있어서, 업데이트 전 텍스처 패치(30)가 아닌, 픽셀 블록(20)에 기초하여 업데이트 된 텍스처 패치(30’)를 출력할 수 있다. 이에 따라, 학습 네트워크 모델이 출력하는 텍스처 패치(30)는 입력 영상(10)의 텍스처 향상에 적합할 수 있다. 다른 예로, 학습 네트워크 모델은 제2 유사도가 제1 유사도보다 크면, 획득된 텍스처 패치(30)가 해당 입력 영상(10)의 텍스처 향상에 적합한 것으로 판단하고, 텍스처 패치(30)를 유지할 수 있다.The learning network model according to an embodiment of the present disclosure may perform learning based on the input image 10. For example, the learning network model identifies a first similarity of the pixel block 20 with respect to the class corresponding to the pixel block 20 included in the input image 10, and the texture matched to the corresponding class for the class The second degree of similarity of the patch 30 can be identified. Subsequently, it may be identified whether the texture patch 30 is updated based on the first and second similarities. For example, if the first similarity is greater than the second similarity, the learning network model determines that the obtained texture patch 30 is not suitable for texture enhancement of the input image 10, and the pixel of the input image 10 An update may be performed based on block 20. The learning network model outputs a texture patch 30 corresponding to another pixel block 20' belonging to the same class as the corresponding pixel block 20 among various pixel blocks constituting the input image 10. The updated texture patch 30 ′ may be output based on the pixel block 20 instead of the texture patch 30. Accordingly, the texture patch 30 output from the learning network model may be suitable for texture enhancement of the input image 10. As another example, when the second similarity degree is greater than the first similarity degree, the learning network model may determine that the obtained texture patch 30 is suitable for texture enhancement of the corresponding input image 10 and maintain the texture patch 30.

한편, 본 개시의 일 실시 예에 따른 복수의 클래스 중 픽셀 블록(20)에 대응되는 클래스를 구분(또는, 식별)하는 학습 네트워크 모델의 동작은 분류기(Classifier), 클래스 식별기 등으로 지칭될 수 있다. 여기서, 분류기는 입력 영상(10)에 포함된 픽셀 블록(20)이 입력되면, 복수의 클래스 중 픽셀 블록(20)에 적합한 클래스를 식별할 수 있다. 예를 들어, 분류기는 픽셀 블록(20)의 에지 방향을 식별하고, 식별된 에지 방향과 복수의 클래스 각각을 정의하는 에지 방향 간의 유사도를 식별할 수 있다. 이어서, 분류기는 복수의 클래스 중 유사도가 가장 큰 하나의 클래스를 해당 픽셀 블록(20)에 대응되는 클래스로 식별할 수 있다.Meanwhile, an operation of a learning network model for classifying (or identifying) a class corresponding to the pixel block 20 among a plurality of classes according to an embodiment of the present disclosure may be referred to as a classifier, a class identifier, or the like. . Here, when the pixel block 20 included in the input image 10 is input, the classifier may identify a class suitable for the pixel block 20 from among a plurality of classes. For example, the classifier may identify an edge direction of the pixel block 20 and a degree of similarity between the identified edge direction and an edge direction defining each of the plurality of classes. Subsequently, the classifier may identify one class having the greatest similarity among the plurality of classes as a class corresponding to the corresponding pixel block 20.

본 개시의 일 실시 예에 따른 학습 네트워크 모델은 픽셀 블록(20)에 대응되는 클래스를 식별하는 모델(예를 들어, 분류기 모델) 및 픽셀 블록(20)과 해당 픽셀 블록(20)에 대응되는 텍스처 패치(30)의 유사도를 비교하여 텍스처 패치(30)에 대한 자가 학습(Self-Learning)을 수행하는 모델의 결합을 의미할 수 있다. 일 실시 예에 따른 학습 네트워크 모델은 외부 장치에 의존하지 않고 영상 처리 장치(100) 자체적으로 학습을 수행하는 온 디바이스 머신 러닝 모델(On-device Machine Learning Model)일 수 있다. 한편, 이는 일 실시 예이고, 학습 네트워크 모델은 분류기 모델은 온 디바이스(On-device) 기반으로 동작하고, 텍스처 패치에 대한 학습을 수행하는 모델은 외부 서버 기반으로 동작하는 형태로 구현될 수도 있음은 물론이다.The learning network model according to an embodiment of the present disclosure includes a model (eg, a classifier model) for identifying a class corresponding to the pixel block 20 and a texture corresponding to the pixel block 20 and the pixel block 20. It may mean a combination of a model that performs self-learning on the texture patch 30 by comparing the similarity of the patch 30. The learning network model according to an embodiment may be an on-device machine learning model in which the image processing apparatus 100 performs training on its own without depending on an external device. On the other hand, this is an embodiment, and the learning network model may be implemented in a form that the classifier model operates on an on-device basis, and a model that performs learning on a texture patch operates on an external server basis. Of course.

이에 따라 학습 네트워크 모델은 영상의 특성에 기초하여 분류 및 학습된 복수의 클래스 각각에 대응되는 텍스처 패치(30)를 저장할 수 있다. 학습 네트워크 모델은 입력 영상(10)에 대응되는 텍스처 패치를 출력함과 동시에, 입력 영상(10)에 포함된 픽셀 값에 기초하여 복수의 클래스 각각에 대응되는 텍스처 패치(30)를 학습할 수 있다.Accordingly, the learning network model may store a texture patch 30 corresponding to each of a plurality of classes classified and learned based on the characteristics of the image. The learning network model may output a texture patch corresponding to the input image 10 and at the same time learn a texture patch 30 corresponding to each of a plurality of classes based on pixel values included in the input image 10. .

도 4를 참조하면, 학습 네트워크 모델은 픽셀 블록(20)의 특성에 기초하여 복수의 클래스 중 해당 픽셀 블록(20)에 대응되는 하나의 클래스를 식별할 수 있다. 예를 들어, 학습 네트워크 모델은 영상의 다양한 특성 중 에지(edge) 방향(또는, 에지 패턴)에 기초하여 분류된 복수의 클래스를 저장할 수 있다. 여기서, 에지는 픽셀 값(또는, 픽셀의 밝기)이 낮은 값에서 높은 값 또는 높은 값에서 낮은 값으로 변하는 지점을 의미할 수 있다. 에지는 영상에 포함된 다양한 오브젝트에 따라 생성되는 오브젝트 간의 경계선을 의미할 수도 있다. 일 예에 따른 학습 네트워크 모델은 복수의 클래스 중 픽셀 블록(20)의 에지 방향(또는, 경계선의 방향)에 대응되는 하나의 클래스를 식별할 수 있다. 또한, 학습 네트워크 모델은 복수의 클래스 중 픽셀 블록(20)의 에지 방향과 가장 유사한(또는, 가장 적합한) 하나의 클래스를 식별할 수 있다. 이어서, 학습 네트워크 모델은 식별된 클래스에 대응되는 텍스처 패치(30)를 출력할 수 있다. 도 2로 돌아와서, 본 개시의 일 실시 예에 따른 프로세서(120)는 학습 네트워크 모델로부터 출력된 텍스처 패치를 입력 영상(10)에 적용하여 텍스처 향상 처리를 수행할 수 있다. Referring to FIG. 4, the learning network model may identify one class corresponding to the pixel block 20 from among a plurality of classes based on the characteristics of the pixel block 20. For example, the learning network model may store a plurality of classes classified based on an edge direction (or edge pattern) among various characteristics of an image. Here, the edge may mean a point at which a pixel value (or brightness of a pixel) changes from a low value to a high value or from a high value to a low value. The edge may mean a boundary line between objects generated according to various objects included in the image. The learning network model according to an example may identify one class corresponding to the edge direction (or border direction) of the pixel block 20 from among a plurality of classes. In addition, the learning network model may identify one class that is most similar (or most suitable) to the edge direction of the pixel block 20 among the plurality of classes. Subsequently, the learning network model may output a texture patch 30 corresponding to the identified class. Returning to FIG. 2, the processor 120 according to an embodiment of the present disclosure may perform texture enhancement processing by applying a texture patch output from the learning network model to the input image 10.

도 5는 본 개시의 일 실시 예에 따른 학습 네트워크 모델을 설명하기 위한 도면이다.5 is a diagram illustrating a learning network model according to an embodiment of the present disclosure.

본 개시의 일 실시 예에 따른 학습 네트워크 모델은 영상의 특성에 기초하여 분류된 복수의 클래스 및 복수의 클래스 각각에 대응되는 적어도 하나의 텍스처 패치를 저장할 수 있다. 도 5를 참조하면, 학습 네트워크 모델은 영상의 특성 중 에지 방향을 기준으로 분류된 제1 내지 제n 클래스를 포함할 수 있다. 또한, 학습 네트워크 모델은 제1 내지 제n 클래스 각각에 대응되는 텍스처 패치를 포함할 수 있다. 여기서, 영상의 특성은 픽셀 블록(20)에 포함된 픽셀 값들의 평균, 분산, 픽셀 좌표, 에지 강도, 에지 방향 또는 색상 중 적어도 하나를 의미할 수 있다. 따라서, 일 실시 예에 따른 학습 네트워크 모델은 픽셀 값들의 평균, 분산, 픽셀 좌표, 에지 강도, 에지 방향 또는 색상 중 적어도 하나를 기준으로 구분된 복수의 클래스를 포함할 수 있다. 한편, 학습 네트워크 모델은 상술한 예시 외에, 픽셀 블록(20)으로부터 식별 가능한 다양한 특징에 기초하여 복수의 클래스를 생성할 수 있고, 해당 픽셀 블록(20)이 복수의 클래스 중 어느 클래스에 대응되는지 여부를 식별할 수 있음은 물론이다. 예를 들어, 학습 네트워크 모델은 색 좌표를 기준으로 클래스를 구분할 수 있고, 픽셀 블록(20)에 포함된 픽셀들의 색 좌표의 평균에 기초하여 해당 픽셀 블록(20)에 대응되는 클래스를 식별할 수도 있다.The learning network model according to an embodiment of the present disclosure may store a plurality of classes classified based on characteristics of an image and at least one texture patch corresponding to each of the plurality of classes. Referring to FIG. 5, the learning network model may include first to nth classes classified based on an edge direction among image characteristics. In addition, the learning network model may include texture patches corresponding to each of the first to nth classes. Here, the characteristic of the image may mean at least one of an average, dispersion, pixel coordinate, edge intensity, edge direction, or color of pixel values included in the pixel block 20. Accordingly, the learning network model according to an embodiment may include a plurality of classes divided based on at least one of an average of pixel values, a variance, a pixel coordinate, an edge intensity, an edge direction, or a color. On the other hand, the learning network model can generate a plurality of classes based on various features that can be identified from the pixel block 20, in addition to the above-described examples, and whether the corresponding pixel block 20 corresponds to which of the plurality of classes. Of course, can be identified. For example, the learning network model may classify classes based on color coordinates, and may identify a class corresponding to the pixel block 20 based on an average of color coordinates of pixels included in the pixel block 20. have.

도 5를 참조하면, 일 실시 예에 따른 프로세서(120)는 입력 영상(10)을 구성하는 영상 프레임에 있어서, 영상 프레임에 포함된 복수의 픽셀을 픽셀 블록(20) 단위로 구분하여 학습 네트워크 모델에 입력할 수 있다. 일 실시 예에 따라, 프로세서(120)는 영상 프레임을 구성하는 복수의 픽셀 블록(20)을 순차적으로 학습 네트워크 모델에 입력할 수 있다. 이어서, 학습 네트워크 모델은 입력되는 복수의 픽셀 블록(20-1, ... 20-n) 각각에 대응되는 텍스처 패치(30-1, ... 30-n)를 출력할 수 있다. Referring to FIG. 5, in an image frame constituting an input image 10, the processor 120 according to an embodiment divides a plurality of pixels included in the image frame into pixel blocks 20 to provide a learning network model. You can enter in According to an embodiment, the processor 120 may sequentially input a plurality of pixel blocks 20 constituting an image frame into the learning network model. Subsequently, the learning network model may output texture patches 30-1, ... 30-n corresponding to each of the input plurality of pixel blocks 20-1, ... 20-n.

일 예로, 학습 네트워크 모델은 제1 픽셀 블록(20-1)의 특성에 기초하여 복수의 클래스 중 제1 픽셀 블록(20-1)에 대응되는 클래스를 식별할 수 있다. 예를 들어, 학습 네트워크 모델은 제1 픽셀 블록(20-1)을 구성하는 픽셀들에 기초하여 제1 픽셀 블록(20-1)의 에지 방향을 식별하고, 식별된 에지 방향이 복수의 클래스 중 어느 클래스에 대응되는지 여부를 식별할 수 있다. 구체적으로, 학습 네트워크 모델은 복수의 클래스들과 제1 픽셀 블록(20-1) 간 유사도를 식별할 수 있다. 예를 들어, 학습 네트워크 모델은 제1 픽셀 블록(20-1)의 에지 방향이 0°이면, 제2 내지 제8 클래스(Class #2 - Class #8) 대비 제1 클래스(Class #1)에서 높은 유사도(또는, 적합도)를 획득할 수 있다. 여기서, 제1 클래스(Class#1)는 에지 방향 0°를 기준으로 정의된 클래스를 의미할 수 있다. 이어서, 학습 네트워크 모델은 제1 픽셀 블록(20-1)에 대응되는 클래스로 제1 클래스(Class #1)를 식별할 수 있다. 이어서, 프로세서(120)는 학습 네트워크 모델을 통해 제1 클래스(Class #1)에 대응되는 제1 텍스처 패치(30-1)를 획득할 수 있다.For example, the learning network model may identify a class corresponding to the first pixel block 20-1 from among a plurality of classes based on the characteristics of the first pixel block 20-1. For example, the learning network model identifies the edge direction of the first pixel block 20-1 based on the pixels constituting the first pixel block 20-1, and the identified edge direction is among a plurality of classes. It is possible to identify which class corresponds to. Specifically, the learning network model may identify a similarity between a plurality of classes and the first pixel block 20-1. For example, in the learning network model, if the edge direction of the first pixel block 20-1 is 0°, in the first class (Class #1) compared to the second to eighth classes (Class #2-Class #8) A high degree of similarity (or goodness of fit) can be obtained. Here, the first class Class#1 may mean a class defined based on the edge direction 0°. Subsequently, the learning network model may identify a first class (Class #1) as a class corresponding to the first pixel block 20-1. Subsequently, the processor 120 may acquire the first texture patch 30-1 corresponding to the first class (Class #1) through the learning network model.

다른 예로, 제2 픽셀 블록(20-2)가 복수의 클래스 중 제2 클래스(Class #2)에 대응되는 것으로 식별되면, 학습 네트워크 모델은 제2 클래스(Class #2)에 대응되는 제2 텍스처 패치(30-2)를 제공할 수 있다.As another example, if the second pixel block 20-2 is identified as corresponding to a second class (Class #2) among a plurality of classes, the learning network model is a second texture corresponding to the second class (Class #2). A patch 30-2 may be provided.

한편, 도 5에서는 설명의 편의를 위해 학습 네트워크 모델이 에지 방향을 기준으로 구분된 제1 내지 제8 클래스를 포함하고, 클래스들 각각이 하나의 텍스처 패치 즉, 제1 내지 제8 텍스처 패치(30-1, ... 30-8)를 포함하는 것으로 도시하였으나, 이에 한정되지 않음은 물론이다.Meanwhile, in FIG. 5, for convenience of explanation, the learning network model includes first to eighth classes divided based on the edge direction, and each of the classes is one texture patch, that is, the first to eighth texture patches 30. -1, ... 30-8), but is not limited thereto.

일 실시 예에 따른 학습 네트워크 모델은 픽셀 블록(20)의 특성에 기초하여 해당 픽셀 블록이 복수의 클래스 중 어느 하나에 대응되지 않는 것으로 식별되면, 해당 픽셀 블록(20)의 특성에 기초하여 신규 클래스를 생성하고, 신규 클래스에 픽셀 블록을 맵핑하여 저장할 수 있다. 일 예로, 픽셀 블록(20)과 복수의 클래스들 간의 유사도가 모두 임계 값 미만이면, 학습 네트워크 모델은 픽셀 블록(20)의 특성에 기초하여 복수의 클래스 외의 새로운 클래스를 생성할 수도 있다.In the learning network model according to an embodiment, if a corresponding pixel block is identified as not corresponding to any one of a plurality of classes based on the characteristics of the pixel block 20, a new class is generated based on the characteristics of the corresponding pixel block 20. Can be created and saved by mapping a pixel block to a new class. For example, if the similarity between the pixel block 20 and the plurality of classes are all less than the threshold value, the learning network model may generate a new class other than the plurality of classes based on the characteristics of the pixel block 20.

도 5를 참조하면, 일 실시 예에 따라 학습 네트워크 모델은 제1 내지 제8 클래스들과 제4 픽셀 블록(20-4) 간 유사도가 임계 값 미만이면, 즉, 제4 픽셀 블록(20-4)에 대응되는 클래스가 식별되지 않으면, 제4 픽셀 블록(20-4)의 특성에 기초하여 제9 클래스를 생성할 수 있다. 예를 들어, 복수의 클래스가 에지 방향을 기준으로 분류되어 있으면, 학습 네트워크 모델은 제4 픽셀 블록(20-4)을 구성하는 픽셀들의 에지 방향을 식별하고 식별된 에지 방향을 기준으로 제9 클래스를 생성할 수 있다. 이어서, 학습 네트워크 모델은 제4 픽셀 블록(20-4)을 제9 클래스에 맵핑하여 저장할 수 있다. 예를 들어, 학습 네트워크 모델은 새롭게 생성된 제9 클래스에 대응되는 텍스처 패치로 제4 픽셀 블록(20-4)을 저장할 수 있다.Referring to FIG. 5, according to an embodiment, if the similarity between the first to eighth classes and the fourth pixel block 20-4 is less than a threshold value, that is, the fourth pixel block 20-4 is If the class corresponding to) is not identified, the ninth class may be generated based on the characteristics of the fourth pixel block 20-4. For example, if a plurality of classes are classified based on the edge direction, the learning network model identifies the edge directions of pixels constituting the fourth pixel block 20-4, and the ninth class based on the identified edge direction. Can be created. Subsequently, the learning network model may map and store the fourth pixel block 20-4 to the ninth class. For example, the learning network model may store the fourth pixel block 20-4 as a texture patch corresponding to the newly generated ninth class.

도 2로 돌아와서, 본 개시의 일 실시 예에 따른 학습 네트워크 모델은 픽셀 블록(20)에 대응되는 클래스에 매칭된 텍스처 패치(30)가 식별되면, 픽셀 블록(20)과 클래스와의 유사도, 텍스처 패치(30)와 클래스와의 유사도에 기초하여 텍스처 패치(30)의 업데이트 여부를 식별할 수 있다. 여기서, 학습 네트워크 모델은 클래스를 정의하는 기준과 픽셀 블록(20) 간의 유사도(또는, 적합도) 및 클래스를 정의하는 기준과 해당 클래스에 매칭된 텍스처 패치(30) 간의 유사도를 비교하여 업데이트 여부를 식별할 수 있다. 도 5를 참조하면, 학습 네트워크 모델은 에지 방향을 기준으로 분류된 복수의 클래스를 포함할 수 있다. 복수의 클래스 중 제1 클래스(Class#1)은 에지 방향이 0°로 정의된 클래스이고, 제5 클래스(Class#5)는 에지 방향이 90°로 정의된 클래스를 의미할 수 있다. 학습 네트워크 모델은 제1 픽셀 블록(20-1)이 입력되면, 제1 픽셀 블록(20-1)의 에지 방향에 기초하여 복수의 클래스 중 유사도가 가장 큰 제1 클래스(Class#1)를 식별할 수 있다. 이어서, 제1 클래스(Class#1)와 제1 픽셀 블록(20-1) 간 유사도와 제1 클래스(Class#1)와 제1 텍스처 패치(30-1) 간 유사도를 비교하여 제1 텍스처 패치(30-1)의 업데이트 여부를 식별할 수 있다.Returning to FIG. 2, in the learning network model according to an embodiment of the present disclosure, when the texture patch 30 matching the class corresponding to the pixel block 20 is identified, the similarity between the pixel block 20 and the class, texture Whether to update the texture patch 30 may be identified based on the similarity between the patch 30 and the class. Here, the learning network model compares the similarity (or goodness of fit) between the criteria defining the class and the pixel block 20 and the similarity between the criteria defining the class and the texture patch 30 matched to the corresponding class to identify whether or not to be updated. can do. Referring to FIG. 5, the learning network model may include a plurality of classes classified based on an edge direction. Among the plurality of classes, a first class (Class#1) may mean a class in which an edge direction is defined as 0°, and a fifth class (Class#5) may mean a class in which an edge direction is defined as 90°. When the first pixel block 20-1 is input, the learning network model identifies a first class (Class#1) with the largest similarity among a plurality of classes based on the edge direction of the first pixel block 20-1. can do. Subsequently, the first texture patch by comparing the similarity between the first class (Class#1) and the first pixel block 20-1 and the similarity between the first class (Class#1) and the first texture patch 30-1. Whether to update (30-1) can be identified.

이에 대한 구체적인 설명은 도 6을 참조하여 하도록 한다.A detailed description of this will be made with reference to FIG. 6.

도 6은 본 개시의 일 실시 예에 따른 클래스 및 텍스처 패치를 설명하기 위한 도면이다.6 is a diagram for describing a class and a texture patch according to an embodiment of the present disclosure.

도 6을 참조하면, 학습 네트워크 모델은 픽셀 블록(20)의 특성에 기초하여 복수의 클래스 중 픽셀 블록(20)에 대응되는 클래스를 식별할 수 있다. 예를 들어, 픽셀 블록(20)은 65°의 에지 방향을 포함하면, 학습 네트워크 모델은 제1 내지 제8 클래스(Class#1 - Class#8) 중 67.5°의 에지 방향으로 정의된 제4 클래스(Class #4)를 식별할 수 있다. 이어서, 학습 네트워크 모델은 식별된 제4 클래스(Class #4)에 대응되는 텍스처 패치(30)를 획득할 수 있다. Referring to FIG. 6, the learning network model may identify a class corresponding to the pixel block 20 from among a plurality of classes based on the characteristics of the pixel block 20. For example, if the pixel block 20 includes an edge direction of 65°, the learning network model is a fourth class defined as an edge direction of 67.5° among the first to eighth classes (Class#1-Class#8). (Class #4) can be identified. Subsequently, the learning network model may acquire the texture patch 30 corresponding to the identified fourth class (Class #4).

이어서, 학습 네트워크 모델은 픽셀 블록(20)과 제4 클래스(Class#4)와의 유사도, 텍스처 패치(30)와 제4 클래스(Class#4)와의 유사도에 기초하여 텍스처 패치(30)의 업데이트 여부를 식별할 수 있다. 여기서, 유사도는 다양한 형태의 유사도 측정 알고리즘, 적합도 측정 알고리즘, 기계 학습 알고리즘을 이용하여 측정될 수 있음은 물론이다. 예를 들어, 계조 값에 기초하여 히스토그램을 비교하거나, 유클리디언 거리 등을 산출하여 유사도를 식별할 수 있고, 다른 예로 CNN(Convolution Neural Network) 학습된 알고리즘 모델에 기초하여 유사도를 식별할 수도 있음은 물론이다.Subsequently, the learning network model determines whether the texture patch 30 is updated based on the similarity between the pixel block 20 and the fourth class (Class#4) and the similarity between the texture patch 30 and the fourth class (Class#4). Can be identified. Here, it goes without saying that the similarity may be measured using various types of similarity measurement algorithms, fitness measurement algorithms, and machine learning algorithms. For example, it is possible to compare histograms based on grayscale values, calculate the Euclidean distance, etc. to identify similarity, and as another example, to identify similarity based on an algorithm model trained by a convolution neural network (CNN). Of course.

예를 들어, 학습 네트워크 모델의 종래에 타 입력 영상(10’), 샘플 영상 등에 기초한 학습 결과에 따라 제4 클래스(Class#4)에 매칭된 텍스처 패치(30)의 에지 방향이 50°인 경우를 상정할 수 있다. 제4 클래스(Class#4)를 정의하는 에지 방향이 67.5°이므로, 학습 네트워크 모델은 65°의 에지 방향을 포함하는 픽셀 블록(20)의 제1 유사도가 50°의 에지 방향을 포함하는 텍스처 패치(30)의 제2 유사도 대비 큰 값을 가지며, 픽셀 블록(20)이 제4 클래스(Class#4)에 적합한 것으로 식별할 수 있다. 학습 네트워크 모델은 픽셀 블록(20)에 기초하여 텍스처 패치(30)를 대체할 수 있다. 이어서, 학습 네트워크 모델은 입력 영상(10)에 포함된 타 픽셀 블록(20’)이 입력되고 타 픽셀 블록(20’)이 제4 클래스(Class#4)에 대응되면, 65°의 에지 방향을 포함하는 픽셀 블록(20)에 기초하여 업데이트된 텍스처 패치(30’)를 출력할 수 있다. 이어서, 프로세서(120)는 텍스처 패치(30’)에 기초하여 타 픽셀 블록(20’)의 텍스처를 생성할 수 있다.For example, if the edge direction of the texture patch 30 matched to the fourth class (Class#4) is 50° according to the result of learning based on the conventional input image 10' and the sample image of the learning network model. Can be assumed. Since the edge direction defining the fourth class (Class#4) is 67.5°, the learning network model is a texture patch that includes the edge direction of 50° with a first similarity of the pixel block 20 including the edge direction of 65°. It has a larger value than the second similarity of (30), and it can be identified that the pixel block 20 is suitable for the fourth class (Class#4). The learning network model may replace the texture patch 30 based on the pixel block 20. Subsequently, in the learning network model, when another pixel block 20' included in the input image 10 is input and the other pixel block 20' corresponds to the fourth class (Class#4), the edge direction of 65° is An updated texture patch 30 ′ may be output based on the included pixel block 20. Subsequently, the processor 120 may generate a texture of the other pixel block 20' based on the texture patch 30'.

다른 예로, 픽셀 블록에 대응되는 클래스와 해당 픽셀 블록(20) 간 제1 유사도 보다 클래스와 해당 클래스에 매칭된 텍스처 패치(30) 간 제2 유사도가 큰 경우를 상정할 수도 있다. 이 경우, 학습 네트워크 모델은 텍스처 패치(30)가 입력 영상(10), 픽셀 블록(20)의 텍스처 생성에 적합한 것으로 식별할 수 있고, 텍스처 패치(30)를 그대로 유지하는 할 수 있다.As another example, it may be assumed that the second similarity between the class and the texture patch 30 matched to the corresponding class is greater than the first similarity between the class corresponding to the pixel block and the pixel block 20. In this case, the learning network model can identify that the texture patch 30 is suitable for texture generation of the input image 10 and the pixel block 20, and can maintain the texture patch 30 as it is.

한편, 본 개시의 일 실시 예에 따라 학습 네트워크 모델은 입력 영상(10)에 포함된 픽셀 블록(20)에 대응되는 텍스처 패치(30)를 획득하는 과정에서 해당 텍스처 패치(30)를 업데이트하므로, 입력 영상(10)의 텍스처 향상에 적합한 텍스처 패치(30)를 포함하는 영상 처리 모델을 생성할 수 있다.Meanwhile, according to an embodiment of the present disclosure, since the learning network model updates the texture patch 30 in the process of obtaining the texture patch 30 corresponding to the pixel block 20 included in the input image 10, An image processing model including a texture patch 30 suitable for texture enhancement of the input image 10 may be generated.

예를 들어, 숲, 잔디 등 오브젝트를 포함하는 입력 영상(10)에 학습 네트워크 모델을 적용하면, 학습 네트워크 모델은 입력 영상(10)을 구성하는 픽셀 블록(20)와 클래스 간 유사도와 기 저장된 텍스처 패치(30)와 클래스 간 유사도를 비교하여 기 저장된 텍스처 패치(30)를 유지하거나 해당 픽셀 블록(20)으로 기 저장된 텍스처 패치(30)를 대체할 수 있다. 일 실시 예에 따라, 입력 영상(10)에 포함된 타 픽셀 블록(20’)에 학습 네트워크 모델이 적용되면, 학습 네트워크 모델은 선행 과정에서 픽셀 블록(20)에 기초하여 업데이트된 텍스처 패치(30’)를 식별할 수 있다. 이 경우, 업데이트된 텍스처 패치(30’)는 입력 영상(10)으로부터 획득된 패치이므로, 동일한 입력 영상(10)에 포함된 타 픽셀 블록(20’)과 높은 상관 관계, 높은 적합도를 가질 수 있다. 따라서, 프로세서(120)는 업데이트된 텍스처 패치(30’)를 타 픽셀 블록(20’)에 적용하여 텍스처가 생성되고 세밀감이 향상된 출력 영상을 획득할 수 있다.For example, if the learning network model is applied to the input image 10 including objects such as forest and grass, the learning network model is the similarity between the pixel block 20 constituting the input image 10 and the class and a previously stored texture. The previously stored texture patch 30 may be maintained by comparing the similarity between the patch 30 and the classes, or the previously stored texture patch 30 may be replaced with the corresponding pixel block 20. According to an embodiment, when a learning network model is applied to another pixel block 20 ′ included in the input image 10, the learning network model is updated based on the pixel block 20 in a preceding process. ') can be identified. In this case, since the updated texture patch 30 ′ is a patch obtained from the input image 10, it may have a high correlation with other pixel blocks 20 ′ included in the same input image 10, and a high degree of suitability. . Accordingly, the processor 120 may apply the updated texture patch 30 ′ to the other pixel block 20 ′ to generate a texture and obtain an output image with improved detail.

도 2로 돌아와서, 본 개시의 일 실시 예에 따른 본 개시의 일 실시 예에 따른 학습 네트워크 모델은 복수의 클래스 각각에 대응되는 텍스처 패치(30)의 저장 시기 또는 텍스처 패치(30)의 적용 빈도수 중 적어도 하나에 기초하여 텍스처 패치(30)를 학습할 수 있다.Returning to FIG. 2, the learning network model according to an embodiment of the present disclosure according to an embodiment of the present disclosure is selected from the storage timing of the texture patch 30 corresponding to each of the plurality of classes or the frequency of application of the texture patch 30. The texture patch 30 may be learned based on at least one.

일 예로, 학습 네트워크 모델은 입력 영상(10)에 기초하여 텍스처 패치(30)를 학습할 수 있고, 기 저장된 텍스처 패치(30)의 저장 시기를 추가로 고려할 수 있다. 예를 들어, 학습 네트워크 모델은 입력 영상(10)에 포함된 픽셀 블록(20)에 대응되는 텍스처 패치(30)의 저장 시기가 일정 기간을 경과한 것으로 식별되면, 텍스처 패치(30)를 픽셀 블록(20)으로 대체할 수 있다. 텍스처 패치(30)의 저장 시기가 오래된 경우, 입력 영상(10)과의 적합도, 매칭 관계의 클래스와의 유사도가 떨어지는 것을 의미할 수 있으므로, 학습 네트워크 모델은 입력 영상(10)에 포함된 픽셀 블록(20)에 기초하여 학습을 수행하고, 텍스처 패치(30)를 업데이트할 수 있다. 학습 네트워크 모델은 입력 영상(10)에 포함된 픽셀 블록(20)을 해당 픽셀 블록(20)에 대응되는 클래스의 텍스처 패치(30)로 맵핑하고, 새롭게 맵핑된 텍스처 패치(30)를 입력 영상(10)의 텍스처를 생성하기 위해 이용할 수 있다.As an example, the learning network model may learn the texture patch 30 based on the input image 10, and may additionally consider the storage timing of the previously stored texture patch 30. For example, when the learning network model identifies that the storage timing of the texture patch 30 corresponding to the pixel block 20 included in the input image 10 has passed a certain period, the texture patch 30 is converted to the pixel block. Can be replaced with (20). If the storage time of the texture patch 30 is old, it may mean that the degree of fitness with the input image 10 and the similarity with the class of the matching relationship are low, so that the learning network model is a pixel block included in the input image 10 Learning may be performed based on (20) and the texture patch 30 may be updated. The learning network model maps the pixel block 20 included in the input image 10 to a texture patch 30 of a class corresponding to the pixel block 20, and maps the newly mapped texture patch 30 to the input image ( 10) can be used to create the texture.

다른 예로, 본 개시의 일 실시 예에 따른 학습 네트워크 모델은 픽셀 블록(20)과 클래스 간 제1 유사도와 텍스처 패치(30)와 클래스 간 제2 유사도가 동일하면, 텍스처 패치(30)의 저장 시점, 활용 빈도수 등에 기초하여 텍스처 패치(30)를 업데이트할 수도 있다. 예를 들어, 제1 및 제2 유사도가 동일하면, 입력 영상(10)의 텍스처 생성에 픽셀 블록(20)이 기 저장된 텍스처 패치(30)보다 적합할 수 있으므로, 픽셀 블록(20)에 기초하여 텍스처 패치(30)를 업데이트할 수 있다. 또 다른 예로, 제1 및 제2 유사도가 동일하면, 학습 네트워크 모델은 텍스처 패치(30) 외에 픽셀 블록(20)을 추가할 수도 있음은 물론이다.As another example, in the learning network model according to an embodiment of the present disclosure, when the first similarity between the pixel block 20 and the class and the second similarity between the texture patch 30 and the class are the same, when the texture patch 30 is stored The texture patch 30 may be updated based on the frequency of use, and the like. For example, if the first and second similarities are the same, the pixel block 20 may be more suitable for texture generation of the input image 10 than the previously stored texture patch 30. The texture patch 30 can be updated. As another example, if the first and second similarities are the same, it goes without saying that the learning network model may add the pixel block 20 in addition to the texture patch 30.

다만, 이는 일 실시 예로 텍스처 패치(30)의 저장 시기가 일정 기간을 경과하면 텍스처 패치(30)가 반드시 업데이트되어야 함을 의미하지는 않는다.However, this does not mean that the texture patch 30 must be updated when the storage timing of the texture patch 30 elapses for a certain period of time, for example.

다른 예로, 학습 네트워크 모델은 텍스처 패치(30)의 적용 빈도수에 기초하여 텍스처 패치(30)를 학습할 수 있다. 일 예로, 특정 텍스처 패치(30)가 현재 입력 중인 영상(10) 외에도 종래에 타 입력 영상(10’)의 텍스처 생성을 위해 빈번하게 이용된 것으로 식별되면, 특정 텍스처 패치(30)는 해당 클래스와의 적합도가 높고, 텍스처 생성에 유용하게 적용가능함을 의미할 수 있다. 이와 달리, 특정 텍스처 패치(30)가 입력 영상(10)의 텍스처 생성에 이용된 빈도가 적은 것으로 식별되면, 학습 네트워크 모델은 텍스처 패치(30)가 맵핑 관계의 클래스와 적합도가 낮은 것으로 식별할 수 있다. 이 경우, 학습 네트워크 모델은 입력 영상(10)에 포함된 픽셀 블록(20)으로 텍스처 패치(30)를 대체할 수 있다. 예를 들어, 픽셀 블록(20)의 특성에 기초하여 복수의 클래스 중 특정 클래스가 해당 픽셀 블록(20)에 대응되는 클래스임이 식별되고, 식별된 클래스에 대응되는 텍스처 패치(30)의 저장 시점이 일정 기간 경과하였거나 영상에의 적용 빈도수가 임계 횟수 미만이면, 학습 네트워크 모델은 해당 픽셀 블록(20)으로 텍스처 패치(30)를 대체할 수 있다.As another example, the learning network model may learn the texture patch 30 based on the frequency of application of the texture patch 30. For example, if it is identified that a specific texture patch 30 is frequently used for texture generation of another input image 10' in addition to the image 10 currently being input, the specific texture patch 30 is associated with the corresponding class. It may mean that the suitability of is high, and it is usefully applicable to texture generation. On the contrary, if the specific texture patch 30 is identified as having a low frequency used to generate the texture of the input image 10, the learning network model can identify the texture patch 30 as having a low suitability with the class of the mapping relationship. have. In this case, the learning network model may replace the texture patch 30 with the pixel block 20 included in the input image 10. For example, based on the characteristics of the pixel block 20, it is identified that a specific class among a plurality of classes is a class corresponding to the pixel block 20, and the storage time of the texture patch 30 corresponding to the identified class is If a certain period has elapsed or the frequency of application to the image is less than the threshold number, the learning network model may replace the texture patch 30 with the corresponding pixel block 20.

도 7은 본 개시의 일 실시 예에 따른 입력 영상을 학습하는 모델을 설명하기 위한 도면이다,7 is a diagram illustrating a model for learning an input image according to an embodiment of the present disclosure.

도 7을 참조하면, 학습 네트워크 모델은 복수의 클래스 중 일부 클래스에 대응되는 텍스처 패치(30)를 저장하지 않을 수 있다. 예를 들어, 도 5에 도시된 바와 같이, 제1 내지 제8 클래스 각각에 대응되는 제1 내지 제8 텍스처 패치(30-1, ... , 30-8)을 저장하는 것이 아니라, 복수의 클래스 중 일부 클래스는 맵핑 관계의 텍스처 패치(30)를 저장하고 있고, 나머지 클래스는 텍스처 패치(30)를 저장하지 않을 수 있다. 이 경우, 학습 네트워크 모델은 입력 영상(10)에 기초하여 텍스처 패치(30)를 획득 및 저장할 수 있다. 일 예로, 학습 네트워크 모델이 입력 영상(10)에 포함된 픽셀 블록(20)에 대응되는 클래스를 식별하고, 식별된 클래스에 대응되는 텍스처 패치(30)를 포함하지 않는 경우를 상정할 수 있다. 학습 네트워크 모델은 해당 픽셀 블록(20)을 식별된 클래스에 맵핑하여 저장할 수 있다. Referring to FIG. 7, the learning network model may not store texture patches 30 corresponding to some of the plurality of classes. For example, as shown in FIG. 5, rather than storing the first to eighth texture patches 30-1, ..., 30-8 corresponding to each of the first to eighth classes, a plurality of Some of the classes may store the texture patch 30 in a mapping relationship, and the other classes may not store the texture patch 30. In this case, the learning network model may acquire and store the texture patch 30 based on the input image 10. For example, it may be assumed that the learning network model identifies a class corresponding to the pixel block 20 included in the input image 10 and does not include the texture patch 30 corresponding to the identified class. The learning network model may map and store the corresponding pixel block 20 to the identified class.

한편, 클래스는 하나의 텍스처 패치(30)만 포함하는 것으로 설명하였으나, 반드시 이에 한정되지 않는다. 예를 들어, 제1 클래스는 제1 클래스에 대응되는 적어도 두 개의 텍스처 패치(30)를 포함할 수 있다. 일 실시 예에 따라 학습 네트워크 모델은 입력 영상(10)에 포함된 픽셀 블록(20)의 클래스를 식별하고, 식별된 클래스에 픽셀 블록(20)을 텍스처 패치(30)로 추가할 수 있다. 여기서, 학습 네트워크 모델은 종래에 기 저장된 텍스처 패치(30)를 삭제하거나, 대체하는 것이 아니라, 기 저장된 텍스처 패치(30)를 제1 텍스처 패치로, 픽셀 블록(20)를 제2 텍스처 패치로 하여 해당 클래스에 맵핑하여 저장할 수 있다.Meanwhile, although it has been described that the class includes only one texture patch 30, it is not necessarily limited thereto. For example, the first class may include at least two texture patches 30 corresponding to the first class. According to an embodiment, the learning network model may identify a class of the pixel block 20 included in the input image 10 and add the pixel block 20 as a texture patch 30 to the identified class. Here, the learning network model does not delete or replace the previously stored texture patch 30, but uses the previously stored texture patch 30 as a first texture patch and the pixel block 20 as a second texture patch. It can be saved by mapping to the corresponding class.

일 실시 예에 따른 학습 네트워크 모델은 픽셀 블록(20)에 대응되는 것으로 식별된 클래스에 대응되는 텍스처 패치(30)가 복수 개 존재하면, 픽셀 블록(20)과 복수 개의 텍스처 패치(30) 각각의 상관 관계에 기초하여 복수 개의 텍스처 패치 중 어느 하나를 식별할 수 있다. 예를 들어, 픽셀 블록(20)에 대응되는 클래스가 제4 클래스이고, 제4 클래스와 맵핑 관계의 텍스처 패치가 제1 내지 제3 텍스처 패치인 경우를 상정할 수 있다. 학습 네트워크 모델은 픽셀 블록과 제1 내지 제3 텍스처 패치 각각의 상관 관계를 식별하고, 식별된 상관 관계들 중 가장 높은 상관 값을 가지는 텍스처 패치를 식별할 수 있다. 가장 높은 상관 값을 가지는 텍스처 패치는 픽셀 블록(20)의 텍스처 생성에 가장 높은 적합도를 가지는 패치임을 의미할 수 있다. 이어서, 학습 네트워크 모델은 식별된 텍스처 패치를 픽셀 블록(20)에 적용하여 텍스처를 생성할 수 있다.In the learning network model according to an embodiment, if a plurality of texture patches 30 corresponding to a class identified as corresponding to the pixel block 20 exist, each of the pixel block 20 and the plurality of texture patches 30 Any one of a plurality of texture patches may be identified based on the correlation. For example, it may be assumed that a class corresponding to the pixel block 20 is a fourth class, and a texture patch having a mapping relationship with the fourth class is a first to third texture patch. The learning network model may identify a correlation between a pixel block and each of the first to third texture patches, and may identify a texture patch having a highest correlation value among the identified correlations. The texture patch having the highest correlation value may mean that it is a patch having the highest degree of suitability for texture generation of the pixel block 20. Subsequently, the learning network model may generate a texture by applying the identified texture patch to the pixel block 20.

도 8은 본 개시의 다른 실시 예에 따른 클래스를 설명하기 위한 도면이다.8 is a diagram for describing a class according to another exemplary embodiment of the present disclosure.

도 8을 참조하면, 일 실시 예에 따른 학습 네트워크 모델은 영상의 특성에 기초하여 픽셀 블록(20)을 제1 내지 제16 클래스 중 어느 하나의 클래스로 분류할 수 있다. 이어서, 학습 네트워크 모델은 분류된 클래스와 맵핑 관계의 텍스처 패치(30)를 식별할 수 있다. 이어서, 식별된 텍스처 패치(30)를 픽셀 블록(20)에 적용할 수 있다.Referring to FIG. 8, the learning network model according to an embodiment may classify the pixel block 20 into any one of the first to sixteenth classes based on the characteristics of an image. Subsequently, the learning network model may identify the classified class and the texture patch 30 of the mapping relationship. Subsequently, the identified texture patch 30 may be applied to the pixel block 20.

한편, 학습 네트워크 모델은 다양한 기준에 따라 클래스를 구분할 수 있다. 또한, 클래스의 개수는 고정되거나 한정된 것이 아니며, 학습 네트워크 모델은 복수의 클래스 중 특정 클래스를 삭제하거나, 복수의 클래스 외의 추가 클래스를 생성할 수도 있음은 물론이다.Meanwhile, the learning network model can classify classes according to various criteria. In addition, the number of classes is not fixed or limited, and it goes without saying that the learning network model may delete a specific class from among a plurality of classes or create an additional class other than a plurality of classes.

설명의 편의를 위해 에지 방향을 기준으로 클래스를 구분하고, 복수의 클래스 각각이 단일 텍스처 패치를 포함하는 경우를 상정하여 설명하였으나, 이에 한정되지 않음은 물론이다. 예를 들어, 일 실시 예에 따른 학습 네트워크 모델은 색 좌표의 분포를 기준으로 제1 내지 제n 클래스로 분류할 수 있고, 입력 영상(10)에 포함된 픽셀 블록(20)의 색 좌표 분포에 기초하여 제1 내지 제n 클래스 중 대응되는 클래스를 식별할 수 있다. 다른 예로, 학습 네트워크 모델은 평균 계조 값, 계조 값의 분산 등을 기준으로 제1 내지 제n 클래스로 분류할 수도 있음은 물론이다.For convenience of description, classes are classified based on the edge direction, and description has been made on the assumption that each of the plurality of classes includes a single texture patch, but is not limited thereto. For example, the learning network model according to an embodiment may be classified into first to nth classes based on the distribution of color coordinates, and the color coordinate distribution of the pixel block 20 included in the input image 10 Based on the first to nth classes, a corresponding class may be identified. As another example, it goes without saying that the learning network model may be classified into first to nth classes based on an average grayscale value and a variance of grayscale values.

도 9는 본 개시의 일 실시 예에 따른 학습 결과를 설명하기 위한 도면이다.9 is a diagram for describing a learning result according to an embodiment of the present disclosure.

도 9를 참조하면, 학습 네트워크 모델은 입력 영상(10)을 구성하는 복수의 픽셀 블록(20) 각각에 대응되는 텍스처 패치(30)를 제공하고, 프로세서(120)는 텍스처 패치(30)를 픽셀 블록(20)에 적용하여 세밀감이 향상된 출력 영상을 획득할 수 있다.Referring to FIG. 9, the training network model provides a texture patch 30 corresponding to each of a plurality of pixel blocks 20 constituting an input image 10, and the processor 120 uses the texture patch 30 as a pixel. By applying it to the block 20, an output image with improved detail may be obtained.

학습 네트워크 모델이 입력 영상(10)에 포함된 픽셀 블록(20)에 기초하여 학습을 수행함에 따라 영상(10)의 입력 전과 후에 학습 네트워크 모델에 포함된 복수의 클래스 및 텍스처 패치(30)가 상이할 수 있다. 예를 들어, 영상의 입력 전에 학습 네트워크 모델은 선행하여 입력된 타 영상 또는 샘플 영상에 기초하여 학습된 텍스처 패치(30)를 포함할 수 있다. 학습 네트워크 모델은 입력 영상(10)에 포함된 픽셀 블록(20)과 해당 픽셀 블록(20)에 대응되는 클래스 간의 유사도, 해당 클래스에 맵핑된 텍스처 패치(30)와 클래스 간의 유사도를 식별하고, 식별 결과에 기초하여 텍스처 패치(30)를 업데이트 할 수 있다. 예를 들어, 학습 네트워크 모델은 픽셀 블록(20)으로 텍스처 패치(30)를 대체하거나, 텍스처 패치(30)를 유지할 수 있다.As the learning network model performs learning based on the pixel block 20 included in the input image 10, the plurality of classes and texture patches 30 included in the learning network model before and after the input of the image 10 are different. can do. For example, before inputting an image, the learning network model may include a texture patch 30 learned based on another image or sample image previously input. The learning network model identifies and identifies the similarity between the pixel block 20 included in the input image 10 and the class corresponding to the pixel block 20, and the similarity between the texture patch 30 mapped to the corresponding class and the class. The texture patch 30 may be updated based on the result. For example, the learning network model may replace the texture patch 30 with the pixel block 20 or maintain the texture patch 30.

도 9는 입력 영상(10)에 기초한 학습 네트워크 모델의 학습 결과를 도시한 도면이다. 도 9를 참조하면, 학습 네트워크 모델에 포함된 복수의 클래스 중 일부 클래스는 맵핑 관계의 텍스처 패치(30)가 입력 영상(10)에 포함된 픽셀 블록(20)으로 대체되었다. 다른 예로, 복수의 클래스 중 나머지 클래스는 맵핑 관계의 텍스처 패치(30)가 유지되었다. 9 is a diagram illustrating a training result of a learning network model based on an input image 10. Referring to FIG. 9, some of the plurality of classes included in the learning network model have a mapping relationship texture patch 30 replaced with a pixel block 20 included in the input image 10. As another example, the texture patch 30 of the mapping relationship is maintained for the rest of the plurality of classes.

도 5, 6, 7에서 픽셀 블록(20)에 도시된 화살표는 해당 픽셀 블록(20)에 대응되는 클래스를 도시한 것이고, 도 9에서 픽셀 블록(20)에 도시된 화살표는 학습 네트워크 모델의 학습 결과에 따라 텍스처 패치(30)가 해당 픽셀 블록(20)으로 대체되었음을 의미한다. 예를 들어, 도 9를 참조하면, 클래스 2, 4, 6 각각에 대응되는 텍스처 패치(30)가 입력 영상(10)에 포함된 픽셀 블록(20)으로 대체되었다.In Figs. 5, 6, and 7, the arrows shown in the pixel block 20 show the classes corresponding to the corresponding pixel blocks 20, and the arrows shown in the pixel block 20 in Fig. 9 are the learning of the learning network model. It means that the texture patch 30 has been replaced with the corresponding pixel block 20 according to the result. For example, referring to FIG. 9, a texture patch 30 corresponding to each of classes 2, 4, and 6 has been replaced with a pixel block 20 included in the input image 10.

도 2로 돌아와서, 본 개시의 일 실시 예에 따른 프로세서(120)는 텍스처 패치(30) 및 픽셀 블록(20) 간 상관 관계에 기초하여 텍스처 패치(30)에 대한 가중치를 획득할 수 있다. 이어서, 프로세서(120)는 가중치가 적용된 텍스처 패치(30’)를 픽셀 블록(20)에 적용하여 출력 영상을 획득할 수 있다.Returning to FIG. 2, the processor 120 according to an embodiment of the present disclosure may obtain a weight for the texture patch 30 based on a correlation between the texture patch 30 and the pixel block 20. Subsequently, the processor 120 may obtain an output image by applying the texture patch 30 ′ to which the weight is applied to the pixel block 20.

입력 영상(10)에 포함된 픽셀 블록(20)과 학습 네트워크 모델로부터 획득된 텍스처 패치(30) 간 상관 관계(correlation)(또는 상관도 또는 연관도) 는 일정한 수치로 계산되어 두 변량 x, y이 서로 관련성이 있다고 추측되는 관계를 의미하며, 관계성의 정도는 상관 계수(correlation coefficient)라고 불리는 수치로 나타내어질 수 있다. 예를 들어, 상관 계수는 -1.0 에서 +1.0 사이의 수치로 표현될 수 있으며, 부호에 상관없이 숫자의 절대값이 클수록 관련성이 더 크다고 볼 수 있다. 예를 들어, 음(-)의 값은 부적 상관을 나타내고, 양(+)의 값은 정적 상관을 나타낼 수 있다. The correlation (or correlation or correlation) between the pixel block 20 included in the input image 10 and the texture patch 30 obtained from the training network model is calculated as a constant value, and two variables x, y This refers to a relationship that is assumed to be related to each other, and the degree of the relationship can be expressed by a number called a correlation coefficient. For example, the correlation coefficient can be expressed as a number between -1.0 and +1.0. Regardless of the sign, the greater the absolute value of the number, the greater the relationship. For example, a negative (-) value may indicate a negative correlation, and a positive (+) value may indicate a positive correlation.

예를 들어, 해당 픽셀 블록(20)에 포함된 픽셀 값 I = [i₀, i₁, ..., i_n-1], 텍스처 패치 R[n]에 포함된 값 R[n] = [r₀, r₁, ..., r_n-1]라 하면, 상관 값 C[n]은 E[I*R[n]] = i_i * r_i로 획득될 수 있다. For example, the pixel value I = [i ₀ , i ₁ , ..., i _n-1 ] included in the corresponding pixel block 20, the value R[n] = [ Assuming r ₀ , r ₁ , ..., r _n-1 ], the correlation value C[n] can be obtained as E[I*R[n]] = i _i * r _i.

또는, 대상 픽셀 블럭에 포함된 픽셀 값들의 평균을 m(I), 텍스처 패치 R[n]에 포함된 값들의 평균을 m(R[n])라 하면, 상관 값은 하기와 같은 수학식 1에 기초하여 획득될 수 있다. Alternatively, if the average of the pixel values included in the target pixel block is m(I) and the average of the values included in the texture patch R[n] is m(R[n]), the correlation value is Equation 1 as follows: Can be obtained based on.

한편, 본 개시의 다른 실시 예에 따라, 텍스처 패치(30)의 평균은 0일 수 있다. 텍스처 패치(30)를 적용하여도 입력 영상(10) 전체의 밝기가 변하지 않고 유지되도록 하기 위함이다. 일 실시 예에 따라, 텍스처 패치(30)의 평균이 0이면, 수학식 1은 하기 수학식 2와 같이 유도될 수 있다. Meanwhile, according to another embodiment of the present disclosure, the average of the texture patches 30 may be 0. This is to keep the brightness of the entire input image 10 unchanged even when the texture patch 30 is applied. According to an embodiment, if the average of the texture patches 30 is 0, Equation 1 may be derived as Equation 2 below.

본 개시의 일 실시 예에 따른 학습 네트워크 모델은 픽셀 블록(20)과 해당 픽셀 블록(20)에 대응되는 텍스처 패치(30) 간 상관 관계가 임계 값 이상이면, 해당 픽셀 블록(20)의 클래스에 대응되는 텍스처 패치(30)를 유지할 수 있다. 다른 예로, 학습 네트워크 모델은 픽셀 블록(20)과 해당 픽셀 블록(20)에 대응되는 텍스처 패치(30) 간 상관 관계가 임계 값 미만이면, 해당 픽셀 블록(20)에 기초하여 텍스처 패치(30)를 업데이트 할 수 있다.. 다른 예로, 프로세서(120)는 획득된 상관 값에 기설정된 비례 상수를 곱하여 획득된 값을 텍스처 패치(30)에 대응되는 가중치로서 획득할 수 있다. 예를 들어, 프로세서(120)는 상관 값에 기초하여 0 내지 1 범위의 가중치를 획득할 수 있다. 일 예에 따라 상관 관계에 따라 텍스처 패치(30)에 가중치 0을 적용하는 경우, 해당 텍스처 패치(30)는 대상 픽셀 블록(20)에 더해지지 않게 된다. 예를 들어, 평탄한 영역이나 강한 에지(edge)가 포함된 영역에서는 모든 클래스 및 모든 텍스처 패치에 대한 상관 관계가 매우 낮을 가능성이 높기 때문에 아무런 texture도 생기지 않게 된다. 이 경우, 에지 영역에서 발생할 수 있는 링잉 현상을 방지할 수 있고, 평탄한 영역에 불필요한 텍스처가 더해지는 것을 방지할 수 있게 된다. In the learning network model according to an embodiment of the present disclosure, if the correlation between the pixel block 20 and the texture patch 30 corresponding to the pixel block 20 is greater than or equal to a threshold value, the class of the pixel block 20 is The corresponding texture patch 30 can be maintained. As another example, in the learning network model, if the correlation between the pixel block 20 and the texture patch 30 corresponding to the pixel block 20 is less than a threshold value, the texture patch 30 is based on the pixel block 20. As another example, the processor 120 may obtain a value obtained by multiplying the obtained correlation value by a preset proportional constant as a weight corresponding to the texture patch 30. For example, the processor 120 may obtain a weight in the range of 0 to 1 based on the correlation value. According to an example, when a weight of 0 is applied to the texture patch 30 according to the correlation, the texture patch 30 is not added to the target pixel block 20. For example, in flat areas or areas with strong edges, there is a high probability that the correlation for all classes and all texture patches is very low, so no texture is produced. In this case, it is possible to prevent a ringing phenomenon that may occur in the edge region, and it is possible to prevent unnecessary texture from being added to the flat region.

다만, 본 개시의 다른 실시 예에 따르면, 픽셀 블록(20)과 텍스처 패치(30) 간 유사도 정보는, 상술한 상관 관계 외에, 다양한 비용 함수에 의해 획득될 수도 있다. 예를 들어, 유사성을 판단하는 비용 함수(cost function)로는 MSE(Mean Square Error), SAD(Sum of absolute difference), MAD(Median Absolute Deviation), correlation 등을 사용할 수 있다. 예를 들어, MSE를 적용하는 경우, 대상 픽셀 블럭의 MSE를 산출하고, MSE 관점에서 대상 픽셀 블록(20)과 텍스처 패치(30) 간 유사도를 획득할 수 있다. 예를 들어, MSE 차이에 기초하여 유사도 가중치가 결정될 수 있다. However, according to another embodiment of the present disclosure, similarity information between the pixel block 20 and the texture patch 30 may be obtained by various cost functions in addition to the above-described correlation. For example, as a cost function for determining similarity, MSE (Mean Square Error), SAD (Sum of Absolute Difference), MAD (Median Absolute Deviation), and correlation may be used. For example, when MSE is applied, the MSE of the target pixel block may be calculated, and similarity between the target pixel block 20 and the texture patch 30 may be obtained from the viewpoint of the MSE. For example, a similarity weight may be determined based on the MSE difference.

프로세서(120)는 획득된 가중치를 텍스처 패치(30)에 각각에 적용하고, 가중치가 적용된 텍스처 패치(30)를 대상 픽셀 블록(20)에 적용하여 출력 영상을 획득할 수 있다. 여기서, 적용이란, 대상 픽셀 블록(20)에 포함된 각 픽셀 값에 가중치가 적용된 텍스처 패치(30)의 대응되는 영역에 포함된 값을 덧셈하는 방식이 될 수 있다. 다만, 이에 한정되는 것은 아니며 단순 덧셈 외의 추가 처리가 수행될 수도 있음은 물론이다.The processor 120 may apply the obtained weight to each of the texture patches 30 and apply the weighted texture patch 30 to the target pixel block 20 to obtain an output image. Here, the application may be a method of adding a value included in a corresponding region of the texture patch 30 to which a weight is applied to each pixel value included in the target pixel block 20. However, it is not limited thereto, and of course, additional processing other than simple addition may be performed.

본 개시의 다른 실시 예에 따르면, 프로세서(120)는 텍스처 패치(30)가 획득되면, 텍스처 패치(30)에 주파수 필터링을 적용하고, 주파수 필터링이 적용된 텍스처 패치(30)를 대상 픽셀 블럭에 적용할 수도 있다. 즉, 프로세서(120)는 텍스처 패치(30)를 입력 영상에 더하기 전에 주파수 필터링을 적용하여, 텍스처 패치(30)의 주파수 범위를 변형시킬 수 있다. 예를 들어, 프로세서(120)는 high-pass filter를 사용하여 고주파 텍스처를 생성할 수도 있고, low-pass filter를 사용하여 저주파 텍스처를 생성할 수 있다. 하기 수학식 3은 필터링된 텍스처(Filter(T))를 입력 영상(I)과 더해 출력 영상(O)을 획득하는 과정을 나타낸다. According to another embodiment of the present disclosure, when the texture patch 30 is obtained, the processor 120 applies frequency filtering to the texture patch 30 and applies the texture patch 30 to which frequency filtering is applied to a target pixel block. You may. That is, the processor 120 may modify the frequency range of the texture patch 30 by applying frequency filtering before adding the texture patch 30 to the input image. For example, the processor 120 may generate a high-frequency texture using a high-pass filter, or may generate a low-frequency texture using a low-pass filter. Equation 3 below shows a process of obtaining an output image O by adding the filtered texture Filter(T) to the input image I.

예를 들어, 프로세서(120)는 텍스처 패치(30)에 가우시안 블러링(또는 가우시안 필터링)과 같은 low-pass filter를 적용할 수 있다. 가우시안 블러링은 가우시안 확률 분포에 기초한 가우시안 필터를 이용하여 블러 처리하는 방법으로, 가우시안 필터를 텍스처 패치(30)에 적용하게 되면 고주파 성분은 차단되어 블러 처리가 된다. 일 실시 예에 따른 프로세서(12)는 텍스처 패치(30)에 포함된 모든 픽셀 값들에 대한 가우시안 필터링을 수행하여, 블러링된 텍스처 패치(30’)를 획득할 수 있다. 이어서, 프로세서(120)는 블러링된 텍스처 패치(30’)를 대응되는 픽셀 블록(20)에 적용하여 출력 영상을 획득할 수 있다.For example, the processor 120 may apply a low-pass filter such as Gaussian blurring (or Gaussian filtering) to the texture patch 30. Gaussian blurring is a method of blurring using a Gaussian filter based on a Gaussian probability distribution. When a Gaussian filter is applied to the texture patch 30, high-frequency components are cut off to perform a blur process. The processor 12 according to an embodiment may obtain a blurred texture patch 30' by performing Gaussian filtering on all pixel values included in the texture patch 30. Subsequently, the processor 120 may obtain an output image by applying the blurred texture patch 30 ′ to the corresponding pixel block 20.

한편, 상술한 영상 처리 과정 즉, 텍스처 향상 처리는 실시 예에 따라 영상의 스케일링 이전 또는 이후에 수행될 수 있다. 예를 들어, 저해상도 영상을 고해상도 영상으로 확대하는 스케일링 이후에 상술한 영상 처리를 수행하거나, 압축 영상을 디코딩하는 과정에서 상술한 영상 처리를 수행한 후 스케일링을 수행할 수도 있다. Meanwhile, the above-described image processing process, that is, texture enhancement processing may be performed before or after scaling of an image according to an exemplary embodiment. For example, the above-described image processing may be performed after scaling to enlarge a low-resolution image to a high-resolution image, or scaling may be performed after performing the above-described image processing in a process of decoding a compressed image.

본 개시의 다른 실시 예에 따른 학습 네트워크 모델은 서로 다른 가중치를 적용된 복수의 텍스처 패치를 획득할 수도 있다.The learning network model according to another embodiment of the present disclosure may acquire a plurality of texture patches to which different weights are applied.

예를 들어, 학습 네트워크 모델은 픽셀 블록(20)에 대응되는 클래스를 식별하고, 해당 클래스에 대응되는 제1 내지 제n 텍스처 패치를 획득할 수 있다. 이이서, 학습 네트워크 모델은 픽셀 블록(20)과 제1 내지 제n 텍스처 패치들 간의 상관 관계를 식별할 수 있다. 예를 들어, 학습 네트워크 모델은 픽셀 블록(20)과 제1 텍스처 패치 간 상관 관계에 기초하여 제1 가중치를 획득하고, 픽셀 블록(20)과 제 2 텍스처 패치 간 상관 관계에 기초하여 제2 가중치를 획득할 수 있다. 이어서, 학습 네트워크 모델은 제1 가중치를 제1 텍스처 패치에 곱셈하고, 제2 가중치를 제2 텍스처 패치에 곱셈하고, 제1 가중치가 곱해진 제1 텍스처 패치 및 제2 가중치가 곱해진 제2 텍스처 패치를 대상 픽셀 블록(20)에 적용하여 출력 영상을 획득할 수 있다.For example, the learning network model may identify a class corresponding to the pixel block 20 and obtain first to nth texture patches corresponding to the class. Then, the learning network model may identify a correlation between the pixel block 20 and the first to nth texture patches. For example, the learning network model acquires a first weight based on the correlation between the pixel block 20 and the first texture patch, and obtains a second weight based on the correlation between the pixel block 20 and the second texture patch. Can be obtained. Subsequently, the learning network model multiplies the first weight by the first texture patch, multiplies the second weight by the second texture patch, and the first texture patch multiplied by the first weight and the second texture multiplied by the second weight. An output image may be obtained by applying the patch to the target pixel block 20.

본 개시의 일 실시 예에 따르면, 상관 관계에 따라 가중치는 기설정된 범위, 예를 들어 0 내지 1 사이 범위에서 결정될 수 있다. 예를 들어, 학습 네트워크 모델은 픽셀 블록(20)과 획득된 텍스처 패치(30) 간 상관 관계가 최소인 경우 가중치를 0으로 결정하고, 상관 관계가 최대인 경우 가중치 1로 결정하며, 상관 관계가 최소 내지 최대 사이에서는 선형적으로 증가되도록 가중치를 결정할 수 있다. According to an embodiment of the present disclosure, the weight may be determined in a preset range, for example, between 0 and 1 according to the correlation. For example, in the learning network model, when the correlation between the pixel block 20 and the acquired texture patch 30 is minimum, the weight is determined as 0, and when the correlation is maximum, the weight is determined as 1, and the correlation is The weight can be determined to increase linearly between the minimum and maximum.

도 10은 본 개시의 다른 실시 예에 따른 클래스를 설명하기 위한 도면이다.10 is a diagram for describing a class according to another exemplary embodiment of the present disclosure.

도 10을 참조하면, 학습 네트워크 모델은 학습을 수행하는 과정에서 클래스 별로 텍스처 패치(30)를 추가, 삭제할 수 있다. Referring to FIG. 10, the learning network model may add or delete texture patches 30 for each class in the process of performing training.

일 실시 예에 따라 학습 네트워크 모델은 입력 영상(10)에 포함된 복수의 픽셀 블록에 기초하여 학습을 수행함에 따라 특정 클래스에 포함된 텍스처를 삭제하거나, 특정 클래스에 복수의 텍스처 패치를 저장할 수 있다. 따라서, 학습 네트워크 모델은 복수의 클래스 각각에 텍스처 패치를 저장하기 위한 저장 공간을 동일하게 할당할 수도 있고, 특정 클래스에 나머지 클래스 보다 많은 저장 공간을 할당할 수도 있다. According to an embodiment, the learning network model may delete a texture included in a specific class or store a plurality of texture patches in a specific class as learning is performed based on a plurality of pixel blocks included in the input image 10. . Accordingly, the learning network model may allocate the same storage space for storing texture patches to each of a plurality of classes, or allocate more storage space to a specific class than the rest of the classes.

일 실시 예에 따라 학습 네트워크 모델은 입력 영상(10)에 포함된 복수의 픽셀 블록 각각의 클래스를 식별하고, 복수의 클래스 각각의 식별 빈도수에 기초하여 복수의 클래스 중 적어도 하나에 대응되는 메모리(110)의 저장 공간의 크기를 변경할 수 있다. 예를 들어, 학습 네트워크 모델은 식별 빈도수에 따라 기 설정된 빈도수 이상 식별된 클래스에 텍스처 패치를 저장하기 위한 저장 공간을 추가 할당하여 메모리(110)의 저장 공간의 크기를 증가시킬 수 있다. 여기서, 기 설정된 빈도수는 픽셀 블록 전체 개수 대비 특정 클래스가 20% 이상 식별되는 경우를 의미할 수 있다. 한편, 이는 일 실시 예에 불과하며 기 설정된 횟수는 10% 등 다양하게 설정될 수 있음은 물론이다. 다른 예로, 학습 네트워크 모델은 식별 빈도수에 기초하여 가장 많이 식별된 클래스에 대응되는 저장 공간의 크기가 증가시킬 수 있다. According to an embodiment, the learning network model identifies a class of each of a plurality of pixel blocks included in the input image 10, and a memory 110 corresponding to at least one of the plurality of classes based on the identification frequency of each of the plurality of classes. ) You can change the size of the storage space. For example, the learning network model may increase the size of the storage space of the memory 110 by additionally allocating a storage space for storing a texture patch to a class that is identified by a predetermined frequency or more according to the identification frequency. Here, the preset frequency may mean a case in which 20% or more of a specific class is identified relative to the total number of pixel blocks. On the other hand, this is only an exemplary embodiment, and it goes without saying that the preset number may be variously set, such as 10%. As another example, the learning network model may increase the size of a storage space corresponding to the most identified class based on the identification frequency.

예를 들어, 입력 영상(10)에 포함된 복수의 픽셀 블록 중 다수의 픽셀 블록이 제4 클래스에 대응되는 경우를 상정할 수 있다. 학습 네트워크 모델은 제4 클래스에 대응되는 메모리(110) 상의 저장 공간의 크기를 증가시킬 수 있다. For example, it may be assumed that a plurality of pixel blocks among a plurality of pixel blocks included in the input image 10 correspond to the fourth class. The learning network model may increase the size of the storage space on the memory 110 corresponding to the fourth class.

일 실시 예에 따라 학습 네트워크 모델은 픽셀 블록이 제4 클래스에 대응되는 것으로 식별되면, 해당 픽셀 블록과 제4 클래스 간의 제1 유사도를 식별하고, 제4 클래스에 기 저장된 텍스처 패치 간의 제2 유사도를 식별할 수 있다. 이어서, 학습 네트워크 모델은 제1 유사도가 제2 유사도 보다 작은 값이면 기 저장된 텍스처 패치를 유지하고, 해당 픽셀 블록을 제4 클래스에 추가로 저장할 수 있다. 여기서, 기 저장된 텍스처 패치가 해당 픽셀 블록 보다 선 순위를 가질 수 있다. According to an embodiment, when a pixel block is identified as corresponding to a fourth class, the learning network model identifies a first similarity between the corresponding pixel block and the fourth class, and calculates a second similarity between texture patches previously stored in the fourth class. Can be identified. Subsequently, if the first similarity is smaller than the second similarity, the learning network model may maintain a previously stored texture patch and additionally store a corresponding pixel block in the fourth class. Here, the previously stored texture patch may have a line priority than the corresponding pixel block.

다른 예로, 학습 네트워크 모델은 제1 유사도가 제2 유사도 보다 큰 값이면 해당 픽셀 블록을 제4 클래스에 추가로 저장할 수 있다. 여기서, 기 저장된 텍스처 패치의 우선 순위가 후 순위로 변경되고, 해당 픽셀 블록이 기 저장된 텍스처 패치 보다 선 순위를 가질 수 있다. As another example, if the first similarity is greater than the second similarity, the learning network model may additionally store the corresponding pixel block in the fourth class. Here, the priority of the previously stored texture patch is changed to a later priority, and the corresponding pixel block may have a prior order than the previously stored texture patch.

또 다른 실시 예로, 학습 네트워크 모델은 복수의 클래스 별 식별 빈도수에 기초하여 가장 많이 식별된 클래스에 기 설정된 개수의 텍스처 패치가 저장 가능하도록 메모리(110)의 저장 공간의 크기를 변경할 수 있고, 두번째로 많이 식별된 클래스에 기 설정된 개수 보다 적은 개수의 텍스처 패치가 저장 가능하도록 저장 공간의 크기를 변경할 수 있다. 예를 들어, 학습 네트워크 모델은 가장 많이 식별된 제4 클래스에 최대 10개의 텍스처 패치가 저장 가능하도록 저장 공간의 크기를 변경할 수 있고, 두번째로 많이 식별된 제2 클래스에 최대 6개의 텍스처 패치가 저장 가능하도록 저장 공간의 크기를 변경할 수 있다. 한편, 구체적인 숫자는 일 실시 예에 불과하며 저장 가능한 텍스처 패치의 개수는 다양하게 설정될 수 있음은 물론이다.In another embodiment, the learning network model may change the size of the storage space of the memory 110 so that a preset number of texture patches can be stored in the most identified class based on the identification frequency for each of a plurality of classes. The size of the storage space can be changed so that a smaller number of texture patches than a preset number can be stored in the many identified classes. For example, the training network model can change the size of the storage space so that up to 10 texture patches can be stored in the fourth class identified most, and up to 6 texture patches are stored in the second class identified the second most. You can change the size of the storage space to allow it. On the other hand, the specific number is only an example, and it goes without saying that the number of storeable texture patches may be variously set.

한편, 학습 네트워크 모델은 해당 픽셀 블록을 언제나 식별된 클래스에 대응되는 텍스처 패치로 추가하는 것은 아니고, 해당 픽셀 블록과 식별된 클래스 간의 유사도가 기 설정된 값 미만이면 해당 픽셀 블록을 추가하지 않을 수 있음은 물론이다. 일 예로, 학습 네트워크 모델은 해당 픽셀 블록과 식별된 클래스 간의 유사도가 50%미만이면, 해당 픽셀 블록을 식별된 클래스의 텍스처 패치로 추가하지 않을 수 있다.On the other hand, the learning network model does not always add the corresponding pixel block as a texture patch corresponding to the identified class, and if the similarity between the pixel block and the identified class is less than a preset value, it may not add the corresponding pixel block. Of course. For example, if the similarity between the corresponding pixel block and the identified class is less than 50%, the learning network model may not add the corresponding pixel block as a texture patch of the identified class.

다른 예로, 학습 네트워크 모델은 입력 영상(10)에 포함된 복수의 픽셀 블록 각각의 클래스를 식별함에 있어서 기 설정된 횟수 미만 식별된 클래스에 대응되는 텍스처 패치를 메모리(110)로부터 삭제할 수 있다. 이어서, 학습 네트워크 모델은 텍스처 패치의 삭제에 따라 확보된 메모리(110)의 저장 공간을 나머지 클래스에 할당할 수 있다. As another example, in identifying the classes of each of the plurality of pixel blocks included in the input image 10, the learning network model may delete from the memory 110 a texture patch corresponding to a class identified less than a preset number of times. Subsequently, the learning network model may allocate the storage space of the memory 110 reserved according to the deletion of the texture patch to the remaining classes.

예를 들어, 학습 네트워크 모델은 복수의 픽셀 블록 각각의 클래스를 식별한 결과에 따라 제3 클래스에 대응되는 픽셀 블록의 개수가 기 설정된 개수 미만이면, 제3 클래스에 기 저장된 텍스처 패치를 삭제하고, 제3 클래스에 대응되는 텍스처 패치를 저장하기 위한 저장 공간을 나머지 클래스에 할당할 수 있다. 이에 따라, 학습 네트워크 모델은 높은 빈도수로 식별되는 클래스에 복수의 텍스처 패치가 저장 가능하도록 해당 클래스에 저장 공간의 크기를 증가시킬 수 있다.For example, if the number of pixel blocks corresponding to the third class is less than a preset number according to the result of identifying each class of a plurality of pixel blocks, the learning network model deletes a texture patch previously stored in the third class, A storage space for storing a texture patch corresponding to the third class may be allocated to the remaining classes. Accordingly, the learning network model may increase the size of a storage space in a corresponding class so that a plurality of texture patches can be stored in a class identified with a high frequency.

다른 예로, 학습 네트워크 모델은 식별 빈도수에 기초하여 가정 적게 식별된 클래스를 삭제하여 해당 클래스에 기 할당된 저장 공간을 나머지 클래스로 재할당할 수도 있다.As another example, the learning network model may reallocate a storage space pre-allocated to the class to the remaining classes by deleting a class that is identified with a small number of assumptions based on the frequency of identification.

도 11은 도 2에 도시된 전자 장치의 세부 구성을 나타내는 블록도이다.11 is a block diagram illustrating a detailed configuration of the electronic device shown in FIG. 2.

도 11에 따르면, 영상 처리 장치(100)는 메모리(110), 프로세서(120), 입력부(130), 디스플레이(140), 출력부(150) 및 사용자 인터페이스(160)을 포함한다. 도 11에 도시된 설명 중 도 2에 도시된 구성과 중복되는 구성에 대해서는 자세한 설명을 생략하도록 한다. Referring to FIG. 11, the image processing apparatus 100 includes a memory 110, a processor 120, an input unit 130, a display 140, an output unit 150, and a user interface 160. In the description of FIG. 11, detailed descriptions of the configurations overlapping with those of FIG. 2 will be omitted.

본 개시의 일 실시 예에 따르면, 메모리(110)는 본 개시에 따른 다양한 동작들에서 생성되는 데이터를 저장하는 단일 메모리로 구현될 수 있다, According to an embodiment of the present disclosure, the memory 110 may be implemented as a single memory that stores data generated in various operations according to the present disclosure.

다만, 본 개시의 다른 실시 예에 따르면, 메모리(110)는 제1 내지 제3 메모리를 포함하도록 구현될 수 있다. However, according to another embodiment of the present disclosure, the memory 110 may be implemented to include first to third memories.

제1 메모리는 입력부(130)를 통해 입력된 영상 중 적어도 일부를 저장할 수 있다. 특히, 제1 메모리는 입력된 영상 프레임 중 적어도 일부 영역을 저장할 수 있다. 이 경우 적어도 일부 영역은 본 개시의 일 실시 예에 따른 영상 처리를 수행하기에 필요한 영역이 될 수 있다. 일 실시 예에 따라, 제1 메모리는 N 라인 메모리로 구현될 수 있다. 예를 들어, N 라인 메모리는 세로 방향으로 17 라인 상당의 용량을 가지는 메모리가 될 수 있으나, 이에 한정되는 것은 아니다. 예를 들어, 1080p(1,920×1,080의 해상도)의 Full HD 영상이 입력되는 경우 Full HD 영상에서 17 라인의 영상 영역 만이 제1 메모리에 저장된다. 이와 같이 제1 메모리는 N 라인 메모리로 구현되고, 입력된 영상 프레임 중 일부 영역 만이 영상 처리를 위해 저장되는 이유는 하드웨어적 한계에 따라 제1 메모리의 메모리 용량이 제한적이기 때문이다. 제2 메모리는 획득된 적어도 하나의 텍스처 패치(30) 등을 저장하기 위한 메모리로, 본 개시의 다양한 실시 예에 따라 다양한 사이즈의 메모리로 구현될 수 있다. 예를 들어, 본 개시의 일 실시 예에 따라 입력 영상의 각 픽셀 값에 대응되는 텍스처 성분을 모두 획득하여 저장한 후 입력 영상에 적용하도록 구현되는 경우, 제2 메모리는 입력 영상의 크기와 같거나 큰 사이즈로 구현될 수 있다. 다른 실시 예에 따라 제1 메모리의 사이즈에 대응되는 영상 단위로 텍스처 성분을 적용하거나, 픽셀 라인 단위로 획득된 텍스처 성분을 픽셀 라인 단위로 적용하는 등의 경우에는 해당 영상 처리를 위한 적절한 사이즈로 구현될 수도 있다. 한편, 제2 메모리는 메모리(110) 전체 영역 중 학습 네트워크 모델에 할당된 메모리 영역을 의미할 수도 있음은 물론이다.The first memory may store at least some of the images input through the input unit 130. In particular, the first memory may store at least a partial region of the input image frame. In this case, at least some of the areas may be areas required to perform image processing according to an exemplary embodiment of the present disclosure. According to an embodiment, the first memory may be implemented as an N-line memory. For example, the N-line memory may be a memory having a capacity equivalent to 17 lines in the vertical direction, but is not limited thereto. For example, when a 1080p (1,920×1,080 resolution) Full HD image is input, only 17 lines of the image area of the Full HD image is stored in the first memory. As described above, the reason why the first memory is implemented as an N-line memory, and only some regions of the input image frames are stored for image processing is that the memory capacity of the first memory is limited according to hardware limitations. The second memory is a memory for storing at least one obtained texture patch 30, and the like, and may be implemented as a memory of various sizes according to various embodiments of the present disclosure. For example, according to an embodiment of the present disclosure, when all texture components corresponding to each pixel value of the input image are acquired and stored, and then applied to the input image, the second memory is equal to or equal to the size of the input image. It can be implemented in a large size. According to another embodiment, when a texture component is applied in an image unit corresponding to the size of the first memory, or a texture component obtained in a pixel line unit is applied in a pixel line unit, it is implemented with an appropriate size for the image processing. It could be. On the other hand, it goes without saying that the second memory may mean a memory area allocated to the learning network model among the entire area of the memory 110.

제3 메모리는 획득된 텍스처 성분을 적용하여 영상 처리한 출력 영상이 저장되는 메모리로, 본 개시의 다양한 실시 예에 따라 다양한 사이즈의 메모리로 구현될 수 있다. 예를 들어, 본 개시의 일 실시 예에 따라 입력 영상(10)의 각 픽셀 값에 대응되는 텍스처 성분을 모두 적용하여 출력 영상을 획득하여 디스플레하도록 구현되는 경우, 제3 메모리는 입력 영상의 크기와 같거나 큰 사이즈로 구현될 수 있다. 다른 실시 예에 따라 제1 메모리의 사이즈에 대응되는 영상 단위로 영상을 출력하거나, 패치 크기에 대응되는 라인 단위로 출력하는 등의 경우에는 해당 영상 저장을 위한 적절한 사이즈로 구현될 수도 있다. The third memory is a memory for storing an image-processed output image by applying the obtained texture component, and may be implemented as a memory having various sizes according to various embodiments of the present disclosure. For example, according to an embodiment of the present disclosure, when all texture components corresponding to each pixel value of the input image 10 are applied to obtain and display an output image, the third memory It can be implemented in the same or larger size. According to another embodiment, when an image is output in an image unit corresponding to the size of the first memory or a line unit corresponding to a patch size is output, the image may be implemented in an appropriate size for storing the image.

다만, 제1 메모리 또는 제2 메모리에 출력 영상이 오버라이트되거나, 출력 영상이 저장되지 않고 바로 디스플레이되는 형태로 구현되는 경우 등에 제3 메모리는 필요하지 않을 수 있다. However, when the output image is overwritten in the first memory or the second memory, or the output image is not stored and is directly displayed, the third memory may not be required.

입력부(130)는 다양한 타입의 컨텐츠, 예를 들어 영상 신호를 수신한다. 예를 들어 입력부(130)는 AP 기반의 Wi-Fi(와이파이, Wireless LAN 네트워크), 블루투스(Bluetooth), 지그비(Zigbee), 유/무선 LAN(Local Area Network), WAN, 이더넷, IEEE 1394, HDMI(High Definition Multimedia Interface), MHL (Mobile High-Definition Link), USB (Universal Serial Bus), DP(Display Port), 썬더볼트(Thunderbolt), VGA(Video Graphics Array)포트, RGB 포트, D-SUB(D-subminiature), DVI(Digital Visual Interface) 등과 같은 통신 방식을 통해 외부 장치(예를 들어, 소스 장치), 외부 저장 매체(예를 들어, USB), 외부 서버(예를 들어 웹 하드) 등으로부터 스트리밍 또는 다운로드 방식으로 영상 신호를 입력받을 수 있다. 여기서, 영상 신호는 디지털 신호가 될 수 있으나 이에 한정되는 것은 아니다. The input unit 130 receives various types of content, for example, an image signal. For example, the input unit 130 is AP-based Wi-Fi (Wi-Fi, Wireless LAN network), Bluetooth, Zigbee, wired/wireless LAN (Local Area Network), WAN, Ethernet, IEEE 1394, HDMI. (High Definition Multimedia Interface), MHL (Mobile High-Definition Link), USB (Universal Serial Bus), DP (Display Port), Thunderbolt, VGA (Video Graphics Array) port, RGB port, D-SUB ( D-subminiature), through a communication method such as DVI (Digital Visual Interface), from an external device (e.g., a source device), an external storage medium (e.g., USB), an external server (e.g., web hard), etc. Video signals can be input by streaming or downloading. Here, the image signal may be a digital signal, but is not limited thereto.

디스플레이(140)는 LCD(liquid crystal display), OLED(organic light-emitting diode), ED(Light-Emitting Diode), micro LED, LCoS(Liquid Crystal on Silicon), DLP(Digital Light Processing), QD(quantum dot) 디스플레이 패널 등과 같은 다양한 형태로 구현될 수 있다. The display 140 includes a liquid crystal display (LCD), an organic light-emitting diode (OLED), a light-emitting diode (ED), a micro LED, liquid crystal on silicon (LCoS), digital light processing (DLP), and quantum dot) It can be implemented in various forms such as a display panel.

출력부(150)는 음향 신호를 출력한다. The output unit 150 outputs an acoustic signal.

예를 들어, 출력부(150)는 프로세서(120)에서 처리된 디지털 음향 신호를 아날로그 음향 신호로 변환하고 증폭하여 출력할 수 있다. 예를 들어, 출력부(150)는 적어도 하나의 채널을 출력할 수 있는, 적어도 하나의 스피커 유닛, D/A 컨버터, 오디오 앰프(audio amplifier) 등을 포함할 수 있다. 일 예로, 출력부(150)는 L 채널, R 채널을 각각 재생하는 L 채널 스피커 및 R 채널 스피커를 포함할 수 있다. 다만, 이에 한정되는 것은 아니며, 출력부(150)는 다양한 형태로 구현가능하다. 다른 예로, 출력부(150)는 L 채널, R 채널, Center 채널을 재생하는 사운드 바 형태로 구현되는 것도 가능하다.For example, the output unit 150 may convert a digital sound signal processed by the processor 120 into an analog sound signal, amplify it, and output it. For example, the output unit 150 may include at least one speaker unit, a D/A converter, an audio amplifier, and the like capable of outputting at least one channel. For example, the output unit 150 may include an L-channel speaker and an R-channel speaker for reproducing L and R channels, respectively. However, the present invention is not limited thereto, and the output unit 150 may be implemented in various forms. As another example, the output unit 150 may be implemented in the form of a sound bar that reproduces an L channel, an R channel, and a center channel.

사용자 인터페이스(160)는 버튼, 터치 패드, 마우스 및 키보드와 같은 장치로 구현되거나, 상술한 디스플레이 기능 및 조작 입력 기능도 함께 수행 가능한 터치 스크린, 리모콘 수신부 등으로 구현될 수 있다. 여기서, 버튼은 영상 처리 장치(100)의 본체 외관의 전면부나 측면부, 배면부 등의 임의의 영역에 형성된 기계적 버튼, 터치 패드, 휠 등과 같은 다양한 유형의 버튼이 될 수 있다.The user interface 160 may be implemented as a device such as a button, a touch pad, a mouse, and a keyboard, or may be implemented as a touch screen or a remote control receiver capable of performing the above-described display function and manipulation input function. Here, the button may be various types of buttons such as a mechanical button, a touch pad, a wheel, etc. formed in an arbitrary area such as a front portion, a side portion, or a rear portion of the main body of the image processing apparatus 100.

한편, 도 11에는 도시되지 않았지만, 본 개시의 실시 예에 따른 영상 처리 전에 입력 영상의 노이즈를 제거하는 프리 필터링을 추가적으로 적용하는 것도 가능하다. 예를 들어, 가우시안 필터와 같은 스무딩 필터(Smoothing Filter), 입력 영상을 기설정된 가이던스(guidance)에 대비시켜 필터링하는 가이디드(guided) 필터 등을 적용하여 두드러진 노이즈를 제거할 수 있다. Meanwhile, although not shown in FIG. 11, it is possible to additionally apply pre-filtering to remove noise from an input image before image processing according to an exemplary embodiment of the present disclosure. For example, noticeable noise may be removed by applying a smoothing filter such as a Gaussian filter, a guided filter that filters an input image against a preset guidance, and the like.

도 12는 본 개시의 일 실시 예에 따른 학습 네트워크 모델을 학습하고 이용하기 위한 영상 처리 장치의 구성을 나타내는 블록도이다.12 is a block diagram illustrating a configuration of an image processing apparatus for learning and using a learning network model according to an embodiment of the present disclosure.

도 12를 참조하면, 프로세서(120)는 학습부(1210) 및 인식부(1220) 중 적어도 하나를 포함할 수 있다. 도 12의 프로세서(120)는 도 2의 영상 처리 장치(100)의 프로세서(140) 또는 데이터 학습 서버(미도시)의 프로세서에 대응될 수 있다.Referring to FIG. 12, the processor 120 may include at least one of a learning unit 1210 and a recognition unit 1220. The processor 120 of FIG. 12 may correspond to the processor 140 of the image processing apparatus 100 of FIG. 2 or a processor of a data learning server (not shown).

학습부(1210)는 픽셀 블록(20)의 클래스를 식별하기 위한 기준을 갖는 인식 모델 및 클래스에 따라 픽셀 블록(20)에 대응되는 텍스처 패치(30)를 획득하기 위한 기준을 갖는 인식 모델을 생성 또는 학습시킬 수 있다. 학습부(1210)는 수집된 학습 데이터를 이용하여 판단 기준을 갖는 인식 모델을 생성할 수 있다. The learning unit 1210 generates a recognition model having a criterion for identifying a class of the pixel block 20 and a recognition model having a criterion for acquiring a texture patch 30 corresponding to the pixel block 20 according to the class. Or you can learn. The learning unit 1210 may generate a recognition model having a determination criterion by using the collected training data.

일 예로, 학습부(1210)는 영상에 포함된 픽셀 블록(20)을 학습 데이터로서 이용하여 해당 픽셀 블록(20)에 대응되는 클래스를 판단하기 위한 인식 모델을 생성, 학습 또는 갱신시킬 수 있다. For example, the learning unit 1210 may generate, learn, or update a recognition model for determining a class corresponding to the pixel block 20 by using the pixel block 20 included in the image as training data.

또 다른 예로, 학습부(1210)는 픽셀 블록(20)과 클래스 간의 유사도 및 텍스처 패치(30)와 클래스 간의 유사도를 비교하여 텍스처 패치(30)의 업데이트 여부를 판단하는 인식 모델을 생성, 학습 또는 갱신시킬 수 있다.As another example, the learning unit 1210 generates, learns, or generates a recognition model that determines whether the texture patch 30 is updated by comparing the similarity between the pixel block 20 and the class and the similarity between the texture patch 30 and the class. Can be updated.

인식부(1220)는 소정의 데이터(예를 들어, 입력 영상)를 학습된 인식 모델의 입력 데이터로 사용하여, 소정의 데이터에 포함된 인식 대상 또는 상황을 추정할 수 있다.The recognition unit 1220 may use predetermined data (eg, an input image) as input data of the learned recognition model to estimate a recognition object or situation included in the predetermined data.

일 예로, 인식부(1220)는 입력 영상(10)의 픽셀 블록(20)을 학습된 인식 모델의 입력 데이터로 사용하여 해당 픽셀 블록(20)의 클래스 및 텍스처 패치(30)를 식별할 수 있다.For example, the recognition unit 1220 may use the pixel block 20 of the input image 10 as input data of the learned recognition model to identify the class and texture patch 30 of the corresponding pixel block 20. .

학습부(1210)의 적어도 일부 및 인식부(1220)의 적어도 일부는, 소프트웨어 모듈로 구현되거나 적어도 하나의 하드웨어 칩 형태로 제작되어 영상 처리 장치에 탑재될 수 있다. 예를 들어, 학습부(1210) 및 인식부(1220) 중 적어도 하나는 인공 지능(AI; artificial intelligence)을 위한 전용 하드웨어 칩 형태로 제작될 수도 있고, 또는 기존의 범용 프로세서(예: CPU 또는 application processor) 또는 그래픽 전용 프로세서(예: GPU)의 일부로 제작되어 전술한 각종 영상 처리 장치 또는 객체 인식 장치에 탑재될 수도 있다. 이때, 인공 지능을 위한 전용 하드웨어 칩은 확률 연산에 특화된 전용 프로세서로서, 기존의 범용 프로세서보다 병렬처리 성능이 높아 기계 학습과 같은 인공 지능 분야의 연산 작업을 빠르게 처리할 수 있다. 학습부(1210) 및 인식부(1220)가 소프트웨어 모듈(또는, 인스트럭션(instruction) 포함하는 프로그램 모듈)로 구현되는 경우, 소프트웨어 모듈은 컴퓨터로 읽을 수 있는 판독 가능한 비일시적 판독 가능 기록매체(non-transitory computer readable media)에 저장될 수 있다. 이 경우, 소프트웨어 모듈은 OS(Operating System)에 의해 제공되거나, 소정의 애플리케이션에 의해 제공될 수 있다. 또는, 소프트웨어 모듈 중 일부는 OS(Operating System)에 의해 제공되고, 나머지 일부는 소정의 애플리케이션에 의해 제공될 수 있다.At least a portion of the learning unit 1210 and at least a portion of the recognition unit 1220 may be implemented as a software module or manufactured in the form of at least one hardware chip and mounted on the image processing apparatus. For example, at least one of the learning unit 1210 and the recognition unit 1220 may be manufactured in the form of a dedicated hardware chip for artificial intelligence (AI), or an existing general-purpose processor (eg, CPU or application). processor) or a graphics dedicated processor (eg, a GPU) and mounted on the aforementioned various image processing devices or object recognition devices. At this time, the dedicated hardware chip for artificial intelligence is a dedicated processor specialized in probability calculation, and has higher parallel processing performance than conventional general-purpose processors, so it can quickly process computation tasks in the field of artificial intelligence such as machine learning. When the learning unit 1210 and the recognition unit 1220 are implemented as a software module (or a program module including an instruction), the software module is a computer-readable non-transitory readable recording medium (non-transitory). transitory computer readable media). In this case, the software module may be provided by an operating system (OS) or may be provided by a predetermined application. Alternatively, some of the software modules may be provided by an operating system (OS), and some of the software modules may be provided by a predetermined application.

이 경우, 학습부(1210) 및 인식부(1220)는 하나의 영상 처리 장치에 탑재될 수도 있으며, 또는 별개의 영상 처리 장치들에 각각 탑재될 수도 있다. 예를 들어, 학습부(1210) 및 인식부(1220) 중 하나는 영상 처리 장치(100)에 포함되고, 나머지 하나는 외부의 서버에 포함될 수 있다. 또한, 학습부(1210) 및 인식부(1220)는 유선 또는 무선으로 통하여, 학습부(1210)가 구축한 모델 정보를 인식부(1220)로 제공할 수도 있고, 학습부(1220)로 입력된 데이터가 추가 학습 데이터로서 학습부(1210)로 제공될 수도 있다. In this case, the learning unit 1210 and the recognition unit 1220 may be mounted on one image processing device, or may be mounted on separate image processing devices, respectively. For example, one of the learning unit 1210 and the recognition unit 1220 may be included in the image processing apparatus 100, and the other may be included in an external server. In addition, the learning unit 1210 and the recognition unit 1220 may provide model information built by the learning unit 1210 to the recognition unit 1220 through wired or wireless, or input to the learning unit 1220. Data may be provided to the learning unit 1210 as additional learning data.

도 13은 본 개시의 일 실시 예에 따른 영상 처리 방법을 설명하기 위한 흐름도이다. 13 is a flowchart illustrating an image processing method according to an exemplary embodiment of the present disclosure.

도 13에 도시된 영상 처리 방법에 따르면, 입력 영상을 학습 네트워크 모델에 적용하여 입력 영상에 포함된 픽셀 블록에 대응되는 텍스처 패치를 획득한다(S1310).According to the image processing method illustrated in FIG. 13, a texture patch corresponding to a pixel block included in the input image is obtained by applying an input image to a learning network model (S1310).

이어서, 픽셀 블록에 획득된 텍스처 패치를 적용하여 출력 영상을 획득한다(S1320).Subsequently, an output image is obtained by applying the obtained texture patch to the pixel block (S1320).

여기서, 학습 네트워크 모델은, 영상의 특성에 기초하여 분류된 복수의 클래스 각각에 대응되는 텍스처 패치를 저장하며, 입력 영상에 기초하여 복수의 클래스 각각에 대응되는 텍스처 패치를 학습할 수 있다.Here, the learning network model stores texture patches corresponding to each of a plurality of classes classified based on characteristics of an image, and may learn a texture patch corresponding to each of the plurality of classes based on an input image.

또한, 일 실시 예에 따른 학습 네트워크 모델은, 픽셀 블록의 특성에 기초하여 복수의 클래스 중 하나를 식별하고, 식별된 클래스에 대응되는 텍스처 패치를 출력하며, 픽셀 블록과 식별된 클래스 간의 제1 유사도 및 텍스처 패치와 식별된 클래스 간의 제2 유사도를 비교하여 텍스처 패치의 업데이트할지 여부를 식별할 수 있다.In addition, the learning network model according to an embodiment identifies one of a plurality of classes based on a characteristic of a pixel block, outputs a texture patch corresponding to the identified class, and a first degree of similarity between the pixel block and the identified class. And comparing a second degree of similarity between the texture patch and the identified class to identify whether to update the texture patch.

일 실시 예에 따른 학습 네트워크 모델은, 제1 및 제2 유사도에 기초하여 식별된 클래스에 대응되는 텍스처 패치를 픽셀 블록으로 대체하거나, 픽셀 블록을 식별된 클래스에 대응되는 텍스처 패치로 추가할 수 있다.The learning network model according to an embodiment may replace a texture patch corresponding to a class identified based on the first and second similarities with a pixel block, or may add a pixel block as a texture patch corresponding to the identified class. .

또한, 학습 네트워크 모델은, 비교 결과에 기초하여 제1 유사도가 제2 유사도 보다 작은 값이면, 식별된 클래스에 대응되는 텍스처 패치를 유지하고, 비교 결과에 기초하여 제1 유사도가 제2 유사도 보다 큰 값이면, 픽셀 블록에 기초하여 텍스처 패치를 업데이트할 수 있다.In addition, the learning network model maintains a texture patch corresponding to the identified class if the first similarity is less than the second similarity based on the comparison result, and the first similarity is greater than the second similarity based on the comparison result. If it is a value, the texture patch can be updated based on the pixel block.

본 개시의 일 실시 예에 따른 학습 네트워크 모델은, 식별된 클래스에 대응되는 텍스처 패치가 복수 개인 경우, 픽셀 블록 및 복수 개의 텍스처 패치 각각의 상관 관계(correlation)에 기초하여 복수 개의 텍스처 패치 중 어느 하나를 식별할 수 있다.In the learning network model according to an embodiment of the present disclosure, when there are a plurality of texture patches corresponding to an identified class, any one of a plurality of texture patches is based on a correlation between a pixel block and a plurality of texture patches. Can be identified.

또한, 학습 네트워크 모델은, 복수의 클래스 각각에 대응되는 텍스처 패치의 저장 시기 또는 텍스처 패치의 적용 빈도수 중 적어도 하나에 기초하여 텍스처 패치를 학습할 수 있다.In addition, the learning network model may learn a texture patch based on at least one of a storage timing of a texture patch corresponding to each of a plurality of classes or an application frequency of a texture patch.

본 개시의 일 실시 예에 따른 학습 네트워크 모델은, 픽셀 블록의 특성에 기초하여 픽셀 블록이 복수의 클래스 중 어느 하나에 대응되지 않는 것으로 식별되면, 픽셀 블록의 특성에 기초하여 신규 클래스를 생성하고 신규 클래스에 픽셀 블록을 맵핑하여 저장할 수 있다.In the learning network model according to an embodiment of the present disclosure, when a pixel block is identified as not corresponding to any one of a plurality of classes based on a characteristic of a pixel block, a new class is generated based on the characteristic of the pixel block and a new class is generated. You can map and store a block of pixels in a class.

여기서, 복수의 클래스는, 평균 픽셀 값, 픽셀 좌표, 분산, 에지 강도, 에지 방향 또는 색상 중 적어도 하나를 기준으로 구분될 수 있다.Here, the plurality of classes may be classified based on at least one of an average pixel value, pixel coordinates, variance, edge intensity, edge direction, or color.

본 개시의 일 실시 예에 따른 출력 영상을 획득하는 S1320단계는, 획득된 텍스처 패치 및 픽셀 블록 간 상관 관계(correlation)에 기초하여 텍스처 패치에 대한 가중치를 획득하는 단계 및 가중치가 적용된 텍스처 패치를 픽셀 블록에 적용하여 출력 영상을 획득하는 단계를 포함할 수 있다.In step S1320 of obtaining an output image according to an embodiment of the present disclosure, a step of acquiring a weight for a texture patch based on a correlation between the acquired texture patch and a pixel block, and a texture patch to which the weight is applied are pixelated. It may include applying to the block to obtain an output image.

여기서, 출력 영상은, 4K UHD(Ultra High Definition) 영상 또는 8K UHD 영상일 수 있다.Here, the output image may be a 4K Ultra High Definition (UHD) image or an 8K UHD image.

다만, 본 개시의 다양한 실시 예들은 영상 처리 장치 뿐 아니라, 셋탑 박스와 같은 영상 수신 장치, 영상 처리 장치 등 영상 처리가 가능한 모든 전자 장치에 적용될 수 있음은 물론이다. However, it goes without saying that the various embodiments of the present disclosure may be applied not only to an image processing device, but also to all electronic devices capable of image processing, such as an image receiving device such as a set-top box and an image processing device.

한편, 이상에서 설명된 다양한 실시 예들은 소프트웨어(software), 하드웨어(hardware) 또는 이들의 조합을 이용하여 컴퓨터(computer) 또는 이와 유사한 장치로 읽을 수 있는 기록 매체 내에서 구현될 수 있다. 일부 경우에 있어 본 명세서에서 설명되는 실시 예들이 프로세서(120) 자체로 구현될 수 있다. 소프트웨어적인 구현에 의하면, 본 명세서에서 설명되는 절차 및 기능과 같은 실시 예들은 별도의 소프트웨어 모듈들로 구현될 수 있다. 소프트웨어 모듈들 각각은 본 명세서에서 설명되는 하나 이상의 기능 및 동작을 수행할 수 있다.Meanwhile, the various embodiments described above may be implemented in a recording medium that can be read by a computer or a similar device using software, hardware, or a combination thereof. In some cases, the embodiments described herein may be implemented by the processor 120 itself. According to software implementation, embodiments such as procedures and functions described in the present specification may be implemented as separate software modules. Each of the software modules may perform one or more functions and operations described herein.

한편, 상술한 본 개시의 다양한 실시 예들에 따른 음향 출력 장치(100)의 프로세싱 동작을 수행하기 위한 컴퓨터 명령어(computer instructions)는 비일시적 컴퓨터 판독 가능 매체(non-transitory computer-readable medium) 에 저장될 수 있다. 이러한 비일시적 컴퓨터 판독 가능 매체에 저장된 컴퓨터 명령어는 특정 기기의 프로세서에 의해 실행되었을 때 상술한 다양한 실시 예에 따른 음향 출력 장치(100)에서의 처리 동작을 특정 기기가 수행하도록 한다. Meanwhile, computer instructions for performing the processing operation of the sound output device 100 according to various embodiments of the present disclosure described above may be stored in a non-transitory computer-readable medium. I can. When the computer instructions stored in the non-transitory computer-readable medium are executed by the processor of the specific device, the specific device causes the specific device to perform the processing operation in the sound output device 100 according to the various embodiments described above.

비일시적 컴퓨터 판독 가능 매체란 레지스터, 캐쉬, 메모리 등과 같이 짧은 순간 동안 데이터를 저장하는 매체가 아니라 반영구적으로 데이터를 저장하며, 기기에 의해 판독(reading)이 가능한 매체를 의미한다. 비일시적 컴퓨터 판독 가능 매체의 구체적인 예로는, CD, DVD, 하드 디스크, 블루레이 디스크, USB, 메모리카드, ROM 등이 있을 수 있다.The non-transitory computer-readable medium refers to a medium that stores data semi-permanently and can be read by a device, rather than a medium that stores data for a short moment, such as a register, cache, and memory. Specific examples of the non-transitory computer-readable medium may include CD, DVD, hard disk, Blu-ray disk, USB, memory card, ROM, and the like.

이상에서는 본 개시의 바람직한 실시 예에 대하여 도시하고 설명하였지만, 본 개시는 상술한 특정의 실시 예에 한정되지 아니하며, 청구범위에서 청구하는 본 개시의 요지를 벗어남이 없이 당해 개시에 속하는 기술분야에서 통상의 지식을 가진자에 의해 다양한 변형실시가 가능한 것은 물론이고, 이러한 변형실시들은 본 개시의 기술적 사상이나 전망으로부터 개별적으로 이해되어져서는 안될 것이다.In the above, preferred embodiments of the present disclosure have been illustrated and described, but the present disclosure is not limited to the specific embodiments described above, and is not departing from the gist of the disclosure claimed in the claims. Various modifications are possible by those skilled in the art of course, and these modifications should not be individually understood from the technical idea or perspective of the present disclosure.

110: 메모리 120: 프로세서110: memory 120: processor

Claims

A memory storing at least one instruction; And
Including; a processor electrically connected to the memory,
The processor,
By executing the above command,
Apply the input image to the training network model,
Obtaining a texture patch corresponding to the pixel block included in the input image from the learning network model,
Applying the texture patch to the pixel block to obtain an output image,
The learning network model,
Storing a plurality of texture patches corresponding to each of a plurality of classes classified based on the characteristics of the pixel block
Identifying any one of the plurality of classes based on the characteristic of the pixel block included in the input image,
Outputs the texture patch corresponding to the identified class,
Updating at least one texture patch among the plurality of texture patches based on the input image.

The method of claim 1,
The learning network model,
Comparing a first similarity between the pixel block included in the input image and the identified class and a second similarity between the texture patch and the identified class to identify whether to update the texture patch.

The method of claim 2,
The learning network model,
The image processing apparatus comprising: replacing a texture patch corresponding to the identified class with the pixel block based on the first and second similarities, or adding the pixel block as a texture patch corresponding to the identified class.

The method of claim 2,
The learning network model,
If the first similarity is a value smaller than the second similarity based on the comparison result, the texture patch corresponding to the identified class is maintained, and
If the first similarity is greater than the second similarity based on the comparison result, the texture patch is updated based on the pixel block.

The method of claim 2,
The learning network model,
When there are a plurality of texture patches corresponding to the identified class, any one of the plurality of texture patches is identified based on a correlation between the pixel block and the plurality of texture patches.

The method of claim 1,
The learning network model,
An image processing apparatus for learning the texture patch based on at least one of a storage timing of a texture patch corresponding to each of the plurality of classes or an application frequency of the texture patch.

The method of claim 1,
The learning network model,
If the pixel block is identified as not corresponding to any one of the plurality of classes based on the characteristic of the pixel block included in the input image, a new class is selected based on the characteristic of the pixel block included in the input image. An image processing apparatus for generating and storing the pixel block by mapping the pixel block to the new class.

The method of claim 1,
The learning network model,
Identifying a class corresponding to each of a plurality of pixel blocks included in the input image,
The image processing apparatus, wherein the size of the storage space of the memory corresponding to at least one of the plurality of classes is changed based on the identification frequency of each of the plurality of classes.

The method of claim 8,
The learning network model,
The image processing apparatus, wherein a texture patch corresponding to a class identified less than a preset number of times based on the identification frequency is deleted from the memory, and a storage space reserved according to the deletion of the texture patch is allocated to the remaining classes.

The method of claim 1,
The plurality of classes,
An image processing device that is classified based on at least one of an average pixel value, pixel coordinates, variance, edge intensity, edge direction, or color.

The method of claim 1,
The processor,
Obtaining a weight for the texture patch based on a correlation between the obtained texture patch and the pixel block,
An image processing apparatus for obtaining the output image by applying the texture patch to which the weight is applied to the pixel block.

The method of claim 1,
The output image,
An image processing device that is a 4K UHD (Ultra High Definition) video or an 8K UHD video.

In the image processing method of the image processing apparatus,
Applying an input image to a learning network model to obtain a texture patch corresponding to a pixel block included in the input image from the learning network model;
Including; applying the texture patch to the pixel block to obtain an output image; Including,
The learning network model,
A plurality of texture patches corresponding to each of the plurality of classes classified based on the characteristics of the pixel block are stored,
Identifying any one of the plurality of classes based on the characteristic of the pixel block included in the input image,
Outputs the texture patch corresponding to the identified class,
Updating at least one texture patch among the plurality of texture patches based on the input image.

The method of claim 13,
The learning network model,
Comparing a first similarity between the pixel block included in the input image and the identified class and a second similarity between the texture patch and the identified class to identify whether to update the texture patch.

The method of claim 14,
The learning network model,
Replacing a texture patch corresponding to the identified class with the pixel block to the first and second similarity degrees, or adding the pixel block as a texture patch corresponding to the identified class.

The method of claim 14,
The learning network model,
If the first similarity is a value smaller than the second similarity based on the comparison result, the texture patch corresponding to the identified class is maintained, and
If the first similarity is greater than the second similarity based on the comparison result, the texture patch is updated based on the pixel block.

The method of claim 14,
The learning network model,
When there are a plurality of texture patches corresponding to the identified class, any one of the plurality of texture patches is identified based on a correlation between the pixel block and each of the plurality of texture patches.

The method of claim 13,
The learning network model,
An image processing method for learning the texture patch based on at least one of a storage timing of a texture patch corresponding to each of the plurality of classes or an application frequency of the texture patch.

The method of claim 13,
The learning network model,
If the pixel block is identified as not corresponding to any one of the plurality of classes based on the characteristic of the pixel block included in the input image, a new class is selected based on the characteristic of the pixel block included in the input image. The image processing method of generating and mapping the pixel block to the new class and storing it.

The method of claim 13,
The plurality of classes,
An image processing method that is classified based on at least one of an average pixel value, pixel coordinates, variance, edge intensity, edge direction, or color.