KR20230009806A

KR20230009806A - An image processing apparatus and a method thereof

Info

Publication number: KR20230009806A
Application number: KR1020220039872A
Authority: KR
Inventors: 박재성; 김지만; 이천; 정성운
Original assignee: 삼성전자주식회사
Priority date: 2021-07-09
Filing date: 2022-03-30
Publication date: 2023-01-17

Abstract

Disclosed is an image processing method that includes the steps of: obtaining object information about an important object included in a first image from the first image; obtaining control information for picture quality processing; and obtaining a second image by performing image quality processing on the important object from the first image based on the object information and the user control information.

Description

An image processing apparatus and a method thereof

개시된 다양한 실시 예들은 영상 처리 장치 및 그 동작 방법에 관한 것으로, 보다 상세하게는 사용자 특성에 맞게 영상을 렌더링하는 영상 처리 장치 및 그 동작 방법에 관한 것이다.Various disclosed embodiments relate to an image processing device and an operating method thereof, and more particularly, to an image processing device rendering an image suitable for user characteristics and an operating method thereof.

시각장애인은 영상을 볼 때 영상에 포함된 객체를 정확히 인지하기 어렵다. 이에, 시각장애인은 영상을 감상하기 위해 확대 기능을 사용하는 경우가 많다. 그러나, 확대 기능이 사용될 때, 영상의 해상도 또한 떨어지므로, 시각 장애인의 시청 욕구를 만족시킬 수 없다는 문제가 있다. It is difficult for the visually impaired to accurately recognize objects included in the image when viewing the image. Therefore, visually impaired people often use the magnification function to enjoy images. However, when the magnification function is used, the resolution of the image is also lowered, so there is a problem that the viewing desire of the visually impaired cannot be satisfied.

또한, 시각장애인은 영상 전체의 세밀함에 치중하지 않고 영상 내에서 사물의 전체적인 윤곽선이나 사물 내의 성분이 무엇인지를 인지하는 것에 집중하는 경향이 있다.In addition, the visually impaired tend to focus on recognizing the overall outline of an object or the components of an object in an image without focusing on the details of the entire image.

이에, 시각장애인의 장애 특성에 따라, 영상에 포함된 객체가 보다 정확히 인지될 수 있는 영상을 제공함으로써, 시각장애인이 콘텐츠를 즐길 수 있도록 하는 기술이 요구된다. Accordingly, there is a need for a technology for enabling visually impaired people to enjoy content by providing an image in which objects included in the image can be more accurately recognized according to the characteristics of the visually impaired person.

다양한 실시 예들은 영상에 포함된 중요 객체에 대해 화질 처리를 수행하는 영상 처리 장치 및 그 동작 방법을 제공하기 위한 것이다. Various embodiments are intended to provide an image processing apparatus and an operating method for performing image quality processing on an important object included in an image.

다양한 실시 예들은 영상에 포함된 중요 객체를 사용자의 선호도에 맞게 화질 처리를 수행하는 영상 처리 장치 및 그 동작 방법을 제공하기 위한 것이다. Various embodiments are intended to provide an image processing apparatus and an operation method for performing image quality processing of an important object included in an image according to a user's preference.

실시 예에 따른 영상 처리 장치는 하나 이상의 인스트럭션을 저장하는 메모리 및 상기 메모리에 저장된 상기 하나 이상의 인스트럭션을 실행하는 프로세서를 포함하고, 상기 프로세서는 상기 하나 이상의 인스트럭션을 실행함으로써, 제1 영상으로부터 상기 제1 영상에 포함된 중요 객체에 대한 객체 정보를 획득하고, 화질 처리를 위한 제어 정보를 획득하고, 상기 객체 정보 및 상기 제어 정보를 기반으로, 상기 중요 객체에 대한 화질 처리를 수행하여 제2 영상을 획득할 수 있다. An image processing device according to an embodiment includes a memory that stores one or more instructions and a processor that executes the one or more instructions stored in the memory, wherein the processor executes the one or more instructions to obtain the first image from a first image. Acquire object information about an important object included in the image, obtain control information for image quality processing, and perform image quality processing on the important object based on the object information and the control information to obtain a second image. can do.

실시 예에서, 상기 객체 정보는 상기 중요 객체의 종류, 위치, 및 크기 중 적어도 하나에 대한 정보를 포함할 수 있다. In an embodiment, the object information may include information on at least one of the type, location, and size of the important object.

실시 예에서, 상기 프로세서는 상기 하나 이상의 인스트럭션을 실행함으로써, 상기 제1 영상으로부터 복수 객체를 검출하고, 상기 복수 객체를 가리키는 객체 식별 정보를 출력하고, 상기 객체 식별 정보 출력에 상응하여 사용자로부터 선택된 객체를 상기 중요 객체로 식별할 수 있다. In an embodiment, the processor detects a plurality of objects from the first image by executing the one or more instructions, outputs object identification information indicating the plurality of objects, and objects selected by the user in accordance with the output of the object identification information. can be identified as the important object.

실시 예에서, 상기 제어 정보는 객체의 확대 여부, 객체의 확대 정도, 윤곽선 처리, 및 평탄화 처리 중 적어도 하나에 대한 제어 정보를 포함할 수 있다. In an embodiment, the control information may include control information on at least one of whether or not the object is enlarged, the degree of enlargement of the object, contour processing, and flattening processing.

실시 예에서, 상기 프로세서는 상기 하나 이상의 인스트럭션을 실행함으로써, 상기 제어 정보에 따라, 상기 객체의 업스케일링, 상기 객체 주위의 윤곽선 처리, 및 상기 객체 내부 평탄화 처리 중 적어도 하나를 수행할 수 있다. In an embodiment, the processor may perform at least one of upscaling of the object, processing an outline around the object, and flattening the inside of the object according to the control information by executing the one or more instructions.

실시 예에서, 상기 제어 정보는 추론 제어 정보 및 실시간 사용자 제어 정보 중 적어도 하나를 포함하고, 상기 추론 제어 정보는 이전 영상에 대한 상기 사용자의 이전 제어 이력 정보로부터 획득되고, 상기 실시간 사용자 제어 정보는 상기 제1 영상에 대한 상기 사용자의 실시간 제어 정보를 포함할 수 있다. In an embodiment, the control information includes at least one of reasoning control information and real-time user control information, the reasoning control information is obtained from previous control history information of the user for a previous image, and the real-time user control information is the Real-time control information of the user for the first image may be included.

실시 예에서, 상기 프로세서는 상기 하나 이상의 인스트럭션을 실행함으로써, 뉴럴 네트워크를 이용하여, 상기 제1 영상으로부터 상기 제2 영상을 획득하고, 상기 뉴럴 네트워크는 입력 영상, 상기 입력 영상에서 사용자가 관심을 갖는 객체 영역, 및 상기 사용자가 관심을 갖는 객체 영역을 화질 처리한 그라운드 트루쓰(ground thruth) 영상을 학습 데이터 셋으로 학습한 뉴럴 네트워크일 수 있다. In an embodiment, the processor obtains the second image from the first image using a neural network by executing the one or more instructions, and the neural network determines an input image and a user's interest in the input image. It may be a neural network that learns, as a learning data set, an object region and a ground truth image obtained by image quality processing of the object region of interest to the user.

실시 예에서, 상기 뉴럴 네트워크는 상기 제1 영상, 상기 제어 정보, 및 상기 객체 정보 중 적어도 하나로부터, 상기 객체가 화질 처리된 제2 영상을 획득할 수 있다. In an embodiment, the neural network may obtain a second image in which the object is quality-processed from at least one of the first image, the control information, and the object information.

실시 예에서, 상기 뉴럴 네트워크는 상기 제1 영상, 상기 사용자 제어 정보, 및 상기 객체 정보 중 적어도 하나로부터, 상기 객체에 대한 화질 처리를 위한 평탄화 파라미터 및 윤곽선 파라미터를 획득하고, 상기 프로세서는 사용자 제어 신호에 따라 상기 평탄화 파라미터에 따라 화질 처리된 영상 및 상기 윤곽선 파라미터에 따라 화질 처리된 영상의 블렌딩 정도를 조절하여 화질 처리된 제2 영상을 획득할 수 있다. In an embodiment, the neural network obtains a flattening parameter and a contour parameter for image quality processing of the object from at least one of the first image, the user control information, and the object information, and the processor receives a user control signal. According to this, a second image quality-processed may be obtained by adjusting the degree of blending between the image quality-processed according to the flattening parameter and the image quality-processed according to the contour parameter.

실시 예에서, 상기 화질 처리는 상기 객체의 윤곽선 처리, 상기 객체 내부의 평탄화 처리, 및 상기 객체를 업스케일링하는 것 중 적어도 하나를 포함하고, 상기 객체의 윤곽선 처리는 상기 객체의 윤곽선의 디테일, 강도, 색상 중 적어도 하나에 대한 처리를 포함하고, 상기 객체 내부의 평탄화 처리는 상기 객체 내부의 평탄화 정도를 조절하는 처리를 포함하고, 상기 객체를 업스케일링하는 것은 상기 객체의 해상도를 유지하면서 상기 객체의 크기를 확대하는 처리를 포함할 수 있다. In an embodiment, the image quality processing includes at least one of processing the outline of the object, flattening the inside of the object, and upscaling the object, and the processing of the outline of the object includes detail, intensity of the outline of the object , color, and the flattening process inside the object includes a process for adjusting the degree of flattening inside the object, and upscaling the object includes processing of the object while maintaining the resolution of the object. It may include a process of enlarging the size.

실시 예에 따른, 영상 처리 장치에서 수행하는 영상 처리 방법은 제1 영상으로부터 상기 제1 영상에 포함된 중요 객체에 대한 객체 정보를 획득하는 단계, 화질 처리에 대한 제어 정보를 획득하는 단계 및 상기 객체 정보 및 상기 사용자 제어 정보를 기반으로, 상기 제1 영상으로부터 상기 중요 객체에 대한 화질 처리를 수행하여 제2 영상을 획득하는 단계를 포함할 수 있다. According to an embodiment, an image processing method performed by an image processing device includes obtaining object information about an important object included in a first image from a first image, obtaining control information for image quality processing, and the object. and acquiring a second image by performing image quality processing on the important object from the first image based on information and the user control information.

실시 예에 따른 컴퓨터로 읽을 수 있는 기록 매체는 제1 영상으로부터 상기 제1 영상에 포함된 중요 객체에 대한 객체 정보를 획득하는 단계, 화질 처리에 대한 제어 정보를 획득하는 단계 및 상기 객체 정보 및 상기 사용자 제어 정보를 기반으로, 상기 제1 영상으로부터 상기 중요 객체에 대한 화질 처리를 수행하여 제2 영상을 획득하는 단계를 포함하는, 영상 처리 방법을 구현하기 위한 프로그램이 기록된 컴퓨터로 읽을 수 있는 기록 매체일 수 있다.A computer-readable recording medium according to an embodiment includes the steps of obtaining object information about an important object included in a first image from a first image, obtaining control information for image quality processing, and the object information and the A computer-readable record in which a program for implementing an image processing method is recorded, comprising obtaining a second image by performing image quality processing on the important object from the first image based on user control information. can be a medium.

일 실시 예에 따른 영상 처리 장치 및 그 동작 방법은 영상에 포함된 중요 객체에 대해 화질 처리를 수행할 수 있다. An image processing device and method of operating the same according to an embodiment may perform image quality processing on an important object included in an image.

일 실시 예에 따른 영상 처리 장치 및 그 동작 방법은 영상에 포함된 중요 객체를 사용자의 선호도에 맞게 화질 처리를 수행할 수 있다. An image processing device and method of operating the same according to an embodiment may perform image quality processing on an important object included in an image according to a user's preference.

도 1은 실시 예에 따라, 영상 처리 장치가 화질 처리된 영상을 출력하는 것을 설명하기 위한 도면이다.
도 2는 실시 예에 따른 영상 처리 장치의 내부 블록도이다.
도 3은 실시 예에 따라, 사용자가 영상 처리 장치를 이용하여 시각 장애 정보를 입력하는 것을 설명하기 위한 도면이다.
도 4는 실시 예에 따른 도 2의 프로세서 내부 블록도이다.
도 5는 실시 예에 따라, 객체 정보 획득부가 입력 영상으로부터 객체 정보를 획득하는 것을 설명하기 위한 도면이다.
도 6은 실시 예에 따라, 제어 정보 획득부가 추론 제어 정보를 획득하는 것을 설명하기 위한 도면이다.
도 7은 실시 예에 따라, 화질 처리부가 입력 영상으로부터 화질 처리된 출력 영상을 획득하는 것을 설명하기 위한 도면이다.
도 8은 실시 예에 따라, 영상 처리 장치가 사용자로부터 중요 객체를 선택 받는 것을 설명하기 위한 도면이다.
도 9는 실시 예에 따른 영상 처리 장치의 내부 블록도이다.
도 10은 실시 예에 따라, 영상 처리 장치가 입력 영상에 대해 수행하는 화질 처리에 대해 설명하기 위한 도면이다.
도 11은 실시 예에 따라, 화질 처리 기능을 포함하는 사용자 인터페이스 화면을 도시한 도면이다.
도 12는 실시 예에 따라, 영상 처리 장치가 입력 영상에 대해 화질 처리를 수행하여 획득한 결과 영상을 도시한 도면이다.
도 13은, 실시 예에 따라 제1 영상으로부터 중요 객체가 화질 처리 된 제2 영상을 획득하는 방법을 도시한 순서도이다.
도 14는 실시 예에 따라, 사용자 제어 정보에 따라 화질 처리를 수행하는 것을 도시한 순서도이다.1 is a diagram for explaining that an image processing device outputs a quality-processed image according to an exemplary embodiment.
2 is an internal block diagram of an image processing device according to an embodiment.
3 is a diagram for explaining how a user inputs information on a visual impairment by using an image processing device according to an exemplary embodiment.
4 is an internal block diagram of the processor of FIG. 2 according to an embodiment.
5 is a diagram for explaining that an object information obtaining unit obtains object information from an input image according to an embodiment.
6 is a diagram for explaining that a control information acquisition unit obtains inference control information according to an embodiment.
7 is a diagram for explaining how a quality processing unit obtains an output image subjected to quality processing from an input image according to an exemplary embodiment.
8 is a diagram for explaining that an image processing device receives selection of an important object from a user according to an embodiment.
9 is an internal block diagram of an image processing device according to an embodiment.
10 is a diagram for describing image quality processing performed by an image processing device on an input image according to an exemplary embodiment.
11 is a diagram illustrating a user interface screen including a quality processing function according to an exemplary embodiment.
12 is a diagram illustrating a resultant image obtained by performing image quality processing on an input image by an image processing device according to an exemplary embodiment.
13 is a flowchart illustrating a method of acquiring a second image in which an important object is quality-processed from a first image according to an embodiment.
14 is a flowchart illustrating that image quality processing is performed according to user control information according to an embodiment.

본 개시에서, "a, b 또는 c 중 적어도 하나" 표현은 " a", " b", " c", "a 및 b", "a 및 c", "b 및 c", "a, b 및 c 모두", 혹은 그 변형들을 지칭할 수 있다.In this disclosure, the expression “at least one of a, b, or c” means “a”, “b”, “c”, “a and b”, “a and c”, “b and c”, “a, b” and c”, or variations thereof.

아래에서는 첨부한 도면을 참조하여 본 개시가 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 본 개시의 실시 예를 상세히 설명한다. 그러나 본 개시는 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시 예에 한정되지 않는다. Hereinafter, embodiments of the present disclosure will be described in detail so that those skilled in the art can easily implement the present disclosure with reference to the accompanying drawings. However, the present disclosure may be implemented in many different forms and is not limited to the embodiments described herein.

본 개시에서 사용되는 용어는, 본 개시에서 언급되는 기능을 고려하여 현재 사용되는 일반적인 용어로 기재되었으나, 이는 당 분야에 종사하는 기술자의 의도 또는 판례, 새로운 기술의 출현 등에 따라 다양한 다른 용어를 의미할 수 있다. 따라서 본 개시에서 사용되는 용어는 용어의 명칭만으로 해석되어서는 안되며, 용어가 가지는 의미와 본 개시의 전반에 걸친 내용을 토대로 해석되어야 한다.The terminology used in the present disclosure has been described as a general term currently used in consideration of the functions mentioned in the present disclosure, but it may mean various other terms depending on the intention or precedent of a person skilled in the art, the emergence of new technologies, and the like. can Therefore, the terms used in the present disclosure should not be interpreted only as the names of the terms, but should be interpreted based on the meanings of the terms and the contents throughout the present disclosure.

또한, 본 개시에서 사용된 용어는 단지 특정한 실시 예를 설명하기 위해 사용된 것이며, 본 개시를 한정하려는 의도로 사용되는 것이 아니다. Also, terms used in the present disclosure are only used to describe specific embodiments and are not intended to limit the present disclosure.

명세서 전체에서, 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 "직접적으로 연결"되어 있는 경우뿐 아니라, 그 중간에 다른 소자를 사이에 두고 "전기적으로 연결"되어 있는 경우도 포함한다. Throughout the specification, when a part is said to be "connected" to another part, this includes not only the case where it is "directly connected" but also the case where it is "electrically connected" with another element interposed therebetween. .

본 명세서, 특히, 특허 청구 범위에서 사용된 “상기” 및 이와 유사한 지시어는 단수 및 복수 모두를 지시하는 것일 수 있다. 또한, 본 개시에 따른 방법을 설명하는 단계들의 순서를 명백하게 지정하는 기재가 없다면, 기재된 단계들은 적당한 순서로 행해질 수 있다. 기재된 단계들의 기재 순서에 따라 본 개시가 한정되는 것은 아니다.As used in this specification, particularly in the claims, “above” and similar designations may refer to both the singular and plural. Further, unless there is a description that explicitly specifies the order of steps in describing a method according to the present disclosure, the recited steps may be performed in any suitable order. The present disclosure is not limited by the order of description of the described steps.

본 명세서에서 다양한 곳에 등장하는 "일부 실시 예에서" 또는 "일 실시 예에서" 등의 어구는 반드시 모두 동일한 실시 예를 가리키는 것은 아니다.The appearances of phrases such as “in some embodiments” or “in one embodiment” in various places in this specification are not necessarily all referring to the same embodiment.

본 개시의 일부 실시 예는 기능적인 블록 구성들 및 다양한 처리 단계들로 나타내어질 수 있다. 이러한 기능 블록들의 일부 또는 전부는, 특정 기능들을 실행하는 다양한 개수의 하드웨어 및/또는 소프트웨어 구성들로 구현될 수 있다. 예를 들어, 본 개시의 기능 블록들은 하나 이상의 마이크로프로세서들에 의해 구현되거나, 소정의 기능을 위한 회로 구성들에 의해 구현될 수 있다. 또한, 예를 들어, 본 개시의 기능 블록들은 다양한 프로그래밍 또는 스크립팅 언어로 구현될 수 있다. 기능 블록들은 하나 이상의 프로세서들에서 실행되는 알고리즘으로 구현될 수 있다. 또한, 본 개시는 전자적인 환경 설정, 신호 처리, 및/또는 데이터 처리 등을 위하여 종래 기술을 채용할 수 있다. “매커니즘”, “요소”, “수단” 및 “구성”등과 같은 용어는 넓게 사용될 수 있으며, 기계적이고 물리적인 구성들로서 한정되는 것은 아니다.Some embodiments of the present disclosure may be represented as functional block structures and various processing steps. Some or all of these functional blocks may be implemented as a varying number of hardware and/or software components that perform specific functions. For example, functional blocks of the present disclosure may be implemented by one or more microprocessors or circuit configurations for a predetermined function. Also, for example, the functional blocks of this disclosure may be implemented in various programming or scripting languages. Functional blocks may be implemented as an algorithm running on one or more processors. In addition, the present disclosure may employ prior art for electronic environment setting, signal processing, and/or data processing. Terms such as “mechanism”, “element”, “means” and “composition” may be used broadly and are not limited to mechanical and physical components.

또한, 도면에 도시된 구성 요소들 간의 연결 선 또는 연결 부재들은 기능적인 연결 및/또는 물리적 또는 회로적 연결들을 예시적으로 나타낸 것일 뿐이다. 실제 장치에서는 대체 가능하거나 추가된 다양한 기능적인 연결, 물리적인 연결, 또는 회로 연결들에 의해 구성 요소들 간의 연결이 나타내어질 수 있다. In addition, connecting lines or connecting members between components shown in the drawings are only examples of functional connections and/or physical or circuit connections. In an actual device, connections between components may be represented by various functional connections, physical connections, or circuit connections that can be replaced or added.

또한, 명세서에 기재된 "...부", "모듈" 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어 또는 소프트웨어로 구현되거나 하드웨어와 소프트웨어의 결합으로 구현될 수 있다.In addition, terms such as "...unit" and "module" described in the specification mean a unit that processes at least one function or operation, which may be implemented as hardware or software or a combination of hardware and software. .

또한, 명세서에서 “사용자”라는 용어는 영상 처리 장치를 이용하는 사람을 의미하며, 소비자, 평가자, 시청자, 관리자 또는 설치 기사를 포함할 수 있다. 여기서, 소비자는 영상 처리 장치를 이용하여 영상을 감상하는 사람으로, 시각장애인을 포함할 수 있다. Also, in the specification, the term “user” means a person who uses an image processing device, and may include a consumer, an evaluator, a viewer, a manager, or an installer. Here, the consumer is a person who watches an image using an image processing device, and may include a visually impaired person.

이하 첨부된 도면을 참고하여 본 개시를 상세히 설명하기로 한다.Hereinafter, the present disclosure will be described in detail with reference to the accompanying drawings.

도 1은 실시 예에 따라, 영상 처리 장치(100)가 화질 처리된 영상을 출력하는 것을 설명하기 위한 도면이다. 1 is a diagram for explaining that the image processing device 100 outputs a quality-processed image according to an embodiment.

도 1을 참조하면, 영상 처리 장치(100)는 영상을 처리하여 출력할 수 있는 전자 장치일 수 있다. 일 예에 따라 영상 처리 장치(100)는 디스플레이를 포함하는 다양한 형태의 전자 장치로 구현될 수 있다. 영상 처리 장치(100)는 고정형 또는 이동형일 수 있으며, 디지털 방송 수신이 가능한 디지털 TV일 수 있으나, 이에 한정되지 않는다. 영상 처리 장치(100)는 비디오를 출력할 수 있다. Referring to FIG. 1 , the image processing device 100 may be an electronic device capable of processing and outputting an image. According to an example, the image processing device 100 may be implemented as various types of electronic devices including displays. The image processing device 100 may be a fixed type or a mobile type, and may be a digital TV capable of receiving digital broadcasting, but is not limited thereto. The image processing device 100 may output video.

실시 예에서, 영상 처리 장치(100)는 화질 처리 모듈을 포함할 수 있다. 화질 처리 모듈은 시청 보조 기능을 제공할 수 있다. 시청 보조 기능은 시각 장애가 있는 사용자가 콘텐츠를 보다 잘 인지할 수 있도록 하기 위해 비디오 및/또는 프레임의 화질을 처리하는 것을 의미할 수 있다.In an embodiment, the image processing device 100 may include an image quality processing module. The image quality processing module may provide a viewing assistance function. The viewing assistance function may refer to processing the quality of video and/or frames so that a user with a visual impairment can better recognize content.

화질 처리 모듈은 적어도 하나의 하드웨어 칩 형태로 제작되어 영상 처리 장치(100)에 탑재되거나, 또는 칩 형태나 장치 형태로 영상 처리 장치(100)에 포함될 수 있다. 또는 화질 처리 모듈은 영상 처리 장치(100)에서 소프트웨어 모듈로 구현될 수도 있다.The image quality processing module may be manufactured in the form of at least one hardware chip and mounted in the image processing device 100, or may be included in the image processing device 100 in the form of a chip or device. Alternatively, the image quality processing module may be implemented as a software module in the image processing device 100 .

실시 예에 따라, 영상 처리 장치(100)는 화질 처리 모듈을 이용하여 화질 처리를 수행할 수 있다. 영상 처리 장치(100)는 비디오에 포함된 제1 영상(110)을 디스플레이를 통해 출력하기 전에, 화질 처리 모듈을 이용하여 제1 영상(110)에 대한 화질 처리를 먼저 수행할 수 있다. 영상 처리 장치(100)는 비디오에 포함된 복수의 프레임들 각각에 대해 화질 처리를 수행할 수 있다. According to an embodiment, the image processing device 100 may perform picture quality processing using a picture quality processing module. The image processing device 100 may first perform image quality processing on the first image 110 using the image quality processing module before outputting the first image 110 included in the video through the display. The image processing device 100 may perform image quality processing on each of a plurality of frames included in a video.

실시 예에서, 영상 처리 장치(100)는 하나 이상의 인스트럭션을 저장하는 메모리 및 메모리에 저장된 하나 이상의 인스트럭션을 실행하는 프로세서를 포함하고, 프로세서는 하나 이상의 인스트럭션을 실행함으로써, 객체 정보 및 제어 정보를 기반으로, 중요 객체에 대한 화질 처리를 수행할 수 있다. In an embodiment, the image processing device 100 includes a memory for storing one or more instructions and a processor for executing the one or more instructions stored in the memory, and the processor executes the one or more instructions so that based on object information and control information , image quality processing can be performed on important objects.

이를 위해, 영상 처리 장치(100)는 제1 영상(110)으로부터 제1 영상(110)에 포함된 중요 객체에 대한 객체 정보를 획득할 수 있다. To this end, the image processing device 100 may obtain object information about an important object included in the first image 110 from the first image 110 .

실시 예에서, 중요 객체는 사용자의 주의를 끄는 관심 대상이 되는 객체로, 화질 처리를 할 대상이 되는 객체를 의미할 수 있다. In an embodiment, an important object is an object of interest that attracts a user's attention, and may refer to an object to be subjected to image quality processing.

실시 예에서, 객체 정보는 중요 객체의 종류, 위치, 및 크기 중 적어도 하나에 대한 정보를 포함할 수 있다. In an embodiment, object information may include information on at least one of the type, location, and size of an important object.

영상에 복수개의 객체가 포함되어 있는 경우, 영상 처리 장치(100)는 다양한 방법을 이용하여 영상에서 중요 객체를 식별할 수 있다. 예컨대, 영상 처리 장치(100)는 복수개의 객체 중 해당 영상이 포함된 비디오에 가장 많이 등장하는 객체를 중요 객체로 식별할 수 있다. 또는, 영상 처리 장치(100)는 현재 발화하는 객체를 중요 객체로 식별할 수 있다. 또는, 영상 처리 장치(100)는 영상의 중앙에 위치하고, 크기가 기준치 이상 큰 객체를 중요 객체로 식별할 수 있다. When a plurality of objects are included in an image, the image processing device 100 may identify important objects in the image using various methods. For example, the image processing device 100 may identify an object that appears the most in a video including a corresponding image among a plurality of objects as an important object. Alternatively, the image processing device 100 may identify a currently speaking object as an important object. Alternatively, the image processing device 100 may identify an object located in the center of the image and having a size greater than a reference value as an important object.

실시 예에서, 영상 처리 장치(100)는 사용자가 선택한 객체를 중요 객체로 식별할 수도 있다. 예컨대, 영상 처리 장치(100)는 영상에서 복수개의 객체를 검출하면, 검출된 객체를 각각 표시하는 객체 식별 정보를 객체 주변에 출력할 수 있다. 사용자는 객체 식별 정보가 화면에 출력된 것에 상응하여 복수개의 객체 식별 정보 중 하나를 선택할 수 있다. 영상 처리 장치(100)는 사용자가 선택한 객체 식별 정보에 대응하는 객체를 중요 객체로 식별할 수 있다. In an embodiment, the image processing device 100 may identify an object selected by the user as an important object. For example, when detecting a plurality of objects in an image, the image processing device 100 may output object identification information indicating each detected object around the objects. The user may select one of a plurality of pieces of object identification information corresponding to the object identification information displayed on the screen. The image processing device 100 may identify an object corresponding to the object identification information selected by the user as an important object.

실시 예에서, 영상 처리 장치(100)는 화질 처리를 위한 제어 정보를 획득할 수 있다. 실시 예에서, 제어 정보는 중요 객체의 화질 처리에 대한 사용자 제어 명령을 나타내는 정보일 수 있다. 제어 정보는 중요 객체의 확대 여부 및 확대 정도, 윤곽선 처리, 및 평탄화 처리 중 적어도 하나에 대한 제어 정보를 포함할 수 있다. In an embodiment, the image processing device 100 may obtain control information for image quality processing. In an embodiment, the control information may be information representing a user control command for image quality processing of an important object. The control information may include control information on at least one of whether or not the important object is enlarged and how much, outline processing, and flattening processing.

실시 예에서, 영상 처리 장치(100)는 제어 정보에 따라 중요 객체에 대해 화질 처리를 수행할 수 있다.In an embodiment, the image processing device 100 may perform image quality processing on important objects according to control information.

실시 예에서, 영상 처리 장치(100)는 시각 장애가 있는 사용자가 콘텐츠를 더 잘 인지할 수 있도록 하기 위해 중요 객체를 화질 처리할 수 있다. In an embodiment, the image processing device 100 may perform image quality processing on important objects so that visually impaired users can better recognize content.

실시 예에서, 화질 처리는 객체의 업스케일링, 객체 주위의 윤곽선 처리, 및 객체 내부 평탄화 처리 중 적어도 하나를 수행할 수 있다. In an embodiment, image quality processing may perform at least one of object upscaling, object contour processing, and object internal flattening processing.

실시 예에서, 제어 정보는 추론 제어 정보 및 실시간 사용자 제어 정보 중 적어도 하나를 포함할 수 있다. In an embodiment, the control information may include at least one of reasoning control information and real-time user control information.

실시 예에서, 추론 제어 정보는 시각 장애가 있는 사람들이나 또는 사용자의 성향을 고려하여 추론한 제어 정보를 의미할 수 있다. In an embodiment, the inferred control information may refer to control information inferred by considering the tendency of people with visual impairments or users.

실시 예에서, 추론 제어 정보는 시각 장애가 있는 사람들의 일반적인 인지 특성이나 선호도를 기반으로 획득될 수 있다. 시각 장애 정도에 따라 다를 수 있지만, 일반적으로, 시각 장애가 있는 사람들은 중요 객체를 크게 보고 싶어하고, 객체 내부의 디테일한 표현은 생략되는 것을 선호하고, 객체의 윤곽선은 더 뚜렷하게 강조된 영상을 선호하는 경향이 있다. 따라서, 영상 처리 장치(100)는 시각 장애가 있는 사람들의 인지 특성이나 선호도 등을 고려하여, 현재 화면에 출력되는 영상, 즉, 도 1의 제1 영상(110)에 포함된 중요 객체에 대해 확대 여부, 윤곽선 처리, 디테일 처리 등을 어떻게 처리할 것인가에 대한 제어 정보를 추론할 수 있다. In an embodiment, reasoning control information may be acquired based on general cognitive characteristics or preferences of visually impaired people. Although it may vary depending on the degree of visual impairment, in general, visually impaired people tend to want to see important objects in a larger size, prefer to omit detailed expressions inside objects, and prefer images with more clearly emphasized outlines of objects. there is Therefore, the image processing device 100 determines whether to enlarge an important object included in the image currently output on the screen, that is, the first image 110 of FIG. , control information on how to process outline processing, detail processing, etc. can be inferred.

경우에 따라, 사용자는 일반적인 시각 장애가 있는 사람들과는 일정 부분 다른 취향이나 선호도를 가질 수 있다. 따라서, 실시 예에서, 추론 제어 정보는 시각 장애가 있는 사람들이 아닌, 영상 처리 장치(100)를 사용하는 사용자의 과거의 제어 이력만을 기반으로 획득될 수도 있다. 영상 처리 장치(100)는 어떤 영상이 출력되었을 때 사용자가 어떤 제어 명령을 했는지에 대한 정보를 누적하고, 이로부터 사용자의 선호도나 취향을 추론할 수 있다. In some cases, users may have tastes or preferences that are partially different from those of general visually impaired people. Accordingly, in an embodiment, inference control information may be obtained based only on past control records of users who use the image processing apparatus 100, not those with visual impairments. The image processing device 100 may accumulate information about what control command the user issued when a certain image was output, and infer the user's preference or taste from this.

또는 실시 예에서, 추론 제어 정보는 사용자가 영상 처리 장치(100)를 이용하여 미리 입력한, 사용자 선호도에 대한 정보를 기반으로 획득될 수도 있다. Alternatively, in an embodiment, the reasoning control information may be obtained based on user preference information previously input by the user using the image processing device 100 .

실시 예에서, 실시간 사용자 제어 정보는 현재 화면에 출력되는 영상에 대한 사용자의 제어 정보를 의미할 수 있다. 실시간 사용자 제어 정보는 현재 화면에 출력되는 제1 영상(110)에 대한 사용자의 실시간 제어 정보를 나타낸다는 점에서 추론 제어 정보와는 구별될 수 있다. 사용자는 현재 출력되는 제1 영상(110)에 포함된 중요 객체에 대해서는, 과거와는 다르게 제어하고 싶은 경우가 있을 수 있다. 예컨대, 사용자는 제1 영상(110)에 포함된 중요 객체를 이전과는 달리 더 크게 확대해서 보고 싶어할 수 있다. 이 경우, 사용자는 리모컨 등을 이용하여 영상 처리 장치(100)에 실시간으로 사용자 제어 정보를 입력할 수 있다. 영상 처리 장치(100)는 사용자로부터 실시간 사용자 제어 정보를 입력 받고, 이에 따라 중요 객체를 더 크게 업스케일링하여 출력할 수 있다.In an embodiment, the real-time user control information may refer to user control information for an image currently displayed on a screen. Real-time user control information can be distinguished from inference control information in that it represents real-time control information of a user for the first image 110 currently displayed on the screen. There may be cases in which the user wants to control an important object included in the currently output first image 110 differently than in the past. For example, the user may want to magnify and view the important object included in the first image 110 larger than before. In this case, the user may input user control information to the image processing device 100 in real time using a remote control or the like. The image processing device 100 may receive real-time user control information from a user, and accordingly up-scale important objects to a larger size and output them.

실시 예에서, 영상 처리 장치(100)는 제1 영상(110)으로부터 중요 객체가 화질 처리된 제2 영상(120)을 획득할 수 있다. In an embodiment, the image processing device 100 may obtain a second image 120 in which an important object is quality-processed from the first image 110 .

실시 예에서, 화질 처리는 중요 객체의 윤곽선 처리, 중요 객체 내부의 평탄화 처리, 및 중요 객체를 업스케일링하는 것 중 적어도 하나를 포함할 수 있다. In an embodiment, the image quality processing may include at least one of processing the outline of the important object, flattening the inside of the important object, and upscaling the important object.

실시 예에서, 영상 처리 장치(100)는 인공지능 기술(Artificial Intelligence, AI)을 이용하여 제1 영상(110)으로부터 제2 영상(120)을 획득할 수 있다. AI 기술은 기계학습(딥 러닝) 및 기계 학습을 활용한 요소 기술들로 구성될 수 있다. AI 기술은 알고리즘을 활용하여 구현될 수 있다. 여기서, AI 기술을 구현하기 위한 알고리즘 또는 알고리즘의 집합을 신경망(Neural Network, 뉴럴 네트워크)이라 한다. 신경망은 입력 데이터를 입력 받고, 분석 및 분류를 위한 연산을 수행하여, 결과 데이터를 출력할 수 있다.In an embodiment, the image processing device 100 may acquire the second image 120 from the first image 110 using artificial intelligence (AI). AI technology can be composed of machine learning (deep learning) and element technologies using machine learning. AI technology can be implemented by utilizing algorithms. Here, an algorithm or a set of algorithms for implementing AI technology is called a neural network. The neural network may receive input data, perform calculations for analysis and classification, and output result data.

실시 예에서, 영상 처리 장치(100)는 뉴럴 네트워크를 이용하여, 제1 영상(110)에 포함된 중요 객체에 대한 화질 처리를 수행할 수 있다. In an embodiment, the image processing device 100 may perform image quality processing on an important object included in the first image 110 using a neural network.

실시 예에서, 뉴럴 네트워크는 입력 영상, 입력 영상에서 사용자가 관심을 갖는 객체 영역, 화질 처리에 대한 사용자의 제어 정보, 및 사용자가 관심을 갖는 객체 영역을 화질 처리한 그라운드 트루쓰(Ground Truth) 영상을 학습 데이터 셋으로 학습한 뉴럴 네트워크일 수 있다. In an embodiment, the neural network generates an input image, an object region in which the user is interested in the input image, user control information for image quality processing, and a ground truth image obtained by image quality processing of the object region in which the user is interested. It may be a neural network learned with a training data set.

실시 예에서, 뉴럴 네트워크는 제1 영상(110), 제어 정보, 및 객체 정보 중 적어도 하나로부터, 중요 객체가 화질 처리된 제2 영상(120)을 획득할 수 있다. In an embodiment, the neural network may obtain the second image 120 in which the important object is quality-processed from at least one of the first image 110, control information, and object information.

실시 예에서, 뉴럴 네트워크는 중요 객체를 업스케일링하거나, 객체 윤곽선 두께를 더 두껍게 표시하거나, 객체 윤곽선의 색상을 특정 색상으로 처리하거나, 또는 객체 내부의 평탄화 처리를 수행하는 것 중 적어도 하나를 수행하여 제2 영상(120)을 획득할 수 있다. In an embodiment, the neural network performs at least one of upscaling an important object, displaying a thicker object outline, processing the color of an object outline with a specific color, or performing a flattening process inside an object. A second image 120 may be acquired.

실시 예에서, 제1 영상(110) 전체가 중요 객체로 선택되어 화질 처리된 경우, 제2 영상(120)은 제1 영상(110) 전체에 대해 화질 처리된 영상일 수 있다. 또는 제1 영상(110) 전체가 아닌, 중요 객체 영역만이 화질 처리된 경우, 제2 영상(120)은 화질 처리된 중요 객체 영역만을 포함하는 영상일 수 있다. 또는 제2 영상(120)은 제1 영상(110)에서 중요 객체 영역만 화질 처리되어 포함되고 중요 객체 영역 외의 나머지 영역은 제1 영상(110)과 동일한 영상일 수 있다. In an embodiment, when the entire first image 110 is selected as an important object and image quality is processed, the second image 120 may be an image obtained by image quality processing of the entire first image 110 . Alternatively, when only the important object region is image quality-processed instead of the entire first image 110, the second image 120 may be an image including only the image quality-processed important object region. Alternatively, the second image 120 may include only the important object region in the first image 110 by performing image quality processing, and other regions other than the important object region may be the same image as the first image 110 .

실시 예에서, 뉴럴 네트워크는 제1 영상(110), 사용자 제어 정보, 및 객체 정보 중 적어도 하나로부터, 객체에 대한 화질 처리를 위한 평탄화 파라미터 및 윤곽선 파라미터를 획득할 수 있다. 영상 처리 장치(100)는 사용자로부터 파라미터 간 블렌딩을 위한 제어 신호를 선택 받고, 그에 따라 평탄화 파라미터에 따라 처리된 영상 및 윤곽선 파라미터에 따라 처리된 영상의 블렌딩 정도를 조절하여 화질 처리된 제2 영상을 획득할 수 있다. In an embodiment, the neural network may obtain a flattening parameter and a contour parameter for image quality processing of an object from at least one of the first image 110 , user control information, and object information. The image processing apparatus 100 receives a control signal for blending between parameters selected by a user, and accordingly adjusts the degree of blending of the image processed according to the flattening parameter and the image processed according to the contour parameter to produce a quality-processed second image. can be obtained

또는, 다른 실시 예에서, 뉴럴 네트워크는 객체에 대한 화질 처리를 위한 평탄화 파라미터 및 윤곽선 파라미터를 획득한 후 평탄화 파라미터 및 윤곽선 파라미터의 블렌딩 정도를 자동으로 조절하여 화질 처리된 제2 영상을 결과물로 획득할 수도 있다. Alternatively, in another embodiment, the neural network obtains a flattening parameter and a contour parameter for image quality processing of an object, and then automatically adjusts a degree of blending of the flattening parameter and the contour parameter to obtain a quality-processed second image as a result. may be

이와 같이, 실시 예에 의하면, 영상 처리 장치(100)는 사용자 특성에 맞게 중요 객체를 화질 처리할 수 있다. 영상 처리 장치(100)는 중요 객체에 대한 객체 정보 및 화질 처리를 위한 제어 정보를 기반으로, 영상에 포함된 중요 객체를 렌더링할 수 있다. 따라서, 사용자는 관심 대상이 되는 중요 객체 영역이 화질 처리된 영상을 시청할 수 있게 되어, 영상에 대한 시청 만족도가 향상될 수 있다. In this way, according to the embodiment, the image processing device 100 may process the quality of important objects according to user characteristics. The image processing device 100 may render an important object included in an image based on object information about the important object and control information for image quality processing. Accordingly, the user can watch the image in which the quality of the important object region of interest has been processed, and thus the satisfaction with viewing the image can be improved.

다만, 이는 하나의 실시 예로, 화질 처리 모듈은 영상 처리 장치(100)에 포함되지 않고, 영상 처리 장치(100)와 별개의 장치로 구현될 수도 있다. 즉, 영상 처리 장치(100)는 통신망(미도시)를 통해 화질 처리 모듈이 포함된 외부 장치나 서버와 통신할 수 있다. 이 경우, 영상 처리 장치(100)는 통신망을 통해 외부 장치나 서버로 비디오를 전송할 수 있다. 또한, 영상 처리 장치(100)는 외부 장치나 서버로 제어 명령을 전송할 수 있다. 외부 장치나 서버는 영상 처리 장치(100)로부터 복수의 프레임들을 포함하는 비디오를 수신하고, 화질 처리 모듈을 이용하여 영상으로부터 중요 객체 영역을 검출할 수 있다. 화질 처리 모듈은 중요 객체 영역에 대해 제어 명령에 따른 화질 처리를 수행할 수 있다. 화질 처리 모듈은 뉴럴 네트워크를 이용하여, 프레임에 포함된 중요 객체에 대해 화질 처리를 수행할 수 있다. 외부 장치나 서버는 화질 처리된 영상을 통신망을 통해 다시 영상 처리 장치(100)로 전송할 수 있다. 영상 처리 장치(100)는 화질 처리된 영상을 화면을 통해 출력할 수 있다. However, as an example, the image quality processing module may not be included in the image processing device 100 and may be implemented as a separate device from the image processing device 100 . That is, the image processing device 100 may communicate with an external device or server including an image quality processing module through a communication network (not shown). In this case, the image processing device 100 may transmit video to an external device or server through a communication network. Also, the image processing device 100 may transmit a control command to an external device or server. An external device or server may receive a video including a plurality of frames from the image processing device 100 and detect an important object region from the image using a picture quality processing module. The image quality processing module may perform image quality processing on the important object region according to a control command. The image quality processing module may perform image quality processing on an important object included in a frame using a neural network. The external device or server may transmit the quality-processed image to the image processing device 100 again through a communication network. The image processing device 100 may output a quality-processed image through a screen.

이와 같이, 실시 예에 의하면, 영상 처리 장치(100)는 비디오에서 중요 객체 영역이 화질 처리된 영상을 사용자에게 출력할 수 있다. In this way, according to an embodiment, the image processing device 100 may output an image in which the important object region in the video has been quality-processed to the user.

도 2는 실시 예에 따른 영상 처리 장치(100)의 내부 블록도이다. 2 is an internal block diagram of an image processing device 100 according to an embodiment.

도 2에 도시된 영상 처리 장치(100)는 도 1의 영상 처리 장치(100)일 수 있다.The image processing device 100 shown in FIG. 2 may be the image processing device 100 of FIG. 1 .

실시 예에서, 영상 처리 장치(100)는 데스크톱, 스마트 폰(smartphone), 태블릿 PC(tablet personal computer), 이동 전화기(mobile phone), 화상 전화기, 전자 북 리더기(e-book reader), 랩톱 PC(laptop personal computer), 넷북 컴퓨터(netbook computer), 디지털 카메라, PDA(Personal Digital Assistants), PMP(Portable Multimedia Player), 캠코더, 네비게이션, 웨어러블 장치(wearable device), 스마트 와치(smart watch), 홈 네트워크 시스템, 보안 시스템, 의료 장치 중 적어도 하나를 포함할 수 있다. In an embodiment, the image processing device 100 is a desktop computer, a smartphone, a tablet personal computer (PC), a mobile phone, a video phone, an e-book reader, a laptop PC ( laptop personal computer), netbook computer, digital camera, PDA (Personal Digital Assistants), PMP (Portable Multimedia Player), camcorder, navigation, wearable device, smart watch, home network system , a security system, and a medical device.

영상 처리 장치(100)는 평면(flat) 디스플레이 장치뿐 아니라, 곡률을 가지는 화면인 곡면(curved) 디스플레이 장치 또는 곡률을 조정 가능한 가변형(flexible) 디스플레이 장치로 구현될 수 있다. 영상 처리 장치(100)의 출력 해상도는 예를 들어, HD(High Definition), Full HD, Ultra HD, 또는 Ultra HD 보다 더 선명한 해상도 등과 같이 다양한 해상도를 가질 수 있다. The image processing device 100 may be implemented as a curved display device, which is a screen having a curvature, or a flexible display device capable of adjusting the curvature, as well as a flat display device. The output resolution of the image processing device 100 may have various resolutions, such as, for example, High Definition (HD), Full HD, Ultra HD, or a resolution sharper than Ultra HD.

영상 처리 장치(100)는 비디오를 출력할 수 있다. 비디오는 복수의 프레임들로 구성될 수 있다. 비디오는, 콘텐츠 프로바이더들(contents providers)이 제공하는 텔레비전 프로그램이나 VOD 서비스를 통한 각종 영화나 드라마 등의 아이템을 포함할 수 있다. 콘텐츠 프로바이더는 소비자에게 비디오를 포함한 각종 콘텐츠를 제공하는 지상파 방송국이나 케이블 방송국, 또는 OTT 서비스 제공자, IPTV 서비스 제공자를 의미할 수 있다.The image processing device 100 may output video. A video may consist of a plurality of frames. Videos may include items such as television programs provided by content providers or various movies or dramas through a VOD service. A content provider may mean a terrestrial broadcasting station or a cable broadcasting station that provides various contents including video to consumers, an OTT service provider, or an IPTV service provider.

도 2를 참조하면, 영상 처리 장치(100)는 프로세서(210) 및 메모리(120)를 포함할 수 있다.Referring to FIG. 2 , the image processing device 100 may include a processor 210 and a memory 120 .

실시 예에 따른 메모리(120)는, 적어도 하나의 인스트럭션을 저장할 수 있다. 메모리(120)는 프로세서(110)가 실행하는 적어도 하나의 프로그램을 저장하고 있을 수 있다. 메모리(120)에는 적어도 하나의 뉴럴 네트워크 및/또는 기 정의된 동작 규칙이나 AI 모델이 저장될 수 있다. 또한 메모리(120)는 영상 처리 장치(100)로 입력되거나 영상 처리 장치(100)로부터 출력되는 데이터를 저장할 수 있다.The memory 120 according to an embodiment may store at least one instruction. The memory 120 may store at least one program executed by the processor 110 . At least one neural network and/or predefined operating rules or AI models may be stored in the memory 120 . Also, the memory 120 may store data input to or output from the image processing device 100 .

메모리(120)는 플래시 메모리 타입(flash memory type), 하드디스크 타입(hard disk type), 멀티미디어 카드 마이크로 타입(multimedia card micro type), 카드 타입의 메모리(예를 들어 SD 또는 XD 메모리 등), 램(RAM, Random Access Memory) SRAM(Static Random Access Memory), 롬(ROM, Read-Only Memory), EEPROM(Electrically Erasable Programmable Read-Only Memory), PROM(Programmable Read-Only Memory), 자기 메모리, 자기 디스크, 광디스크 중 적어도 하나의 타입의 저장매체를 포함할 수 있다. The memory 120 may be a flash memory type, a hard disk type, a multimedia card micro type, a card type memory (eg SD or XD memory, etc.), RAM (RAM, Random Access Memory) SRAM (Static Random Access Memory), ROM (Read-Only Memory), EEPROM (Electrically Erasable Programmable Read-Only Memory), PROM (Programmable Read-Only Memory), magnetic memory, magnetic disk , an optical disk, and at least one type of storage medium.

실시 예에서, 메모리(120)에는 영상 처리를 수행하기 위한 하나 이상의 인스트럭션이 저장될 수 있다.In an embodiment, one or more instructions for performing image processing may be stored in the memory 120 .

실시 예에서, 메모리(120)에는 영상에 포함된 중요 객체에 대한 객체 정보를 획득하기 위한 하나 이상의 인스트럭션이 저장될 수 있다. In an embodiment, one or more instructions for obtaining object information about an important object included in an image may be stored in the memory 120 .

실시 예에서, 메모리(120)에는 화질 처리를 위한 제어 정보를 획득하기 위한 하나 이상의 인스트럭션이 저장될 수 있다.In an embodiment, one or more instructions for obtaining control information for picture quality processing may be stored in the memory 120 .

실시 예에서, 메모리(120)에는 객체 정보 및 제어 정보를 기반으로, 중요 객체에 대한 화질 처리를 수행하기 위한 하나 이상의 인스트럭션이 저장될 수 있다.In an embodiment, one or more instructions for performing image quality processing on an important object may be stored in the memory 120 based on object information and control information.

실시 예에서, 메모리(120)에는 적어도 하나의 뉴럴 네트워크 및/또는 기 정의된 동작 규칙이나 AI 모델이 저장될 수 있다. In an embodiment, at least one neural network and/or a predefined operating rule or AI model may be stored in the memory 120 .

실시 예에서, 메모리(120)에 저장된 뉴럴 네트워크는 입력 영상, 입력 영상에서 사용자가 관심을 갖는 객체 영역, 및 사용자가 관심을 갖는 객체 영역을 화질 처리한 그라운드 트루쓰(ground thruth) 영상을 학습 데이터 셋으로 학습한 뉴럴 네트워크일 수 있다. In an embodiment, the neural network stored in the memory 120 outputs an input image, an object region of interest to the user in the input image, and a ground truth image obtained by quality-processing the object region of interest to the user as training data. It may be a neural network trained with three.

실시 예에서, 메모리(120)에 저장된 뉴럴 네트워크는 제1 영상으로부터 중요 객체 영역을 화질 처리하여 제2 영상을 획득할 수 있다. In an embodiment, the neural network stored in the memory 120 may obtain a second image by performing image quality processing on the important object region from the first image.

프로세서(110)는 영상 처리 장치(100)의 전반적인 동작을 제어한다. 프로세서(110)는 메모리(120)에 저장된 하나 이상의 인스트럭션을 실행함으로써, 영상 처리 장치(100)가 기능하도록 제어할 수 있다. The processor 110 controls overall operations of the image processing device 100 . The processor 110 may control the image processing device 100 to function by executing one or more instructions stored in the memory 120 .

실시 예에서, 프로세서(110)는 영상으로부터 영상에 포함된 중요 객체에 대한 객체 정보를 획득할 수 있다. In an embodiment, the processor 110 may obtain object information about an important object included in the image from the image.

실시 예에서, 프로세서(110)는 영상에 포함된 객체가 복수개인 경우, 복수 객체 중 비디오에 가장 많이 등장하는 객체, 또는, 현재 발화하는 객체, 또는 관심 영역에 위치하는 객체를 중요 객체로 식별할 수 있다. In an embodiment, if there are a plurality of objects included in the image, the processor 110 may identify, as an important object, an object that appears most often in the video, an object currently uttered, or an object located in the region of interest among the plurality of objects. can

또는, 실시 예에서, 프로세서(110)는 복수 객체를 가리키는 객체 식별 정보가 화면에 출력되도록 하고, 사용자로부터 선택된 객체 식별 정보에 대응하는 객체를 중요 객체로 식별할 수 있다. Alternatively, in an embodiment, the processor 110 may display object identification information indicating a plurality of objects on a screen, and may identify an object corresponding to the object identification information selected by the user as an important object.

실시 예에서, 프로세서(110)는 화질 처리를 위한 제어 정보를 획득할 수 있다. In an embodiment, the processor 110 may obtain control information for image quality processing.

실시 예에서, 프로세서(110)는 객체 정보 및 제어 정보를 기반으로, 중요 객체에 대한 화질 처리를 수행하여 화질 처리된 영상을 획득할 수 있다. In an embodiment, the processor 110 may acquire a quality-processed image by performing quality processing on an important object based on object information and control information.

실시 예에서, 프로세서(110)는, 뉴럴 네트워크를 이용하여, 제1 영상, 제어 정보, 및 객체 정보 중 적어도 하나로부터, 중요 객체를 화질 처리할 수 있다. In an embodiment, the processor 110 may process the quality of the important object from at least one of the first image, control information, and object information using a neural network.

실시 예에서, 뉴럴 네트워크는 중요 객체의 업스케일링, 중요 객체 주위의 윤곽선 처리, 및 중요 객체 내부 평탄화 처리 중 적어도 하나를 수행할 수 있다. In an embodiment, the neural network may perform at least one of upscaling the important object, processing an outline around the important object, and flattening the inside of the important object.

도 3은 실시 예에 따라, 사용자가 영상 처리 장치(100)를 이용하여 시각 장애 정보를 입력하는 것을 설명하기 위한 도면이다.3 is a diagram for explaining how a user inputs information on a visual impairment using the image processing device 100 according to an embodiment.

실시 예에서, 영상 처리 장치(100)는 시청 보조 기능을 제공하기 위해서 사용자의 시각 장애 정도를 미리 파악할 수 있다. 영상 처리 장치(100)가 뉴럴 네트워크를 이용하여 추론 제어 정보를 획득하는 경우, 뉴럴 네트워크는 사용자의 인터랙션 히스토리를 파악하기까지 시간이 소요된다. 즉, 뉴럴 네트워크가 학습을 통해 최적화된 모델이 되기까지는 신뢰도 있는 제어 정보를 추론하지 못하게 된다. 따라서, 영상 처리 장치(100)는 사용자로부터 입력된 시각 장애 정보를 기반으로 학습 모델을 더 빨리 학습 시킴으로써 보다 신뢰도 있는 추론 제어 정보를 빠른 시간 안에 획득할 수 있다. In an embodiment, the image processing device 100 may determine the degree of visual impairment of the user in advance in order to provide a viewing assistance function. When the image processing device 100 obtains reasoning control information using the neural network, the neural network takes time to determine the user's interaction history. That is, reliable control information cannot be inferred until the neural network becomes an optimized model through learning. Therefore, the image processing device 100 can quickly acquire more reliable inference control information by quickly learning a learning model based on the visual impairment information input from the user.

실시 예에서, 영상 처리 장치(100)는 사용자의 시각 장애 정도를 파악하기 위해, 시각 장애가 있는 사용자에게 시각 장애 정보를 직접 입력하게 할 수 있다. In an embodiment, the image processing device 100 may allow a visually impaired user to directly input information about the visual impairment in order to determine the degree of the user's visual impairment.

또는, 실시 예에서, 영상 처리 장치(100)는 사용자에게 예시 화면을 제공하여 사용자가 선호하는 예시를 선택하게 함으로써 사용자의 시각 장애 수준을 파악할 수도 있다. Alternatively, in an embodiment, the image processing device 100 may determine the user's level of visual impairment by providing an example screen to the user and allowing the user to select a preferred example.

실시 예에서, 사용자가 시청 보조 기능을 활성화한 것을 선택하면, 영상 처리 장치(100)는 사용자의 시각 장애 정도를 파악하기 위해, 사용자에게 다양한 영상을 보여줄 수 있다. 사용자는 영상 처리 장치(100)가 제공하는 다양한 기능이 적용된 효과 영상을 보면서 사용자 인터랙션을 제공할 수 있다. 예컨대, 사용자는 객체의 크기, 윤곽선 처리, 평탄화 처리 등에 대해 사용자가 선호하는 화질 처리에 대한 정보를 입력할 수 있다. In an embodiment, when the user selects activating the viewing assistance function, the image processing device 100 may show various images to the user in order to determine the level of the user's visual impairment. A user may provide user interaction while viewing an effect image to which various functions provided by the image processing device 100 are applied. For example, the user may input information about image quality processing preferred by the user for object size, contour processing, flattening processing, and the like.

도 3a는 영상 처리 장치(100)가 제1 기본 영상(300) 및 제1 기본 영상(300)에 포함된 객체에 대해 윤곽선을 처리하여 생성한 복수 영상들(301, 303, 305)을 출력한 것을 도시한다. FIG. 3A shows the image processing device 100 outputting a first base image 300 and a plurality of images 301, 303, and 305 generated by processing the outline of an object included in the first base image 300. show what

실시 예에서, 영상 처리 장치(100)는 제1 기본 영상(300)에 포함된 객체의 윤곽선에 대해, 윤곽선의 보존 여부, 객체 내부의 텍스처(texture)의 처리 여부, 노이즈 제거 여부, 윤곽선의 강도, 윤곽선의 디테일 정도 등에 따라, 기본 영상(300)에 포함된 객체의 윤곽선을 다양하게 변형할 수 있다. 윤곽선의 강도는 윤곽선의 두꺼운 정도나 진한 정도를 의미할 수 있다.In an embodiment, the image processing device 100 may determine whether or not to preserve the contour of the object included in the first basic image 300, process the texture inside the object, remove noise, and strength of the contour. , the contour of the object included in the basic image 300 may be modified in various ways according to the degree of detail of the contour. The intensity of the outline may mean the degree of thickness or depth of the outline.

사용자는 도 3a에 도시된 복수 영상들(301, 303, 305)을 보고, 이 중 인지하기 용이하거나 또는 사용자가 선호하는 윤곽선을 가진 영상을 선택할 수 있다. The user may view the plurality of images 301, 303, and 305 shown in FIG. 3A and select an image having an outline that is easy to recognize or the user prefers.

영상 처리 장치(100)는 사용자가 선택한 영상에 대해 수행된 윤곽선 처리에 대한 정보를 저장하고, 이를 추론 제어 정보로 이용하여, 향후 영상에 포함된 객체에 대한 윤곽선 처리 시 이용할 수 있다. The image processing device 100 may store information about contour processing performed on an image selected by a user, and use the information as inference control information to process contours of objects included in an image in the future.

시각 장애가 있는 사용자는 영상에 포함된 미세한 디테일보다 객체의 윤곽선이나 객체의 형체만 보존되는 것을 더 선호하는 경향이 있다. 실시 예에서, 영상 처리 장치(100)는 사용자로부터 객체 내부의 디테일함을 없애는 정도, 즉, 평탄화 정도를 선택 받을 수 있다. 객체 내부를 평탄화한다는 것은 객체 내부의 텍스처나 디테일한 표현을 제거하여 객체 내부가 블러링(blurring)해지거나 뭉개지도록 하는 것을 의미할 수 있다. 평탄화 정도가 크다는 것은 텍스처나 디테일한 표현이 더 많이 제거되어 객체 내부가 더 많이 뭉개진 것을 의미할 수 있다. Users with visual impairments tend to prefer preservation of only the contours of an object or the shape of an object rather than minute details included in an image. In an embodiment, the image processing device 100 may receive a selection from the user of the degree of removal of detail inside the object, that is, the degree of flatness. Flattening the inside of an object may mean removing a texture or detailed expression inside the object so that the inside of the object is blurred or crushed. A greater degree of flattening can mean that more textures or details are removed, resulting in more crushing of the interior of the object.

영상 처리 장치(100)는 사용자가 선택한 평탄화 정도에 따라 제2 기본 영상(310)에 대해 평탄화 처리를 수행하고, 그 결과 영상을 사용자에게 출력할 수 있다. The image processing device 100 may perform a flattening process on the second basic image 310 according to the degree of flattening selected by the user, and output the resultant image to the user.

도 3b는 영상 처리 장치(100)가 제2 기본 영상(310) 및 제2 기본 영상(310)에 포함된 객체에 대해 사용자가 선택한 평탄화 정도에 따라 평탄화 처리를 수행한 영상(313)을 출력한 것을 도시한다. 3B , the image processing device 100 outputs a second basic image 310 and an image 313 on which flattening is performed according to the degree of flattening selected by the user for the second basic image 310 and the object included in the second basic image 310. show what

도 3b를 참조하면, 제2 기본 영상(310)과 달리, 평탄화 처리를 수행한 영상(313)은 꽃잎 내부의 텍스쳐나 무늬가 제거된 것을 알 수 있다. Referring to FIG. 3B , unlike the second basic image 310 , it can be seen that the flattened image 313 has the texture or pattern inside the petal removed.

사용자는 평탄화 처리를 수행한 영상(313)을 보고, 평탄화 정도를 컨펌하거나, 또는 평탄화 정도를 다시 선택할 수 있다. 영상 처리 장치(100)는 사용자가 평탄화 정도를 다시 선택할 경우, 사용자가 선택한 평탄화 정도에 따라 제2 기본 영상(310)에 대해 다시 평탄화 처리를 수행하고, 결과 영상을 사용자에게 출력할 수 있다.The user may view the flattened image 313, confirm the flattening level, or select the flattening level again. When the user selects the level of flattening again, the image processing device 100 may perform the flattening process again on the second base image 310 according to the level of flattening selected by the user and output the resulting image to the user.

영상 처리 장치(100)는 사용자가 선택한 평탄화 정도에 대한 정보를 저장하고, 이를 추론 제어 정보로 이용할 수 있다. 영상 처리 장치(100)는 향후 영상을 화질 처리 할 때, 사용자가 선택한 평탄화 정도에 대한 정보를 이용하여 영상에 포함된 객체에 대해 평탄화를 수행할 수 있다.The image processing device 100 may store information about the level of flatness selected by the user and use it as reasoning control information. When image quality is processed in the future, the image processing device 100 may perform flattening on an object included in the image by using information about the degree of flattening selected by the user.

도 3에는 도시하지 않았으나, 실시 예에서, 영상 처리 장치(100)는 기본 영상에 포함된 중요 객체를 어느 정도의 크기로 확대할지를 사용자로부터 선택 받을 수 있다. 영상 처리 장치(100)는 사용자가 선택한 확대 정도에 대한 정보를 저장하고, 이를 추론 제어 정보로 이용하여, 향후 영상에 포함된 중요 객체를 화질 처리할 때 사용자가 선택한 크기로 중요 객체를 업스케일링할 수 있다. Although not shown in FIG. 3 , in an embodiment, the image processing device 100 may be selected by the user to what size to enlarge an important object included in the basic image. The image processing device 100 stores information about the degree of magnification selected by the user and uses this as reasoning control information to upscale the important object to the size selected by the user when processing the quality of the important object included in the image in the future. can

이와 같이, 실시 예에 의하면, 영상 처리 장치(100)는 사용자로부터 시각 장애 정도나 선호도를 입력 받고, 사용자로부터 입력된 정보를 기반으로 제어 정보를 생성할 수 있다. In this way, according to an embodiment, the image processing device 100 may receive a degree of visual impairment or preference from a user and generate control information based on the information input from the user.

도 4는 실시 예에 따른 도 2의 프로세서(110)의 내부 블록도이다. 4 is an internal block diagram of the processor 110 of FIG. 2 according to an embodiment.

도 4를 참조하면, 프로세서(110)는 객체 정보 획득부(111), 제어 정보 획득부(113) 및 화질 처리부(115)를 포함할 수 있다. Referring to FIG. 4 , the processor 110 may include an object information acquisition unit 111 , a control information acquisition unit 113 , and an image quality processing unit 115 .

실시 예에서, 객체 정보 획득부(111), 제어 정보 획득부(113) 및 화질 처리부(115)는 모듈 형태로 프로세서(710)에 포함될 수 있다. 모듈이라 함은, 본 개시의 기술적 사상을 수행하기 위한 하드웨어 및 상기 하드웨어를 구동하기 위한 소프트웨어의 기능적, 구조적 결합을 의미할 수 있다. 예컨대, 모듈은 소정의 코드와 소정의 코드가 수행되기 위한 하드웨어 리소스의 논리적인 단위를 의미할 수 있으며, 반드시 물리적으로 연결된 코드를 의미하거나, 한 종류의 하드웨어로 한정되지 않는다.In an embodiment, the object information acquisition unit 111, the control information acquisition unit 113, and the image quality processing unit 115 may be included in the processor 710 in a module form. A module may mean a functional and structural combination of hardware for implementing the technical idea of the present disclosure and software for driving the hardware. For example, a module may refer to a predetermined code and a logical unit of hardware resources for executing the predetermined code, and is not necessarily a physically connected code or limited to one type of hardware.

실시 예에서, 객체 정보 획득부(111)는 입력 영상을 분석하여 객체를 검출하고, 객체에 대한 객체 정보를 획득할 수 있다. In an embodiment, the object information acquisition unit 111 may analyze an input image to detect an object and obtain object information about the object.

중요 객체는 관심 대상이 되는 객체이므로, 주로 사용자의 주의를 끄는 관심 영역에 위치할 수 있다. 관심 영역은 사용자의 시각 특성 등에 따라 달라질 수 있다. 예컨대, 사람은 일반적으로 화면의 중앙 부분을 가장자리 부분보다 더 많이 보는 경향이 있기 때문에 통상 화면의 중앙 부분이 관심 영역이 될 수 있다. 또한, 사람은 일반적으로 전경(foreground)에 위치한 객체를 배경(background)에 위치한 객체보다 중요하게 여기는 경향이 있다. 또한, 사람은 정지된 물체보다 움직임이 큰 물체에 더 집중하는 경향이 있다. Since the important object is an object of interest, it may be located in a region of interest that mainly attracts the user's attention. The region of interest may vary according to user's visual characteristics and the like. For example, since people generally tend to view the central portion of the screen more than the edge portion, the central portion of the screen may be a region of interest. In addition, people generally tend to give more importance to objects located in the foreground than objects located in the background. Also, people tend to focus more on moving objects than on still objects.

실시 예에서, 객체 정보 획득부(111)는 입력 영상의 가운데 영역에 위치한 객체를 중요 객체로 식별할 수 있다. In an embodiment, the object information acquisition unit 111 may identify an object located in the middle area of the input image as an important object.

실시 예에서, 객체 정보 획득부(111)는 전경의 객체를 배경의 객체보다 중요 객체로 식별할 수 있다. In an embodiment, the object information obtaining unit 111 may identify a foreground object as a more important object than a background object.

실시 예에서, 객체 정보 획득부(111)는 복수의 프레임들을 기반으로 객체가 움직이는 지 여부를 고려하여, 정지된 객체보다 움직임이 큰 객체를 중요 객체로 식별할 수 있다. In an embodiment, the object information acquisition unit 111 may consider whether the object is moving based on a plurality of frames and identify an object having greater motion than a stationary object as an important object.

입력 영상에 객체가 하나만 포함되어 있는 경우, 객체 정보 획득부(111)는 하나의 객체를 중요 객체로 식별할 수 있다. When only one object is included in the input image, the object information obtaining unit 111 may identify one object as an important object.

입력 영상에 복수개의 객체가 포함되어 있는 경우, 객체 정보 획득부(111)는 복수개의 객체 중에서 중요 객체를 식별할 수 있다. 중요 객체는 입력 영상에 포함된 복수개의 객체들 중에 일부일 수 있다. 중요 객체는 하나일 수도 있으나, 이에 한정되는 것은 아니며, 영상의 종류나 특성, 사용자의 선택 등에 따라 복수개가 될 수도 있다. When a plurality of objects are included in the input image, the object information obtaining unit 111 may identify an important object from among the plurality of objects. The important object may be a part of a plurality of objects included in the input image. The important object may be one, but is not limited thereto, and may be plural depending on the type or characteristic of an image, user's selection, and the like.

관심 영역은 영상의 종류에 따라 달라질 수 있다. 실시 예에서, 객체 정보 획득부(111)는 입력 영상을 분석하여 입력 영상의 장르를 식별할 수 있다. 실시 예에서, 객체 정보 획득부(111)는 입력 영상의 장르에 따라 관심 영역을 다르게 식별할 수 있다. The region of interest may vary according to the type of image. In an embodiment, the object information acquisition unit 111 may identify the genre of the input image by analyzing the input image. In an embodiment, the object information acquisition unit 111 may differently identify the region of interest according to the genre of the input image.

예컨대, 영상이 드라마나 영화와 같은 장르인 경우, 사용자는 주인공이나 또는 현재의 발화자를 관심있게 보는 경향이 있으므로, 주인공 또는 발화자가 중요 객체일 수 있다. 또한, 영상에 사람이 나오는 경우, 사람은 영상에 포함된 사람의 얼굴 형체를 인지하고자 하는 욕구가 크기 때문에 얼굴 영역을 관심 영역으로 식별할 수 있다.For example, if the video is of a genre such as a drama or a movie, since the user tends to look at the main character or the current speaker with interest, the main character or the speaker may be an important object. Also, when a person appears in the image, the face region may be identified as the region of interest because the human has a strong desire to recognize the face shape of the person included in the image.

실시 예에서, 객체 정보 획득부(111)는 입력 영상이 영화나 드라마인 경우, 입력 영상에 가장 많이 나오는 인물을 중요 객체로 식별할 수 있다. 또는, 실시 예에서, 객체 정보 획득부(111)는 영상에 포함된 객체의 움직임, 예컨대, 얼굴에 포함된 입술의 움직임으로부터 발화자를 식별하고, 발화자를 중요 객체로 식별할 수 있다. In an embodiment, when the input image is a movie or drama, the object information acquisition unit 111 may identify a person appearing most often in the input image as an important object. Alternatively, in an embodiment, the object information acquisition unit 111 may identify a speaker from the movement of an object included in an image, for example, a movement of lips included in a face, and identify the speaker as an important object.

입력 영상이 뉴스인 경우, 사용자는 보통 뉴스 화면 하단에 표시되는 자막을 주의 깊게 보는 경향이 있다. 따라서, 실시 예에서, 객체 정보 획득부(111)는 입력 영상의 장르가 뉴스이거나, 입력 영상에 자막이 포함되어 있는 경우, 화면의 하단 영역이나, 자막이 포함된 영역을 관심 영역으로 식별하고, 관심 영역에 포함된 문자를 중요 객체로 식별할 수 있다. When the input image is news, the user usually tends to pay close attention to subtitles displayed at the bottom of the news screen. Accordingly, in an embodiment, when the genre of the input video is news or the input video includes a subtitle, the object information acquisition unit 111 identifies the lower area of the screen or the area including the subtitle as an area of interest, A character included in the region of interest may be identified as an important object.

실시 예에서, 객체 정보 획득부(111)는 중요 객체의 유형이나 객체가 속한 부류가 무엇인지에 대한 정보를 객체 정보로 획득할 수 있다.In an embodiment, the object information acquisition unit 111 may acquire information about the type of important object or the class to which the object belongs as object information.

실시 예에서, 객체 정보는 중요 객체의 위치에 대한 정보를 포함할 수 있다. 예컨대, 객체 정보는 중요 객체가 화면 중앙에 위치하는지, 화면 하단에 위치하는지, 또는 화면의 우측 상단에 위치하는 지 등에 대한 정보를 포함할 수 있다. 중요 객체의 위치는 화면 내의 좌표 값 등으로 획득될 수 있다.In an embodiment, the object information may include information about the location of an important object. For example, the object information may include information about whether an important object is located in the center of the screen, at the bottom of the screen, or at the top right of the screen. The position of the important object may be acquired as a coordinate value in the screen.

실시 예에서, 객체 정보는 중요 객체의 크기에 대한 정보를 포함할 수 있다. 중요 객체의 크기는 영상에서 중요 객체가 차지하는 비중이나 중요 객체의 가로, 세로 길이, 또는 대각선 길이나 지름(diameter)의 길이 등을 의미할 수 있다. In an embodiment, the object information may include information about the size of an important object. The size of an important object may mean a weight occupied by an important object in an image, a horizontal or vertical length of an important object, or a length of a diagonal or a diameter of an important object.

실시 예에서, 객체 정보 획득부(111)는 객체 정보를 제어 정보 획득부(113) 및 화질 처리부(115)로 전송할 수 있다. In an embodiment, the object information acquisition unit 111 may transmit object information to the control information acquisition unit 113 and the image quality processing unit 115 .

실시 예에서, 제어 정보 획득부(113)는 제어 정보를 획득할 수 있다. In an embodiment, the control information acquisition unit 113 may obtain control information.

실시 예에서, 추론 제어 정보는 시각 장애가 있는 사람들이나 또는 사용자의 성향을 고려하여 추론한 제어 정보를 의미할 수 있다. 추론 제어 정보는 시각 장애가 있는 사람들, 및/또는 사용자의 성향에 따라 현재 화면에 출력할 영상에 포함된 중요 객체에 대해 어떻게 화질 처리를 수행할 것인가를 추론하여 획득한 제어 정보일 수 있다. In an embodiment, the inferred control information may refer to control information inferred by considering the tendency of people with visual impairments or users. The inference control information may be control information obtained by inferring how to perform image quality processing on an important object included in an image to be currently displayed on a screen according to a tendency of people with visual impairments and/or users.

실시 예에서, 제어 정보 획득부(113)는 시각 장애가 있는 사람들의 인지 특성이나 선호도를 기반으로 추론 제어 정보를 획득할 수 있다. 예컨대, 제어 정보 획득부(113)는 시각 장애가 있는 사람들의 인지 특성에 대한 정보를 미리 저장하고 있을 수 있다. 제어 정보 획득부(113)는 기 저장된 시각 장애인의 인지 특성에 대한 정보를 기반으로, 현재 화면에 출력할 입력 영상에 대한 제어 정보를 추론할 수 있다. In an embodiment, the control information acquisition unit 113 may obtain inferential control information based on cognitive characteristics or preferences of visually impaired people. For example, the control information acquisition unit 113 may store information on cognitive characteristics of visually impaired people in advance. The control information acquisition unit 113 may infer control information for an input image to be displayed on the current screen based on pre-stored information on the cognitive characteristics of the visually impaired person.

시각 장애가 있는 사람들은 시각 장애가 없는 일반 사람들이 갖고 있는 일반적인 시각 특성 외에도 시각 장애인 특유의 시각 특성을 가지고 있을 수 있다. 예컨대, 시각 장애가 있는 사람들은 영상 전체의 세밀함에 치중하지 않고 영상 내에서 사물의 전체적인 윤곽선이나 사물 내의 성분을 인지하는 것에 집중하는 경향이 있다. 또한, 시각 장애가 있는 사람들은 영상 전체를 함께 보는 것보다는 특정 객체만을 더 크게 보고 싶어하는 경향이 있다. 또한, 시각 장애가 있는 사람들은 색 대비가 큰 객체를 더 관심 있게 보는 경향이 있다. 예컨대, 녹색 잔디밭 위에 빨간 단풍잎이 떨어져 있는 영상이 있는 경우, 시각 장애가 있는 사람들은 색 대비가 큰 객체를 보다 자세히 보고 싶어하는 욕구가 크다. 따라서, 제어 정보 획득부(113)는 시각 장애가 있는 사람들이 선호하는 제어 정보를 추론 제어 정보로 획득할 수 있다. Blind people may have visual characteristics unique to the blind, in addition to general visual characteristics possessed by people without visual impairment. For example, people with visual impairment tend to focus on recognizing the overall outline of an object or components within an object within an image without focusing on details of the entire image. In addition, people with visual impairment tend to want to see only a specific object in a larger size than to see the entire image together. In addition, visually impaired people tend to view objects with high color contrast with more interest. For example, when there is an image of a red maple leaf falling on a green lawn, visually impaired people have a great desire to see an object with high color contrast in more detail. Accordingly, the control information acquisition unit 113 may obtain control information preferred by visually impaired people as inferential control information.

실시 예에서, 제어 정보 획득부(113)는 사용자의 인터랙션 히스토리로부터 추론 제어 정보를 획득할 수도 있다. 사용자는 일반적인 시각 장애가 있는 사람들과는 일정 부분 다른 취향이나 선호도를 가질 수 있다. 일반적으로, 시각 장애가 있는 사용자는 영상 처리 장치(100)를 사용자 혼자 단독으로 사용하는 경우가 많다. 즉, 시각 장애가 있는 사용자는 영상 처리 장치(100)를 개인화된 기기로 이용하는 경우가 많다. 따라서, 영상 처리 장치(100)는 사용자가 객체에 대한 화질 처리를 위해 입력한 제어 명령 이력을 기반으로 사용자의 선호도나 취향을 추론할 수 있다. In an embodiment, the control information acquisition unit 113 may obtain inference control information from a user's interaction history. Users may have tastes or preferences that are partially different from those of people with general visual impairments. In general, a user with a visual impairment often uses the image processing device 100 alone. That is, users with visual disabilities often use the image processing device 100 as a personalized device. Accordingly, the image processing device 100 may infer the user's preference or taste based on the control command history input by the user to process the image quality of the object.

제어 정보 획득부(113)는 사용자의 제어 이력을 영상과 연계하여, 어떤 영상이 출력되었을 때 사용자가 어떤 제어 명령을 했는지에 대한 정보를 누적하고, 이로부터 사용자의 선호도나 취향을 추론할 수 있다. 제어 정보 획득부(113)는 복수의 프레임들이나 복수의 비디오에 대해 입력된 사용자의 과거 제어 명령 이력을 시계열적으로 누적하여 사용자 제어 명령의 패턴을 식별하고, 이로부터 사용자의 선호도나 취향을 추론할 수 있다. The control information acquisition unit 113 may link the user's control history with the image, accumulate information on what control command the user gave when a certain image was output, and infer the user's preference or taste from this. . The control information acquisition unit 113 identifies patterns of user control commands by time-sequentially accumulating history of past control commands input by a user for a plurality of frames or a plurality of videos, and infers the user's preference or taste therefrom. can

제어 정보 획득부(113)는 통상의 시각 장애가 있는 사람들의 인지 특성 및 선호도와 함께, 사용자의 제어 이력을 함께 고려하여, 중요 객체에 대한 화질 처리 명령을 추론할 수 있다. 또는, 제어 정보 획득부(113)는 통상의 시각 장애가 있는 사람들의 인지 특성은 배제하고, 사용자의 제어 이력만을 고려하여, 중요 객체에 대한 화질 처리 명령을 추론할 수 있다.The control information acquisition unit 113 may infer an image quality processing command for an important object by considering the user's control history as well as the cognitive characteristics and preferences of people with normal visual impairments. Alternatively, the control information acquisition unit 113 may infer an image quality processing command for an important object by excluding the cognitive characteristics of people with normal visual impairments and considering only the user's control history.

또는 실시 예에서, 제어 정보 획득부(113)는 사용자가 영상 처리 장치(100)를 이용하여 미리 입력한, 사용자 선호도에 대한 정보를 기반으로 추론 제어 정보를 획득할 수도 있다. 사용자는 영상 처리 장치(100)를 이용하여 사용자의 시각 장애 정도나, 사용자가 선호하는 화질 처리 정도에 대한 정보를 미리 입력할 수 있다. 영상 처리 장치(100)는 사용자가 미리 입력한 사용자 선호도에 대한 정보로부터, 현재 영상에 대해 적용할 제어 정보를 추론하여, 추론 제어 정보를 획득할 수 있다. Alternatively, in an embodiment, the control information acquisition unit 113 may acquire inference control information based on user preference information previously input by the user using the image processing device 100 . The user may use the image processing device 100 to input information about the degree of visual impairment of the user or the degree of image quality processing preferred by the user in advance. The image processing device 100 may obtain inferred control information by inferring control information to be applied to the current image from user preference information previously input by the user.

실시 예에서, 제어 정보 획득부(113)는 사용자로부터 실시간 사용자 제어 정보를 입력 받을 수 있다. 실시간 사용자 제어 정보는 현재 화면에 출력되는 입력 영상에 대한 사용자의 실시간 제어 정보를 의미할 수 있다. 제어 정보 획득부(113)는 사용자로부터 입력된 실시간 사용자 제어 정보를 이용하여 기 저장된 제어 정보를 업데이트할 수 있다. 실시 예에서, 제어 정보 획득부(113)는 업데이트된 제어 정보를 화질 처리부(115)로 전송할 수 있다. In an embodiment, the control information obtaining unit 113 may receive real-time user control information from a user. The real-time user control information may mean real-time control information of a user for an input image currently output on a screen. The control information acquisition unit 113 may update pre-stored control information using real-time user control information input from the user. In an embodiment, the control information acquisition unit 113 may transmit updated control information to the quality processing unit 115 .

실시 예에서, 화질 처리부(115)는 객체 정보 획득부(111)로부터 객체 정보를 수신하고, 제어 정보 획득부(113)로부터 제어 정보를 수신할 수 있다. In an embodiment, the quality processing unit 115 may receive object information from the object information acquisition unit 111 and control information from the control information acquisition unit 113 .

실시 예에서, 화질 처리부(115)는 객체 정보 및 제어 정보를 기반으로, 중요 객체에 대한 화질 처리를 수행하여 제2 영상을 획득할 수 있다. In an embodiment, the image quality processing unit 115 may obtain a second image by performing image quality processing on an important object based on object information and control information.

실시 예에서, 화질 처리부(115)는 중요 객체의 윤곽선 처리를 수행할 수 있다. 중요 객체의 윤곽선 처리는 윤곽선의 디테일, 윤곽선의 강도, 윤곽선의 색상, 윤곽선과 다른 영역 간의 대비 정도 중 적어도 하나에 대한 처리를 수행하는 것을 포함할 수 있다. In an embodiment, the image quality processing unit 115 may perform contour processing of important objects. Processing the outline of the important object may include processing at least one of the detail of the outline, the intensity of the outline, the color of the outline, and the degree of contrast between the outline and other regions.

실시 예에서, 화질 처리부(115)는 중요 객체 내부의 평탄화 처리를 수행할 수 있다. 평탄화 처리는 중요 객체의 윤곽선을 제외한 중요 객체 내부 영역의 평탄화 정도를 조절하는 처리를 포함할 수 있다. In an embodiment, the image quality processing unit 115 may perform a flattening process inside the important object. The flattening process may include a process of adjusting a degree of flattening of an inner region of the important object except for the outline of the important object.

실시 예에서, 화질 처리부(115)는 중요 객체를 업스케일링할 수 있다. 화질 처리부(115)는 중요 객체의 크기를 늘릴 뿐 아니라, 중요 객체에 대해 선명도 처리를 함께 수행할 수 있다. 즉, 화질 처리부(115)는 중요 객체에 대한 업스케일링을 통해 중요 객체의 크기를 확대하면서 동시에 해상도도 향상되도록 할 수 있다. In an embodiment, the picture quality processor 115 may upscale important objects. The image quality processing unit 115 may increase the size of the important object and also perform sharpness processing on the important object. That is, the image quality processing unit 115 may increase the size of the important object and simultaneously improve the resolution through upscaling of the important object.

실시 예에서, 화질 처리부(115)는 객체 정보로부터 획득된 중요 객체의 장르, 위치, 크기에 대한 정보를 기반으로, 화질 처리를 수행할 대상을 식별하고, 식별된 영역에 제어 정보를 정합하여, 제어 정보에 맞게 식별된 영역을 화질 처리할 수 있다. In an embodiment, the image quality processing unit 115 identifies an object to be subjected to image quality processing based on information about the genre, location, and size of an important object obtained from object information, matches control information to the identified region, An area identified according to the control information may be image quality processed.

예컨대, 화질 처리부(115)는 중요 객체가 자막이고, 자막의 위치가 화면 하단이고, 자막의 크기가 영상에서 어느 정도를 차지하는지에 대한 정보를 획득할 수 있다. 화질 처리부(115)는 제어 정보로부터 획득된 화질 처리에 대한 제어 명령을 기반으로, 자막 영역을 식별하고, 자막 영역에 대해 화질 처리를 수행할 수 있다. 예컨대, 화질 처리부(115)는 자막의 위치를 찾고, 자막이 포함된 영역을 크롭핑하고, 크로핑된 영역을 소정 크기로 업스케일링할 수 있다. 화질 처리부(115)는 자막의 테두리는 사용자가 선호하는 색상인 보라색으로, 일정 강도 이상으로 두껍고 진하게 표시하는 처리를 수행함으로써, 중요 객체에 대한 화질 처리를 수행할 수 있다. For example, the image quality processing unit 115 may obtain information about a subtitle as an important object, a position of the subtitle at the bottom of the screen, and a size of the subtitle occupying an image. The picture quality processing unit 115 may identify a caption area and perform picture quality processing on the caption area based on a control command for picture quality processing obtained from the control information. For example, the picture quality processing unit 115 may locate a subtitle, crop an area including the subtitle, and upscale the cropped area to a predetermined size. The image quality processing unit 115 may perform image quality processing on an important object by performing a process of displaying the edge of the subtitle in purple, which is a color preferred by the user, thick and dark over a certain intensity.

도 5는 실시 예에 따라, 객체 정보 획득부(111)가 입력 영상으로부터 객체 정보를 획득하는 것을 설명하기 위한 도면이다. 5 is a diagram for explaining that the object information acquisition unit 111 obtains object information from an input image according to an embodiment.

실시 예에서, 객체 정보 획득부(111)는 입력 영상(510)으로부터 객체 정보(520)를 획득하는 기능을 수행할 수 있는 적절한 로직, 회로, 인터페이스, 및/또는 코드를 포함할 수 있다. 실시 예에서, 객체 정보 획득부(111)는 룰 기반으로(rule based) 입력 영상(510)을 분석하여 입력 영상(510)에 포함된 객체에 대한 객체 정보(520)를 획득할 수 있다. 영상 처리 장치(100)의 데이터 처리 성능이나 운영 체제, CPU 처리 속도 등이, 많은 연산량을 빠른 시간 안에 수행하기 어려운 장치인 경우, 객체 정보 획득부(111)는 룰 기반으로 입력 영상(510)을 분석하여 입력 영상(510)으로부터 객체 정보를 획득하도록 설계될 수 있다. In an embodiment, the object information acquisition unit 111 may include appropriate logic, circuits, interfaces, and/or codes capable of performing a function of acquiring the object information 520 from the input image 510 . In an embodiment, the object information acquisition unit 111 may acquire object information 520 for an object included in the input image 510 by analyzing the input image 510 based on a rule. When the data processing performance of the image processing device 100, operating system, CPU processing speed, etc. is difficult to perform a large amount of calculation in a short time, the object information acquisition unit 111 obtains the input image 510 based on rules It may be designed to acquire object information from the input image 510 by analyzing it.

또는, 실시 예에서, 객체 정보 획득부(111)는 뉴럴 네트워크를 이용하여 입력 영상(510)으로부터 객체 정보(520)를 획득할 수도 있다. 실시 예에서, 객체 정보 획득부(111)에 포함된 뉴럴 네트워크를 제1 뉴럴 네트워크(500)로 호칭하기로 한다.Alternatively, in an embodiment, the object information acquisition unit 111 may obtain object information 520 from the input image 510 using a neural network. In an embodiment, the neural network included in the object information obtaining unit 111 will be referred to as a first neural network 500.

실시 예에서, 제1 뉴럴 네트워크(500)는 2개 이상의 히든 레이어들을 포함하는 딥 뉴럴 네트워크(DNN)일 수 있다. 제1 뉴럴 네트워크(500)는 입력 데이터를 받고, 입력된 데이터가 히든 레이어들을 통과하여 처리됨으로써, 처리된 데이터가 출력되는 구조를 포함할 수 있다. In an embodiment, the first neural network 500 may be a deep neural network (DNN) including two or more hidden layers. The first neural network 500 may include a structure in which input data is received, the input data is processed through hidden layers, and the processed data is output.

도 5에서는 제1 뉴럴 네트워크(500)의 숨은 층(hidden layer)이 2개의 심도(depth)를 가지는 딥 뉴럴 네트워크(DNN)인 경우를 예로 들어 도시하였다. In FIG. 5 , a case in which the hidden layer of the first neural network 500 is a deep neural network (DNN) having two depths is shown as an example.

객체 정보 획득부(111)는 제1 뉴럴 네트워크(500)를 통한 연산을 수행하여 입력 영상(510)을 분석할 수 있다. 제1 뉴럴 네트워크(500)는 학습 데이터를 통한 학습을 수행할 수 있다. 여기서, 제1 뉴럴 네트워크(500)는 모델의 구현 방식이나 결과의 정확도, 결과의 신뢰도, 프로세서의 연산 처리 속도 및 용량 등에 따라 매우 다양하게 설계될 수 있다. The object information acquisition unit 111 may analyze the input image 510 by performing an operation through the first neural network 500 . The first neural network 500 may perform learning through learning data. Here, the first neural network 500 can be designed in a variety of ways according to a model implementation method, result accuracy, result reliability, and processing speed and capacity of a processor.

제1 뉴럴 네트워크(500)는 입력 계층(501), 숨은 계층(hidden layer)(502) 및 출력 계층(503)을 포함 하여, 장르 결정 및 객체 검출을 위한 연산을 수행할 수 있다. 제1 뉴럴 네트워크(500)는 입력 계층(501)과 제1 숨은 계층(HIDDEN LAYER1) 간에 형성되는 제1 계층(Layer 1)(504), 제1 숨은 계층(HIDDEN LAYER1)과 제2 숨은 계층(HIDDEN LAYER2) 간에 형성되는 제2 계층(Layer 2)(505), 및 제2 숨은 계층(HIDDEN LAYER2)과 출력 계층(OUTPUT LAYER)(503) 간에 형성되는 제3 계층(Layer 3)(506)으로 형성될 수 있다. The first neural network 500 includes an input layer 501, a hidden layer 502, and an output layer 503, and may perform calculations for genre determination and object detection. The first neural network 500 includes a first layer (Layer 1) 504 formed between the input layer 501 and the first hidden layer (HIDDEN LAYER1), the first hidden layer (HIDDEN LAYER1) and the second hidden layer ( A second layer (Layer 2) 505 formed between the HIDDEN LAYER2 and a third layer (Layer 3) 506 formed between the second hidden layer (HIDDEN LAYER2) and the output layer (OUTPUT LAYER) 503. can be formed

제1 뉴럴 네트워크(500)를 형성하는 복수개의 계층들 각각은 하나 이상의 노드를 포함할 수 있다. 예를 들어, 입력 계층(501)은 데이터를 수신하는 하나 이상의 노드(node)(530)들을 포함할 수 있다. 도 5에서는 입력 계층(501)이 복수개의 노드들을 포함하는 경우를 예로 들어 도시하였다. 그리고, 복수개의 노드(530)로 입력 영상(510)이 입력될 수 있다. 여기서, 인접한 두 개의 계층들은 도시된 바와 같이 복수개의 엣지(edge)들(예를 들어, 540)로 연결된다. 각각의 노드들은 대응되는 가중치값을 가지고 있어서, 제1 뉴럴 네트워크(500)는 입력된 신호와 가중치 값을 연산, 예를 들어, 곱하기 연산한 값에 근거하여, 출력 데이터를 획득할 수 있다. Each of the plurality of layers forming the first neural network 500 may include one or more nodes. For example, the input layer 501 may include one or more nodes 530 that receive data. In FIG. 5, a case in which the input layer 501 includes a plurality of nodes is shown as an example. Also, the input image 510 may be input to the plurality of nodes 530 . Here, two adjacent layers are connected by a plurality of edges (eg, 540) as shown. Since each node has a corresponding weight value, the first neural network 500 may obtain output data based on a value obtained by calculating, for example, multiplying the input signal and the weight value.

제1 뉴럴 네트워크(500)는 복수의 학습 데이터에 근거하여 학습되어, 입력 영상(510)으로부터 객체 정보(520)를 추론하는 모델로서 구축될 수 있다. The first neural network 500 may be built as a model that is learned based on a plurality of learning data and infers the object information 520 from the input image 510 .

실시 예에서, 제1 뉴럴 네트워크(500)는 지도형 학습(supervised learning) 방법으로 소정의 데이터셋(dataset)을 학습하여 훈련된 뉴럴 네트워크일 수 있다. 뉴럴 네트워크는 학습한 데이터셋과 같거나 유사한 객체 정보 영역을 검출하도록 훈련될 수 있다.In an embodiment, the first neural network 500 may be a neural network trained by learning a predetermined dataset using a supervised learning method. The neural network can be trained to detect object information regions that are the same as or similar to the learned dataset.

실시 예에서, 제1 뉴럴 네트워크(500)는 복수의 학습 데이터를 입력 값으로 하여 입력 데이터를 분석 및 분류하여 특징을 추출하는 알고리즘일 수 있다. 제1 뉴럴 네트워크(500)는 입력된 데이터로부터 객체 정보를 추출하도록 학습된 모델일 수 있다. 이를 위해, 제1 뉴럴 네트워크(500)는 영상에서 중요 객체를 검출하는 방법을 학습할 수 있다. 제1 뉴럴 네트워크(500)는 다양한 장르의 영상 및 각 영상의 레이블, 각 영상에 포함된 중요 객체의 종류나 위치, 크기를 학습 데이터 셋으로 이용하여 훈련될 수 있다. In an embodiment, the first neural network 500 may be an algorithm that extracts features by analyzing and classifying input data by taking a plurality of learning data as input values. The first neural network 500 may be a model learned to extract object information from input data. To this end, the first neural network 500 may learn a method of detecting an important object in an image. The first neural network 500 can be trained using images of various genres, labels of each image, and types, locations, and sizes of important objects included in each image as a learning data set.

실시 예에서, 제1 뉴럴 네트워크(500)는 영상에서 사람들이 중요하게 여기는 객체를 분석하고, 그 객체의 종류를 분석할 수 있다. 보다 구체적으로, 제1 뉴럴 네트워크(500)는 영상에서 시각 장애가 있는 사람들이 강조 기능을 활용한 중요 객체에 대한 정보 및 그 객체의 종류를 학습 데이터로 훈련할 수 있다. In an embodiment, the first neural network 500 may analyze an object that people consider important in an image and analyze the type of the object. More specifically, the first neural network 500 may train information about an important object and the type of the object in an image for which visually impaired people use an emphasis function as learning data.

실시 예에서, 제1 뉴럴 네트워크(500)는 관심 영역을 찾기 위해, 세일리언시 데이터 셋(saliency dataset)을 학습할 수 있다. 세일리언시 데이터 셋은 세일리언시 맵으로도 호칭될 수 있다. 세일리언시 맵은 사람들의 관심을 끄는 세일리언시 영역을 다른 영역과 구별하여 표현하는 맵을 의미할 수 있다. 실시 예에서, 제1 뉴럴 네트워크(500)는 시각 장애가 있는 사람들이 주로 관심을 갖는 세일리언시 영역에 대한 정보를 학습할 수 있다. 세일리언시 영역은 비디오 프레임에서 시각 장애가 있는 사람들의 관심을 끄는 영역, 즉, 시각적 집중도가 높은 영역을 나타낼 수 있다. 예컨대, 제1 뉴럴 네트워크(500)는 시각 장애가 있는 사람들의 시선을 추적하여 얻어진 세일리언시 영역을 미리 학습한 모델일 수 있다. 제1 뉴럴 네트워크(500)는 입력된 비디오 프레임에 포함된 픽셀들 각각 또는 유사한 특징을 갖는 복수 픽셀들을 포함하는 픽셀 그룹의 색 변화나 분포, 엣지(edges), 공간 주파수, 구조, 분포, 히스토그램, 텍스쳐(texture) 등을 고려하여 입력된 비디오 프레임에 대한 세일리언시 맵을 획득하도록 학습될 수 있다. 실시 예에서, 제1 뉴럴 네트워크(500)는 세일리언시 맵 영역에 높은 웨이트를 부여함으로써 중요 객체에 대한 객체 정보를 획득할 수 있다. In an embodiment, the first neural network 500 may learn a saliency dataset to find a region of interest. A saliency data set can also be called a saliency map. The saliency map may refer to a map that distinguishes a saliency area that attracts people's attention from other areas and expresses it. In an embodiment, the first neural network 500 may learn information about a saliency area in which visually impaired people are mainly interested. The saliency area may represent an area that attracts the attention of visually impaired people in the video frame, that is, an area with high visual concentration. For example, the first neural network 500 may be a model obtained by pre-learning a saliency area obtained by tracking the eyes of visually impaired people. The first neural network 500 calculates color change or distribution, edges, spatial frequency, structure, distribution, histogram, It may be learned to acquire a saliency map for an input video frame in consideration of a texture or the like. In an embodiment, the first neural network 500 may acquire object information about an important object by assigning a high weight to the saliency map area.

실시 예에서, 제1 뉴럴 네트워크(500)는 세일리언시 데이터 셋 외에도, 세그멘테이션 데이터 셋(segmentation dataset)을 학습 데이터로 학습할 수 있다. 제1 뉴럴 네트워크(500)는 객체의 시맨틱(semantic) 정보를 고려하여, 영상에 포함된 중요 객체의 종류를 식별할 수 있다. 제1 뉴럴 네트워크(500)는 중요 객체의 종류에 따라 다른 웨이트를 부여함으로써 중요 객체에 대한 객체 정보를 획득할 수 있다. In an embodiment, the first neural network 500 may learn a segmentation dataset as training data in addition to the saliency data set. The first neural network 500 may identify the type of important object included in the image by considering semantic information of the object. The first neural network 500 may acquire object information on an important object by assigning different weights according to the type of important object.

또한, 제1 뉴럴 네트워크(500)는 이미지 이해 데이터 셋(image understanding dataset)을 학습 데이터로 이용하여 훈련될 수 있다. 제1 뉴럴 네트워크(500)는 객체나 객체의 영역을 해석하여 객체가 무엇인지, 또한, 객체 간의 공간적 관계가 어떠한지 등을 학습할 수 있다. 제1 뉴럴 네트워크(500)는 이미지 이해 데이터 셋에 따라 중요 객체에 대한 객체 정보를 획득할 수 있다. Also, the first neural network 500 may be trained using an image understanding dataset as training data. The first neural network 500 may learn what an object is and what a spatial relationship between objects is by interpreting an object or a region of the object. The first neural network 500 may obtain object information about an important object according to an image understanding data set.

실시 예에서, 제1 뉴럴 네트워크(500)는 영상과 각 영상의 레이블, 시각 장애가 있는 사람들이 화질 처리, 예컨대, 강조 기능을 활용한 영상인지 여부에 대한 정보, 영상에서 화질 처리된, 예컨대, 강조된 오브젝트에 대한 정보 등을 학습할 수 있다. 실시 예에서, 제1 뉴럴 네트워크(500)는 시각 장애가 있는 사람들이 강조 기능을 활용한 영상 및 그 영상에서 화질 처리된 객체에 대한 객체 정보간의 관련성을 학습하고, 영상으로부터 객체 정보를 획득하도록 훈련될 수 있다. In an embodiment, the first neural network 500 includes images and labels of each image, information on whether visually impaired people are images using image quality processing, e.g., an emphasis function, image quality processing, e.g., enhanced images. Information about objects can be learned. In an embodiment, the first neural network 500 is trained to learn the relationship between an image using an emphasis function for visually impaired people and object information about an object whose quality has been processed in the image, and to obtain object information from the image. can

실시 예에서, 제1 뉴럴 네트워크(500)는 시각 장애가 있는 사람들이 화질 처리를 수행한 객체의 종류나 그러한 객체가 포함된 영역을 정답 셋으로 이용하여 훈련될 수 있다. 제1 뉴럴 네트워크(500)는 입력 영상으로부터 객체의 종류나 위치, 크기를 추론/예측하고, 예측한 결과가 정답 셋과 같아지도록 반복하여 훈련될 수 있다. In an embodiment, the first neural network 500 may be trained using, as a set of correct answers, types of objects for which visual impairments have performed image quality processing or regions including such objects. The first neural network 500 infers/predicts the type, location, or size of an object from an input image, and can be repeatedly trained so that the predicted result is equal to three correct answers.

구체적으로, 제1 뉴럴 네트워크(500)를 통하여 출력되는 결과의 정확도를 높이기 위해서, 복수의 학습 데이터에 근거하여 출력 계층(503)에서 입력 계층(501) 방향으로 학습(training)을 반복적으로 수행하며 출력 결과의 정확도가 높아지도록 가중치 값들을 수정할 수 있다. Specifically, in order to increase the accuracy of the result output through the first neural network 500, training is repeatedly performed from the output layer 503 toward the input layer 501 based on a plurality of training data, Weight values can be modified to increase the accuracy of the output result.

그리고, 최종적으로 수정된 가중치 값들을 가지는 제1 뉴럴 네트워크(500)는 객체 정보를 모델로 이용될 수 있다. 구체적으로, 제1 뉴럴 네트워크(500)는 입력 데이터인 입력 영상(510)에 포함되는 정보를 분석하여 객체 정보(520)를 결과로 출력할 수 있다. And, the first neural network 500 having finally modified weight values may use object information as a model. Specifically, the first neural network 500 may analyze information included in the input image 510 as input data and output object information 520 as a result.

학습이 끝난 제1 뉴럴 네트워크(500)는 객체 정보 획득부(111)에 장착되어, 입력 영상(510)으로부터 객체 정보(520)를 획득하는 데 이용될 수 있다. 제1 뉴럴 네트워크(500)는 입력 영상(510)이 입력되면, 그로부터 입력 영상(510)의 레이블 및 입력 영상(510)에 포함된 중요 객체를 검출하고, 중요 객체의 종류, 위치, 크기 중 적어도 하나에 대한 정보를 포함하는 객체 정보를 획득할 수 있다. The trained first neural network 500 may be installed in the object information acquisition unit 111 and used to acquire object information 520 from the input image 510 . When an input image 510 is input, the first neural network 500 detects a label of the input image 510 and an important object included in the input image 510, and at least among the type, location, and size of the important object. Object information including information about one can be obtained.

실시 예에서, 객체 정보 획득부(111)는 제1 뉴럴 네트워크(500)를 이용하여 획득한 객체 정보를 제어 정보 획득부(113) 및 화질 처리부(115)로 전송할 수 있다.In an embodiment, the object information acquisition unit 111 may transmit the object information obtained using the first neural network 500 to the control information acquisition unit 113 and the image quality processing unit 115 .

실시 예에서, 객체 정보 획득부(111)가 제어 정보 획득부(113)로 전송하는 객체 정보와 화질 처리부(115)로 전송하는 객체 정보는 서로 다를 수 있다. 객체 정보 획득부(111)가 제어 정보 획득부(113)로 전송하는 객체 정보를 제1 객체 정보라고 하면, 제1 객체 정보는 입력 영상(510), 즉, 하나의 프레임으로부터 획득되는 정보일 수 있다. 제어 정보 획득부(113)는 영상의 장르와 사용자 제어 명령 간의 연관성으로부터 제어 정보를 획득하기 때문에, 객체 정보 획득부(111)는 제어 정보 획득부(113)로 각 프레임에서 획득된 프레임의 장르에 대한 정보만을 전송할 수 있다. 예컨대, 제1 객체 정보는 입력 영상(510)의 장르에 대한 정보, 예컨대, 입력 영상(510)이 뉴스인지, 드라마인지 등에 대한 정보만을 포함할 수 있다. In an embodiment, object information transmitted from the object information acquisition unit 111 to the control information acquisition unit 113 and object information transmitted to the quality processing unit 115 may be different from each other. If the object information transmitted from the object information acquisition unit 111 to the control information acquisition unit 113 is first object information, the first object information may be information obtained from the input image 510, that is, one frame. there is. Since the control information acquisition unit 113 obtains control information from the correlation between the genre of the video and the user control command, the object information acquisition unit 111 determines the genre of the frame obtained from each frame by the control information acquisition unit 113. information can only be transmitted. For example, the first object information may include only information about the genre of the input image 510, eg, whether the input image 510 is news or drama.

실시 예에서, 객체 정보 획득부(111)가 화질 처리부(115)로 전송하는 객체 정보를 제2 객체 정보라고 하면, 제2 객체 정보는 입력 영상(510)뿐 아니라, 입력 영상(510)이 포함된 복수 프레임이나 비디오를 분석하여 획득된 정보를 포함할 수 있다. 제2 정보는 입력 영상(510)의 장르에 따라 식별된 관심 영역 내의 중요 객체의 장르 및 중요 객체의 위치, 크기에 대한 정보를 포함할 수 있다. 화질 처리부(115)는 입력 영상(510)에 포함된 중요 객체에 대해 화질 처리를 하기 때문에, 중요 객체의 종류, 위치, 크기 등과 같은 정보가 필요하다. 따라서, 실시 예에서, 객체 정보 획득부(111)는 입력 영상(510)이 포함된 비디오를 분석하여, 그로부터 중요 객체에 대한 객체 정보를 획득하고, 이를 화질 처리부(115)로 전송할 수 있다. 예컨대, 입력 영상(510)의 장르가 뉴스인 경우, 제2 객체 정보는 중요 객체의 장르가 자막이라는 정보, 입력 영상(510)에서 자막의 위치가 어디인지에 대한 정보, 자막의 크기가 얼마인지에 대한 정보 등을 포함할 수 있다. In an embodiment, if the object information transmitted from the object information obtaining unit 111 to the image quality processing unit 115 is second object information, the second object information includes not only the input image 510 but also the input image 510. It may include information obtained by analyzing multiple frames or videos. The second information may include information about the genre of the important object in the region of interest identified according to the genre of the input image 510 and the location and size of the important object. Since the image quality processing unit 115 performs image quality processing on important objects included in the input image 510, information such as the type, location, and size of important objects is required. Accordingly, in an embodiment, the object information obtaining unit 111 may analyze the video including the input image 510 to obtain object information about an important object therefrom, and transmit the obtained object information to the quality processing unit 115 . For example, when the genre of the input video 510 is news, the second object information is information that the genre of the important object is a subtitle, information about where the subtitle is located in the input video 510, and what size the subtitle is. may include information about

다만, 이는 하나의 실시 예로, 객체 정보 획득부(111)가 제어 정보 획득부(113)로 전송하는 객체 정보와 화질 처리부(115)로 전송하는 객체 정보는 서로 동일한 정보일 수도 있다. 예컨대, 객체 정보 획득부(111)는 제어 정보 획득부(113)와 화질 처리부(115) 모두에게, 영상의 장르 및 객체 정보, 즉, 객체의 종류나 위치, 크기에 대한 정보를 전송할 수 있다. However, this is an example, and the object information transmitted from the object information acquisition unit 111 to the control information acquisition unit 113 and the object information transmitted to the quality processing unit 115 may be the same information. For example, the object information acquisition unit 111 may transmit genre and object information of a video, that is, information about the type, location, and size of an object, to both the control information acquisition unit 113 and the quality processing unit 115.

도 6은 실시 예에 따라, 제어 정보 획득부(113)가 추론 제어 정보를 획득하는 것을 설명하기 위한 도면이다. 6 is a diagram for explaining that the control information acquisition unit 113 acquires inference control information according to an embodiment.

실시 예에서, 제어 정보 획득부(113)는 입력 데이터로부터 추론 제어 정보를 획득하는 기능을 수행할 수 있는 적절한 로직, 회로, 인터페이스, 및/또는 코드를 포함할 수 있다. In an embodiment, the control information acquisition unit 113 may include appropriate logic, circuitry, interfaces, and/or codes capable of performing a function of acquiring inference control information from input data.

실시 예에서, 제어 정보 획득부(113)는 다양한 알고리즘을 이용하여 사용자 제어 정보 및 입력 영상에 대한 객체 정보로부터, 입력 영상에 대한 제어 정보를 추론할 수 있다. In an embodiment, the control information acquisition unit 113 may infer control information about an input image from user control information and object information about an input image using various algorithms.

실시 예에서, 제어 정보 획득부(113)는 프로그램이나 인스트럭션과 같은 룰 기반으로 제어 정보를 추론하거나, 뉴럴 네트워크를 이용하여 추론 제어 정보를 획득할 수 있다. In an embodiment, the control information obtaining unit 113 may infer control information based on a rule such as a program or an instruction, or obtain inferred control information using a neural network.

실시 예에서, 도 6은, 제어 정보 획득부(113)가 뉴럴 네트워크를 이용하여 추론 제어 정보를 획득하는 경우를 도시한다. 제어 정보 획득부(113)에 포함된 뉴럴 네트워크를 제2 뉴럴 네트워크(600)로 호칭하면, 실시 예에서, 제2 뉴럴 네트워크(600)는 중요 객체에 대한 사용자의 인터랙션 히스토리를 학습할 수 있다. In an embodiment, FIG. 6 illustrates a case in which the control information acquisition unit 113 acquires inference control information using a neural network. If the neural network included in the control information acquisition unit 113 is referred to as the second neural network 600, in an embodiment, the second neural network 600 may learn a user's interaction history for important objects.

실시 예에서, 제2 뉴럴 네트워크(600)는 RNN으로 구현될 수 있다. In an embodiment, the second neural network 600 may be implemented as an RNN.

실시 예에서, 제2 뉴럴 네트워크(600)는 시각 장애가 있는 사람들이 특정 영상에 대해 주로 이용한 화질 처리에 대한 정보를 학습할 수 있다. 제2 뉴럴 네트워크(600)는 시각 장애가 있는 사람들이 영상에 대해 화질 처리와 관련하여 입력한 제어 명령의 종류, 제어 정도, 제어 빈도 등을 학습 데이터로 이용하여 훈련할 수 있다. 제2 뉴럴 네트워크(600)가 기존의 일반적인 시각 장애인의 데이터를 학습한 상태에서, 영상 처리 장치(100)의 사용자가 영상에 대해 화질 처리와 관련하여 입력한 제어 명령의 종류, 제어 정도, 제어 빈도 등을 추가로 학습하는 경우, 제2 뉴럴 네트워크(600)는 준지도형 학습(semi-supervised learning) 알고리즘일 수 있다. In an embodiment, the second neural network 600 may learn information about image quality processing mainly used by visually impaired people for a specific image. The second neural network 600 may be trained using, as learning data, the type, degree of control, control frequency, etc. of control commands input by people with visual impairments in relation to image quality processing. In a state where the second neural network 600 has learned data of the general blind, the type, degree of control, and control frequency of a control command input by the user of the image processing device 100 in relation to image quality processing for an image In the case of additional learning, etc., the second neural network 600 may be a semi-supervised learning algorithm.

또는, 다른 실시 예에서, 제2 뉴럴 네트워크(600)는 영상 처리 장치(100)의 사용자가 영상에 대해 화질 처리와 관련하여 입력한 제어 명령의 종류, 제어 정도, 제어 빈도 등 만을 학습할 수 있다. 이 경우, 제2 뉴럴 네트워크(600)는 사용자만의 인터랙션 히스토리를 가지고 사용자의 패턴을 학습하는 비지도형 학습(unsupervised learning) 알고리즘일 수 있다. Alternatively, in another embodiment, the second neural network 600 may learn only the type, control degree, control frequency, etc. of a control command input by the user of the image processing device 100 in relation to image quality processing. . In this case, the second neural network 600 may be an unsupervised learning algorithm that learns a user's pattern with a user's own interaction history.

실시 예에서, 사용자의 인터랙션 히스토리는 중요 객체에 대한 윤곽선 처리, 평탄화 처리, 업스케일링 처리 중 적어도 하나에 대한 제어 정보 히스토리를 포함할 수 있다. 사용자의 인터랙션은 리모컨이나 영상 처리 장치(100)에 구비된 키패드 등의 사용자 인터페이스를 통해 입력될 수 있다. In an embodiment, the user's interaction history may include a control information history for at least one of contour processing, flattening processing, and upscaling processing for important objects. User interaction may be input through a user interface such as a remote controller or a keypad provided in the image processing device 100 .

실시 예에서, 제2 뉴럴 네트워크(600)는 사용자 인터랙션 히스토리를 축적하고, 이로부터 사용자의 선호도나 습관, 사용 패턴 등을 학습할 수 있다. 사용자 인터랙션 히스토리는 사용자가 화질 처리를 선택한 영상의 종류, 그 영상에서 화질 처리 대상이 된 객체의 종류, 객체의 크기, 객체의 위치, 및 사용자가 선택한 화질 처리의 종류나 정도, 화질 처리 기능을 이용한 시점에 대한 정보 중 적어도 하나에 대한 히스토리를 포함할 수 있다. In an embodiment, the second neural network 600 may accumulate a user interaction history and learn a user's preference, habit, usage pattern, etc. from this. The user interaction history is the type of image for which the user has selected image quality processing, the type of object that is subject to image quality processing in the image, the size of the object, the position of the object, the type or degree of image processing selected by the user, and the quality processing function used. It may include a history of at least one of the information about the viewpoint.

사용자가 선택한 화질 처리의 종류나 정도는 사용자가 요청한 객체의 윤곽선 두께나 디테일 정도, 윤곽선의 색상, 객체 내부의 평탄화 정도 등에 대한 정보, 객체를 어느 크기로 확대했는지 여부 등에 대한 정보 등을 포함할 수 있다. 화질 처리 기능을 이용한 시점에 대한 정보는 사용자가 비디오 재생 전에 화질 처리 기능을 이용했는지, 또는 비디오가 재생된 후, 일정 시점 경과 후, 또는 특정한 영상이 출력될 때만 화질 처리 기능을 이용했는지 여부 등과 같은 정보를 포함할 수 있다.The type or degree of image quality processing selected by the user may include information about the thickness or detail of the outline of the object requested by the user, the color of the outline, the degree of flattening inside the object, and information about the size of the object to which it has been enlarged. there is. Information on the timing of using the quality processing function is such as whether the user used the quality processing function before playing the video, or whether the user used the quality processing function after the video was played, after a certain point in time, or only when a specific video was output. information may be included.

실시 예에서, 제2 뉴럴 네트워크(600)는 사용자가 새로 제어 명령을 입력한 경우, 사용자의 제어 명령을 추가로 학습할 수 있다.In an embodiment, the second neural network 600 may additionally learn the user's control command when the user newly inputs the control command.

실시 예에서, 제2 뉴럴 네트워크(600)는 객체 정보 획득부(111)로부터 입력 영상에 대한 객체 정보를 수신할 수 있다. 입력 영상에 대한 객체 정보는 입력 영상의 종류에 대한 정보를 포함할 수 있다. In an embodiment, the second neural network 600 may receive object information about an input image from the object information obtainer 111 . Object information on the input image may include information on the type of the input image.

실시 예에서, 제2 뉴럴 네트워크(600)는 해당 영상에 대한 사용자 제어 정보를 입력 받을 수 있다. In an embodiment, the second neural network 600 may receive user control information for a corresponding image.

실시 예에서, 제2 뉴럴 네트워크(600)는 입력 영상에 대한 객체 정보로부터, 입력 영상의 종류를 파악하고, 그 입력 영상에 대한 사용자 제어 정보를 이용하여, 사용자가 그 입력 영상에 대해 화질 처리 기능을 이용했는지 여부, 입력 영상에 대해 사용자가 어떤 제어 명령을 했는지에 대한 정보를 학습할 수 있다.In an embodiment, the second neural network 600 determines the type of the input image from object information on the input image, and uses the user control information on the input image to allow the user to perform a quality processing function on the input image. It is possible to learn information about whether or not a control command was used and what control command the user gave to the input image.

실시 예에서, 제2 뉴럴 네트워크(600)는 사용자 인터랙션 히스토리를 통계적으로 학습하여 획득한 사용 패턴으로부터 추론 제어 정보를 출력할 수 있다. 추론 제어 정보는 사용자의 인터랙션 히스토리를 기반으로, 현재 영상에 대해 어떤 화질 처리를 수행할 것인지를 추론한 제어 정보일 수 있다. 제2 뉴럴 네트워크(600)로부터 획득된 제어 정보는 화질 처리부(115)로 전송될 수 있다. In an embodiment, the second neural network 600 may output inference control information from a usage pattern obtained by statistically learning a user interaction history. The inference control information may be control information that infers what kind of image quality processing is to be performed for a current video based on a user's interaction history. Control information obtained from the second neural network 600 may be transmitted to the image quality processing unit 115 .

도 7은 실시 예에 따라, 화질 처리부(115)가 입력 영상으로부터 화질 처리된 출력 영상을 획득하는 것을 설명하기 위한 도면이다. 7 is a diagram for explaining how the quality processing unit 115 obtains an output image subjected to quality processing from an input image according to an exemplary embodiment.

도 7을 참조하면, 화질 처리부(115)는 뉴럴 네트워크를 이용하여 입력 데이터로부터 출력 데이터를 획득할 수 있다. Referring to FIG. 7 , the quality processing unit 115 may obtain output data from input data using a neural network.

화질 처리부(115)가 이용하는 뉴럴 네트워크를 제3 뉴럴 네트워크(700)라고 하면, 실시 예에서, 제3 뉴럴 네트워크(700)는 CNN(Convolution Neural Network), DCNN(Deep Convolution Neural Network) 또는 캡스넷(Capsnet) 기반의 신경망일 수 있다. If the neural network used by the image quality processing unit 115 is the third neural network 700, in an embodiment, the third neural network 700 is a Convolution Neural Network (CNN), Deep Convolution Neural Network (DCNN), or CAPSnet ( Capsnet) based neural network.

실시 예에서, 제3 뉴럴 네트워크(700)는 다양한 데이터들을 입력 받고, 입력된 데이터들을 분석하는 방법, 입력된 데이터들을 분류하는 방법, 및/또는 입력된 데이터들에서 결과 데이터 생성에 필요한 특징을 추출하는 방법 등을 스스로 발견 또는 터득할 수 있도록 훈련될 수 있다. 제3 뉴럴 네트워크(700)는 다수의 학습 데이터들에 학습 알고리즘을 적용하여, 원하는 특성의 인공지능 모델로 만들어질 수 있다. 이러한 학습은 영상 처리 장치(100) 자체에서 이루어질 수도 있고, 별도의 서버/시스템을 통해 이루어 질 수도 있다. 여기서, 학습 알고리즘은, 다수의 학습 데이터들을 이용하여 소정의 대상 기기(예컨데, 로봇)를 훈련시켜 소정의 대상 기기 스스로 결정을 내리거나 예측을 할 수 있도록 하는 방법이다. In an embodiment, the third neural network 700 receives various data, analyzes the input data, classifies the input data, and/or extracts features necessary for generating result data from the input data. It can be trained to discover or learn how to do it yourself. The third neural network 700 can be made into an artificial intelligence model with desired characteristics by applying a learning algorithm to a plurality of learning data. Such learning may be performed in the image processing device 100 itself or through a separate server/system. Here, the learning algorithm is a method of training a predetermined target device (eg, a robot) using a plurality of learning data so that the predetermined target device can make a decision or make a prediction by itself.

실시 예에서, 제3 뉴럴 네트워크(700)는 학습 데이터를 입력 값으로 하는 지도 학습(supervised learning)을 통하여, 데이터 추론 모델로 학습될 수 있다. In an embodiment, the third neural network 700 may be trained as a data reasoning model through supervised learning using learning data as an input value.

도 7을 참조하면, 제3 뉴럴 네트워크(700)는 입력 계층(710), 숨은 계층(720), 및 출력 계층(730)을 포함할 수 있다. 실시 예에서, 숨은 계층(720)은 복수개의 히든 레이어들을 포함할 수 있다. 제3 뉴럴 네트워크(700)는 하나 이상의 히든 레이어를 포함할 수 있다. 예컨대, 제3 뉴럴 네트워크(700)는 두 개 이상의 히든 레이어들을 포함하는 딥 제3 뉴럴 네트워크(700)(DNN)일 수 있다. 딥 뉴럴 네트워크(DNN)는 복수의 계층들을 통한 연산을 수행하는 뉴럴 네트워크로, 연산을 수행하는 내부의 계층(layer)의 개수에 따라서 네트워크의 심도(depth)가 증가할 수 있다. 딥 뉴럴 네트워크(DNN) 연산은 컨볼루션 뉴럴 네트워크(CNN: Convolution Neural Network) 연산 등을 포함할 수 있다. Referring to FIG. 7 , a third neural network 700 may include an input layer 710, a hidden layer 720, and an output layer 730. In an embodiment, the hidden layer 720 may include a plurality of hidden layers. The third neural network 700 may include one or more hidden layers. For example, the third neural network 700 may be a deep third neural network 700 (DNN) including two or more hidden layers. A deep neural network (DNN) is a neural network that performs calculations through a plurality of layers, and the depth of the network may increase according to the number of internal layers that perform calculations. Deep neural network (DNN) calculations may include convolutional neural network (CNN) calculations and the like.

제3 뉴럴 네트워크(700)는 입력 계층(710)과 출력 계층(730) 간에 숨은 계층(720)이 복수개의 층으로 형성될 수 있다. 제3 뉴럴 네트워크(700)의 계층의 심도나 형태는 결과의 정확도, 결과의 신뢰도, 프로세서의 연산 처리 속도 및 용량 등을 고려하여 다양하게 설계될 수 있다.The third neural network 700 may include a plurality of hidden layers 720 between the input layer 710 and the output layer 730 . The depth or shape of the layer of the third neural network 700 can be designed in various ways in consideration of the accuracy of the result, the reliability of the result, the processing speed and capacity of the processor, and the like.

제3 뉴럴 네트워크(700)를 형성하는 복수개의 계층들 각각은 하나 이상의 노드를 포함할 수 있다. 예를 들어, 입력 계층(710)은 데이터를 수신하는 하나 이상의 노드(node)들을 포함할 수 있다. 여기서, 제3 뉴럴 네트워크(700)의 입력 계층(710)에 포함된 노드의 수는 출력 계층(730)에 포함된 노드의 수와 동일할 수 있다. 도 7에서는 제3 뉴럴 네트워크(700)에 포함된 제1 숨은 계층의 노드 개수가 50개이고, 제2 숨은 계층의 노드 개수가 100개이고, 제3 숨은 계층의 노드 수가 50개인 경우를 나타낸다. 그러나 이는 하나의 실시 예로, 제3 뉴럴 네트워크(700)의 노드의 개수는 다양하게 설계될 수 있다.Each of the plurality of layers forming the third neural network 700 may include one or more nodes. For example, the input layer 710 may include one or more nodes receiving data. Here, the number of nodes included in the input layer 710 of the third neural network 700 may be the same as the number of nodes included in the output layer 730 . 7 shows a case in which the number of nodes of the first hidden layer included in the third neural network 700 is 50, the number of nodes of the second hidden layer is 100, and the number of nodes of the third hidden layer is 50. However, this is an example, and the number of nodes of the third neural network 700 may be designed in various ways.

실시 예에서, 입력 계층(710)에 포함된 복수개의 노드들로 입력 데이터가 입력될 수 있다. In an embodiment, input data may be input to a plurality of nodes included in the input layer 710 .

시각 장애가 있는 사람들은, 하이 프리퀀시 텍스쳐(high frequency textures)를 인지하지 못하기 때문에 그러한 영상을 필요로 하지 않는다. 시각 장애가 있는 사람들은 큰 객체 위주의 큰 윤곽선을 인지하는 것을 더 중요하게 생각하기 때문에, 중요 객체의 크기가 크고, 중요 객체 내부는 평탄하고, 중요 객체의 윤곽선은 강조된 영상을 생성하는 것이 요구된다. People with visual impairments do not need such images because they do not perceive high frequency textures. Since visually impaired people place more importance on recognizing large outlines of large objects, it is required to generate an image in which the size of an important object is large, the inside of the important object is flat, and the outline of the important object is emphasized.

따라서, 실시 예에서, 제3 뉴럴 네트워크(700)는 다양한 장르의 영상, 각 영상에서 시각 장애가 있는 사람들이 관심을 갖는 영역이나 객체에 대한 객체 정보, 시각 장애가 있는 사람들이 화질 처리를 위해 입력한 제어 정보, 시각 장애인의 시각 특성을 반영한 정답 셋을 학습 데이터로 이용할 수 있다. Accordingly, in an embodiment, the third neural network 700 may use images of various genres, object information about areas or objects that visually impaired people are interested in in each image, and controls input by visually impaired people for image quality processing. A set of correct answers reflecting information and visual characteristics of the visually impaired can be used as learning data.

실시 예에서, 제3 뉴럴 네트워크(700)는 객체 정보 획득부(111)로부터 객체 정보를 수신하고, 제어 정보 획득부(113)로부터 제어 정보를 수신할 수 있다. 제3 뉴럴 네트워크(700)는 다양한 장르를 갖는 영상, 각 영상에 대해 객체 정보 획득부(111)가 획득한 객체 정보 및 제어 정보 획득부(113)가 획득한 제어 정보를 입력 데이터로 입력 받을 수 있다. In an embodiment, the third neural network 700 may receive object information from the object information acquisition unit 111 and control information from the control information acquisition unit 113 . The third neural network 700 may receive, as input data, images of various genres, object information acquired by the object information acquisition unit 111 and control information obtained by the control information acquisition unit 113 for each image. there is.

인접한 두 개의 계층들의 노드들은 복수개의 엣지(edge)들로 연결될 수 있다. 각각의 엣지들은 대응되는 가중치 값 및 곱하기나 더하기 등과 같은 연산 정보를 가지고 있다. 제3 뉴럴 네트워크(700)는 입력된 데이터에 엣지의 가중치 값을 곱하거나 더하여 연산을 수행하고 그 결과 값을 엣지와 연결된 다음 계층의 노드 값으로 출력할 수 있다. 실시 예에서, 제3 뉴럴 네트워크(700)에 포함된 계층들은 이전 레이어의 모든 노드가 다음 레이어의 모든 노드에 연결되는 완전 연결 계층(Fully Connected layer)으로 형성될 수 있다. Nodes of two adjacent layers may be connected by a plurality of edges. Each edge has a corresponding weight value and operation information such as multiplication or addition. The third neural network 700 may perform an operation by multiplying or adding the weight value of the edge to the input data, and output the resultant value as a node value of the next layer connected to the edge. In an embodiment, the layers included in the third neural network 700 may be formed as a fully connected layer in which all nodes of a previous layer are connected to all nodes of a next layer.

제3 뉴럴 네트워크(700)는 노드에 입력된 값들을 함수에 통과시킨 후 다음 레이어로 전달하는데, 이 때 다음 레이어의 출력을 결정하는 함수를 활성화 함수(Activation Function)라고 한다. 활성화 함수는 입력 데이터를 다음 레이어로 어떻게 전달할 것인지를 결정하는 함수일 수 있다. 실시 예에서, 제3 뉴럴 네트워크(700)는 히든 레이어에서 사용하는 활성화 함수로 ReLU(Rectified Linear Unit)를 사용할 수 있다. ReLU는 비선형 활성화 함수의 하나로, 학습이 빠르고 구현이 간단하다는 장점이 있다. 다만, 이에 한정되는 것은 아니며, 제3 뉴럴 네트워크(700)는 Sigmoid 또는 Hyperbolic tangent/Tang 함수와 같은 다른 비선형 활성화 함수를 이용할 수도 있다. 또는 제3 뉴럴 네트워크(700)는 활성화 함수로 비선형 함수가 아닌, 이진 활성화 함수나 선형 활성화 함수를 이용할 수도 있다.The third neural network 700 passes values input to nodes through a function and then transfers them to the next layer. At this time, a function that determines an output of the next layer is called an activation function. The activation function may be a function that determines how to transfer input data to the next layer. In an embodiment, the third neural network 700 may use Rectified Linear Unit (ReLU) as an activation function used in the hidden layer. ReLU is one of the nonlinear activation functions, and has the advantage of being fast to learn and simple to implement. However, it is not limited thereto, and the third neural network 700 may use other nonlinear activation functions such as a sigmoid or hyperbolic tangent/tang function. Alternatively, the third neural network 700 may use a binary activation function or a linear activation function instead of a nonlinear function as an activation function.

실시 예에서, 제3 뉴럴 네트워크(700)는 입력 계층(710)에 포함된 노드들로 입력 데이터를 입력 받고, 입력 데이터에 대해 각 계층들 간의 연산을 수행하고 그 결과 값을 출력 데이터로 획득할 수 있다. 즉, 제3 뉴럴 네트워크(700)는 입력 데이터를 분석 및 분류하고 화질 처리를 수행하는 데 필요한 특징을 추출하고 처리하여, 화질 처리된 결과 데이터를 출력 데이터로 획득할 수 있다.In an embodiment, the third neural network 700 receives input data from nodes included in the input layer 710, performs an operation between each layer on the input data, and obtains the resultant value as output data. can That is, the third neural network 700 analyzes and classifies input data, extracts and processes features necessary for image quality processing, and obtains image quality processed result data as output data.

실시 예에서, 제3 뉴럴 네트워크(700)는 입력 영상에 대해, 객체 정보를 기반으로 중요 객체 영역을 식별할 수 있다. 실시 예에서, 제3 뉴럴 네트워크(700)는 사용자 제어 정보를 기반으로, 중요 객체에 대해 윤곽선 처리, 중요 객체 내부의 평탄화 처리, 및 중요 객체를 업스케일링하는 것 중 적어도 하나를 수행함으로써, 중요 객체 영역에 대해 화질 처리를 수행할 수 있다. In an embodiment, the third neural network 700 may identify an important object region of the input image based on object information. In an embodiment, the third neural network 700 performs at least one of contour processing on the important object, flattening process inside the important object, and upscaling the important object based on the user control information, so that the important object Image quality processing may be performed on the region.

실시 예에서, 제3 뉴럴 네트워크(700)는 다양한 장르를 갖는 영상, 사용자 제어 정보, 및 객체 정보 중 적어도 하나로부터, 객체에 대한 화질 처리를 위한 파라미터를 획득할 수 있다. 화질 처리를 위한 파라미터는 평탄화 파라미터, 윤곽선 파라미터, 업스케일링 파라미터 중 적어도 하나를 포함할 수 있다. In an embodiment, the third neural network 700 may obtain a parameter for image quality processing of an object from at least one of images having various genres, user control information, and object information. Parameters for image quality processing may include at least one of a flattening parameter, a contour parameter, and an upscaling parameter.

실시 예에서, 제3 뉴럴 네트워크(700)는 사용자 제어 정보를 이용하여 파라미터 값을 조절하고, 그에 따라 중요 객체가 화질 처리되도록 할 수 있다. In an embodiment, the third neural network 700 may adjust a parameter value using user control information, and accordingly, image quality of an important object may be processed.

실시 예에서, 제3 뉴럴 네트워크(700)는 윤곽선 파라미터에 따라 중요 객체의 윤곽선 처리를 수행할 수 있다. 제3 뉴럴 네트워크(700)는 사용자 제어 정보를 기반으로 중요 객체의 윤곽선의 디테일을 어느 정도로 유지할지, 윤곽선의 강도, 즉, 두께나 진한 정도를 어느 정도로 할 지, 윤곽선의 색상을 무슨 색으로 할지, 배경 영역과 윤곽선의 색상 간의 색 대비를 어느 정도로 할지 등을 결정할 수 있다. In an embodiment, the third neural network 700 may perform contour processing of important objects according to contour parameters. The third neural network 700 determines how much detail of the outline of an important object to be maintained, how strong the outline is, that is, how thick or dark it is, and what color the outline is, based on user control information. , the level of color contrast between the background area and the color of the outline can be determined.

실시 예에서, 제3 뉴럴 네트워크(700)는 평탄화 파라미터에 따라 중요 객체 내부를 평탄화 처리할 수 있다. 제3 뉴럴 네트워크(700)는 사용자 제어 정보에 따라 사용자가 선호하는 방법으로, 중요 객체의 윤곽선을 제외한 중요 객체 내부 영역의 평탄화 정도를 조절하는 처리를 수행할 수 있다. In an embodiment, the third neural network 700 may flatten the inside of the important object according to the flattening parameter. The third neural network 700 may perform a process of adjusting the degree of flattening of the inner region of the important object, excluding the contour of the important object, in a method preferred by the user according to the user control information.

실시 예에서, 제3 뉴럴 네트워크(700)는 업스케일링 파라미터에 따라 중요 객체를 업스케일링할 수 있다. 제3 뉴럴 네트워크(700)는 사용자 제어 정보에 따라 사용자가 선호하는 크기로 중요 객체의 크기를 확대 수 있다. 동시에, 제3 뉴럴 네트워크(700)는 크기가 커진 중요 객체에 대해 선명도 처리를 수행할 수 있다. In an embodiment, the third neural network 700 may upscale the important object according to the upscaling parameter. The third neural network 700 may enlarge the size of the important object to a size preferred by the user according to user control information. At the same time, the third neural network 700 may perform sharpness processing on an important object whose size has increased.

실시 예에서, 제3 뉴럴 네트워크(700)는 픽셀의 크기를 단순히 복사해서 픽셀 수를 늘리는 대신, 픽셀 사이에 새로운 픽셀을 생성함으로써 중요 객체의 크기를 업스케일링할 수 있다. 제3 뉴럴 네트워크(700)는 다양한 방법으로 새로운 픽셀을 생성할 수 있다. 제3 뉴럴 네트워크(700)는 nearest 방식, bilinear 방식, joint bilateral 방식 등과 같이 다양한 업샘플링 방식을 이용하여 픽셀 사이에 새로운 픽셀을 생성할 수 있다. 즉, 제3 뉴럴 네트워크(700)는 중요 객체에 대해 해상도를 조절하면서 중요 객체의 크기를 늘리는 처리를 동시에 수행함으로써, 중요 객체의 크기가 늘어남과 동시에 해상도 또한 향상되도록 할 수 있다. In an embodiment, the third neural network 700 may upscale the size of an important object by creating a new pixel between pixels instead of simply copying the size of a pixel to increase the number of pixels. The third neural network 700 may generate new pixels in various ways. The third neural network 700 may generate a new pixel between pixels using various upsampling methods such as a nearest method, a bilinear method, a joint bilateral method, and the like. That is, the third neural network 700 simultaneously increases the size of the important object while adjusting the resolution of the important object, so that the size of the important object increases and the resolution is also improved.

실시 예에서, 제3 뉴럴 네트워크(700)는 사용자 제어 정보를 기반으로 평탄화 파라미터가 적용된 결과와 윤곽선 파라미터가 적용된 결과의 블렌딩 정도를 학습할 수 있다. 제3 뉴럴 네트워크(700)는 제어 정보를 학습하여 사용자가 선호하는 블렌딩 정도를 학습하고, 그에 따라 평탄화 파라미터와 윤곽선 파라미터가 적용된 결과들을 블렌딩함으로써 중요 객체 내부는 평탄화되고, 중요 객체의 윤곽선은 윤곽선 처리가 수행된 최종 결과물을 획득할 수 있다. In an embodiment, the third neural network 700 may learn a degree of blending between a result of applying a flattening parameter and a result of applying a contour parameter based on user control information. The third neural network 700 learns the blending degree preferred by the user by learning the control information, and accordingly blends the results to which the flattening parameter and the contour parameter are applied, so that the inside of the important object is flattened and the contour of the important object is contoured. can obtain the final result performed.

다른 실시 예에서, 제3 뉴럴 네트워크(700)는 중요 객체에 대해 화질 처리한 결과를 출력 데이터로 획득하는 대신, 중요 객체에 대한 평탄화 파라미터와 윤곽선 파라미터를 출력 데이터로 출력할 수도 있다. 이 경우, 화질 처리부(115)는 평탄화 파라미터가 적용된 결과와 윤곽선 파라미터가 적용된 결과에 대한 블렌딩 정보를 추가로 사용자로부터 입력 받을 수 있다. 사용자는 평탄화 정도가 너무 세고, 윤곽선 처리가 너무 약하다고 느끼는 경우, 사용자 인터페이스를 이용하여 평탄화 파라미터와 윤곽선 파라미터의 블렌딩 정보를 조절할 수 있다. 화질 처리부(115)는 사용자로부터의 제어 신호에 따라 파라미터의 블렌딩 정도를 조절하여 최종적으로 화질 처리된 영상을 획득할 수 있다.In another embodiment, the third neural network 700 may output the flattening parameter and the contour parameter of the important object as output data instead of acquiring the image quality processing result of the important object as output data. In this case, the image quality processing unit 115 may additionally receive blending information about a result of applying the flattening parameter and a result of applying the contour parameter from the user. When the user feels that the flattening degree is too strong and the contour processing is too weak, the user may adjust blending information of the flattening parameter and the contour parameter using the user interface. The image quality processing unit 115 may adjust the degree of blending of parameters according to a control signal from a user to finally obtain a quality-processed image.

실시 예에서, 제3 뉴럴 네트워크(700)는 결과의 정확도를 높이기 위해서, 복수의 학습 데이터에 근거하여 출력 계층(730)에서 입력 계층(710) 방향으로 학습(training)을 반복적으로 수행하여 출력 결과의 정확도가 높아지도록 가중치 값들을 수정할 수 있다. In an embodiment, the third neural network 700 repeatedly performs training in the direction from the output layer 730 to the input layer 710 based on a plurality of training data to increase the accuracy of the result, resulting in an output result. Weight values can be modified to increase the accuracy of .

실시 예에서, 제3 뉴럴 네트워크(700)는 출력 계층(730)에서 출력되는 화질 처리된 결과 데이터를 그라운드 트루쓰(ground truth) 간의 차이를 손실 함수로 획득할 수 있다. 그라운드 트루쓰는 영상에서 사용자가 관심을 갖는 중요 객체를 사용자가 선호하는 화질 처리 방법에 따라 화질 처리한 데이터일 수 있다. 제3 뉴럴 네트워크(700)는 손실 함수를 다시 입력 받고, 손실 함수가 최소가 되도록 히든 레이어(720)에 포함된 엣지들의 가중치 값을 계속 수정할 수 있다. 엣지들의 가중치 값은 반복적인 학습을 통하여 최적화될 수 있으며, 결과의 정확도가 소정의 신뢰도를 만족할 때까지 반복적으로 수정될 수 있다. 제3 뉴럴 네트워크(700)는 최종적으로 설정된 엣지들의 가중치 값들에 의해서 형성될 수 있다. In an embodiment, the third neural network 700 may obtain a difference between ground truths of quality-processed result data output from the output layer 730 as a loss function. The ground truth may be data quality-processed according to a quality-processing method preferred by the user for an important object of interest to the user in the image. The third neural network 700 may receive the loss function again and continuously modify weight values of edges included in the hidden layer 720 so that the loss function becomes a minimum. The weight values of the edges may be optimized through iterative learning, and may be repeatedly modified until the accuracy of the result satisfies a predetermined degree of reliability. The third neural network 700 may be formed by weight values of edges finally set.

실시 예에 따르면, 제3 뉴럴 네트워크(700)를 이용하여 영상에 포함된 중요 객체에 대해 화질 처리를 수행하는 방법을 학습하는 동작은, 영상 처리 장치(100)에 장착되기 전에 미리 수행될 수 있다. 복수의 학습 데이터 중 일부가 변경되는 경우, 학습 모델 또한 업데이트될 수 있다. 소정의 주기 단위로, 새로운 학습 데이터가 사용되거나 추가될 경우, 제3 뉴럴 네트워크(700)는 새로운 학습 데이터로부터 화질 처리를 수행하는 방법을 다시 학습할 수 있으며, 이에 따라 학습 모델이 업데이트될 수 있다. According to the embodiment, an operation of learning how to perform image quality processing on an important object included in an image using the third neural network 700 may be performed in advance before being installed in the image processing device 100. . When some of the plurality of learning data is changed, the learning model may also be updated. If new training data is used or added at predetermined intervals, the third neural network 700 may learn how to perform image quality processing from the new training data again, and the learning model may be updated accordingly. .

실시 예에서, 제3 뉴럴 네트워크(700)를 이용하여 화질 처리를 수행하는 방법을 학습하는 동작은, 외부의 컴퓨팅 장치(미도시)에서 수행될 수 있다. 제3 뉴럴 네트워크(700)를 이용하여 영상에 포함된 객체에 대해 화질 처리를 수행하는 방법을 학습하는 동작은, 상대적으로 복잡한 연산량을 필요로 할 수 있다. 이에 따라, 컴퓨팅 장치가 학습하는 동작을 수행하고, 영상 처리 장치(100)는 통신망을 통해 컴퓨팅 장치로부터 학습 모델을 수신할 수 있다. 또는, 영상 처리 장치(100)를 제조하는 제조사는 컴퓨팅 장치가 학습시킨 제3 뉴럴 네트워크(700)를 영상 처리 장치(100)에 장착하여, 학습 모델이 영상 처리 장치(100)에서 화질 처리된 영상을 획득하는 데 이용되도록 할 수 있다. In an embodiment, an operation of learning how to perform image quality processing using the third neural network 700 may be performed in an external computing device (not shown). An operation of learning how to perform image quality processing on an object included in an image using the third neural network 700 may require a relatively complex amount of computation. Accordingly, the computing device may perform a learning operation, and the image processing device 100 may receive a learning model from the computing device through a communication network. Alternatively, a manufacturer of the image processing device 100 may install the third neural network 700 trained by the computing device into the image processing device 100 so that the learning model is an image quality-processed by the image processing device 100. can be used to obtain

또는, 본 개시의 다른 실시 예에서, 컴퓨팅 장치가 아닌, 영상 처리 장치(100)가 제3 뉴럴 네트워크(700)를 통한 학습 동작을 직접 수행할 수도 있다. 이 경우, 영상 처리 장치(100)는 학습 데이터를 획득하고, 학습 데이터로 제3 뉴럴 네트워크(700)를 학습시켜 학습 모델을 결정할 수 있으며, 결정된 학습 모델을 통하여 화질 처리된 영상을 획득할 수 있다.Alternatively, in another embodiment of the present disclosure, the image processing device 100, not the computing device, may directly perform the learning operation through the third neural network 700. In this case, the image processing device 100 may obtain learning data, determine a learning model by learning the third neural network 700 with the learning data, and acquire a quality-processed image through the determined learning model. .

실시 예에서, 학습이 끝난 제3 뉴럴 네트워크(700)는 입력 데이터로, 실시간 영상, 객체 정보 및 제어 정보를 입력 받을 수 있다. 제3 뉴럴 네트워크(700)는 실시간으로 입력되는 입력 영상에 대해 객체 정보 획득부(111)로부터 객체 정보를 입력 받을 수 있다. 제3 뉴럴 네트워크(700)는 객체 정보를 기반으로 실시간 영상에서 중요 객체의 종류나 위치, 크기를 식별할 수 있다. 제3 뉴럴 네트워크(700)는 제어 정보 획득부(113)로부터 사용자의 인터랙션 히스토리를 기반으로 생성된, 실시간 영상에 적용할 제어 정보를 수신하고, 제어 정보에 따라, 실시간 영상에 포함된 중요 객체를 화질 처리함으로써 중요 객체가 화질 처리된 영상을 출력 데이터로 획득할 수 있다. In an embodiment, the trained third neural network 700 may receive real-time video, object information, and control information as input data. The third neural network 700 may receive object information from the object information acquisition unit 111 for an input image input in real time. The third neural network 700 may identify the type, location, or size of an important object in a real-time image based on object information. The third neural network 700 receives control information to be applied to the real-time video generated based on the user's interaction history from the control information acquisition unit 113, and selects an important object included in the real-time video according to the control information. By processing the image quality, the important object can acquire the image image processed for image quality as output data.

도 8은 실시 예에 따라, 영상 처리 장치(100)가 사용자로부터 중요 객체를 선택 받는 것을 설명하기 위한 도면이다.8 is a diagram for explaining how the image processing device 100 receives selection of an important object from a user according to an embodiment.

실시 예에서, 영상 처리 장치(100)는 입력 영상에서 객체를 검출할 수 있다. 입력 영상에 복수개의 객체가 포함되어 있는 경우, 영상 처리 장치(100)는 복수개의 객체 중에서 중요 객체를 식별할 수 있다. 중요 객체는 입력 영상에 포함된 복수개의 객체들 중에 일부일 수 있다. In an embodiment, the image processing device 100 may detect an object from an input image. When a plurality of objects are included in the input image, the image processing device 100 may identify an important object from among the plurality of objects. The important object may be a part of a plurality of objects included in the input image.

실시 예에서, 영상 처리 장치(100)는 복수 객체 중 중요 객체로 처리할 대상을 사용자로부터 직접 선택 받을 수 있다. 이를 위해, 영상 처리 장치(100)는 복수개의 객체 각각을 식별하기 위한 객체 식별 정보를 복수개의 객체 주변에 각각 출력할 수 있다. In an embodiment, the image processing device 100 may directly receive a user's selection of an object to be processed as an important object from among a plurality of objects. To this end, the image processing device 100 may output object identification information for identifying each of the plurality of objects around the plurality of objects.

도 8은, 영상 처리 장치(100)가 영상에서 복수개의 객체를 검출한 경우, 이 중 어느 객체를 중요 객체로 결정할지를 사용자로부터 선택 받기 위해 객체 식별 정보를 화면에 출력한 것을 도시한다. FIG. 8 illustrates that, when the image processing device 100 detects a plurality of objects in an image, object identification information is displayed on a screen to receive a user's selection of which object to determine as an important object among the plurality of objects.

도 8a는 현재 화면에서 출력된 영상이 정지 영상일 때, 객체 식별 정보가 화면에 출력된 것을 도시한다. 영상 처리 장치(100)는 사용자가 비디오를 시청하려고 하면, 해당 비디오에 등장하는 복수 인물들을 포함하는 정지 영상을 출력하고, 정지 영상에 객체 식별 정보를 출력할 수 있다. 영상 처리 장치(100)는 복수개의 객체 각각에 대한 객체 식별 정보(811, 812, 8123, 814)를 객체 주변에 출력할 수 있다. 사용자는 리모컨 등의 사용자 인터페이스를 이용하여, 복수개의 객체 식별 정보 중 하나를 선택할 수 있다. 예컨대, 사용자가 객체 식별 정보 812를 선택한 경우, 영상 처리 장치(100)는 사용자가 선택한 객체 식별 정보 812의 색상이나 두께, 투명도 등을, 선택 받지 못한 다른 객체 식별 정보와는 다르게 표시하여, 사용자에게 특정 객체가 선택되었음을 알려줄 수 있다. 8A shows that object identification information is output on the screen when the image output on the current screen is a still image. When a user tries to watch a video, the image processing device 100 may output a still image including a plurality of people appearing in the corresponding video and output object identification information to the still image. The image processing device 100 may output object identification information 811 , 812 , 8123 , and 814 for each of a plurality of objects around the object. A user may select one of a plurality of pieces of object identification information using a user interface such as a remote controller. For example, when the user selects the object identification information 812, the image processing device 100 displays the color, thickness, transparency, etc. of the object identification information 812 selected by the user differently from other unselected object identification information to the user. You can notify that a specific object is selected.

도 8b, 도 8c는 영상 처리 장치(100)가 비디오를 재생하는 중에 발화자를 식별하고, 발화자에 대응하는 객체 식별 정보를 발화자 주변에 출력한 것을 도시한 도면이다. 영상 처리 장치(100)는 현재 출력되는 영상을 분석하여, 영상에서 발화자를 식별할 수 있다. 예컨대, 영상 처리 장치(100)는 현재 재생 중인 비디오에 포함된 객체가 사람인 경우, 사람의 얼굴을 분석하여, 입술이 움직이는 대상을 발화자로 식별할 수 있다. 또는, 영상 처리 장치(100)는 비디오 프레임과 오디오 프레임을 함께 입력 받고 입력된 비디오 프레임과 오디오 프레임의 특징을 분석 및 분류하여, 현재 발화자의 위치를 식별함으로써 발화자를 검출할 수도 있다. 8B and 8C are diagrams illustrating that the image processing apparatus 100 identifies a speaker while playing a video, and outputs object identification information corresponding to the speaker to surroundings of the speaker. The image processing device 100 may analyze a currently output image to identify a speaker in the image. For example, when an object included in a currently playing video is a person, the image processing device 100 may analyze a person's face and identify a subject whose lips are moving as a speaker. Alternatively, the image processing device 100 may receive video frames and audio frames together, analyze and classify characteristics of the input video frames and audio frames, and identify the location of the current speaker, thereby detecting the speaker.

도 8a는 영상 처리 장치(100)가 특정 시점, 예컨대, t0 시점에, 발화자를 나타내기 위한 객체 식별 정보(821)를 발화자 주변에 출력한 것을 도시한다. 사용자는 리모컨 등을 이용하여 발화자 주변의 객체 식별 정보(821)를 선택함으로써, 발화자가 중요 객체로 식별되도록 할 수 있다. FIG. 8A illustrates that the image processing device 100 outputs object identification information 821 for indicating the speaker to the surroundings of the speaker at a specific time point, for example, time t0. The user can select the object identification information 821 around the speaker using a remote control, so that the speaker is identified as an important object.

영상 처리 장치(100)는 사용자가 t0 시점에 출력된 객체 식별 정보(821)를 선택하지 않은 경우, t1 시점에, t0 시점의 발화자와는 다른 발화자를 표시하기 위한 객체 식별 정보(831)를 화면에 출력할 수 있다. 도 8c는 영상 처리 장치(100)가 t1 시점의 발화자를 나타내기 위한 객체 식별 정보(831)를 화면에 출력된 것을 도시한다. 사용자는 t1 시점에 화면에 출력된 객체 식별 정보(831)를 선택함으로써, t1 시점에 출력된 객체 식별 정보(831)에 대응하는 객체를 중요 객체로 선택할 수 있다. When the user does not select the output object identification information 821 at time t0, the image processing device 100 displays, at time t1, object identification information 831 for displaying a speaker different from the speaker at time t0 on the screen. can be output to 8C shows that the image processing device 100 outputs object identification information 831 for indicating a speaker at time t1 to the screen. The user may select the object corresponding to the object identification information 831 output at time t1 as an important object by selecting the object identification information 831 output on the screen at time t1.

이와 같이, 실시 예에 의하면, 영상 처리 장치(100)는 복수개의 객체 중 화질 처리를 수행할 중요 객체를 사용자로부터 직접 선택 받을 수 있다. 따라서, 영상 처리 장치(100)는 사용자가 선택한 중요 객체에 대해 화질 처리를 수행함으로써, 사용자가 원하는 대상이 사용자에게 보다 잘 인지되도록 할 수 있다. As such, according to the embodiment, the image processing device 100 may directly select an important object to perform image quality processing from among a plurality of objects from the user. Accordingly, the image processing device 100 may allow the user to better recognize an object desired by the user by performing image quality processing on the important object selected by the user.

도 9는 실시 예에 따른 영상 처리 장치(100a)의 내부 블록도이다. 9 is an internal block diagram of an image processing device 100a according to an embodiment.

도 9의 영상 처리 장치(100a)는 도 2의 영상 처리 장치(100)의 일 예일 수 있다. 이하, 도 2에서 설명한 내용과 중복되는 설명은 생략한다.The image processing device 100a of FIG. 9 may be an example of the image processing device 100 of FIG. 2 . Hereinafter, descriptions overlapping with those described in FIG. 2 will be omitted.

도 9를 참조하면, 영상 처리 장치(100a)는 프로세서(210) 및 메모리(220) 외에, 튜너부(910), 통신부(920), 감지부(930), 입/출력부(940), 비디오 처리부(950), 디스플레이부(960), 오디오 처리부(970), 오디오 출력부(980) 및 사용자 인터페이스(990)를 더 포함할 수 있다.Referring to FIG. 9 , the image processing device 100a includes a tuner unit 910, a communication unit 920, a sensing unit 930, an input/output unit 940, a video, in addition to a processor 210 and a memory 220. A processing unit 950, a display unit 960, an audio processing unit 970, an audio output unit 980, and a user interface 990 may be further included.

튜너부(910)는 유선 또는 무선으로 수신되는 방송 콘텐츠 등을 증폭(amplification), 혼합(mixing), 공진(resonance)등을 통하여 많은 전파 성분 중에서 영상 처리 장치(100a)에서 수신하고자 하는 채널의 주파수만을 튜닝(tuning)시켜 선택할 수 있다. 튜너부(910)를 통해 수신된 콘텐츠는 디코딩되어 오디오, 비디오 및/또는 부가 정보로 분리된다. 분리된 오디오, 비디오 및/또는 부가 정보는 프로세서(210)의 제어에 의해 메모리(220)에 저장될 수 있다. The tuner unit 910 determines the frequency of a channel desired to be received by the image processing device 100a among many radio wave components through amplification, mixing, resonance, etc. You can select only by tuning. The content received through the tuner unit 910 is decoded and separated into audio, video and/or additional information. The separated audio, video and/or additional information may be stored in the memory 220 under the control of the processor 210 .

통신부(920)는 프로세서(210)의 제어에 의해 영상 처리 장치(100a)를 주변 기기나 외부 장치, 서버, 이동 단말기 등과 연결할 수 있다. 통신부(920)는 무선 통신을 수행할 수 있는 적어도 하나의 통신 모듈을 포함할 수 있다. 통신부(920)는 영상 처리 장치(100a)의 성능 및 구조에 대응하여 무선랜 모듈(921), 블루투스 모듈(922), 유선 이더넷(Ethernet)(923) 중 적어도 하나를 포함할 수 있다.The communication unit 920 may connect the image processing device 100a to a peripheral device, an external device, a server, or a mobile terminal under the control of the processor 210 . The communication unit 920 may include at least one communication module capable of performing wireless communication. The communication unit 920 may include at least one of a wireless LAN module 921, a Bluetooth module 922, and a wired Ethernet 923 corresponding to the performance and structure of the image processing device 100a.

블루투스 모듈(922)은 블루투스 통신 규격에 따라서 주변 기기로부터 전송된 블루투스 신호를 수신할 수 있다. 블루투스 모듈(922)은 BLE(Bluetooth Low Energy) 통신 모듈이 될 수 있으며, BLE 신호를 수신할 수 있다. 블루투스 모듈(922)은 BLE 신호가 수신되는지 여부를 감지하기 위해서 상시적으로 또는 일시적으로 BLE 신호를 스캔할 수 있다. 무선랜 모듈(921)은 와이파이(Wi-Fi) 통신 규격에 따라서 주변 기기와 와이파이 신호를 송수신할 수 있다. The Bluetooth module 922 may receive a Bluetooth signal transmitted from a peripheral device according to the Bluetooth communication standard. The Bluetooth module 922 may be a Bluetooth Low Energy (BLE) communication module and may receive a BLE signal. The Bluetooth module 922 may continuously or temporarily scan a BLE signal to detect whether a BLE signal is received. The wireless LAN module 921 may transmit and receive Wi-Fi signals with neighboring devices according to Wi-Fi communication standards.

감지부(930)는 사용자의 음성, 사용자의 이미지, 또는 사용자의 인터랙션을 감지하며, 마이크(931), 카메라부(932), 광 수신부(933), 센싱부(934)를 포함할 수 있다. 마이크(931)는 사용자의 발화(utterance)된 음성이나 노이즈를 포함하는 오디오 신호를 수신할 수 있고 수신된 오디오 신호를 전기 신호로 변환하여 프로세서(210)로 출력할 수 있다. The sensing unit 930 detects a user's voice, a user's image, or a user's interaction, and may include a microphone 931, a camera unit 932, a light receiving unit 933, and a sensing unit 934. The microphone 931 may receive an audio signal including a user's utterance or noise, convert the received audio signal into an electrical signal, and output the converted electrical signal to the processor 210 .

카메라부(932)는 센서(미도시) 및 렌즈(미도시)를 포함하고, 화면에 맺힌 이미지를 촬영하여 캡쳐하고 이를 프로세서(210)로 전송할 수 있다. The camera unit 932 may include a sensor (not shown) and a lens (not shown), take and capture an image formed on the screen, and transmit it to the processor 210 .

광 수신부(933)는, 광 신호(제어 신호를 포함)를 수신할 수 있다. 광 수신부(933)는 리모컨이나 핸드폰 등과 같은 제어 장치로부터 사용자 입력(예를 들어, 터치, 눌림, 터치 제스처, 음성, 또는 모션)에 대응되는 광 신호를 수신할 수 있다. The light receiving unit 933 may receive light signals (including control signals). The light receiving unit 933 may receive an optical signal corresponding to a user input (eg, touch, pressure, touch gesture, voice, or motion) from a control device such as a remote controller or a mobile phone.

입/출력부(940)는 프로세서(210)의 제어에 의해 영상 처리 장치(100a)의 외부 기기 등으로부터 비디오(예를 들어, 동적 이미지 신호나 정지 이미지 신호 등), 오디오(예를 들어, 음성 신호나, 음악 신호 등) 및 부가 정보 등을 수신할 수 있다. The input/output unit 940 receives video (eg, a dynamic image signal or still image signal), audio (eg, audio) from an external device of the image processing device 100a under the control of the processor 210 . signals, music signals, etc.) and additional information may be received.

입/출력부(940)는 HDMI 포트(High-Definition Multimedia Interface port, 941), 컴포넌트 잭(component jack, 942), PC 포트(PC port, 943), 및 USB 포트(USB port, 944) 중 하나를 포함할 수 있다. 입/출력부(940)는 HDMI 포트(941), 컴포넌트 잭(942), PC 포트(943), 및 USB 포트(944)의 조합을 포함할 수 있다.The input/output unit 940 is one of a High-Definition Multimedia Interface port (941), a component jack (942), a PC port (943), and a USB port (944). can include The input/output unit 940 may include a combination of an HDMI port 941 , a component jack 942 , a PC port 943 , and a USB port 944 .

비디오 처리부(950)는, 디스플레이부(960)에 의해 표시될 이미지 데이터를 처리하며, 이미지 데이터에 대한 디코딩, 렌더링, 스케일링, 노이즈 필터링, 프레임 레이트 변환, 및 해상도 변환 등과 같은 다양한 이미지 처리 동작을 수행할 수 있다. The video processing unit 950 processes image data to be displayed by the display unit 960, and performs various image processing operations such as decoding, rendering, scaling, noise filtering, frame rate conversion, and resolution conversion for image data. can do.

실시 예에서, 비디오 처리부(950)는 제어 정보에 따라 중요 객체에 대한 화질 처리를 수행할 수 있다. 비디오 처리부(950)는 중요 객체의 크기를 늘리고, 해상도를 변환하고, 중요 객체 내부를 평탄화하고, 중요 객체의 윤곽선을 처리하는 등의 화질 처리를 수행할 수 있다. In an embodiment, the video processing unit 950 may perform image quality processing on important objects according to control information. The video processing unit 950 may perform quality processing such as increasing the size of an important object, converting the resolution, flattening the inside of the important object, and processing the contour of the important object.

디스플레이부(960)는 방송국으로부터 수신하거나 외부 서버, 또는 외부 저장 매체 등으로부터 수신한 콘텐츠를 화면에 출력할 수 있다. 콘텐츠는 미디어 신호로, 비디오 신호, 이미지, 텍스트 신호 등을 포함할 수 있다. The display unit 960 may display content received from a broadcasting station, an external server, or an external storage medium on a screen. The content is a media signal and may include a video signal, an image, a text signal, and the like.

실시 예에서, 디스플레이부(960)는 비디오 처리부(950)에 의해 중요 객체가 화질 처리된 영상을 화면에 출력할 수 있다. In an embodiment, the display unit 960 may output an image of an important object quality-processed by the video processing unit 950 on a screen.

실시 예에서, 디스플레이부(960)는 객체 주변에 객체 식별 정보가 포함된 화면을 출력할 수 있다.In an embodiment, the display unit 960 may output a screen including object identification information around the object.

실시 예에서, 디스플레이부(960)는 사용자로부터 시청 보조 기능을 선택 받기 위한 인터페이스 화면을 출력할 수 있다.In an embodiment, the display unit 960 may output an interface screen for receiving a user's selection of a viewing assistance function.

오디오 처리부(970)는 오디오 데이터에 대한 처리를 수행한다. 오디오 처리부(970)에서는 오디오 데이터에 대한 디코딩이나 증폭, 노이즈 필터링 등과 같은 다양한 처리가 수행될 수 있다. The audio processor 970 processes audio data. The audio processing unit 970 may perform various processes such as decoding or amplifying audio data and filtering noise.

오디오 출력부(980)는 프로세서(210)의 제어에 의해 튜너부(910)를 통해 수신된 콘텐츠에 포함된 오디오, 통신부(920) 또는 입/출력부(940)를 통해 입력되는 오디오, 메모리(220)에 저장된 오디오를 출력할 수 있다. 오디오 출력부(980)는 스피커(981), 헤드폰(982) 또는 S/PDIF(Sony/Philips Digital Interface: 출력 단자)(983) 중 적어도 하나를 포함할 수 있다. The audio output unit 980 controls audio included in the content received through the tuner unit 910 under the control of the processor 210, audio input through the communication unit 920 or the input/output unit 940, and memory ( 220) can output audio stored in it. The audio output unit 980 may include at least one of a speaker 981, headphones 982, and a Sony/Philips Digital Interface (S/PDIF) 983.

사용자 인터페이스(990)는 영상 처리 장치(100a)를 제어하기 위한 사용자 입력을 수신할 수 있다. 사용자 인터페이스(990)는 사용자의 터치를 감지하는 터치 패널, 사용자의 푸시 조작을 수신하는 버튼, 사용자의 회전 조작을 수신하는 휠, 키보드(key board), 및 돔 스위치 (dome switch), 음성 인식을 위한 마이크, 모션을 센싱하는 모션 감지 센서 등을 포함하는 다양한 형태의 사용자 입력 디바이스를 포함할 수 있으나 이에 제한되지 않는다. 리모컨이나 기타 이동 단말기가 영상 처리 장치(100a)를 제어하는 경우, 사용자 인터페이스(990)는 이동 단말기로부터 수신되는 제어 신호를 수신할 수 있다.The user interface 990 may receive a user input for controlling the image processing device 100a. The user interface 990 includes a touch panel that detects a user's touch, a button that receives a user's push manipulation, a wheel that receives a user's rotation manipulation, a keyboard, and a dome switch, and voice recognition. It may include various types of user input devices including a microphone for sensing motion, a motion sensor for sensing motion, and the like, but is not limited thereto. When a remote control or other mobile terminal controls the image processing device 100a, the user interface 990 may receive a control signal received from the mobile terminal.

실시 예에서, 사용자 인터페이스(990)는 사용자로부터 화질 처리 기능 사용 여부를 선택 받을 수 있다. 실시 예에서, 사용자 인터페이스(990)는 화질 처리 기능 중 어떤 기능을 어느 정도로 사용할지를 선택 받을 수 있다. In an embodiment, the user interface 990 may select whether or not to use the image quality processing function from the user. In an embodiment, the user interface 990 may be selected to use which function among image quality processing functions and to what extent.

실시 예에서, 사용자 인터페이스(990)는 사용자로부터 화면에서 객체 식별 정보를 선택 받을 수 있다.In an embodiment, the user interface 990 may receive object identification information selected from a screen by a user.

도 10은 실시 예에 따라, 영상 처리 장치(100)가 입력 영상에 대해 수행하는 화질 처리에 대해 설명하기 위한 도면이다. 10 is a diagram for describing image quality processing performed by the image processing device 100 on an input image according to an exemplary embodiment.

도 10을 참조하면, 영상 처리 장치(100)는 입력 영상(1010)을 입력 받고, 이를 각각 처리하여 제1 처리 영상(1021) 및 제2 처리 영상(1025)를 획득할 수 있다.Referring to FIG. 10 , the image processing device 100 may receive an input image 1010 and process it, respectively, to obtain a first processed image 1021 and a second processed image 1025 .

실시 예에서, 영상 처리 장치(100)는 입력 영상(1010)에 포함된 객체의 내부의 평탄화 정도를 조절할 수 있다. 영상 처리 장치(100)는 사용자의 인터랙션 히스토리나, 실시간 제어 정보에 따라, 입력 영상(1010)에 포함된 객체 내부의 디테일을 어느 정도 삭제할지를 결정할 수 있다. 영상 처리 장치(100)는 제3 뉴럴 네트워크(700)를 이용하여 영상, 사용자 제어 정보, 및 객체 정보 중 적어도 하나로부터, 중요 객체에 대한 화질 처리를 위한 평탄화 파라미터를 획득할 수 있다. 평탄화 파라미터는 평탄화 처리 정도를 나타내는 파라미터일 수 있다. In an embodiment, the image processing device 100 may adjust the degree of flattening of the inside of the object included in the input image 1010 . The image processing device 100 may determine how much detail inside an object included in the input image 1010 should be deleted according to a user's interaction history or real-time control information. The image processing apparatus 100 may obtain a flattening parameter for image quality processing of an important object from at least one of an image, user control information, and object information by using the third neural network 700 . The flattening parameter may be a parameter representing a degree of flattening.

실시 예에서, 영상 처리 장치(100)는 다양한 필터를 이용하여, 입력 영상(1010)에 포함된 객체의 내부를 평탄화 처리할 수 있다. 예컨대, 영상 처리 장치(100)는 edge preserving smoothing filter를 이용하여 입력 영상(1010)에 포함된 객체의 내부의 디테일의 뭉개짐 정도를 조절할 수 있다. In an embodiment, the image processing device 100 may flatten the inside of an object included in the input image 1010 using various filters. For example, the image processing device 100 may adjust the degree of blurring of internal details of objects included in the input image 1010 by using an edge preserving smoothing filter.

실시 예에서, 영상 처리 장치(100)는 사용자 제어 정보를 기반으로 필터의 평탄화 파라미터를 조절하여, 중요 객체의 평탄화 정도를 조절할 수 있다. In an embodiment, the image processing device 100 may adjust the degree of flattening of an important object by adjusting a flattening parameter of a filter based on user control information.

실시 예에서, 제1 처리 영상(1021)은 입력 영상(1010)에 포함된 객체에 대해 평탄화가 수행된 영상을 도시한다. 제1 처리 영상(1021)을 보면, 꽃잎 내부의 무늬나 꽃술, 꽃가루 등의 디테일한 표현들이 모두 사라지고 평평하게 뭉개진 것을 알 수 있다. In an embodiment, the first processed image 1021 shows an image on which flattening has been performed on an object included in the input image 1010 . Looking at the first processed image 1021, it can be seen that all detailed expressions such as patterns inside the petals, stamens, and pollen have disappeared and are flattened.

실시 예에서, 영상 처리 장치(100)는 입력 영상(1010)에 포함된 객체의 윤곽선을 처리할 수 있다. 실시 예에서, 영상 처리 장치(100)는 제3 뉴럴 네트워크(700)를 이용하여 영상, 사용자 제어 정보, 및 객체 정보 중 적어도 하나로부터, 중요 객체에 대한 화질 처리를 위한 윤곽선 파라미터를 획득할 수 있다. 윤곽선 파라미터는 윤곽선의 보존 정도, 주변의 노이즈 제거, 윤곽선의 두께, 진하기, 윤곽선의 디테일 제거 여부 등에 대한 파라미터일 수 있다. In an embodiment, the image processing device 100 may process the outline of an object included in the input image 1010 . In an embodiment, the image processing device 100 may obtain a contour parameter for image quality processing of an important object from at least one of image, user control information, and object information using the third neural network 700. . The contour parameter may be a parameter for the degree of preservation of the contour line, the removal of surrounding noise, the thickness and thickness of the contour line, whether to remove details of the contour line, and the like.

실시 예에서, 영상 처리 장치(100)는 사용자 제어 정보를 기반으로 윤곽선 파라미터를 조절하여, 중요 객체의 윤곽선을 처리할 수 있다. In an embodiment, the image processing device 100 may process the contour of the important object by adjusting the contour parameter based on user control information.

실시 예에서, 제2 처리 영상(1025)은 입력 영상(1010)에 포함된 객체에 대해 윤곽선이 처리된 영상을 도시한다. 제2 처리 영상(1025)을 보면, 꽃잎의 형태만이 두꺼운 윤곽선으로 포함되어 있고, 주변의 다른 디테일한 표현들은 모두 제거되었음을 알 수 있다. 또한, 윤곽선의 색상이 배경이나 객체 내부와는 다르다는 것을 알 수 있다. In an embodiment, the second processed image 1025 shows an image in which the outline of an object included in the input image 1010 has been processed. Looking at the second processed image 1025, it can be seen that only the shape of the petal is included as a thick outline, and all other detailed expressions around it have been removed. Also, it can be seen that the color of the outline is different from the background or inside the object.

실시 예에서, 영상 처리 장치(100)에 포함된 제3 뉴럴 네트워크(700)는 사용자 제어 정보를 기반으로 평탄화 처리된 영상과 윤곽선 처리된 영상의 블렌딩 정도를 학습할 수 있다. 제3 뉴럴 네트워크(700)는 사용자 제어 정보를 학습하여 사용자가 선호하는 블렌딩 정도를 학습하고, 그에 따라 평탄화 파라미터가 적용된 영상과 윤곽선 파라미터가 적용된 영상을 블렌딩함으로써 중요 객체 내부는 평탄화되고, 중요 객체의 윤곽선은 윤곽선 처리가 수행된 최종 결과물을 획득할 수 있다. 즉, 도 10에서, 제3 뉴럴 네트워크(700)는 제1 처리 영상(1021)과 제2 처리 영상(1025)의 블렌딩 정도를 자동으로 조절하고, 그에 따라 최종 영상을 획득할 수 있다. In an embodiment, the third neural network 700 included in the image processing device 100 may learn the blending degree of the flattened image and the contoured image based on user control information. The third neural network 700 learns user control information to learn the degree of blending preferred by the user, and accordingly blends the image to which the flattening parameter is applied and the image to which the contour parameter is applied, so that the inside of the important object is flattened and the image of the important object is flattened. Contours may obtain a final result of contour processing. That is, in FIG. 10 , the third neural network 700 may automatically adjust the degree of blending between the first processed image 1021 and the second processed image 1025, and obtain a final image accordingly.

다른 실시 예에서, 영상 처리 장치(100)는 사용자로부터 평탄화 파라미터가 적용된 영상 및 윤곽선 파라미터가 적용된 영상의 블렌딩 정도에 대한 제어 명령을 수신할 수 있다. In another embodiment, the image processing device 100 may receive a control command for a blending degree of an image to which a flattening parameter is applied and an image to which a contour parameter is applied from a user.

실시 예에서, 영상 처리 장치(100)는 사용자의 제어 명령에 따라, 평탄화 파라미터와 윤곽선 파라미터가 적용된 영상 간의 블렌딩 정도를 조절하여, 평탄화된 정도의 강약, 윤곽선 처리 정도의 강약을 조절하여 최종 결과물을 획득할 수 있다. In an embodiment, the image processing device 100 adjusts the degree of blending between the flattening parameter and the contour parameter-applied image according to the user's control command, and adjusts the intensity of the flattening degree and the degree of contour processing to obtain a final result. can be obtained

이 경우, 제3 뉴럴 네트워크(700)는 중요 객체에 대해 화질 처리한 결과를 출력 데이터로 출력하는 대신, 중요 객체에 대해 평탄화 처리만 수행하여 획득한 제1 처리 영상(1021)과 윤곽선 처리만 수행하여 획득한 제2 처리 영상(1025)을 출력 데이터로 출력할 수 있다. In this case, the third neural network 700 only performs contour processing and the first processed image 1021 acquired by performing only flattening on the important object, instead of outputting the result of image quality processing on the important object as output data. The acquired second processed image 1025 may be output as output data.

영상 처리 장치(100)는 평탄화 처리만 수행된 영상과 윤곽선 처리만 수행된 영상 간의 블렌딩 정보를 추가로 사용자로부터 입력 받을 수 있다. 사용자는 평탄화 정도가 너무 세고, 윤곽선 처리가 너무 약하다고 느끼는 경우, 사용자 인터페이스를 이용하여 평탄화 처리된 영상과 윤곽선 처리된 영상 간의 블렌딩 정도를 조절할 수 있다. 화질 처리부(115)는 사용자로부터의 제어 신호에 따라 영상 간 블렌딩 정도를 조절하여 최종적으로 화질 처리된 영상을 획득할 수 있다.The image processing device 100 may additionally receive blending information between the image on which only the flattening process has been performed and the image on which only the outline process has been performed, from the user. When the user feels that the flattening degree is too strong and the outline processing is too weak, the user can adjust the blending degree between the flattened image and the outline processed image using the user interface. The image quality processing unit 115 may adjust the degree of blending between images according to a control signal from a user to finally obtain a quality-processed image.

도 11은 실시 예에 따라, 화질 처리 기능을 포함하는 사용자 인터페이스 화면을 도시한 도면이다. 11 is a diagram illustrating a user interface screen including a quality processing function according to an exemplary embodiment.

사용자는 영상을 시청하기 전에, 미리, 시청 보조 기능을 선택하여, 화질 처리 여부나 화질 처리 정도를 선택할 수 있다. 또는, 사용자는 영상을 시청하는 중에, 특정 객체에 대해 화질 처리를 수행하기를 원하는 경우, 시청 보조 기능이 활성화되도록 선택하여, 특정 객체에 대해 실시간으로 화질 처리가 수행되도록 할 수 있다. Before viewing an image, the user may select a viewing assistance function in advance to select whether or not to process image quality or the degree of image quality processing. Alternatively, when the user wants to perform image quality processing on a specific object while watching a video, the user may select the viewing assist function to be activated so that the image quality processing is performed on the specific object in real time.

영상 처리 장치(100)는 시청 보조 기능을 제공하기 위한 인터페이스 화면을 출력할 수 있다. 실시 예에서, 영상 처리 장치(100)는 다양한 설정 기능 중, 화면의 다양한 기능을 설정하기 위한 화면 설정 인터페이스 화면을 출력할 수 있다. The image processing device 100 may output an interface screen for providing a viewing assistance function. In an embodiment, the image processing device 100 may output a screen setting interface screen for setting various functions of the screen among various setting functions.

도 11(a)를 참조하면, 화면 설정 인터페이스 화면(1110)은 시청 보조 기능 선택을 위한 메뉴(1115)를 포함할 수 있다. 시청 보조 기능 선택을 위한 메뉴(1115)는 시청 보조 기능 적용 여부나 시청 보조 기능 적용 정도를 조절하기 위한 막대기 형태의 메뉴바를 포함할 수 있다. 다만, 이는 하나의 실시 예로, 화면 설정 인터페이스 화면(1110)이나 시청 보조 기능 선택을 위한 메뉴(1115)는 다양한 구조나 배치, 형태 등을 가질 수 있음은 물론이다.Referring to FIG. 11(a) , the screen setting interface screen 1110 may include a menu 1115 for selecting a viewing assistance function. The menu 1115 for selecting the viewing assistive function may include a bar-shaped menu bar for adjusting whether to apply the viewing assistive function or the degree of application of the viewing assistive function. However, this is an example, and the screen setting interface screen 1110 or the menu 1115 for selecting a viewing assistance function may have various structures, arrangements, and shapes.

도 11(b)는 또 다른 화면 설정 인터페이스 화면(1120)의 예를 도시한다. 또 다른 화면 설정 인터페이스 화면(1120)에는 윤곽선 처리를 위한 메뉴(1121), 크기 조절을 위한 메뉴(1123), 평탄화 처리를 위한 메뉴(1125)가 포함될 수 있다. 사용자는 또 다른 화면 설정 인터페이스 화면(1120)을 보고, 중요 객체에 대해 수행할 화질 처리의 종류를 선택할 수 있다. 11( b ) shows another example of a screen setting interface screen 1120 . Another screen setting interface screen 1120 may include a menu 1121 for contour processing, a menu 1123 for size adjustment, and a menu 1125 for flattening. The user may view another screen setting interface screen 1120 and select the type of image quality processing to be performed on the important object.

또한, 도 11에는 도시하지 않았으나, 영상 처리 장치(100)는 사용자가 화질 처리 종류를 선택한 경우, 선택된 종류의 화질 처리를 어느 정도로 수행할지에 대한 정보를 입력할 수 있는 사용자 인터페이스 화면을 출력할 수 있다. 사용자는 윤곽선 처리, 평탄화 정도, 확대 정도를 선택하여 입력할 수 있다.Also, although not shown in FIG. 11 , when a user selects a quality processing type, the image processing device 100 may output a user interface screen for inputting information on how much the selected type of image quality processing is to be performed. there is. The user can select and input the outline processing, leveling degree, and magnification degree.

도 12는 실시 예에 따라, 영상 처리 장치(100)가 입력 영상에 대해 화질 처리를 수행하여 획득한 결과 영상을 도시한 도면이다.12 is a diagram illustrating a resultant image obtained by performing image quality processing on an input image by the image processing device 100 according to an exemplary embodiment.

도 12를 참조하면, 영상 처리 장치(100)는 입력 영상(1210)을 분석하여, 입력 영상(1210)에 포함된 중요 객체를 검출할 수 있다. 영상 처리 장치(100)는 중요 객체를 분석하여, 중요 객체가 사람이라는 것을 식별할 수 있다. Referring to FIG. 12 , the image processing device 100 may analyze an input image 1210 and detect an important object included in the input image 1210 . The image processing device 100 may analyze the important object and identify that the important object is a person.

실시 예에서, 영상 처리 장치(100)는 사람 전체에 대해 화질 처리를 수행할 수 있다. 영상 처리 장치(100)는 사용자의 제어 정보를 기반으로, 즉, 사용자의 과거 이력이나 취향에 따라, 객체를 확대하지 않고 크기를 유지한 채, 객체의 평탄화와 윤곽선 처리만을 수행할 수 있다. 예컨대, 영상 처리 장치(100)는 사용자 인터랙션 히스토리를 기반으로, 사용자가 중요 객체의 크기가 일정 크기 이상인 경우에는 확대 기능을 이용하지 않았음을 식별하고, 입력 영상(1210)에 포함된 객체의 크기에 따라 확대 기능 이용 여부를 결정할 수 있다. 영상 처리 장치(100)는 객체의 크기가 일정 크기보다 큰 경우, 입력 영상(1210)에 포함된 객체를 확대하지 않고, 다른 화질 처리만을 수행할 수 있다. In an embodiment, the image processing device 100 may perform image quality processing on all people. The image processing device 100 may perform only flattening and contour processing of an object based on user control information, that is, according to the user's past history or taste, while maintaining the size of the object without enlarging the object. For example, based on the user interaction history, the image processing device 100 identifies that the user did not use the magnification function when the size of an important object is greater than a certain size, and determines the size of the object included in the input image 1210. Depending on this, it is possible to determine whether or not to use the enlargement function. When the size of the object is larger than a predetermined size, the image processing device 100 may perform only other picture quality processing without enlarging the object included in the input image 1210 .

실시 예에서, 영상 처리 장치(100)는 객체 전체를 평탄화하고, 윤곽선을 처리하여, 제1 결과 영상(1220)을 획득할 수 있다. 제1 결과 영상(1220)에 포함된 객체는 입력 영상(1210) 내에서의 객체와 동일한 크기를 갖는 것을 알 수 있다. 또한, 제1 결과 영상(1220)에 포함된 객체는 입력 영상(1210)에 포함된 객체의 윤곽선 및 평탄화 처리가 수행되었음을 알 수 있다. In an embodiment, the image processing device 100 may obtain a first resultant image 1220 by flattening the entire object and processing the contour. It can be seen that the object included in the first result image 1220 has the same size as the object in the input image 1210 . In addition, it can be seen that the object included in the first result image 1220 has been subjected to contouring and flattening of the object included in the input image 1210 .

또는 다른 실시 예에서, 영상 처리 장치(100)는 입력 영상(1210)에 포함된 객체가 사람인 경우, 사람의 얼굴만을 중요 객체로 식별할 수 있다. 예컨대, 영상 처리 장치(100)는 사용자 인터랙션 히스토리를 기반으로, 사용자가 사람의 얼굴만을 확대하여 시청하는 것을 선호하는 경우, 입력 영상(1210)에서 사람의 얼굴 영역(1211)만을 중요 객체 영역으로 식별할 수 있다. 영상 처리 장치(100)는 중요 객체 영역인 얼굴 영역을 크랍(crop)하고, 크랍된 중요 객체 영역에 대해서 화질 처리를 수행할 수 있다. 영상 처리 장치(100)는 크랍된 중요 객체 영역을 입력 영상(1210)의 크기만큼 확대하고, 동시에, 중요 객체 영역의 해상도를 조절할 수 있다. 또한, 영상 처리 장치(100)는 중요 객체 영역에 대해 사용자가 선호하는 평탄화 처리 및 윤곽선 처리를 수행할 수 있다. Alternatively, in another embodiment, when the object included in the input image 1210 is a person, the image processing device 100 may identify only the person's face as the important object. For example, based on the user interaction history, the image processing device 100 identifies only the human face region 1211 of the input image 1210 as an important object region when the user prefers to enlarge and watch only the human face. can do. The image processing device 100 may crop a face region that is an important object region and perform image quality processing on the cropped important object region. The image processing device 100 may enlarge the cropped important object region by the size of the input image 1210 and simultaneously adjust the resolution of the important object region. Also, the image processing device 100 may perform flattening and contour processing preferred by the user on the important object region.

실시 예에서, 영상 처리 장치(100)는 중요 객체만을 화질 처리하여, 제2 결과 영상(1230)을 획득할 수 있다. 제2 결과 영상(1230)은 입력 영상(1210)에 포함된 객체 중 얼굴 영역만을 포함하는 것을 알 수 있다. 또한, 입력 영상(1210)에 포함된 객체의 얼굴의 윤곽선이 처리되고, 얼굴 내부 중 눈, 코, 입을 제외한 나머지 영역은 평탄화되었음을 알 수 있다. In an embodiment, the image processing device 100 may acquire a second resultant image 1230 by performing image quality processing only on important objects. It can be seen that the second result image 1230 includes only the face region among the objects included in the input image 1210 . In addition, it can be seen that the outline of the face of the object included in the input image 1210 is processed, and the remaining regions of the face except for the eyes, nose, and mouth are flattened.

도 13은, 실시 예에 따라 제1 영상으로부터 중요 객체가 화질 처리 된 제2 영상을 획득하는 방법을 도시한 순서도이다. 13 is a flowchart illustrating a method of acquiring a second image in which an important object is quality-processed from a first image according to an embodiment.

도 13을 참조하면, 영상 처리 장치(100)는 제1 영상을 입력 받고, 제1 영상에 포함된 중요 객체에 대한 객체 정보를 획득할 수 있다(단계 1310). Referring to FIG. 13 , the image processing device 100 may receive a first image and obtain object information about an important object included in the first image (operation 1310).

실시 예에서, 영상 처리 장치(100)는 제1 영상을 분석하고, 제1 영상으로부터 중요 객체를 검출할 수 있다. 영상 처리 장치(100)는 중요 객체의 종류, 위치, 크기 중 적어도 하나에 대한 정보를 객체 정보로 획득할 수 있다.In an embodiment, the image processing device 100 may analyze the first image and detect an important object from the first image. The image processing device 100 may obtain information about at least one of the type, location, and size of an important object as object information.

실시 예에서, 영상 처리 장치(100)는 화질 처리를 위한 사용자 제어 정보를 획득할 수 있다(단계 1320).In an embodiment, the image processing device 100 may obtain user control information for picture quality processing (operation 1320).

실시 예에서, 사용자 제어 정보는 객체의 확대 여부, 객체의 확대 정도, 윤곽선 처리, 및 평탄화 처리 중 적어도 하나에 대한 제어 정보를 포함할 수 있다. In an embodiment, the user control information may include control information on at least one of whether the object is enlarged, the degree of enlargement of the object, contour processing, and flattening processing.

실시 예에서, 영상 처리 장치(100)는 객체 정보 및 제어 정보를 기반으로, 중요 객체에 대한 화질 처리를 수행하여 제2 영상을 획득할 수 있다(단계 1330).In an embodiment, the image processing device 100 may obtain a second image by performing image quality processing on an important object based on object information and control information (operation 1330).

실시 예에서, 영상 처리 장치(100) 객체 정보를 기반으로 획득한 객체의 위치, 크기, 종류 등에 대해, 사용자의 제어 정보를 참조하여, 제1 영상에 포함된 객체에 대한 화질 처리를 수행할 수 있다. In an embodiment, the image processing device 100 may perform image quality processing on an object included in the first image by referring to user control information for the location, size, type, etc. of the acquired object based on the object information. there is.

도 14는 실시 예에 따라, 사용자 제어 정보에 따라 화질 처리를 수행하는 것을 도시한 순서도이다.14 is a flowchart illustrating that image quality processing is performed according to user control information according to an embodiment.

실시 예에서, 사용자 제어 정보는, 추론 제어 정보 및 실시간 사용자 제어 정보 중 적어도 하나를 포함할 수 있다. In an embodiment, the user control information may include at least one of reasoning control information and real-time user control information.

실시 예에서, 영상 처리 장치(100)는 영상 처리 장치(100) 내부에 기 저장되어 있는 사용자의 인터랙션 히스토리를 기반으로 추론 제어 정보를 획득할 수 있다. In an embodiment, the image processing device 100 may obtain reasoning control information based on a user's interaction history pre-stored in the image processing device 100 .

실시 예에서, 영상 처리 장치(100)는 추론 제어 정보에 따라, 중요 객체에 대한 화질 처리를 수행할 수 있다(단계 1410).In an embodiment, the image processing device 100 may perform image quality processing on an important object according to inference control information (operation 1410).

실시 예에서, 영상 처리 장치(100)는 사용자의 실시간 제어 정보가 수신되었는지 여부를 식별할 수 있다(단계 1420). In an embodiment, the image processing device 100 may identify whether real-time control information of the user has been received (operation 1420).

실시 예에서, 영상 처리 장치(100)는 사용자의 실시간 제어 정보가 입력된 경우, 실시간 제어 정보에 따라, 객체에 대한 화질 처리를 추가로 수행할 수 있다(단계 1430). 사용자의 실시간 제어 정보가 추론 제어 정보와 상반되는 경우, 영상 처리 장치(100)는 사용자의 실시간 제어 정보에 따라 객체에 대한 화질 처리를 수행할 수 있다. 영상 처리 장치(100)는 화질 처리된 영상을 출력할 수 있다.In an embodiment, when user's real-time control information is input, the image processing device 100 may additionally perform image quality processing on the object according to the real-time control information (operation 1430). When the user's real-time control information conflicts with the reasoning control information, the image processing device 100 may perform image quality processing on the object according to the user's real-time control information. The image processing device 100 may output a quality-processed image.

일부 실시 예에 따른 영상 처리 장치의 동작 방법 및 장치는 컴퓨터에 의해 실행되는 프로그램 모듈과 같은 컴퓨터에 의해 실행 가능한 명령어를 포함하는 기록 매체의 형태로도 구현될 수 있다. 컴퓨터 판독 가능 매체는 컴퓨터에 의해 액세스될 수 있는 임의의 가용 매체일 수 있고, 휘발성 및 비 휘발성 매체, 분리형 및 비 분리형 매체를 모두 포함한다. 또한, 컴퓨터 판독 가능 매체는 컴퓨터 저장 매체 및 통신 매체를 모두 포함할 수 있다. 컴퓨터 저장 매체는 컴퓨터 판독 가능 명령어, 데이터 구조, 프로그램 모듈 또는 기타 데이터와 같은 정보의 저장을 위한 임의의 방법 또는 기술로 구현된 휘발성 및 비 휘발성, 분리형 및 비 분리형 매체를 모두 포함한다. 통신 매체는 전형적으로 컴퓨터 판독 가능 명령어, 데이터 구조, 프로그램 모듈, 또는 반송파와 같은 변조된 데이터 신호의 기타 데이터, 또는 기타 전송 메커니즘을 포함하며, 임의의 정보 전달 매체를 포함한다. A method and apparatus for operating an image processing device according to some embodiments may be implemented in the form of a recording medium including instructions executable by a computer, such as program modules executed by a computer. Computer readable media can be any available media that can be accessed by a computer and includes both volatile and nonvolatile media, removable and non-removable media. Also, computer readable media may include both computer storage media and communication media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Communication media typically includes computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave, or other transport mechanism, and includes any information delivery media.

또한, 전술한 본 개시의 실시 예에 따른 영상 처리 장치 및 그 동작 방법은 제1 영상으로부터 상기 제1 영상에 포함된 중요 객체에 대한 객체 정보를 획득하는 단계, 화질 처리에 대한 사용자 제어 정보를 획득하는 단계, 상기 객체 정보 및 상기 사용자 제어 정보를 기반으로, 상기 제1 영상으로부터 상기 중요 객체에 대한 화질 처리를 수행하여 제2 영상을 획득하는 단계를 포함하는 영상 처리 방법을 구현하기 위한 프로그램이 기록된 컴퓨터로 판독 가능한 기록 매체/저장 매체를 포함하는 컴퓨터 프로그램 제품으로 구현될 수 있다. In addition, the above-described image processing apparatus and method of operation according to an embodiment of the present disclosure include acquiring object information about an important object included in the first image from a first image, and acquiring user control information for image quality processing. A program for implementing an image processing method comprising the step of obtaining a second image by performing image quality processing on the important object from the first image based on the object information and the user control information. may be implemented as a computer program product including a computer-readable recording medium/storage medium.

기기로 읽을 수 있는 저장 매체는, 비일시적(non-transitory) 저장 매체의 형태로 제공될 수 있다. 여기서,‘비일시적 저장 매체'는 실재(tangible)하는 장치이고, 신호(signal)(예: 전자기파)를 포함하지 않는다는 것을 의미할 뿐이며, 이 용어는 데이터가 저장 매체에 반영구적으로 저장되는 경우와 임시적으로 저장되는 경우를 구분하지 않는다. 예로, '비일시적 저장 매체'는 데이터가 임시적으로 저장되는 버퍼를 포함할 수 있다.The device-readable storage medium may be provided in the form of a non-transitory storage medium. Here, 'non-transitory storage medium' only means that it is a tangible device and does not contain signals (e.g., electromagnetic waves), and this term refers to the case where data is semi-permanently stored in the storage medium and temporary It does not discriminate if it is saved as . For example, the 'non-temporary storage medium' may include a buffer in which data is temporarily stored.

일 실시예에 따르면, 본 문서에 개시된 다양한 실시 예들에 따른 방법은 컴퓨터 프로그램 제품(computer program product)에 포함되어 제공될 수 있다. 컴퓨터 프로그램 제품은 상품으로서 판매자 및 구매자 간에 거래될 수 있다. 컴퓨터 프로그램 제품은 기기로 읽을 수 있는 저장 매체(예: compact disc read only memory (CD-ROM))의 형태로 배포되거나, 또는 어플리케이션 스토어를 통해 또는 두개의 사용자 장치들(예: 스마트폰들) 간에 직접, 온라인으로 배포(예: 다운로드 또는 업로드)될 수 있다. 온라인 배포의 경우에, 컴퓨터 프로그램 제품(예:다운로더블 앱(downloadable app))의 적어도 일부는 제조사의 서버(150), 어플리케이션 스토어의 서버(150), 또는 중계 서버(150)의 메모리와 같은 기기로 읽을 수 있는 저장 매체에 적어도 일시 저장되거나, 임시적으로 생성될 수 있다.According to one embodiment, the method according to various embodiments disclosed in this document may be included and provided in a computer program product. Computer program products may be traded between sellers and buyers as commodities. A computer program product is distributed in the form of a device-readable storage medium (eg compact disc read only memory (CD-ROM)), or through an application store or between two user devices (eg smartphones). It can be distributed (e.g., downloaded or uploaded) directly or online. In the case of online distribution, at least a part of a computer program product (eg, a downloadable app) is stored in the memory of the manufacturer's server 150, the application store's server 150, or the relay server 150. At least temporarily stored in a storage medium readable by a device, or may be temporarily created.

전술한 설명은 예시를 위한 것이며, 발명이 속하는 기술분야의 통상의 지식을 가진 자는 발명의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시 예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일 형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다.The above description is for illustrative purposes, and those skilled in the art will understand that it can be easily modified into other specific forms without changing the technical spirit or essential features of the invention. Therefore, the embodiments described above should be understood as illustrative in all respects and not limiting. For example, each component described as a single type may be implemented in a distributed manner, and similarly, components described as distributed may be implemented in a combined form.

Claims

In the image processing device,
a memory that stores one or more instructions; and
a processor to execute the one or more instructions stored in the memory;
By executing the one or more instructions, the processor:
Obtain object information about an important object included in the first image from the first image;
Acquiring control information for image quality processing;
Based on the object information and the control information, a second image is acquired by performing image quality processing on the important object.

The image processing apparatus according to claim 1, wherein the object information includes information on at least one of a type, location, and size of the important object.

The method of claim 1 , wherein the processor executes the one or more instructions to:
Detecting a plurality of objects from the first image;
outputting object identification information indicating the plurality of objects;
An image processing device that identifies an object selected by a user as the important object in accordance with the output of the object identification information.

The image processing apparatus according to claim 1, wherein the control information includes control information on at least one of whether an object is enlarged, a degree of enlargement of the object, an outline process, and a flatten process.

5. The method of claim 4, wherein the processor by executing the one or more instructions:
According to the control information, at least one of upscaling the object, processing an outline around the object, and flattening the inside of the object is performed.

The method of claim 1, wherein the control information includes at least one of reasoning control information and real-time user control information,
The reasoning control information is obtained from the user's previous control history information for a previous image,
The real-time user control information includes real-time control information of the user for the first image.

The method of claim 1 , wherein the processor acquires the second image from the first image using a neural network by executing the one or more instructions;
The neural network is a neural network that learns an input image, an object region of interest to the user in the input image, and a ground truth image obtained by image quality processing of the object region of interest to the user as a learning data set. , image processing device.

The image processing apparatus according to claim 7 , wherein the neural network obtains a second image in which the object is quality-processed from at least one of the first image, the control information, and the object information.

The method of claim 7, wherein the neural network obtains a flattening parameter and a contour parameter for image quality processing of the object from at least one of the first image, the user control information, and the object information,
wherein the processor adjusts a blending degree of an image quality-processed according to the flattening parameter and an image quality-processed according to the contour parameter according to a user control signal to obtain a quality-processed second image.

The method of claim 8 or 9, wherein the image quality processing includes at least one of processing an outline of the object, flattening processing inside the object, and upscaling the object,
The processing of the outline of the object includes processing of at least one of detail, strength, and color of the outline of the object;
The flattening process inside the object includes a process for adjusting the degree of flattening inside the object,
Upscaling the object includes a process of enlarging a size of the object while maintaining a resolution of the object.

In the image processing method performed by the image processing device,
Obtaining object information about an important object included in the first image from a first image;
obtaining control information for picture quality processing; and
and obtaining a second image by performing image quality processing on the important object from the first image based on the object information and the user control information.

The image processing method of claim 11 , wherein the object information includes information on at least one of a type, location, and size of the important object.

The method of claim 11 , further comprising: detecting a plurality of objects from the first image;
outputting object identification information indicating the plurality of objects; and
The image processing method further comprising the step of identifying an object selected by a user as the important object in correspondence with the output of the object identification information.

The image processing method of claim 11 , wherein the control information includes control information on at least one of an object magnification, an object magnification degree, an outline process, and a flattening process.

15. The method of claim 14, wherein the performing of image quality processing on the important object comprises performing at least one of upscaling of the object, processing an outline around the object, and flattening the inside of the object according to the control information. Including, image processing method.

12. The method of claim 11, wherein the control information includes at least one of reasoning control information and real-time user control information,
The reasoning control information is obtained from the user's previous control history information for a previous image,
The real-time user control information includes real-time control information of the user for the first image.

12. The method of claim 11, wherein the obtaining of the second image from the first image is performed using a neural network,
The neural network is a neural network that learns an input image, an object region of interest to the user in the input image, and a ground truth image obtained by image quality processing of the object region of interest to the user as a learning data set. , image processing method.

18 . The image processing method of claim 17 , wherein the neural network obtains a second image in which the object is quality-processed from at least one of the first image, the control information, and the object information.

The method of claim 17, wherein the neural network obtains a flattening parameter and a contour parameter for image quality processing of the object from at least one of the first image, the user control information, and the object information,
The acquiring of the second image from the first image may include adjusting a blending degree between an image quality-processed according to the flattening parameter and an image quality-processed according to the contour parameter under user control to obtain a quality-processed second image. A method of processing an image, comprising obtaining an image.

Obtaining object information about an important object included in the first image from a first image;
obtaining control information for picture quality processing; and
A computer recorded with a program for implementing an image processing method, comprising obtaining a second image by performing image quality processing on the important object from the first image based on the object information and the user control information. readable recording medium.