KR102407815B1

KR102407815B1 - Apparatus and method for processing image

Info

Publication number: KR102407815B1
Application number: KR1020170135869A
Authority: KR
Inventors: 김예훈; 윤소정; 장준익
Original assignee: 삼성전자주식회사
Priority date: 2016-12-22
Filing date: 2017-10-19
Publication date: 2022-06-13
Also published as: KR102458358B1; CN110063053A; CN113114943B; CN113114943A; KR20220080731A; JP2020507228A; JP2023036778A; KR20180073432A; CN110063053B

Abstract

복수의 데이터 인식 모델들 중에서 소정의 조건에 부합하는 데이터 인식 모델이 학습한 관심 정보에 기초하여, 관심영역을 추정하여 초점을 맞추는 영상 처리 장치 및 영상 처리 방법을 개시한다.
이 경우, 영상 처리 장치는 규칙 기반 또는 인공 지능 알고리즘을 이용하여 관심 영역을 추정할 수 있다. 인공 지능 알고리즘을 이용하여 관심 영역을 추정하는 경우, 영상 처리 장치는 기계 학습, 신경망 또는 딥러닝 알고리즘을 이용하여 관심 영역을 추정할 수 있다.Disclosed are an image processing apparatus and an image processing method for estimating and focusing a region of interest based on interest information learned by a data recognition model meeting a predetermined condition among a plurality of data recognition models.
In this case, the image processing apparatus may estimate the ROI using a rule-based or artificial intelligence algorithm. When the ROI is estimated using an artificial intelligence algorithm, the image processing apparatus may estimate the ROI using a machine learning, neural network, or deep learning algorithm.

Description

Image processing apparatus and method {Apparatus and method for processing image}

영상 처리 장치 및 영상 처리 방법에 관한 것이다.The present invention relates to an image processing apparatus and an image processing method.

또한 본 개시는 딥러닝 등의 기계 학습 알고리즘을 활용하여 인간 두뇌의 인지, 판단 등의 기능을 모사하는 인공 지능(Artificial Intelligence, AI) 시스템 및 그 응용 기술에 관련된 것이다.In addition, the present disclosure relates to an artificial intelligence (AI) system for simulating functions such as cognition and judgment of the human brain by using a machine learning algorithm such as deep learning and an application technology thereof.

영상을 처리함에 있어 자동으로 초점을 맞추어 주는 것은 영상에 대한 사용자의 만족감을 증대시킬수 있는 작업으로서, 일반적으로 영상 내에서 사용자의 시선이 집중될 수 있는 영역을 관심영역으로 보아, 이와 같은 관심영역에 초점을 맞추는 영상 처리 방식이 있다. Automatically focusing an image in processing an image is a task that can increase the user's satisfaction with the image. There is an image processing method that focuses.

또한, 최근 영상 처리 분야에도 인공지능 시스템이 도입되고 있다. In addition, artificial intelligence systems have recently been introduced into the image processing field.

인공지능 시스템은 인간 수준의 지능을 구현하는 컴퓨터 시스템이며, 기존 Rule 기반 스마트 시스템과 달리 기계가 스스로 학습하고 판단하며 똑똑해지는 시스템이다. 인공지능 시스템은 사용할수록 인식률이 향상되고 사용자 취향을 보다 정확하게 이해할 수 있게 되어, 기존 Rule 기반 스마트 시스템은 점차 딥러닝 기반 인공지능 시스템으로 대체되고 있다.An artificial intelligence system is a computer system that implements human-level intelligence, and unlike the existing rule-based smart system, it is a system in which a machine learns, judges, and becomes smarter by itself. As artificial intelligence systems are used, the recognition rate improves and users can understand user preferences more accurately.

인공 지능 기술은 기계학습(딥러닝) 및 기계학습을 활용한 요소 기술들로 구성된다.Artificial intelligence technology consists of machine learning (deep learning) and element technologies using machine learning.

기계학습은 입력 데이터들의 특징을 스스로 분류/학습하는 알고리즘 기술이며, 요소기술은 딥러닝 등의 기계학습 알고리즘을 활용하여 인간 두뇌의 인지, 판단 등의 기능을 모사하는 기술로서, 언어적 이해, 시각적 이해, 추론/예측, 지식 표현, 동작 제어 등의 기술 분야로 구성된다.Machine learning is an algorithm technology that categorizes/learns the characteristics of input data by itself, and element technology uses machine learning algorithms such as deep learning to simulate functions such as cognition and judgment of the human brain. It consists of technical fields such as understanding, reasoning/prediction, knowledge expression, and motion control.

인공지능 기술이 응용되는 다양한 분야는 다음과 같다. 언어적 이해는 인간의 언어/문자를 인식하고 응용/처리하는 기술로서, 자연어 처리, 기계 번역, 대화시스템, 질의 응답, 음성 인식/합성 등을 포함한다. 시각적 이해는 사물을 인간의 시각처럼 인식하며 처리하는 기술로서, 객체 인식, 객체 추적, 영상 검색, 사람 인식, 장면 이해, 공간 이해, 영상 개선 등을 포함한다. 추론 예측은 정보를 판단하여 논리적으로 추론하고 예측하는 기술로서, 지식/확률 기반 추론, 최적화 예측, 선호 기반 계획, 추천 등을 포함한다. 지식 표현은 인간의 경험정보를 지식데이터로 자동화 처리하는 기술로서, 지식 구축(데이터 생성/분류), 지식 관리(데이터 활용) 등을 포함한다. 동작 제어는 차량의 자율 주행, 로봇의 움직임을 제어하는 기술로서, 움직임 제어(항법, 충돌, 주행), 조작 제어(행동 제어) 등을 포함한다.The various fields where artificial intelligence technology is applied are as follows. Linguistic understanding is a technology for recognizing and applying/processing human language/text, and includes natural language processing, machine translation, dialogue system, question and answer, and speech recognition/synthesis. Visual understanding is a technology for recognizing and processing objects like human vision, and includes object recognition, object tracking, image search, human recognition, scene understanding, spatial understanding, image improvement, and the like. Inferential prediction is a technology for logically reasoning and predicting by judging information, and includes knowledge/probability-based reasoning, optimization prediction, preference-based planning, and recommendation. Knowledge expression is a technology that automatically processes human experience information into knowledge data, and includes knowledge construction (data generation/classification) and knowledge management (data utilization). Motion control is a technology for controlling autonomous driving of a vehicle and movement of a robot, and includes motion control (navigation, collision, driving), manipulation control (action control), and the like.

복수의 데이터 인식 모델들 중에서 소정의 조건에 부합하는 데이터 인식 모델이 학습한 관심 정보에 기초하여, 관심영역을 추정하여 초점을 맞추는 영상 처리 장치 및 영상 처리 방법을 제공하는 것이다.An object of the present invention is to provide an image processing apparatus and an image processing method for estimating and focusing a region of interest based on interest information learned by a data recognition model meeting a predetermined condition among a plurality of data recognition models.

제 1 측면에 따른 영상 처리 장치는, 적어도 하나의 피사체를 포함하는 라이브 뷰 영상을 획득하는 촬영부; 컴퓨터 실행가능 명령어(computer executable instruction)를 저장하는 메모리; 상기 컴퓨터 실행가능 명령어를 실행함으로써, 데이터 인식 모델들 중에서 소정의 조건에 대응되는 데이터 인식 모델에 기초하여, 상기 데이터 인식 모델이 학습한 관심 정보에 해당하는지 판단하는 기준에 따라, 상기 획득된 라이브 뷰 영상에서 사용자의 관심영역을 추정하고, 상기 추정된 관심영역에 초점을 맞추는 적어도 하나의 프로세서; 및 상기 추정된 관심영역에 초점을 맞춘 상기 라이브 뷰 영상을 디스플레이하는 입출력부;를 포함한다.An image processing apparatus according to a first aspect includes: a photographing unit configured to acquire a live view image including at least one subject; a memory storing computer executable instructions; By executing the computer-executable instruction, based on a data recognition model corresponding to a predetermined condition among data recognition models, the obtained live view is based on a criterion for determining whether the data recognition model corresponds to the learned interest information. at least one processor for estimating a user's ROI from the image and focusing on the estimated ROI; and an input/output unit configured to display the live view image focused on the estimated ROI.

제 2 측면에 따른 영상 처리 방법은, 적어도 하나의 피사체를 포함하는 라이브 뷰 영상을 획득하는 단계; 데이터 인식 모델들 중에서 소정의 조건에 대응되는 데이터 인식 모델에 기초하여, 상기 데이터 인식 모델이 학습한 관심 정보에 해당하는지 판단하는 기준에 따라, 상기 획득된 라이브 뷰 영상에서 사용자의 관심영역을 추정하는 단계; 상기 추정된 관심영역에 초점을 맞추는 단계; 및 상기 추정된 관심영역에 초점을 맞춘 상기 라이브 뷰 영상을 디스플레이하는 단계;를 포함한다. An image processing method according to a second aspect includes: acquiring a live view image including at least one subject; Based on a data recognition model corresponding to a predetermined condition among data recognition models, according to a criterion for determining whether the data recognition model corresponds to the learned interest information step; focusing on the estimated ROI; and displaying the live view image focused on the estimated ROI.

제 3 측면에 따라, 상기 영상 처리 방법을 컴퓨터에서 실행시키기 위한 프로그램을 기록한 컴퓨터로 읽을 수 있는 기록매체이다.According to a third aspect, it is a computer-readable recording medium in which a program for executing the image processing method in a computer is recorded.

도 1a는 일 실시예에 따른 영상 처리 장치를 설명하기 위한 블록도이다.
도 1b는 다양한 실시예에 따른 영상 처리 장치의 구성을 도시한 도면이다.
도 2는 일 실시예에 따른 촬영부의 블록도이다.
도 3은 일 실시예에 따른 제어부의 동작을 설명하기 위한 도면이다.
도 4a 및 도 4b 일 실시예에 따른 영상 처리 장치에서 돌출 영역을 사용자의 관심영역으로 추정하여 초점을 맞춘 예를 설명하기 위한 도면이다.
도 5a 및 도 5b는 일 실시예에 따른 영상 처리 장치에서 개인 맞춤화된 관심 정보에 대응되는 영역을 사용자의 관심영역으로 추정하여 초점을 맞춘 예를 설명하기 위한 도면이다.
도 6은 일 실시예에 따른 제어부의 블록도이다.
도 7은 일 실시예에 따른 데이터 학습부의 블록도이다.
도 8은 일 실시예에 따른 데이터 인식부의 블록도이다.
도 9는 다른 일 실시예에 따른 영상 처리 장치를 설명하기 위한 블록도이다.
도 10은 일부 실시예에 따른 영상 처리 장치 및 서버가 서로 연동함으로써 데이터를 학습하고 인식하는 예시를 나타내는 도면이다.
도 11은 또 다른 일 실시예에 따른 영상 처리 장치를 설명하기 위한 블록도이다.
도 12는 일 실시예에 따른 영상 처리 방법을 나타내는 흐름도이다.
도 13은 일 실시예에 따른 영상 처리 장치가 제 1 프로세서 및 제 2 프로세서를 포함하는 경우에 관심 영역을 추정하는 상황을 설명하기 위한 흐름도이다.
도 14는 일 실시예에 따른 영상 처리 장치가 제 1 프로세서, 제 2 프로세서, 및 제 3 프로세서를 포함하는 경우에 관심 영역을 추정하는 상황을 설명하기 위한 흐름도이다.
도 15는 일 실시예에 따른 전자 장치가 제 1 프로세서, 제 2 프로세서, 및 제 3 프로세서를 포함하는 경우에 관심 영역을 추정하는 다른 상황을 설명하기 위한 흐름도이다.
도 16은 일 실시예에 따른 영상 처리 장치가 서버를 이용하여 관심 영역을 추정하는 상황을 설명하기 위한 흐름도이다.1A is a block diagram illustrating an image processing apparatus according to an exemplary embodiment.
1B is a diagram illustrating a configuration of an image processing apparatus according to various embodiments of the present disclosure;
2 is a block diagram of a photographing unit according to an exemplary embodiment.
3 is a view for explaining an operation of a control unit according to an embodiment.
4A and 4B are diagrams for explaining an example in which a salient region is estimated as a user's ROI and focused in the image processing apparatus according to an exemplary embodiment.
5A and 5B are diagrams for explaining an example of estimating a region corresponding to personalized interest information as a user's region of interest in the image processing apparatus according to an embodiment of the present disclosure;
6 is a block diagram of a control unit according to an embodiment.
7 is a block diagram of a data learning unit according to an exemplary embodiment.
8 is a block diagram of a data recognition unit according to an exemplary embodiment.
9 is a block diagram illustrating an image processing apparatus according to another exemplary embodiment.
10 is a diagram illustrating an example of learning and recognizing data by interworking with an image processing apparatus and a server according to some embodiments.
11 is a block diagram illustrating an image processing apparatus according to another exemplary embodiment.
12 is a flowchart illustrating an image processing method according to an exemplary embodiment.
13 is a flowchart illustrating a situation of estimating a region of interest when an image processing apparatus includes a first processor and a second processor according to an exemplary embodiment.
14 is a flowchart illustrating a situation in which an ROI is estimated when an image processing apparatus includes a first processor, a second processor, and a third processor, according to an exemplary embodiment.
15 is a flowchart illustrating another situation in which an ROI is estimated when an electronic device includes a first processor, a second processor, and a third processor, according to an embodiment.
16 is a flowchart illustrating a situation in which an image processing apparatus estimates a region of interest using a server, according to an exemplary embodiment.

이하 첨부된 도면을 참조하면서 오로지 예시를 위한 실시예를 상세히 설명하기로 한다. 하기 실시예는 기술적 내용을 구체화하기 위한 것일 뿐 권리 범위를 제한하거나 한정하는 것이 아님은 물론이다. 상세한 설명 및 실시예로부터 해당 기술분야의 전문가가 용이하게 유추할 수 있는 것은 권리범위에 속하는 것으로 해석된다.Hereinafter, an exemplary embodiment will be described in detail with reference to the accompanying drawings. Of course, the following examples are not intended to limit or limit the scope of rights only for the purpose of specifying the technical content. What can be easily inferred by an expert in the technical field from the detailed description and examples is construed as belonging to the scope of the right.

한편, 본 명세서에서 어떤 구성이 다른 구성과 "연결"되어 있다고 할 때, 이는 '직접적으로 연결'되어 있는 경우뿐 아니라, '그 중간에 다른 구성을 사이에 두고 연결'되어 있는 경우도 포함한다. 또한, 어떤 구성이 다른 구성을 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한, 그 외 다른 구성을 제외하는 것이 아니라 다른 구성들 더 포함할 수도 있다는 것을 의미한다.On the other hand, in the present specification, when a component is "connected" with another component, this includes not only a case of 'directly connected' but also a case of 'connected with another component interposed therebetween'. In addition, when a component "includes" another component, it means that other components may be further included, rather than excluding other components, unless otherwise stated.

또한, 본 명세서에서 사용되는 '제 1' 또는 '제 2' 등과 같이 서수를 포함하는 용어는 다양한 구성 요소들을 설명하는데 사용할 수 있지만, 상기 구성 요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성 요소를 다른 구성 요소로부터 구별하는 목적으로만 사용된다. Also, terms including ordinal numbers such as 'first' or 'second' used in this specification may be used to describe various elements, but the elements should not be limited by the terms. The above terms are used only for the purpose of distinguishing one component from another.

본 명세서에서 "영상 처리 장치"란 촬영 기능을 구비한 전자 장치를 총칭하는 용어를 의미한다. 예를 들어, 카메라 모듈을 구비하고 있는 스마트폰이나 디지털 카메라와 같은 디바이스가 영상 처리 장치에 해당 될 수 있다.As used herein, the term “image processing device” refers to a generic term for an electronic device having a photographing function. For example, a device such as a smartphone or a digital camera having a camera module may correspond to the image processing apparatus.

본 실시예들은 영상 처리 장치 및 영상 처리 방법에 관한 것으로서 이하의 실시예들이 속하는 기술 분야에서 통상의 지식을 가진 자에게 널리 알려져 있는 사항들에 관해서는 자세한 설명을 생략한다.The present embodiments relate to an image processing apparatus and an image processing method, and detailed descriptions of matters widely known to those of ordinary skill in the art to which the following embodiments belong will be omitted.

도 1a는 일 실시예에 따른 영상 처리 장치(1000)를 설명하기 위한 블록도이다. 본 실시예와 관련된 기술분야에서 통상의 지식을 가진 자라면 도 1에 도시된 구성요소들 외에 다른 범용적인 구성요소들이 더 포함될 수 있음을 알 수 있다.1A is a block diagram illustrating an image processing apparatus 1000 according to an exemplary embodiment. Those of ordinary skill in the art related to the present embodiment can see that other general-purpose components other than those shown in FIG. 1 may be further included.

도 1a에 도시된 바와 같이, 일 실시예에 따른 영상 처리 장치(1000)는 메모리(1100), 제어부(1200), 입출력부(1300), 촬영부(1610)을 포함할 수 있다. As shown in FIG. 1A , the image processing apparatus 1000 according to an embodiment may include a memory 1100 , a controller 1200 , an input/output unit 1300 , and a photographing unit 1610 .

촬영부(1610)는 적어도 하나의 피사체를 포함하는 영상을 획득할 수 있다. 예를 들어, 촬영부(1610)는 적어도 하나의 피사체를 포함하는 라이브 뷰 영상을 획득할 수 있고, 실제 촬영이 수행될 때 영상 처리 장치(1000)에 저장될 촬영 영상을 획득할 수 있다. 촬영부(1610)는 사용자의 촬영 명령에 대한 응답으로, 추정된 관심영역에 초점을 맞추어 영상을 촬영할 수 있다. 촬영부(1610)에 관한 상세한 설명은 도 2에서 상세히 설명한다.The photographing unit 1610 may acquire an image including at least one subject. For example, the photographing unit 1610 may acquire a live view image including at least one subject, and may acquire a captured image to be stored in the image processing apparatus 1000 when actual photographing is performed. The photographing unit 1610 may photograph an image by focusing on the estimated ROI in response to the user's photographing command. A detailed description of the photographing unit 1610 will be described in detail with reference to FIG. 2 .

메모리(1100)는 제어부(1200)의 처리 및 제어를 위한 프로그램을 저장할 수 있고, 영상 처리 장치(1000)로 입력되거나 영상 처리 장치(1000)로부터 출력되는 데이터를 저장할 수도 있다. 메모리(1100)는 컴퓨터 실행가능 명령어(computer executable instruction)를 저장할 수 있다. The memory 1100 may store a program for processing and control by the controller 1200 , and may also store data input to or output from the image processing apparatus 1000 . Memory 1100 may store computer executable instructions.

제어부(1200)는, 통상적으로 영상 처리 장치(1000)의 전반적인 동작을 제어한다. 제어부(1200)는 적어도 하나의 프로세서를 구비할 수 있다. 제어부(1200)는 그 기능 및 역할에 따라, 복수의 프로세서들을 포함하거나, 통합된 형태의 하나의 프로세서를 포함할 수 있다.The controller 1200 generally controls the overall operation of the image processing apparatus 1000 . The controller 1200 may include at least one processor. The control unit 1200 may include a plurality of processors or a single processor in an integrated form according to its function and role.

제어부(1200)를 구성하는 적어도 하나의 프로세서는 메모리(1100)에 저장된 컴퓨터 실행가능 명령어를 실행함으로써, 데이터 인식 모델들 중에서 소정의 조건에 대응되는 데이터 인식 모델에 기초하여, 데이터 인식 모델이 학습한 관심 정보에 해당하는지 판단하는 기준에 따라, 획득된 라이브 뷰 영상에서 사용자의 관심영역을 추정하고, 추정된 관심영역에 초점을 맞출 수 있다. 제어부(1200)에 관한 상세한 설명은 도 3 내지 도 8에서 상세히 설명한다.At least one processor constituting the control unit 1200 executes computer-executable instructions stored in the memory 1100, so that the data recognition model learns based on a data recognition model corresponding to a predetermined condition among the data recognition models. The user's ROI may be estimated from the acquired live-view image according to a criterion for determining whether the information corresponds to the ROI, and the estimated ROI may be focused. A detailed description of the control unit 1200 will be described in detail with reference to FIGS. 3 to 8 .

입출력부(1300)는 추정된 관심영역에 초점을 맞춘 라이브 뷰 영상을 디스플레이할 수 있다. 입출력부(1300)는 관심영역에 초점을 맞추는 작업이 실시간으로 반영된 라이브 뷰 영상을 디스플레이할 수 있다.The input/output unit 1300 may display a live view image focused on the estimated ROI. The input/output unit 1300 may display a live view image in which the focusing operation on the region of interest is reflected in real time.

도 1b는 다양한 실시예에 따른 영상 처리 장치(1000)의 구성을 도시한 도면이다.1B is a diagram illustrating a configuration of an image processing apparatus 1000 according to various embodiments of the present disclosure.

도 1b를 참조하면, 영상 처리 장치(1000)는 제 1 프로세서(1200a) 및 제 2 프로세서(1200b)를 가지고 있는 제어부(1200)를 포함할 수 있다. Referring to FIG. 1B , the image processing apparatus 1000 may include a controller 1200 having a first processor 1200a and a second processor 1200b.

제 1 프로세서(1200a)는 영상 처리 장치(1000)에 설치된 적어도 하나의 어플리케이션의 실행을 제어하고, 영상 처리 장치(1000)에 획득되는 이미지(예: 라이브 뷰 이미지, 촬영된 이미지 등)에 대한 그래픽 처리를 수행할 수 있다. 제 1 프로세서(1200a)는 CPU(central processing unit), GPU(graphic processing unit), 통신칩 및 센서 등의 기능이 통합된 SoC(system on chip) 형태로 구현될 수 있다. 또한, 제 1 프로세서(1200a)는 본 명세서 내에서 AP(application processor)로 설명될 수도 있다.The first processor 1200a controls execution of at least one application installed in the image processing apparatus 1000 , and provides graphics for images (eg, live view images, captured images, etc.) acquired by the image processing apparatus 1000 . processing can be performed. The first processor 1200a may be implemented in the form of a system on chip (SoC) in which functions such as a central processing unit (CPU), a graphic processing unit (GPU), a communication chip, and a sensor are integrated. Also, the first processor 1200a may be referred to as an application processor (AP) in this specification.

제 2 프로세서(1200b)는 데이터 인식 모델을 이용하여 이미지의 관심 영역을 추정할 수 있다. The second processor 1200b may estimate the ROI of the image using the data recognition model.

한편, 제 2 프로세서(1200b)는 데이터 인식 모델을 이용한 관심 영역 추정의 기능을 수행하는 인공 지능(AI; artificial intelligence)을 위한 전용 하드웨어 칩 형태로 제작될 수 있다. 다양한 실시예에 따르면, 시각적 이해를 요소기술로 하는 데이터 인식 모델의 경우, 인공 지능(AI: artificial intelligence)을 위한 전용 하드웨어 칩은 GPU를 포함할 수 있다.Meanwhile, the second processor 1200b may be manufactured in the form of a dedicated hardware chip for artificial intelligence (AI) that performs a function of estimating a region of interest using a data recognition model. According to various embodiments, in the case of a data recognition model using visual understanding as an element technology, a dedicated hardware chip for artificial intelligence (AI) may include a GPU.

또한, 영상 처리 장치(1000)는 제 2 프로세서(1200b)와 동일한 기능을 수행하는 제 3 프로세서, 제 4 프로세서 등을 더 포함할 수도 있다. 이 경우, 각각의 프로세서들은 서로 다른 데이터 인식 모델을 이용하여 관심 영역 추정 기능을 수행할 수도 있다.Also, the image processing apparatus 1000 may further include a third processor, a fourth processor, and the like that perform the same function as the second processor 1200b. In this case, each processor may perform an ROI estimation function using different data recognition models.

본 개시의 다양한 실시예에 따르면, 제 1 프로세서(1200a)가 수행하는 기능은 메모리(1100)에 저장되어 다양한 기능을 수행하는 어플리케이션들을 위하여 수행될 수 있고, 제 2 프로세서(1200b)가 수행하는 기능은 영상 처리 장치(1000)의 OS를 위하여 수행될 수 있다. According to various embodiments of the present disclosure, a function performed by the first processor 1200a may be stored in the memory 1100 and performed for applications that perform various functions, and a function performed by the second processor 1200b may be performed for the OS of the image processing apparatus 1000 .

예를 들어, 카메라 어플리케이션은 라이브 뷰 영상을 생성하고, 소정의 조건에 대응하는 데이터 인식 모델을 결정할 수 있다. 카메라 어플리케이션은 OS 및/또는 영상 처리 장치(1000)의 외부에 위치하는 서버에 대하여 결정된 데이터 인식 모델 및 관심 영역 추정 요청과 관련된 정보를 전송할 수 있다.For example, the camera application may generate a live view image and determine a data recognition model corresponding to a predetermined condition. The camera application may transmit the determined data recognition model and information related to the ROI estimation request to the OS and/or the server located outside the image processing apparatus 1000 .

OS 및/또는 외부의 서버는 각각 포함된 데이터 인식 모델을 이용하여 관심 영역을 추정할 수 있다. 이때, OS 및/또는 외부의 서버가 추정된 관심 영역에 초점을 맞출 수 있으나. 이에 한정되는 것은 아니다.The OS and/or an external server may estimate the ROI by using the included data recognition model, respectively. In this case, the OS and/or an external server may focus on the estimated ROI. The present invention is not limited thereto.

도 2는 일 실시예에 따른 촬영부의 블록도이다. 본 실시예와 관련된 기술분야에서 통상의 지식을 가진 자라면 도 2에 도시된 구성요소들 외에 다른 범용적인 구성요소들이 더 포함될 수 있음을 알 수 있다.2 is a block diagram of a photographing unit according to an exemplary embodiment. Those of ordinary skill in the art related to the present embodiment can see that other general-purpose components other than the components shown in FIG. 2 may be further included.

촬영부(1610)는 입사광으로부터 전기적인 신호의 영상을 생성하는 구성요소로서, 렌즈(1611), 렌즈 구동부(1612), 조리개(1613), 조리개 구동부(1614), 촬상 소자(1615), 및 촬상 소자 제어부(1616)를 포함한다. The photographing unit 1610 is a component that generates an image of an electrical signal from incident light, and includes a lens 1611 , a lens driving unit 1612 , an iris 1613 , an iris driving unit 1614 , an imaging device 1615 , and an imaging device. and a device control unit 1616.

렌즈(1611)는 복수 군, 복수 매의 렌즈들을 구비할 수 있다. 렌즈(1611)는 렌즈 구동부(1612)에 의해 그 위치가 조절된다. 렌즈 구동부(1612)는 제어부(1200)에서 제공된 제어 신호에 따라 렌즈(1611)의 위치를 조절한다.The lens 1611 may include a plurality of groups and a plurality of lenses. The position of the lens 1611 is adjusted by the lens driver 1612 . The lens driver 1612 adjusts the position of the lens 1611 according to a control signal provided from the controller 1200 .

조리개(1613)는 조리개 구동부(1614)에 의해 그 개폐 정도가 조절되며, 촬상 소자(1615)로 입사되는 광량을 조절한다.The opening/closing degree of the diaphragm 1613 is controlled by the diaphragm driver 1614 , and the amount of light incident to the imaging device 1615 is adjusted.

렌즈(1611) 및 조리개(1613)를 통과한 광학 신호는 촬상 소자(1615)의 수광면에 이르러 피사체의 상을 결상한다. 상기 촬상 소자(1615)는 광학 신호를 전기 신호로 변환하는 CCD(Charge Coupled Device) 이미지센서 또는 CIS(Complementary Metal Oxide Semiconductor Image Sensor)일 수 있다. 이와 같은 촬상 소자(1615)는 촬상 소자 제어부(1616)에 의해 감도 등이 조절될 수 있다. 촬상 소자 제어부(1616)는 실시간으로 입력되는 영상 신호에 의해 자동으로 생성되는 제어 신호 또는 사용자의 조작에 의해 수동으로 입력되는 제어 신호에 따라 촬상 소자(1615)를 제어할 수 있다.The optical signal passing through the lens 1611 and the diaphragm 1613 arrives at the light receiving surface of the imaging device 1615 to form an image of the subject. The imaging device 1615 may be a Charge Coupled Device (CCD) image sensor or a Complementary Metal Oxide Semiconductor Image Sensor (CIS) that converts an optical signal into an electrical signal. The sensitivity of the imaging device 1615 may be adjusted by the imaging device controller 1616 . The imaging device controller 1616 may control the imaging device 1615 according to a control signal automatically generated by an image signal input in real time or a control signal manually input by a user's manipulation.

촬상 소자(1615)의 노광 시간은 셔터(미도시)로 조절된다. 셔터(미도시)는 가리개를 이동시켜 빛의 입사를 조절하는 기계식 셔터와, 촬상 소자(1615)에 전기 신호를 공급하여 노광을 제어하는 전자식 셔터가 있다.The exposure time of the imaging device 1615 is controlled by a shutter (not shown). The shutter (not shown) includes a mechanical shutter for controlling the incidence of light by moving a shade, and an electronic shutter for controlling exposure by supplying an electrical signal to the imaging device 1615 .

아날로그 신호 처리부(미도시)는 촬상 소자(1615)로부터 공급된 아날로그 신호에 대하여, 노이즈 저감 처리, 게인 조정, 파형 정형화, 아날로그-디지털 변환 처리 등을 수행할 수 있다. 아날로그 신호 처리부(미도시)로부터 출력된 영상 신호는 제어부(1200)에 입력될 수 있다. 제어부(1200)에 입력된 영상 신호는 디지털 신호 처리를 통해 라이브 뷰 영상이 될 수 있다.The analog signal processing unit (not shown) may perform noise reduction processing, gain adjustment, waveform shaping, analog-to-digital conversion processing, and the like, on the analog signal supplied from the imaging device 1615 . The image signal output from the analog signal processing unit (not shown) may be input to the control unit 1200 . The image signal input to the controller 1200 may be a live view image through digital signal processing.

도 3은 일 실시예에 따른 제어부의 동작을 설명하기 위한 도면이다.3 is a view for explaining an operation of a control unit according to an embodiment.

제어부(1200)를 구성하는 적어도 하나의 프로세서는 촬영부(1610)로부터 입력된 영상 신호에 기초하여 라이브 뷰 영상을 획득할 수 있다. 제어부(1200)를 구성하는 적어도 하나의 프로세서는 복수 개의 데이터 인식 모델들 중에서 소정의 조건에 대응되는 데이터 인식 모델에 기초하여, 데이터 인식 모델이 학습한 관심 정보에 해당하는지 판단하는 기준에 따라, 획득된 라이브 뷰 영상에서 사용자의 관심 영역을 추정하고, 추정된 관심영역에 초점을 맞출 수 있다.At least one processor constituting the control unit 1200 may acquire a live view image based on an image signal input from the photographing unit 1610 . At least one processor constituting the control unit 1200 acquires, based on a data recognition model corresponding to a predetermined condition among a plurality of data recognition models, according to a criterion for determining whether the data recognition model corresponds to the learned interest information The user's ROI may be estimated from the live view image, and the estimated ROI may be focused.

예를 들어, 제어부(1200)를 구성하는 적어도 하나의 프로세서는 제 1 조건을 만족할 때, 제 1 데이터 인식 모델이 학습한 돌출(saliency) 영역에 해당하는지 판단하는 기준에 따라, 돌출 영역을 사용자의 관심영역으로 추정할 수 있다. 돌출 영역은 영상에서 부각되거나 특징이 있는 곳이라고 일반적으로 인식되는 영역을 말하며, 피사체가 영상에서 차지하는 면적 또는 영상의 색상 분포에 관한 소정의 기준에 의해 결정될 수 있다. 제 1 데이터 인식 모델은 돌출 영역에 해당하는지 판단하는 기준에 대하여 학습할 수 있다. 제 1 데이터 인식 모델은 촬영부(1200)에서 획득된 라이브 뷰 영상에서 돌출 영역을 결정할 때, 학습한 돌출 영역에 해당하는지 판단하는 기준을 이용할 수 있다. For example, when the first condition is satisfied, the at least one processor constituting the control unit 1200 selects the saliency region of the user according to a criterion for determining whether the first data recognition model corresponds to the learned saliency region. It can be estimated as a region of interest. The protrusion area refers to an area generally recognized as a place that is highlighted or has a characteristic in an image, and may be determined by a predetermined criterion regarding an area occupied by a subject in an image or a color distribution of the image. The first data recognition model may learn a criterion for determining whether it corresponds to the salient region. The first data recognition model may use a criterion for determining whether the salient region corresponds to the learned salient region when determining the salient region in the live view image acquired by the photographing unit 1200 .

도 4a 및 도 4b 일 실시예에 따른 영상 처리 장치(1000)에서 돌출 영역을 사용자의 관심영역으로 추정하여 초점을 맞춘 예를 설명하기 위한 도면이다.4A and 4B are diagrams for explaining an example in which the image processing apparatus 1000 according to an exemplary embodiment is focused by estimating the salient region as the user's ROI.

도 4a를 보면, 같은 품종의 단일 색의 꽃밭을 지나가는 사람의 형태 또는 면적이 배경인 꽃밭 영역과 비교하여 다른 점이 두드러지고, 노란색인 꽃들과는 다른 색상의 옷을 입었기 때문에 부각되므로, 영상 처리 장치(1000)는 영상에서 사람에 대응되는 영역을 돌출 영역에 해당된다고 판단하고, 영상 내에서 사람에 대응되는 영역을 관심영역으로 추정하여 초점을 맞출 수 있다.Referring to FIG. 4A , the shape or area of a person passing by a flower field of a single color of the same variety is different from that of the flower field as a background, and the difference is noticeable because they wear clothes of a different color from the yellow flowers, so the image processing device A reference numeral 1000 may determine that a region corresponding to a person in the image corresponds to a salient region, and may focus by estimating a region corresponding to a person in the image as a region of interest.

도 4b를 보면, 길게 이어진 도로와 숲 사이에 역시 길게 이어진 펜스가 있는 영상에서, 홀로 세워진 빨간 색의 공중 전화 박스에 대응되는 영역을 영상 처리 장치(1000)는 돌출 영역에 해당된다고 판단하고, 영상 내에서 공중 전화 박스에 대응되는 영역을 관심영역으로 추정하여 초점을 맞출 수 있다.Referring to FIG. 4B , in the image in which there is also a long fence between a long road and a forest, the image processing apparatus 1000 determines that an area corresponding to a red public phone box erected alone corresponds to a protruding area, It is possible to focus by estimating a region corresponding to a public telephone box in the ROI as a region of interest.

제어부(1200)를 구성하는 적어도 하나의 프로세서는 제 2 조건을 만족할 때, 제 2 데이터 인식 모델이 학습한 개인 맞춤화된 관심 정보에 해당하는지 판단하는 기준에 따라, 개인 맞춤화된 관심 정보에 대응되는 영역을 사용자의 관심 영역으로 추정할 수 있다. 개인 맞춤화된 관심 정보는 영상 처리 장치(1000)에 저장된 사용자의 영상들에 관한 소정의 통계에 의해 결정될 수 있다. 제 2 데이터 인식 모델은 개인 맞춤화된 관심 정보에 해당하는지 판단하는 기준에 대하여 학습할 수 있다. 제 2 데이터 인식 모델은 촬영부(1200)에서 획득된 라이브 뷰 영상에서 개인 맞춤화된 관심 정보에 대응되는 영역을 결정할 때, 학습한 개인 맞춤화된 관심 정보에 해당하는지 판단하는 기준을 이용할 수 있다. When the second condition is satisfied, the at least one processor constituting the control unit 1200 is an area corresponding to the personalized interest information according to a criterion for determining whether the second data recognition model corresponds to the learned personalized interest information. can be estimated as the user's ROI. The personalized interest information may be determined by predetermined statistics regarding user images stored in the image processing apparatus 1000 . The second data recognition model may learn about a criterion for determining whether it corresponds to the personalized interest information. When determining a region corresponding to the personalized interest information in the live view image acquired by the photographing unit 1200 , the second data recognition model may use a criterion for determining whether it corresponds to the learned personalized interest information.

도 5a 및 도 5b는 일 실시예에 따른 영상 처리 장치(1000)에서 개인 맞춤화된 관심 정보에 대응되는 영역을 사용자의 관심영역으로 추정하여 초점을 맞춘 예를 설명하기 위한 도면이다.5A and 5B are diagrams for explaining an example in which the image processing apparatus 1000 according to an exemplary embodiment is focused by estimating a region corresponding to the personalized ROI as the user's ROI.

도 5a를 보면, 사람과 자동차가 포함된 영상에서, 영상 처리 장치(1000)는 개인 맞춤화된 관심 정보가 자동차인 경우, 영상 내에서 자동차에 대응되는 영역을 관심영역으로 추정하여 초점을 맞출 수 있다. 즉, 사람과 자동차가 포함된 영상에서, 사람에게 초점을 맞추는 것이 일반적일 수 있으나, 개인 맞춤화된 관심 정보가 자동차라고 결정되어 있는 경우, 자동차에 대응되는 영역을 사용자의 관심영역으로 추정하여 초점을 맞출 수 있다.Referring to FIG. 5A , in an image including a person and a car, when the personalized interest information is a car, the image processing apparatus 1000 may focus by estimating a region corresponding to the car as the region of interest in the image. . That is, in an image including a person and a car, it may be common to focus on a person, but when it is determined that the personalized interest information is a car, the area corresponding to the car is estimated as the user's area of interest to focus can match

도 5b를 보면, 여러 명의 아기가 포함된 영상에서, 영상 처리 장치(1000)는 개인 맞춤화된 관심 정보가 사용자 자신의 아기인 경우, 영상 내에서 사용자 자신의 아기에 대응되는 영역을 관심영역으로 추정하여 초점을 맞출 수 있다. 즉, 여러 명의 아기가 포함된 영상에서, 모든 아기에게 또는 제일 가까운 위치의 아기에게 초점을 맞추는 것이 일반적일 수 있으나, 개인 맞춤화된 관심 정보가 사용자 자신의 아기라고 결정되어 있는 경우, 사용자 자신의 아기에 대응되는 영역을 사용자의 관심영역으로 추정하여 초점을 맞출 수 있다.Referring to FIG. 5B , in an image including multiple babies, when the personalized interest information is the user's own baby, the image processing apparatus 1000 estimates a region corresponding to the user's own baby as the region of interest in the image. so you can focus. That is, in an image containing multiple babies, it may be common to focus on all babies or the closest baby, but if the personalized interest information is determined to be your own baby, your own baby It is possible to focus by estimating the region corresponding to the region of interest of the user.

한편, 제 2 조건은 영상 처리 장치(1000)에 저장된 영상들의 개수가 소정의 개수보다 많고, 개인 맞춤화된 관심 정보에 대한 신뢰도가 소정의 조건을 만족하는 경우이고, 제 1 조건은 제 2 조건을 만족하지 않는 경우인 것을 의미한다. 영상 처리 장치(1000)에 저장된 영상들의 개수가 소정의 개수보다 많지 않거나, 개인 맞춤화된 관심 정보에 대한 신뢰도가 소정의 조건을 만족하지 않는 경우, 제 2 조건을 만족하지 않는 것으로 보며, 개인 맞춤화된 관심 정보의 정확도가 떨어지므로, 제 1 데이터 인식 모델을 이용하여 관심 영역을 추정한다.Meanwhile, the second condition is a case in which the number of images stored in the image processing apparatus 1000 is greater than the predetermined number, the reliability of the personalized interest information satisfies the predetermined condition, and the first condition is that the second condition is satisfied. It means that you are not satisfied. If the number of images stored in the image processing apparatus 1000 is not greater than the predetermined number or the reliability of the personalized interest information does not satisfy the predetermined condition, it is considered that the second condition is not satisfied, and the personalized information is not satisfied. Since the accuracy of the ROI is low, the ROI is estimated using the first data recognition model.

제어부(1200)를 구성하는 적어도 하나의 프로세서는 관심 정보의 우선순위에 기초하여, 우선순위가 높은 관심 정보에 대응되는 영역을 사용자의 관심 영역으로 추정할 수 있다.At least one processor constituting the control unit 1200 may estimate a region corresponding to the interest information having a high priority as the user's region of interest based on the priority of the interest information.

제어부(1200)를 구성하는 적어도 하나의 프로세서는 복수 개의 관심영역이 추정된 경우, 복수 개의 관심영역 모두에 대해 초점을 맞추는 다중 초점을 수행할 수 있다. 또한, 제어부(1200)를 구성하는 적어도 하나의 프로세서는 복수 개의 관심영역이 추정된 경우, 복수 개의 관심영역 중 사용자가 선택한 관심영역에 대해 초점을 맞출 수도 있다.At least one processor constituting the controller 1200 may perform multi-focus by focusing on all of the plurality of ROIs when a plurality of ROIs are estimated. Also, when a plurality of regions of interest are estimated, the at least one processor constituting the controller 1200 may focus on the region of interest selected by the user from among the plurality of regions of interest.

제어부(1200)를 구성하는 적어도 하나의 프로세서는 추정된 관심 영역에 초점을 맞추기 위한 제어 신호를 촬영부(1610)에 전송할 수 있다.At least one processor constituting the control unit 1200 may transmit a control signal for focusing on the estimated ROI to the photographing unit 1610 .

도 6은 일 실시예에 따른 제어부(1200)의 블록도이다.6 is a block diagram of the controller 1200 according to an embodiment.

도 6을 참조하면, 일부 실시예에 따른 제어부(1200)는 데이터 학습부(1210) 및 데이터 인식부(1220)를 포함할 수 있다.Referring to FIG. 6 , the control unit 1200 according to some embodiments may include a data learning unit 1210 and a data recognition unit 1220 .

데이터 학습부(1210)는 데이터 인식 모델(예: 도 3의 제 1 데이터 인식 모델, 제 2 데이터 인식 모델)이 영상 내에서 사용자의 관심영역을 추정하기 위해, 영상 내의 어느 영역이 관심 정보에 해당하는지 판단하는 판단 기준을 갖도록 학습할 수 있다. 데이터 학습부(1210)는 관심 정보에 해당하는지 판단하기 위하여 어떤 데이터를 이용할지, 데이터를 이용하여 관심 정보에 해당하는지 여부를 어떻게 판단할 지에 관한 판단 기준을 데이터 인식 모델이 갖도록 학습할 수 있다. 데이터 학습부(1210)는 학습에 이용될 데이터를 획득하고, 획득된 데이터를 데이터 인식 모델에 적용함으로써, 영상 내의 어느 영역이 관심 정보에 해당하는지 판단하는 판단 기준을 학습할 수 있다.In order for the data recognition model (eg, the first data recognition model of FIG. 3 , the second data recognition model of FIG. 3 ) to estimate the user's area of interest in the image, the data learning unit 1210 may determine which area in the image corresponds to the interest information. You can learn to have a criterion for judging whether or not to do it. The data learning unit 1210 may learn that the data recognition model has a criterion for determining which data to use to determine whether it corresponds to the interest information and how to use the data to determine whether it corresponds to the interest information. The data learner 1210 acquires data to be used for learning and applies the acquired data to a data recognition model to learn a criterion for determining which region in the image corresponds to the information of interest.

데이터 인식부(1220)는 다양한 종류의 데이터에 기초하여 영상 내의 어느 영역이 관심 정보에 해당하는지 판단할 수 있다. 데이터 인식부(1220)는 데이터 인식 모델들 중에서 소정의 조건에 대응되어 학습된 데이터 인식 모델을 이용하여, 영상 내의 어느 영역이 관심 정보에 해당하는지 판단할 수 있다. 데이터 인식부(1220)는 데이터 인식 모델별로 학습에 의해 획득한 기준에 따라, 적어도 하나의 피사체를 포함하는 라이브 뷰 영상을 입력 값으로 하여 데이터 인식 모델을 적용함으로써, 영상 내의 어느 영역이 관심 정보에 해당하는지 판단할 수 있다. 한편, 데이터 인식 모델을 적용하여 영상 내의 어느 영역이 관심 정보에 해당하는지 판단하여 사용자의 관심 영역을 추정한 결과와 추정 결과에 대한 사용자의 응답은 데이터 인식 모델을 갱신하는데 이용될 수 있다.The data recognizer 1220 may determine which region in the image corresponds to the interest information based on various types of data. The data recognition unit 1220 may determine which region in the image corresponds to the information of interest by using a data recognition model learned in response to a predetermined condition from among the data recognition models. The data recognition unit 1220 applies the data recognition model using a live view image including at least one subject as an input value according to a criterion obtained by learning for each data recognition model, so that a certain region in the image is identified as the information of interest. It can be determined whether Meanwhile, the result of estimating the user's region of interest by applying the data recognition model to determine which region in the image corresponds to the interest information and the user's response to the estimation result may be used to update the data recognition model.

데이터 학습부(1210) 및 데이터 인식부(1220) 중 적어도 하나는, 적어도 하나의 하드웨어 칩 형태로 제작되어 영상 처리 장치에 탑재될 수 있다. 예를 들어, 데이터 학습부(1210) 및 데이터 인식부(1220) 중 적어도 하나는 인공 지능(AI; artificial intelligence)을 위한 전용 하드웨어 칩 형태로 제작될 수도 있고, 또는 기존의 범용 프로세서(예: CPU 또는 application processor) 또는 그래픽 전용 프로세서(예: GPU)의 일부로 제작되어 전술한 각종 영상 처리 장치에 탑재될 수도 있다.At least one of the data learning unit 1210 and the data recognition unit 1220 may be manufactured in the form of at least one hardware chip and mounted in the image processing apparatus. For example, at least one of the data learning unit 1210 and the data recognition unit 1220 may be manufactured in the form of a dedicated hardware chip for artificial intelligence (AI), or a conventional general-purpose processor (eg, CPU). Alternatively, it may be manufactured as a part of an application processor) or a graphics-only processor (eg, GPU) and mounted on the various image processing apparatuses described above.

이 경우, 데이터 학습부(1210) 및 데이터 인식부(1220)는 하나의 영상 처리 장치에 탑재될 수도 있으며, 또는 별개의 영상 처리 장치들에 각각 탑재될 수도 있다. 예를 들어, 데이터 학습부(1210) 및 데이터 인식부(1220) 중 하나는 영상 처리 장치에 포함되고, 나머지 하나는 서버에 포함될 수 있다. 또한, 데이터 학습부(1210) 및 데이터 인식부(1220)는 유선 또는 무선으로 통하여, 데이터 학습부(1210)가 구축한 모델 정보를 데이터 인식부(1220)로 제공할 수도 있고, 데이터 인식부(1220)로 입력된 데이터가 추가 학습 데이터로서 데이터 학습부(1210)로 제공될 수도 있다.In this case, the data learning unit 1210 and the data recognition unit 1220 may be mounted on one image processing apparatus, or may be mounted on separate image processing apparatuses, respectively. For example, one of the data learner 1210 and the data recognizer 1220 may be included in the image processing apparatus, and the other one may be included in the server. In addition, the data learning unit 1210 and the data recognition unit 1220 may provide the model information built by the data learning unit 1210 to the data recognition unit 1220 through a wired or wireless connection, and the data recognition unit ( Data input to 1220 may be provided to the data learning unit 1210 as additional learning data.

한편, 데이터 학습부(1210) 및 데이터 인식부(1220) 중 적어도 하나는 소프트웨어 모듈로 구현될 수 있다. 데이터 학습부(1210) 및 데이터 인식부(1220) 중 적어도 하나가 소프트웨어 모듈(또는, 인스터력션(instruction) 포함하는 프로그램 모듈)로 구현되는 경우, 소프트웨어 모듈은 컴퓨터로 읽을 수 있는 판독 가능한 비일시적 판독 가능 기록매체(non-transitory computer readable media)에 저장될 수 있다. 또한, 이 경우, 적어도 하나의 소프트웨어 모듈은 OS(Operating System)에 의해 제공되거나, 소정의 애플리케이션에 의해 제공될 수 있다. 또는, 적어도 하나의 소프트웨어 모듈 중 일부는 OS(Operating System)에 의해 제공되고, 나머지 일부는 소정의 애플리케이션에 의해 제공될 수 있다. Meanwhile, at least one of the data learning unit 1210 and the data recognition unit 1220 may be implemented as a software module. When at least one of the data learning unit 1210 and the data recognition unit 1220 is implemented as a software module (or a program module including an instruction), the software module is a computer-readable, non-transitory, non-transitory It may be stored in a readable recording medium (non-transitory computer readable media). Also, in this case, at least one software module may be provided by an operating system (OS) or may be provided by a predetermined application. Alternatively, a part of the at least one software module may be provided by an operating system (OS), and the other part may be provided by a predetermined application.

도 7은 일 실시예에 따른 데이터 학습부(1210)의 블록도이다.7 is a block diagram of the data learning unit 1210 according to an exemplary embodiment.

도 7을 참조하면, 일부 실시예에 따른 데이터 학습부(1210)는 데이터 획득부(1210-1), 전처리부(1210-2), 학습 데이터 선택부(1210-3), 모델 학습부(1210-4) 및 모델 평가부(1210-5)를 포함할 수 있다.Referring to FIG. 7 , the data learning unit 1210 according to some embodiments includes a data acquiring unit 1210-1, a preprocessing unit 1210-2, a training data selection unit 1210-3, and a model learning unit 1210. -4) and a model evaluation unit 1210-5.

데이터 획득부(1210-1)는 데이터 인식 모델(예: 도 3의 제 1 데이터 인식 모델, 제 2 데이터 인식 모델)이 영상 내의 어느 영역이 관심 정보에 해당하는지 판단하는 판단 기준을 갖도록 학습하기 위하여 필요한 데이터를 획득할 수 있다. 데이터 획득부(1210-1)는 영상 내의 어느 영역이 관심 정보에 해당하는지 판단하기 위한 학습을 위하여 필요한 데이터를 획득할 수 있다. The data acquisition unit 1210-1 is configured to learn the data recognition model (eg, the first data recognition model of FIG. 3 , the second data recognition model of FIG. 3 ) to have a criterion for determining which region in the image corresponds to the information of interest. You can get the data you need. The data acquisition unit 1210-1 may acquire data necessary for learning to determine which region in the image corresponds to the information of interest.

예를 들어, 데이터 획득부(1210-1)는 영상 데이터 예를 들어, 이미지, 동영상 등을 획득할 수 있다. 데이터 획득부(1210-1)는 영상 처리 장치(1000)에서 직접 입력된 데이터나 선택된 데이터 등을 획득할 수 있다. 또한, 데이터 획득부(1210-1)는 영상 처리 장치(1000)에서 다양한 센서들을 이용하여 감지되는 다양한 센싱 정보들을 획득할 수 있다. 또한, 데이터 획득부(1210-1)는 영상 처리 장치(1000)와 통신하는 서버(2000)와 같은 외부 장치로부터 수신된 데이터를 획득할 수 있다. For example, the data acquisition unit 1210-1 may acquire image data, for example, an image or a moving picture. The data acquisition unit 1210-1 may acquire data directly input from the image processing apparatus 1000 or selected data. Also, the data acquisition unit 1210-1 may acquire various types of sensing information sensed by the image processing apparatus 1000 using various sensors. Also, the data acquisition unit 1210-1 may acquire data received from an external device such as the server 2000 that communicates with the image processing apparatus 1000 .

데이터 획득부(1210-1)는 사용자로부터 입력받은 데이터, 영상 처리 장치(1000)에서 촬영되었거나 기 저장된 데이터, 또는 서버와 같은 외부 장치로부터 수신된 데이터 등을 획득할 수 있으나, 이에 제한되지 않는다. The data acquisition unit 1210-1 may acquire data input from a user, data photographed or previously stored in the image processing apparatus 1000, or data received from an external device such as a server, but is not limited thereto.

전처리부(1210-2)는 데이터 인식 모델이 영상 내의 어느 영역이 관심 정보에 해당하는지 판단하기 위한 학습에 획득된 데이터가 이용될 수 있도록, 획득된 데이터를 전처리할 수 있다. 전처리부(1210-2)는 후술할 모델 학습부(1210-4)가 상황 판단을 위한 학습을 위하여 획득된 데이터를 이용할 수 있도록, 획득된 데이터를 기 설정된 포맷으로 가공할 수 있다. The preprocessor 1210 - 2 may preprocess the acquired data so that the acquired data can be used for learning for the data recognition model to determine which region in the image corresponds to the information of interest. The preprocessor 1210-2 may process the acquired data into a preset format so that the model learning unit 1210-4, which will be described later, uses the acquired data for learning for situation determination.

예를 들어, 전처리부(1210-2)는 데이터 획득부(1210-1)에서 획득한 이미지, 동영상 등의 데이터에 대해, 의미 있는 데이터를 선별할 수 있도록 노이즈를 제거하거나, 소정의 형태로 가공할 수 있다.For example, the pre-processing unit 1210-2 removes noise or processes the data such as images and moving images acquired by the data acquisition unit 1210-1 to select meaningful data in a predetermined form. can do.

학습 데이터 선택부(1210-3)는 데이터 인식 모델이 전처리된 데이터 중에서 영상 내의 어느 영역이 관심 정보에 해당하는지 판단하기 위한 학습에 필요한 데이터를 선택할 수 있다. 선택된 데이터는 모델 학습부(1210-4)에 제공될 수 있다. 학습 데이터 선택부(1210-3)는 영상 내의 어느 영역이 관심 정보에 해당하는지 판단하기 위한 기 설정된 판단 기준에 따라, 전처리된 데이터 중에서 영상 내의 어느 영역이 관심 정보에 해당하는지 판단하기 위한 학습에 필요한 데이터를 선택할 수 있다. 또한, 학습 데이터 선택부(1210-3)는 후술할 모델 학습부(1210-4)에 의한 학습에 의해 기 설정된 선별 기준에 따라 데이터를 선택할 수도 있다.The training data selector 1210 - 3 may select data required for learning to determine which region in the image corresponds to the interest information among data pre-processed by the data recognition model. The selected data may be provided to the model learning unit 1210 - 4 . The learning data selector 1210-3 is configured for learning to determine which region in the image corresponds to the interest information among the preprocessed data according to a predetermined criterion for determining which region in the image corresponds to the interest information. data can be selected. Also, the training data selection unit 1210 - 3 may select data according to a preset selection criterion by learning by the model learning unit 1210 - 4 to be described later.

학습 데이터 선택부(1210-3)는 이미지, 동영상 등의 각각의 데이터 형태마다, 데이터 선택을 위한 선별 기준을 가질 수 있으며, 이와 같은 선별 기준을 이용하여 학습에 필요한 데이터를 선택할 수 있다. The learning data selection unit 1210 - 3 may have a selection criterion for data selection for each data type such as an image or a moving image, and may select data necessary for learning by using the selection criterion.

학습 데이터 선택부(1210-3)는 데이터 인식 모델이 영상 내의 어느 영역이 관심 정보에 해당하는지 학습하기 위한 학습에 필요한 데이터를 선택할 수 있다. The learning data selector 1210 - 3 may select data required for learning in order for the data recognition model to learn which region in the image corresponds to the interest information.

모델 학습부(1210-4)는 데이터 인식 모델이 학습 데이터에 기초하여 영상 내의 어느 영역이 관심 정보에 해당하는지 판단하는 판단 기준을 갖도록 학습할 수 있다. 또한, 모델 학습부(1210-4)는 데이터 인식 모델이 영상 내의 어느 영역이 관심 정보에 해당하는지 판단하기 위하여 어떤 학습 데이터를 이용해야 하는지에 대한 선별 기준을 학습할 수도 있다.The model learning unit 1210 - 4 may learn the data recognition model to have a criterion for determining which region in the image corresponds to the interest information based on the training data. Also, the model learning unit 1210 - 4 may learn a selection criterion for which training data should be used in order for the data recognition model to determine which region in the image corresponds to the information of interest.

모델 학습부(1210-4)는 데이터 인식 모델이 영상 내의 어느 영역이 관심 정보에 해당하는지를 어떻게 판단할지 학습할 수 있다. 예를 들어, 모델 학습부(1210-4)는 제 1 데이터 인식 모델이 영상 내의 어느 영역이 돌출 영역에 해당하는지를 어떻게 판단할지 학습할 수 있다. 또한, 모델 학습부(1210-4)는 제 2 데이터 인식 모델이 영상 내의 어느 영역이 개인 맞춤화된 관심 정보에 해당하는지를 어떻게 판단할지 학습할 수 있다.The model learning unit 1210 - 4 may learn how the data recognition model determines which region in the image corresponds to the interest information. For example, the model learning unit 1210 - 4 may learn how the first data recognition model determines which region in the image corresponds to the salient region. Also, the model learning unit 1210 - 4 may learn how the second data recognition model determines which region in the image corresponds to the personalized interest information.

또한, 모델 학습부(1210-4)는 영상 내의 어느 영역이 관심 정보에 해당하는지 판단하는데 이용되는 데이터 인식 모델을 학습 데이터를 이용하여 학습시킬 수 있다. 이 경우, 데이터 인식 모델은 미리 구축된 모델일 수 있다. 예를 들어, 데이터 인식 모델은 기본 학습 데이터(예를 들어, 샘플 텍스트 등)을 입력 받아 미리 구축된 모델일 수 있다.Also, the model learning unit 1210 - 4 may learn a data recognition model used to determine which region in the image corresponds to the interest information by using the training data. In this case, the data recognition model may be a pre-built model. For example, the data recognition model may be a model built in advance by receiving basic training data (eg, sample text, etc.).

모델 학습부(1210-4)는, 예를 들어, 학습 데이터를 입력 값으로 하는 지도 학습(supervised learning) 을 통하여, 데이터 인식 모델을 학습시킬 수 있다. 또한, 모델 학습부(1210-4)는, 예를 들어, 별다른 지도없이 영상 내의 어느 영역이 관심 정보에 해당하는지 판단하기 위해 필요한 데이터의 종류를 스스로 학습함으로써, 영상 내의 어느 영역이 관심 정보에 해당하는지 판단하기 위한 기준을 발견하는 비지도 학습(unsupervised learning)을 통하여, 데이터 인식 모델을 학습시킬 수 있다. 또는, 모델 학습부(1210-4)는, 예를 들어, 학습에 따른 영상 내의 어느 영역이 관심 정보에 해당하는지 판단한 결과가 올바른지에 대한 피드백을 이용하는 강화 학습(reinforcement learning)을 통하여, 데이터 인식 모델을 학습시킬 수 있다.The model learning unit 1210 - 4 may train the data recognition model through, for example, supervised learning using learning data as an input value. In addition, the model learning unit 1210-4 learns the type of data required to determine which region in the image corresponds to the interest information without any guidance, for example, so that which region in the image corresponds to the interest information. A data recognition model can be trained through unsupervised learning to find a criterion for determining whether to Alternatively, the model learning unit 1210 - 4 may, for example, perform a data recognition model through reinforcement learning using feedback on whether a result of determining which region in an image according to learning corresponds to the information of interest is correct. can be learned

일 실시예에 따르면, 모델 학습부(1210-4)는 제 1 데이터 인식 모델이 영상, 및 영상에 포함된 색 또는 형태 중 적어도 하나가 주변과 차이나는 영역의 좌표 정보 등을 포함하는 학습 데이터를 이용하여 관심 영역을 추정하는 판단 기준을 갖도록 학습할 수 있다.According to an embodiment, the model learning unit 1210-4 receives training data including coordinate information of a region in which at least one of a color or a shape included in an image and a color or shape included in the image is different from that of the first data recognition model. It is possible to learn to have a criterion for estimating the region of interest by using it.

예를 들면, 모델 학습부(1210-4)는 도 4a와 같은 영상과 영상 내에 포함된 사람의 형태가 위치한 영역의 좌표 정보를 포함하는 학습 데이터를 이용하는 지도 학습 방법을 이용하여 학습을 진행할 수 있다. For example, the model learning unit 1210 - 4 may perform learning using a supervised learning method using learning data including an image as shown in FIG. 4A and coordinate information of a region where a human shape included in the image is located. .

이로 인해, 모델 학습부(1210-4)에 의해 학습된 제 1 데이터 인식 모델은 영상 촬영 장치(1000)가 생성한 라이브 뷰 영상에 포함된 객체를 인식하고, 인식된 객체들 중에서 주변 영역과 색상 및/또는 형태가 상이한 객체가 위치한 영역을 관심 영역으로 추정할 수 있다.For this reason, the first data recognition model learned by the model learning unit 1210 - 4 recognizes an object included in the live view image generated by the image capturing apparatus 1000 , and a peripheral area and color among the recognized objects and/or a region in which an object having a different shape is located may be estimated as a region of interest.

또한, 일 실시예에 따르면, 모델 학습부(1210-4)는 제 2 데이터 인식 모델이 사용자가 영상 촬영 장치(1000)를 이용하여 촬영한 영상을 이용하여 관심 영역을 추정하는 판단 기준을 갖도록 학습할 수 있다.Also, according to an embodiment, the model learning unit 1210 - 4 learns the second data recognition model to have a criterion for estimating the region of interest using the image captured by the user using the image capturing apparatus 1000 . can do.

예를 들면, 모델 학습부(1210-4)는, 비지도 학습 방법을 이용하여, 도 5a와 같은 사용자가 촬영한 영상을 입력받아 관심 영역을 추정하는 판단기준을 갖도록 학습할 수 있다.For example, the model learning unit 1210 - 4 may use an unsupervised learning method to receive an image captured by a user as shown in FIG. 5A and learn to have a criterion for estimating an ROI.

구체적으로, 모델 학습부(1210-4)는 사용자가 촬영한 영상에서 객체를 인식할 수 있다. 예를 들면, 모델 학습부(1210-4)는 갤러리 어플리케이션에 저장된 사용자 촬영 다수의 영상들에 대하여 각각 객체를 인식할 수 있다. 모델 학습부(1210-4)는 인식된 객체의 형태에 따라서 유사한 형태의 객체들을 그룹별로 구분할 수 있다. 또는, 모델 학습부(1210-4)는 인식된 객체들의 빈도수에 따라서 생성된 그룹의 순위를 선정할 수 있다. 따라서, 모델 학습부(1210-4)는 영상 촬영 장치(1000)의 사용자가 주로 촬영하고 많이 촬영하는 객체를 이용하여 학습할 수 있다.Specifically, the model learning unit 1210 - 4 may recognize an object from an image captured by the user. For example, the model learning unit 1210 - 4 may recognize an object with respect to a plurality of images captured by the user stored in the gallery application. The model learning unit 1210 - 4 may classify objects of similar types into groups according to the types of recognized objects. Alternatively, the model learning unit 1210 - 4 may select the rank of the generated group according to the frequency of the recognized objects. Accordingly, the model learning unit 1210 - 4 may learn by using an object that the user of the image capturing apparatus 1000 mainly shoots and shoots a lot.

이로 인해, 모델 학습부(1210-4)에 의해 학습된 제 2 데이터 인식 모델은 영상 촬영 장치(1000)가 생성한 라이브 뷰 영상에 포함된 객체 중 영상 촬영 장치(1000)의 사용자 선호도가 높은 객체가 위치한 영역을 관심 영역으로 추정할 수 있다. 즉, 제 2 데이터 인식 모델은 영상 촬영 장치(1000)를 사용하는 사용자의 선호도에 맞추어진 관심 영역을 추정할 수 있다. For this reason, the second data recognition model learned by the model learning unit 1210 - 4 is an object having a high user preference of the image capturing apparatus 1000 among objects included in the live view image generated by the image capturing apparatus 1000 . A region in which is located may be estimated as a region of interest. That is, the second data recognition model may estimate an ROI tailored to a preference of a user who uses the image capturing apparatus 1000 .

다양한 실시예에 따르면, 영상 처리 장치(1000)는 제 2 데이터 인식 모델 중에서 사용자의 선호도가 높은 일부 객체만을 추정할 수 있도록 학습된 제 4 데이터 인식 모델을 생성할 수 있다. 예를 들어, 모델 학습부(1210-4)는 사용자가 촬영한 영상을 학습한 결과, A 그룹, B 그룹, C 그룹, D 그룹을 생성할 수 있다. 이 때, 영상 처리 장치(1000)는 사용자가 촬영한 영상에서 검출된 객체 중 빈도수가 가장 높은 A 그룹에 속한 객체만을 검출할 수 있는 제 4 데이터 인식 모델을 생성할 수 있다. 이 경우, 영상 처리 장치(1000)는 주어진 상황에 따라 제 2 데이터 인식 모델과 제 4 데이터 인식 모델을 선택적으로 또는 순차적으로 사용할 수 있다.According to various embodiments, the image processing apparatus 1000 may generate a trained fourth data recognition model to estimate only some objects having a high user preference from among the second data recognition models. For example, as a result of learning the image captured by the user, the model learning unit 1210 - 4 may generate group A, group B, group C, and group D. In this case, the image processing apparatus 1000 may generate a fourth data recognition model capable of detecting only an object belonging to group A having the highest frequency among objects detected in an image captured by the user. In this case, the image processing apparatus 1000 may selectively or sequentially use the second data recognition model and the fourth data recognition model according to a given situation.

데이터 인식 모델은, 인식 모델의 적용 분야, 학습의 목적 또는 장치의 컴퓨터 성능 등을 고려하여 구축될 수 있다. 데이터 인식 모델은, 예를 들어, 신경망(Neural Network)을 기반으로 하는 모델일 수 있다. 예컨대, DNN(Deep Neural Network), RNN(Recurrent Neural Network), BRDNN(Bidirectional Recurrent Deep Neural Network)과 같은 모델이 데이터 인식 모델로서 사용될 수 있으나, 이에 한정되지 않는다.The data recognition model may be constructed in consideration of the field of application of the recognition model, the purpose of learning, or the computer performance of the device. The data recognition model may be, for example, a model based on a neural network. For example, a model such as a deep neural network (DNN), a recurrent neural network (RNN), or a bidirectional recurrent deep neural network (BRDNN) may be used as the data recognition model, but is not limited thereto.

다양한 실시예에 따르면, 모델 학습부(1210-4)는 미리 구축된 데이터 인식 모델이 복수 개가 존재하는 경우, 입력된 학습 데이터와 기본 학습 데이터의 관련성이 큰 데이터 인식 모델을 학습할 데이터 인식 모델로 결정할 수 있다. 이 경우, 기본 학습 데이터는 데이터의 타입 별로 기 분류되어 있을 수 있으며, 데이터 인식 모델은 데이터의 타입 별로 미리 구축되어 있을 수 있다. 예를 들어, 기본 학습 데이터는 학습 데이터가 생성된 지역, 학습 데이터가 생성된 시간, 학습 데이터의 크기, 학습 데이터의 장르, 학습 데이터의 생성자, 학습 데이터 내의 오브젝트의 종류 등과 같은 다양한 기준으로 기 분류되어 있을 수 있다. According to various embodiments, when a plurality of pre-built data recognition models exist, the model learning unit 1210 - 4 uses a data recognition model that has a high correlation between the input training data and the basic learning data as a data recognition model to be trained. can decide In this case, the basic learning data may be pre-classified for each type of data, and the data recognition model may be previously built for each type of data. For example, the basic training data is pre-classified by various criteria such as the region where the training data is generated, the time when the training data is generated, the size of the training data, the genre of the training data, the creator of the training data, the type of object in the training data, etc. may have been

또한, 모델 학습부(1210-4)는, 예를 들어, 오류 역전파법(error back-propagation) 또는 경사 하강법(gradient descent)을 포함하는 학습 알고리즘 등을 이용하여 데이터 인식 모델을 학습시킬 수 있다.Also, the model learning unit 1210 - 4 may train the data recognition model using, for example, a learning algorithm including error back-propagation or gradient descent. .

또한, 데이터 인식 모델이 학습되면, 모델 학습부(1210-4)는 학습된 데이터 인식 모델을 저장할 수 있다. 이 경우, 모델 학습부(1210-4)는 학습된 데이터 인식 모델을 데이터 인식부(1220)를 포함하는 영상 처리 장치의 메모리에 저장할 수 있다. 또는, 모델 학습부(1210-4)는 학습된 데이터 인식 모델을 후술할 데이터 인식부(1220)를 포함하는 영상 처리 장치의 메모리에 저장할 수 있다. 또는, 모델 학습부(1210-4)는 학습된 데이터 인식 모델을 영상 처리 장치와 유선 또는 무선 네트워크로 연결되는 서버의 메모리에 저장할 수도 있다.Also, when the data recognition model is learned, the model learning unit 1210 - 4 may store the learned data recognition model. In this case, the model learning unit 1210 - 4 may store the learned data recognition model in the memory of the image processing apparatus including the data recognition unit 1220 . Alternatively, the model learning unit 1210 - 4 may store the learned data recognition model in the memory of the image processing apparatus including the data recognition unit 1220 to be described later. Alternatively, the model learning unit 1210 - 4 may store the learned data recognition model in a memory of a server connected to the image processing apparatus through a wired or wireless network.

이 경우, 학습된 데이터 인식 모델이 저장되는 메모리는, 예를 들면, 영상 처리 장치의 적어도 하나의 다른 구성요소에 관계된 명령 또는 데이터를 함께 저장할 수도 있다. 또한, 메모리는 소프트웨어 및/또는 프로그램을 저장할 수도 있다. 프로그램은, 예를 들면, 커널, 미들웨어, 어플리케이션 프로그래밍 인터페이스(API) 및/또는 어플리케이션 프로그램(또는 "어플리케이션") 등을 포함할 수 있다.In this case, the memory in which the learned data recognition model is stored may also store, for example, commands or data related to at least one other component of the image processing apparatus. The memory may also store software and/or programs. A program may include, for example, a kernel, middleware, an application programming interface (API) and/or an application program (or "application"), and the like.

모델 평가부(1210-5)는 데이터 인식 모델에 평가 데이터를 입력하고, 평가 데이터로부터 출력되는 인식 결과가 소정 기준을 만족하지 못하는 경우, 모델 학습부(1210-4)로 하여금 다시 학습하도록 할 수 있다. 이 경우, 평가 데이터는 데이터 인식 모델을 평가하기 위한 기 설정된 데이터일 수 있다. The model evaluator 1210-5 may input evaluation data to the data recognition model, and when the recognition result output from the evaluation data does not satisfy a predetermined criterion, the model learning unit 1210-4 may be trained again. have. In this case, the evaluation data may be preset data for evaluating the data recognition model.

예를 들어, 모델 평가부(1210-5)는 평가 데이터에 대한 학습된 데이터 인식 모델의 인식 결과 중에서, 인식 결과가 정확하지 않은 평가 데이터의 개수 또는 비율이 미리 설정된 임계치를 초과하는 경우 소정 기준을 만족하지 못한 것으로 평가할 수 있다. 예컨대, 소정 기준이 비율 2%로 정의되는 경우, 학습된 데이터 인식 모델이 총 1000개의 평가 데이터 중의 20개를 초과하는 평가 데이터에 대하여 잘못된 인식 결과를 출력하는 경우, 모델 평가부(1210-5)는 학습된 데이터 인식 모델이 적합하지 않은 것으로 평가할 수 있다.For example, the model evaluator 1210-5 sets a predetermined criterion when the number or ratio of evaluation data for which the recognition result is not accurate among the recognition results of the learned data recognition model for the evaluation data exceeds a preset threshold. It can be evaluated as unsatisfactory. For example, when the predetermined criterion is defined as a ratio of 2%, when the learned data recognition model outputs an erroneous recognition result for more than 20 evaluation data out of a total of 1000 evaluation data, the model evaluation unit 1210-5 can be evaluated that the trained data recognition model is not suitable.

한편, 학습된 데이터 인식 모델이 복수 개가 존재하는 경우, 모델 평가부(1210-5)는 각각의 학습된 데이터 인식 모델에 대하여 소정 기준을 만족하는지를 평가하고, 소정 기준을 만족하는 모델을 최종 데이터 인식 모델로서 결정할 수 있다. 이 경우, 소정 기준을 만족하는 모델이 복수 개인 경우, 모델 평가부(1210-5)는 평가 점수가 높은 순으로 미리 설정된 어느 하나 또는 소정 개수의 모델을 최종 데이터 인식 모델로서 결정할 수 있다.On the other hand, when there are a plurality of learned data recognition models, the model evaluation unit 1210 - 5 evaluates whether each learned data recognition model satisfies a predetermined criterion, and recognizes the model that satisfies the predetermined criterion as final data It can be determined as a model. In this case, when there are a plurality of models satisfying the predetermined criteria, the model evaluator 1210 - 5 may determine any one or a predetermined number of models preset in the order of the highest evaluation score as the final data recognition model.

다양한 실시예에 따르면, 데이터 학습부(1210)는 데이터 획득부(1210-1), 및 모델 학습부(1210-4)를 포함하고, 전처리부(1210-2). 학습 데이터 선택부(1210-3), 및 모델 평가부(1210-5)는 선별적으로 포함할 수도 있다.According to various embodiments, the data learning unit 1210 includes a data acquiring unit 1210 - 1 and a model learning unit 1210 - 4 , and a preprocessing unit 1210 - 2 . The training data selection unit 1210 - 3 and the model evaluation unit 1210 - 5 may be selectively included.

한편, 데이터 학습부(1210) 내의 데이터 획득부(1210-1), 전처리부(1210-2), 학습 데이터 선택부(1210-3), 모델 학습부(1210-4) 및 모델 평가부(1210-5) 중 적어도 하나는, 적어도 하나의 하드웨어 칩 형태로 제작되어 영상 처리 장치에 탑재될 수 있다. 예를 들어, 데이터 획득부(1210-1), 전처리부(1210-2), 학습 데이터 선택부(1210-3), 모델 학습부(1210-4) 및 모델 평가부(1210-5) 중 적어도 하나는 인공 지능(AI; artificial intelligence)을 위한 전용 하드웨어 칩 형태로 제작될 수도 있고, 또는 기존의 범용 프로세서(예: CPU 또는 application processor) 또는 그래픽 전용 프로세서(예: GPU)의 일부로 제작되어 전술한 각종 영상 처리 장치에 탑재될 수도 있다.On the other hand, in the data learning unit 1210, the data acquisition unit 1210-1, the preprocessor 1210-2, the training data selection unit 1210-3, the model learning unit 1210-4, and the model evaluation unit 1210 At least one of -5) may be manufactured in the form of at least one hardware chip and mounted on the image processing apparatus. For example, at least one of the data acquisition unit 1210-1, the preprocessor 1210-2, the training data selection unit 1210-3, the model learning unit 1210-4, and the model evaluation unit 1210-5 One may be manufactured in the form of a dedicated hardware chip for artificial intelligence (AI), or may be manufactured as part of an existing general-purpose processor (eg, CPU or application processor) or graphics-only processor (eg, GPU) as described above. It may be mounted on various image processing apparatuses.

또한, 데이터 획득부(1210-1), 전처리부(1210-2), 학습 데이터 선택부(1210-3), 모델 학습부(1210-4) 및 모델 평가부(1210-5)는 하나의 영상 처리 장치에 탑재될 수도 있으며, 또는 별개의 영상 처리 장치들에 각각 탑재될 수도 있다. 예를 들어, 데이터 획득부(1210-1), 전처리부(1210-2), 학습 데이터 선택부(1210-3), 모델 학습부(1210-4) 및 모델 평가부(1210-5) 중 일부는 영상 처리 장치에 포함되고, 나머지 일부는 서버에 포함될 수 있다.In addition, the data acquisition unit 1210-1, the preprocessor 1210-2, the training data selection unit 1210-3, the model learning unit 1210-4, and the model evaluation unit 1210-5 are one image. It may be mounted on the processing apparatus, or may be respectively mounted on separate image processing apparatuses. For example, some of the data acquisition unit 1210-1, the preprocessor 1210-2, the training data selection unit 1210-3, the model learning unit 1210-4, and the model evaluation unit 1210-5 may be included in the image processing apparatus, and the remaining part may be included in the server.

또한, 데이터 획득부(1210-1), 전처리부(1210-2), 학습 데이터 선택부(1210-3), 모델 학습부(1210-4) 및 모델 평가부(1210-5) 중 적어도 하나는 소프트웨어 모듈로 구현될 수 있다. 데이터 획득부(1210-1), 전처리부(1210-2), 학습 데이터 선택부(1210-3), 모델 학습부(1210-4) 및 모델 평가부(1210-5) 중 적어도 하나가 소프트웨어 모듈(또는, 인스터력션(instruction) 포함하는 프로그램 모듈)로 구현되는 경우, 소프트웨어 모듈은 컴퓨터로 읽을 수 있는 판독 가능한 비일시적 판독 가능 기록매체(non-transitory computer readable media)에 저장될 수 있다. 또한, 이 경우, 적어도 하나의 소프트웨어 모듈은 OS(Operating System)에 의해 제공되거나, 소정의 애플리케이션에 의해 제공될 수 있다. 또는, 적어도 하나의 소프트웨어 모듈 중 일부는 OS(Operating System)에 의해 제공되고, 나머지 일부는 소정의 애플리케이션에 의해 제공될 수 있다.In addition, at least one of the data acquisition unit 1210-1, the preprocessor 1210-2, the training data selection unit 1210-3, the model learning unit 1210-4, and the model evaluation unit 1210-5 is It may be implemented as a software module. At least one of the data acquisition unit 1210-1, the preprocessor 1210-2, the training data selection unit 1210-3, the model learning unit 1210-4, and the model evaluation unit 1210-5 is a software module When implemented as (or, a program module including instructions), the software module may be stored in a computer-readable non-transitory computer readable medium. Also, in this case, at least one software module may be provided by an operating system (OS) or may be provided by a predetermined application. Alternatively, a part of the at least one software module may be provided by an operating system (OS), and the other part may be provided by a predetermined application.

도 8은 일 실시예에 따른 데이터 인식부(1220)의 블록도이다.8 is a block diagram of the data recognition unit 1220 according to an exemplary embodiment.

도 8을 참조하면, 일부 실시예에 따른 데이터 인식부(1220)는 데이터 획득부(1220-1), 전처리부(1220-2), 인식 데이터 선택부(1220-3), 인식 결과 제공부(1220-4) 및 모델 갱신부(1220-5)를 포함할 수 있다. Referring to FIG. 8 , the data recognition unit 1220 according to some embodiments includes a data acquisition unit 1220-1, a preprocessor 1220-2, a recognition data selection unit 1220-3, and a recognition result providing unit ( 1220-4) and a model update unit 1220-5.

데이터 인식부(1220)는 복수 개의 학습된 데이터 인식 모델들 중에서 소정의 조건에 대응되는 데이터 인식 모델을 이용하여, 데이터 인식 모델이 학습한 관심 정보에 해당하는지 판단하는 판단 기준에 따라, 영상 내의 어느 영역이 관심 정보에 해당하는지 판단할 수 있다. The data recognition unit 1220 uses a data recognition model corresponding to a predetermined condition from among a plurality of learned data recognition models, according to a determination criterion for determining whether the data recognition model corresponds to the learned interest information, which It may be determined whether the region corresponds to the information of interest.

데이터 획득부(1220-1)는 데이터 인식 모델이 영상 내의 어느 영역이 관심 정보에 해당하는지 판단하기 위해 필요한 다양한 종류의 데이터를 획득할 수 있다. 예를 들어, 데이터 획득부(1220-1)는 이미지, 동영상 등과 같은 영상 데이터를 획득할 수 있다. 예를 들어, 데이터 획득부(1220-1)는 영상 처리 장치(1000)에서 직접 입력된 데이터나 선택된 데이터 등을 획득하거나, 영상 처리 장치(1000)에서 다양한 센서들을 이용하여 감지되는 다양한 센싱 정보들을 획득할 수 있다. 또한, 데이터 획득부(1220-1)는 영상 처리 장치(1000)와 통신하는 서버(2000)와 같은 외부 장치로부터 수신된 데이터를 획득할 수 있다. The data acquisition unit 1220-1 may acquire various types of data necessary for the data recognition model to determine which region in the image corresponds to the information of interest. For example, the data acquisition unit 1220-1 may acquire image data such as an image or a moving picture. For example, the data acquisition unit 1220-1 acquires data directly input from the image processing apparatus 1000 or selected data, or obtains various sensing information sensed by the image processing apparatus 1000 using various sensors. can be obtained Also, the data acquisition unit 1220-1 may acquire data received from an external device such as the server 2000 that communicates with the image processing apparatus 1000 .

전처리부(1220-2)는 데이터 인식 모델이 영상 내의 어느 영역이 관심 정보에 해당하는지 판단하기 위해 획득된 데이터가 이용될 수 있도록, 획득된 데이터를 전처리할 수 있다. 전처리부(1220-2)는 후술할 인식 결과 제공부(1220-4)가 영상 내의 어느 영역이 관심 정보에 해당하는지 판단하기 위하여 획득된 데이터를 이용할 수 있도록, 획득된 데이터를 기 설정된 포맷으로 가공할 수 있다. The preprocessor 1220 - 2 may preprocess the acquired data so that the data recognition model can use the acquired data to determine which region in the image corresponds to the information of interest. The preprocessor 1220-2 processes the acquired data into a preset format so that the recognition result providing unit 1220-4, which will be described later, uses the acquired data to determine which region in the image corresponds to the information of interest. can do.

예를 들어, 전처리부(1220-2)는 데이터 획득부(1220-1)에서 획득한 이미지, 동영상 등의 영상 데이터에 대해, 의미 있는 데이터를 선별할 수 있도록 노이즈를 제거하거나, 소정의 형태로 가공할 수 있다.For example, the preprocessor 1220 - 2 removes noise from the image data such as an image or a moving picture acquired by the data acquisition unit 1220 - 1 to select meaningful data or converts it into a predetermined form. can be processed

인식 데이터 선택부(1220-3)는 전처리된 데이터 중에서 데이터 인식 모델이 영상 내의 어느 영역이 관심 정보에 해당하는지 판단하는데 필요한 데이터를 선택할 수 있다. 선택된 데이터는 인식 결과 제공부(1220-4)에게 제공될 수 있다. 인식 데이터 선택부(1220-3)는 영상 내의 어느 영역이 관심 정보에 해당하는지 판단하기 위한 기 설정된 판단 기준에 따라, 전처리된 데이터 중에서 일부 또는 전부를 선택할 수 있다. 또한, 인식 데이터 선택부(1220-3)는 후술할 모델 학습부(1210-4)에 의한 학습에 의해 기 설정된 선별 기준에 따라 데이터를 선택할 수도 있다.The recognition data selector 1220 - 3 may select data necessary for determining which region in the image of the data recognition model corresponds to the information of interest from among the preprocessed data. The selected data may be provided to the recognition result providing unit 1220 - 4 . The recognition data selector 1220 - 3 may select some or all of the pre-processed data according to a predetermined criterion for determining which region in the image corresponds to the information of interest. In addition, the recognition data selection unit 1220-3 may select data according to a preset selection criterion by learning by the model learning unit 1210-4, which will be described later.

인식 결과 제공부(1220-4)는 선택된 데이터를 데이터 인식 모델에 적용하여 영상 내의 어느 영역이 관심 정보에 해당하는지 판단할 수 있다. 인식 결과 제공부(1220-4)는 데이터의 인식 목적에 따라 영상 내의 관심 정보에 대응되는 영역을 제공할 수 있다. 인식 결과 제공부(1220-4)는 인식 데이터 선택부(1220-3)에 의해 선택된 데이터를 입력 값으로 이용함으로써, 선택된 데이터를 데이터 인식 모델에 적용할 수 있다. 또한, 인식 결과는 데이터 인식 모델에 의해 결정될 수 있다. 인식 결과 제공부(1220-4)는 복수 개의 데이터 인식 모델들 중에서 소정의 조건에 대응되는 데이터 인식 모델에 기초하여 영상 내의 어느 영역이 관심 정보에 해당하는지 판단할 수 있다. The recognition result providing unit 1220 - 4 may apply the selected data to the data recognition model to determine which region in the image corresponds to the information of interest. The recognition result providing unit 1220 - 4 may provide a region corresponding to the interest information in the image according to the purpose of data recognition. The recognition result providing unit 1220 - 4 may apply the selected data to the data recognition model by using the data selected by the recognition data selecting unit 1220 - 3 as an input value. Also, the recognition result may be determined by a data recognition model. The recognition result providing unit 1220 - 4 may determine which region in the image corresponds to the interest information based on a data recognition model corresponding to a predetermined condition among a plurality of data recognition models.

인식 결과 제공부(1220-4)는 제 1 조건을 만족할 때, 제 1 데이터 인식 모델이 학습한 돌출(saliency) 영역에 해당하는지 판단하는 기준에 따라, 돌출 영역을 사용자의 관심영역으로 추정할 수 있다. 제 1 데이터 인식 모델은 촬영부(1200)에서 획득된 라이브 뷰 영상에서 돌출 영역을 결정할 때, 학습한 돌출 영역에 해당하는지 판단하는 기준을 이용할 수 있다.When the first condition is satisfied, the recognition result providing unit 1220-4 may estimate the saliency region as the user's ROI according to a criterion for determining whether the first data recognition model corresponds to the learned saliency region. have. The first data recognition model may use a criterion for determining whether the salient region corresponds to the learned salient region when determining the salient region in the live view image acquired by the photographing unit 1200 .

예를 들어, 인식 결과 제공부(1220-4)는, 영상 처리 장치(1000)가 제 1 조건을 만족할 때, 제 1 데이터 인식 모델을 이용하여 도 4b와 같은 입력 영상에서 전화 부스를 검출하고, 전화 부스가 있는 영역을 관심 영역으로서 추정할 수 있다.For example, when the image processing apparatus 1000 satisfies the first condition, the recognition result providing unit 1220-4 detects a phone booth from the input image as shown in FIG. 4B using the first data recognition model, The area in which the phone booth is located may be estimated as the area of interest.

인식 결과 제공부(1220-4)는 제 2 조건을 만족할 때, 제 2 데이터 인식 모델이 학습한 개인 맞춤화된 관심 정보에 해당하는지 판단하는 기준에 따라, 개인 맞춤화된 관심 정보에 대응되는 영역을 사용자의 관심 영역으로 추정할 수 있다. 제 2 데이터 인식 모델은 촬영부(1200)에서 획득된 라이브 뷰 영상에서 개인 맞춤화된 관심 정보에 대응되는 영역을 결정할 때, 학습한 개인 맞춤화된 관심 정보에 해당하는지 판단하는 기준을 이용할 수 있다. When the second condition is satisfied, the recognition result providing unit 1220-4 selects an area corresponding to the personalized interest information according to a criterion for determining whether the second data recognition model corresponds to the learned personalized interest information. can be estimated as the region of interest of When determining a region corresponding to the personalized interest information in the live view image acquired by the photographing unit 1200 , the second data recognition model may use a criterion for determining whether it corresponds to the learned personalized interest information.

예를 들어, 인식 결과 제공부(1220-4)는, 영상 처리 장치(1000)가 제 2 조건을 만족할 때, 제 2 데이터 인식 모델을 이용하여 도 5b와 같은 입력 영상에서 아이의 얼굴을 검출하고, 아이의 얼굴이 있는 영역을 관심 영역으로 추정할 수 있다.For example, when the image processing apparatus 1000 satisfies the second condition, the recognition result providing unit 1220 - 4 detects the face of a child from the input image as shown in FIG. 5B using the second data recognition model, and , the area with the child's face can be estimated as the area of interest.

또한, 모델 학습부(예: 도 7의 모델 학습부(1210-4)가 사용자가 촬영한 영상을 학습한 결과 자동차와 유사한 형상이 포함된 영상이 가장 높인 빈도수인 경우, 인식 결과 제공부(1220-4)는 제 4 데이터 인식 모델을 이용하여 도 5a와 같은 입력 영상에서 자동차와 유사한 형상을 검출하고, 자동차와 유사한 형상이 있는 영역을 관심 영역으로 추정할 수 있다.In addition, when the model learning unit (eg, the model learning unit 1210 - 4 of FIG. 7 ) learns the image captured by the user, the image including a shape similar to a car has the highest frequency, the recognition result providing unit 1220 -4) may detect a shape similar to a car in the input image as shown in FIG. 5A using the fourth data recognition model, and may estimate a region having a shape similar to a car as a region of interest.

즉, 사용자가 영상 처리 장치(1000)을 구매한 초기에 제 2 조건을 만족하지 못하는 경우, 영상 처리 장치(1000)는 제 1 데이터 인식 모델을 이용하여 관심 영역을 추정할 수 있다. 그리고, 영상 처리 장치(1000)로 촬영된 영상의 수가 증가하여 제 2 조건을 만족하는 경우, 영상 처리 장치(1000)는 제 2 데이터 인식 모델 및/또는 제 4 데이터 인식 모델을 이용하여 관심 영역을 추정할 수 있다.That is, when the user does not satisfy the second condition in the initial purchase of the image processing apparatus 1000 , the image processing apparatus 1000 may estimate the ROI using the first data recognition model. And, when the second condition is satisfied because the number of images captured by the image processing apparatus 1000 increases, the image processing apparatus 1000 determines the region of interest using the second data recognition model and/or the fourth data recognition model. can be estimated

모델 갱신부(1220-5)는 인식 결과 제공부(1220-4)에 의해 제공되는 인식 결과에 대한 평가에 기초하여, 데이터 인식 모델이 갱신되도록할 수 있다. 예를 들어, 모델 갱신부(1220-5)는 인식 결과 제공부(1220-4)에 의해 제공되는 영상 내의 어느 영역이 관심 정보에 해당하는지 판단한 결과를 모델 학습부(1210-4)에게 제공함으로써, 모델 학습부(1210-4)가 데이터 인식 모델을 갱신하도록 할 수 있다. The model updating unit 1220 - 5 may update the data recognition model based on the evaluation of the recognition result provided by the recognition result providing unit 1220 - 4 . For example, the model updating unit 1220-5 provides the model learning unit 1210-4 with a result of determining which region in the image provided by the recognition result providing unit 1220-4 corresponds to the interest information to the model learning unit 1210-4. , the model learning unit 1210 - 4 may update the data recognition model.

다양한 실시예에 따르면, 데이터 인식부(1220)는 데이터 획득부(1220-1), 및 인식결과 제공부(1220-4)를 포함하고, 전처리부(1230-2), 인식 데이터 선택부(1220-3), 및 모델 갱신부(1220-5)는 선별적으로 포함할 수도 있다.According to various embodiments, the data recognition unit 1220 includes a data acquisition unit 1220-1, and a recognition result providing unit 1220-4, a pre-processing unit 1230-2, and a recognition data selection unit 1220. -3), and the model updater 1220 - 5 may be selectively included.

한편, 데이터 인식부(1220) 내의 데이터 획득부(1220-1), 전처리부(1220-2), 인식 데이터 선택부(1220-3), 인식 결과 제공부(1220-4) 및 모델 갱신부(1220-5) 중 적어도 하나는, 적어도 하나의 하드웨어 칩 형태로 제작되어 영상 처리 장치에 탑재될 수 있다. 예를 들어, 데이터 획득부(1220-1), 전처리부(1220-2), 인식 데이터 선택부(1220-3), 인식 결과 제공부(1220-4) 및 모델 갱신부(1220-5) 중 적어도 하나는 인공 지능(AI; artificial intelligence)을 위한 전용 하드웨어 칩 형태로 제작될 수도 있고, 또는 기존의 범용 프로세서(예: CPU 또는 application processor) 또는 그래픽 전용 프로세서(예: GPU)의 일부로 제작되어 전술한 각종 영상 처리 장치에 탑재될 수도 있다.On the other hand, in the data recognition unit 1220, the data acquisition unit 1220-1, the preprocessor 1220-2, the recognition data selection unit 1220-3, the recognition result providing unit 1220-4, and the model update unit ( 1220-5) may be manufactured in the form of at least one hardware chip and mounted on the image processing apparatus. For example, among the data acquisition unit 1220-1, the preprocessor 1220-2, the recognition data selection unit 1220-3, the recognition result providing unit 1220-4, and the model update unit 1220-5 At least one may be manufactured in the form of a dedicated hardware chip for artificial intelligence (AI), or may be manufactured as part of an existing general-purpose processor (eg, CPU or application processor) or graphics-only processor (eg, GPU). It may be mounted on one of various image processing apparatuses.

또한, 데이터 획득부(1220-1), 전처리부(1220-2), 인식 데이터 선택부(1220-3), 인식 결과 제공부(1220-4) 및 모델 갱신부(1220-5)는 하나의 영상 처리 장치에 탑재될 수도 있으며, 또는 별개의 영상 처리 장치들에 각각 탑재될 수도 있다. 예를 들어, 데이터 획득부(1220-1), 전처리부(1220-2), 인식 데이터 선택부(1220-3), 인식 결과 제공부(1220-4) 및 모델 갱신부(1220-5) 중 일부는 영상 처리 장치에 포함되고, 나머지 일부는 서버에 포함될 수 있다.In addition, the data acquisition unit 1220-1, the preprocessor 1220-2, the recognition data selection unit 1220-3, the recognition result providing unit 1220-4, and the model update unit 1220-5 are one unit. It may be mounted on the image processing apparatus, or may be respectively mounted on separate image processing apparatuses. For example, among the data acquisition unit 1220-1, the preprocessor 1220-2, the recognition data selection unit 1220-3, the recognition result providing unit 1220-4, and the model update unit 1220-5 Some may be included in the image processing apparatus, and some may be included in the server.

또한, 데이터 획득부(1220-1), 전처리부(1220-2), 인식 데이터 선택부(1220-3), 인식 결과 제공부(1220-4) 및 모델 갱신부(1220-5) 중 적어도 하나는 소프트웨어 모듈로 구현될 수 있다. 데이터 획득부(1220-1), 전처리부(1220-2), 인식 데이터 선택부(1220-3), 인식 결과 제공부(1220-4) 및 모델 갱신부(1220-5) 중 적어도 하나가 소프트웨어 모듈(또는, 인스터력션(instruction) 포함하는 프로그램 모듈)로 구현되는 경우, 소프트웨어 모듈은 컴퓨터로 읽을 수 있는 판독 가능한 비일시적 판독 가능 기록매체(non-transitory computer readable media)에 저장될 수 있다. 또한, 이 경우, 적어도 하나의 소프트웨어 모듈은 OS(Operating System)에 의해 제공되거나, 소정의 애플리케이션에 의해 제공될 수 있다. 또는, 적어도 하나의 소프트웨어 모듈 중 일부는 OS(Operating System)에 의해 제공되고, 나머지 일부는 소정의 애플리케이션에 의해 제공될 수 있다.In addition, at least one of the data acquisition unit 1220-1, the preprocessor 1220-2, the recognition data selection unit 1220-3, the recognition result providing unit 1220-4, and the model update unit 1220-5 may be implemented as a software module. At least one of the data acquisition unit 1220-1, the preprocessor 1220-2, the recognition data selection unit 1220-3, the recognition result providing unit 1220-4, and the model update unit 1220-5 is software. When implemented as a module (or a program module including instructions), the software module may be stored in a computer-readable non-transitory computer readable medium. Also, in this case, at least one software module may be provided by an operating system (OS) or may be provided by a predetermined application. Alternatively, a part of the at least one software module may be provided by an operating system (OS), and the other part may be provided by a predetermined application.

도 9는 다른 일 실시예에 따른 영상 처리 장치를 설명하기 위한 블록도이다. 본 실시예와 관련된 기술분야에서 통상의 지식을 가진 자라면 도 9에 도시된 구성요소들 외에 다른 범용적인 구성요소들이 더 포함될 수 있음을 알 수 있다. 도 1에서 설명된 구성요소들에 대해서는 중복되는 설명을 생략한다.9 is a block diagram illustrating an image processing apparatus according to another exemplary embodiment. Those of ordinary skill in the art related to the present embodiment can see that other general-purpose components other than the components shown in FIG. 9 may be further included. Duplicate descriptions of the components described in FIG. 1 will be omitted.

외부 서버(2000)는 사용자가 촬영한 영상들을 소정의 기준에 따라 분류하고, 분류된 영상들로 제 3 데이터 인식 모델을 학습시켜, 개인 맞춤화된 관심 정보에 해당하는지 판단하는 판단 기준을 획득할 수 있다. The external server 2000 classifies the images captured by the user according to a predetermined criterion, learns the third data recognition model from the classified images, and obtains a criterion for determining whether or not it corresponds to personalized interest information. have.

일 실시예에 따르면, 제 3 데이터 인식 모델은 제 2 데이터 인식 모델과 유사한 방식으로 학습되어 사용자에 맞추어진 관심 정보를 판단하는 판단 기준을 획득할 수 있다.According to an embodiment, the third data recognition model may be trained in a similar manner to the second data recognition model to obtain a criterion for determining interest information tailored to the user.

영상 처리 장치(1000)는 외부 서버(2000)에 저장된 사용자의 영상들을 이용하여, 제 3 데이터 인식 모델이 학습한 개인 맞춤화된 관심 정보에 해당하는지 판단하는 판단 기준을 외부 서버(2000)로부터 수신하는 통신부(1500)를 더 포함할 수 있다. The image processing apparatus 1000 uses the user's images stored in the external server 2000 to receive, from the external server 2000, a criterion for determining whether the third data recognition model corresponds to the learned personalized interest information. It may further include a communication unit 1500 .

제어부(1200)를 구성하는 적어도 하나의 프로세서는 영상 처리 장치(1000)에 마련된 데이터 인식 모델이 외부 서버(2000)로부터 수신한 개인 맞춤화된 관심 정보에 해당하는지 판단하는 기준에 따라, 개인 맞춤화된 관심 정보에 대응되는 영역을 라이브 뷰 영상에서 사용자의 관심영역으로 추정할 수 있다.At least one processor constituting the control unit 1200 may determine whether the data recognition model provided in the image processing apparatus 1000 corresponds to the personalized interest information received from the external server 2000, the personalized interest The region corresponding to the information may be estimated as the user's ROI in the live view image.

한편, 영상 처리 장치(1000)는 영상 처리 장치(1000)에서 획득한 라이브 뷰 영상을 통신부(1500)를 통해 서버(2000)로 전송할 수 있다. 서버(2000)는 제 3 데이터 인식 모델이 학습한 개인 맞춤화된 관심 정보에 해당하는지 판단하는 기준에 따라, 영상 처리 장치(1000)로부터 전송된 라이브 뷰 영상에서 개인 맞춤화된 관심 정보에 대응되는 영역을 사용자의 관심 영역으로 추정하고, 추정된 관심 영역에 대한 정보를 영상 처리 장치(1000)로 전송해 줄 수도 있다.Meanwhile, the image processing apparatus 1000 may transmit the live view image acquired by the image processing apparatus 1000 to the server 2000 through the communication unit 1500 . The server 2000 selects a region corresponding to the personalized interest information in the live view image transmitted from the image processing apparatus 1000 according to a criterion for determining whether the third data recognition model corresponds to the learned personalized interest information. It may be estimated as the user's ROI, and information on the estimated ROI may be transmitted to the image processing apparatus 1000 .

도 10은 일부 실시예에 따른 영상 처리 장치(1000) 및 서버(2000)가 서로 연동함으로써 데이터를 학습하고 인식하는 예시를 나타내는 도면이다.10 is a diagram illustrating an example of learning and recognizing data by interworking with the image processing apparatus 1000 and the server 2000 according to some embodiments.

도 10을 참조하면, 서버(2000)는 영상 내의 어느 영역이 관심 정보에 해당하는지 판단하기 위한 기준을 학습할 수 있으며, 영상 처리 장치(1000)는 서버(2000)에서 학습된 데이터 인식 모델을 이용하여, 영상 내의 어느 영역이 관심 정보에 해당하는지 판단할 수 있다.Referring to FIG. 10 , the server 2000 may learn a criterion for determining which region in an image corresponds to interest information, and the image processing apparatus 1000 uses the data recognition model learned by the server 2000 . Accordingly, it is possible to determine which region in the image corresponds to the interest information.

이 경우, 서버(2000)의 데이터 학습부(2210)는 도 7에 도시된 데이터 학습부(1210)의 기능을 수행할 수 있다. 서버(2000)의 데이터 학습부(2210)는 영상 내의 어느 영역이 관심 정보에 해당하는지 판단하기 위하여 어떤 데이터를 이용할지, 데이터를 이용하여 영상 내의 어느 영역이 관심 정보에 해당하는지 어떻게 판단할지에 관한 판단 기준을 학습할 수 있다. 서버(2000)의 데이터 학습부(2210)는 학습에 이용될 데이터를 획득하고, 획득된 데이터를 후술할 데이터 인식 모델에 적용함으로써, 영상 내의 어느 영역이 관심 정보에 해당하는지 판단하기 위한 판단 기준을 학습할 수 있다.In this case, the data learning unit 2210 of the server 2000 may perform the function of the data learning unit 1210 illustrated in FIG. 7 . The data learning unit 2210 of the server 2000 relates to which data to use to determine which region in the image corresponds to the interest information, and how to determine which region in the image corresponds to the interest information using the data. Judgment criteria can be learned. The data learning unit 2210 of the server 2000 acquires data to be used for learning, and applies the acquired data to a data recognition model to be described later, thereby determining which region in the image corresponds to the information of interest. can learn

또한, 영상 처리 장치(1000)의 인식 결과 제공부(1220-4)는 인식 데이터 선택부(1220-3)에 의해 선택된 데이터를 서버(2000)에 의해 생성된 데이터 인식 모델에 적용하여 영상 내의 어느 영역이 관심 정보에 해당하는지 판단할 수 있다. 예를 들어, 인식 결과 제공부(1220-4)는 인식 데이터 선택부(1220-3)에 의해 선택된 데이터를 서버(2000)에게 전송하고, 서버(2000)가 인식 데이터 선택부(1220-3)에 의해 선택된 데이터를 데이터 인식 모델에 적용하여 영상 내의 어느 영역이 관심 정보에 해당하는지 판단할 것을 요청할 수 있다. 또한, 인식 결과 제공부(1220-4)는 서버(2000)에 의해 판단된 영상 내의 관심 정보에 대응되는 영역을 서버(2000)로부터 수신할 수 있다. In addition, the recognition result providing unit 1220 - 4 of the image processing apparatus 1000 applies the data selected by the recognition data selection unit 1220 - 3 to the data recognition model generated by the server 2000 to apply any It may be determined whether the region corresponds to the information of interest. For example, the recognition result providing unit 1220-4 transmits the data selected by the recognition data selection unit 1220-3 to the server 2000, and the server 2000 performs the recognition data selection unit 1220-3 It may be requested to determine which region in the image corresponds to the information of interest by applying the data selected by the to the data recognition model. Also, the recognition result providing unit 1220 - 4 may receive a region corresponding to the interest information in the image determined by the server 2000 from the server 2000 .

예를 들어, 영상 처리 장치(1000)는 영상 처리 장치(1000)에서 획득된 라이브 뷰 영상을 서버(2000)로 전송할 수 있다. 서버(2000)는 영상 처리 장치(1000)로부터 수신된 라이브 뷰 영상을 서버(2000)에 저장된 데이터 인식 모델에 적용시킴으로써, 영상 내의 어느 영역이 관심 정보에 해당하는지 판단할 수 있다. 서버(2000)는 서버(2000)에 저장된 사용자의 영상들을 더 반영하여, 영상 내의 어느 영역이 관심 정보에 해당하는지 판단할 수 있다. 서버(2000)에서 판단된 영상 내의 관심 정보에 대응되는 영역을은 영상 처리 장치(1000)로 전송될 수 있다.For example, the image processing apparatus 1000 may transmit a live view image acquired by the image processing apparatus 1000 to the server 2000 . The server 2000 may determine which region in the image corresponds to the interest information by applying the live view image received from the image processing apparatus 1000 to the data recognition model stored in the server 2000 . The server 2000 may further reflect the user's images stored in the server 2000 to determine which region in the image corresponds to the interest information. The region corresponding to the interest information in the image determined by the server 2000 may be transmitted to the image processing apparatus 1000 .

또는, 영상 처리 장치(1000)의 인식 결과 제공부(1320-4)는 서버(2000)에 의해 생성된 데이터 인식 모델을 서버(2000)로부터 수신하고, 수신된 데이터 인식 모델을 이용하여 영상 내의 어느 영역이 관심 정보에 해당하는지 판단할 수 있다. 이 경우, 영상 처리 장치(1000)의 인식 결과 제공부(1220-4)는 인식 데이터 선택부(1220-3)에 의해 선택된 데이터를 서버(2000)로부터 수신된 데이터 인식 모델에 적용하여 영상 내의 어느 영역이 관심 정보에 해당하는지 판단할 수 있다. Alternatively, the recognition result providing unit 1320 - 4 of the image processing apparatus 1000 receives the data recognition model generated by the server 2000 from the server 2000 and uses the received data recognition model to select any one in the image. It may be determined whether the region corresponds to the information of interest. In this case, the recognition result providing unit 1220 - 4 of the image processing apparatus 1000 applies the data selected by the recognition data selector 1220 - 3 to the data recognition model received from the server 2000 to apply any It may be determined whether the region corresponds to the information of interest.

예를 들어, 영상 처리 장치(1000)는 영상 처리 장치(1000)에서 획득된 라이브 뷰 영상을 서버(2000)로부터 수신된 데이터 인식 모델에 적용시킴으로써, 영상 내의 어느 영역이 관심 정보에 해당하는지 판단할 수 있다. 서버(2000)는 서버(2000)에 저장된 사용자의 영상들을 영상 처리 장치(1000)로 전송하여, 영상 처리 장치(1000)가 영상 내의 어느 영역이 관심 정보에 해당하는지 판단할 때 더 이용하도록 할 수 있다. For example, the image processing apparatus 1000 may determine which region in the image corresponds to the interest information by applying the live view image acquired by the image processing apparatus 1000 to the data recognition model received from the server 2000 . can The server 2000 may transmit the user's images stored in the server 2000 to the image processing apparatus 1000 so that the image processing apparatus 1000 can further use it when determining which region in the image corresponds to the information of interest. have.

도 11은 다른 일 실시예에 따른 영상 처리 장치(1000)를 설명하기 위한 블록도이다.11 is a block diagram illustrating an image processing apparatus 1000 according to another exemplary embodiment.

도 11에 도시된 바와 같이, 다른 일 실시예에 따른 영상 처리 장치(1000)는 메모리(1100), 제어부(1200), 입출력부(1300), 센싱부(1400), 통신부(1500) 및 A/V 입력부(1600)를 포함할 수 있다. 11 , the image processing apparatus 1000 according to another exemplary embodiment includes a memory 1100 , a controller 1200 , an input/output unit 1300 , a sensing unit 1400 , a communication unit 1500 , and an A/ A V input unit 1600 may be included.

메모리(1100)는 플래시 메모리 타입(flash memory type), 하드디스크 타입(hard disk type), 멀티미디어 카드 마이크로 타입(multimedia card micro type), 카드 타입의 메모리(예를 들어 SD 또는 XD 메모리 등), 램(RAM, Random Access Memory) SRAM(Static Random Access Memory), 롬(ROM, Read-Only Memory), EEPROM(Electrically Erasable Programmable Read-Only Memory), PROM(Programmable Read-Only Memory), 자기 메모리, 자기 디스크, 광디스크 중 적어도 하나의 타입의 저장매체를 포함할 수 있다. The memory 1100 may include a flash memory type, a hard disk type, a multimedia card micro type, a card type memory (eg, SD or XD memory), and a RAM. (RAM, Random Access Memory) SRAM (Static Random Access Memory), ROM (Read-Only Memory), EEPROM (Electrically Erasable Programmable Read-Only Memory), PROM (Programmable Read-Only Memory), magnetic memory, magnetic disk , may include at least one type of storage medium among optical disks.

메모리(1100)에 저장된 프로그램들은 그 기능에 따라 복수 개의 모듈들로 분류할 수 있는데, 예를 들어, UI 모듈, 터치 스크린 모듈, 알림 모듈 등으로 분류될 수 있다. Programs stored in the memory 1100 may be classified into a plurality of modules according to their functions, for example, may be classified into a UI module, a touch screen module, a notification module, and the like.

UI 모듈은, 애플리케이션 별로 영상 처리 장치(1000)와 연동되는 특화된 UI, GUI 등을 제공할 수 있다. 터치 스크린 모듈은 사용자의 터치 스크린 상의 터치 제스처를 감지하고, 터치 제스처에 관한 정보를 제어부(1200)로 전달할 수 있다. 일부 실시예에 따른 터치 스크린 모듈은 터치 코드를 인식하고 분석할 수 있다. 터치 스크린 모듈은 별도의 하드웨어로도 구성될 수도 있다. 사용자의 터치 제스처에는 탭, 터치&홀드, 더블 탭, 드래그, 패닝, 플릭, 드래그 앤드 드롭, 스와이프 등이 있을 수 있다. 알림 모듈은 영상 처리 장치(1000)의 이벤트 발생을 알리기 위한 신호를 발생할 수 있다. 영상 처리 장치(1000)에서 발생되는 이벤트의 예로는 메시지 수신, 키 신호 입력, 콘텐츠 입력, 콘텐츠 전송, 소정의 조건에 해당되는 콘텐츠 검출 등이 있다. 알림 모듈은 디스플레이부(1322)를 통해 비디오 신호 형태로 알림 신호를 출력할 수도 있고, 음향 출력부(1324)를 통해 오디오 신호 형태로 알림 신호를 출력할 수도 있고, 진동 모터(1326)를 통해 진동 신호 형태로 알림 신호를 출력할 수도 있다.The UI module may provide a specialized UI, GUI, or the like that is interlocked with the image processing apparatus 1000 for each application. The touch screen module may detect a touch gesture on the user's touch screen and transmit information about the touch gesture to the controller 1200 . The touch screen module according to some embodiments may recognize and analyze a touch code. The touch screen module may also be configured as separate hardware. The user's touch gesture may include tap, touch & hold, double tap, drag, pan, flick, drag and drop, swipe, and the like. The notification module may generate a signal for notifying the occurrence of an event in the image processing apparatus 1000 . Examples of events occurring in the image processing apparatus 1000 include message reception, key signal input, content input, content transmission, and content detection corresponding to a predetermined condition. The notification module may output a notification signal in the form of a video signal through the display unit 1322 , may output a notification signal in the form of an audio signal through the sound output unit 1324 , and vibrate through the vibration motor 1326 . A notification signal may be output in the form of a signal.

제어부(1200)는, 통상적으로 영상 처리 장치(1000)의 전반적인 동작을 제어한다. 예를 들어, 제어부(1200)는, 메모리(1100)에 저장된 프로그램들을 실행함으로써, 입출력부(1300), 센싱부(1400), 통신부(1500), A/V 입력부(1600) 등을 전반적으로 제어할 수 있다. The controller 1200 generally controls the overall operation of the image processing apparatus 1000 . For example, the controller 1200 controls the input/output unit 1300 , the sensing unit 1400 , the communication unit 1500 , the A/V input unit 1600 , etc. in general by executing programs stored in the memory 1100 . can do.

구체적으로, 제어부(1200)는 적어도 하나의 프로세서를 구비할 수 있다. 제어부(1200)는 그 기능 및 역할에 따라, 복수의 프로세서들을 포함하거나, 통합된 형태의 하나의 프로세서를 포함할 수 있다.Specifically, the controller 1200 may include at least one processor. The control unit 1200 may include a plurality of processors or a single processor in an integrated form according to its function and role.

제어부(1200)를 구성하는 적어도 하나의 프로세서는 메모리(1100)에 저장된 컴퓨터 실행가능 명령어를 실행함으로써, 데이터 인식 모델들 중에서 소정의 조건에 대응되는 데이터 인식 모델에 기초하여, 데이터 인식 모델이 학습한 관심 정보에 해당하는지 판단하는 기준에 따라, 획득된 라이브 뷰 영상에서 사용자의 관심영역을 추정하고, 추정된 관심영역에 초점을 맞출 수 있다. At least one processor constituting the control unit 1200 executes computer-executable instructions stored in the memory 1100 so that the data recognition model learns, based on a data recognition model corresponding to a predetermined condition among the data recognition models. The user's ROI may be estimated from the acquired live-view image according to a criterion for determining whether the information corresponds to the ROI, and the estimated ROI may be focused.

제어부(1200)를 구성하는 적어도 하나의 프로세서는 제 1 조건을 만족할 때, 제 1 데이터 인식 모델이 학습한 돌출(saliency) 영역에 해당하는지 판단하는 기준에 따라, 돌출 영역을 사용자의 관심영역으로 추정하고, 제 2 조건을 만족할 때, 제 2 데이터 인식 모델이 학습한 개인 맞춤화된 관심 정보에 해당하는지 판단하는 기준에 따라, 개인 맞춤화된 관심 정보에 대응되는 영역을 사용자의 관심 영역으로 추정할 수 있다.When the first condition is satisfied, the at least one processor constituting the control unit 1200 estimates the saliency region as the user's ROI according to a criterion for determining whether the first data recognition model corresponds to the learned saliency region. and, when the second condition is satisfied, the region corresponding to the personalized interest information may be estimated as the user's interest region according to a criterion for determining whether the second data recognition model corresponds to the learned personalized interest information. .

학습된 데이터 인식 모델은 영상 처리 장치(1000) 외부의 서버에 저장되어 있을 수 있으며, 영상 처리 장치(1000)의 요청에 따라 서버로부터 수신될 수 있다.The learned data recognition model may be stored in a server external to the image processing apparatus 1000 , and may be received from the server according to a request of the image processing apparatus 1000 .

입출력부(1300)는 사용자 입력부(1310)와 출력부(1320)을 포함할 수 있다. 입출력부(1300)는 사용자 입력부(1310)와 출력부(1320)가 분리된 형태이거나, 터치스크린과 같이 통합된 하나의 형태일 수 있다. The input/output unit 1300 may include a user input unit 1310 and an output unit 1320 . The input/output unit 1300 may be in a form in which the user input unit 1310 and the output unit 1320 are separated, or may be in one integrated form such as a touch screen.

입출력부(1300)는 추정된 관심영역에 초점을 맞춘 라이브 뷰 영상을 디스플레이할 수 있다.The input/output unit 1300 may display a live view image focused on the estimated ROI.

사용자 입력부(1310)는, 사용자가 영상 처리 장치(1000)를 제어하기 위한 데이터를 입력하는 수단을 의미할 수 있다. 사용자 입력부(1310)는 사용자로부터 축약어를 입력받을 수 있고, 사용자로부터 축약어에 대응되는 문장을 선택받을 수 있다.The user input unit 1310 may mean a means for a user to input data for controlling the image processing apparatus 1000 . The user input unit 1310 may receive an abbreviation input from the user, and may receive a selection of a sentence corresponding to the abbreviation word from the user.

사용자 입력부(1310)는 키 패드(key pad)(1312), 터치 패널(1314)(접촉식 정전 용량 방식, 압력식 저항막 방식, 적외선 감지 방식, 표면 초음파 전도 방식, 적분식 장력 측정 방식, 피에조 효과 방식 등), 팬인식 패널(1316) 등이 될 수 있다. 뿐만 아니라, 사용자 입력부(1310)는 조그 휠, 조그 스위치 등이 있을 수 있으나 이에 한정되는 것은 아니다.The user input unit 1310 includes a keypad 1312 , a touch panel 1314 (contact capacitive method, pressure resistance film method, infrared sensing method, surface ultrasonic conduction method, integral tension measurement method, piezoelectric type). effect method, etc.), a fan recognition panel 1316, and the like. In addition, the user input unit 1310 may include a jog wheel, a jog switch, and the like, but is not limited thereto.

출력부(1320)는 영상 처리 장치(1000)에서 애플리케이션이 실행된 결과를 출력할 수 있다. 출력부(1320)는 영상 처리 장치(1000)의 동작 결과를 출력할 수 있고, 사용자 입력이 있는 경우, 사용자의 입력에 따라 변경된 결과를 출력할 수 있다. The output unit 1320 may output a result of executing an application in the image processing apparatus 1000 . The output unit 1320 may output an operation result of the image processing apparatus 1000 , and when there is a user input, may output a result changed according to the user's input.

출력부(1320)는, 오디오 신호 또는 비디오 신호 또는 진동 신호를 출력할 수 있으며, 출력부(1320)는 디스플레이부(1322), 음향 출력부(1324), 및 진동 모터(1326)를 포함할 수 있다.The output unit 1320 may output an audio signal, a video signal, or a vibration signal, and the output unit 1320 may include a display unit 1322 , a sound output unit 1324 , and a vibration motor 1326 . have.

디스플레이부(1322)는 영상 처리 장치(1000)에서 처리되는 정보를 디스플레이한다. 예를 들어, 디스플레이부(1322)는, 카메라 애플리케이션의 실행 화면을 디스플레이하거나, 사용자의 조작을 입력받기 위한 사용자 인터페이스를 디스플레이할 수 있다.The display 1322 displays information processed by the image processing apparatus 1000 . For example, the display 1322 may display an execution screen of a camera application or display a user interface for receiving a user's manipulation.

한편, 디스플레이부(1322)와 터치패드가 레이어 구조를 이루어 터치 스크린으로 구성되는 경우, 디스플레이부(1322)는 출력 장치 이외에 입력 장치로도 사용될 수 있다. 디스플레이부(1322)는 액정 디스플레이(liquid crystal display), 박막 트랜지스터 액정 디스플레이(thin film transistor-liquid crystal display), 유기 발광 다이오드(organic light-emitting diode), 플렉시블 디스플레이(flexible display), 3차원 디스플레이(3D display), 전기영동 디스플레이(electrophoretic display) 중에서 적어도 하나를 포함할 수 있다. 그리고 영상 처리 장치(1000)의 구현 형태에 따라 영상 처리 장치(1000)는 디스플레이부(1322)를 2개 이상 포함할 수도 있다. 이때, 2개 이상의 디스플레이부(1322)는 힌지(hinge)를 이용하여 마주보게 배치될 수 있다. On the other hand, when the display unit 1322 and the touch pad form a layer structure to form a touch screen, the display unit 1322 may be used as an input device in addition to an output device. The display unit 1322 includes a liquid crystal display, a thin film transistor-liquid crystal display, an organic light-emitting diode, a flexible display, a three-dimensional display ( 3D display) and electrophoretic display (electrophoretic display) may include at least one. Also, depending on the implementation form of the image processing apparatus 1000 , the image processing apparatus 1000 may include two or more display units 1322 . In this case, two or more display units 1322 may be disposed to face each other using a hinge.

음향 출력부(1324)는 통신부(1500)로부터 수신되거나 메모리(1100)에 저장된 오디오 데이터를 출력한다. 또한, 음향 출력부(1324)는 영상 처리 장치(1000)에서 수행되는 기능(예를 들어, 호신호 수신음, 메시지 수신음, 알림음)과 관련된 음향 신호를 출력한다. 이러한 음향 출력부(1324)에는 스피커(speaker), 버저(Buzzer) 등이 포함될 수 있다.The sound output unit 1324 outputs audio data received from the communication unit 1500 or stored in the memory 1100 . Also, the sound output unit 1324 outputs a sound signal related to a function (eg, a call signal reception sound, a message reception sound, and a notification sound) performed by the image processing apparatus 1000 . The sound output unit 1324 may include a speaker, a buzzer, and the like.

진동 모터(1326)는 진동 신호를 출력할 수 있다. 예를 들어, 진동 모터(1326)는 오디오 데이터 또는 비디오 데이터(예컨대, 호신호 수신음, 메시지 수신음 등)의 출력에 대응하는 진동 신호를 출력할 수 있다. 또한, 진동 모터(1326)는 터치스크린에 터치가 입력되는 경우 진동 신호를 출력할 수도 있다.The vibration motor 1326 may output a vibration signal. For example, the vibration motor 1326 may output a vibration signal corresponding to the output of audio data or video data (eg, a call signal reception sound, a message reception sound, etc.). Also, the vibration motor 1326 may output a vibration signal when a touch is input to the touch screen.

센싱부(1400)는, 영상 처리 장치(1000)의 상태 또는 영상 처리 장치(1000) 주변의 상태를 감지하고, 감지된 정보를 제어부(1200)로 전달할 수 있다. The sensing unit 1400 may detect a state of the image processing apparatus 1000 or a state around the image processing apparatus 1000 , and transmit the sensed information to the controller 1200 .

센싱부(1400)는, 지자기 센서(Magnetic sensor)(1410), 가속도 센서(Acceleration sensor)(1420), 온/습도 센서(1430), 적외선 센서(1440), 자이로스코프 센서(1450), 위치 센서(예컨대, GPS)(1460), 기압 센서(1470), 근접 센서(1480), 및 RGB 센서(illuminance sensor)(1490) 중 적어도 하나를 포함할 수 있으나, 이에 한정되는 것은 아니다. 각 센서들의 기능은 그 명칭으로부터 당업자가 직관적으로 추론할 수 있으므로, 구체적인 설명은 생략하기로 한다.The sensing unit 1400 includes a magnetic sensor 1410 , an acceleration sensor 1420 , a temperature/humidity sensor 1430 , an infrared sensor 1440 , a gyroscope sensor 1450 , and a position sensor. (eg, GPS) 1460 , a barometric pressure sensor 1470 , a proximity sensor 1480 , and at least one of an illuminance sensor 1490 , but is not limited thereto. Since a function of each sensor can be intuitively inferred from the name of a person skilled in the art, a detailed description thereof will be omitted.

통신부(1500)는, 영상 처리 장치(1000)와 다른 장치 또는 서버 간의 통신을 하게 하는 하나 이상의 구성요소를 포함할 수 있다. 예를 들어, 통신부(1500)는, 근거리 통신부(1510), 이동 통신부(1520), 방송 수신부(1530)를 포함할 수 있다. The communication unit 1500 may include one or more components that enable communication between the image processing apparatus 1000 and another device or server. For example, the communication unit 1500 may include a short-distance communication unit 1510 , a mobile communication unit 1520 , and a broadcast receiving unit 1530 .

근거리 통신부(short-range wireless communication unit)(151)는, 블루투스 통신부, BLE(Bluetooth Low Energy) 통신부, 근거리 무선 통신부(Near Field Communication unit), WLAN(와이파이) 통신부, 지그비(Zigbee) 통신부, 적외선(IrDA, infrared Data Association) 통신부, WFD(Wi-Fi Direct) 통신부, UWB(ultra wideband) 통신부, Ant+ 통신부 등을 포함할 수 있으나, 이에 한정되는 것은 아니다. Short-range wireless communication unit (151), Bluetooth communication unit, BLE (Bluetooth Low Energy) communication unit, short-range wireless communication unit (Near Field Communication unit), WLAN (Wi-Fi) communication unit, Zigbee (Zigbee) communication unit, infrared ( It may include an IrDA, infrared Data Association) communication unit, a Wi-Fi Direct (WFD) communication unit, an ultra wideband (UWB) communication unit, an Ant+ communication unit, and the like, but is not limited thereto.

이동 통신부(1520)는, 이동 통신망 상에서 기지국, 외부의 단말, 서버 중 적어도 하나와 무선 신호를 송수신한다. 여기에서, 무선 신호는, 음성 호 신호, 화상 통화 호 신호 또는 문자/멀티미디어 메시지 송수신에 따른 다양한 형태의 데이터를 포함할 수 있다.The mobile communication unit 1520 transmits/receives a radio signal to and from at least one of a base station, an external terminal, and a server on a mobile communication network. Here, the wireless signal may include various types of data according to transmission and reception of a voice call signal, a video call signal, or a text/multimedia message.

방송 수신부(1530)는, 방송 채널을 통하여 외부로부터 방송 신호 및/또는 방송 관련된 정보를 수신한다. 방송 채널은 위성 채널, 지상파 채널을 포함할 수 있다. 구현 예에 따라서 영상 처리 장치(1000)가 방송 수신부(1530)를 포함하지 않을 수도 있다.The broadcast receiver 1530 receives a broadcast signal and/or broadcast-related information from the outside through a broadcast channel. The broadcast channel may include a satellite channel and a terrestrial channel. According to an embodiment, the image processing apparatus 1000 may not include the broadcast receiver 1530 .

또한, 통신부(1500)는, 콘텐츠를 송수신 또는 업로드하기 위하여 다른 장치, 서버, 주변 기기 등과 통신을 수행할 수 있다.Also, the communication unit 1500 may communicate with other devices, servers, peripheral devices, and the like in order to transmit/receive or upload content.

A/V(Audio/Video) 입력부(1600)는 오디오 신호 또는 비디오 신호 입력을 위한 것으로, 이에는 촬영부(1610)와 마이크로폰(1620) 등이 포함될 수 있다. 촬영부(1610)은 화상 통화모드 또는 촬영 모드에서 이미지 센서를 통해 정지영상 또는 동영상 등의 화상 프레임을 얻을 수 있다. 이미지 센서를 통해 캡쳐된 이미지는 제어부(1200) 또는 별도의 이미지 처리부(미도시)를 통해 처리될 수 있다. The A/V (Audio/Video) input unit 1600 is for inputting an audio signal or a video signal, and may include a photographing unit 1610 , a microphone 1620 , and the like. The photographing unit 1610 may obtain an image frame such as a still image or a moving image through an image sensor in a video call mode or a photographing mode. The image captured through the image sensor may be processed through the controller 1200 or a separate image processing unit (not shown).

촬영부(1610)에서 처리된 화상 프레임은 메모리(1100)에 저장되거나 통신부(1500)를 통하여 외부로 전송될 수 있다. 촬영부(1610)는 단말기의 구성 태양에 따라 2개 이상이 구비될 수도 있다.The image frame processed by the photographing unit 1610 may be stored in the memory 1100 or transmitted to the outside through the communication unit 1500 . Two or more photographing units 1610 may be provided according to the configuration of the terminal.

마이크로폰(1620)은, 외부의 음향 신호를 입력 받아 전기적인 음성 데이터로 처리한다. 예를 들어, 마이크로폰(1620)은 외부 디바이스 또는 화자로부터 음향 신호를 수신할 수 있다. 마이크로폰(1620)는 외부의 음향 신호를 입력 받는 과정에서 발생 되는 잡음(noise)를 제거하기 위한 다양한 잡음 제거 알고리즘을 이용할 수 있다. The microphone 1620 receives an external sound signal and processes it as electrical voice data. For example, the microphone 1620 may receive an acoustic signal from an external device or a speaker. The microphone 1620 may use various noise removal algorithms for removing noise generated in the process of receiving an external sound signal.

도 12는 일 실시예에 따른 영상 처리 방법을 나타내는 흐름도이다.12 is a flowchart illustrating an image processing method according to an exemplary embodiment.

1210 단계에서, 영상 처리 장치(1000)는 적어도 하나의 피사체를 포함하는 라이브 뷰 영상을 획득한다.In operation 1210 , the image processing apparatus 1000 acquires a live view image including at least one subject.

1220 단계에서, 영상 처리 장치(1000)는 데이터 인식 모델들 중에서 소정의 조건에 대응되는 데이터 인식 모델에 기초하여, 데이터 인식 모델이 학습한 관심 정보에 해당하는지 판단하는 기준에 따라, 획득된 라이브 뷰 영상에서 사용자의 관심영역을 추정한다. In operation 1220 , the image processing apparatus 1000 determines whether the data recognition model corresponds to the learned interest information based on a data recognition model corresponding to a predetermined condition among the data recognition models, according to the obtained live view. Estimate the user's ROI from the image.

영상 처리 장치(1000)는 제 1 조건을 만족할 때, 제 1 데이터 인식 모델이 학습한 돌출(saliency) 영역에 해당하는지 판단하는 기준에 따라, 돌출 영역을 사용자의 관심영역으로 추정하고, 제 2 조건을 만족할 때, 제 2 데이터 인식 모델이 학습한 개인 맞춤화된 관심 정보에 해당하는지 판단하는 기준에 따라, 개인 맞춤화된 관심 정보에 대응되는 영역을 사용자의 관심 영역으로 추정할 수 있다. 돌출 영역은 피사체가 영상에서 차지하는 면적 또는 영상의 색상 분포에 관한 소정의 기준에 의해 결정되고, 개인 맞춤화된 관심 정보는 영상 처리 방법이 수행되는 영상 처리 장치(1000)에 저장된 사용자의 영상들에 관한 소정의 통계에 의해 결정될 수 있다. 이때, 제 2 조건은 영상 처리 방법이 수행되는 영상 처리 장치(1000)에 저장된 영상들의 개수가 소정의 개수보다 많고, 개인 맞춤화된 관심 정보에 대한 신뢰도가 소정의 조건을 만족하는 경우이고, 제 1 조건은 제 2 조건을 만족하지 않는 경우인 것을 의미한다.When the first condition is satisfied, the image processing apparatus 1000 estimates the saliency region as the user's ROI according to a criterion for determining whether the first data recognition model corresponds to the learned saliency region, and the second condition is satisfied, the region corresponding to the personalized interest information may be estimated as the user's interest region according to a criterion for determining whether the second data recognition model corresponds to the learned personalized interest information. The protrusion area is determined by a predetermined criterion regarding the area occupied by the subject in the image or the color distribution of the image, and the personalized interest information relates to user images stored in the image processing apparatus 1000 in which the image processing method is performed. It may be determined by predetermined statistics. In this case, the second condition is that the number of images stored in the image processing apparatus 1000 on which the image processing method is performed is greater than the predetermined number, and the reliability of the personalized interest information satisfies the predetermined condition, and the first condition The condition means that the second condition is not satisfied.

한편, 영상 처리 장치(1000)는 사용자의 영상들을 소정의 기준에 따라 분류하고, 분류된 영상들로 제 3 데이터 인식 모델을 학습시켜, 개인 맞춤화된 관심 정보에 해당하는지 판단하는 기준을 획득하는 외부 서버(2000)와 통신을 수행할 수 있다. 영상 처리 장치(1000)는 외부 서버(2000)에 저장된 사용자의 영상들을 이용하여, 제 3 데이터 인식 모델이 학습한 개인 맞춤화된 관심 정보에 해당하는지 판단하는 기준을 외부 서버(2000)로부터 수신할 수 있다. 영상 처리 장치(1000)에 마련된 데이터 인식 모델이 수신한 개인 맞춤화된 관심 정보에 해당하는지 판단하는 기준에 따라, 개인 맞춤화된 관심 정보에 대응되는 영역을 사용자의 관심영역으로 추정할 수 있다.Meanwhile, the image processing apparatus 1000 classifies the user's images according to a predetermined criterion, trains a third data recognition model with the classified images, and obtains a criterion for determining whether the user's image corresponds to the personalized interest information. Communication with the server 2000 may be performed. The image processing apparatus 1000 may use the user's images stored in the external server 2000 to receive, from the external server 2000, a criterion for determining whether the third data recognition model corresponds to the learned personalized interest information. have. According to a criterion for determining whether the data recognition model provided in the image processing apparatus 1000 corresponds to the received personalized interest information, a region corresponding to the personalized interest information may be estimated as the user's region of interest.

영상 처리 장치(1000)는 관심 정보의 우선순위에 기초하여, 우선순위가 높은 관심 정보에 대응되는 영역을 사용자의 관심 영역으로 추정할 수 있다. 영상 처리 장치(1000)는 복수 개의 관심영역이 추정된 경우, 복수 개의 관심영역 모두에 대해 초점을 맞추는 다중 초점을 수행할 수 있다. 영상 처리 장치(1000)는 복수 개의 관심영역이 추정된 경우, 복수 개의 관심영역 중 사용자가 선택한 관심영역에 대해 초점을 맞출 수 있다.The image processing apparatus 1000 may estimate a region corresponding to the interest information having a high priority as the user's region of interest based on the priority of the interest information. When a plurality of ROIs are estimated, the image processing apparatus 1000 may perform multi-focusing by focusing on all of the plurality of ROIs. When a plurality of regions of interest are estimated, the image processing apparatus 1000 may focus on a region of interest selected by the user from among the plurality of regions of interest.

1230 단계에서, 영상 처리 장치(1000)는 추정된 관심영역에 초점을 맞춘다.In operation 1230 , the image processing apparatus 1000 focuses on the estimated ROI.

1240 단계에서, 영상 처리 장치(1000)는 추정된 관심영역에 초점을 맞춘 라이브 뷰 영상을 디스플레이한다. 영상 처리 장치(1000)는 사용자의 촬영 명령에 대한 응답으로, 추정된 관심영역에 초점을 맞추어 영상을 촬영할 수 있다.In operation 1240 , the image processing apparatus 1000 displays a live view image focused on the estimated ROI. The image processing apparatus 1000 may capture an image by focusing on the estimated ROI in response to a user's photographing command.

도 13은 일 실시예에 따른 영상 처리 장치가 제 1 프로세서 및 제 2 프로세서를 포함하는 경우에 관심 영역을 추정하는 상황을 설명하기 위한 흐름도이다.13 is a flowchart illustrating a situation of estimating a region of interest when an image processing apparatus includes a first processor and a second processor according to an exemplary embodiment.

일 실시예에 따르면, 영상 처리 장치(1000)는 제 1 프로세서(1200a) 및 제 2 프로세서(1200b)를 포함할 수 있다. According to an embodiment, the image processing apparatus 1000 may include a first processor 1200a and a second processor 1200b.

S1310 단계에서, 제 1 프로세서(1200a)는 라이브 뷰 영상을 획득할 수 있다. 라이브 뷰 영상은, 예를 들면, 적어도 하나의 피사체를 포함할 수 있다. In operation S1310, the first processor 1200a may acquire a live view image. The live view image may include, for example, at least one subject.

S1320 단계에서, 제 1 프로세서(1200a)는 소정의 조건에 대응하는 데이터 인식 모델을 결정할 수 있다. In operation S1320 , the first processor 1200a may determine a data recognition model corresponding to a predetermined condition.

예를 들면, 제 1 프로세서(1200a)는 영상 처리 장치(1000)가 도 3 에서 상술한 제 2 조건을 만족하는 경우, 제 2 데이터 인식 모델 또는 제 4 데이터 인식 모델을 이용하는 것을 결정할 수 있다. 일 실시예에 따르면, 제 1 프로세서(1200a)는 영상 처리 장치(1000)의 제조사가 설정한 기본값 또는 사용자의 선택에 따라서 제 2 데이터 인식 모델 또는 제 4 데이터 인식 모델 중 하나를 선택할 수 있다.For example, when the image processing apparatus 1000 satisfies the second condition described with reference to FIG. 3 , the first processor 1200a may determine to use the second data recognition model or the fourth data recognition model. According to an embodiment, the first processor 1200a may select one of the second data recognition model and the fourth data recognition model according to a default value set by the manufacturer of the image processing apparatus 1000 or a user's selection.

또한, 제 1 프로세서(1200a)는 영상 처리 장치(1000)가 제 2 조건을 만족하지 못하는 경우, 제 1 조건을 만족하는 것으로 결정하고, 제 1 데이터 인식 모델을 이용하는 것을 결정할 수 있다.Also, when the image processing apparatus 1000 does not satisfy the second condition, the first processor 1200a may determine that the first condition is satisfied and determine to use the first data recognition model.

S1330 단계에서, 제 1 프로세서(1200a)는 결정된 데이터 인식 모델을 이용하여 관심 영역을 추정할 것을 제 2 프로세서(1200b)에 대하여 요청할 수 있다.In operation S1330 , the first processor 1200a may request the second processor 1200b to estimate the ROI using the determined data recognition model.

S1340 단계에서, 제 2 프로세서(1200b)는 결정된 데이터 인식 모델을 이용하여 관심 영역을 추정할 수 있다.In operation S1340 , the second processor 1200b may estimate the ROI using the determined data recognition model.

예를 들어, 제 2 프로세서(1200b)는 제 1 데이터 인식 모델이 결정된 경우, 제 1 데이터 인식 모델이 학습한 돌출(saliency) 영역에 해당하는지 판단하는 기준에 따라, 돌출 영역을 사용자의 관심영역으로 추정할 수 있다. 제 2 프로세서(1200b)는 제 2 데이터 인식 모델이 결정된 경우, 제 2 데이터 인식 모델이 학습한 개인 맞춤화된 관심 정보에 해당하는지 판단하는 기준에 따라, 개인 맞춤화된 관심 정보에 대응되는 영역을 사용자의 관심 영역으로 추정할 수 있다.For example, when the first data recognition model is determined, the second processor 1200b sets the saliency region as a region of interest of the user according to a criterion for determining whether the first data recognition model corresponds to the learned saliency region. can be estimated When the second data recognition model is determined, the second processor 1200b selects an area corresponding to the personalized interest information of the user according to a criterion for determining whether the second data recognition model corresponds to the learned personalized interest information. It can be estimated as a region of interest.

제 2 프로세서(1200b)는 제 4 데이터 인식 모델이 결정된 경우, 제 4 데이터 인식 모델을 이용하여 개인 맞춤화된 괌심 정보에 대응하는 영역을 사용자의 관심 영역으로 추정할 수 있다.When the fourth data recognition model is determined, the second processor 1200b may estimate the region corresponding to the personalized Guam SIM information as the user's ROI by using the fourth data recognition model.

S1350 단계에서, 제 2 프로세서(1200b)는 추정된 관심 영역을 제 1 프로세서(1200a)로 전송할 수 있다.In operation S1350 , the second processor 1200b may transmit the estimated ROI to the first processor 1200a.

S1360 단계에서, 제 1 프로세서(1200a)는 추정된 관심 영역에 초점을 맞출 수 있다.In operation S1360 , the first processor 1200a may focus on the estimated ROI.

S1370 단계에서, 제 1 프로세서(1200a)는 추정된 관심 영역에 초점을 맞춘 라이브 뷰 영상을 디스플레이 할 수 있다.In operation S1370 , the first processor 1200a may display a live view image focused on the estimated ROI.

도 14는 일 실시예에 따른 영상 처리 장치가 제 1 프로세서, 제 2 프로세서, 및 제 3 프로세서를 포함하는 경우에 관심 영역을 추정하는 상황을 설명하기 위한 흐름도이다.14 is a flowchart illustrating a situation in which an ROI is estimated when an image processing apparatus includes a first processor, a second processor, and a third processor, according to an exemplary embodiment.

일 실시예에 따르면, 영상 처리 장치(1000)는 제 1 프로세서(1200a), 제 2 프로세서(1200b), 및 제 3 프로세서(1200c)를 포함할 수 있다. 예를 들면, 제 2 프로세서(1200b)는 제 1 데이터 인식 모델을 이용하여 관심 영역을 추정할 수 있다. 제 3 프로세서(1200c)는 제 2 데이터 인식 모델 또는 제 4 데이터 인식 모델을 이용하여 관심 영역을 추정할 수 있다.According to an embodiment, the image processing apparatus 1000 may include a first processor 1200a, a second processor 1200b, and a third processor 1200c. For example, the second processor 1200b may estimate the ROI using the first data recognition model. The third processor 1200c may estimate the ROI by using the second data recognition model or the fourth data recognition model.

S1410 단계에서 제 1 프로세서(1200a)는 라이브 뷰 영상을 획득할 수 있다. 라이브 뷰 영상은, 예를 들면, 적어도 하나의 피사체를 포함할 수 있다. In operation S1410, the first processor 1200a may acquire a live view image. The live view image may include, for example, at least one subject.

S1420 단계에서, 제 1 프로세서(1200a)는 소정의 조건에 대응하는 데이터 인식 모델을 결정할 수 있다. In operation S1420, the first processor 1200a may determine a data recognition model corresponding to a predetermined condition.

예를 들면, 제 1 프로세서(1200a)는 영상 처리 장치(1000)가 도 3 에서 상술한 제 2 조건을 만족하는 경우, 제 2 데이터 인식 모델 또는 제 4 데이터 인식 모델을 이용하는 것을 결정할 수 있다. For example, when the image processing apparatus 1000 satisfies the second condition described with reference to FIG. 3 , the first processor 1200a may determine to use the second data recognition model or the fourth data recognition model.

일 실시예에 따르면, 제 1 프로세서(1200a)는 사용자의 관심 영역 검출의 속도를 높이기 위해 제 4 데이터 인식 모델을 제 2 조건에 대응하는 기본 데이터 인식 모델로 설정할 수 있다. 그러나 이에 한정되지는 않는다. 예를 들면, 제 1 프로세서(1200a)는 영상 처리 장치(1000)의 제조사가 설정한 기본값 또는 사용자의 선택에 따라서 제 2 데이터 인식 모델 또는 제 4 데이터 인식 모델 중 하나를 선택할 수 있다.According to an embodiment, the first processor 1200a may set the fourth data recognition model as a basic data recognition model corresponding to the second condition in order to increase the speed of detecting the user's ROI. However, the present invention is not limited thereto. For example, the first processor 1200a may select one of the second data recognition model and the fourth data recognition model according to a default value set by the manufacturer of the image processing apparatus 1000 or a user's selection.

S1430 단계를 참조하면, 제 1 데이터 인식 모델이 결정된 경우에, 제 1 프로세서(1200a)는 관심 영역을 추정할 것을 제 2 프로세서(1200b)에게 요청할 수 있다. 또한, 제 2 데이터 인식 모델 또는 제 4 데이터 인식 모델이 결정된 경우에, 제 1 프로세서(1200a)는 관심 영역을 추정할 것을 제 3 프로세서(1200c)에게 요청할 수 있다.Referring to step S1430 , when the first data recognition model is determined, the first processor 1200a may request the second processor 1200b to estimate the ROI. Also, when the second data recognition model or the fourth data recognition model is determined, the first processor 1200a may request the third processor 1200c to estimate the ROI.

S1440 단계에서, 제 3 프로세서(1200c)는 제 4 데이터 인식 모델을 이용하여 관심 영역을 추정할 수 있다.In operation S1440 , the third processor 1200c may estimate the ROI using the fourth data recognition model.

S1450 단계에서, 제 3 프로세서(1200c)는 제 4 데이터 인식 모델을 이용하여 관심 영역의 추정이 완료되었는지 확인할 수 있다.In operation S1450 , the third processor 1200c may check whether the estimation of the ROI is completed using the fourth data recognition model.

S1470 단계에서, 제 3 프로세서(1200c)는 제 4 데이터 인식 모델을 이용하여 관심 영역이 추정된 것이 확인되면, 추정된 관심 영역을 제 1 프로세서(1200a)로 전송할 수 있다.In operation S1470 , when it is confirmed that the ROI is estimated using the fourth data recognition model, the third processor 1200c may transmit the estimated ROI to the first processor 1200a.

S1460 단계에서 제 2 프로세서(1200b)는 제 1 데이터 인식 모델을 이용하여 관심 영역을 추정할 수 있다. 또한, S1450 단계에서 제 3 프로세서(1200c)가 제 4 데이터 인식 모델을 이용하여 관심 영역을 추정하는 것에 실패한 것이 확인되면, 제 2 프로세서(1200b)는 제 1 데이터 인식 모델을 이용하여 관심 영역을 추정할 수 있다.In operation S1460 , the second processor 1200b may estimate the ROI using the first data recognition model. Also, if it is confirmed in step S1450 that the third processor 1200c has failed to estimate the region of interest using the fourth data recognition model, the second processor 1200b estimates the region of interest using the first data recognition model. can do.

S1475 단계에서, 제 2 프로세서(1200b)는 제 1 데이터 인식 모델을 이용하여 추정된 관심 영역을 제 1 프로세서(1200a)로 전송할 수 있다.In operation S1475 , the second processor 1200b may transmit the ROI estimated using the first data recognition model to the first processor 1200a.

S1480 단계에서, 제 1 프로세서(1200a)는 추정된 관심 영역에 초점을 맞출 수 있다.In operation S1480 , the first processor 1200a may focus on the estimated ROI.

S1490 단계에서, 제 1 프로세서(1200a)는 추정된 관심 영역에 초점을 맞춘 라이브 뷰 영상을 디스플레이 할 수 있다.In operation S1490 , the first processor 1200a may display a live view image focused on the estimated ROI.

도 15는 일 실시예에 따른 전자 장치가 제 1 프로세서, 제 2 프로세서, 및 제 3 프로세서를 포함하는 경우에 관심 영역을 추정하는 다른 상황을 설명하기 위한 흐름도이다.15 is a flowchart illustrating another situation in which an ROI is estimated when an electronic device includes a first processor, a second processor, and a third processor, according to an embodiment.

S1510 단계에서 제 1 프로세서(1200a)는 라이브 뷰 영상을 획득할 수 있다. 라이브 뷰 영상은, 예를 들면, 적어도 하나의 피사체를 포함할 수 있다. In operation S1510, the first processor 1200a may acquire a live view image. The live view image may include, for example, at least one subject.

S1520 단계에서, 제 1 프로세서(1200a)는 소정의 조건에 대응하는 데이터 인식 모델을 결정할 수 있다. In operation S1520, the first processor 1200a may determine a data recognition model corresponding to a predetermined condition.

S1530 단계를 참조하면, 제 1 데이터 인식 모델이 결정된 경우에, 제 1 프로세서(1200a)는 관심 영역을 추정할 것을 제 2 프로세서(1200b)에게 요청할 수 있다. 또한, 제 2 데이터 인식 모델 또는 제 4 데이터 인식 모델이 결정된 경우에, 제 1 프로세서(1200a)는 관심 영역을 추정할 것을 제 3 프로세서(1200c)에게 요청할 수 있다.Referring to step S1530 , when the first data recognition model is determined, the first processor 1200a may request the second processor 1200b to estimate the ROI. Also, when the second data recognition model or the fourth data recognition model is determined, the first processor 1200a may request the third processor 1200c to estimate the ROI.

S1540 단계에서, 제 2 프로세서(1200b)는 제 1 데이터 인식 모델을 이용하여 관심 영역을 추정할 수 있다. In operation S1540 , the second processor 1200b may estimate the ROI using the first data recognition model.

S1560 단계에서, 제 2 프로세서(1200b)는 추정된 관심 영역을 제 1 프로세서(1200a)로 전송할 수 있다.In operation S1560 , the second processor 1200b may transmit the estimated ROI to the first processor 1200a.

S1550 단계에서, 제 3 프로세서(1200c)는 제 4 데이터 인식 모델을 이용하여 관심 영역을 추정할 수 있다.In operation S1550 , the third processor 1200c may estimate the ROI using the fourth data recognition model.

S1570 단계에서, 제 3 프로세서(1200c)는 제 4 데이터 인식 모델을 이용하여 관심 영역의 추정이 완료되었는지 확인할 수 있다.In operation S1570 , the third processor 1200c may check whether the estimation of the ROI is completed using the fourth data recognition model.

S1565 단계에서. 제 3 프로세서(1200c)는 제 4 데이터 인식 모델을 이용하여 관심 영역이 추정된 것이 확인되면, 추정된 관심 영역을 제 1 프로세서(1200a)로 전송할 수 있다.In step S1565. When it is confirmed that the ROI is estimated using the fourth data recognition model, the third processor 1200c may transmit the estimated ROI to the first processor 1200a.

S1580 단계를 참조하면, 제 3 프로세서(1200c)는 제 4 데이터 인식 모델을 이용하여 관심 영역의 초정이 실패한 것이 확인되면, 제 2 데이터 인식 모델을 이용하여 관심 영역을 추정할 수 있다.Referring to step S1580 , if it is confirmed that the initialization of the ROI has failed using the fourth data recognition model, the third processor 1200c may estimate the ROI using the second data recognition model.

예를 들어, 제 4 데이터 인식 모델은 자동차와 유사한 형상을 관심 영역으로 추정할 수 있는데, 라이브 뷰 영상에서 검출한 객체에 자동차와 유사한 형상이 없는 경우에, 제 3 프로세서(1200c)는 제 2 데이터 인식 모델을 적용하여 사용자에게 맞춤화된 다른 객체를 관심 영역으로 추정할 수 있다.For example, the fourth data recognition model may estimate a car-like shape as a region of interest. When the object detected from the live-view image does not have a car-like shape, the third processor 1200c performs the second data Another object customized to the user can be estimated as the region of interest by applying the recognition model.

S1590 단계에서, 제 2 프로세서(1200b)는 제 1 데이터 인식 모델을 이용하여 추정된 관심 영역을 제 1 프로세서(1200a)로 전송할 수 있다.In operation S1590 , the second processor 1200b may transmit the ROI estimated using the first data recognition model to the first processor 1200a.

S1593 단계에서, 제 1 프로세서(1200a)는 추정된 관심 영역에 초점을 맞출 수 있다.In operation S1593 , the first processor 1200a may focus on the estimated ROI.

S1595 단계에서, 제 1 프로세서(1200a)는 추정된 관심 영역에 초점을 맞춘 라이브 뷰 영상을 디스플레이 할 수 있다.In operation S1595 , the first processor 1200a may display a live view image focused on the estimated ROI.

도 16은 일 실시예에 따른 영상 처리 장치가 서버를 이용하여 관심 영역을 추정하는 상황을 설명하기 위한 흐름도이다.16 is a flowchart illustrating a situation in which an image processing apparatus estimates a region of interest using a server, according to an exemplary embodiment.

이 경우, 영상 처리 장치(1000) 및 서버(2000) 간에 데이터를 송/수신하기 위한 인터페이스가 정의될 수 있다.In this case, an interface for transmitting/receiving data between the image processing apparatus 1000 and the server 2000 may be defined.

예를 들면, 데이터 인식 모델에 적용할 학습 데이터를 인자 값(또는, 매개 값 또는 전달 값)으로 갖는 API((application program interface)가 정의될 수 있다. API는 어느 하나의 프로토콜(예로, 영상 처리 장치(1000)에서 정의된 프로토콜)에서 다른 프로토콜(예를 들면, 서버(2000)에서 정의된 프로토콜)의 어떤 처리를 위해 호출할 수 있는 서브 루틴 또는 함수의 집합으로 정의될 수 있다. 즉, API를 통하여 어느 하나의 프로토콜에서 다른 프로토콜의 동작이 수행될 수 있는 환경이 제공될 수 있다.For example, an API (application program interface) having training data to be applied to the data recognition model as a factor value (or a parameter value or a transfer value) may be defined. The API may be defined by any one protocol (eg, image processing It can be defined as a set of subroutines or functions that can be called for some processing of another protocol (eg, the protocol defined in the server 2000) in the device 1000). An environment in which an operation of another protocol can be performed in one protocol can be provided through .

일 실시예에 따르면, 서버(2000)는 제 3 데이터 인식 모델을 포함할 수 있다.According to an embodiment, the server 2000 may include a third data recognition model.

S1610 단계에서, 영상 처리 장치(1000)는 라이브 뷰 영상을 획득할 수 있다. 라이브 뷰 영상은, 예를 들면, 적어도 하나의 피사체를 포함할 수 있다. In operation S1610 , the image processing apparatus 1000 may acquire a live view image. The live view image may include, for example, at least one subject.

S1620 단계에서, 영상 처리 장치(1000)는 소정의 조건에 대응하는 데이터 인식 모델을 결정할 수 있다. In operation S1620, the image processing apparatus 1000 may determine a data recognition model corresponding to a predetermined condition.

예를 들면, 영상 처리 장치(1000)는 도 3 에서 상술한 제 2 조건을 만족하는 경우, 제 2 데이터 인식 모델 또는 제 4 데이터 인식 모델을 이용하는 것을 결정할 수 있다. 또한, 영상 처리 장치(1000)는 제 2 조건을 만족하지 못하는 경우, 제 1 조건을 만족하는 것으로 결정하고, 제 1 데이터 인식 모델을 이용하는 것을 결정할 수 있다.For example, when the second condition described above with reference to FIG. 3 is satisfied, the image processing apparatus 1000 may determine to use the second data recognition model or the fourth data recognition model. Also, when the second condition is not satisfied, the image processing apparatus 1000 may determine that the first condition is satisfied and determine to use the first data recognition model.

S1630 단계를 참조하면, 영상 처리 장치(1000)는 데이터 인식 모델을 이용하여 관심 영역이 추정되었는지 확인할 수 있다.Referring to step S1630 , the image processing apparatus 1000 may determine whether the ROI is estimated using the data recognition model.

S1640 단계를 참조하면, 영상 처리 장치(1000)에 포함된 데이터 인식 모델을 이용하여 관심 영역이 추정된 경우, 영상 처리 장치(1000)는 추정된 관심 영역에 초점을 맞출 수 있다.Referring to operation S1640 , when the ROI is estimated using the data recognition model included in the image processing apparatus 1000 , the image processing apparatus 1000 may focus on the estimated ROI.

S1650 단계를 참조하면, 영상 처리 장치(1000)에 포함된 데이터 인식 모델을 이용하여 관심 영역이 추정되지 않은 경우, 영상 처리 장치(1000)는 서버(2000)에 관심 영역 추정을 요청할 수 있다.Referring to step S1650 , when the ROI is not estimated using the data recognition model included in the image processing apparatus 1000 , the image processing apparatus 1000 may request the server 2000 to estimate the ROI.

S1660 단계에서, 서버(2000)는 제 3 데이터 인식 모델을 이용하여 관심 영역을 추정할 수 있다.In operation S1660, the server 2000 may estimate the ROI using the third data recognition model.

S1670 단계에서, 서버(2000)는 추정된 관심 영역을 영상 처리 장치(1000)로 전송할 수 있다.In operation S1670 , the server 2000 may transmit the estimated ROI to the image processing apparatus 1000 .

S1680 단계에서, 영상 처리 장치(1000)는 추정된 관심 영역에 초점을 맞출 수 있다.In operation S1680 , the image processing apparatus 1000 may focus on the estimated ROI.

S1690 단계에서, 영상 처리 장치(1000)는 추정된 관심 영역에 초점을 맞춘 라이브 뷰 영상을 디스플레이 할 수 있다.In operation S1690, the image processing apparatus 1000 may display a live view image focused on the estimated ROI.

한편, 상술한 영상 처리 방법은 컴퓨터에서 실행될 수 있는 프로그램으로 작성가능하고, 컴퓨터로 읽을 수 있는 저장매체를 이용하여 이와 같은 프로그램을 동작시키는 범용 디지털 컴퓨터에서 구현될 수 있다. 이와 같은 컴퓨터로 읽을 수 있는 저장매체는 read-only memory (ROM), random-access memory (RAM), flash memory, CD-ROMs, CD-Rs, CD+Rs, CD-RWs, CD+RWs, DVD-ROMs, DVD-Rs, DVD+Rs, DVD-RWs, DVD+RWs, DVD-RAMs, BD-ROMs, BD-Rs, BD-R LTHs, BD-REs, 마그네틱 테이프, 플로피 디스크, 광자기 데이터 저장 장치, 광학 데이터 저장 장치, 하드 디스크, 솔리드-스테이트 디스크(SSD), 그리고 명령어 또는 소프트웨어, 관련 데이터, 데이터 파일, 및 데이터 구조들을 저장할 수 있고, 프로세서나 컴퓨터가 명령어를 실행할 수 있도록 프로세서나 컴퓨터에 명령어 또는 소프트웨어, 관련 데이터, 데이터 파일, 및 데이터 구조들을 제공할 수 있는 어떠한 장치라도 될 수 있다.Meanwhile, the above-described image processing method can be written as a program that can be executed on a computer, and can be implemented in a general-purpose digital computer that operates such a program using a computer-readable storage medium. These computer-readable storage media include read-only memory (ROM), random-access memory (RAM), flash memory, CD-ROMs, CD-Rs, CD+Rs, CD-RWs, CD+RWs, and DVDs. -Storing ROMs, DVD-Rs, DVD+Rs, DVD-RWs, DVD+RWs, DVD-RAMs, BD-ROMs, BD-Rs, BD-R LTHs, BD-REs, magnetic tapes, floppy disks, magneto-optical data devices, optical data storage devices, hard disks, solid-state disks (SSDs), and may store instructions or software, related data, data files, and data structures, and are stored on a processor or computer so that the processor or computer may execute the instructions. It can be any device capable of providing instructions or software, associated data, data files, and data structures.

또한, 개시된 실시예들은 컴퓨터로 읽을 수 있는 저장 매체(computer-readable storage media)에 저장된 명령어를 포함하는 S/W 프로그램으로 구현될 수 있다. In addition, the disclosed embodiments may be implemented as a S/W program including instructions stored in a computer-readable storage medium.

컴퓨터는, 저장 매체로부터 저장된 명령어를 호출하고, 호출된 명령어에 따라 개시된 실시예에 따른 동작이 가능한 장치로서, 개시된 실시예들에 따른 영상 처리 장치를 포함할 수 있다.The computer is an apparatus capable of calling a stored instruction from a storage medium and operating according to the called instruction according to the disclosed embodiment, and may include the image processing apparatus according to the disclosed embodiment.

컴퓨터로 읽을 수 있는 저장매체는, 비일시적(non-transitory) 저장매체의 형태로 제공될 수 있다. 여기서, '기 비일시적'은 저장매체가 신호(signal)를 포함하지 않으며 실재(tangible)한다는 것을 의미할 뿐 데이터가 저장매체에 반영구적 또는 임시적으로 저장됨을 구분하지 않는다. The computer-readable storage medium may be provided in the form of a non-transitory storage medium. Here, 'non-transitory' means that the storage medium does not include a signal and is tangible, and does not distinguish that data is semi-permanently or temporarily stored in the storage medium.

또한, 개시된 실시예들에 따른 제어 방법은 컴퓨터 프로그램 제품(computer program product)에 포함되어 제공될 수 있다. 컴퓨터 프로그램 제품은 상품으로서 판매자 및 구매자 간에 거래될 수 있다.In addition, the control method according to the disclosed embodiments may be provided included in a computer program product (computer program product). Computer program products may be traded between sellers and buyers as commodities.

컴퓨터 프로그램 제품은 S/W 프로그램, S/W 프로그램이 저장된 컴퓨터로 읽을 수 있는 저장 매체를 포함할 수 있다. 예를 들어, 컴퓨터 프로그램 제품은 영상 처리 장치의 제조사 또는 전자 마켓(예, 구글 플레이 스토어, 앱 스토어)을 통해 전자적으로 배포되는 S/W 프로그램 형태의 상품(예, 다운로더블 앱)을 포함할 수 있다. 전자적 배포를 위하여, S/W 프로그램의 적어도 일부는 저장 매체에 저장되거나, 임시적으로 생성될 수 있다. 이 경우, 저장 매체는 제조사의 서버, 전자 마켓의 서버, 또는 SW 프로그램을 임시적으로 저장하는 중계 서버의 저장매체가 될 수 있다.The computer program product may include a S/W program and a computer-readable storage medium in which the S/W program is stored. For example, computer program products may include products (eg, downloadable apps) in the form of S/W programs distributed electronically through manufacturers of image processing devices or electronic markets (eg, Google Play Store, App Store). can For electronic distribution, at least a portion of the S/W program may be stored in a storage medium or may be temporarily generated. In this case, the storage medium may be a server of a manufacturer, a server of an electronic market, or a storage medium of a relay server temporarily storing a SW program.

컴퓨터 프로그램 제품은, 서버 및 영상 처리 장치로 구성되는 시스템에서, 서버의 저장매체 또는 영상 처리 장치의 저장매체를 포함할 수 있다. 또는, 서버 또는 영상 처리 장치와 통신 연결되는 제 3 장치(예, 스마트폰)가 존재하는 경우, 컴퓨터 프로그램 제품은 제 3 장치의 저장매체를 포함할 수 있다. 또는, 컴퓨터 프로그램 제품은 서버로부터 영상 처리 장치 또는 제 3 장치로 전송되거나, 제 3 장치로부터 영상 처리 장치로 전송되는 S/W 프로그램 자체를 포함할 수 있다.The computer program product, in a system including a server and an image processing device, may include a storage medium of the server or a storage medium of the image processing device. Alternatively, when there is a third device (eg, a smartphone) that is communicatively connected to the server or the image processing device, the computer program product may include a storage medium of the third device. Alternatively, the computer program product may include the S/W program itself transmitted from the server to the image processing device or the third device, or transmitted from the third device to the image processing device.

이 경우, 서버, 영상 처리 장치 및 제 3 장치 중 하나가 컴퓨터 프로그램 제품을 실행하여 개시된 실시예들에 따른 방법을 수행할 수 있다. 또는, 서버, 영상 처리 장치 및 제 3 장치 중 둘 이상이 컴퓨터 프로그램 제품을 실행하여 개시된 실시예들에 따른 방법을 분산하여 실시할 수 있다.In this case, one of the server, the image processing apparatus, and the third apparatus may execute a computer program product to perform the method according to the disclosed embodiments. Alternatively, two or more of the server, the image processing device, and the third device may execute a computer program product to distribute the method according to the disclosed embodiments.

예를 들면, 서버(예로, 클라우드 서버 또는 인공 지능 서버 등)가 서버에 저장된 컴퓨터 프로그램 제품을 실행하여, 서버와 통신 연결된 영상 처리 장치가 개시된 실시예들에 따른 방법을 수행하도록 제어할 수 있다. For example, a server (eg, a cloud server or an artificial intelligence server) may execute a computer program product stored in the server to control the image processing apparatus communicatively connected to the server to perform the method according to the disclosed embodiments.

또 다른 예로, 제 3 장치가 컴퓨터 프로그램 제품을 실행하여, 제 3 장치와 통신 연결된 영상 처리 장치가 개시된 실시예에 따른 방법을 수행하도록 제어할 수 있다. 제 3 장치가 컴퓨터 프로그램 제품을 실행하는 경우, 제 3 장치는 서버로부터 컴퓨터 프로그램 제품을 다운로드하고, 다운로드 된 컴퓨터 프로그램 제품을 실행할 수 있다. 또는, 제 3 장치는 프리로드 된 상태로 제공된 컴퓨터 프로그램 제품을 실행하여 개시된 실시예들에 따른 방법을 수행할 수도 있다.As another example, the third device may execute a computer program product to control the image processing device communicatively connected to the third device to perform the method according to the disclosed embodiment. When the third device executes the computer program product, the third device may download the computer program product from the server and execute the downloaded computer program product. Alternatively, the third device may execute the computer program product provided in a preloaded state to perform the method according to the disclosed embodiments.

이제까지 실시예들을 중심으로 살펴보았다. 개시된 실시예들이 속하는 기술 분야에서 통상의 지식을 가진 자는 개시된 실시예들이 본질적인 특성에서 벗어나지 않는 범위에서 변형된 형태로 구현될 수 있음을 이해할 수 있을 것이다. 그러므로 개시된 실시예들은 한정적인 관점이 아니라 설명적인 관점에서 고려되어야 한다. 발명의 범위는 전술한 실시예들의 설명이 아니라 특허청구범위에 나타나 있으며, 그와 동등한 범위 내에 있는 모든 차이점은 발명의 범위에 포함된 것으로 해석되어야 할 것이다.So far, examples have been mainly looked at. Those of ordinary skill in the art to which the disclosed embodiments pertain will understand that the disclosed embodiments can be implemented in modified forms without departing from the essential characteristics. Therefore, the disclosed embodiments are to be considered in an illustrative rather than a restrictive sense. The scope of the invention is indicated in the claims rather than the description of the above-described embodiments, and all differences within an equivalent range should be construed as being included in the scope of the invention.

Claims

a photographing unit configured to acquire a live view image including at least one subject;
a memory storing computer executable instructions;
By executing the computer-executable instruction, based on a data recognition model corresponding to a predetermined condition among data recognition models, the obtained live view is based on a criterion for determining whether the data recognition model corresponds to the learned interest information. at least one processor for estimating a user's ROI from the image and focusing on the estimated ROI; and
an input/output unit configured to display the live view image focused on the estimated ROI;
includes,
the at least one processor,
When the first condition is satisfied, according to a criterion for determining whether the first data recognition model corresponds to the learned saliency area, the saliency area is estimated as the user's area of interest, and when the second condition is satisfied, the second condition 2 The image processing apparatus for estimating the region corresponding to the personalized interest information as the user's region of interest according to a criterion for determining whether the data recognition model corresponds to the learned personalized interest information.

delete

The method of claim 1,
The protrusion area is determined by a predetermined criterion regarding an area occupied by a subject in an image or a color distribution of the image, and the personalized interest information is stored in the image processing device according to predetermined statistics on the user's images. To be determined, an image processing device.

The method of claim 1,
The second condition is a case in which the number of images stored in the image processing apparatus is greater than a predetermined number, the reliability of the personalized interest information satisfies a predetermined condition, and the first condition is that the second condition is satisfied. If it is not satisfied, the image processing device.

The method of claim 1,
Using the images of the user stored in the external server, further comprising a communication unit for receiving from the external server a criterion for determining whether the third data recognition model corresponds to the learned personalized interest information,
The at least one processor,
An image processing apparatus for estimating a region corresponding to the personalized interest information as the user's region of interest according to a criterion for determining whether a data recognition model provided in the image processing apparatus corresponds to the received personalized interest information .

6. The method of claim 5,
The external server classifies the user's images according to a predetermined criterion, learns the third data recognition model from the classified images, and obtains a criterion for determining whether the image corresponds to the personalized interest information. processing unit.

The method of claim 1,
The at least one processor,
An image processing apparatus for estimating a region corresponding to the interest information having a higher priority as the user's region of interest, based on the priority of the interest information.

The method of claim 1,
the at least one processor,
When a plurality of ROIs are estimated, the image processing apparatus performs multi-focus for focusing on all of the plurality of ROIs.

The method of claim 1,
The at least one processor,
When a plurality of regions of interest are estimated, the image processing apparatus focuses on a region of interest selected by a user from among the plurality of regions of interest.

The method of claim 1,
The image processing apparatus, wherein the photographing unit captures an image by focusing on the estimated ROI in response to the user's photographing command.

acquiring a live view image including at least one subject;
Based on a data recognition model corresponding to a predetermined condition among data recognition models, according to a criterion for determining whether the data recognition model corresponds to the learned interest information, the user's region of interest is estimated from the acquired live view image step;
focusing on the estimated ROI; and
displaying the live view image focused on the estimated ROI;
includes,
The step of estimating the region of interest comprises:
When the first condition is satisfied, according to a criterion for determining whether the first data recognition model corresponds to the learned saliency area, the saliency area is estimated as the user's area of interest, and when the second condition is satisfied, the second condition 2 An image processing method of estimating a region corresponding to the personalized interest information as the user's region of interest according to a criterion for determining whether the data recognition model corresponds to the learned personalized interest information.

delete

12. The method of claim 11,
The protrusion region is determined by a predetermined criterion regarding the area occupied by the subject in the image or the color distribution of the image, and the personalized interest information is stored in the user's images stored in the image processing device on which the image processing method is performed. An image processing method that is determined by predetermined statistics regarding

12. The method of claim 11,
The second condition is a case in which the number of images stored in the image processing apparatus on which the image processing method is performed is greater than a predetermined number, the reliability of the personalized interest information satisfies a predetermined condition, and the first condition is a case where the second condition is not satisfied, the image processing method.

15. The method of claim 14,
Using the user's images stored in the external server, further comprising the step of receiving, from the external server, a criterion for determining whether a third data recognition model corresponds to the learned personalized interest information,
The step of estimating the region of interest comprises:
According to a criterion for determining whether a data recognition model provided in an image processing apparatus on which the image processing method is performed corresponds to the received personalized interest information, the region corresponding to the personalized interest information is defined as the user's region of interest. Estimating, image processing method.

16. The method of claim 15,
The external server classifies the user's images according to a predetermined criterion, learns the third data recognition model from the classified images, and obtains a criterion for determining whether the image corresponds to the personalized interest information. processing method.

12. The method of claim 11,
The step of estimating the region of interest comprises:
An image processing method of estimating, as a user's region of interest, a region corresponding to the interest information having a higher priority, based on the priority of the interest information.

12. The method of claim 11,
The focusing step is
When a plurality of ROIs are estimated, multi-focusing is performed to focus on all of the plurality of ROIs.

12. The method of claim 11,
The focusing step is
When a plurality of regions of interest are estimated, an image processing method of focusing on a region of interest selected by a user from among the plurality of regions of interest.

acquiring a live view image including at least one subject;
Based on a data recognition model corresponding to a predetermined condition among data recognition models, according to a criterion for determining whether the data recognition model corresponds to the learned interest information, the user's region of interest is estimated from the acquired live view image step;
focusing on the estimated ROI; and
displaying the live view image focused on the estimated ROI; includes,
The step of estimating the region of interest comprises:
When the first condition is satisfied, according to a criterion for determining whether the first data recognition model corresponds to the learned saliency area, the saliency area is estimated as the user's area of interest, and when the second condition is satisfied, the second condition 2 According to a criterion for determining whether the data recognition model corresponds to the learned personalized interest information, instructions set to perform an image processing method of estimating the region corresponding to the personalized interest information as the user's region of interest are stored A computer-readable recording medium.