KR20210052153A

KR20210052153A - Electronic apparatus and method for controlling thereof

Info

Publication number: KR20210052153A
Application number: KR1020200036344A
Authority: KR
Inventors: 루아 후안 페레즈; 타오 시앙; 티모시 호스피데일즈; 시아티엔 주
Original assignee: 삼성전자주식회사
Priority date: 2019-10-29
Filing date: 2020-03-25
Publication date: 2021-05-10
Also published as: GB201915637D0; GB2588614B; GB2588614A

Abstract

Disclosed is a method of controlling an electronic device. According to the present disclosure, the control method includes the steps of: obtaining a neural network model trained to detect an object corresponding to at least one class; obtaining a user command for detecting a first object corresponding to a first class; and obtaining the new neural network model based on the neural network model and information on the first object if the first object does not correspond to at least one class. The present invention provides the electronic device capable of obtaining the new neural network model by adding a new class corresponding to a user command to the neural network model, and recognizing an object corresponding to a new class based on the obtained neural network model.

Description

Electronic device and its control method {ELECTRONIC APPARATUS AND METHOD FOR CONTROLLING THEREOF}

본 개시는 신경망 모델을 커스터마이징(customizing)하는 전자 장치 및 그 제어 방법으로, 보다 상세하게는, 사용자 명령에 따라 신규 클래스를 추가하여 새로운 신경망 모델을 획득하는 전자 장치 및 그 제어 방법에 관한 것이다.The present disclosure relates to an electronic device for customizing a neural network model and a control method thereof, and more particularly, to an electronic device for obtaining a new neural network model by adding a new class according to a user command, and a control method thereof.

종래의 인공 지능(AI) 기반의 오브젝트 인식 모델은 미리 정해진 카테고리나 클래스에 대한 데이터를 바탕으로 오프라인으로 학습되며, 학습이 완료되면 스마트폰, 로봇/로봇 장치, 또는 다른 영상 및/또는 음성 인식 시스템과 같은 장치에 적용된다. Conventional artificial intelligence (AI)-based object recognition models are learned offline based on data for a predetermined category or class, and when learning is completed, a smartphone, robot/robot device, or other image and/or voice recognition system Applies to devices such as

한편, 학습이 완료된 오브젝트 인식 모델이 장치에 적용되고 나면, 인식 가능한 신규 클래스(또는 카레고리)를 오브젝트 인식 모델에 추가하는 등 오브젝트 인식 모델을 변경하는 것은 어려운 일이다. 신규 클래스를 오브젝트 인식 모델에 추가하기 위해서는, 추가되는 신규 클래스에 대한 많은 샘플과 오브젝트 인식 모델의 재학습을 위한 클라우드 컴퓨팅이 필요하며, 이는 시간 및 비용상 적절하지 않기 때문이다.On the other hand, it is difficult to change the object recognition model, such as adding a new class (or category) that can be recognized, to the object recognition model after the object recognition model that has been learned is applied to the device. In order to add a new class to the object recognition model, a large number of samples for the added new class and cloud computing for retraining the object recognition model are required, because this is not appropriate in terms of time and cost.

최근에는 스마트폰 등에 적용된 오브젝트 인식 모델을 커스터마이징하려는 소비자의 니즈가 증가함에 따라 학습이 완료되어 제품에 적용된 오브젝트 인식 모델을 커스터마이징하는 기술에 대한 필요성이 대두된다.Recently, as the needs of consumers to customize an object recognition model applied to a smartphone or the like increase, the need for a technology for customizing an object recognition model applied to a product has emerged as learning is completed.

본 발명이 해결하고자 하는 일 기술적 과제는, 사용자 명령에 대응되는 신규 클래스를 신경망 모델에 추가하여 새로운 신경망 모델을 획득하고, 획득된 신경망 모델을 바탕으로 신규 클래스에 해당하는 오브젝트를 인식할 수 있는 전자 장치를 제공하는 것이다.One technical problem to be solved by the present invention is to obtain a new neural network model by adding a new class corresponding to a user command to a neural network model, and to recognize an object corresponding to the new class based on the obtained neural network model. It is to provide the device.

본 발명의 기술적 과제들은 이상에서 언급한 기술적 과제들로 제한되지 않으며, 언급되지 않은 또 다른 기술적 과제들은 아래의 기재로부터 본 발명의 기술분야에서의 통상의 기술자에게 명확하게 이해 될 수 있을 것이다.The technical problems of the present invention are not limited to the technical problems mentioned above, and other technical problems that are not mentioned will be clearly understood by those skilled in the art from the following description.

상술한 기술적 과제를 해결하기 위한 본 개시의 예시적인 일 실시 예에 따르면, 전자 장치의 제어 방법에 있어서, 적어도 하나의 클래스에 대응되는 오브젝트를 검출하도록 학습된 신경망 모델을 획득하는 단계; 제1 클래스에 대응되는 제1 오브젝트를 검출하기 위한 사용자 명령을 획득하는 단계; 및 상기 제1 오브젝트가 상기 적어도 하나의 클래스에 대응되지 않으면, 상기 신경망 모델 및 상기 제1 오브젝트에 대한 정보를 바탕으로 새로운 신경망 모델을 획득하는 단계;를 포함하는 제어 방법이 제공될 수 있다.According to an exemplary embodiment of the present disclosure for solving the above technical problem, there is provided a method for controlling an electronic device, the method comprising: acquiring a neural network model trained to detect an object corresponding to at least one class; Obtaining a user command for detecting a first object corresponding to the first class; And if the first object does not correspond to the at least one class, obtaining a new neural network model based on the neural network model and information on the first object.

상술한 기술적 과제를 해결하기 위한 본 개시의 예시적인 다른 일 실시 예에 따르면, 전자 장치에 있어서, 적어도 하나의 인스트럭션을 포함하는 메모리; 및 프로세서;를 포함하고, 상기 프로세서는, 적어도 하나의 클래스에 대응되는 오브젝트를 검출하도록 학습된 신경망 모델을 획득하고, 제1 클래스에 대응되는 제1 오브젝트를 검출하기 위한 사용자 명령을 획득하고, 상기 제1 오브젝트가 상기 적어도 하나의 클래스에 대응되지 않으면, 상기 신경망 모델 및 상기 제1 오브젝트에 대한 정보를 바탕으로 새로운 신경망 모델을 획득하는 전자 장치가 제공될 수 있다.According to another exemplary embodiment of the present disclosure for solving the above technical problem, there is provided an electronic device comprising: a memory including at least one instruction; And a processor; wherein the processor obtains a neural network model trained to detect an object corresponding to at least one class, obtains a user command for detecting a first object corresponding to a first class, and the If the first object does not correspond to the at least one class, an electronic device for obtaining a new neural network model based on the neural network model and information on the first object may be provided.

본 개시의 과제의 해결 수단이 상술한 해결 수단들로 제한되는 것은 아니며, 언급되지 아니한 해결 수단들은 본 명세서 및 첨부된 도면으로부터 본 개시가 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The solution means of the subject of the present disclosure is not limited to the above-described solution means, and solutions that are not mentioned will be clearly understood by those of ordinary skill in the art from the present specification and the accompanying drawings. I will be able to.

이상과 같은 본 개시의 다양한 실시 예에 따르면, 전자 장치는 사용자 명령에 따라 신경망 모델에 추가된 신규 클래스에 대응되는 오브젝트를 인식할 수 있다.According to various embodiments of the present disclosure as described above, the electronic device may recognize an object corresponding to a new class added to the neural network model according to a user command.

그 외에 본 개시의 실시 예로 인하여 얻을 수 있거나 예측되는 효과에 대해서는 본 개시의 실시 예에 대한 상세한 설명에서 직접적 또는 암시적으로 개시하도록 한다. 예컨대, 본 개시의 실시 예에 따라 예측되는 다양한 효과에 대해서는 후술될 상세한 설명 내에서 개시될 것이다.In addition, effects that can be obtained or predicted by the embodiments of the present disclosure will be disclosed directly or implicitly in the detailed description of the embodiments of the present disclosure. For example, various effects predicted according to an embodiment of the present disclosure will be disclosed within a detailed description to be described later.

도 1은 본 개시의 일 실시 예에 따른 전자 장치를 설명하기 위한 도면이다.
도 2는 본 개시의 일 실시 예에 따른 전자 장치의 제어 방법을 도시한 순서도이다.
도 3a는 극소수 학습을 이용한 종래의 오브젝트 인식 모델을 설명하기 위한 도면이다.
도 3b는 종래의 오브젝트 인식 모델의 학습 방법을 설명하기 위한 도면이다.
도 4a는 본 개시의 일 실시 예에 따른 새로운 신경망 모델을 획득하는 방법을 설명하기 위한 도면이다.
도 4b는 본 개시의 일 실시 예에 따른 신경망 모델의 학습 방법을 설명하기 위한 도면이다.
도 5a는 본 개시의 일 실시 예에 따른 전자 장치의 구성을 도시한 블록도이다.
도 5b는 본 개시의 일 실시 예에 따른 새로운 신경망 모델을 획득하는 방법을 설명하기 위한 도면이다.
도 6a는 본 개시의 일 실시 예에 따른 이미지 샘플을 이용해 신경망 모델을 커스터마이징하는 방법을 도시한 도면이다.
도 6b는 본 개시의 일 실시 예에 따른 비디오 프레임 샘플을 이용해 신경망 모델을 커스터마이징하는 모습을 도시한 도면이다.
도 7은 본 개시의 일 실시 예에 따른 전자 장치의 제어 방법을 나타내는 순서도이다.
도 8은 사용자 요청 클래스가 이미 신경망 모델에 존재하는지 확인하는 예의 흐름도이다.1 is a diagram for describing an electronic device according to an exemplary embodiment of the present disclosure.
2 is a flowchart illustrating a method of controlling an electronic device according to an embodiment of the present disclosure.
3A is a diagram illustrating a conventional object recognition model using very few learning.
3B is a diagram illustrating a method of learning a conventional object recognition model.
4A is a diagram illustrating a method of obtaining a new neural network model according to an embodiment of the present disclosure.
4B is a diagram illustrating a method of learning a neural network model according to an embodiment of the present disclosure.
5A is a block diagram illustrating a configuration of an electronic device according to an embodiment of the present disclosure.
5B is a diagram illustrating a method of obtaining a new neural network model according to an embodiment of the present disclosure.
6A is a diagram illustrating a method of customizing a neural network model using image samples according to an embodiment of the present disclosure.
6B is a diagram illustrating a customizing a neural network model using video frame samples according to an embodiment of the present disclosure.
7 is a flowchart illustrating a method of controlling an electronic device according to an embodiment of the present disclosure.
8 is a flowchart of an example of checking whether a user request class already exists in a neural network model.

본 명세서에서 사용되는 용어에 대해 간략히 설명하고, 본 개시에 대해 구체적으로 설명하기로 한다.　The terms used in the present specification will be briefly described, and the present disclosure will be described in detail.

본 개시의 실시 예에서 사용되는 용어는 본 개시에서의 기능을 고려하면서 가능한 현재 널리 사용되는 일반적인 용어들을 선택하였으나, 이는 당 분야에 종사하는 기술자의 의도 또는 판례, 새로운 기술의 출현 등에 따라 달라질 수 있다. 또한, 특정한 경우는 출원인이 임의로 선정한 용어도 있으며, 이 경우 해당되는 개시의 설명 부분에서 상세히 그 의미를 기재할 것이다. 따라서 본 개시에서 사용되는 용어는 단순한 용어의 명칭이 아닌, 그 용어가 가지는 의미와 본 개시의 전반에 걸친 내용을 토대로 정의되어야 한다.Terms used in the embodiments of the present disclosure have selected general terms that are currently widely used as possible while considering functions in the present disclosure, but this may vary according to the intention or precedent of a technician working in the field, the emergence of new technologies, etc. . In addition, in certain cases, there are terms arbitrarily selected by the applicant, and in this case, the meaning of the terms will be described in detail in the description of the corresponding disclosure. Therefore, the terms used in the present disclosure should be defined based on the meaning of the term and the overall contents of the present disclosure, not a simple name of the term.

본 개시의 실시 예들은 다양한 변환을 가할 수 있고 여러 가지 실시 예를 가질 수 있는바, 특정 실시 예들을 도면에 예시하고 상세한 설명에 상세하게 설명하고자 한다. 그러나 이는 특정한 실시 형태에 대해 범위를 한정하려는 것이 아니며, 개시된 사상 및 기술 범위에 포함되는 모든 변환, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 실시 예들을 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 요지를 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다.Since the embodiments of the present disclosure may apply various transformations and may have various embodiments, specific embodiments will be illustrated in the drawings and described in detail in the detailed description. However, this is not intended to limit the scope of the specific embodiment, it is to be understood to include all conversions, equivalents, or substitutes included in the disclosed spirit and technical scope. In describing the embodiments, if it is determined that a detailed description of a related known technology may obscure the subject matter, a detailed description thereof will be omitted.

제1, 제2 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 구성요소들은 용어들에 의해 한정되어서는 안 된다. 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다.Terms such as first and second may be used to describe various components, but the components should not be limited by terms. The terms are used only for the purpose of distinguishing one component from other components.

단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "포함하다" 또는 "구성되다" 등의 용어는 명세서 상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.Singular expressions include plural expressions unless the context clearly indicates otherwise. In the present application, terms such as "comprise" or "consist" are intended to designate the presence of features, numbers, steps, actions, components, parts, or combinations thereof described in the specification, but one or more other It should be understood that the presence or addition of features, numbers, steps, actions, components, parts, or combinations thereof, does not preclude the possibility of preliminary exclusion.

아래에서는 첨부한 도면을 참고하여 본 개시의 실시 예에 대하여 본 개시가 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다.　그러나 본 개시는 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시 예에 한정되지 않는다. 그리고 도면에서 본 개시를 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.Hereinafter, exemplary embodiments of the present disclosure will be described in detail with reference to the accompanying drawings so that those of ordinary skill in the art may easily implement the present disclosure. However, the present disclosure may be implemented in various different forms and is not limited to the embodiments described herein. In addition, in the drawings, parts irrelevant to the description are omitted in order to clearly describe the present disclosure, and similar reference numerals are attached to similar parts throughout the specification.

도 1은 본 개시의 일 실시 예에 따른 전자 장치를 설명하기 위한 도면이다. 전자 장치(100)는 주변 환경을 촬상하여 이미지(30)를 획득할 수 있다. 예로, 전자 장치(100)는 로봇 장치일 수 있다. 전자 장치(100)는 획득된 이미지(30)에 포함된 오브젝트를 인식할 수 있다. 구체적으로, 전자 장치(100)는 이미지에 포함된 오브젝트를 인식하도록 학습된 신경망 모델에 획득된 이미지(30)를 입력하여 이미지(30)에 포함된 오브젝트를 인식할 수 있다. 예로, 전자 장치(100)는 이미지(30)에 포함된 컵을 인식할 수 있다.1 is a diagram for describing an electronic device according to an exemplary embodiment of the present disclosure. The electronic device 100 may acquire the image 30 by capturing the surrounding environment. For example, the electronic device 100 may be a robot device. The electronic device 100 may recognize an object included in the acquired image 30. In more detail, the electronic device 100 may recognize an object included in the image 30 by inputting the acquired image 30 to a neural network model trained to recognize an object included in the image. For example, the electronic device 100 may recognize a cup included in the image 30.

한편, 전자 장치(100)는 사용자(10)로부터 사용자(10)의 컵(20)을 찾으라는 명령을 획득할 수 있다. 이 때, 전자 장치(100)가 사용자(10)의 컵(20)을 인식하려면, 전자 장치(100)는 사용자(10)의 컵(20)을 인식하도록 학습된 신경망 모델을 이용하여 사용자(10)의 컵(20)을 인식하여야 한다. 그러나, 일반적으로 사용자(10)에게 배포되는 전자 장치(100)는 가장 보편적이고 일반적인 목적(예로, 오브젝트가 컵 인지 책 인지 식별)에 적합하게 학습된 신경망 모델을 이용한다. 따라서, 전자 장치(100)는 종래의 오브젝트 인식 모델(또는 신경망 모델)을 이용해서는 이미지(30)에 포함된 복수의 컵을 인식할 수 있다 하더라도, 복수의 컵 중 사용자(10)의 컵(20)을 인식할 수는 없다.Meanwhile, the electronic device 100 may obtain a command from the user 10 to find the cup 20 of the user 10. In this case, in order for the electronic device 100 to recognize the cup 20 of the user 10, the electronic device 100 uses a neural network model learned to recognize the cup 20 of the user 10 ) Of the cup (20) should be recognized. However, in general, the electronic device 100 distributed to the user 10 uses a trained neural network model suitable for the most common and general purpose (eg, identifying whether an object is a cup or a book). Accordingly, although the electronic device 100 can recognize a plurality of cups included in the image 30 using a conventional object recognition model (or neural network model), the cup 20 of the user 10 among the plurality of cups is ) Cannot be recognized.

한편, 본 개시에 따른 전자 장치(100)는 사용자(10)로부터 기 저장된 신경망 모델로는 식별할 수 없는 오브젝트인 사용자(10)의 컵(20)과 관련된 명령(예로, 컵(20)을 찾으라는 명령)을 획득하게 되면, 컵(20)에 대한 특징을 추출하고, 추출된 특징을 바탕으로 기 저장된 신경망 모델의 가중치 벡터값을 변경하여 새로운 신경망 모델을 획득할 수 있다. 그리고, 전자 장치(100)는 획득된 새로운 신경망 모델을 이용하여 사용자 명령에 대응되는 기능(예로, 컵(20)에 대한 위치를 사용자에게 제공하는 기능)을 수행할 수 있다.Meanwhile, the electronic device 100 according to the present disclosure searches for a command related to the cup 20 of the user 10, which is an object that cannot be identified by the user 10 with a neural network model previously stored. Command), it is possible to obtain a new neural network model by extracting a feature for the cup 20 and changing a weight vector value of a previously stored neural network model based on the extracted feature. In addition, the electronic device 100 may perform a function corresponding to a user command (for example, a function of providing a location of the cup 20 to the user) by using the acquired new neural network model.

이하에서는, 이 같은 기능을 수행하기 위한 전자 장치(100)의 제어 방법에 대하여 설명하도록 한다.Hereinafter, a method of controlling the electronic device 100 for performing such a function will be described.

도 2는 본 개시의 일 실시 예에 따른 전자 장치(100)의 제어 방법을 도시한 순서도이다.2 is a flowchart illustrating a method of controlling the electronic device 100 according to an embodiment of the present disclosure.

전자 장치(100)는 적어도 하나의 클래스에 대응되는 오브젝트를 검출하도록 학습된 신경망 모델을 획득할 수 있다(S210). 이 때, 전자 장치(100)는 외부 서버 또는 외부 장치로부터 학습된 신경망 모델을 획득할 수 있다. 그리고, 전자 장치(100)는 획득된 신경망 모델을 이용하여 적어도 하나의 클래스에 대응되는 오브젝트를 검출할 수 있다. 예로, 전자 장치(100)는 주변을 촬상한 이미지를 획득하고, 획득된 이미지를 신경망 모델이 입력하여 이미지에 포함된 오브젝트에 대한 종류 정보 및 위치 정보를 획득할 수 있다. 한편, 본 개시에서 클래스(class)란, 다수의 오브젝트를 분류하기 위한 용어로 "분류" 및 "카테고리"와 혼용되어 사용될 수 있다.The electronic device 100 may obtain a trained neural network model to detect an object corresponding to at least one class (S210). In this case, the electronic device 100 may acquire a neural network model learned from an external server or an external device. Further, the electronic device 100 may detect an object corresponding to at least one class by using the obtained neural network model. For example, the electronic device 100 may acquire an image of a surrounding area, and the neural network model may input the acquired image to obtain type information and location information about an object included in the image. Meanwhile, in the present disclosure, a class is a term for classifying a plurality of objects and may be used interchangeably with "classification" and "category".

한편, 신경망 모델(neural network)은 오브젝트에 대한 특징값(또는 특징)을 추출하는 특징 추출 모듈(또는 특징 추출기) 및 특징 추출 모듈로부터 획득된 특징값을 바탕으로 오브젝트에 대한 분류값을 획득하는 분류값 획득 모듈(또는 분류기)를 포함할 수 있다. 그리고, 분류값 획득 모듈은 적어도 하나의 열 벡터(column vector)를 포함하는 가중치 벡터(weight vector)를 포함할 수 있다.Meanwhile, a neural network model is a classification that obtains a classification value for an object based on a feature extraction module (or feature extractor) that extracts a feature value (or feature) for an object and a feature value obtained from the feature extraction module. It may include a value acquisition module (or classifier). In addition, the classification value acquisition module may include a weight vector including at least one column vector.

신경망 모델은 다양한 학습 방법으로 학습될 수 있다. 특히, 신경망 모델은 기계 학습 방법에 기초하여 학습될 수 있다. 이 때, 신경망 모델은 오버 피팅(overfitting)이 방지되도록 기 정의된 정규화 함수를 포함하는 손실 함수(loss function)를 바탕으로 학습될 수 있다. 특히, 신경망 모델은 직교성 제약(orthogonal constraint)을 바탕으로 학습될 수 있다.Neural network models can be trained in a variety of learning methods. In particular, the neural network model may be trained based on a machine learning method. In this case, the neural network model may be trained based on a loss function including a predefined normalization function to prevent overfitting. In particular, a neural network model can be trained based on orthogonal constraints.

전자 장치(100)는 제1 클래스에 대응되는 제1 오브젝트를 검출하기 위한 사용자 명령을 획득할 수 있다(S220). 이 때, 사용자 명령은 다양한 형태로 이루어질 수 있다. 예로, 사용자 명령은 제1 클래스를 신규 클래스로 전자 장치(100)에 저장하는 명령 또는 제1 클래스에 대응되는 제1 오브젝트를 검출하기 위한 명령을 포함할 수 있다. 또한, 전자 장치(100)는 제1 오브젝트에 대한 이미지를 획득할 수 있다.The electronic device 100 may obtain a user command for detecting a first object corresponding to the first class (S220). In this case, the user command may be made in various forms. For example, the user command may include a command for storing the first class as a new class in the electronic device 100 or a command for detecting a first object corresponding to the first class. Also, the electronic device 100 may obtain an image of the first object.

전자 장치(100)는 사용자 명령이 획득되면 제1 오브젝트가 신경망 모델이 식별할 수 있는 기 학습된 적어도 하나의 클래스에 해당하는지 여부를 판단할 수 있다. 구체적으로, 전자 장치(100)는 제1 오브젝트에 대한 이미지를 획득하고, 제1 오브젝트에 대한 이미지를 신경망 모델에 입력하여 제1 오브젝트에 대한 제1 특징값을 획득할 수 있다. 전자 장치(100)는 제1 특징값과 학습된 신경망 모델의 가중치 벡터를 비교하여 기 학습된 적어도 하나의 클래스에 해당하는지 여부를 판단할 수 있다.When a user command is obtained, the electronic device 100 may determine whether the first object corresponds to at least one pre-learned class that can be identified by the neural network model. In more detail, the electronic device 100 may acquire an image of the first object and input the image of the first object into a neural network model to obtain a first feature value of the first object. The electronic device 100 may compare the first feature value with the weight vector of the learned neural network model to determine whether it corresponds to at least one previously learned class.

제1 오브젝트가 적어도 하나의 클래스에 대응되지 않으면, 전자 장치(100)는 신경망 모델 및 제1 오브젝트에 대한 정보를 바탕으로 새로운 신경망 모델을 획득하고(S230), 전자 장치(100)는 획득된 새로운 신경망 모델을 이용하여 제1 오브젝트를 검출할 수 있다. 이 때, 전자 장치(100)는 극소수 학습(few-shot learning)을 바탕으로 새로운 신경망 모델을 획득할 수 있다. 극소수 학습이란 일반적인 학습에 비해 적은 양의 학습 데이터를 이용한 학습을 의미한다. 한편, 이하에서 신경망 모델은 특별한 언급이 없는 한 단계 S210에서 획득된 신경망 모델을 지칭하며, 단계 S230을 통해 획득된 신경망 모델은 새로운 신경망 모델이라 지칭하도록 한다.If the first object does not correspond to at least one class, the electronic device 100 acquires a new neural network model based on the neural network model and information on the first object (S230), and the electronic device 100 The first object may be detected using a neural network model. In this case, the electronic device 100 may acquire a new neural network model based on few-shot learning. Very few learning means learning using a small amount of learning data compared to general learning. Meanwhile, hereinafter, the neural network model refers to the neural network model obtained in step S210 unless otherwise specified, and the neural network model obtained through step S230 is referred to as a new neural network model.

전자 장치(100)는 제1 오브젝트에 대한 이미지를 신경망 모델의 특징 추출 모듈에 입력하여 제1 오브젝트에 대한 제1 특징값을 획득할 수 있다. 그리고, 전자 장치(100)는 제1 특징값과 신경망 모델의 분류값 획득 모듈을 바탕으로 새로운 분류값 획득 모듈을 획득할 수 있다. 구체적으로, 전자 장치(100)는 제1 특징값의 평균값을 바탕으로 제1 열 벡터를 생성하고, 제1 열 벡터를 가중치 벡터의 새로운 열 벡터로 추가하여 새로운 분류값 획득 모듈을 획득할 수 있다. 또한, 전자 장치(100)는 기 정의된 정규화 함수를 바탕으로 새로운 분류값 획득 모듈을 정규화할 수 있다.The electronic device 100 may obtain a first feature value for the first object by inputting an image of the first object into a feature extraction module of the neural network model. Further, the electronic device 100 may acquire a new classification value obtaining module based on the first feature value and the classification value obtaining module of the neural network model. Specifically, the electronic device 100 may obtain a new classification value acquisition module by generating a first column vector based on the average value of the first feature value and adding the first column vector as a new column vector of the weight vector. . Also, the electronic device 100 may normalize a new classification value acquisition module based on a predefined normalization function.

이하에서는 신경망 모델의 학습 방법 및 학습된 신경망 모델을 바탕으로 새로운 신경망 모델을 획득하는 방법에 대해 보다 상세히 설명하도록 한다.Hereinafter, a method of learning a neural network model and a method of acquiring a new neural network model based on the learned neural network model will be described in more detail.

도 3a는 극소수 학습을 이용한 종래의 오브젝트 인식 모델을 설명하기 위한 도면이다. 기 학습된 오브젝트 인식 모델에 신규 클래스가 추가되면, 오브젝트 인식 모델은 오로지 신규 클래스에 대응되는 오브젝트만 검출할 수 있고, 신규 클래스가 추가되기 전 기 학습된 클래스에 대응되는 오브젝트는 검출할 수 없었다. 즉, 종래의 오브젝트 인식 모델의 분류기(302)는 신규 클래스에 대한 가중치 벡터만을 포함하고, 기존 클래스에 대한 가중치 벡터는 포함하지 않았다. 또한, 신규 클래스에 대응되는 오브젝트를 인식하기 위해, 신규 클래스에 대응되는 학습 데이터를 바탕으로 분류기(302)를 학습시키는 과정이 필요하였다. 3A is a diagram illustrating a conventional object recognition model using very few learning. When a new class is added to the previously learned object recognition model, the object recognition model can only detect objects corresponding to the new class, and cannot detect objects corresponding to the previously learned class before the new class is added. That is, the classifier 302 of the conventional object recognition model includes only the weight vector for the new class and does not include the weight vector for the existing class. In addition, in order to recognize an object corresponding to a new class, a process of learning the classifier 302 based on learning data corresponding to the new class was required.

도 3b는 종래의 오브젝트 인식 모델의 학습 방법을 설명하기 위한 도면이다. 도 3b를 참조하면, 오브젝트 인식 모델은 특징 추출기(300) 및 분류기(302)를 포함한다. 특징 추출기(300)는 입력되는 학습 샘플에 대한 특징을 추출하며, 분류기(302)는 추출된 특징에 대한 분류값을 출력한다. 분류기(302)는 가중치 벡터(W)를 포함한다. 오브젝트 인식 모델은 출력된 분류값과 학습 샘플에 대응되는 라벨링 데이터(labelling data)를 기초로 정의되는 손실 함수(loss function)를 바탕으로 산출되는 오차가 최소화하는 오차 역전법(backpropagation)에 의해 학습되었다.3B is a diagram illustrating a method of learning a conventional object recognition model. Referring to FIG. 3B, the object recognition model includes a feature extractor 300 and a classifier 302. The feature extractor 300 extracts features for the input training sample, and the classifier 302 outputs a classification value for the extracted features. The classifier 302 includes a weight vector W. The object recognition model was trained by an error backpropagation that minimizes an error calculated based on a loss function defined based on the output classification value and labeling data corresponding to the training sample. .

도 4a는 본 개시의 일 실시 예에 따른 새로운 신경망 모델을 획득하는 방법을 설명하기 위한 도면이다. 전자 장치(100)는 극소수 학습을 바탕으로 신경망 모델을 학습시킬 수 있다. 신경망 모델은 특징 추출 모듈(410) 및 분류 모듈(또는 분류값 획득 모듈)(420)을 포함할 수 있다. 분류 모듈(420)은 복수의 열 벡터 또는 행 벡터로 구성되는 가중치 벡터를 포함할 수 있다. 가중치 벡터의 각각의 열 벡터는 각각의 클래스에 대응될 수 있다. 분류 모듈(420)은 베이스 영역(base portion)(421) 및 신규 영역(novel portion)(422)을 포함할 수 있다. 베이스 영역(421)은 기 학습된 클래스에 대응되는 벡터를 포함하며, 신규 영역(422)은 사용자 입력에 따른 신규 클래스에 대응되는 벡터를 포함할 수 있다. 신규 영역(422)은 로컬 영역(local portion)으로 혼용되기도 한다.4A is a diagram illustrating a method of obtaining a new neural network model according to an embodiment of the present disclosure. The electronic device 100 may train a neural network model based on very few learning. The neural network model may include a feature extraction module 410 and a classification module (or a classification value acquisition module) 420. The classification module 420 may include a weight vector composed of a plurality of column vectors or row vectors. Each column vector of the weight vector may correspond to each class. The classification module 420 may include a base portion 421 and a novel portion 422. The base region 421 may include a vector corresponding to a previously learned class, and the new region 422 may include a vector corresponding to a new class according to a user input. The new area 422 may be used interchangeably as a local portion.

전자 장치(100)는 기 학습된 클래스에 해당하는(즉, 베이스 영역에 포함되는) 오브젝트(41)를 특징 추출 모듈(410)에 입력하여 오브젝트(41)에 대한 특징값을 획득할 수 있다. 전자 장치(100)는 획득된 특징값을 분류 모듈(420)에 입력하여 오브젝트(41)에 대한 분류값을 획득할 수 있다. 이 때, 전자 장치(100)는 오브젝트(41)에 대한 특징값과 분류 모듈(420)에 포함된 가중치 벡터 간의 내적 연산을 수행할 수 있다. 또한, 신경망 모델은 특징값에 대한 벡터와 분류 모듈(420)의 가중치 벡터의 코사인 거리를 바탕으로 분류값을 출력할 수 있다.The electronic device 100 may obtain a feature value for the object 41 by inputting an object 41 corresponding to a previously learned class (ie, included in the base region) into the feature extraction module 410. The electronic device 100 may obtain a classification value for the object 41 by inputting the obtained feature value into the classification module 420. In this case, the electronic device 100 may perform a dot product calculation between the feature value of the object 41 and the weight vector included in the classification module 420. In addition, the neural network model may output a classification value based on a vector for a feature value and a cosine distance of a weight vector of the classification module 420.

한편, 전자 장치(100)는 사용자로부터 기 학습된 클래스에 대응되지 않는(즉, 베이스 영역(421)에 포함되지 않는) 제1 오브젝트(42)에 대한 요청을 수신할 수 있다. 이 때, 전자 장치(100)는 제1 오브젝트(42)를 특징 추출 모듈(410)에 입력하여 제1 특징값(43)을 획득할 수 있다. 전자 장치(100)는 제1 특징값(43)을 바탕으로 신규 영역(422)에 할당 또는 저장할 새로운 가중치 벡터(또는 열 벡터)를 획득할 수 있다. 예로, 전자 장치(100)는 제1 특징값(43)을 평균화(averaging)하고, 평균화된 제1 특징값(43)을 신규 영역(422)에 저장할 수 있다. 이에 따라, 신규 영역(422)은 제1 오브젝트(42)에 대응되는 신규 클래스에 대응되는 가중치 벡터를 포함할 수 있다. 이와 같이, 전자 장치(100)는 제1 특징값(43)을 바탕으로 분류 모듈(420)의 신규 영역(422)을 업데이트함으로써 새로운 신경망 모델을 획득할 수 있다.Meanwhile, the electronic device 100 may receive a request from a user for a first object 42 that does not correspond to a previously learned class (ie, is not included in the base region 421 ). In this case, the electronic device 100 may obtain the first feature value 43 by inputting the first object 42 into the feature extraction module 410. The electronic device 100 may obtain a new weight vector (or column vector) to be allocated or stored in the new region 422 based on the first feature value 43. For example, the electronic device 100 may average the first feature values 43 and store the averaged first feature values 43 in the new area 422. Accordingly, the new area 422 may include a weight vector corresponding to a new class corresponding to the first object 42. In this way, the electronic device 100 may obtain a new neural network model by updating the new area 422 of the classification module 420 based on the first feature value 43.

한편, 도 3a를 다시 참조하면, 종래의 신경망 모델(또는 신경망 모델이 적용된 전자 장치)은 신규 클래스에 해당하는 오브젝트를 인식할 수 있도록 학습되면, 신규 클래스에 해당하는 가중치 벡터를 포함하도록 분류기(302)가 업데이트 됨에 따라 기 학습된 클래스에 해당하는 가중치 벡터는 포함하지 않았다. 이에 따라, 종래의 신경망 모델은 기 학습된 클래스에 해당하는 오브젝트는 더 이상 인식할 수 없었다. 반면에, 본 개시에 따른 새로운 신경망 모델은 신규 클래스에 해당하는 오브젝트를 인식하도록 학습되더라도, 베이스 영역(421) 및 신규 영역(422)을 모두 포함한다. 따라서, 전자 장치(100)는 새로운 신경망 모델을 이용하여 제1 오브젝트(42)뿐만 아니라 기 학습된 오브젝트(41)까지 인식할 수 있다.Meanwhile, referring again to FIG. 3A, when a conventional neural network model (or an electronic device to which a neural network model is applied) is trained to recognize an object corresponding to a new class, the classifier 302 includes a weight vector corresponding to the new class. ) Is updated, the weight vector corresponding to the previously learned class is not included. Accordingly, the conventional neural network model could no longer recognize an object corresponding to a previously learned class. On the other hand, the new neural network model according to the present disclosure includes both the base region 421 and the new region 422 even though it is learned to recognize an object corresponding to a new class. Accordingly, the electronic device 100 may recognize not only the first object 42 but also the previously learned object 41 by using the new neural network model.

도 4b는 본 개시의 일 실시 예에 따른 신경망 모델의 학습 방법을 설명하기 위한 도면이다.4B is a diagram illustrating a method of learning a neural network model according to an embodiment of the present disclosure.

신경망 모델은 특징 추출 모듈(410) 및 분류값 획득 모듈(420)을 포함할 수 있다. 특징 추출 모듈(410)은 입력되는 학습 샘플에 대한 특징(또는 특징값)을 추출하도록 학습된다. 분류값 획득 모듈(420)은 특징 추출 모듈(410)로부터 출력된 특징값을 바탕으로 학습 샘플에 대한 분류값을 출력하도록 학습된다. 신경망 모델은 기 정의된 함수를 바탕으로 산출되는 직교성 점수(orthogonality score), 분류값 및 학습 샘플에 대응되는 라벨링 데이터를 바탕으로 정의되는 손실 함수를 바탕으로 학습될 수 있다. 신경망 모델은 손실 함수를 바탕으로 산출되는 오차를 최소화하는 오차 역전법에 따라 학습될 수 있다. 여기서, 직교성 점수란 분류값 획득 모듈(420)의 가중치 벡터가 정규화된 값일 수 있다.The neural network model may include a feature extraction module 410 and a classification value acquisition module 420. The feature extraction module 410 is trained to extract a feature (or feature value) for an input training sample. The classification value acquisition module 420 is trained to output a classification value for a training sample based on the feature value output from the feature extraction module 410. The neural network model may be trained based on an orthogonality score calculated based on a predefined function, a classification value, and a loss function defined based on labeling data corresponding to a training sample. The neural network model can be trained according to the error inversion method that minimizes the error calculated based on the loss function. Here, the orthogonality score may be a value in which the weight vector of the classification value acquisition module 420 is normalized.

한편, 도 3b에 따른 종래 오브젝트 인식 모델은 직교성 점수를 바탕으로 학습되지 않으므로, 분류기(302)에 대한 정규화가 수행되지 않았다. 이에 따라, 종래의 오브젝트 인식 모델은 기 학습된 클래스에 대해 오버 피팅되어 신규 클래스에 대한 오브젝트를 인식할 수 없었다. 이에 반해, 본 개시에 따른 신경망 모델은 직교성 점수를 바탕으로 정규화됨에 따라, 특정 클래스에 대해 오버피팅되지 않는다. 또한, 베이스 영역(421)에 포함되는 가중치 벡터들과 신규 영역(422)에 추가되는 가중치 벡터들과의 호환성이 향상될 수 있다. 즉, 신경망 모델이 직교성 제약(orthogonality constraint)을 바탕으로 학습되면, 사용자는 신경망 모델을 보다 용이하게 커스터마이징할 수 있다. 그리고, 전자 장치(100)는 신규 영역(422)에 추가된 가중치 벡터들에 대해서도 직교성 제약으로 바탕으로 정규화를 수행할 수 있다.Meanwhile, since the conventional object recognition model according to FIG. 3B is not trained based on the orthogonality score, normalization for the classifier 302 is not performed. Accordingly, the conventional object recognition model was over-fitting with respect to the previously learned class, so that the object for the new class could not be recognized. In contrast, since the neural network model according to the present disclosure is normalized based on the orthogonality score, it is not overfitting for a specific class. In addition, compatibility between weight vectors included in the base region 421 and weight vectors added to the new region 422 may be improved. That is, if the neural network model is trained based on orthogonality constraint, the user can more easily customize the neural network model. Also, the electronic device 100 may normalize weight vectors added to the new region 422 based on orthogonality constraints.

도 5a는 본 개시의 일 실시 예에 따른 전자 장치의 구성을 도시한 블록도이다. 5A is a block diagram illustrating a configuration of an electronic device according to an embodiment of the present disclosure.

전자 장치(100)는 카메라(110), 마이크(120), 통신 인터페이스(130), 메모리(140), 디스플레이(150), 스피커(160) 및 프로세서(170)를 포함할 수 있다. 예로, 전자 장치(100)는 스마트폰, 태블릿, 노트북, 컴퓨터 또는 컴퓨팅 장치, 가상 비서 장치, 로봇 및 로봇 장치, 소비재/가전 장치(예: 스마트 냉장고), 사물 인터넷 장치, 또는 영상 촬영 시스템/장치 등과 같은 어느 사용자 장치일 수 있으나, 이에 한정되는 것은 아니다. 이하에서는 각 구성에 대하여 설명한다.The electronic device 100 may include a camera 110, a microphone 120, a communication interface 130, a memory 140, a display 150, a speaker 160, and a processor 170. For example, the electronic device 100 is a smartphone, a tablet, a notebook, a computer or computing device, a virtual assistant device, a robot and a robot device, a consumer goods/home appliance device (eg, a smart refrigerator), an Internet of Things device, or an image capturing system/device. It may be any user device such as, but is not limited thereto. Hereinafter, each configuration will be described.

카메라(110)는 전자 장치(100) 주변을 촬상하여 이미지를 획득할 수 있다. 또한, 카메라(110)는 사용자 명령을 획득할 수 있다. 예로, 사용자가 제공하는 오브젝트에 대한 이미지나 사용자의 제스처를 촬영한 영상을 획득할 수 있다. 카메라(110)는 다양한 종류의 카메라로 구현될 수 있다. 예를 들어, 카메라(110)는 2D기반의 RGB 카메라 및 IR 카메라 중 어느 하나일 수 있다. 또는, 카메라(110)는 3D기반의 ToF(Time of Flight) 카메라 및 스테레오 카메라 중 어느 하나일 수 있다.The camera 110 may acquire an image by capturing an image around the electronic device 100. In addition, the camera 110 may obtain a user command. For example, an image of an object provided by a user or an image of a user's gesture may be acquired. The camera 110 may be implemented with various types of cameras. For example, the camera 110 may be any one of a 2D-based RGB camera and an IR camera. Alternatively, the camera 110 may be any one of a 3D-based Time of Flight (ToF) camera and a stereo camera.

마이크(120)는 사용자의 음성을 입력받기 위한 구성으로서, 전자 장치(100) 내에 구비될 수 있으나, 이는 일 실시예에 불과할 뿐, 전자 장치(100)의 외부에 전자 장치(100)와 유선 또는 무선으로 연결될 수 있다. 특히, 마이크(120)는 특정 오브젝트를 검색하기 위한 사용자 음성을 입력받을 수 있다.The microphone 120 is a configuration for receiving a user's voice input, and may be provided in the electronic device 100, but this is only an embodiment, and the electronic device 100 and wired or Can be connected wirelessly. In particular, the microphone 120 may receive a user voice for searching for a specific object.

통신 인터페이스(130)는 적어도 하나의 회로를 포함하며 다양한 유형의 외부 기기와 통신을 수행할 수 있다. 예를 들어, 통신 인터페이스(130)는 외부 서버 또는 사용자 단말과 통신을 수행할 수 있다. 또한, 통신 인터페이스(130)는 통신 인터페이스(130)는 다양한 유형의 통신 방식에 따라 외부 기기와 통신을 수행할 수 있다. 통신 인터페이스(130)는 무선 혹은 유선으로 데이터 통신을 수행할 수 있다. 무선 통신 방식으로 외부 기기와 통신을 수행할 경우, 통신 인터페이스(130)는 와이파이 통신 모듈, 셀룰러 통신모듈, 3G(3세대) 이동통신 모듈, 4G(4세대) 이동통신 모듈, 4세대 LTE(Long Term Evolution) 통신 모듈, 5G(5세대) 이동통신 모듈 중 적어도 하나를 포함할 수 있다. 한편, 본 개시의 일 실시예에 따른, 통신 인터페이스(130)는 무선 통신 모듈로 구현될 수 있으나, 이는 일 실시예에 불과할 뿐, 유선 통신 모듈(예를 들어, LAN 등)로 구현될 수 있다.The communication interface 130 includes at least one circuit and may communicate with various types of external devices. For example, the communication interface 130 may perform communication with an external server or a user terminal. In addition, the communication interface 130 may communicate with an external device according to various types of communication methods. The communication interface 130 may perform data communication wirelessly or wired. When performing communication with an external device in a wireless communication method, the communication interface 130 includes a Wi-Fi communication module, a cellular communication module, a 3G (3rd generation) mobile communication module, a 4G (4th generation) mobile communication module, and a 4th generation LTE (Long It may include at least one of a Term Evolution) communication module and a 5G (5th generation) mobile communication module. Meanwhile, according to an embodiment of the present disclosure, the communication interface 130 may be implemented as a wireless communication module, but this is only an embodiment, and may be implemented as a wired communication module (eg, LAN, etc.). .

메모리(140)는 전자 장치(100)의 구성요소들의 전반적인 동작을 제어하기 위한 운영체제(OS: Operating System) 및 전자 장치(100)의 구성요소와 관련된 명령 또는 데이터를 저장할 수 있다. 이를 위해 메모리(140)는 비휘발성 메모리(ex: 하드 디스크, SSD(Solid state drive), 플래시 메모리), 휘발성 메모리 등으로 구현될 수 있다.The memory 140 may store an operating system (OS) for controlling the overall operation of the components of the electronic device 100 and commands or data related to the components of the electronic device 100. To this end, the memory 140 may be implemented as a nonvolatile memory (eg, a hard disk, a solid state drive (SSD), a flash memory), a volatile memory, or the like.

예를 들어, 메모리(140)에는, 실행될 때 프로세서(150)로 하여금, 카메라(110)로부터 이미지가 획득되면, 이미지에 포함된 오브젝트에 대한 유형 정보 및 위치 정보를 획득하도록 하는 인스터럭션을 저장할 수 있다. 또한, 메모리(140)는 오브젝트를 인식하기 위한 신경망 모델을 저장할 수 있다. 특히, 신경망 모델은 기존의 범용 프로세서(예를 들어, CPU) 또는 별도의 AI 전용 프로세서(예를 들어, GPU, NPU 등)에 의해 실행될 수 있다. 그리고, 메모리(140)에는, 신경망 모델에 신규 클래스를 추가하도록 사용자가 요청할 수 있는 어플리케이션에 대한 데이터가 저장될 수 있다.For example, the memory 140 stores an instruction that causes the processor 150 to acquire type information and location information for an object included in the image when an image is acquired from the camera 110 when executed. I can. In addition, the memory 140 may store a neural network model for recognizing an object. In particular, the neural network model may be executed by an existing general-purpose processor (eg, CPU) or a separate AI-only processor (eg, GPU, NPU, etc.). In addition, the memory 140 may store data on an application that the user may request to add a new class to the neural network model.

디스플레이(150)는 다양한 화면을 디스플레이할 수 있다. 예를 들어, 디스플레이(150)는 전자 장치(100)는 애플리케이션 실행 화면을 표시하여 사용자가 애플리케이션을 이용해 신규 클래스에 대한 요청을 입력할 수 있는 화면을 디스플레이할 수 있다. 또한, 디스플레이(150)는 사용자가 요청한 오브젝트를 표시하거나, 전자 장치(100)에 의해 생성된 프롬프트나 알림을 표시할 수 있다. 한편, 디스플레이(150)는 터치 스크린(touch screen)으로 구현될 수 있다. 이 때, 프로세서(170)는 디스플레이(150)를 통해 사용자의 터치 입력을 획득할 수 있다.The display 150 may display various screens. For example, the display 150 may display an application execution screen so that the electronic device 100 may display a screen through which a user can input a request for a new class using the application. In addition, the display 150 may display an object requested by the user, or may display a prompt or notification generated by the electronic device 100. Meanwhile, the display 150 may be implemented as a touch screen. In this case, the processor 170 may obtain a user's touch input through the display 150.

스피커(160)는 외부로 수신된 각종 오디오 데이터뿐만 아니라 각종 알림 음이나 음성 메시지 등을 출력하는 구성요소일 수 있다. 이때, 전자 장치(100)는 스피커(160)와 같은 오디오 출력 장치를 포함할 수 있으나, 오디오 출력 단자와 같은 출력 장치를 포함할 수 있다. 특히, 스피커(160)는 사용자 음성에 대한 응답 결과 및 동작 결과 등을 음성 형태로 제공할 수 있다.The speaker 160 may be a component that outputs not only various audio data received from the outside, but also various notification sounds or voice messages. In this case, the electronic device 100 may include an audio output device such as the speaker 160, but may include an output device such as an audio output terminal. In particular, the speaker 160 may provide a response result and an operation result to the user's voice in a voice form.

프로세서(170)는 전자 장치(100)의 전반적인 동작을 제어할 수 있다. 예를 들어, 프로세서(170)는 기 설정된 적어도 하나의 클래스에 대응되는 오브젝트를 검출하도록 학습된 신경망 모델을 획득할 수 있다. 획득된 신경망 모델은 이미지에 포함된 오브젝트에 대한 유형 정보를 획득하도록 학습된 신경망 모델일 수 있다. 신경망 모델은 오브젝트에 대한 특징값을 추출하는 특징 추출 모듈 및 특징 추출 모듈로부터 획득된 특징값을 바탕으로 오브젝트에 대한 분류값을 획득하는 분류값 획득 모듈을 포함할 수 있다.The processor 170 may control the overall operation of the electronic device 100. For example, the processor 170 may acquire a trained neural network model to detect an object corresponding to at least one preset class. The obtained neural network model may be a neural network model trained to acquire type information on an object included in the image. The neural network model may include a feature extraction module for extracting a feature value for an object, and a classification value acquisition module for obtaining a classification value for an object based on the feature value obtained from the feature extraction module.

프로세서(170)는 제1 클래스에 대응되는 제1 오브젝트를 검출하기 위한 사용자 명령을 획득할 수 있다. 프로세서(170)는 제1 오브젝트에 대한 이미지를 학습된 신경망 모델에 입력하여 제1 오브젝트에 대한 제1 특징값을 획득하고, 제1 특징값과 학습된 신경망 모델의 가중치 벡터를 비교하여 제1 오브젝트가 적어도 하나의 클래스에 해당하는지 여부를 판단할 수 있다. The processor 170 may obtain a user command for detecting a first object corresponding to the first class. The processor 170 inputs an image of the first object into the learned neural network model to obtain a first feature value for the first object, and compares the first feature value with a weight vector of the learned neural network model to obtain a first object. It can be determined whether is corresponding to at least one class.

제1 오브젝트가 적어도 하나의 클래스에 대응되지 않으면, 프로세서(170)는 신경망 모델 및 제1 오브젝트에 대한 정보를 바탕으로 새로운 신경망 모델을 획득할 수 있다. 구체적으로, 프로세서(170)는 제1 오브젝트에 대한 이미지를 획득하고, 획득된 제1 오브젝트에 대한 이미지를 특징 추출 모듈에 입력하여 제1 오브젝트에 대한 제1 특징값을 획득할 수 있다. 그리고, 프로세서(170)는 제1 특징값 및 분류값 획득 모듈을 바탕으로 새로운 분류값 획득 모듈을 획득할 수 있다. 이 때, 분류값 획득 모듈은 복수의 열 벡터를 포함하는 가중치 벡터를 포함할 수 있다. 프로세서(170)는 제1 특징값의 평균값을 바탕으로 제1 열 벡터를 생성하고, 제1 열 벡터를 가중치 벡터의 새로운 열 벡터로 추가하여 새로운 분류값 획득 모듈을 획득할 수 있다. 프로세서(170)는 기 정의된 정규화 함수를 바탕으로 상기 획득된 새로운 분류값 획득 모듈을 정규화할 수 있다.If the first object does not correspond to at least one class, the processor 170 may acquire a new neural network model based on the neural network model and information on the first object. Specifically, the processor 170 may acquire an image of the first object and input the acquired image of the first object to the feature extraction module to obtain a first feature value of the first object. Further, the processor 170 may acquire a new classification value obtaining module based on the first feature value and classification value obtaining module. In this case, the classification value acquisition module may include a weight vector including a plurality of column vectors. The processor 170 may obtain a new classification value acquisition module by generating a first column vector based on the average value of the first feature values and adding the first column vector as a new column vector of the weight vector. The processor 170 may normalize the acquired new classification value acquisition module based on a predefined normalization function.

프로세서(170)는 신경망 모델을 커스터마이징할 수 있다. 프로세서(170)는 신규 클래스에 대한 사용자 요청을 수신하고, 신규 클래스가 신규이며 신경망 모델에 추가되어야 되는지 여부를 판단할 수 있다. 신규 클래스가 신규로 판단되면, 프로세서(170)는 신규 클래스를 대표하는 적어도 하나의 샘플을 획득할 수 있다. 이 때, 프로세서(170)는 영상, 음성 파일, 음성 클립, 비디오, 및 비디오의 프레임 중 적어도 하나를 샘플로 획득할 수 있다.The processor 170 may customize the neural network model. The processor 170 may receive a user request for a new class and determine whether the new class is new and should be added to the neural network model. If it is determined that the new class is new, the processor 170 may obtain at least one sample representing the new class. In this case, the processor 170 may acquire at least one of an image, an audio file, an audio clip, a video, and a frame of a video as a sample.

프로세서(170)는 특징 추출 모듈과 분류값 획득 모듈의 기본 영역을 포함하는 신경망 모델로부터 적어도 하나의 샘플로부터 추출된 적어도 하나의 특징을 획득할 수 있다. 이 때, 프로세서(170)는 통신 인터페이스(130)를 통해 사용자 요청 및 적어도 하나의 샘플을 신경망 모델을 포함하는 외부 서버에 전송할 수 있다. 프로세서(170)는 외부 서버로부터 적어도 하나의 샘플에 대한 특징을 수신할 수 있다. 프로세서(170)는 추출된 적어도 하나의 특징을 신규 클래스의 대표로 저장할 수 있다. 이 때, 프로세서(170)는 신규 클래스에 대응하는 분류값 획득 모듈의 가중 벡터를 메모리(140)에 저장할 수 있다.The processor 170 may acquire at least one feature extracted from at least one sample from a neural network model including a basic region of the feature extraction module and the classification value acquisition module. In this case, the processor 170 may transmit a user request and at least one sample to an external server including a neural network model through the communication interface 130. The processor 170 may receive features of at least one sample from an external server. The processor 170 may store the extracted at least one feature as a representative of the new class. In this case, the processor 170 may store the weight vector of the classification value acquisition module corresponding to the new class in the memory 140.

프로세서(170)는 신규 클래스와 관련된 적어도 하나의 키워드를 획득할 수 있다. 프로세서(170)는 적어도 하나의 키워드가 신경망 모델의 분류값 획득 모듈의 베이스 영역의 복수의 기 정의된 키워드들 중 하나와 매칭되는지 판단할 수 있다. 프로세서(170)는 적어도 하나의 키워드가 복수의 기 정의된 키워드들 중 하나와 매칭되면, 매칭된 기 정의된 키워드에 대응하는 클래스를 식별할 수 있다. 프로세서(170)는 식별된 클래스에 대응하는 예시 샘플과 적어도 하나의 키워드를 식별된 클래스에 할당하는 제안을 출력하도록 디스플레이(150) 또는 스피커(160)를 제어할 수 있다. 이 때, 적어도 하나의 키워드가 식별된 클래스에 배정되어야 한다는 사용자 확인이 획득되면, 프로세서(170)는 적어도 하나의 키워드를 식별된 클래스에 할당할 수 있다. 이에 반해, 식별된 클래스에 적어도 하나의 키워드의 할당을 불허하는 사용자 입력이 획득되면, 프로세서(170)는 신경망 모델에 신규 클래스를 추가할 수 있다.The processor 170 may acquire at least one keyword related to the new class. The processor 170 may determine whether at least one keyword matches one of a plurality of predefined keywords in the base region of the classification value acquisition module of the neural network model. When at least one keyword matches one of a plurality of predefined keywords, the processor 170 may identify a class corresponding to the matched predefined keyword. The processor 170 may control the display 150 or the speaker 160 to output an example sample corresponding to the identified class and a proposal for allocating at least one keyword to the identified class. At this time, when a user confirmation that at least one keyword should be assigned to the identified class is obtained, the processor 170 may allocate at least one keyword to the identified class. On the other hand, when a user input disallowing assignment of at least one keyword to the identified class is obtained, the processor 170 may add a new class to the neural network model.

한편, 본 개시에 따른 전자 장치(100)는 사용자의 프라이버시 보호 관점에서 효과적일 수 있다. 이는 신규 클래스가 다른 사용자들이 이용/접근 가능한 클라우드에 저장되거나 분류기의 기본(base) 부분에 추가되는 것이 아니라, 전자 장치(100)에 저장되기 때문이다. 그러나 사용자에 의해 정의된 신규 클래스를 사용자의 다른 장치에(예: 스마트폰에서 노트북, 가상 비서, 로봇 버틀러, 스마트 냉장고 등으로) 공유하길 원할 수 있다. 따라서, 프로세서(170)는 분류값 획득 모듈의 로컬 영역에 저장된 신규 클래스를 외부 장치와 공유할 수 있다. 프로세서(170)는 분류값 획득 모듈의 로컬 영역에 저장된 신규 클래스를 분류값 획득 모듈의 베이스 영역을 포함하는 외부 서버와 공유할 수 있다. 이는 자동으로 실행될 수 있다. 예를 들어, 신경망 모델이 카메라 애플리케이션의 일부로 사용되는 경우, 신경망 모델이 사용자의 스마트폰에서 갱신되면, 신경망 모델은 같은 카메라 애플리케이션이 작동되고 있는 사용자의 다른 장치와 자동으로 공유될 수 있다. 따라서, 공유는 많은 장치간의 소프트웨어 애플리케이션 동기화의 일부에 해당될 수 있다.Meanwhile, the electronic device 100 according to the present disclosure may be effective in terms of protecting a user's privacy. This is because the new class is not stored in the cloud that other users can use/access to or added to the base part of the classifier, but is stored in the electronic device 100. However, you may want to share the new class defined by the user to other devices of the user (eg, from smartphone to laptop, virtual assistant, robot butler, smart refrigerator, etc.). Accordingly, the processor 170 may share the new class stored in the local area of the classification value acquisition module with the external device. The processor 170 may share the new class stored in the local area of the classification value acquisition module with an external server including the base area of the classification value acquisition module. This can be done automatically. For example, when the neural network model is used as part of a camera application, when the neural network model is updated on the user's smartphone, the neural network model can be automatically shared with other devices of the user running the same camera application. Thus, sharing can be part of the synchronization of software applications between many devices.

특히, 본 개시에 따른 인공지능과 관련된 기능은 프로세서(170)와 메모리(140)를 통해 동작된다. 프로세서(170)는 하나 또는 복수의 프로세서로 구성될 수 있다. 이때, 하나 또는 복수의 프로세서는 CPU, AP, DSP(Digital Signal Processor) 등과 같은 범용 프로세서, GPU, VPU(Vision Processing Unit)와 같은 그래픽 전용 프로세서 또는 NPU와 같은 인공지능 전용 프로세서일 수 있다. 하나 또는 복수의 프로세서는, 메모리(140)에 저장된 기 정의된 동작 규칙 또는 인공지능 모델에 따라, 입력 데이터를 처리하도록 제어한다. 또는, 하나 또는 복수의 프로세서가 인공지능 전용 프로세서인 경우, 인공지능 전용 프로세서는, 특정 인공지능 모델의 처리에 특화된 하드웨어 구조로 설계될 수 있다. In particular, functions related to artificial intelligence according to the present disclosure are operated through the processor 170 and the memory 140. The processor 170 may be composed of one or a plurality of processors. In this case, one or more processors may be a general-purpose processor such as a CPU, AP, or Digital Signal Processor (DSP), a graphics-only processor such as a GPU, a Vision Processing Unit (VPU), or an artificial intelligence-only processor such as an NPU. One or more processors control to process input data according to a predefined operation rule or an artificial intelligence model stored in the memory 140. Alternatively, when one or more processors are dedicated AI processors, the AI dedicated processor may be designed with a hardware structure specialized for processing a specific AI model.

기 정의된 동작 규칙 또는 인공지능 모델은 학습을 통해 만들어진 것을 특징으로 한다. 여기서, 학습을 통해 만들어진다는 것은, 기본 인공지능 모델이 학습 알고리즘에 의하여 다수의 학습 데이터들을 이용하여 학습됨으로써, 원하는 특성(또는, 목적)을 수행하도록 설정된 기 정의된 동작 규칙 또는 인공지능 모델이 만들어짐을 의미한다. 이러한 학습은 본 개시에 따른 인공지능이 수행되는 기기 자체에서 이루어질 수도 있고, 별도의 서버 및/또는 시스템을 통해 이루어 질 수도 있다. 학습 알고리즘의 예로는, 지도형 학습(supervised learning), 비지도형 학습(unsupervised learning), 준지도형 학습(semi-supervised learning), 생성적 적대 신경망(Generative Adversarial Network) 또는 강화 학습(reinforcement learning)이 있으나, 전술한 예에 한정되지 않는다.A predefined motion rule or artificial intelligence model is characterized by being created through learning. Here, to be made through learning means that a basic artificial intelligence model is learned using a plurality of learning data by a learning algorithm, so that a predefined motion rule or artificial intelligence model set to perform a desired characteristic (or purpose) is created. Means Jim. Such learning may be performed in a device on which artificial intelligence according to the present disclosure is performed, or may be performed through a separate server and/or system. Examples of learning algorithms include supervised learning, unsupervised learning, semi-supervised learning, generative adversarial network, or reinforcement learning. , Is not limited to the above-described example.

인공지능 모델은, 복수의 신경망 레이어들로 구성될 수 있다. 복수의 신경망 레이어들 각각은 복수의 가중치들(weight values)을 갖고 있으며, 이전(previous) 레이어의 연산 결과와 복수의 가중치들 간의 연산을 통해 신경망 연산을 수행한다. 복수의 신경망 레이어들이 갖고 있는 복수의 가중치들은 인공지능 모델의 학습 결과에 의해 최적화될 수 있다. 예를 들어, 학습 과정 동안 인공지능 모델에서 획득한 로스(loss) 값 또는 코스트(cost) 값이 감소 또는 최소화되도록 복수의 가중치들이 갱신될 수 있다. 인공 신경망은 심층 신경망(DNN:Deep Neural Network)를 포함할 수 있으며, 예를 들어, CNN (Convolutional Neural Network), DNN (Deep Neural Network), RNN (Recurrent Neural Network), RBM (Restricted Boltzmann Machine), DBN (Deep Belief Network), BRDNN(Bidirectional Recurrent Deep Neural Network) 또는 심층 Q-네트워크 (Deep Q-Networks) 등이 있으나, 전술한 예에 한정되지 않는다.The artificial intelligence model may be composed of a plurality of neural network layers. Each of the plurality of neural network layers has a plurality of weight values, and a neural network operation is performed through an operation result of a previous layer and a plurality of weights. The plurality of weights of the plurality of neural network layers can be optimized by the learning result of the artificial intelligence model. For example, a plurality of weights may be updated to reduce or minimize a loss value or a cost value obtained from the artificial intelligence model during the learning process. The artificial neural network may include a deep neural network (DNN), for example, CNN (Convolutional Neural Network), DNN (Deep Neural Network), RNN (Recurrent Neural Network), RBM (Restricted Boltzmann Machine), DBN (Deep Belief Network), BRDNN (Bidirectional Recurrent Deep Neural Network), Deep Q-Networks (Deep Q-Networks), and the like, but is not limited to the above-described example.

한편, 본 개시에 따른 전자 장치(100)는 신경망 모델의 정확성 유지도 보장하면서 신경망 모델이 효율적인 시간, 자원, 비용으로 커스터마이징될 수 있게 한다. 이는 전자 장치(100)에서 신경망 모델의 분류값 획득 모듈을 국소적으로 확장함으로써 이루어진다. 다시 말해, 전자 장치(100)는 기 학습된 분류값 획득 모듈 전역에 변경을 행하지 않는다. 이는 신경망 모델이 처음부터 다시 학습될 필요가 없으며, 신경망 모델이 빠르게 갱신될 수 있다는 것을 의미한다. 또한, 이는 신경망 모델을 갱신/커스터마이징하기 위해 고가인 클라우드 컴퓨팅을 사용할 필요가 없다는 것을 의미한다. Meanwhile, the electronic device 100 according to the present disclosure also ensures the accuracy of the neural network model and allows the neural network model to be customized with efficient time, resources, and cost. This is accomplished by locally expanding the classification value acquisition module of the neural network model in the electronic device 100. In other words, the electronic device 100 does not change the entire pre-learned classification value acquisition module. This means that the neural network model does not need to be retrained from scratch, and the neural network model can be quickly updated. In addition, this means that there is no need to use expensive cloud computing to update/customize neural network models.

한편, 전자 장치(100)는 신경망 모델을 외부 서버로부터 획득할 수 있다.Meanwhile, the electronic device 100 may obtain a neural network model from an external server.

도 5b는 본 개시의 일 실시 예에 따른 새로운 신경망 모델을 획득하는 방법을 설명하기 위한 도면이다. 전자 장치(100)는 외부 서버(500)로부터 특징 추출 모듈(181) 및 분류값 획득 모듈(182)을 포함하는 신경망 모델(180)을 획득할 수 있다. 분류값 획득 모듈(182)은 베이스 영역(182-1)을 포함할 수 있다. 전자 장치(100)는 외부 서버(500)로부터 획득된 분류값 획득 모듈(182)로부터 새로운 분류값 획득 모듈을 획득할 수 있다. 구체적으로, 전자 장치(100)는 신규 클래스에 해당하는 오브젝트에 대한 특징값을 평균화하여 로컬 영역(182-2)에 저장할 수 있다. 이에 따라, 전자 장치(100)는 신규 클래스에 해당하는 오브젝트를 검출할 수 있다. 한편, 신규 클래스에 해당하는 오브젝트에 대한 특징값을 획득하는 동작은 외부 서버(500)에 의해 수행될 수 있다. 이 경우, 전자 장치(100)는 사용자로부터 신규 클래스에 해당하는 오브젝트에 대한 정보를 획득하여 외부 서버(500)로 전송하고, 외부 서버(500)에 의해 추출된 신규 클래스에 해당하는 오브젝트에 대한 정보를 수신할 수 있다.5B is a diagram illustrating a method of obtaining a new neural network model according to an embodiment of the present disclosure. The electronic device 100 may obtain a neural network model 180 including a feature extraction module 181 and a classification value acquisition module 182 from the external server 500. The classification value acquisition module 182 may include a base area 182-1. The electronic device 100 may obtain a new classification value obtaining module from the classification value obtaining module 182 obtained from the external server 500. Specifically, the electronic device 100 may average feature values for objects corresponding to the new class and store them in the local area 182-2. Accordingly, the electronic device 100 may detect an object corresponding to a new class. Meanwhile, the operation of obtaining a feature value for an object corresponding to a new class may be performed by the external server 500. In this case, the electronic device 100 obtains information on the object corresponding to the new class from the user and transmits it to the external server 500, and information on the object corresponding to the new class extracted by the external server 500 Can be received.

도 6a는 본 개시의 일 실시 예에 따른 이미지 샘플을 이용해 신경망 모델을 커스터마이징하는 방법을 도시한 도면이다. 본 실시 예에서 사용자는 전자 장치(100)에서 사용자의 강아지 사진을 찾으려 할 수 있다. 전자 장치(100)는 스마트폰일 수 있다. 그러나 전자 장치(100)에는 사용자의 강아지 외에도 수많은 다른 강아지에 대한 이미지가 저장되어 있을 수 있다. 전자 장치(100)의 이미지 갤러리 애플리케이션은 텍스트 기반 키워드 검색을 이용해 강아지를 포함하는 모든 사진의 위치를 검색할 수 있지만, 사용자의 강아지에 대응하는 키워드나 클래스가 없어서 사용자의 강아지를 포함하는 이미지의 위치를 검색하지 못할 수도 있다.6A is a diagram illustrating a method of customizing a neural network model using image samples according to an embodiment of the present disclosure. In this embodiment, the user may try to find a picture of the user's dog on the electronic device 100. The electronic device 100 may be a smartphone. However, in addition to the user's dog, images of numerous other dogs may be stored in the electronic device 100. The image gallery application of the electronic device 100 can search the location of all photos including dogs using text-based keyword search, but there is no keyword or class corresponding to the user's dog, so the location of the image including the user's dog You may not be able to search.

단계(S600)에서 전자 장치(100)는 전자 장치(100)에 설치된 이미지 갤러리 애플리케이션을 선택하는 사용자 명령을 획득할 수 있다. 그리고, 사용자는 전자 장치(100)에 설치된 이미지 갤러리 애플리케이션의 설정란에 들어갈 수 있다. 전자 장치(100)는 "신규 검색 카테고리 추가"를 위한 화면을 디스플레이할 수 있다. 사용자는 전자 장치(100)에 디스플레이된 화면에서 "신규 검색 카테고리"를 추가할 수 있다. 이에 따라 전자 장치(100)는 신경망 모델을 커스터마이징하는 동작을 수행할 수 있다. 전자 장치(100)는 사용자에게 신규 클래스에 대한 요청을 입력하도록 유도할 수 있다. 전자 장치(100)는 사용자에게 신규 카테고리 키워드를 입력하도록 유도할 수 있다. 이 경우, 단계(S602)에서 사용자는 신규 카테고리와 관련된 키워드 "독일 셰퍼드", "내 강아지", 및 "라이카"(강아지 이름)를 입력할 수 있다.In operation S600, the electronic device 100 may obtain a user command for selecting an image gallery application installed in the electronic device 100. In addition, the user may enter the setting field of the image gallery application installed in the electronic device 100. The electronic device 100 may display a screen for "adding a new search category". The user may add a "new search category" on the screen displayed on the electronic device 100. Accordingly, the electronic device 100 may perform an operation of customizing the neural network model. The electronic device 100 may induce a user to input a request for a new class. The electronic device 100 may induce the user to input a new category keyword. In this case, in step S602, the user may input keywords "German Shepherd", "My Dog", and "Leica" (dog name) related to the new category.

단계(S604)에서 사용자는 카메라를 이용해 강아지의 영상을 촬영하거나 이미지 갤러리에서 사용자의 강아지 사진을 애플리케이션에 추가할 수 있다. 이에 따라, 전자 장치(100)는 사용자가 입력한 신규 카테고리를 저장할 수 있다. 그 후 사용자가 이미지 갤러리 애플리케이션의 검색 기능에 키워드 "내 강아지"를 입력하면, 전자 장치(100)는 사용자의 강아지에 대한 이미지를 디스플레이하여 사용자에게 제공할 수 있다(S606).In step S604, the user may take an image of the dog using the camera or may add a photo of the user's dog to the application from the image gallery. Accordingly, the electronic device 100 may store a new category input by the user. Thereafter, when the user inputs the keyword "my dog" in the search function of the image gallery application, the electronic device 100 may display an image of the user's dog and provide it to the user (S606).

또한 사용자는 카테고리와 관련된 키워드를 삭제함으로써 이제 더 이상 요청하지 않는 카테고리를 삭제하기 위해 이미지 갤러리 애플리케이션의 설정란을 사용할 수 있다. 이로 인해 해당 키워드와 관련된 분류기 가중 벡터가 분류기의 국소(local) 부분에서 삭제될 수 있다.In addition, the user can use the settings field of the image gallery application to delete categories that are no longer requested by deleting keywords associated with the category. As a result, the classifier weight vector associated with the keyword may be deleted from the local part of the classifier.

도 6b는 본 개시의 일 실시 예에 따른 비디오 프레임 샘플을 이용해 신경망 모델을 커스터마이징하는 모습을 도시한 도면이다. 본 실시 예에서, 사용자는 셀카를 찍을 때 스마트폰의 카메라에 명령을 주기 위해 사용자의 스마트폰에서 사용되는 디폴트 손바닥 제스처(S610와 같이)가 마음에 안 들 수 있다. 사용자는 "셀카 촬영"에 대한 신규 제스처 등록을 원할 수 있다. 단계(S612)에서 사용자는 카메라 설정란에 들어가서 "신규 셀카 촬영 제스처 설정"을 선택한다. 사용자는 왼쪽에서 오른쪽으로 머리를 움직이는 사용자 정의 제스처 기록을 시작한다(S614). 단계(S616)에서 사용자는 사용자의 선택을 확인할 수 있다. 이로써 사용자는 셀카 촬영 동작을 작동시키기 위해 사용자의 신규 제스처를 사용할 수 있다(S618).6B is a diagram illustrating a customizing a neural network model using video frame samples according to an embodiment of the present disclosure. In this embodiment, the user may not like the default palm gesture (such as S610) used on the user's smartphone to give a command to the smartphone's camera when taking a selfie. The user may wish to register a new gesture for "taking a selfie". In step S612, the user enters the camera setting field and selects "new selfie shooting gesture setting". The user starts recording a user-defined gesture moving his or her head from left to right (S614). In step S616, the user can confirm the user's selection. Accordingly, the user can use the user's new gesture to operate the selfie photographing operation (S618).

상술한 바와 같이 분류기의 기본(base) 부분(또는 만약 이미 존재한다면 국소(local) 부분)이 유사한 또는 실질적으로 동일한 클래스를 이미 포함하지 않는 경우에만 기계 학습 모델의 분류기의 국소(local) 부분에 신규 클래스를 등록하는 것이 바람직할 수 있다. 유사한 또는 실질적으로 동일한 클래스가 분류기에 이미 존재한다면, 사용자 는 클래스의 존재를 알리고 해당 키워드를 기존 클래스에 연결시키기를 제안할 수 있다. 사용자는 이 제안을 수용하거나 거절할 수 있고 거절하는 경우엔 사용자가 제공한 클래스를 모델에 추가하는 과정을 이어간다.As described above, only if the base part of the classifier (or the local part if it already exists) does not already contain a similar or substantially identical class, the new local part of the classifier of the machine learning model is It may be desirable to register the class. If a similar or substantially identical class already exists in the classifier, the user can announce the existence of the class and suggest linking the keyword to the existing class. The user can accept or reject this offer, and in case of rejection, the process of adding the user-provided class to the model continues.

도 7은 본 개시의 일 실시 예에 따른 전자 장치의 제어 방법을 나타내는 순서도이다.7 is a flowchart illustrating a method of controlling an electronic device according to an embodiment of the present disclosure.

전자 장치(100)는 제1 클래스에 대한 사용자 명령을 획득할 수 있다(S710). 사용자는 사용자가 신경망 모델과 상호작용할 수 있도록 하는 애플리케이션을 설치하는 등의 적절한 방법으로 이 요청을 할 수 있다. 애플리케이션은 예를 들어 전자 장치(100)의 카메라에 관련된 애플리케이션 또는 카메라에서 촬영된 영상과 비디오를 대조하는데 이용되는 애플리케이션일 수 있다.The electronic device 100 may obtain a user command for the first class (S710). The user can make this request in any suitable way, such as installing an application that allows the user to interact with the neural network model. The application may be, for example, an application related to the camera of the electronic device 100 or an application used to match a video and an image captured by the camera.

전자 장치(100)는 제1 클래스가 신규 클래스 이며 전자 장치(100)에 기 저장된 신경망 모델에 추가해야 하는지 판단할 수 있다(S720). 이 판단은, 비효율적인 모델 작동을 초래할 수 있는 모델의 클래스 중복을 피하기 위해 수행될 수 있다. 사용자에 의해 요청된 제1 클래스가 신규 클래스인지 판단하기 위한 방법의 예는 도 8을 참조하여 이하에서 설명하도록 한다.The electronic device 100 may determine whether the first class is a new class and should be added to a neural network model previously stored in the electronic device 100 (S720). This determination can be performed to avoid class duplication of the model, which can lead to inefficient model operation. An example of a method for determining whether the first class requested by the user is a new class will be described below with reference to FIG. 8.

단계(S720)에서 제1 클래스가 신규 클래스로 판단되면, 전자 장치(100)는 제1 클래스를 대표하는 적어도 하나의 샘플을 획득할 수 있다(S730). 적어도 하나의 샘플은 영상, 음성 파일, 음성 클립, 비디오, 및 비디오의 프레임 중 하나 이상일 수 있다. 일반적으로 적어도 하나의 샘플은 모두 제1 클래스를 정의하기 위해 사용될 동일한 오브젝트(또는 특징)를 나타내는 영상들의 세트일 수 있다. 예를 들어, 사용자가 신경망 모델이 영상과 비디오에서 사용자의 강아지를 식별하기를 원하면, 사용자는 하나 이상의 사용자의 강아지 사진을 제1 클래스를 대표하는 입력 샘플로 제공할 수 있다. 여러 샘플이 획득되는 경우, 샘플들은 모두 동일한 종류/파일 형태(예: 영상) 또는 다른 종류(예: 영상 및 비디오)일 수 있다. 다시 말하자면, 사용자는 사용자의 강아지 사진과 비디오 모두를 입력 샘플로 제공할 수 있다.If the first class is determined to be a new class in step S720, the electronic device 100 may obtain at least one sample representing the first class in step S730. The at least one sample may be one or more of an image, an audio file, an audio clip, a video, and a frame of a video. In general, at least one sample may all be a set of images representing the same object (or feature) to be used to define the first class. For example, if the user wants the neural network model to identify the user's puppy in images and videos, the user may provide one or more pictures of the user's puppy as input samples representing the first class. When multiple samples are acquired, all of the samples may be of the same type/file type (eg, image) or different types (eg, image and video). In other words, the user can provide both the user's dog photo and video as input samples.

제1 클래스를 대표하는 하나의 샘플은 신경망 모델을 커스터마이징하는데 충분할 수 있다. 그러나, 다른 머신 러닝 기술과 마찬가지로, 보통 샘플이 많을수록 향상되거나 더 좋은 결과를 얻을 수 있다. 전자 장치(100)는 획득된 샘플의 품질이 좋지 않거나 제1 클래스를 정의하고 신경망 모델에 추가하기에 충분하지 않으면, 사용자에게 샘플을 더 입력하도록 요청하는 메시지를 출력할 수 있다.One sample representing the first class may be sufficient to customize the neural network model. However, like other machine learning techniques, the more samples you usually get, the better or better you get. If the obtained sample is of poor quality or is insufficient to define the first class and add it to the neural network model, the electronic device 100 may output a message requesting the user to input more samples.

경우에 따라서, 단계(S710)에서의 사용자 요청은 신규 클래스(즉, 제1 클래스)를 대표하는 샘플을 포함할 수 있다. 이 경우, 단계(S730)에서 전자 장치(100)는 단순히 이미 수신된 샘플을 사용할 수 있다. 경우에 따라서, 단계(S710)에서의 사용자 요청은 어떠한 샘플도 포함하지 않을 수 있다. 이 경우, 단계(S730)에서 전자 장치(100)는 사용자에게 샘플을 제공/입력하도록 유도하는 가이드 메시지를 포함할 수 있다. 또는, 샘플은 단계(S720)에서 수신될 수 있고, 따라서 단계(S730)에서 전자 장치(100)는 단계(S720)에서 획득된 샘플을 사용할 수 있다.In some cases, the user request in step S710 may include a sample representing a new class (ie, a first class). In this case, in step S730, the electronic device 100 may simply use a sample that has already been received. In some cases, the user request in step S710 may not include any samples. In this case, in step S730, the electronic device 100 may include a guide message for inducing a user to provide/input a sample. Alternatively, the sample may be received in step S720, and accordingly, in step S730, the electronic device 100 may use the sample acquired in step S720.

전자 장치(100)는 획득된 샘플에 대한 특징값을 획득할 수 있다(S740). 이 때, 전자 장치(100)는 신경망 모델에 포함된 특징 추출 모듈에 획득된 샘플을 입력하여 샘플에 대한 특징값을 획득할 수 있다. 신경망 모델의 전체 또는 일부는 전자 장치(100) 또는 원격 서버/클라우드 서버에 구현될 수 있다.The electronic device 100 may obtain a feature value for the obtained sample (S740). In this case, the electronic device 100 may acquire a feature value for the sample by inputting the acquired sample to the feature extraction module included in the neural network model. All or part of the neural network model may be implemented in the electronic device 100 or a remote server/cloud server.

전자 장치(100)는 획득된 특징값을 분류값 획득 모듈의 로컬 영역에 저장할 수 있다(S750). 이로써 전자 장치(100)는 제1 클래스에 해당하는 오브젝트를 인식할 수 있는 신경망 모델을 획득할 수 있다.The electronic device 100 may store the acquired feature value in a local area of the classification value obtaining module (S750). Accordingly, the electronic device 100 may obtain a neural network model capable of recognizing an object corresponding to the first class.

도 8은 사용자 요청 클래스가 이미 신경망 모델에 존재하는지 확인하는 예의 흐름도이다.8 is a flowchart of an example of checking whether a user request class already exists in a neural network model.

전자 장치(100)는 신규 클래스에 대한 사용자 요청을 수신하고(S800), 신규 클래스와 연관된 적어도 하나의 키워드를 수신할 수 있다(S802). 전자 장치(100)는 수신된 적어도 하나의 키워드가 기 정의된 키워드와 매칭되는지 판단할 수 있다(S804). 판단 결과, 전자 장치(100)는 분류기의 로컬 영역이 이미 존재하는 경우, 분류기의 베이스 영역 및 로컬 영역과 관련된 키워드를 매칭시킬 수 있다.The electronic device 100 may receive a user request for a new class (S800) and receive at least one keyword associated with the new class (S802). The electronic device 100 may determine whether the received at least one keyword matches a predefined keyword (S804). As a result of the determination, if the local area of the classifier already exists, the electronic device 100 may match the base area of the classifier and the keyword related to the local area.

수신된 키워드가 기 정의된 키워드와 매칭되면, 전자 장치(100)는 매칭된 키워드에 대응하는 클래스를 식별할 수 있다(S806). 그리고, 전자 장치(100)는 수신된 키워드를 식별된 기존 클래스에 할당하는 제안을 출력할 수 있다(S808). 이 때, 전자 장치(100)는 신규 클래스가 왜 식별된 기존 클래스와 유사/동일한지 설명하기 위해 식별된 클래스에 대응하는 예시 샘플 또한 출력할 수 있다. When the received keyword matches a predefined keyword, the electronic device 100 may identify a class corresponding to the matched keyword (S806). Further, the electronic device 100 may output a proposal for allocating the received keyword to the identified existing class (S808). In this case, the electronic device 100 may also output an example sample corresponding to the identified class to explain why the new class is similar/same as the identified existing class.

전자 장치(100)는 사용자가 제안을 수용했는지 여부를 판단할 수 있다(S810). 이 때, 전자 장치(100)는 사용자 응답에 기초하여 사용자의 제안 수용 여부를 판단할 수 있다. 사용자가 제안을 수용한 것으로 판단되면, 전자 장치(100)는 키워드를 식별된 클래스에 할당할 수 있다(S812). 반면에, 사용자가 제안을 수용하지 않은 것으로 판단되면, 전자 장치(100)는 신경망 모델에 신규 클래스를 추가하기 위한 동작을 수행할 수 있으며, 이러한 동작은 도 7의 단계 S730로 이어질 수 있다.The electronic device 100 may determine whether the user has accepted the proposal (S810). In this case, the electronic device 100 may determine whether to accept the user's proposal based on the user response. If it is determined that the user has accepted the proposal, the electronic device 100 may allocate the keyword to the identified class (S812). On the other hand, if it is determined that the user has not accepted the proposal, the electronic device 100 may perform an operation for adding a new class to the neural network model, and this operation may lead to step S730 of FIG. 7.

단계(S804)에서 수신한 키워드가 기 정의된 키워드 중 어느 것과도 매칭되지 않으면, 전자 장치(100)는 수신된 클래스를 대표하는 적어도 하나의 샘플을 수신할 수 있다(S814). 그리고, 전자 장치(100)는 수신된 샘플의 특징값을 획득하고, 획득된 특징값이 기존 클래스와 매칭되는지 판단할 수 있다(S818). 전자 장치(100)는 수신된 샘플로부터 추출된 특징 벡터와 분류기의 각 분류기 가중 벡터의 내적을 연산함으로써 판단할 수 있다.If the keyword received in step S804 does not match any of the predefined keywords, the electronic device 100 may receive at least one sample representing the received class (S814). Further, the electronic device 100 may obtain a feature value of the received sample and determine whether the obtained feature value matches an existing class (S818). The electronic device 100 may determine by calculating the dot product of the feature vector extracted from the received sample and each classifier weight vector of the classifier.

샘플의 특징값이 기존 클래스와 매칭된다고 판단되면, 전자 장치(100)는 수신된 키워드를 식별된 클래스에 할당하는 제안을 출력할 수 있다(S808). 반면에, 샘플의 특징값이 기존 클래스와 매칭되지 않는다고 판단되면, 전자 장치(100)는 신경망 모델에 신규 클래스를 추가하기 위한 동작을 수행할 수 있으며, 이러한 동작은 도 7의 단계 S730로 이어질 수 있다.If it is determined that the feature value of the sample matches the existing class, the electronic device 100 may output a proposal to allocate the received keyword to the identified class (S808). On the other hand, if it is determined that the feature value of the sample does not match the existing class, the electronic device 100 may perform an operation to add a new class to the neural network model, and this operation may lead to step S730 of FIG. 7. have.

한편, 이상에서 설명된 다양한 실시 예들은 소프트웨어(software), 하드웨어(hardware) 또는 이들의 조합을 이용하여 컴퓨터(computer) 또는 이와 유사한 장치로 읽을 수 있는 기록 매체 내에서 구현될 수 있다. 일부 경우에 있어 본 명세서에서 설명되는 실시 예들이 프로세서 자체로 구현될 수 있다. 소프트웨어적인 구현에 의하면, 본 명세서에서 설명되는 절차 및 기능과 같은 실시 예들은 별도의 소프트웨어 모듈들로 구현될 수 있다. 소프트웨어 모듈들 각각은 본 명세서에서 설명되는 하나 이상의 기능 및 작동을 수행할 수 있다.Meanwhile, the various embodiments described above may be implemented in a recording medium that can be read by a computer or a similar device using software, hardware, or a combination thereof. In some cases, the embodiments described herein may be implemented by the processor itself. According to software implementation, embodiments such as procedures and functions described in the present specification may be implemented as separate software modules. Each of the software modules may perform one or more functions and operations described herein.

한편, 상술한 본 개시의 다양한 실시 예들에 따른 처리 동작을 수행하기 위한 컴퓨터 명령어(computer instructions)는 비일시적 컴퓨터 판독 가능 매체(non-transitory computer-readable medium) 에 저장될 수 있다. 이러한 비일시적 컴퓨터 판독 가능 매체에 저장된 컴퓨터 명령어는 프로세서에 의해 실행되었을 때 상술한 다양한 실시 예에 따른 처리 동작을 특정 기기가 수행하도록 할 수 있다.Meanwhile, computer instructions for performing a processing operation according to various embodiments of the present disclosure described above may be stored in a non-transitory computer-readable medium. When a computer instruction stored in such a non-transitory computer-readable medium is executed by a processor, a specific device may cause the processing operation according to the above-described various embodiments to be performed.

비일시적 컴퓨터 판독 가능 매체란 레지스터, 캐쉬, 메모리 등과 같이 짧은 순간 동안 데이터를 저장하는 매체가 아니라 반영구적으로 데이터를 저장하며, 기기에 의해 판독(reading)이 가능한 매체를 의미한다. 비일시적 컴퓨터 판독 가능 매체의 구체적인 예로는, CD, DVD, 하드 디스크, 블루레이 디스크, USB, 메모리카드, ROM 등이 있을 수 있다.The non-transitory computer-readable medium refers to a medium that stores data semi-permanently and can be read by a device, rather than a medium that stores data for a short moment, such as registers, caches, and memory. Specific examples of non-transitory computer-readable media may include CD, DVD, hard disk, Blu-ray disk, USB, memory card, ROM, and the like.

한편, 기기로 읽을 수 있는 저장매체는, 비일시적(non-transitory) 저장매체의 형태로 제공될 수 있다. 여기서, ‘비일시적 저장매체'는 실재(tangible)하는 장치이고, 신호(signal)(예: 전자기파)를 포함하지 않는다는 것을 의미할 뿐이며, 이 용어는 데이터가 저장매체에 반영구적으로 저장되는 경우와 임시적으로 저장되는 경우를 구분하지 않는다. 예로, '비일시적 저장매체'는 데이터가 임시적으로 저장되는 버퍼를 포함할 수 있다.Meanwhile, a storage medium that can be read by a device may be provided in the form of a non-transitory storage medium. Here, the term'non-transitory storage medium' is a tangible device and only means that it does not contain a signal (e.g., electromagnetic wave), and this term refers to the case where data is semi-permanently stored in the storage medium and temporary. It does not distinguish the case where it is stored as. For example, the'non-transitory storage medium' may include a buffer in which data is temporarily stored.

일 실시예에 따르면, 본 문서에 개시된 다양한 실시예들에 따른 방법은 컴퓨터 프로그램 제품(computer program product)에 포함되어 제공될 수 있다. 컴퓨터 프로그램 제품은 상품으로서 판매자 및 구매자 간에 거래될 수 있다. 컴퓨터 프로그램 제품은 기기로 읽을 수 있는 저장 매체(예: compact disc read only memory (CD-ROM))의 형태로 배포되거나, 또는 어플리케이션 스토어(예: 플레이 스토어^TM)를 통해 또는 두개의 사용자 장치들(예: 스마트폰들) 간에 직접, 온라인으로 배포(예: 다운로드 또는 업로드)될 수 있다. 온라인 배포의 경우에, 컴퓨터 프로그램 제품(예: 다운로더블 앱(downloadable app))의 적어도 일부는 제조사의 서버, 어플리케이션 스토어의 서버, 또는 중계 서버의 메모리와 같은 기기로 읽을 수 있는 저장 매체에 적어도 일시 저장되거나, 임시적으로 생성될 수 있다.According to an embodiment, a method according to various embodiments disclosed in the present document may be provided by being included in a computer program product. Computer program products can be traded between sellers and buyers as commodities. The computer program product is distributed in the form of a device-readable storage medium (e.g. compact disc read only memory (CD-ROM)), or through an application store (e.g. Play Store ^TM ) or two user devices It can be distributed (e.g., downloaded or uploaded) directly between, e.g. smartphones), online. In the case of online distribution, at least a part of the computer program product (e.g., downloadable app) is at least in a device-readable storage medium such as the manufacturer's server, the application store's server, or the relay server's memory. It can be temporarily stored or created temporarily.

이상에서는 본 개시의 바람직한 실시 예에 대하여 도시하고 설명하였지만, 본 개시는 상술한 특정의 실시 예에 한정되지 아니하며, 청구범위에서 청구하는 본 개시의 요지를 벗어남이 없이 당해 개시에 속하는 기술분야에서 통상의 지식을 가진자에 의해 다양한 변형실시가 가능한 것은 물론이고, 이러한 변형실시들은 본 개시의 기술적 사상이나 전망으로부터 개별적으로 이해되어져서는 안될 것이다.In the above, preferred embodiments of the present disclosure have been illustrated and described, but the present disclosure is not limited to the specific embodiments described above, and is generally used in the technical field belonging to the disclosure without departing from the gist of the disclosure claimed in the claims. Various modifications are possible by those skilled in the art of course, and these modifications should not be individually understood from the technical idea or perspective of the present disclosure.

Claims

In the control method of an electronic device,
Obtaining a trained neural network model to detect an object corresponding to at least one class;
Obtaining a user command for detecting a first object corresponding to the first class; And
If the first object does not correspond to the at least one class, obtaining a new neural network model based on the neural network model and information on the first object; including;
Control method.

The method of claim 1,
The trained neural network model,
A feature extraction module that extracts feature values for an object, and
Comprising a classification value acquisition module for obtaining a classification value for the object based on the feature value obtained from the feature extraction module
Control method.

The method of claim 2,
Obtaining the user command,
Including the step of obtaining an image for the first object,
Obtaining the new neural network model,
Inputting the acquired image of the first object into the feature extraction module to obtain a first feature value for the first object,
Comprising the step of obtaining a new classification value obtaining module based on the first feature value and the classification value obtaining module
Control method.

The method of claim 3,
The classification value acquisition module includes a weight vector including a plurality of column vectors,
Obtaining the new classification value acquisition module,
Generate a first column vector based on the average value of the first feature value,
Acquiring the new classification value acquisition module by adding the first column vector as a new column vector of the weight vector
Control method.

The method of claim 4,
Normalizing the obtained new classification value acquisition module based on a predefined normalization function; further comprising
Control method.

The method of claim 1,
The trained neural network model,
It is learned based on a loss function that includes a predefined regularization function to prevent overfitting.
Control method.

The method of claim 1,
When the user command is obtained, determining whether the first object corresponds to the at least one class; further comprising,
The determining step,
Obtaining an image for the first object,
Inputting an image of the first object into the learned neural network model to obtain a first feature value of the first object,
Comparing the first feature value and the weight vector of the learned neural network model to determine whether the first object corresponds to the at least one class
Control method.

In the electronic device,
A memory including at least one instruction; And
Including; a processor;
The processor,
Acquire a neural network model trained to detect an object corresponding to at least one class,
Obtaining a user command for detecting a first object corresponding to the first class,
If the first object does not correspond to the at least one class, obtaining a new neural network model based on the neural network model and information on the first object
Electronic device.

The method of claim 8,
The trained neural network model,
A feature extraction module that extracts feature values for an object, and
Comprising a classification value acquisition module for obtaining a classification value for the object based on the feature value obtained from the feature extraction module
Electronic device.

The method of claim 9,
The processor,
Obtaining an image for the first object,
Inputting the obtained image of the first object into the feature extraction module to obtain a first feature value of the first object,
Acquiring a new classification value obtaining module based on the first feature value and the classification value obtaining module
Electronic device.

The method of claim 10,
The classification value acquisition module includes a weight vector including a plurality of column vectors,
The processor,
Generate a first column vector based on the average value of the first feature value,
Acquiring the new classification value acquisition module by adding the first column vector as a new column vector of the weight vector
Electronic device.

The method of claim 11,
The processor,
Normalizing the acquired new classification value acquisition module based on a predefined normalization function
Electronic device.

The method of claim 8,
The trained neural network model,
It is learned based on a loss function that includes a predefined regularization function to prevent overfitting.
Electronic device.

The processor,
Obtaining an image for the first object,
Inputting an image of the first object into the learned neural network model to obtain a first feature value of the first object,
Comparing the first feature value and the weight vector of the learned neural network model to determine whether the first object corresponds to the at least one class
Electronic device.

A recording medium capable of recording a computer version on which a program for performing the method according to any one of claims 1 to 14 is recorded.