KR102603424B1

KR102603424B1 - Method, Apparatus, and Computer-readable Medium for Determining Image Classification using a Neural Network Model

Info

Publication number: KR102603424B1
Application number: KR1020220077453A
Authority: KR
Inventors: 김상태; 곽찬웅
Original assignee: 주식회사 노타
Priority date: 2022-06-24
Filing date: 2022-06-24
Publication date: 2023-11-20

Abstract

본 개시는 뉴럴 네트워크 모델을 이용하여 이미지 이벤트 분류를 결정하는 방법, 장치 및 컴퓨터-판독가능 매체에 관한 것이다. 본 개시의 일 실시 예에 따른 방법은, 제1 뉴럴 네트워크 모델에 입력 데이터로써 이미지를 입력하고, 제1 뉴럴 네트워크 모델의 출력 데이터로써 이미지에 대한 이벤트 분류 결과를 획득하고, 이벤트 분류 결과 및 이벤트 분류 결과에 매칭되는 랜드마크 포인트의 신뢰도 정보에 기초하여 이미지에 대한 신규 이벤트 분류를 결정할 수 있다. This disclosure relates to a method, apparatus, and computer-readable medium for determining image event classification using a neural network model. A method according to an embodiment of the present disclosure includes inputting an image as input data to a first neural network model, obtaining an event classification result for the image as output data of the first neural network model, event classification results, and event classification. A new event classification for the image can be determined based on the reliability information of the landmark point that matches the result.

Description

{Method, Apparatus, and Computer-readable Medium for Determining Image Classification using a Neural Network Model}

본 발명은 뉴럴 네트워크 모델을 이용하여 이미지 이벤트 분류를 결정하는 방법, 장치 및 컴퓨터-판독가능 매체에 관한 것이다. The present invention relates to a method, apparatus, and computer-readable medium for determining image event classification using a neural network model.

뉴럴 네트워크 모델은 인간 수준의 지능을 구현하는 컴퓨터 시스템으로서 기계가 스스로 학습하고 판단하는 모델이다.A neural network model is a computer system that implements human-level intelligence and is a model in which machines learn and make decisions on their own.

뉴럴 네트워크 모델은 입력 데이터들의 특징을 스스로 분류/학습하는 알고리즘을 이용하는 기계학습(딥러닝) 기술 및 기계학습 알고리즘을 활용하여 인간 두뇌의 인지, 판단 등의 기능을 모사하는 요소 기술들로 구성된다.The neural network model consists of machine learning (deep learning) technology that uses an algorithm that classifies/learns the characteristics of input data on its own, and element technologies that mimic the functions of the human brain such as cognition and judgment using machine learning algorithms.

요소 기술들은, 예로, 인간의 언어/문자를 인식하는 언어적 이해 기술, 사물을 인간의 시각처럼 인식하는 시각적 이해 기술, 정보를 판단하여 논리적으로 추론하고 예측하는 추론/예측 기술, 인간의 경험 정보를 지식데이터로 처리하는 지식 표현 기술 및 차량의 자율 주행, 로봇의 움직임을 제어하는 동작 제어 기술 중 적어도 하나를 포함할 수 있다.Element technologies include, for example, linguistic understanding technology that recognizes human language/characters, visual understanding technology that recognizes objects as if they were human eyes, reasoning/prediction technology that judges information and makes logical inferences and predictions, and human experience information. It may include at least one of a knowledge expression technology that processes knowledge data and a motion control technology that controls the autonomous driving of a vehicle and the movement of a robot.

시각적 이해는 사물을 인간의 시각처럼 인식하여 처리하는 기술로서, 객체 인식, 객체 추적, 영상 검색, 사람 인식, 장면 이해, 공간 이해, 영상 개선 등을 포함한다.Visual understanding is a technology that recognizes and processes objects like human vision, and includes object recognition, object tracking, image search, person recognition, scene understanding, spatial understanding, and image improvement.

추론 예측은 정보를 판단하여 논리적으로 추론하고 예측하는 기술로서, 지식/확률 기반 추론, 최적화 예측, 선호 기반 계획, 추천 등을 포함한다.Inferential prediction is a technology that judges information to make logical inferences and predictions, and includes knowledge/probability-based reasoning, optimization prediction, preference-based planning, and recommendations.

최근에는, 이와 같은 인공 지능 기술을 이용하여 이미지의 이벤트를 분류하는 전자 장치가 개발되고 있는데, 이벤트 분류의 정확도를 높이기 위한 연구가 필요한 실정이다.Recently, electronic devices that classify events in images using artificial intelligence technology have been developed, and research is needed to improve the accuracy of event classification.

전술한 배경기술은 발명자가 본 발명의 도출을 위해 보유하고 있었거나, 본 발명의 도출 과정에서 습득한 기술 정보로서, 반드시 본 발명의 출원 전에 일반 공중에게 공개된 공지기술이라 할 수는 없다. The above-mentioned background technology is technical information that the inventor possessed for deriving the present invention or acquired in the process of deriving the present invention, and cannot necessarily be said to be known art disclosed to the general public before filing the application for the present invention.

대한민국 등록특허공보 제10-2005150호 (2019.07.23)Republic of Korea Patent Publication No. 10-2005150 (2019.07.23)

본 발명은 뉴럴 네트워크 모델을 이용하여 이미지 이벤트 분류를 결정하는 방법, 장치 및 컴퓨터-판독가능 매체를 제공하는데 있다. 본 발명이 해결하고자 하는 과제는 이상에서 언급한 과제에 한정되지 않으며, 언급되지 않은 본 발명의 다른 과제 및 장점들은 하기의 설명에 의해서 이해될 수 있고, 본 발명의 실시 예에 의해보다 분명하게 이해될 것이다. 또한, 본 발명이 해결하고자 하는 과제 및 장점들은 특허 청구 범위에 나타낸 수단 및 그 조합에 의해 실현될 수 있음을 알 수 있을 것이다. The present invention provides a method, apparatus, and computer-readable medium for determining image event classification using a neural network model. The problem to be solved by the present invention is not limited to the problems mentioned above, and other problems and advantages of the present invention that are not mentioned can be understood through the following description and can be understood more clearly through the examples of the present invention. It will be. In addition, it can be seen that the problems and advantages to be solved by the present invention can be realized by the means and combinations thereof indicated in the patent claims.

상술한 기술적 과제를 달성하기 위한 기술적 수단으로서, 본 개시의 제1 측면은, 뉴럴 네트워크 모델을 이용하여 이미지 이벤트 분류를 결정하는 방법에 있어서, 제1 뉴럴 네트워크 모델에 입력 데이터로써 이미지를 입력하고, 상기 제1 뉴럴 네트워크 모델의 출력 데이터로써 상기 이미지에 대한 이벤트 분류 결과를 획득하는 단계; 및 상기 이벤트 분류 결과 및 상기 이벤트 분류 결과에 매칭되는 랜드마크 포인트의 신뢰도 정보에 기초하여 상기 이미지에 대한 신규 이벤트 분류를 결정하는 단계; 를 포함하고, 상기 신규 이벤트 분류는, 상기 이벤트 분류 결과와 연관된 것인, 방법을 제공할 수 있다.As a technical means for achieving the above-mentioned technical problem, a first aspect of the present disclosure is a method of determining image event classification using a neural network model, inputting an image as input data to a first neural network model, Obtaining an event classification result for the image using output data of the first neural network model; and determining a new event classification for the image based on the event classification result and reliability information of a landmark point matching the event classification result. A method may be provided, wherein the new event classification is related to the event classification result.

본 개시의 제2 측면은, 뉴럴 네트워크 모델을 이용하여 이미지 이벤트 분류를 결정하는 장치에 있어서, 적어도 하나의 프로그램이 저장된 메모리; 및 상기 적어도 하나의 프로그램을 실행함으로써 뉴럴 네트워크 모델을 구동하는 적어도 하나의 프로세서를 포함하고, 상기 적어도 하나의 프로세서는, 제1 뉴럴 네트워크 모델에 입력 데이터로써 이미지를 입력하고, 상기 제1 뉴럴 네트워크 모델의 출력 데이터로써 상기 이미지에 대한 이벤트 분류 결과를 획득하고, 상기 이벤트 분류 결과 및 상기 이벤트 분류 결과에 매칭되는 랜드마크 포인트의 신뢰도 정보에 기초하여 상기 이미지에 대한 신규 이벤트 분류를 결정하며, 상기 신규 이벤트 분류는, 상기 이벤트 분류 결과와 연관된 것인, 장치를 제공할 수 있다.A second aspect of the present disclosure provides an apparatus for determining image event classification using a neural network model, comprising: a memory storing at least one program; and at least one processor that drives a neural network model by executing the at least one program, wherein the at least one processor inputs an image as input data to a first neural network model, and operates the first neural network model. Obtain an event classification result for the image as output data, determine a new event classification for the image based on the event classification result and reliability information of a landmark point matching the event classification result, and determine a new event classification for the image. Classification may provide a device that is associated with the event classification results.

본 개시의 제3 측면은, 제1 측면에 따른 방법을 컴퓨터에서 실행하기 위한 프로그램을 기록한 컴퓨터로 읽을 수 있는 기록매체를 제공할 수 있다.A third aspect of the present disclosure may provide a computer-readable recording medium recording a program for executing the method according to the first aspect on a computer.

이 외에도, 본 발명을 구현하기 위한 다른 방법, 다른 시스템 및 상기 방법을 실행하기 위한 컴퓨터 프로그램이 저장된 컴퓨터로 판독 가능한 기록매체가 더 제공될 수 있다.In addition, another method for implementing the present invention, another system, and a computer-readable recording medium storing a computer program for executing the method may be further provided.

전술한 것 외의 다른 측면, 특징, 이점이 이하의 도면, 특허청구범위 및 발명의 상세한 설명으로부터 명확해질 것이다. Other aspects, features and advantages in addition to those described above will become apparent from the following drawings, claims and detailed description of the invention.

전술한 본 개시의 과제 해결 수단에 의하면, 본 개시에서는 이벤트 분류 모델의 한계를 극복하고, 더 높은 정확도로 이미지의 이벤트를 분류할 수 있다.According to the problem solving means of the present disclosure described above, the present disclosure can overcome the limitations of the event classification model and classify events in images with higher accuracy.

전술한 본 개시의 과제 해결 수단에 의하면, 본 개시에서는 이벤트 분류 모델 및 랜드마크 검출 모델의 정보를 조합하여 이미지 이벤트 분류를 결정함으로써, 이벤트 분류 모델의 한계를 극복하고 높은 정확도로 이미지의 이벤트를 분류할 수 있다. According to the problem-solving means of the present disclosure described above, the present disclosure determines image event classification by combining the information of the event classification model and the landmark detection model, thereby overcoming the limitations of the event classification model and classifying events in the image with high accuracy. can do.

도 1은 일 실시예에 따른 시스템의 블록도이다.
도 2a 내지 2b는 일 실시예에 따른 이미지 내 얼굴 영역을 검출하는 방법을 설명하기 위한 예시적인 도면이다.
도 3은 일 실시예에 따른 얼굴 영역 내 랜드마크를 검출하는 방법을 설명하기 위한 예시적인 도면이다.
도 4는 일 실시예에 따른 이미지의 이벤트를 분류하는 방법을 설명하기 위한 예시적인 도면이다.
도 5는 일 실시예에 따른 이미지에 대한 이벤트 분류 결과 예시를 설명하기 위한 도면이다.
도 6a 및 6b는 일 실시예에 따른 랜드마크 포인트의 신뢰도 정보를 산출하는 방법을 설명하기 위한 예시적인 도면이다.
도 7a 및 7b는 일 실시예에 따른 이미지에 대한 신규 이벤트 분류를 결정하는 방법을 설명하기 위한 예시적인 도면이다.
도 8a 및 8b는 일 실시예에 따른 랜드마크 포인트의 신뢰도를 산출하는 방법을 설명하기 위한 예시적인 도면이다.
도 9는 일 실시예에 따른 뉴럴 네트워크 모델을 이용하여 이미지 이벤트 분류를 결정하는 방법을 설명하기 위한 흐름도이다.
도 10은 일 실시예에 따른 이미지 이벤트 분류 결정 장치의 블록도이다.1 is a block diagram of a system according to one embodiment.
2A and 2B are exemplary diagrams for explaining a method of detecting a face area in an image according to an embodiment.
FIG. 3 is an exemplary diagram illustrating a method for detecting landmarks in a face area according to an embodiment.
FIG. 4 is an exemplary diagram illustrating a method of classifying events in an image according to an embodiment.
Figure 5 is a diagram for explaining an example of an event classification result for an image according to an embodiment.
6A and 6B are exemplary diagrams for explaining a method of calculating reliability information of a landmark point according to an embodiment.
7A and 7B are exemplary diagrams for explaining a method of determining a new event classification for an image according to an embodiment.
8A and 8B are exemplary diagrams for explaining a method of calculating reliability of a landmark point according to an embodiment.
Figure 9 is a flowchart illustrating a method of determining image event classification using a neural network model according to an embodiment.
Figure 10 is a block diagram of an image event classification determination device according to an embodiment.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 설명되는 실시 예들을 참조하면 명확해질 것이다. 그러나 본 발명은 아래에서 제시되는 실시 예들로 한정되는 것이 아니라, 서로 다른 다양한 형태로 구현될 수 있고, 본 발명의 사상 및 기술 범위에 포함되는 모든 변환, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 아래에 제시되는 실시 예들은 본 발명의 개시가 완전하도록 하며, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이다. 본 발명을 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다.The advantages and features of the present invention and methods for achieving them will become clear by referring to the embodiments described in detail together with the accompanying drawings. However, the present invention is not limited to the embodiments presented below, but may be implemented in various different forms, and should be understood to include all conversions, equivalents, and substitutes included in the spirit and technical scope of the present invention. . The embodiments presented below are provided to ensure that the disclosure of the present invention is complete and to fully inform those skilled in the art of the scope of the invention. In describing the present invention, if it is determined that a detailed description of related known technologies may obscure the gist of the present invention, the detailed description will be omitted.

본 출원에서 사용한 용어는 단지 특정한 실시 예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "포함하다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다. The terms used in this application are only used to describe specific embodiments and are not intended to limit the invention. Singular expressions include plural expressions unless the context clearly dictates otherwise. In this application, terms such as “comprise” or “have” are intended to designate the presence of features, numbers, steps, operations, components, parts, or combinations thereof described in the specification, but are not intended to indicate the presence of one or more other features. It should be understood that this does not exclude in advance the possibility of the existence or addition of elements, numbers, steps, operations, components, parts, or combinations thereof.

본 개시의 일부 실시예는 기능적인 블록 구성들 및 다양한 처리 단계들로 나타내어질 수 있다. 이러한 기능 블록들의 일부 또는 전부는, 특정 기능들을 실행하는 다양한 개수의 하드웨어 및/또는 소프트웨어 구성들로 구현될 수 있다. 예를 들어, 본 개시의 기능 블록들은 하나 이상의 마이크로프로세서들에 의해 구현되거나, 소정의 기능을 위한 회로 구성들에 의해 구현될 수 있다. 또한, 예를 들어, 본 개시의 기능 블록들은 다양한 프로그래밍 또는 스크립팅 언어로 구현될 수 있다. 기능 블록들은 하나 이상의 프로세서들에서 실행되는 알고리즘으로 구현될 수 있다. 또한, 본 개시는 전자적인 환경 설정, 신호 처리, 및/또는 데이터 처리 등을 위하여 종래 기술을 채용할 수 있다. "매커니즘", "요소", "수단" 및 "구성" 등과 같은 용어는 넓게 사용될 수 있으며, 기계적이고 물리적인 구성들로서 한정되는 것은 아니다.Some embodiments of the present disclosure may be represented by functional block configurations and various processing steps. Some or all of these functional blocks may be implemented in various numbers of hardware and/or software configurations that perform specific functions. For example, the functional blocks of the present disclosure may be implemented by one or more microprocessors, or may be implemented by circuit configurations for certain functions. Additionally, for example, functional blocks of the present disclosure may be implemented in various programming or scripting languages. Functional blocks may be implemented as algorithms running on one or more processors. Additionally, the present disclosure may employ conventional technologies for electronic environment setup, signal processing, and/or data processing. Terms such as “mechanism,” “element,” “means,” and “configuration” may be used broadly and are not limited to mechanical and physical configurations.

또한, 도면에 도시된 구성 요소들 간의 연결 선 또는 연결 부재들은 기능적인 연결 및/또는 물리적 또는 회로적 연결들을 예시적으로 나타낸 것일 뿐이다. 실제 장치에서는 대체 가능하거나 추가된 다양한 기능적인 연결, 물리적인 연결, 또는 회로 연결들에 의해 구성 요소들 간의 연결이 나타내어질 수 있다.Additionally, connection lines or connection members between components shown in the drawings merely exemplify functional connections and/or physical or circuit connections. In an actual device, connections between components may be represented by various replaceable or additional functional connections, physical connections, or circuit connections.

이하 첨부된 도면을 참고하여 본 개시를 상세히 설명하기로 한다. Hereinafter, the present disclosure will be described in detail with reference to the attached drawings.

도 1은 일 실시예에 따른 시스템의 블록도이다. 1 is a block diagram of a system according to one embodiment.

시스템(100)은 얼굴 검출 모델(110), 랜드마크 검출 모델(121), 이벤트 분류 모델(122) 및 후처리 모듈(130)을 포함할 수 있다. System 100 may include a face detection model 110, a landmark detection model 121, an event classification model 122, and a post-processing module 130.

얼굴 검출 모델(110)은 이미지를 획득하고, 획득된 이미지로부터 얼굴 영역을 검출할 수 있다. The face detection model 110 may acquire an image and detect a face area from the acquired image.

랜드마크 검출 모델(121)은 얼굴 검출 모델(110)로부터 이미지에서 검출된 얼굴 영역을 수신할 수 있다. 랜드마크 검출 모델(121)은 얼굴 영역 내 랜드마크를 검출할 수 있다.The landmark detection model 121 may receive the face area detected in the image from the face detection model 110. The landmark detection model 121 can detect landmarks within the face area.

이벤트 분류 모델(122)은 획득된 이미지의 이벤트를 분류할 수 있다. 얼굴 검출 모델(110)에서 이미지에서 얼굴 영역을 검출하면, 이벤트 분류 모델(122)은 얼굴 영역이 검출된 이미지의 이벤트를 분류할 수 있다. The event classification model 122 can classify events in the acquired image. When the face detection model 110 detects a face area in an image, the event classification model 122 may classify the event of the image in which the face area is detected.

후처리 모듈(130)은 랜드마크 검출 모델(121)로부터 획득된 얼굴 영역 내 랜드마크, 또는 이벤트 분류 모델(122)로부터 획득된 이미지의 이벤트 분류 결과를 이용하여 후처리를 수행할 수 있다. 예를 들어, 후처리 모듈(130)은 랜드마크를 이용하여 헤드 포즈(head pose)를 결정하거나, 랜드마크 및 이벤트 분류 결과를 이용하여 투표(voting)를 진행할 수 있다. The post-processing module 130 may perform post-processing using landmarks in the face area obtained from the landmark detection model 121 or event classification results of the image obtained from the event classification model 122. For example, the post-processing module 130 may determine a head pose using a landmark, or may perform voting using the landmark and event classification results.

시스템(100)은 후처리 모듈(130)에서 출력된 결과를 통해, 시스템(100)을 사용하는 사용자에게 정보를 제공할 수 있다. 예를 들어, 시스템(100)은 운전 모니터링 시스템(driving monitoring system)일 수 있고, 이 경우 시스템(100)은 얼굴 검출 모델(110), 랜드마크 검출 모델(121), 이벤트 분류 모델(122) 및 후처리 모듈(130)을 이용하여 운전자를 촬영한 이미지를 처리함으로써, 운전자의 상태 정보를 제공할 수 있다. 예를 들어, 운전자의 상태 정보는 산만(distraction), 졸음(drowsy), 흡연(smoke) 및 통화(phone) 등을 포함할 수 있다.The system 100 may provide information to users using the system 100 through the results output from the post-processing module 130. For example, system 100 may be a driving monitoring system, in which case system 100 may include a face detection model 110, a landmark detection model 121, an event classification model 122, and By processing the image captured of the driver using the post-processing module 130, information on the driver's status can be provided. For example, the driver's status information may include distraction, drowsy, smoking, and phone.

도 2a 내지 2b는 일 실시예에 따른 이미지 내 얼굴 영역을 검출하는 방법을 설명하기 위한 예시적인 도면이다. 2A and 2B are exemplary diagrams for explaining a method of detecting a face area in an image according to an embodiment.

얼굴 검출 모델은 얼굴 영역을 검출하기 위한 이미지(200)를 획득할 수 있다. 얼굴 검출 모델은 이미지(200)로부터 얼굴 영역을 검출하기 위해, 소프트웨어 모듈, 하드웨어 모듈, 또는 이들의 다양한 조합으로 구현될 수 있다. The face detection model can acquire an image 200 for detecting a face area. The face detection model may be implemented as a software module, a hardware module, or various combinations thereof to detect a face area from the image 200.

이미지(200) 내 얼굴 영역을 검출하기 위한 방법으로 다양한 알고리즘이 이용될 수 있다. 예를 들어, 얼굴 검출 모델은 Adaboost 알고리즘을 이용한 뉴럴 네트워크 모델로 구현될 수 있으나, 뉴럴 네트워크 모델을 구현하는데 사용될 수 있는 알고리즘은 이에 제한되지 않는다. Various algorithms can be used as a method for detecting the face area within the image 200. For example, a face detection model can be implemented as a neural network model using the Adaboost algorithm, but the algorithm that can be used to implement the neural network model is not limited to this.

도 2a를 참조하면, 이미지(200) 내 얼굴 영역을 검출하기 위해 슬라이딩 윈도우(210)가 이용될 수 있다. 얼굴 검출 모델은 슬라이딩 윈도우(210)를 이용하여 이미지(200)를 스캔하고, 스캔된 이미지(200)의 각 영역이 얼굴 영역 또는 배경 영역 중 어느 영역에 해당하는지 결정할 수 있다. Referring to FIG. 2A, a sliding window 210 may be used to detect a facial area within image 200. The face detection model can scan the image 200 using the sliding window 210 and determine whether each area of the scanned image 200 corresponds to a face area or a background area.

도 2b를 참조하면, 얼굴 검출 모델은 학습을 통해 이미지(200) 내 소정의 영역을 얼굴 영역(220)으로 검출할 수 있다.Referring to FIG. 2B, the face detection model can detect a predetermined area within the image 200 as the face area 220 through learning.

도 3은 일 실시예에 따른 얼굴 영역 내 랜드마크를 검출하는 방법을 설명하기 위한 예시적인 도면이다. FIG. 3 is an exemplary diagram illustrating a method for detecting landmarks in a face area according to an embodiment.

랜드마크 검출 모델은 얼굴 영역(300) 내 랜드마크(310)를 검출하기 위해, 소프트웨어 모듈, 하드웨어 모듈, 또는 이들의 다양한 조합으로 구현될 수 있다. 랜드마크 검출 모델은 얼굴 검출 모델에서 검출된 얼굴 영역(300)을 수신하여 얼굴 영역(300) 내 랜드마크(310)를 검출할 수 있다.The landmark detection model may be implemented as a software module, a hardware module, or various combinations thereof to detect the landmark 310 within the face area 300. The landmark detection model may receive the face area 300 detected in the face detection model and detect the landmark 310 within the face area 300.

랜드마크(310)는 얼굴 영역(300) 내에서 각 특징에 대표되는 위치를 나타내는 특징점(feature point)을 의미한다. 예를 들어, 랜드마크(310)는 사람의 얼굴을 대표하는 특징인 눈, 코, 입, 및 눈썹 등에 대응하는 특징점을 나타낼 수 있다. 랜드마크(310)는 각 특징에 대해 적어도 하나가 설정될 수 있고, 하나의 특징에 복수의 랜드마크(310)가 설정될 수도 있다. The landmark 310 refers to a feature point indicating a position representing each feature within the face area 300. For example, the landmark 310 may represent feature points corresponding to eyes, nose, mouth, and eyebrows, which are representative features of a person's face. At least one landmark 310 may be set for each feature, and a plurality of landmarks 310 may be set for one feature.

랜드마크 검출 모델은 사전에 획득된 얼굴 영역(300)에 대한 학습 및 통계 알고리즘(예를 들어, AAM(active appearance model), SDM(Supervised Descent Method) 등)을 사용한 뉴럴 네트워크 모델로 구현될 수 있으나, 뉴럴 네트워크 모델을 구현하는데 사용될 수 있는 알고리즘은 이에 제한되지 않는다. The landmark detection model can be implemented as a neural network model using learning and statistical algorithms (e.g., AAM (active appearance model), SDM (Supervised Descent Method), etc.) for the previously acquired face area 300. , the algorithms that can be used to implement the neural network model are not limited to this.

도 3에서 얼굴 영역(300)의 특징 중 오른쪽 눈썹을 예로 들면, 오른쪽 눈썹에 대응하는 랜드마크(310)는 9개로 설정될 수 있다. Taking the right eyebrow as an example among the features of the face area 300 in FIG. 3, the number of landmarks 310 corresponding to the right eyebrow may be set to 9.

한편, 얼굴 영역(300)의 각 특징에 대응하는 랜드마크(310)의 개수는, 랜드마크(310)가 사용되는 어플리케이션에 따라 다양한 개수로 설정될 수 있다. Meanwhile, the number of landmarks 310 corresponding to each feature of the face area 300 may be set to various numbers depending on the application in which the landmarks 310 are used.

도 4는 일 실시예에 따른 이미지의 이벤트를 분류하는 방법을 설명하기 위한 예시적인 도면이다. FIG. 4 is an exemplary diagram illustrating a method of classifying events in an image according to an embodiment.

이벤트 분류 모델은 이미지의 이벤트를 분류하기 위해 소프트웨어 모듈, 하드웨어 모듈, 또는 이들의 다양한 조합으로 구현될 수 있다. 얼굴 검출 모델에서 소정의 이미지에서 얼굴 영역을 검출하면, 이벤트 분류 모델은 얼굴 영역이 검출된 소정의 이미지의 이벤트를 분류할 수 있다. The event classification model may be implemented as a software module, a hardware module, or various combinations thereof to classify events in an image. When the face detection model detects a face area in a predetermined image, the event classification model can classify the event of the predetermined image in which the face area is detected.

이벤트 분류 모델은 이미지에 포함된 얼굴 영역과 얼굴 영역 주변의 객체에 기초하여 이미지가 기설정된 이벤트들 중 어느 이벤트에 해당하는지 분류할 수 있다.The event classification model can classify which of the preset events the image corresponds to based on the face area included in the image and objects around the face area.

이벤트 분류 모델은 컨벌루션 뉴럴 네트워크(Convolutional Neural Networks, CNN) 모델로 구현될 수 있으나, 뉴럴 네트워크 모델을 구현하는데 사용될 수 있는 알고리즘은 이에 제한되지 않는다. The event classification model can be implemented as a convolutional neural network (CNN) model, but the algorithm that can be used to implement the neural network model is not limited to this.

구체적으로, 이벤트 분류 모델은 입력 이미지, 피처맵들(feature maps) 및 출력을 포함하는 복수 레이어들을 갖는 아키텍처로 구현될 수 있다. 이벤트 분류 모델에서 입력 이미지는 커널(kernel)이라 불리는 필터와의 컨벌루션 연산이 수행되고, 그 결과 피처맵들이 출력된다. 이때 생성된 출력 피처맵들은 입력 피처맵들로서 다시 커널과의 컨벌루션 연산이 수행되고, 새로운 피처맵들이 출력된다. 이와 같은 컨벌루션 연산이 반복적으로 수행된 결과, 최종적으로는 이벤트 분류 모델을 통해 입력 이미지의 특징들에 기초한 이벤트 결과가 출력될 수 있다. 상술한 과정을 통해 이벤트 분류 모델은 소정의 이미지의 이벤트를 분류할 수 있다.Specifically, the event classification model can be implemented as an architecture with multiple layers including an input image, feature maps, and output. In the event classification model, a convolution operation is performed on the input image with a filter called a kernel, and as a result, feature maps are output. The output feature maps generated at this time are input feature maps, and a convolution operation with the kernel is performed again, and new feature maps are output. As a result of this convolution operation being performed repeatedly, event results based on the features of the input image can ultimately be output through an event classification model. Through the above-described process, the event classification model can classify events in a given image.

도 4를 참조하면, 이벤트 분류 모델은 이미지에 대한 이벤트를 normal(410), phone(420), smoke(430) 및 mask(440) 등으로 분류할 수 있다. phone(420)은 이미지 내 사람이 핸드폰을 사용 중인 것을 나타내는 이벤트이고, smoke(430)은 이미지 내 사람이 흡연 중인 것을 나타내는 이벤트이고, mask(440)는 이미지 내 사람이 마스크를 착용 중인 것은 나타내는 이벤트이며, normal(410)은 상술한 이벤트가 아닌 모든 경우를 나타내는 이벤트일 수 있다. 그러나, 이벤트 분류 모델이 분류할 수 있는 이벤트의 종류는 상술한 예로 제한되지 않는다.Referring to FIG. 4, the event classification model can classify image events into normal (410), phone (420), smoke (430), and mask (440). phone(420) is an event indicating that the person in the image is using a cell phone, smoke(430) is an event indicating that the person in the image is smoking, and mask(440) is an event indicating that the person in the image is wearing a mask. , and normal(410) may be an event representing all cases other than the above-mentioned events. However, the types of events that the event classification model can classify are not limited to the above examples.

도 5는 일 실시예에 따른 이미지에 대한 이벤트 분류 결과 예시를 설명하기 위한 도면이다.Figure 5 is a diagram for explaining an example of an event classification result for an image according to an embodiment.

뉴럴 네트워크 모델을 이용하여 이미지 이벤트 분류를 결정하는 장치(이하, 이미지 이벤트 분류 결정 장치)는, 제1 뉴럴 네트워크 모델에 입력 데이터로써 이미지를 입력하고, 제1 뉴럴 네트워크 모델의 출력 데이터로써 이미지에 대한 이벤트 분류 결과를 획득할 수 있다.A device for determining image event classification using a neural network model (hereinafter referred to as an image event classification determination device) inputs an image as input data to a first neural network model and determines the image event as output data of the first neural network model. Event classification results can be obtained.

제1 뉴럴 네트워크 모델은 도 4에서 설명한 이벤트 분류 모델일 수 있고, 이미지에 대한 이벤트 분류 결과는 도 4의 이미지에 대한 이벤트 중 어느 하나일 수 있다.The first neural network model may be the event classification model described in FIG. 4, and the event classification result for the image may be any one of the events for the image in FIG. 4.

일 실시예에 따르면, 이미지에 대한 이벤트 분류 결과는 이벤트 분류 결과에 대한 신뢰도 (confidence) 정보를 포함할 수 있다. 이미지 이벤트 분류 결과에 대한 신뢰도가 높을수록 불확실성이 낮으며, 반대로 이미지 이벤트 분류 결과에 대한 신뢰도가 낮을수록 불확실성이 크다는 것을 의미한다.According to one embodiment, the event classification result for an image may include confidence information about the event classification result. The higher the reliability of the image event classification results, the lower the uncertainty, and conversely, the lower the reliability of the image event classification results, the greater the uncertainty.

도 5를 참조하면, 이미지 이벤트 분류 결과가 명확한(clear) 그룹의 이미지(510)는 각각 mask 이벤트와 normal 이벤트로 분류되었으며, 분류 결과에 대한 신뢰도가 각각 97.3%와 93.1%로 높은 반면, 이미지 이벤트 분류 결과가 불명확한(unclear) 그룹의 이미지(530)는 mask 이벤트로 분류되었으나, 분류 결과에 대한 신뢰도가 71.1%와 63.7%로 다소 낮은 것을 확인할 수 있다.Referring to FIG. 5, the images 510 in the group with clear image event classification results were classified into mask events and normal events, respectively, and the reliability of the classification results was high at 97.3% and 93.1%, respectively, while the image event classification results were high at 97.3% and 93.1%, respectively. The image 530 of the group with unclear classification results was classified as a mask event, but it can be seen that the reliability of the classification results was somewhat low at 71.1% and 63.7%.

한편, 학습이 잘 수행된 이벤트 분류 모델일지라도, 이벤트 분류 결과에 대한 100%의 신뢰도를 가질 수 없다. 예컨대, 이미지 이벤트 분류 결과가 불명확한 그룹의 이미지(530)를 참조하면, 이미지내에서 사람이 마스크를 착용하긴 하였으나, 입이 아닌 턱에 착용하여 mask 이벤트 분류에 대한 신뢰도가 낮은 것을 확인할 수 있다. 이 밖에도 예를 들면, 이미지 내에서 사람이 핸드폰을 사용 중이라고 판단되지만, 귀가 아닌 입에 밀착하여 사용 중이어서 phone 이벤트 분류에 대한 신뢰도가 낮을 수 있고, normal 이벤트로 분류되지만 이미지 내에서 사람이 손으로 눈을 가리고 있어 이벤트 분류에 대한 신뢰도가 낮은 경우가 있을 수 있다. Meanwhile, even if the event classification model is well-trained, the event classification results cannot have 100% reliability. For example, referring to the image 530 of the group for which the image event classification result is unclear, it can be seen that although the person in the image is wearing a mask, it is worn on the chin rather than the mouth, so the reliability of mask event classification is low. In addition, for example, it is determined that a person is using a cell phone in the image, but since the person is using it close to the mouth rather than the ear, the reliability of phone event classification may be low, and although it is classified as a normal event, the person is using it with his or her hand in the image. There may be cases where the reliability of event classification is low because the eyes are covered.

본 개시에서는 이와 같은 이벤트 분류 모델의 한계를 극복하고, 더 높은 정확도로 이미지의 이벤트를 분류하기 위해, 이미지 이벤트 분류 결과 및 이미지 이벤트 분류 결과에 매칭되는 랜드마크 포인트의 신뢰도 정보에 기초하여 이미지에 대한 신규 이벤트 분류를 결정하는 방법을 개시한다. 보다 자세한 방법은 후술하기로 한다.In this disclosure, in order to overcome the limitations of such event classification models and classify events in images with higher accuracy, the image event classification results are based on the reliability information of landmark points matching the image event classification results. A method for determining a new event classification is disclosed. A more detailed method will be described later.

도 6a 및 6b는 일 실시예에 따른 랜드마크 포인트의 신뢰도 정보를 산출하는 방법을 설명하기 위한 예시적인 도면이다.6A and 6B are exemplary diagrams for explaining a method of calculating reliability information of a landmark point according to an embodiment.

이미지 이벤트 분류 결정 장치는 제2 뉴럴 네트워크 모델에 입력 데이터로써 이미지를 입력하고, 이미지에서 복수의 랜드마크 포인트들을 탐지하여, 복수의 랜드마크 포인트들 각각에 대해 신뢰도 정보를 산출할 수 있다.The image event classification determination device may input an image as input data to a second neural network model, detect a plurality of landmark points in the image, and calculate reliability information for each of the plurality of landmark points.

일 실시예에서, 제2 뉴럴 네트워크 모델에 입력되는 이미지는, 제1 뉴럴 네트워크 모델에 입력된 이미지(이하, 제1 이미지)와 관련된 관련 이미지(이하, 제2 이미지)일 수 있다. 제2 이미지는 제1 이미지를 변형한 이미지일 수 있다. 구체적으로, 제2 이미지는 제1 이미지에 대해 resize, crop, channel 변환된 이미지일 수 있다. 예를 들어, 제2 이미지는 제1 이미지 대비, 1픽셀 씩 margin 을 넓힌 이미지이거나, rgb 채널 순서가 바뀐 이미지이거나, resize를 통해 크기가 달라진 이미지일 수 있다.In one embodiment, the image input to the second neural network model may be a related image (hereinafter, the second image) related to the image (hereinafter, the first image) input to the first neural network model. The second image may be a modified image of the first image. Specifically, the second image may be an image that has been resized, cropped, or channel converted to the first image. For example, the second image may be an image with a margin expanded by 1 pixel compared to the first image, an image with the rgb channel order changed, or an image whose size is changed through resize.

제2 뉴럴 네트워크 모델은 도 3에서 설명한 랜드마크 검출 모델일 수 있고, 복수의 랜드마크 포인트들은, 랜드마크 검출 모델을 통해 검출되는 얼굴 영역내 랜드마크일 수 있다. 복수의 랜드마크 포인트들은 사람의 얼굴을 대표하는 특징에 대응하는 특징점을 나타낼 수 있으며, 각 특징에 대해 복수의 랜드마크 포인트들이 설정될 수 있다.The second neural network model may be the landmark detection model described in FIG. 3, and the plurality of landmark points may be landmarks within the face area detected through the landmark detection model. A plurality of landmark points may represent feature points corresponding to features representing a person's face, and a plurality of landmark points may be set for each feature.

복수의 랜드마크 포인트들 각각에 대한 신뢰도 정보는, 랜드마크 포인트의 위치에 대한 정확도를 의미할 수 있다. 신뢰도 정보는 복수의 랜드마크 각각의 포인트별로 회귀 분석(regression) 과정에서 계산된 점수의 분포에 따라 결정될 수 있다. 복수의 랜드마크 포인트들 각각에 대해 회귀 분석 과정을 통하여 신뢰도 정보를 산출하는 방법은 이하 도 8a 및 8b와 함께 설명하기로 한다.Reliability information for each of a plurality of landmark points may mean accuracy of the location of the landmark point. Reliability information may be determined according to the distribution of scores calculated in a regression process for each point of a plurality of landmarks. A method of calculating reliability information through a regression analysis process for each of a plurality of landmark points will be described below with reference to FIGS. 8A and 8B.

도 6a를 참조하면, 이미지 이벤트 분류 결정 장치는 원본 이미지(610)를 제2 뉴럴 네트워크 모델에 입력하고, 복수의 랜드마크 포인트들(631)로써 양쪽 눈과, 코, 입의 양쪽 끝부분을 탐지하여, 복수의 랜드마크 포인트들(631) 각각에 대한 신뢰도 정보(651)를 산출한 것을 확인할 수 있다.Referring to FIG. 6A, the image event classification determination device inputs the original image 610 into the second neural network model and detects both eyes, nose, and both ends of the mouth using a plurality of landmark points 631. Thus, it can be confirmed that reliability information 651 for each of the plurality of landmark points 631 has been calculated.

일 실시예에 따르면, 랜드마크 포인트의 신뢰도 정보는 이미지에 대한 이벤트 분류 결과에 기초하여, 복수의 랜드마크 포인트들 중에서 결정된 타겟(target) 랜드마크 포인트의 신뢰도 정보일 수 있다.According to one embodiment, the reliability information of the landmark point may be reliability information of a target landmark point determined among a plurality of landmark points based on an event classification result for the image.

이미지 이벤트 분류 결정 장치는 제1 뉴럴 네트워크 모델을 통해 획득되는 이미지 이벤트 분류 결과에 기초하여, 제2 뉴럴 네트워크 모델을 통해 탐지되는 복수의 랜드마크 포인트들 중에서 일부 랜드마크 포인트를 타겟 랜드마크로 결정할 수 있다.The image event classification determination device may determine some landmark points as target landmarks among a plurality of landmark points detected through the second neural network model, based on the image event classification result obtained through the first neural network model. .

예를 들어, 이미지 이벤트 분류 결정 장치는 mask 이벤트 분류 결과에 기초하여, 입이나 턱에 대응하는 랜드마크 포인트를 타겟 랜드마크로 결정할 수 있다. 다른 예시로, 이미지 이벤트 분류 결정 장치는 phone 이벤트 분류 결과에 기초하여 귀에 대응하는 랜드마크 포인트를 타겟 랜드마크로 결정할 수 있고, smoke 이벤트 분류 결과에 기초하여 입에 대응하는 랜드마크 포인트를 타겟 랜드마크로 결정할 수 있다.For example, the image event classification determination device may determine a landmark point corresponding to the mouth or chin as the target landmark based on the mask event classification result. As another example, the image event classification determination device may determine the landmark point corresponding to the ear as the target landmark based on the phone event classification result, and may determine the landmark point corresponding to the mouth as the target landmark based on the smoke event classification result. You can.

이에 따라, 본 개시에 따른 이미지 이벤트 분류 결정 장치는 이벤트 분류 모델 및 랜드마크 검출 모델간 유기적인 정보 조합이 가능할 것이다. 이벤트 분류 결과와 유기적인 연결관계를 갖는 랜드마크 포인트의 신뢰도 정보를 조합하여 이미지에 대한 신규 이벤트 분류를 결정함으로써, 이벤트 분류 결과 및 신규 이벤트 분류에 대한 연관성을 유지할 수 있음을 이해할 수 있을 것이다.Accordingly, the image event classification determination device according to the present disclosure will be able to organically combine information between the event classification model and the landmark detection model. It will be understood that by determining a new event classification for an image by combining the reliability information of landmark points that have an organic connection with the event classification result, it is possible to maintain the correlation between the event classification result and the new event classification.

다른 실시예에 따르면, 이미지 이벤트 분류 결정 장치는 복수의 랜드마크 포인트들을 하나 이상의 영역으로 구분하고, 영역 중에서 이미지에 대한 이벤트 분류 결과에 기초하여 적어도 어느 하나의 영역을 선택하여, 선택된 영역에 포함되는 랜드마크 포인트를 타겟 랜드마크 포인트로써 결정할 수 있다.According to another embodiment, the image event classification determination device divides a plurality of landmark points into one or more areas, selects at least one area from among the areas based on the event classification result for the image, and selects at least one area included in the selected area. The landmark point can be determined as the target landmark point.

도 3에서 상술한 바와 같이 얼굴 영역(300)의 각 특징에 대응하는 랜드마크(310)의 개수는 복수로 설정될 수 있고, 랜드마크(310)가 사용되는 어플리케이션에 따라 다양한 개수로 설정될 수 있다. 이때, 이미지 이벤트 분류 결정 장치는 하나의 특징에 대응하는 복수의 랜드마크들을 그룹핑하여, 각 특징별로 영역을 구분할 수 있다. 예컨대, 도 3의 얼굴 영역(300)을 예로 들면, 오른쪽 눈썹에 대응하는 9개의 랜드마크(310)는 하나의 영역으로 구분될 수 있다. 코에 대응하는 9개의 랜드마크(310), 입에 대응하는 17개의 랜드마크(310) 또한 마찬가지로 하나의 영역으로 구분될 수 있다.As described above in FIG. 3, the number of landmarks 310 corresponding to each feature of the face area 300 may be set to plural, and may be set to various numbers depending on the application in which the landmarks 310 are used. there is. At this time, the image event classification determination device may group a plurality of landmarks corresponding to one feature and distinguish an area for each feature. For example, taking the face area 300 of FIG. 3 as an example, nine landmarks 310 corresponding to the right eyebrow may be divided into one area. The 9 landmarks 310 corresponding to the nose and the 17 landmarks 310 corresponding to the mouth can also be divided into one area.

이미지 이벤트 분류 결정 장치는 구분된 영역 중에서 이미지에 대한 이벤트 분류 결과에 기초하여 영역을 선택할 수 있고, 선택된 영역에 포함되는 랜드마크 포인트를 타켓 랜드마크 포인트로써 결정할 수 있다. 이미지 이벤트 분류 결과에 따라 하나 이상의 영역 중 어떠한 영역을 선택해야 하는지는 소정의 설정값을 따르거나, 학습될 수 있다. 예를 들어, mask 이벤트 분류 결과가 획득된 경우, 입과 턱 영역이 선택될 수 있고, phone 이벤트 분류 결과가 획득된 경우 귀 영역이 선택될 수 있다.The image event classification determination device may select an area among the divided areas based on the event classification result for the image, and determine a landmark point included in the selected area as a target landmark point. Depending on the image event classification results, which of one or more areas should be selected may follow a predetermined setting value or may be learned. For example, when a mask event classification result is obtained, the mouth and chin areas can be selected, and when a phone event classification result is obtained, the ear area can be selected.

도 6b를 참조하면, 이미지 이벤트 분류 결정 장치는 mask 이벤트 분류 결과(620)에 기초하여, 복수의 랜드마크 포인트들(631) 중에서 입 영역에 포함되는 타겟 랜드마크 포인트(632)를 결정하여, 타겟 랜드마크 포인트의 신뢰도 정보(652)를 산출한 것을 확인할 수 있다.Referring to FIG. 6B, the image event classification determination device determines a target landmark point 632 included in the mouth area among a plurality of landmark points 631 based on the mask event classification result 620, and determines the target landmark point 632 included in the mouth area. It can be confirmed that the reliability information 652 of the landmark point has been calculated.

도 7a 및 7b는 일 실시예에 따른 이미지에 대한 신규 이벤트 분류를 결정하는 방법을 설명하기 위한 예시적인 도면이다.7A and 7B are exemplary diagrams for explaining a method of determining a new event classification for an image according to an embodiment.

이미지 이벤트 분류 결정 장치는 이미지에 대한 이벤트 분류 결과 및 분류 결과에 매칭되는 랜드마크 포인트의 신뢰도 정보에 기초하여 이미지에 대한 신규 이벤트 분류를 결정할 수 있다. 이미지에 대한 신규 이벤트 분류는, 이미지에 대한 이벤트 분류 결과와 연관된 것일 수 있다.The image event classification determination device may determine a new event classification for the image based on the event classification result for the image and the reliability information of the landmark point matching the classification result. The new event classification for the image may be related to the event classification result for the image.

본 개시는 정확도 높은 이미지 이벤트 분류를 위해 이미지 이벤트 분류 결과에 매칭되는 랜드마크 포인트의 신뢰도 정보에 기초하여, 이미지에 대한 신규 이벤트 분류를 결정할 수 있다. In order to classify image events with high accuracy, the present disclosure can determine a new event classification for an image based on reliability information of landmark points that match the image event classification results.

이미지에 대한 신규 이벤트 분류는 이미지에 대한 이벤트 분류 결과에 기초하여 결정되므로, 이미지에 대한 이벤트 분류 결과와 연관된 것일 수 있다. 예를 들어, 이미지에 대한 신규 이벤트 분류는 이벤트 분류 모델이 이미지에 대해 분류할 수 없었던 새로운 이벤트를 의미할 수 있다. 본 개시에서는, 이벤트 분류 모델을 통한 이미지 이벤트 분류 결과 및 랜드마크 검출 모델을 통한 랜드마크 포인트의 신뢰도 정보를 조합하여 이미지 이벤트 분류를 결정함으로써, 이벤트 분류 모델의 한계를 극복하고 높은 정확도로 이미지의 이벤트를 분류할 수 있다. Since the new event classification for the image is determined based on the event classification result for the image, it may be related to the event classification result for the image. For example, a new event classification for an image may mean a new event that the event classification model was unable to classify for the image. In this disclosure, image event classification is determined by combining the image event classification results through an event classification model and the reliability information of landmark points through a landmark detection model, thereby overcoming the limitations of the event classification model and identifying events in the image with high accuracy. can be classified.

일 실시예에 따르면, 이미지에 대한 신규 이벤트 분류는 이미지에 대한 이벤트 분류 결과의 하위 개념일 수 있다. 예를 들어, hand 이벤트 분류에는 하위 개념으로써 눈비빔, 입막음 등이 신규 이벤트 분류로 결정될 수 있고, mask 이벤트 분류에는 하위 개념으로써 턱스크(pull a mask under the chin) 등이 신규 이벤트 분류로 결정될 수 있다.According to one embodiment, the new event classification for an image may be a sub-concept of the event classification result for the image. For example, in the hand event classification, eye rubbing, covering one's mouth, etc. may be determined as a new event classification as sub-concepts, and in the mask event classification, a sub-concept such as pull a mask under the chin may be determined as a new event classification. there is.

도 7a를 참조하면, 이미지 이벤트 분류 결정 장치는 이미지 이벤트 분류 결과(710)로써 신뢰도 63.7%의 mask 이벤트를 획득하였다. 이미지 이벤트 분류 결정 장치는 이미지 이벤트 분류 결과(710)에 매칭되는 랜드마크 포인트의 신뢰도 정보(730)에 기초하여, 이미지에 대한 신규 이벤트 분류(750)로써, 턱스크 이벤트로 이미지 이벤트 분류를 결정한 것을 확인할 수 있다.Referring to FIG. 7A, the image event classification determination device obtained a mask event with a reliability of 63.7% as the image event classification result 710. The image event classification determination device determines the image event classification as a tusk event as a new event classification 750 for the image based on the reliability information 730 of the landmark point matching the image event classification result 710. You can check it.

일 실시예에 따르면, 이미지 이벤트 분류 결정 장치는 이미지 이벤트 분류 결과에 매칭되는 랜드마크 포인트의 신뢰도 정보를 이용하여, 랜드마크의 가려짐(occlusion) 정보를 획득할 수 있다. 이미지 이벤트 분류 결정 장치는 이미지에 대한 이벤트 분류 결과와 분류 결과에 매칭되는 랜드마크 포인트의 신뢰도 정보와 랜드마크의 가려짐 정보에 기초하여 이미지에 대한 신규 이벤트 분류를 결정할 수 있다.According to one embodiment, the image event classification determination device may obtain occlusion information of the landmark using reliability information of the landmark point that matches the image event classification result. The image event classification determination device may determine a new event classification for the image based on the event classification result for the image, the reliability information of the landmark point matching the classification result, and the occlusion information of the landmark.

예를 들어, 랜드마크 포인트의 신뢰도 정보가 소정의 값 이상인 경우, 이미지 이벤트 분류 결정 장치는 해당 랜드마크가 가려지지 않았다는 가려짐 정보를 획득할 수 있다. 다른 예시로, 랜드마크 포인트의 신뢰도 정보가 소정의 값 이하인 경우, 이미지 이벤트 분류 결정 장치는 해당 랜드마크가 가려졌다는 가려짐 정보를 획득할 수 있다. 소정의 값은 어플리케이션에 따라 50%, 60% 등으로 결정될 수 있다.For example, when the reliability information of a landmark point is greater than a predetermined value, the image event classification determination device may obtain occlusion information indicating that the corresponding landmark is not occluded. As another example, when the reliability information of a landmark point is less than a predetermined value, the image event classification determination device may obtain occlusion information indicating that the landmark is occluded. The predetermined value may be determined as 50%, 60%, etc. depending on the application.

구체적으로, 이미지 이벤트 분류 결정 장치는 이미지에 대한 이벤트 분류 결과의 결과값이 제1 결과값 이상이고, 랜드마크 포인트의 신뢰도 정보가 제1 신뢰도 값 이상인 것에 응답하여 이미지에 대한 신규 이벤트 분류를 결정할 수 있다. 또한, 이미지 이벤트 분류 장치는 이미지에 대한 이벤트 분류 결과의 결과값이 제2 결과값 이상이고, 랜드마크 포인트의 신뢰도 정보가 제2 신뢰도 값 미만인 것에 응답하여 이미지에 대한 신규 이벤트 분류를 결정할 수 있다. 이벤트 분류 결과의 결과값은 분류 결과에 대한 신뢰도 정보일 수 있다. Specifically, the image event classification determination device may determine a new event classification for the image in response to the fact that the result of the event classification result for the image is greater than or equal to the first result value and the reliability information of the landmark point is greater than or equal to the first reliability value. there is. Additionally, the image event classification device may determine a new event classification for the image in response to the fact that the result of the event classification result for the image is greater than or equal to the second result value and the reliability information of the landmark point is less than the second reliability value. The result of the event classification result may be reliability information about the classification result.

예를 들어, 이미지 이벤트 분류 장치가 hand 이벤트 분류 결과에 대한 신뢰도가 0.6 이상이라는 A 정보와, 타겟 랜드마크 포인트인 눈 영역의 랜드마크 포인트의 신뢰도 정보가 0.6 미만인 B 정보를 획득한 경우를 예시로 들어 본다. 이미지 이벤트 분류 장치는 A 정보를 통해 이미지 내에 사람의 손이 등장한다는 정보를 획득하고, B 정보를 통해 이미지 내 얼굴 영역 중 눈 랜드마크가 가려졌다는 가려짐 정보를 획득할 수 있다. 이미지 이벤트 분류 장치는 이미지 내에 사람의 손이 등장한다는 정보와, 눈 랜드마크가 가려졌다는 가려짐 정보를 조합하여 눈 비빔의 신규 이벤트 분류를 결정할 수 있다.For example, as an example, an image event classification device obtains A information that the reliability of the hand event classification result is more than 0.6, and B information that the reliability information of the landmark point in the eye area, which is the target landmark point, is less than 0.6. Listen. The image event classification device can obtain information that a human hand appears in the image through A information, and obtain occlusion information that an eye landmark in the face area in the image is occluded through B information. The image event classification device can determine a new event classification of eye rubbing by combining information that a human hand appears in the image and occlusion information that the eye landmark is occluded.

도 7b를 참조하면, 이미지 이벤트 분류 결정 장치는 이미지 이벤트 분류 결과(710)로써 신뢰도 97.3%의 mask 이벤트 분류 결과(711)와, 신뢰도 63.7%의 mask 이벤트 분류 결과(712)를 획득하였다. 이미지 이벤트 분류 결정 장치는 이미지 이벤트 분류 결과(710) 각각에 매칭되는 랜드마크 포인트의 신뢰도 정보(730)를 이용하여, 랜드마크의 가려짐 정보를 획득할 수 있다. 예를 들어, 이미지 이벤트 분류 결정 장치는 제1 랜드마크 포인트의 신뢰도 정보(731)를 이용하여, 입이 가려졌다는 가려짐 정보(732)를 획득할 수 있고, 제2 랜드마크 포인트의 신뢰도 정보(733)를 이용하여, 입이 가려지지 않았다는 가려짐 정보(734)를 획득할 수 있다. Referring to FIG. 7B, the image event classification determination device obtained a mask event classification result (711) with a reliability of 97.3% and a mask event classification result (712) with a reliability of 63.7% as the image event classification result (710). The image event classification determination device may obtain landmark occlusion information using the reliability information 730 of the landmark point that matches each of the image event classification results 710. For example, the image event classification determination device may use the reliability information 731 of the first landmark point to obtain occlusion information 732 indicating that the mouth is covered, and the reliability information of the second landmark point ( Using 733), occlusion information 734 indicating that the mouth is not covered can be obtained.

계속하여 도 7b를 참조하면, 이미지 이벤트 분류 결정 장치는 신뢰도 97.3%의 mask 이벤트 분류 결과(711), 제1 랜드마크 포인트의 신뢰도 정보(731) 및 입이 가려졌다는 가려짐 정보(732)에 기초하여 마스크 착용 중이라는 이미지에 대한 신규 이벤트 분류(751)를 결정할 수 있다. 마찬가지로, 이미지 이벤트 분류 결정 장치는 신뢰도 63.7%의 mask 이벤트 분류 결과(712), 제2 랜드마크 포인트의 신뢰도 정보(733) 및 입이 가려지지 않았다는 가려짐 정보(734)에 기초하여 턱스크 착용 중이라는 이미지에 대한 신규 이벤트 분류(752)를 결정할 수 있다. Continuing to refer to FIG. 7B, the image event classification determination device is based on the mask event classification result 711 with 97.3% reliability, reliability information 731 of the first landmark point, and occlusion information 732 indicating that the mouth is covered. Accordingly, a new event classification 751 for the image showing that a mask is being worn can be determined. Likewise, the image event classification decision device determines whether the mask is being worn based on the mask event classification result 712 with a reliability of 63.7%, the reliability information 733 of the second landmark point, and the occlusion information 734 indicating that the mouth is not covered. A new event classification 752 for the image can be determined.

도 8a 및 8b는 일 실시예에 따른 랜드마크 포인트의 신뢰도를 산출하는 방법을 설명하기 위한 예시적인 도면이다.8A and 8B are exemplary diagrams for explaining a method of calculating reliability of a landmark point according to an embodiment.

이미지 이벤트 분류 결정 장치는 복수의 랜드마크 포인트들 각각에 대한 회귀 분석(regression) 과정을 통해 랜드마크 포인트의 신뢰도 정보를 산출할 수 있다.The image event classification determination device may calculate reliability information of the landmark point through a regression process for each of the plurality of landmark points.

일 실시예에 따르면, 이미지 이벤트 분류 결정 장치는 복수의 랜드마크 포인트들 각각에 대해 복수의 좌표를 결정하고, 복수의 좌표 각각에 대한 신뢰도를 산출하여, 복수의 좌표 각각에 대한 신뢰도 중 가장 높은 신뢰도를 복수의 랜드마크 포인트들 각각에 대해 산출되는 신뢰도 정보로 결정하는 회귀 분석 과정을 통해 랜드마크 포인트의 신뢰도 정보를 산출할 수 있다.According to one embodiment, the image event classification determination device determines a plurality of coordinates for each of a plurality of landmark points, calculates the reliability for each of the plurality of coordinates, and determines the highest reliability among the reliability for each of the plurality of coordinates. The reliability information of the landmark point can be calculated through a regression analysis process that determines the reliability information calculated for each of the plurality of landmark points.

도 8a를 참조하면, 눈 영역에 대응되는 랜드마크 포인트의 신뢰도 정보가 산출되는 예시가 나타난다. 눈 영역에 대응되는 랜드마크 포인트에 대해 6개의 좌표 및 각각에 대한 신뢰도가 산출될 수 있다. (49, 67)의 좌표에 해당하는 랜드마크 포인트에 대한 신뢰도가 97%로 가장 높은 것을 확인할 수 있다. 이미지 이벤트 분류 결정 장치는 6개의 좌표 및 신뢰도 정보 중 가장 높은 97%의 신뢰도를 눈 영역에 대응하는 랜드마크 포인트에 대해 산출되는 신뢰도 정보로 결정할 수 있다. 이미지 이벤트 분류 결정 장치는 기타 다른 얼굴 영역에 대응하는 랜드마크 포인트 각각에 대해 실시예에 따른 회귀 분석 과정을 통해 신뢰도 정보를 산출할 수 있다.Referring to FIG. 8A, an example of calculating reliability information of a landmark point corresponding to an eye area is shown. Six coordinates and reliability for each of the landmark points corresponding to the eye area can be calculated. It can be seen that the reliability of the landmark point corresponding to the coordinates of (49, 67) is the highest at 97%. The image event classification decision device can determine the highest reliability of 97% among six coordinates and reliability information as reliability information calculated for the landmark point corresponding to the eye area. The image event classification determination device may calculate reliability information for each landmark point corresponding to another face area through a regression analysis process according to the embodiment.

다른 실시예에 따르면, 이미지 이벤트 분류 결정 장치는 복수의 랜드마크 포인트들 각각에 대해 히트맵(heat map) 분포를 찾고, 히트맵 분포상에서 중심이 되는 좌표를 연산하여, 중심이 되는 좌표를 기준으로 히트맵 분포의 확산 정도에 따라 복수의 랜드마크 포인트들 각각에 대해 산출되는 신뢰도 정보를 결정하는 회귀 분석 과정을 통해 랜드마크 포인트의 신뢰도 정보를 산출할 수 있다.According to another embodiment, the image event classification determination device finds a heat map distribution for each of a plurality of landmark points, calculates the central coordinate on the heat map distribution, and calculates the central coordinate based on the central coordinate. Reliability information for landmark points can be calculated through a regression analysis process that determines reliability information calculated for each of a plurality of landmark points according to the degree of spread of the heatmap distribution.

도 8b를 참조하면, 눈 영역에 대응되는 랜드마크 포인트의 신뢰도 정보가 산출되는 예시가 나타난다. 이미지 이벤트 분류 결정 장치는 눈 영역에 대응되는 랜드마크 포인트에 대해 히트맵 분포를 찾을 수 있고, 히트맵 분포상에서 중심이 되는 좌표가 (49. 67)로 연산할 수 있다. 이미지 이벤트 분류 결정 장치는 (49, 67)를 기준으로 히트맵 분포의 확산 정도에 따라 97%의 신뢰도를 눈 영역에 대응하는 랜드마크 포인트에 대해 산출되는 신뢰도 정보로 결정할 수 있다. 이미지 이벤트 분류 결정 장치는 기타 다른 얼굴 영역에 대응하는 랜드마크 포인트 각각에 대해 실시예에 따른 회귀 분석 과정을 통해 신뢰도 정보를 산출할 수 있다.Referring to FIG. 8B, an example of calculating reliability information of a landmark point corresponding to an eye area is shown. The image event classification decision device can find the heatmap distribution for the landmark point corresponding to the eye area, and the central coordinate in the heatmap distribution can be calculated as (49.67). The image event classification decision device can determine 97% reliability based on (49, 67) and the degree of spread of the heatmap distribution as reliability information calculated for the landmark point corresponding to the eye area. The image event classification determination device may calculate reliability information for each landmark point corresponding to another face area through a regression analysis process according to the embodiment.

도 9는 일 실시예에 따른 뉴럴 네트워크 모델을 이용하여 이미지 이벤트 분류를 결정하는 방법을 설명하기 위한 흐름도이다.Figure 9 is a flowchart illustrating a method of determining image event classification using a neural network model according to an embodiment.

도 9를 참조하면, 단계 910에서, 이미지 이벤트 분류 결정 장치는 제1 뉴럴 네트워크 모델에 입력 데이터로써 이미지를 입력하고, 제1 뉴럴 네트워크 모델의 출력 데이터로써 이미지에 대한 이벤트 분류 결과를 획득할 수 있다.Referring to FIG. 9, in step 910, the image event classification determination device may input an image as input data to a first neural network model and obtain an event classification result for the image as output data of the first neural network model. .

일 실시예에 따르면, 이미지에 대한 이벤트 분류 결과는 분류 결과에 대한 신뢰도 정보를 포함할 수 있다. 이미지 이벤트 분류 결과에 대한 신뢰도가 높을수록 불확실성이 낮으며, 반대로 이미지 이벤트 분류 결과에 대한 신뢰도가 낮을수록 불확실성이 크다는 것을 의미한다.According to one embodiment, the event classification result for an image may include reliability information about the classification result. The higher the reliability of the image event classification results, the lower the uncertainty, and conversely, the lower the reliability of the image event classification results, the greater the uncertainty.

단계 920에서, 이미지 이벤트 분류 결정 장치는 이미지에 대한 이벤트 분류 결과 및 분류 결과에 매칭되는 랜드마크 포인트의 신뢰도 정보에 기초하여 이미지에 대한 신규 이벤트 분류를 결정할 수 있다. 이미지에 대한 신규 이벤트 분류는, 이미지에 대한 이벤트 분류 결과와 연관된 것일 수 있다.In step 920, the image event classification determination device may determine a new event classification for the image based on the event classification result for the image and the reliability information of the landmark point matching the classification result. The new event classification for the image may be related to the event classification result for the image.

일 실시예에 따르면, 이미지 이벤트 분류 결정 장치는 제2 뉴럴 네트워크 모델에 입력 데이터로써 이미지를 입력하고, 이미지에서 복수의 랜드마크 포인트들을 탐지하여, 복수의 랜드마크 포인트들 각각에 대해 신뢰도 정보를 산출할 수 있다.According to one embodiment, the image event classification determination device inputs an image as input data to a second neural network model, detects a plurality of landmark points in the image, and calculates reliability information for each of the plurality of landmark points. can do.

일 실시예에 따르면, 이미지 이벤트 분류 결정 장치는 복수의 랜드마크 포인트들 각각에 대한 회귀 분석(regression) 과정을 통해 랜드마크 포인트의 신뢰도 정보를 산출할 수 있다.According to one embodiment, the image event classification determination device may calculate reliability information of the landmark point through a regression process for each of the plurality of landmark points.

도 10은 일 실시예에 따른 이미지 이벤트 분류 결정 장치의 블록도이다.Figure 10 is a block diagram of an image event classification determination device according to an embodiment.

도 10을 참조하면, 이미지 이벤트 분류 결정 장치(1000)는 통신부(1010), 프로세서(1020) 및 DB(1030)를 포함할 수 있다. 도 10의 이미지 이벤트 분류 결정 장치(1000)에는 실시예와 관련된 구성요소들 만이 도시되어 있다. 따라서, 도 10에 도시된 구성요소들 외에 다른 범용적인 구성요소들이 더 포함될 수 있음을 당해 기술분야의 통상의 기술자라면 이해할 수 있다.Referring to FIG. 10, the image event classification determination device 1000 may include a communication unit 1010, a processor 1020, and a DB 1030. In the image event classification determination device 1000 of FIG. 10, only components related to the embodiment are shown. Accordingly, those skilled in the art can understand that other general-purpose components may be included in addition to the components shown in FIG. 10.

통신부(1010)는 외부 서버 또는 외부 장치와 유선/무선 통신을 하게 하는 하나 이상의 구성 요소를 포함할 수 있다. 예를 들어, 통신부(1010)는, 근거리 통신부(미도시), 이동 통신부(미도시) 및 방송 수신부(미도시) 중 적어도 하나를 포함할 수 있다.The communication unit 1010 may include one or more components that enable wired/wireless communication with an external server or external device. For example, the communication unit 1010 may include at least one of a short-range communication unit (not shown), a mobile communication unit (not shown), and a broadcast receiver (not shown).

DB(1030)는 이미지 이벤트 분류 결정 장치(1000) 내에서 처리되는 각종 데이터들을 저장하는 하드웨어로서, 프로세서(1020)의 처리 및 제어를 위한 프로그램을 저장할 수 있다. The DB 1030 is hardware that stores various data processed within the image event classification determination device 1000, and can store programs for processing and control of the processor 1020.

DB(1030)는 DRAM(dynamic random access memory), SRAM(static random access memory) 등과 같은 RAM(random access memory), ROM(read-only memory), EEPROM(electrically erasable programmable read-only memory), CD-ROM, 블루레이 또는 다른 광학 디스크 스토리지, HDD(hard disk drive), SSD(solid state drive), 또는 플래시 메모리를 포함할 수 있다.The DB 1030 is a random access memory (RAM) such as dynamic random access memory (DRAM), static random access memory (SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), CD- It may include ROM, Blu-ray or other optical disk storage, a hard disk drive (HDD), a solid state drive (SSD), or flash memory.

프로세서(1020)는 이미지 이벤트 분류 결정 장치(1000)의 전반적인 동작을 제어한다. 예를 들어, 프로세서(1020)는 DB(1030)에 저장된 프로그램들을 실행함으로써, 입력부(미도시), 디스플레이(미도시), 통신부(1010), DB(1030) 등을 전반적으로 제어할 수 있다. 프로세서(1020)는, DB(1030)에 저장된 프로그램들을 실행함으로써, 이미지 이벤트 분류 결정 장치(1000)의 동작을 제어할 수 있다.The processor 1020 controls the overall operation of the image event classification determination device 1000. For example, the processor 1020 can generally control the input unit (not shown), display (not shown), communication unit 1010, DB 1030, etc. by executing programs stored in the DB 1030. The processor 1020 may control the operation of the image event classification determination device 1000 by executing programs stored in the DB 1030.

프로세서(1020)는 도 1 내지 도 9에서 상술한 이미지 이벤트 분류 결정 장치(1000)의 동작 중 적어도 일부를 제어할 수 있다. 이미지 이벤트 분류 결정 장치(1000)는 도 1의 시스템(100)과 동일하거나, 시스템(100)의 동작 중 일부를 수행하는 장치로 구현될 수 있다.The processor 1020 may control at least some of the operations of the image event classification determination device 1000 described above with reference to FIGS. 1 to 9 . The image event classification determination device 1000 may be the same as the system 100 of FIG. 1 or may be implemented as a device that performs some of the operations of the system 100.

프로세서(1020)는 ASICs (application specific integrated circuits), DSPs(digital signal processors), DSPDs(digital signal processing devices), PLDs(programmable logic devices), FPGAs(field programmable gate arrays), 제어기(controllers), 마이크로 컨트롤러(micro-controllers), 마이크로 프로세서(microprocessors), 기타 기능 수행을 위한 전기적 유닛 중 적어도 하나를 이용하여 구현될 수 있다.The processor 1020 includes application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), controllers, and microcontrollers. It may be implemented using at least one of micro-controllers, microprocessors, and other electrical units for performing functions.

일 실시예로, 이미지 이벤트 분류 결정 장치(1000)는 이동성을 가지는 전자 장치일 수 있다. 예를 들어, 이미지 이벤트 분류 결정 장치(1000)는 스마트폰, 태블릿 PC, PC, 스마트 TV, PDA(personal digital assistant), 랩톱, 미디어 플레이어, 내비게이션, 카메라가 탑재된 디바이스 및 기타 모바일 전자 장치로 구현될 수 있다. 또한, 이미지 이벤트 분류 결정 장치(1000)는 통신 기능 및 데이터 프로세싱 기능을 구비한 시계, 안경, 헤어 밴드 및 반지 등의 웨어러블 장치로 구현될 수 있다.In one embodiment, the image event classification determination device 1000 may be a mobile electronic device. For example, the image event classification determination device 1000 is implemented in smartphones, tablet PCs, PCs, smart TVs, personal digital assistants (PDAs), laptops, media players, navigation devices, devices equipped with cameras, and other mobile electronic devices. It can be. Additionally, the image event classification determination device 1000 may be implemented as a wearable device such as a watch, glasses, hair band, or ring equipped with a communication function and data processing function.

다른 실시예로, 이미지 이벤트 분류 결정 장치(1000)는 차량 내에 임베디드 되는 전자 장치일 수 있다. 예를 들어, 이미지 이벤트 분류 결정 장치(1000)는 생산 과정 이후 튜닝(tuning)을 통해 차량 내에 삽입되는 전자 장치일 수 있다.In another embodiment, the image event classification determination device 1000 may be an electronic device embedded in a vehicle. For example, the image event classification determination device 1000 may be an electronic device inserted into a vehicle through tuning after the production process.

또 다른 실시예로, 이미지 이벤트 분류 결정 장치(1000)는 차량 외부에 위치하는 서버일 수 있다. 서버는 네트워크를 통해 통신하여 명령, 코드, 파일, 컨텐츠, 서비스 등을 제공하는 컴퓨터 장치 또는 복수의 컴퓨터 장치들로 구현될 수 있다. 서버는 차량에 탑재된 장치들로부터 이미지의 이벤트를 분류하기 위해 필요한 데이터를 수신하고, 수신한 데이터에 기초하여 이미지의 이벤트를 분류할 수 있다.In another embodiment, the image event classification determination device 1000 may be a server located outside the vehicle. A server may be implemented as a computer device or a plurality of computer devices that communicate over a network to provide commands, codes, files, content, services, etc. The server may receive data necessary to classify image events from devices mounted on the vehicle, and classify image events based on the received data.

또 다른 실시예로, 이미지 이벤트 분류 결정 장치(1000)에서 수행되는 프로세스는 이동성을 가지는 전자 장치, 차량 내에 임베디되는 전자 장치 및 차량 외부에 위치하는 서버 중 적어도 일부에 의해 수행될 수 있다.In another embodiment, the process performed by the image event classification determination device 1000 may be performed by at least some of a mobile electronic device, an electronic device embedded in a vehicle, and a server located outside the vehicle.

본 발명에 따른 실시 예는 컴퓨터 상에서 다양한 구성요소를 통하여 실행될 수 있는 컴퓨터 프로그램의 형태로 구현될 수 있으며, 이와 같은 컴퓨터 프로그램은 컴퓨터로 판독 가능한 매체에 기록될 수 있다. 이때, 매체는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM 및 DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical medium), 및 ROM, RAM, 플래시 메모리 등과 같은, 프로그램 명령어를 저장하고 실행하도록 특별히 구성된 하드웨어 장치를 포함할 수 있다.Embodiments according to the present invention may be implemented in the form of a computer program that can be executed through various components on a computer, and such a computer program may be recorded on a computer-readable medium. At this time, the media includes magnetic media such as hard disks, floppy disks, and magnetic tapes, optical recording media such as CD-ROMs and DVDs, magneto-optical media such as floptical disks, and ROM. , RAM, flash memory, etc., may include hardware devices specifically configured to store and execute program instructions.

한편, 상기 컴퓨터 프로그램은 본 발명을 위하여 특별히 설계되고 구성된 것이거나 컴퓨터 소프트웨어 분야의 당업자에게 공지되어 사용 가능한 것일 수 있다. 컴퓨터 프로그램의 예에는, 컴파일러에 의하여 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용하여 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드도 포함될 수 있다.Meanwhile, the computer program may be designed and configured specifically for the present invention, or may be known and available to those skilled in the art of computer software. Examples of computer programs may include not only machine language code such as that created by a compiler, but also high-level language code that can be executed by a computer using an interpreter or the like.

일 실시예에 따르면, 본 개시의 다양한 실시예들에 따른 방법은 컴퓨터 프로그램 제품(computer program product)에 포함되어 제공될 수 있다. 컴퓨터 프로그램 제품은 상품으로서 판매자 및 구매자 간에 거래될 수 있다. 컴퓨터 프로그램 제품은 기기로 읽을 수 있는 저장 매체(예: compact disc read only memory (CD-ROM))의 형태로 배포되거나, 또는 어플리케이션 스토어(예: 플레이 스토어TM)를 통해 또는 두 개의 사용자 장치들 간에 직접, 온라인으로 배포(예: 다운로드 또는 업로드)될 수 있다. 온라인 배포의 경우에, 컴퓨터 프로그램 제품의 적어도 일부는 제조사의 서버, 어플리케이션 스토어의 서버, 또는 중계 서버의 메모리와 같은 기기로 읽을 수 있는 저장 매체에 적어도 일시 저장되거나, 임시적으로 생성될 수 있다.According to one embodiment, methods according to various embodiments of the present disclosure may be included and provided in a computer program product. Computer program products are commodities and can be traded between sellers and buyers. The computer program product may be distributed in the form of a machine-readable storage medium (e.g. compact disc read only memory (CD-ROM)) or through an application store (e.g. Play StoreTM) or between two user devices. It may be distributed in person or online (e.g., downloaded or uploaded). In the case of online distribution, at least a portion of the computer program product may be at least temporarily stored or temporarily created in a machine-readable storage medium, such as the memory of a manufacturer's server, an application store's server, or a relay server.

본 발명에 따른 방법을 구성하는 단계들에 대하여 명백하게 순서를 기재하거나 반하는 기재가 없다면, 상기 단계들은 적당한 순서로 행해질 수 있다. 반드시 상기 단계들의 기재 순서에 따라 본 발명이 한정되는 것은 아니다. 본 발명에서 모든 예들 또는 예시적인 용어(예들 들어, 등등)의 사용은 단순히 본 발명을 상세히 설명하기 위한 것으로서 특허청구범위에 의해 한정되지 않는 이상 상기 예들 또는 예시적인 용어로 인해 본 발명의 범위가 한정되는 것은 아니다. 또한, 당업자는 다양한 수정, 조합 및 변경이 부가된 특허청구범위 또는 그 균등물의 범주 내에서 설계 조건 및 팩터에 따라 구성될 수 있음을 알 수 있다.Unless there is an explicit order or statement to the contrary regarding the steps constituting the method according to the invention, the steps may be performed in any suitable order. The present invention is not necessarily limited by the order of description of the above steps. The use of any examples or illustrative terms (e.g., etc.) in the present invention is merely to describe the present invention in detail, and unless limited by the claims, the scope of the present invention is limited by the examples or illustrative terms. It doesn't work. Additionally, those skilled in the art will recognize that various modifications, combinations and changes may be made depending on design conditions and factors within the scope of the appended claims or their equivalents.

따라서, 본 발명의 사상은 상기 설명된 실시 예에 국한되어 정해져서는 아니 되며, 후술하는 특허청구범위 뿐만 아니라 이 특허청구범위와 균등한 또는 이로부터 등가적으로 변경된 모든 범위는 본 발명의 사상의 범주에 속한다고 할 것이다Therefore, the spirit of the present invention should not be limited to the above-described embodiments, and the scope of the patent claims described below as well as all scopes equivalent to or equivalently changed from the scope of the claims are within the scope of the spirit of the present invention. It will be said to belong to

100: 시스템
110: 얼굴 검출 모델
121: 랜드마크 검출 모델
122: 이벤트 분류 모델
130: 후처리 모듈
200: 이미지
210: 슬라이딩 윈도우
220: 얼굴 영역
300: 얼굴 영역
310: 랜드마크
410: normal
420: phone
430: smoke
440: mask
510: 이미지 이벤트 분류 결과가 명확한 그룹의 이미지
530: 이미지 이벤트 분류 결과가 불명확한 그룹의 이미지
610: 원본 이미지
631: 복수의 랜드마크 포인트들
651: 신뢰도 정보
620: 이미지 이벤트 분류 결과
632: 타켓 랜드마크 포인트
652: 타겟 랜드마크 포인트의 신뢰도 정보
710: 이미지 이벤트 분류 결과
711: mask 이벤트 분류 결과
712: mask 이벤트 분류 결과
730: 랜드마크 포인트의 신뢰도 정보
731: 제1 랜드마크 포인트의 신뢰도 정보
732: 입이 가려졌다는 가려짐 정보
733: 제2 랜드마크 포인트의 신뢰도 정보
734: 입이 가려지지 않았다는 가려짐 정보
750: 신규 이벤트 분류
751: 마스크 착용 중 신규 이벤트 분류
752: 턱스크 착용 중 신규 이벤트 분류
1000: 이미지 이벤트 분류 결정 장치
1010: 통신부
1020: 프로세서
1030: DB100: System
110: Face detection model
121: Landmark detection model
122: Event classification model
130: Post-processing module
200: image
210: sliding window
220: Face area
300: face area
310: Landmark
410: normal
420: phone
430: smoke
440: mask
510: Images in a group with clear image event classification results
530: Images in a group where the image event classification result is unclear
610: Original image
631: Multiple landmark points
651: Reliability information
620: Image event classification results
632: Target landmark point
652: Reliability information of target landmark point
710: Image event classification results
711: mask event classification result
712: mask event classification result
730: Reliability information of landmark points
731: Reliability information of the first landmark point
732: Occlusion information that the mouth is covered
733: Reliability information of the second landmark point
734: Occlusion information that the mouth is not covered
750: New event classification
751: Classification of new events while wearing a mask
752: New event classification while wearing tusk
1000: Image event classification decision device
1010: Ministry of Communications
1020: Processor
1030:DB

Claims

In a method of determining image event classification using a neural network model,
Inputting an image as input data to a first neural network model, classifying an event for the image and obtaining reliability regarding the classification of the event using output data of the first neural network model; and
In response to the obtained event classification being a first event, obtaining reliability information of a landmark point matching the first event, and obtaining occlusion information of the landmark point based on the reliability information; - The occlusion information is obtained based on first occlusion information obtained based on high reliability information of the landmark point matching the first event and low reliability information of the landmark point matching the first event. Contains second occlusion information that is - and
determining an event for the image as one of the first event or a second event associated with the first event by combining the event classification, reliability regarding the event classification, and occlusion information of the landmark; Including,
The determining step is,
When the reliability of the event for the image being classified as the first event by the first neural network model is greater than or equal to a predetermined value, second occlusion information of the landmark point matching the first event In further consideration, change the event for the image to the second event, or
When the reliability of being classified as the first event is less than a predetermined value, the event for the image is converted to the second event by further considering the first occlusion information of the landmark point matching the first event. It includes the step of changing to,
The second event is a new event classification that the first neural network model could not classify for the image.

delete

According to claim 1,
The reliability information of the landmark point is,
A method of inputting a related image related to the image as input data to a second neural network model, detecting a plurality of landmark points in the related image, and calculating reliability information for each of the plurality of landmark points.

According to clause 4,
The reliability information of the landmark point is,
Reliability information of a target landmark point determined among the plurality of landmark points based on the event classification result.

According to clause 5,
The target landmark point is,
The method includes dividing the plurality of landmark points into one or more areas, selecting at least one area from among the areas based on the event classification result, and determining the landmark point included in the selected area.

According to clause 4,
The reliability information of the landmark point is,
A method of reliability information calculated through a regression process for each of the plurality of landmark points.

According to clause 7,
The regression analysis process is,
Determine a plurality of coordinates for each of the plurality of landmark points, calculate a reliability for each of the plurality of coordinates, and assign the highest reliability among the reliability for each of the plurality of coordinates to each of the plurality of landmark points. A method that is a process of deciding with reliability information calculated for .

According to clause 7,
The regression analysis process is,
Find a heat map distribution for each of the plurality of landmark points, calculate the coordinates at the center of the heat map distribution, and calculate the coordinates at the center according to the degree of spread of the heat map distribution based on the coordinates at the center. A method of determining reliability information calculated for each of the plurality of landmark points.

In a device for determining image event classification using a neural network model,
a memory in which at least one program is stored; and
At least one processor that runs a neural network model by executing the at least one program,
The at least one processor,
Input an image as input data to a first neural network model, classify an event for the image and obtain reliability regarding the classification of the event as output data of the first neural network model,
In response to the obtained event classification being a first event, reliability information of the landmark point matching the first event is obtained, and occlusion information of the landmark point based on the reliability information - the occlusion information First occlusion information obtained based on high reliability information of the landmark point matching the first event and second occlusion obtained based on low reliability information of the landmark point matching the first event. Contains baggage information - Obtain,
Combining the event classification, the reliability of the event classification, and the occlusion information of the landmark to determine the event for the image as one of the first event or a second event associated with the first event,
The processor,
When the reliability of the event for the image being classified as the first event by the first neural network model is greater than or equal to a predetermined value, second occlusion information of the landmark point matching the first event In further consideration, change the event for the image to the second event, or
When the reliability of being classified as the first event is less than a predetermined value, the event for the image is converted to the second event by further considering the first occlusion information of the landmark point matching the first event. Including changing to,
The second event is a new event classification that the first neural network model was unable to classify for the image.

A computer-readable recording medium that records a program for executing the method of claim 1 on a computer.