KR20210056050A

KR20210056050A - Apparaus and method of identifying multi class objects

Info

Publication number: KR20210056050A
Application number: KR1020190142653A
Authority: KR
Inventors: 임영철; 강민성
Original assignee: 재단법인대구경북과학기술원
Priority date: 2019-11-08
Filing date: 2019-11-08
Publication date: 2021-05-18

Abstract

Disclosed are a device and a method for identifying a multi-class object. The method for identifying a multi-class object includes the steps of: inputting an image to the learned artificial intelligence neural network; performing subclass non-maximum suppression (NMS); and performing group class NMS.

Description

Device and method for identifying multi-class objects {APPARAUS AND METHOD OF IDENTIFYING MULTI CLASS OBJECTS}

본 개시는 멀티 클래스 객체 식별 장치 및 방법에 관한 것으로, 더욱 상세하게는 학습된 인공지능 신경망을 이용하여 영상에 포함된 객체를 식별하는 멀티 클래스 객체 식별 장치 및 방법에 관한 것이다.The present disclosure relates to a multi-class object identification apparatus and method, and more particularly, to a multi-class object identification apparatus and method for identifying an object included in an image using a learned artificial intelligence neural network.

최근 딥러닝 기술의 발전으로 인해 멀티 클래스 객체 검출에 대한 신뢰도가 많이 높아졌고, 다양한 산업 분야에서 핵심 기술로 활용되고 있다. VJ(Viola Jones)나 HOG(Histogram of Oriented Gradient)와 같은 초기의 객체 검출 방법은 멀티 클래스 객체 검출보다는 사람이나 차량과 같이 하나의 객체만을 검출하는 분야에서 많이 활용되었다. 반면, 딥러닝 기반의 객체 검출 방법은 사람, 차량, 이륜차, 신호등, 교통 표지판등 다양한 멀티 클래스 객체들을 복잡도의 증가가 거의 없이 동시에 검출할 수 있다.Due to the recent development of deep learning technology, the reliability of multi-class object detection has increased a lot, and it is being used as a core technology in various industrial fields. Early object detection methods such as VJ (Viola Jones) and HOG (Histogram of Oriented Gradient) were more used in the field of detecting only one object such as a person or vehicle rather than multi-class object detection. On the other hand, the object detection method based on deep learning can simultaneously detect various multi-class objects such as people, vehicles, motorcycles, traffic lights, and traffic signs with little increase in complexity.

도 1에 도시된 바와 같이, 일반적인 딥러닝 기반의 객체 검출 기술은 사전에 학습된 객체 검출 네트워크에 입력 영상이 입력되면, 검출 대상 객체들 주변에 많은 바운딩 박스(bounding box)들을 생성한다. 그리고, 일반적인 딥러닝 기반의 객체 검출 기술은 NMS(non-maximum suppression) 알고리즘에 기초하여 하나의 바운딩 박스만을 남기고 다른 박스들을 제거한다. 일반적인 딥러닝 기반의 객체 검출 기술은 동영상과 같이 연속적으로 영상이 입력되는 경우, 현재 프레임에서 검출된 객체들을 대상으로 다음 프레임부터 추적, 데이터 연관 등의 모듈을 통해 각 객체의 인스턴스들을 매 프레임마다 연결함으로써 해당 객체들의 위치와 움직임 정보를 추정할 수 있다. As shown in FIG. 1, when an input image is input to a pre-learned object detection network, a general deep learning-based object detection technology creates many bounding boxes around detection target objects. In addition, a general deep learning-based object detection technology removes other boxes while leaving only one bounding box based on a non-maximum suppression (NMS) algorithm. In general, deep learning-based object detection technology connects instances of each object every frame through modules such as tracking from the next frame and data association for objects detected in the current frame when images are continuously input, such as videos. By doing so, it is possible to estimate the location and motion information of the objects.

기존의 객체 검출 기술은 특정 객체 클래스, 예를 들면, 차량 클래스 내부에서도 세단, 밴, 승합차(SUV) 등과 같은 서브 클래스(sub-class)에 대한 구분도 가능하다. 기존의 객체 검출 기술은 사람과 차량처럼 명확하게 구분 가능한 특징들을 포함하는 클래스의 분류 정확도가 높다. 그러나, 세단, 승합차, 밴 등과 같이 유사한 특징을 가지는 클래스들을 구분할 경우에는 객체 인식 오류가 발생할 가능성이 높아진다. Existing object detection technology is capable of classifying sub-classes such as sedans, vans, vans (SUVs), etc. even within a specific object class, for example, a vehicle class. Existing object detection technology has high classification accuracy for classes that include features that can be clearly distinguished, such as humans and vehicles. However, when classifying classes having similar characteristics such as sedans, vans, vans, etc., there is a high possibility that an object recognition error may occur.

즉, 도 1b에 도시된 바와 같이 기존의 객체 검출 기술은 객체 클래스를 잘못 분류하거나, 도 1c에 도시된 바와 같이 객체에 생성되는 여러 개의 클래스 박스들에 의해 2개 이상의 다른 클래스로 인식되는 문제가 발생한다. 기존의 객체 검출 기술은 객체 검출 단계에서 발생된 인식 오류로 인해 다중 객체 추적 단계에서 추적 초기화(initialization) 실패나 오인식된 다수 객체들을 추적하고 오검출 추적 활성화 문제를 발생시킨다.That is, as shown in FIG. 1B, the existing object detection technology incorrectly classifies an object class, or is recognized as two or more different classes by several class boxes created in an object as shown in FIG. 1C. Occurs. Existing object detection technologies trace initialization failure or misrecognized multiple objects in the multi-object tracking step due to a recognition error generated in the object detection step, and cause a problem of activating false detection tracking.

도 1d에 도시된 바와 같이, 기존의 객체 검출 기술은 추적 초기화 과정에서 오인식이 발생되면 새로운 객체가 검출된 것으로 판단하여 추적 초기화를 하지 못하고 추적을 활성화하지 못한다. 그리고, 도 1e 및 도 1f에 도시된 바와 같이, 기존의 객체 검출 기술은 동일한 객체에 대해 연속적으로 복수의 프레임 동안 서로 다른 객체로 검출하고 상호 연관시킴으로써, 두 개의 다른 객체로 추적 초기화가 이루어지고 추적을 활성화시킨다. 즉 기존의 다중 객체 추적 기술은 객체 검출 단계에서 발생된 오검출 객체(녹색 박스이며 밴에 해당함)에 대해 추적이 활성화되었다고 판단하기 때문에 객체 추적 모듈에 의하여 검출 오류가 전파되는 문제가 발생된다.As shown in FIG. 1D, when a misrecognition occurs during a tracking initialization process, the existing object detection technology determines that a new object has been detected, so that tracking cannot be initialized and tracking cannot be activated. And, as shown in Figs. 1e and 1f, the existing object detection technology detects the same object as different objects during a plurality of frames in succession and correlates them with each other, so that two different objects are initialized and tracked. Activates. That is, since the existing multi-object tracking technology determines that tracking is activated for an erroneous detection object (green box, corresponding to a van) generated in the object detection step, a problem in which detection errors are propagated by the object tracking module occurs.

따라서, 유사한 특징을 가지는 클래스들을 정확히 구분하고 오검출 및 검출 오류를 방지할 수 있는 기술에 대한 필요성이 존재한다.Accordingly, there is a need for a technology capable of accurately classifying classes having similar characteristics and preventing erroneous detection and detection errors.

본 개시는 상술한 문제점을 해결하기 위한 것으로 객체 식별 및 객체 추적을 정확하게 할 수 있는 멀티 클래스 객체 식별 장치 및 방법을 제공하는 것이다.The present disclosure is to solve the above-described problems, and is to provide a multi-class object identification apparatus and method capable of accurately identifying objects and tracking objects.

이상과 같은 목적을 달성하기 위한 본 개시의 일 실시 예에 따르면, 멀티 클래스 객체 식별 방법은 학습된 인공지능 신경망에 영상이 입력되는 단계, 상기 학습된 인공지능 신경망에 기초하여 상기 영상에 포함된 객체에 대응되는 적어도 하나의 서브 클래스를 식별하고, 상기 식별된 적어도 하나의 서브 클래스의 단위로 적어도 하나의 서브 클래스 바운딩 박스를 생성하는 서브 클래스 NMS(Non-maximum suppression)를 수행하는 단계 및 각각의 서브 클래스에 대한 확률을 산출하고, 복수의 동종 객체 각각에 대응되는 복수의 서브 클래스가 그룹핑된 그룹 클래스에 기초하여 상기 식별된 적어도 하나의 서브 클래스가 포함된 그룹 클래스를 식별하며, 상기 생성된 적어도 하나의 서브 클래스 바운딩 박스를 제거하고, 상기 식별된 그룹 클래스에 기초하여 하나의 그룹 클래스 바운딩 박스를 생성하는 그룹 클래스 NMS를 수행하는 단계를 포함한다.According to an embodiment of the present disclosure for achieving the above object, a method for identifying a multi-class object includes inputting an image to a learned artificial intelligence neural network, and an object included in the image based on the learned artificial intelligence neural network. Performing a subclass Non-maximum suppression (NMS) for identifying at least one subclass corresponding to and generating at least one subclass bounding box in units of the identified at least one subclass, and each subclass Calculate a probability for a class, identify a group class including the identified at least one subclass based on a group class in which a plurality of subclasses corresponding to each of a plurality of homogeneous objects are grouped, and the generated at least one And performing a group class NMS of removing the subclass bounding box of and generating one group class bounding box based on the identified group class.

한편, 멀티 클래스 객체 식별 방법은 상기 산출된 각각의 서브 클래스에 대한 확률 중 가장 높은 확률의 서브 클래스를 상기 객체로 인식하는 단계를 더 포함할 수 있다.Meanwhile, the method of identifying a multi-class object may further include recognizing a subclass having a highest probability among the calculated probabilities for each subclass as the object.

그리고, 멀티 클래스 객체 식별 방법은 상기 생성된 그룹 클래스 바운딩 박스 및 상기 가장 높은 확률의 서브 클래스를 표시하는 단계를 더 포함할 수 있다.In addition, the method of identifying a multi-class object may further include displaying the generated group class bounding box and the subclass having the highest probability.

한편, 상기 영상은 복수의 프레임을 포함하는 동영상이고, 상기 서브 클래스 NMS를 수행하는 단계 및 상기 그룹 클래스 NMS를 수행하는 단계는 상기 복수의 프레임 각각에 대해 수행되며, 멀티 클래스 객체 식별 방법은 상기 복수의 프레임 각각에 생성된 상기 그룹 클래스 바운딩 박스가 동일한 그룹 클래스 바운딩 박스이고 연속적으로 기 설정된 횟수 이상 생성되는 경우, 객체 추적을 활성화하는 단계를 더 포함할 수 있다.Meanwhile, the image is a video including a plurality of frames, the subclass NMS and the group class NMS are performed for each of the plurality of frames, and the multiclass object identification method includes the plurality of frames. When the group class bounding box generated in each of the frames of is the same group class bounding box and is continuously generated more than a preset number of times, activating object tracking may be further included.

그리고, 멀티 클래스 객체 식별 방법은 기 설정된 갯수의 프레임으로부터 산출된 상기 각각의 서브 클래스에 대한 확률을 누적 평균하고, 상기 누적 평균된 확률이 가장 높은 서브 클래스를 객체로 인식하여 추적하는 객체 추적 단계를 더 포함할 수 있다.In addition, the multi-class object identification method includes an object tracking step of accumulatively averaging the probabilities of each subclass calculated from a preset number of frames, and recognizing and tracking the subclass having the highest accumulated average probabilities as an object. It may contain more.

이상과 같은 목적을 달성하기 위한 본 개시의 일 실시 예에 따르면, 멀티 클래스 객체 식별 장치는 영상을 입력받는 입력부 및 학습된 인공지능 신경망을 포함하는 프로세서를 포함하고, 상기 프로세서는 상기 학습된 인공지능 신경망에 기초하여 상기 영상에 포함된 객체에 대응되는 적어도 하나의 서브 클래스를 식별하고, 상기 식별된 적어도 하나의 서브 클래스의 단위로 적어도 하나의 서브 클래스 바운딩 박스를 생성하는 서브 클래스 NMS 연산을 수행하고, 각각의 서브 클래스에 대한 확률을 산출하고, 복수의 동종 객체 각각에 대응되는 복수의 서브 클래스가 그룹핑된 그룹 클래스에 기초하여 상기 식별된 적어도 하나의 서브 클래스가 포함된 그룹 클래스를 식별하며, 상기 생성된 적어도 하나의 서브 클래스 바운딩 박스를 제거하고, 상기 식별된 그룹 클래스에 기초하여 하나의 그룹 클래스 바운딩 박스를 생성하는 그룹 클래스 NMS 연산을 수행한다.According to an embodiment of the present disclosure for achieving the above object, a multi-class object identification apparatus includes a processor including an input unit for receiving an image and a learned artificial intelligence neural network, wherein the processor includes the learned artificial intelligence Perform a subclass NMS operation for identifying at least one subclass corresponding to an object included in the image based on a neural network, and generating at least one subclass bounding box in units of the identified at least one subclass, and , Calculating a probability for each subclass, and identifying a group class including the identified at least one subclass based on a group class in which a plurality of subclasses corresponding to each of a plurality of homogeneous objects are grouped, and the A group class NMS operation is performed for removing at least one generated subclass bounding box and generating one group class bounding box based on the identified group class.

그리고, 상기 프로세서는 상기 산출된 각각의 서브 클래스에 대한 확률 중 가장 높은 확률의 서브 클래스를 상기 객체로 인식할 수 있다.In addition, the processor may recognize a subclass having the highest probability among the calculated probabilities for each subclass as the object.

한편, 멀티 클래스 객체 식별 장치는 상기 생성된 그룹 클래스 바운딩 박스 및 상기 가장 높은 확률의 서브 클래스를 표시하는 디스플레이를 더 포함할 수 있다.Meanwhile, the apparatus for identifying a multi-class object may further include a display displaying the generated group class bounding box and the subclass with the highest probability.

한편, 상기 영상은 복수의 프레임을 포함하는 동영상이고, 상기 프로세서는 상기 서브 클래스 NMS 연산 및 상기 그룹 클래스 NMS 연산을 상기 복수의 프레임 각각에 대해 수행하며, 상기 복수의 프레임 각각에 생성된 상기 그룹 클래스 바운딩 박스가 동일한 그룹 클래스 바운딩 박스이고 연속적으로 기 설정된 횟수 이상 생성되는 경우, 객체 추적을 활성화할 수 있다.Meanwhile, the image is a video including a plurality of frames, and the processor performs the subclass NMS operation and the group class NMS operation for each of the plurality of frames, and the group class generated in each of the plurality of frames If the bounding box is the same group class bounding box and is continuously generated more than a preset number of times, object tracking may be activated.

그리고, 상기 프로세서는 기 설정된 갯수의 프레임으로부터 산출된 상기 각각의 서브 클래스에 대한 확률을 누적 평균하고, 상기 누적 평균된 확률이 가장 높은 서브 클래스를 객체로 인식하여 추적할 수 있다.In addition, the processor may cumulatively average the probabilities of each subclass calculated from a preset number of frames, and track the subclass having the highest cumulative average probability as an object.

이상 설명한 바와 같이, 멀티 클래스 객체 식별 장치 및 방법은 유사한 특징을 포함하는 객체를 정확하게 구분할 수 있다.As described above, the apparatus and method for identifying a multi-class object can accurately distinguish an object having similar characteristics.

그리고, 멀티 클래스 객체 식별 장치 및 방법은 객체 추적의 정확도와 신뢰도를 향상시킬 수 있다.In addition, the multi-class object identification apparatus and method can improve the accuracy and reliability of object tracking.

본 발명의 효과들은 이상에서 언급한 효과들로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해 될 수 있을 것이다.The effects of the present invention are not limited to the above-mentioned effects, and other effects not mentioned will be clearly understood by those skilled in the art from the following description.

도 1a는 기존 객체 검출 기술의 검출 결과를 나타내는 도면이다.
도 1b는 기존 객체 검출 기술의 인식 오류의 예를 나타내는 도면이다.
도 1c는 기존 객체 검출 기술의 단일 객체를 복수 객체로 인식한 오류의 예를 나타내는 도면이다.
도 1d는 기존 객체 검출 기술의 추적 비활성화 오류의 예를 나타내는 도면이다.
도 1e는 기존 객체 검출 기술의 단일 객체에 대한 복수 객체 검출 오류의 예를 나타내는 도면이다.
도 1f는 기존 객체 검출 기술의 검출 오류의 전파의 예를 나타내는 도면이다.
도 2a는 본 개시의 일 실시 예에 따른 멀티 클래스 객체 식별 장치의 블록도이다.
도 2b는 본 개시의 다른 실시 예에 따른 멀티 클래스 객체 식별 장치의 블록도이다.
도 3a는 본 개시에 따른 그룹 클래스의 제1 실시 예를 나타내는 도면이다.
도 3b는 본 개시에 따른 그룹 클래스의 제2 실시 예를 나타내는 도면이다.
도 3c는 본 개시에 따른 그룹 클래스의 제3 실시 예를 나타내는 도면이다.
도 4는 본 개시의 일 실시 예에 따른 멀티 클래스 객체 검출 과정을 설명하는 도면이다.
도 5a는 확률 분포를 이용한 객체 검출 과정을 나타내는 일 실시 예이다.
도 5b는 확률 분포를 이용한 객체 추적 과정을 나타내는 일 실시 예이다.
도 6은 본 개시의 일 실시 예에 따른 멀티 클래스 객체 식별 방법의 흐름도이다.1A is a diagram illustrating a detection result of an existing object detection technology.
1B is a diagram illustrating an example of a recognition error in an existing object detection technology.
1C is a diagram illustrating an example of an error in recognizing a single object as a plurality of objects in the existing object detection technology.
1D is a diagram illustrating an example of a tracking deactivation error in an existing object detection technology.
1E is a diagram illustrating an example of an error in detecting multiple objects for a single object in the existing object detection technology.
1F is a diagram showing an example of propagation of a detection error in an existing object detection technology.
2A is a block diagram of an apparatus for identifying a multi-class object according to an embodiment of the present disclosure.
2B is a block diagram of an apparatus for identifying a multi-class object according to another embodiment of the present disclosure.
3A is a diagram illustrating a first embodiment of a group class according to the present disclosure.
3B is a diagram illustrating a second embodiment of a group class according to the present disclosure.
3C is a diagram illustrating a third embodiment of a group class according to the present disclosure.
4 is a diagram illustrating a process of detecting a multi-class object according to an embodiment of the present disclosure.
5A is an exemplary embodiment illustrating an object detection process using a probability distribution.
5B is an embodiment illustrating an object tracking process using a probability distribution.
6 is a flowchart of a method of identifying a multi-class object according to an embodiment of the present disclosure.

이하에서는 첨부된 도면을 참조하여 다양한 실시 예를 보다 상세하게 설명한다. 본 명세서에 기재된 실시 예는 다양하게 변형될 수 있다. 특정한 실시 예가 도면에서 묘사되고 상세한 설명에서 자세하게 설명될 수 있다. 그러나, 첨부된 도면에 개시된 특정한 실시 예는 다양한 실시 예를 쉽게 이해하도록 하기 위한 것일 뿐이다. 따라서, 첨부된 도면에 개시된 특정 실시 예에 의해 기술적 사상이 제한되는 것은 아니며, 발명의 사상 및 기술 범위에 포함되는 모든 균등물 또는 대체물을 포함하는 것으로 이해되어야 한다.Hereinafter, various embodiments will be described in more detail with reference to the accompanying drawings. The embodiments described herein may be variously modified. Certain embodiments may be depicted in the drawings and described in detail in the detailed description. However, specific embodiments disclosed in the accompanying drawings are only intended to facilitate understanding of various embodiments. Therefore, the technical idea is not limited by the specific embodiments disclosed in the accompanying drawings, it is to be understood as including all equivalents or substitutes included in the spirit and scope of the invention.

제1, 제2 등과 같이 서수를 포함하는 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 이러한 구성요소들은 상술한 용어에 의해 한정되지는 않는다. 상술한 용어는 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. Terms including ordinal numbers such as first and second may be used to describe various elements, but these elements are not limited by the above-described terms. The above-described terms are used only for the purpose of distinguishing one component from other components.

본 명세서에서, "포함한다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다. 어떤 구성요소가 다른 구성요소에 "연결되어" 있다거나 "접속되어" 있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소가 다른 구성요소에 "직접 연결되어" 있다거나 "직접 접속되어" 있다고 언급된 때에는, 중간에 다른 구성요소가 존재하지 않는 것으로 이해되어야 할 것이다.In the present specification, terms such as "comprises" or "have" are intended to designate the presence of features, numbers, steps, actions, components, parts, or combinations thereof described in the specification, but one or more other features. It is to be understood that the presence or addition of elements or numbers, steps, actions, components, parts, or combinations thereof does not preclude in advance. When a component is referred to as being "connected" or "connected" to another component, it is understood that it may be directly connected or connected to the other component, but other components may exist in the middle. It should be. On the other hand, when a component is referred to as being "directly connected" or "directly connected" to another component, it should be understood that there is no other component in the middle.

한편, 본 명세서에서 사용되는 구성요소에 대한 "모듈" 또는 "부"는 적어도 하나의 기능 또는 동작을 수행한다. 그리고, "모듈" 또는 "부"는 하드웨어, 소프트웨어 또는 하드웨어와 소프트웨어의 조합에 의해 기능 또는 동작을 수행할 수 있다. 또한, 특정 하드웨어에서 수행되어야 하거나 적어도 하나의 프로세서에서 수행되는 "모듈" 또는 "부"를 제외한 복수의 "모듈들" 또는 복수의 "부들"은 적어도 하나의 모듈로 통합될 수도 있다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다.Meanwhile, a "module" or "unit" for a component used in the present specification performs at least one function or operation. In addition, the "module" or "unit" may perform a function or operation by hardware, software, or a combination of hardware and software. In addition, a plurality of "modules" or a plurality of "units" excluding "module" or "unit" to be performed in specific hardware or performed by at least one processor may be integrated into at least one module. Singular expressions include plural expressions unless the context clearly indicates otherwise.

본 발명의 설명에 있어서 각 단계의 순서는 선행 단계가 논리적 및 시간적으로 반드시 후행 단계에 앞서서 수행되어야 하는 경우가 아니라면 각 단계의 순서는 비제한적으로 이해되어야 한다. 즉, 위와 같은 예외적인 경우를 제외하고는 후행 단계로 설명된 과정이 선행단계로 설명된 과정보다 앞서서 수행되더라도 발명의 본질에는 영향이 없으며 권리범위 역시 단계의 순서에 관계없이 정의되어야 한다. 그리고 본 명세서에서 "A 또는 B"라고 기재한 것은 A와 B 중 어느 하나를 선택적으로 가리키는 것뿐만 아니라 A와 B 모두를 포함하는 것도 의미하는 것으로 정의된다. 또한, 본 명세서에서 "포함"이라는 용어는 포함하는 것으로 나열된 요소 이외에 추가로 다른 구성요소를 더 포함하는 것도 포괄하는 의미를 가진다.In the description of the present invention, the order of each step is to be understood without limitation, unless the preceding step must be performed logically and temporally prior to the subsequent step. That is, except for the above exceptional cases, even if a process described as a subsequent step is performed prior to a process described as a preceding step, the essence of the invention is not affected, and the scope of rights should also be defined regardless of the order of the steps. And in the present specification, the term "A or B" is defined to mean not only selectively indicating any one of A and B, but also including both A and B. In addition, in the present specification, the term "comprising" has the meaning of encompassing a further including other elements in addition to the elements listed as including.

본 명세서에서는 본 발명의 설명에 필요한 필수적인 구성요소만을 설명하며, 본 발명의 본질과 관계가 없는 구성요소는 언급하지 아니한다. 그리고 언급되는 구성요소만을 포함하는 배타적인 의미로 해석되어서는 아니되며 다른 구성요소도 포함할 수 있는 비배타적인 의미로 해석되어야 한다.In the present specification, only essential components necessary for the description of the present invention are described, and components not related to the essence of the present invention are not mentioned. In addition, it should not be interpreted as an exclusive meaning including only the mentioned components, but should be interpreted as a non-exclusive meaning that may include other components as well.

그 밖에도, 본 발명을 설명함에 있어서, 관련된 공지 기능 혹은 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우, 그에 대한 상세한 설명은 축약하거나 생략한다. 한편, 각 실시 예는 독립적으로 구현되거나 동작될 수도 있지만, 각 실시 예는 조합되어 구현되거나 동작될 수도 있다.In addition, in describing the present invention, when it is determined that a detailed description of a related known function or configuration may unnecessarily obscure the subject matter of the present invention, the detailed description thereof will be abbreviated or omitted. Meanwhile, each embodiment may be implemented or operated independently, but each embodiment may be implemented or operated in combination.

도 2a는 본 개시의 일 실시 예에 따른 멀티 클래스 객체 식별 장치의 블록도이고, 도 2b는 본 개시의 다른 실시 예에 따른 멀티 클래스 객체 식별 장치의 블록도이다. 도 2a 및 도 2b를 참조하여 멀티 클래스 객체 식별 장치를 설명한다.2A is a block diagram of an apparatus for identifying multi-class objects according to an embodiment of the present disclosure, and FIG. 2B is a block diagram of an apparatus for identifying multi-class objects according to another embodiment of the present disclosure. An apparatus for identifying a multi-class object will be described with reference to FIGS. 2A and 2B.

도 2a를 참조하면, 멀티 클래스 객체 식별 장치(100)는 입력부(110) 및 프로세서(120)를 포함할 수 있다.Referring to FIG. 2A, the multi-class object identification apparatus 100 may include an input unit 110 and a processor 120.

입력부(110)는 영상을 입력받는다. 입력받은 영상은 정지영상일 수 있고, 복수의 프레임을 포함하는 동영상일 수 있다. 입력부(110)는 영상을 입력받을 수 있는 구성은 모두 가능하다. 예를 들어, 입력부(110)는 카메라로 구현되고, 카메라는 영상을 촬상하고 촬상된 영상을 멀티 클래스 객체 식별 장치(100)로 입력할 수 있다. 또는, 입력부(110)는 통신 모듈로 구현되고, 통신 모듈은 외부 장치로부터 유무선 통신 방식을 이용하여 촬상된 영상을 수신하고 멀티 클래스 객체 식별 장치(100)로 입력할 수 있다. 또는, 입력부(110)는 저장 매체로 구현되고, 저장 매체는 저장된 영상을 멀티 클래스 객체 식별 장치(100)로 입력할 수 있다.The input unit 110 receives an image. The received image may be a still image or a moving image including a plurality of frames. The input unit 110 may be configured to receive an image. For example, the input unit 110 may be implemented as a camera, and the camera may capture an image and input the captured image to the multi-class object identification device 100. Alternatively, the input unit 110 may be implemented as a communication module, and the communication module may receive an image captured using a wired/wireless communication method from an external device and input it to the multi-class object identification device 100. Alternatively, the input unit 110 may be implemented as a storage medium, and the storage medium may input a stored image to the multi-class object identification apparatus 100.

프로세서(120)는 학습된 인공지능 신경망에 기초하여 서브 클래스 NMS(Non-maximum suppression) 연산 과정과 그룹 클래스 NMS 연산 과정을 수행할 수 있다. NMS 연산 과정은 영상에 포함된 객체에 대응되는 서브 클래스를 식별하고, 식별된 서브 클래스의 단위로 서브 클래스 바운딩 박스를 생성하는 과정을 의미한다. 서브 클래스는 하나 이상 복수 개일 수 있다.The processor 120 may perform a subclass non-maximum suppression (NMS) operation process and a group class NMS operation process based on the learned artificial intelligence neural network. The NMS operation process refers to a process of identifying a subclass corresponding to an object included in an image and generating a subclass bounding box in units of the identified subclass. There may be one or more subclasses.

그룹 클래스 NMS 연산 과정은 각각의 서브 클래스에 대한 확률을 산출하고, 복수의 서브 클래스가 포함된 그룹 클래스를 식별하며, 서브 클래스 바운딩 박스를 제거하고, 식별된 그룹 클래스에 기초하여 하나의 그룹 클래스 바운딩 박스를 생성하는 과정을 의미한다.The group class NMS operation process calculates the probability for each subclass, identifies a group class containing a plurality of subclasses, removes the subclass bounding box, and binds one group class based on the identified group class. It refers to the process of creating a box.

그룹 클래스 및 서브 클래스에 대한 구체적인 설명은 후술한다.A detailed description of the group class and subclass will be described later.

그리고, 프로세서(120)는 산출된 서브 클래스에 대한 확률 중 가장 높은 확률의 서브 클래스를 객체로 인식할 수 있다. 예를 들어, 프로세서(120)는 하나의 서브 클래스를 식별한 경우, 식별된 서브 클래스를 객체로 인식할 수 있다. 프로세서(120)가 세 개(a, b, c)의 서브 클래스를 식별한 경우, 각각의 서브 클래스의 매칭 확률을 산출할 수 있다. 만일, a 서브 클래스의 매칭 확률이 0.3, b 서브 클래스의 매칭 확률이 0.2, c 서브 클래스의 매칭 확률이 0.5라면, 프로세서(120)는 c 서브 클래스를 객체로 인식할 수 있다. In addition, the processor 120 may recognize a subclass having the highest probability among the calculated probabilities for the subclass as an object. For example, when one subclass is identified, the processor 120 may recognize the identified subclass as an object. When the processor 120 identifies three (a, b, c) subclasses, a matching probability of each subclass may be calculated. If the matching probability of subclass a is 0.3, subclass b is 0.2, and subclass c is 0.5, the processor 120 may recognize subclass c as an object.

한편, 멀티 클래스 객체 식별 장치는 디스플레이를 더 포함할 수 있다. 도 2b를 참조하면 디스플레이(130)를 포함하는 멀티 클래스 객체 식별 장치의 블록도가 도시되어 있다. 디스플레이(130)는 생성된 그룹 클래스 바운딩 박스 및 가장 높은 확률의 서브 클래스를 표시할 수 있다.Meanwhile, the multi-class object identification device may further include a display. Referring to FIG. 2B, a block diagram of an apparatus for identifying a multi-class object including a display 130 is shown. The display 130 may display the generated group class bounding box and the subclass with the highest probability.

상술한 바와 같이, 멀티 클래스 식별 장치에 입력되는 영상은 동영상일 수 있다. 멀티 클래스 식별 장치는 입력되는 동영상의 각 프레임에 기초하여 서브 클래스에 대한 확률을 누적 평균함으로써 객체를 정확하게 구분하여 식별하고, 객체 추적의 정확도와 신뢰도를 향상시킬 수 있다.As described above, the image input to the multi-class identification device may be a video. The multi-class identification apparatus accurately classifies and identifies an object by accumulating and averaging probabilities for subclasses based on each frame of an input video, and improves accuracy and reliability of object tracking.

딥러닝 기반의 객체 검출 정확도의 향상으로 인해 다양한 객체 클래스들이 세부적으로 구분되어 검출될 수 있는 기술에 대한 요구가 증가하고 있다. 예를 들어, 사용자는 단순히 차량을 검출하는 기술보다 세단, SUV, 밴 등과 같이 세부적으로 구분하여 객체를 검출하는 기술을 요구하고 있다.Due to the improvement of the object detection accuracy based on deep learning, there is an increasing demand for a technology capable of subdividing and detecting various object classes. For example, users are demanding a technology that detects objects by subdividing them into details such as sedans, SUVs, and vans, rather than simply detecting a vehicle.

본 개시는 매 프레임마다 독립적인 검출 결과만으로 객체를 검출하는 것이 아니라 매 프레임마다 추적된 객체의 확률 분포를 누적 평균하여 객체를 검출한다. 아래에서는 그룹 클래스 및 서브 클래스를 설명하고, 누적 평균 확률을 이용하여 객체를 식별하고 추적하는 방식을 설명한다.The present disclosure detects an object by accumulating averaging a probability distribution of an object tracked every frame, rather than detecting an object only with an independent detection result for each frame. In the following, group classes and subclasses will be described, and a method of identifying and tracking objects using the cumulative average probability will be described.

도 3a 내지 도 3c는 본 개시에 따른 그룹 클래스의 다양한 실시 예를 나타내는 도면이다. 도 3a 내지 도 3c를 참조하여 클래스를 설명한다.3A to 3C are diagrams illustrating various embodiments of a group class according to the present disclosure. The class will be described with reference to FIGS. 3A to 3C.

서브 클래스는 개개의 객체를 의미한다. 그리고, 그룹 클래스는 복수의 동종 객체 또는 유사한 형태를 가지는 객체의 그룹을 의미한다. 도 3a 내지 도 3c에 도시된 바와 같이, 세단, 밴, SUV 등은 차량 그룹으로 그룹핑할 수 있고, 오토바이, 자전거 등은 이륜차 그룹으로 그룹핑할 수 있으며, 어른, 어린이 등은 사람 그룹으로 그룹핑할 수 있다. 즉, 세단, 밴, SUV 각각은 서브 클래스이고, 차량 그룹은 그룹 클래스일 수 있다. 오토바이, 자전거 각각은 서브 클래스이고, 이륜차 그룹은 그룹 클래스일 수 있다. 어른, 어린이 각각은 서브 클래스이고, 사람 그룹은 그룹 클래스일 수 있다. 상술한 예는 일 실시 예이며, 각각의 객체는 서브 클래스로 정의하고, 복수의 동종 객체 또는 유사한 형태를 가지는 객체를 그룹핑한 그룹은 그룹 클래스로 정의할 수 있다.Subclass means individual objects. In addition, the group class refers to a group of a plurality of homogeneous objects or objects having a similar shape. 3A to 3C, sedans, vans, SUVs, etc. can be grouped into vehicle groups, motorcycles, bicycles, etc. can be grouped into two-wheeled vehicle groups, and adults, children, etc. can be grouped into people group. have. That is, each of the sedan, the van, and the SUV may be a subclass, and the vehicle group may be a group class. Each of the motorcycle and bicycle may be a subclass, and the motorcycle group may be a group class. Each of an adult and a child may be a subclass, and a group of people may be a group class. The above-described example is an embodiment, and each object is defined as a subclass, and a group obtained by grouping a plurality of homogeneous objects or objects having a similar shape may be defined as a group class.

도 4는 본 개시의 일 실시 예에 따른 멀티 클래스 객체 검출 과정을 설명하는 도면이다.4 is a diagram illustrating a process of detecting a multi-class object according to an embodiment of the present disclosure.

도 4를 참조하면, 서브 클래스 NMS 수행 과정과 그룹 클래스 NMS 수행 과정이 도시되어 있다. 영상에 포함된 객체를 검출하기 위한 검출 네트워크를 학습할 때는 서브 클래스 단위로 구분하여 학습이 수행되고, 추론할 때도 서브 클래스 단위로 바운딩 박스와 클래스 확률이 추정된다. Referring to FIG. 4, a process of performing a subclass NMS and a process of performing a group class NMS are shown. When a detection network for detecting an object included in an image is trained, learning is performed by dividing into subclass units, and when inferring, bounding boxes and class probabilities are estimated by subclass units.

사전에 학습된 검출 네트워크에 영상(11)이 입력되면, 출력맵에서 객체 주변에 많은 클래스 바운딩 박스들이 생성된다. 객체 주변의 많은 클래스 바운딩 박스를 제거하기 위하여 NMS 알고리즘이 사용될 수 있다. 본 개시에서는 서브 클래스 구분의 모호함으로 인한 검출 오류를 최소화하기 위하여 2단계의 NMS 과정이 수행될 수 있다.When the image 11 is input to the pre-learned detection network, many class bounding boxes are generated around the object in the output map. The NMS algorithm can be used to remove many class bounding boxes around the object. In the present disclosure, a two-step NMS process may be performed in order to minimize detection errors due to ambiguity in subclass classification.

서브 클래스 NMS 수행 단계는 서브 클래스에 대한 객체 검출 결과를 획득한다. 서브 클래스 NMS 수행 단계에서는 영상(11)에 포함된 동일한 객체에 대하여 여러 서브 클래스의 바운딩 박스들이 생성될 수 있으며, 생성된 여러 서브 클래스 바운딩 박스를 제거하기 위해 그룹 클래스 NMS 과정이 수행될 수 있다.The subclass NMS execution step acquires an object detection result for the subclass. In the subclass NMS execution step, several subclass bounding boxes may be generated for the same object included in the image 11, and a group class NMS process may be performed to remove the generated subclass bounding boxes.

그룹 NMS 수행 단계에서 최종적으로 추정된 바운딩 박스는 그룹 클래스로 대표되고 각각의 서브 클래스에 대한 확률 분포 정보가 산출될 수 있다. 그룹 NMS 수행 단계에서 각각의 서브 클래스에 대해 산출된 확률은 서브 클래스 NMS를 수행한 후 각 서브 클래스가 가지고 있는 확률이다.The bounding box finally estimated in the group NMS execution step is represented by a group class, and probability distribution information for each subclass may be calculated. The probability calculated for each subclass in the group NMS execution step is the probability that each subclass has after performing the subclass NMS.

즉, 영상(11)이 입력되면 학습된 인공지능 신경망에 기초하여 영상(11) 내의 객체 주변에 적어도 하나의 바운딩 박스가 생성될 수 있다. 서브 클래스 NMS 과정이 수행되면, 영상에 포함된 객체에 대응되는 적어도 하나의 서브 클래스를 식별하고, 식별된 적어도 하나의 서브 클래스의 단위로 서브 클래스 바운딩 박스가 생성될 수 있다. 그리고, 그룹 클래스 NMS 과정이 수행되면, 생성된 서브 클래스 바운딩 박스에 기초하여 그룹 클래스를 식별하고, 식별된 그룹 클래스에 기초하여 하나의 그룹 클래스 바운딩 박스가 생성될 수 있다. 이때, 기 생성된 서브 클래스 바운딩 박스는 제거되고, 서브 클래스 NMS 수행시 각 서브 클래스가 가지고 있는 확률에 기초하여 각 서브 클래스에 대한 확률이 산출될 수 있다. That is, when the image 11 is input, at least one bounding box may be generated around the object in the image 11 based on the learned artificial intelligence neural network. When the subclass NMS process is performed, at least one subclass corresponding to an object included in an image is identified, and a subclass bounding box may be generated in units of the identified at least one subclass. In addition, when the group class NMS process is performed, a group class may be identified based on the generated subclass bounding box, and one group class bounding box may be generated based on the identified group class. In this case, the previously generated subclass bounding box is removed, and a probability for each subclass may be calculated based on a probability possessed by each subclass when subclass NMS is performed.

일 실시 예로서, 도 4에 도시된 바와 같이 영상(11)에 포함된 차량에 대해 서브 클래스 NMS 및 그룹 클래스 NMS가 수행되면, 멀티 클래스 객체 식별 장치는 차량 그룹에 대응되는 그룹 클래스 바운딩 박스를 생성하고, 세단의 서브 클래스 확률은 0.52, SUV의 서브 클래스 확률은 0.97, 밴의 서브 클래스 확률은 0.68로 산출할 수 있다. 그리고, 멀티 클래스 객체 식별 장치는 가장 높은 확률인 SUV로 객체를 인식할 수 있다.As an embodiment, as shown in FIG. 4, when subclass NMS and group class NMS are performed on a vehicle included in the image 11, the multi-class object identification device generates a group class bounding box corresponding to the vehicle group. The subclass probability of the sedan is 0.52, the subclass probability of the SUV is 0.97, and the subclass probability of the van is 0.68. In addition, the multi-class object identification device may recognize the object with the highest probability of the SUV.

도 5a는 확률 분포를 이용한 객체 검출 과정을 나타내는 일 실시 예이고, 도 5b는 확률 분포를 이용한 객체 추적 과정을 나타내는 일 실시 예이다. 도 5a 및 도 5b를 참조하여 설명한다.FIG. 5A is an embodiment showing an object detection process using a probability distribution, and FIG. 5B is an embodiment showing an object tracking process using a probability distribution. This will be described with reference to FIGS. 5A and 5B.

상술한 바와 같이, 멀티 클래스 객체 식별 장치에 입력되는 영상은 복수의 프레임을 포함하는 동영상일 수 있다. 멀티 클래스 객체 식별 장치는 동영상에 포함된 객체를 추적할 수 있다. 멀티 클래스 객체 식별 장치는 그룹 클래스 단위로 객체를 연관시키고 추적하며, 현재 프레임에서의 서브 클래스 누적 평균 확률에 기초하여 서브 클래스에 대해 판단할 수 있다. 멀티 클래스 객체 식별 장치가 객체를 추적하는 과정은 객체 검출 단계와 객체 추적 단계로 나눌 수 있다. 객체 검출 단계는 각 프레임에 대해 상술한 서브 클래스 NMS 및 그룹 클래스 NMS를 수행하는 과정을 포함할 수 있다.As described above, the image input to the multi-class object identification device may be a video including a plurality of frames. The multi-class object identification device may track an object included in a video. The apparatus for identifying a multi-class object may associate and track an object in units of a group class, and may determine a sub-class based on an accumulated average probability of a sub-class in a current frame. The process of tracking an object by the multi-class object identification device can be divided into an object detection step and an object tracking step. The object detection step may include performing the above-described subclass NMS and group class NMS for each frame.

즉, 도 5a에 도시된 바와 같이, 객체 검출 단계는 검출된 객체의 서브 클래스에 대한 확률을 각각 산출하고, 그룹 클래스를 식별하여 하나의 그룹 클래스 바운딩 박스를 생성할 수 있다. That is, as illustrated in FIG. 5A, in the object detection step, a probability of a subclass of a detected object is calculated, and a group class is identified to generate one group class bounding box.

그리고, 도 5b에 도시된 바와 같이, 멀티 클래스 객체 식별 장치는 기 설정된 횟수 이상 동일한 그룹의 객체가 검출되고 연관되면 객체 추적 초기화를 진행하고, 객체 추적을 활성화시킬 수 있다. 일 실시 예로서, 도 5b에서 기 설정된 횟수는 3이고, 세번째 프레임인 k+2 프레임에서 객체 추적이 활성화될 수 있다. 멀티 클래스 객체 식별 장치는 객체 검출 단계에서 산출된 서브 클래스의 누적 평균 확률에 기초하여 서브 클래스를 식별하고 객체를 인식할 수 있다. 도 5b를 참조하면, k+2 프레임까지 밴의 서브 클래스 누적 평균 확률이 가장 높기 때문에 멀티 클래스 식별 장치는 추적중인 객체를 밴으로 인식할 수 있다. 그러나, k+3 프레임에서 SUV의 누적 평균 확률이 가장 높기 때문에 추적중인 객체를 SUV로 인식할 수 있다.In addition, as shown in FIG. 5B, when objects of the same group are detected and associated more than a preset number of times, the multi-class object identification apparatus may initialize object tracking and activate object tracking. As an embodiment, the preset number of times in FIG. 5B is 3, and object tracking may be activated in the third frame, the k+2 frame. The apparatus for identifying a multi-class object may identify a sub-class and recognize an object based on the cumulative average probability of the sub-class calculated in the object detection step. Referring to FIG. 5B, since the subclass cumulative average probability of a van is the highest up to k+2 frames, the multi-class identification apparatus may recognize an object being tracked as a van. However, since the cumulative average probability of the SUV is highest in the k+3 frame, the object being tracked can be recognized as an SUV.

상술한 바와 같이, 본 개시는 누적 평균 확률을 이용하여 추적중인 객체를 인식하기 때문에 객체를 지속적으로 추적하면 누적된 프레임에 따른 평균 확률이 누적될 수 있다. 따라서, 본 개시는 누적된 평균 확률에 기초하여 추적중인 객체를 인식하므로 신뢰도가 높아지고, 객체 분류 정확도도 높아질 수 있다.As described above, since the present disclosure recognizes the object being tracked using the cumulative average probability, if the object is continuously tracked, the average probability according to the accumulated frames may be accumulated. Accordingly, in the present disclosure, since the object being tracked is recognized based on the accumulated average probability, reliability may be increased, and object classification accuracy may be improved.

상술한 바와 같이, 기존 객체 식별/추적 기술은 유사 형태를 가지는 클래스를 검출 단계에서 강결정(hard decision)으로 구분하므로 동일한 객체에 대하여 매 프레임마다 다른 클래스로 판단되는 오류가 발생한다. 따라서, 연속 프레임에서의 클래스 분류 불일치로 인해 추적 초기화가 어렵거나 동일한 객체에 대하여 다수의 서브 클래스 검출로 인해 오검출 추적 활성화와 같은 문제가 발생될 수 있다.As described above, since the existing object identification/tracking technology classifies a class having a similar shape as a hard decision in the detection step, an error of determining that the same object is a different class every frame occurs. Accordingly, it may be difficult to initialize tracking due to a class classification mismatch in consecutive frames, or a problem such as erroneous detection tracking activation may occur due to the detection of multiple subclasses for the same object.

본 개시에서는 객체 검출 단계에서 구분이 어려운 객체들은 그룹 클래스로 구성하여 그룹 클래스 단위로 객체를 검출한다. 검출된 그룹 클래스는 서브 클래스에 대한 확률 분포 정보를 가지고 있고, 추적을 수행하면서 매 프레임마다 서브 클래스에 대한 확률 분포 정보를 누적하고 평균하여 서브 클래스를 추정한다. 따라서, 본 개시는 객체를 오랫동안 추적할수록 서브 클래스의 누적 평균 확률 분포의 신뢰도가 높아지고, 객체 분류 정확도도 높아질 수 있다.In the present disclosure, objects that are difficult to distinguish in the object detection step are configured as group classes, and objects are detected in units of group classes. The detected group class has probability distribution information for the subclass, and the subclass is estimated by accumulating and averaging probability distribution information for the subclass for each frame while tracking is performed. Accordingly, according to the present disclosure, the longer the object is tracked, the higher the reliability of the cumulative average probability distribution of the subclass and the higher the object classification accuracy.

차량의 종류와 같이 비슷한 형태를 갖는 객체들을 구분하여 추적해야 하는 분야에서 본 개시가 사용되는 경우, 다중 객체 추적 정확도와 신뢰도가 향상될 수 있다. 특히, 추적이 지속될수록 객체 구분에 대한 신뢰도가 높아지게 되어, 구분이 모호한 객체에 대한 분류 정확도를 향상시킬 수 있다. 예를 들어, 본 개시는 CCTV, 자동차, 로봇, 게임 등과 같이 다양한 객체들을 검출하고 추적하는 분야에서 활용될 수 있다.When the present disclosure is used in a field in which objects having a similar shape, such as a vehicle type, must be classified and tracked, the accuracy and reliability of multi-object tracking may be improved. In particular, as tracking continues, the reliability of object classification increases, and thus classification accuracy for objects whose classification is ambiguous can be improved. For example, the present disclosure may be utilized in the field of detecting and tracking various objects such as CCTV, automobile, robot, and game.

지금까지 멀티 클래스 객체 식별 장치에 대한 다양한 실시 예를 설명하였다. 아래에서는 멀티 클래스 객체 식별 방법에 대해 설명한다.So far, various embodiments of a multi-class object identification device have been described. The following describes how to identify multi-class objects.

도 6은 본 개시의 일 실시 예에 따른 멀티 클래스 객체 식별 방법의 흐름도이다.6 is a flowchart of a method of identifying a multi-class object according to an embodiment of the present disclosure.

도 6을 참조하면, 멀티 클래스 객체 식별 장치는 학습된 인공지능 신경망을 포함하고 학습된 인공지능 신경망으로 영상을 입력받는다(S610). 입력받는 영상은 정지영상일 수 있고, 복수의 프레임을 포함하는 동영상일 수 있다.6, the multi-class object identification apparatus includes a learned artificial intelligence neural network and receives an image through the learned artificial intelligence neural network (S610). The received image may be a still image or a moving image including a plurality of frames.

멀티 클래스 객체 식별 장치는 서브 클래스 NMS 과정을 수행한다(S620). 즉, 멀티 클래스 객체 식별 장치는 학습된 인공지능 신경망에 기초하여 이미지에 포함된 객체에 대응되는 서브 클래스를 식별하고, 식별된 서브 클래스의 단위로 서브 클래스 바운딩 박스를 생성한다.The multi-class object identification device performs a sub-class NMS process (S620). That is, the multiclass object identification apparatus identifies a subclass corresponding to an object included in an image based on the learned artificial intelligence neural network, and generates a subclass bounding box in units of the identified subclass.

멀티 클래스 객체 식별 장치는 그룹 클래스 NMS 과정을 수행한다(S630). 즉, 멀티 클래스 객체 식별 장치는 각각의 서브 클래스에 대한 확률을 산출하고, 그룹 클래스를 식별한다. 그리고, 멀티 클래스 객체 식별 장치는 생성된 서브 클래스 바운딩 박스를 제거하고, 식별된 그룹 클래스에 기초하여 하나의 그룹 클래스 바운딩 박스를 생성한다.The multi-class object identification device performs a group class NMS process (S630). That is, the multi-class object identification apparatus calculates a probability for each subclass and identifies a group class. In addition, the multi-class object identification apparatus removes the generated sub-class bounding box, and generates one group class bounding box based on the identified group class.

멀티 클래스 객체 식별 장치는 산출된 각 서브 클래스에 대한 확률 중 가장 높은 확률의 서브 클래스를 객체로 인식할 수 있으며, 생성된 그룹 클래스 바운딩 박스 및 가장 높은 확률의 서브 클래스를 표시할 수 있다.The apparatus for identifying a multi-class object may recognize a subclass having the highest probability among the calculated probabilities for each subclass as an object, and display the generated group class bounding box and the subclass having the highest probability.

상술한 다양한 실시 예에 따른 멀티 클래스 객체 식별 방법은 컴퓨터 프로그램 제품으로 제공될 수도 있다. 컴퓨터 프로그램 제품은 S/W 프로그램 자체 또는 S/W 프로그램이 저장된 비일시적 판독 가능 매체(non-transitory computer readable medium)를 포함할 수 있다.The method for identifying a multi-class object according to the various embodiments described above may be provided as a computer program product. The computer program product may include the S/W program itself or a non-transitory computer readable medium in which the S/W program is stored.

비일시적 판독 가능 매체란 레지스터, 캐쉬, 메모리 등과 같이 짧은 순간 동안 데이터를 저장하는 매체가 아니라 반영구적으로 데이터를 저장하며, 기기에 의해 판독(reading)이 가능한 매체를 의미한다. 구체적으로는, 상술한 다양한 어플리케이션 또는 프로그램들은 CD, DVD, 하드 디스크, 블루레이 디스크, USB, 메모리카드, ROM 등과 같은 비일시적 판독 가능 매체에 저장되어 제공될 수 있다. The non-transitory readable medium refers to a medium that stores data semi-permanently and can be read by a device, rather than a medium that stores data for a short moment, such as a register, a cache, and a memory. Specifically, the above-described various applications or programs may be provided by being stored in a non-transitory readable medium such as a CD, DVD, hard disk, Blu-ray disk, USB, memory card, ROM, or the like.

또한, 이상에서는 본 발명의 바람직한 실시 예에 대하여 도시하고 설명하였지만, 본 발명은 상술한 특정의 실시 예에 한정되지 아니하며, 청구범위에서 청구하는 본 발명의 요지를 벗어남이 없이 당해 발명이 속하는 기술분야에서 통상의 지식을 가진 자에 의해 다양한 변형실시가 가능한 것은 물론이고, 이러한 변형실시들은 본 발명의 기술적 사상이나 전망으로부터 개별적으로 이해되어져서는 안될 것이다.In addition, although the preferred embodiments of the present invention have been illustrated and described above, the present invention is not limited to the specific embodiments described above, and the technical field to which the present invention belongs without departing from the gist of the present invention claimed in the claims. In addition, various modifications are possible by those of ordinary skill in the art, and these modifications should not be individually understood from the technical spirit or prospect of the present invention.

100, 100a: 멀티 클래스 객체 식별 장치 110: 입력부
120: 프로세서 130: 디스플레이100, 100a: multi-class object identification device 110: input unit
120: processor 130: display

Claims

Inputting an image to the learned artificial intelligence neural network;
A subclass for identifying at least one subclass corresponding to an object included in the image based on the learned artificial intelligence neural network, and generating at least one subclass bounding box in units of the identified at least one subclass Performing non-maximum suppression (NMS); And
Calculate a probability for each subclass, identify a group class including the identified at least one subclass based on a group class in which a plurality of subclasses corresponding to each of a plurality of homogeneous objects are grouped, and generate the And performing a group class NMS of removing at least one subclass bounding box and generating one group class bounding box based on the identified group class.

The method of claim 1,
Recognizing a subclass having the highest probability among the calculated probabilities for each subclass as the object; and further comprising a multi-class object identification method.

The method of claim 2,
Displaying the generated group class bounding box and the highest probability subclass.

The method of claim 1,
The video is a video including a plurality of frames,
The performing of the subclass NMS and the performing of the group class NMS are performed for each of the plurality of frames,
If the group class bounding box generated in each of the plurality of frames is the same group class bounding box and is continuously generated more than a preset number of times, activating object tracking; and further comprising.

The method of claim 4,
An object tracking step of accumulating averaging the probabilities of each subclass calculated from a preset number of frames, and recognizing and tracking the subclass having the highest accumulative average probability as an object; further comprising multi-class object identification Way.

An input unit for receiving an image; And
Including; a processor including the learned artificial intelligence neural network,
The processor,
A subclass for identifying at least one subclass corresponding to an object included in the image based on the learned artificial intelligence neural network, and generating at least one subclass bounding box in units of the identified at least one subclass Perform NMS operation,
Calculate a probability for each subclass, identify a group class including the identified at least one subclass based on a group class in which a plurality of subclasses corresponding to each of a plurality of homogeneous objects are grouped, and generate the A multi-class object identification device that removes at least one subclass bounding box and performs a group class NMS operation to generate one group class bounding box based on the identified group class.

The method of claim 6,
The processor,
A multi-class object identification apparatus for recognizing a subclass having the highest probability among the calculated probabilities for each subclass as the object.

The method of claim 7,
The multi-class object identification apparatus further comprising a; display for displaying the generated group class bounding box and the subclass of the highest probability.

The method of claim 6,
The video is a video including a plurality of frames,
The processor,
The subclass NMS operation and the group class NMS operation are performed for each of the plurality of frames, and the group class bounding box generated in each of the plurality of frames is the same group class bounding box and is continuously generated more than a preset number of times. In case, object tracking is activated, multi-class object identification device.

The method of claim 9,
The processor,
A multi-class object identification apparatus for accumulating averaging the probabilities of each of the subclasses calculated from a preset number of frames, and recognizing and tracking the subclass having the highest accumulative average probability as an object.