KR102444929B1

KR102444929B1 - Method for processing image of object for identifying animal and apparatus thereof

Info

Publication number: KR102444929B1
Application number: KR1020210083754A
Authority: KR
Inventors: 박대현; 임준호
Original assignee: 주식회사 펫나우
Priority date: 2021-06-28
Filing date: 2021-06-28
Publication date: 2022-09-21

Abstract

The present invention provides a method for effectively detecting an object for identifying a companion animal while decreasing computational complexity, and an apparatus thereof. According to the present invention, the method for processing an image of an object for identifying a companion animal comprises: a step of acquiring an image in which the companion animal is included; a step of generating characteristic area candidates for determining a species of the companion animal in the image; a step of setting a first characteristic area with a determined location and size based on a reliability value of each of the characteristic area candidates; a step of setting a second characteristic area including the object for identifying the companion animal in the first characteristic area; and a step of acquiring the image of the object in the second characteristic area. According to the present invention, the apparatus generates a broader final characteristic area by considering the reliability value of each of the plurality of characteristic area candidates in a process of determining a characteristic area for determining the species of the companion animal to enable more accurate detection by detecting the object for identifying the companion animal in the final characteristic area.

Description

METHOD FOR PROCESSING IMAGE OF OBJECT FOR IDENTIFYING ANIMAL AND APPARATUS THEREOF

본 발명은 반려 동물의 식별을 위한 객체의 이미지를 처리하기 위한 방법 및 장치에 관한 것으로, 보다 구체적으로 강아지의 비문과 같이 반려 동물을 식별할 수 있는 객체를 효과적으로 검출할 수 있는 이미지 처리 방법 및 장치에 관한 것이다.The present invention relates to a method and apparatus for processing an image of an object for identification of a companion animal, and more particularly, an image processing method and apparatus for effectively detecting an object capable of identifying a companion animal, such as a dog's inscription is about

현대 사회에서 사람과 함께 생활하면서 정서적으로 의지할 수 있는 반려 동물에 대한 수요가 높아지고 있다. 이에 따라, 반려 동물에 대한 건강 관리 등을 위하여 다양한 반려 동물에 대한 정보를 데이터베이스화 하여 관리할 필요성이 증가하고 있다. 반려 동물을 관리하기 위하여 사람의 지문과 같이 반려 동물의 식별 정보가 필요하며, 반려 동물에 따라 사용될 수 있는 객체가 각각 정의될 수 있다. 예를 들어, 강아지의 경우 비문(코 주름의 형상)이 각자 상이하므로 각 강아지마다 비문을 식별 정보로 사용할 수 있다. In modern society, there is an increasing demand for companion animals that can be relied on emotionally while living with people. Accordingly, there is an increasing need to manage and manage information on various companion animals in a database for health management of companion animals. In order to manage the companion animal, identification information of the companion animal, such as a human fingerprint, is required, and objects that can be used according to the companion animal may be defined. For example, in the case of puppies, since the inscriptions (the shape of the nose folds) are different, the inscriptions may be used as identification information for each puppy.

도 1의 (a)에 도시된 것과 같이, 비문을 등록하는 방법은 사람의 지문 또는 안면을 등록하는 것과 같이 반려 동물의 코를 포함한 안면을 촬영하고(S110) 비문을 포함한 이미지를 데이터베이스에 저장 및 등록하는 과정(S120)에 의해 수행된다. 또한, 비문을 조회하는 방법은 도 1의 (b)에 도시된 것과 같이 반려 동물의 비문을 촬영하고(S130), 촬영된 비문과 일치하는 비문 및 이와 관련된 정보를 탐색하고(S140), 촬영된 비문과 일치하는 정보를 출력하는 과정(S150)에 의해 수행될 수 있다. 도 1과 같이 반려 동물의 비문을 등록하고 조회하는 과정을 통해 각 반려 동물을 식별하고 해당 반려 동물의 정보를 관리할 수 있다. 반려 동물의 비문 정보는 데이터베이스에 저장되고 AI 기반의 학습 또는 인식을 위한 데이터로서 사용될 수 있다. As shown in Fig. 1 (a), the method of registering the inscription is to photograph the face including the nose of the companion animal (S110) and store the image including the inscription in the database, such as registering a human fingerprint or face, and It is performed by the registration process (S120). In addition, the method of inquiring the inscription is as shown in (b) of FIG. 1, by photographing the inscription of the companion animal (S130), searching for the inscription matching the photographed inscription and related information (S140), and It may be performed by the process of outputting information matching the inscription (S150). As shown in FIG. 1 , through the process of registering and inquiring inscriptions of companion animals, each companion animal can be identified and information of the companion animal can be managed. The pet's inscription information is stored in a database and can be used as data for AI-based learning or recognition.

그러나, 반려 동물의 비문을 취득하고 저장함에 있어 몇 가지 문제점이 존재한다. However, there are several problems in obtaining and storing the inscriptions of companion animals.

먼저, 사진은 촬영 각도, 초점, 거리, 크기, 환경 등에 따라 인식이 어려울 수 있다. 사람의 안면 인식 기술을 비문 인식에 적용하고자 하는 시도가 있었으나, 사람의 안면 정보는 충분한 데이터가 축적된 반면 반려 동물의 비문 정보는 충분한 데이터가 확보되지 않아 인식률이 낮다는 문제점이 있다. 구체적으로, AI 기반의 인식이 수행되기 위하여는 기계가 학습할 수 있는 형태로 가공된 학습 데이터가 필요하나, 반려 동물의 비문은 충분한 데이터가 축적되지 않아 비문 인식에 어려움이 있다.First, it may be difficult to recognize a photo depending on a photographing angle, focus, distance, size, environment, and the like. There have been attempts to apply human facial recognition technology to inscription recognition, but there is a problem that sufficient data is accumulated for human facial information, whereas sufficient data is not secured for inscription information of companion animals, resulting in a low recognition rate. Specifically, in order to perform AI-based recognition, learning data processed in a form that can be learned by a machine is required, but the inscriptions of companion animals are difficult to recognize because sufficient data is not accumulated.

또한, 반려 동물의 비문 인식을 위하여는 선명한 코 주름을 갖는 이미지가 요구되나, 사람과 달리 반려 동물은 잠시 동작을 멈추는 것과 같은 행위를 할 줄 모르기 때문에 선명한 코 주름 이미지를 취득하기가 쉽지 않다. 예를 들어, 강아지는 계속 안면을 움직이고 혀를 낼름거리기 때문에 원하는 품질의 비문 이미지를 취득하기가 매우 어렵다. 예를 들어, 도 2를 참조하면, 하단의 이미지들과 같이 코의 주름이 선명하게 촬영된 영상이 요구되지만 실제로 촬영된 이미지는 대부분 상단의 이미지들과 같이 흔들림 등으로 인하여 코 주름이 선명하게 촬영되지 못한 경우가 많다. 이러한 문제를 해결하기 위하여 강아지의 코를 강제로 고정시킨 상태로 촬영하는 방법들이 고려되고 있으나, 반려 동물에게 강제적인 행위를 하도록 하기 때문에 부적절한 것으로 평가받고 있다. In addition, an image with clear nose wrinkles is required for inscription recognition of companion animals, but unlike humans, companion animals do not know how to perform actions such as stopping for a moment, so it is not easy to obtain a clear image of nose wrinkles. For example, it is very difficult to obtain an image of an inscription of the desired quality because a dog constantly moves its face and sticks out its tongue. For example, referring to FIG. 2 , an image in which the wrinkles of the nose are clearly photographed as in the images at the bottom is required, but most of the images actually photographed have the wrinkles in the nose clearly photographed due to shaking, like in the images at the top. In many cases it has not been In order to solve this problem, a method of taking pictures with the dog's nose forcibly fixed is being considered, but it is evaluated as inappropriate because it forces the companion animal to perform a forced action.

본 발명은 연산 복잡도를 감소시키면서 반려 동물의 식별을 위한 객체를 효과적으로 검출할 수 있는 영상 처리 방법 및 장치를 제공한다. The present invention provides an image processing method and apparatus capable of effectively detecting an object for identification of a companion animal while reducing computational complexity.

본 발명의 해결과제는 이상에서 언급된 것들에 한정되지 않으며, 언급되지 아니한 다른 해결과제들은 아래의 기재로부터 당업자에게 명확히 이해될 수 있을 것이다. The problems to be solved of the present invention are not limited to those mentioned above, and other problems not mentioned will be clearly understood by those skilled in the art from the following description.

본 발명에 따른 반려 동물의 식별을 위한 객체의 이미지를 처리하기 위한 방법은, 상기 반려 동물이 포함된 영상을 획득하는 단계와, 상기 영상에서 상기 반려 동물의 종을 결정하기 위한 특징 영역 후보들을 생성하는 단계와, 상기 특징 영역 후보들 각각의 신뢰도 값에 기초하여 위치 및 크기가 결정된 제1 특징 영역을 설정하는 단계와, 상기 제1 특징 영역에서 상기 반려 동물을 식별하기 위한 객체를 포함하는 제2 특징 영역을 설정하는 단계와, 상기 제2 특징 영역에서 상기 객체의 이미지를 획득하는 단계를 포함한다. A method for processing an image of an object for identification of a companion animal according to the present invention includes: acquiring an image including the companion animal; and generating feature region candidates for determining the species of the companion animal from the image a second feature comprising: setting a first feature region whose location and size are determined based on the reliability values of each of the feature region candidates; and an object for identifying the companion animal in the first feature region setting a region; and acquiring an image of the object in the second feature region.

본 발명에 따르면, 상기 특징 영역 후보들을 생성하는 단계는, 인공 신경망을 사용하여 계층적으로 복수개의 특징 영상들을 생성하는 단계와, 상기 특징 영상들 각각에 대하여 미리 정의된 경계 영역들을 적용하여 각 경계 영역들에서 특정 종의 반려 동물이 위치할 확률 값을 계산하는 단계와, 상기 확률 값을 고려하여 상기 특징 영역 후보들을 생성하는 단계를 포함할 수 있다.According to the present invention, the generating of the feature region candidates includes generating a plurality of feature images hierarchically using an artificial neural network, and applying predefined boundary regions to each of the feature images to each boundary. The method may include calculating a probability value in which a companion animal of a specific species is located in the regions, and generating the feature region candidates in consideration of the probability value.

본 발명에 따르면, 상기 특징 영역 후보들을 생성하는 단계는, 상기 특징 영상에서 특정 동물 종에 해당할 확률이 가장 높은 제1 경계 영역을 선택하는 단계와, 상기 특징 영상에서 상기 선택된 제1 경계 영역을 제외한 나머지 경계 영역들에 대하여, 상기 확률 값의 순서에 따라 상기 제1 경계 영역과의 중첩도를 계산하고 상기 중첩도가 기준 중첩도보다 큰 경계 영역을 상기 특징 영상의 특징 영역 후보에 포함시키는 단계를 포함할 수 있다. According to the present invention, the generating of the feature region candidates includes: selecting a first boundary region having the highest probability of corresponding to a specific animal species in the feature image; Calculating the degree of overlap with the first boundary area for the remaining boundary areas except for the probability values according to the order of the probability values, and including a boundary area with the degree of overlap greater than the reference degree of overlap in the feature area candidate of the feature image; may include.

본 발명에 따르면, 상기 중첩도는 상기 제1 경계 영역과 상기 나머지 경계 영역들 각각의 교집합 대 합집합 면적 비로 정의될 수 있다. According to the present invention, the degree of overlap may be defined as a ratio of intersection to union area of each of the first boundary region and the remaining boundary regions.

본 발명에 따르면, 상기 제1 특징 영역이 위치하는 중심점은 상기 특징 영역 후보의 중심점들에 대한 신뢰도 값의 가중 합에 의해 결정될 수 있다.According to the present invention, the central point at which the first feature region is located may be determined by a weighted sum of reliability values for the central points of the feature region candidate.

본 발명에 따르면, 상기 제1 특징 영역의 너비는 상기 특징 영역 후보들의 너비들에 대한 신뢰도 값의 가중 합에 의해 결정되고, 상기 제1 특징 영역의 높이는 상기 특징 영역 후보들의 높이들에 대한 신뢰도 값의 가중 합에 의해 결정될 수 있다. According to the present invention, the width of the first feature region is determined by a weighted sum of reliability values for the widths of the feature region candidates, and the height of the first feature region is a confidence value for the heights of the feature region candidates. can be determined by the weighted sum of

본 발명에 따른 반려 동물의 식별을 위한 객체의 이미지를 처리하기 위한 장치는, 상기 반려 동물이 포함된 영상을 생성하는 카메라와, 상기 카메라로부터 제공된 영상을 처리하여 상기 반려 동물의 식별을 위한 객체의 이미지를 생성하는 프로세서를 포함한다. 상기 프로세서는, 상기 영상에서 상기 반려 동물의 종을 결정하기 위한 특징 영역 후보들을 생성하고, 상기 특징 영역 후보들 각각의 신뢰도 값에 기초하여 위치 및 크기가 결정된 제1 특징 영역을 설정하고, 상기 제1 특징 영역에서 상기 반려 동물을 식별하기 위한 객체를 포함하는 제2 특징 영역을 설정하고, 상기 제2 특징 영역에서 상기 객체의 이미지를 획득할 수 있다. An apparatus for processing an image of an object for identification of a companion animal according to the present invention includes a camera for generating an image including the companion animal, and an object for identification of the companion animal by processing the image provided from the camera. and a processor for generating an image. The processor generates feature region candidates for determining the species of the companion animal from the image, sets a first feature region whose location and size are determined based on reliability values of each of the feature region candidates, and A second feature region including an object for identifying the companion animal may be set in the feature region, and an image of the object may be acquired from the second feature region.

본 발명에 따르면, 상기 프로세서는, 인공 신경망을 사용하여 계층적으로 복수개의 특징 영상들을 생성하고, 상기 특징 영상들 각각에 대하여 미리 정의된 경계 영역들을 적용하여 각 경계 영역들에서 특정 종의 반려 동물이 위치할 확률 값을 계산하고, 상기 확률 값을 고려하여 상기 특징 영역 후보들을 생성할 수 있다. According to the present invention, the processor generates a plurality of feature images hierarchically using an artificial neural network, and applies predefined boundary regions to each of the feature images, so that a companion animal of a specific species is applied to each of the boundary regions. It is possible to calculate a location probability value and generate the feature region candidates in consideration of the probability value.

본 발명에 따르면, 상기 프로세서는, 상기 특징 영상에서 특정 동물 종에 해당할 확률이 가장 높은 제1 경계 영역을 선택하고, 상기 특징 영상에서 상기 선택된 제1 경계 영역을 제외한 나머지 경계 영역들에 대하여, 상기 확률 값의 순서에 따라 상기 제1 경계 영역과의 중첩도를 계산하고 상기 중첩도가 기준 중첩도보다 큰 경계 영역을 상기 특징 영상의 특징 영역 후보에 포함시킬 수 있다. According to the present invention, the processor selects a first boundary region with the highest probability of corresponding to a specific animal species in the feature image, and includes: The degree of overlap with the first boundary area may be calculated according to the order of the probability values, and a boundary area having the degree of overlap greater than the reference degree of overlap may be included in the feature area candidate of the feature image.

본 발명에 따르면, 상기 제1 특징 영역이 위치하는 중심점은 상기 특징 영역 후보의 중심점들에 대한 신뢰도 값의 가중 합에 의해 결정될 수 있다. According to the present invention, the central point at which the first feature region is located may be determined by a weighted sum of reliability values for the central points of the feature region candidate.

본 발명에 따르면, 반려 동물의 종을 결정하기 위한 특징 영역을 결정하는 과정에서 복수개의 특징 영역 후보들 각각의 신뢰도 값을 고려하여 보다 넓은 영역의 최종 특징 영역을 생성하므로 이후 최종 특징 영역 내에서 반려 동물의 식별용 객체를 검출함으로써 보다 정확한 검출을 가능하게 한다. According to the present invention, in the process of determining the feature region for determining the species of the companion animal, the final feature region of a wider region is generated by considering the reliability values of each of the plurality of feature region candidates, and thereafter, the companion animal within the final feature region is generated. It enables more accurate detection by detecting an object for identification of

본 발명의 효과는 이상에서 언급된 것들에 한정되지 않으며, 언급되지 아니한 다른 효과들은 아래의 기재로부터 당업자에게 명확하게 이해될 수 있을 것이다. Effects of the present invention are not limited to those mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the following description.

도 1은 AI 기반의 반려 동물의 관리를 위한 개략적인 절차를 도시한다.
도 2는 AI 기반의 반려 동물의 관리에 있어 학습 또는 식별에 적합하지 않은 이미지 및 학습 또는 식별에 적합한 이미지의 예시들을 도시한다.
도 3은 본 발명에 따른 학습 또는 식별용 객체 이미지의 적합도 판단이 적용된 AI 기반의 반려 동물의 비문 관리를 위한 절차를 도시한다.
도 4는 본 발명에 따른 반려 동물의 관리 시스템에서 반려 동물의 식별을 위한 객체를 검출하기 위한 절차를 도시한다.
도 5는 본 발명이 적용된 반려 동물의 식별 객체를 검출하기 위한 UI(User Interface) 화면의 예를 도시한다.
도 6은 본 발명에 따른 반려 동물의 식별을 위한 객체를 검출하는 과정을 도시한다.
도 7은 본 발명에 따른 특징 영역을 설정하기 위한 과정을 도시한다.
도 8은 본 발명에 따른 반려 동물의 종을 결정하기 위한 특징 영역을 도출하는 과정을 도시한다.
도 9는 반려 동물의 식별을 위한 객체의 이미지를 처리하기 위한 방법의 흐름도이다.
도 10은 반려 동물의 식별을 위한 객체의 이미지를 처리하기 위한 장치의 블록도이다. 1 shows a schematic procedure for AI-based management of companion animals.
2 illustrates examples of images not suitable for learning or identification and images suitable for learning or identification in AI-based management of companion animals.
3 shows a procedure for managing inscriptions of an AI-based companion animal to which the suitability determination of an object image for learning or identification according to the present invention is applied.
4 illustrates a procedure for detecting an object for identification of a companion animal in the companion animal management system according to the present invention.
5 shows an example of a UI (User Interface) screen for detecting an identification object of a companion animal to which the present invention is applied.
6 illustrates a process of detecting an object for identification of a companion animal according to the present invention.
7 shows a process for setting a feature region according to the present invention.
8 illustrates a process of deriving a feature region for determining a species of a companion animal according to the present invention.
9 is a flowchart of a method for processing an image of an object for identification of a companion animal.
10 is a block diagram of an apparatus for processing an image of an object for identification of a companion animal.

이하, 첨부한 도면을 참고로 하여 본 발명의 실시예들에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예들에 한정되지 않는다.Hereinafter, with reference to the accompanying drawings, embodiments of the present invention will be described in detail so that those of ordinary skill in the art to which the present invention pertains can easily implement them. The present invention may be embodied in many different forms and is not limited to the embodiments described herein.

본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 동일 또는 유사한 구성요소에 대해서는 동일한 참조 부호를 붙이도록 한다.In order to clearly describe the present invention, parts irrelevant to the description are omitted, and the same reference numerals are assigned to the same or similar elements throughout the specification.

또한, 여러 실시예들에 있어서, 동일한 구성을 가지는 구성요소에 대해서는 동일한 부호를 사용하여 대표적인 실시예에서만 설명하고, 그 외의 다른 실시예에서는 대표적인 실시예와 다른 구성에 대해서만 설명하기로 한다.In addition, in various embodiments, components having the same configuration will be described using the same reference numerals only in the representative embodiment, and only configurations different from the representative embodiment will be described in other embodiments.

명세서 전체에서, 어떤 부분이 다른 부분과 "연결(또는 결합)"되어 있다고 할 때, 이는 "직접적으로 연결(또는 결합)"되어 있는 경우뿐만 아니라, 다른 부재를 사이에 두고 "간접적으로 연결(또는 결합)"된 것도 포함한다. 또한, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다.Throughout the specification, when a part is "connected (or coupled)" with another part, it is not only "directly connected (or coupled)" but also "indirectly connected (or connected)" with another member therebetween. combined)" is also included. In addition, when a part "includes" a certain component, this means that other components may be further included, rather than excluding other components, unless otherwise stated.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥 상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다. Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Terms such as those defined in a commonly used dictionary should be interpreted as having a meaning consistent with the meaning in the context of the related art, and should not be interpreted in an ideal or excessively formal meaning unless explicitly defined in the present application. does not

본 문서에서는 강아지의 코 주름 형상(비문)을 활용하여 식별 정보를 추출하는 내용을 중심으로 설명하나, 본 발명에서 반려 동물의 범위는 강아지에 한정되지 않으며, 또한 식별 정보로 사용되는 특징으로서 비문에 한정되지 않고 다양한 반려 동물의 신체적 특징이 사용될 수 있다. In this document, the contents of extracting identification information by utilizing the shape of a dog's nose wrinkle (inscription) are mainly described, but in the present invention, the scope of companion animals is not limited to dogs, and, as a feature used as identification information, It is not limited and various physical characteristics of companion animals may be used.

앞서 설명한 바와 같이, AI 기반의 학습 또는 식별에 적합한 반려 동물의 비문 이미지가 충분하지 않고 반려 동물의 비문 이미지는 그 품질이 낮을 가능성이 크기 때문에 AI 기반의 학습 또는 식별을 위하여 비문 이미지를 선별적으로 데이터베이스에 저장할 필요가 있다. As described above, since the inscription images of companion animals are not sufficient for AI-based learning or identification, and the inscription images of companion animals are likely to be of low quality, the inscription images are selectively selected for AI-based learning or identification. You need to store it in the database.

도 3은 본 발명에 따른 학습 또는 식별용 비문 이미지의 적합도 판단이 적용된 AI 기반의 반려 동물의 비문 관리를 위한 절차를 도시한다. 본 발명은 반려 동물의 비문을 촬영한 후 촬영된 비문 이미지가 AI 기반의 학습 또는 식별을 위한 데이터로서 적합한지 여부를 먼저 판단하고, 적합하다고 판단된 경우 AI 기반의 학습 또는 인식을 위한 서버로 전송 및 저장하여 이후 학습 또는 식별을 위한 데이터로 사용한다. FIG. 3 shows a procedure for managing inscriptions of companion animals based on AI to which judgment of suitability of inscription images for learning or identification according to the present invention is applied. The present invention first determines whether the photographed inscription image is suitable as data for AI-based learning or identification after photographing the inscription of the companion animal, and if it is determined that it is suitable, it is transmitted to the server for AI-based learning or recognition and stored to be used as data for later learning or identification.

도 3에 도시된 것과 같이, 본 발명에 따른 비문 관리 절차는 크게 비문 취득 절차와 비문 인식 절차를 포함한다. As shown in FIG. 3 , the inscription management procedure according to the present invention largely includes an inscription acquisition procedure and an inscription recognition procedure.

본 발명에 따르면, 신규로 반려 동물의 비문을 등록할 때 반려 동물이 포함된 영상을 촬영한 후 반려 동물의 얼굴 영역에서 비문 이미지를 추출하며, 특히 해당 비문 이미지가 해당 반려 동물의 식별 또는 학습을 위하여 적합한 지 여부를 먼저 판단한다. 촬영된 이미지가 식별 또는 학습에 적합하다고 판단된 경우 해당 이미지가 서버(인공지능 신경망)로 전송되어 데이터베이스에 저장된다. According to the present invention, when an inscription of a companion animal is newly registered, an image containing a companion animal is taken, and an image of the inscription is extracted from the face region of the companion animal, and in particular, the inscription image is used to identify or learn the companion animal. First determine whether it is suitable for When it is determined that the captured image is suitable for identification or learning, the image is transmitted to the server (artificial intelligence neural network) and stored in the database.

비문을 통해 반려 동물의 식별 정보를 조회하는 경우, 마찬가지로 반려 동물이 포함된 영상을 촬영한 후 반려 동물의 얼굴 영역에서 비문 이미지를 추출하며, 특히 해당 비문 이미지가 해당 반려 동물의 식별 또는 학습을 위하여 적합한 지 여부를 먼저 판단한다. 촬영된 이미지가 식별 또는 학습에 적합하다고 판단된 경우 해당 이미지가 서버로 전송되고 기존에 저장된 비문 이미지들과의 매칭을 통해 해당 반려 동물의 식별 정보를 추출한다. In the case of inquiring identification information of companion animals through inscriptions, in the same way, after shooting an image containing companion animals, the inscription image is extracted from the companion animal's face area. In particular, the inscription image is used for identification or learning of the companion animal. First determine whether it is suitable or not. When it is determined that the photographed image is suitable for identification or learning, the image is transmitted to the server and the identification information of the companion animal is extracted through matching with the previously stored inscription images.

비문 등록 절차의 경우, 도 3의 (a)와 같이 반려 동물을 촬영하고(S305), 촬영된 반려 동물의 이미지에서 먼저 얼굴 영역(이후 제1 특징 영역으로 설명됨)을 검출하고(S310), 얼굴 영역 내에서 코가 차지하는 영역(이후 제2 특징 영역으로 설명됨)을 검출하고 촬영된 이미지가 학습 또는 식별용으로 적합한지에 대한 품질 검사를 통해 비문 이미지를 출력하고(S315), 출력된 이미지가 인공 신경망을 구성하는 서버로 전송되어 저장 및 등록된다(S320).In the case of the inscription registration procedure, a companion animal is photographed as shown in (a) of FIG. 3 (S305), and a face region (hereinafter described as a first feature region) is first detected in the photographed companion animal image (S310), Detects the area occupied by the nose within the face area (hereinafter referred to as the second characteristic area) and outputs an inscription image through a quality check on whether the captured image is suitable for learning or identification (S315), and the output image is It is transmitted to the server constituting the artificial neural network, stored and registered (S320).

비문 조회 절차의 경우, 도 3의 (b)와 같이 반려 동물을 촬영하고(S330), 반려 동물의 이미지에서 얼굴 영역을 검출하고(S335), 얼굴 영역 내에서 코가 차지하는 영역을 검출하고 촬영된 이미지가 학습 또는 식별용으로 적합한지에 대한 품질 검사를 통해 비문 이미지를 출력하는데(S315), 이는 비문 등록 절차와 유사하다. 이후 절차는 출력된 비문 이미지를 기존에 저장 및 학습된 비문 이미지들과 비교하여 일치하는 정보를 탐색하는 과정(S345)과 탐색 결과에 대한 출력 과정(S350)이 수행된다. In the case of the inscription inquiry procedure, a companion animal is photographed (S330), a face region is detected from the image of the companion animal (S335), and an area occupied by a nose is detected within the face region as shown in FIG. An inscription image is output through a quality check on whether the image is suitable for learning or identification (S315), which is similar to the inscription registration procedure. Thereafter, the process of searching for matching information by comparing the output inscription image with the previously stored and learned inscription images (S345) and outputting the search result (S350) are performed.

도 4는 본 발명에 따른 반려 동물의 비문 관리 시스템에서 반려 동물의 코에 대응하는 객체를 검출하기 위한 절차를 도시한다. 4 illustrates a procedure for detecting an object corresponding to a nose of a companion animal in the system for managing inscriptions of a companion animal according to the present invention.

도 4를 참조하면, 먼저 반려 동물을 촬영하여 초기 이미지가 생성되며(S405), 초기 이미지에서 얼굴 영역을 먼저 검출한다(S410). 이후 얼굴 영역 내에서 반려 동물의 종을 고려하여 코 영역을 검출한다(S415). 1차적으로 얼굴 영역을 먼저 검출하고 2차적으로 코 영역을 검출하는 것은 계단식(cascaded) 검출을 통해 모든 종을 고려하여 코 영역을 검출하는 것 보다 연산 복잡도를 낮출 수 있고 검출 정확도도 향상시킬 수 있기 때문이다. 이후 검출된 코 영역의 이미지가 향후 비문의 식별 또는 학습에 있어 적합한지 여부를 검사하기 위한 품질 검사가 수행되고(S420), 품질 검사 결과 적합한 이미지로 판단된 경우 해당 이미지를 서버로 전송하여 비문의 식별에 사용되거나 향후 학습 또는 식별을 위해 저장될 수 있다(S425). Referring to FIG. 4 , an initial image is generated by first photographing a companion animal (S405), and a face region is first detected from the initial image (S410). Thereafter, the nose region is detected in consideration of the companion animal species within the face region (S415). Detecting the face region first and secondarily detecting the nose region can lower the computational complexity and improve detection accuracy than detecting the nose region considering all species through cascaded detection. Because. Thereafter, a quality check is performed to check whether the detected image of the nose area is suitable for identification or learning of a future inscription (S420), and if it is determined as a suitable image as a result of the quality check, the image is transmitted to the server to It may be used for identification or stored for future learning or identification (S425).

또한, 본 발명에 따르면 강아지의 코 주름(비문)과 같이 반려 동물의 식별을 위한 객체의 이미지가 흐릿하게 촬영되지 않도록 검출된 코 영역에 초점이 맞춰지도록 카메라를 제어할 수 있다(S430). 이는 카메라의 초점이 코 영역에 맞춰지도록 함으로써 코의 초점이 어긋남으로 인해 이미지의 품질이 저하되는 것을 방지하기 위함이다. Also, according to the present invention, it is possible to control the camera to focus on the detected nose region so that the image of an object for identification of a companion animal, such as a dog's nose wrinkle (inscription), is not taken blurry (S430). This is to prevent the image quality from being deteriorated due to the nose being out of focus by making the camera focus on the nose area.

도 5는 본 발명이 적용된 반려 동물의 비문 이미지를 획득하기 위한 UI(User Interface) 화면의 예를 도시한다. 도 5는 여러 반려 동물들 중에서 강아지의 비문을 취득하기 위한 경우를 도시한다. 5 shows an example of a UI (User Interface) screen for obtaining an image of an inscription of a companion animal to which the present invention is applied. 5 shows a case for obtaining a dog's inscription among several companion animals.

도 5를 참고하면, 촬영중인 영상에서 반려 동물의 종을 식별하여 현재 촬영중인 반려 동물이 강아지인지 여부를 판단한다. 촬영중인 반려 동물이 강아지가 아닌 경우 도 5와 같이 '강아지를 찾을 수 없어요'와 같은 문구를 출력하고, 촬영중인 반려 동물이 강아지인 경우 강아지의 비문을 획득하기 위한 절차를 수행한다. 촬영중인 반려 동물이 강아지인지 여부를 판단하기 위하여 영상에 포함된 반려 동물의 얼굴 영역을 먼저 추출하고, 얼굴 영역에 포함된 이미지를 기존의 학습된 데이터와 비교하여 해당 반려 동물의 종(Species)을 결정할 수 있다. Referring to FIG. 5 , it is determined whether the companion animal currently being photographed is a dog by identifying the species of the companion animal in the image being photographed. If the companion animal being photographed is not a dog, a phrase such as 'cannot find a dog' is output as shown in FIG. 5, and if the companion animal being photographed is a dog, a procedure for obtaining the dog's inscription is performed. In order to determine whether the companion animal being photographed is a dog, the face region of the companion animal included in the image is first extracted, and the image included in the face region is compared with the existing learned data to determine the species of the companion animal. can decide

이후, 강아지의 얼굴에서 강아지의 코에 해당하는 영역을 설정한 후 코에 해당하는 영역에 초점을 맞추어 촬영을 지속한다. 도 5와 같이 강아지가 지속적으로 움직이는 경우에도 강아지의 코를 추적하면서 코에 초점을 맞춰 촬영이 수행될 수 있다. 이때 각 촬영된 영상에서 강아지의 비문 이미지가 반려 동물의 식별 또는 학습을 위하여 적합한지 여부를 판단하고, 적합성에 대한 정도를 도 5와 같이 출력할 수 있다. 비문 이미지가 반려 동물의 식별 또는 학습을 위하여 적합한지에 대한 정도가 수치로서 계산될 수 있으며, 적합도에 대한 수치에 따라 적합도가 낮을수록 'BAD' 방향으로, 적합도가 높을 수록 'GOOD' 방향으로 그래픽이 출력될 수 있다. 강아지의 비문 이미지가 충분히 획득된 경우, 촬영을 종료하고 해당 강아지의 비문 이미지와 함께 식별 정보를 데이터베이스에 저장하거나 해당 강아지의 식별 정보를 출력할 수 있다. Thereafter, after setting an area corresponding to the dog's nose on the dog's face, focusing on the region corresponding to the nose, and shooting is continued. As shown in FIG. 5 , even when the dog is continuously moving, the photographing may be performed by focusing on the nose while tracking the dog's nose. In this case, it may be determined whether the image of the dog's inscription in each captured image is suitable for identification or learning of a companion animal, and the degree of suitability may be output as shown in FIG. 5 . The degree of whether the inscription image is suitable for identification or learning of a companion animal can be calculated as a numerical value, and according to the numerical value for the fit, the lower the fit, the better the graphic is. can be output. When the image of the inscription of the dog is sufficiently obtained, the shooting may end and the identification information may be stored in the database together with the image of the inscription of the dog or the identification information of the dog may be output.

본 발명에서, 반려 동물의 얼굴 영역을 먼저 검출한 이후 얼굴 영역 내에서 코 영역을 검출한다. 이는 연산 복잡도를 저감시키면서 객체 검출 난이도를 낮추기 위함이다. 영상을 촬영하는 과정에서 검출하고자 하는 객체 이외의 물체 또는 불필요하거나 잘못된 정보가 영상에 포함될 수 있다. 따라서, 본 발명은 촬영중인 영상에서 원하는 객체(반려 동물의 코)가 존재하는지 여부를 먼저 판단한다. In the present invention, the face region of the companion animal is first detected, and then the nose region is detected within the face region. This is to reduce the difficulty of object detection while reducing computational complexity. In the process of capturing an image, an object other than an object to be detected or unnecessary or erroneous information may be included in the image. Therefore, in the present invention, it is first determined whether a desired object (the nose of a companion animal) exists in the image being photographed.

또한, 반려 동물의 비문을 식별하기 위하여는 일정 수준 이상의 해상도를 갖는 영상이 요구되지만 영상의 해상도가 높아질수록 영상의 처리를 위한 연산량이 증가하게 되는 문제가 있다. 또한, 반려 동물의 종류가 증가함에 따라 각 반려 동물의 종류 별로 학습 방법이 상이하기 때문에 인공지능의 연산 난이도가 더욱 증가하게 되는 문제가 있다. 특히, 유사한 종류의 동물은 비슷한 형상을 가지기 때문에(예: 강아지의 코와 늑대의 코는 유사함) 유사한 동물에 대하여 동물의 종류와 함께 코를 분류하는 것은 매우 높은 연산 난이도를 가질 수 있다. In addition, in order to identify the inscription of the companion animal, an image having a resolution higher than a certain level is required, but as the resolution of the image increases, the amount of computation for image processing increases. In addition, as the types of companion animals increase, there is a problem in that the computational difficulty of artificial intelligence further increases because a learning method is different for each type of companion animal. In particular, since similar types of animals have similar shapes (eg, a dog's nose and a wolf's nose are similar), classifying a nose with an animal type for a similar animal may have a very high computational difficulty.

따라서, 본 발명은 이러한 연산 복잡도를 저감시키기 위하여 계단식(cascaded) 객체 검출 방법을 사용한다. 예를 들어, 반려 동물을 촬영하면서 반려 동물의 얼굴 영역을 먼저 검출한 후 반려 동물의 종류를 먼저 식별하고, 검출된 반려 동물의 얼굴 영역과 식별된 반려 동물의 종류에 기초하여 해당 반려 동물의 코 영역을 검출한다. 이는 상대적으로 연산 복잡도가 낮은 저해상도에서 반려 동물의 종류를 식별하는 과정을 먼저 수행하고 반려 동물의 종류에 따라 결정된 객체 검출 방법을 적용하여 반려동물의 얼굴 영역에서 고해상도를 유지하며 코 영역 검출을 수행하는 것이다. 그리하여, 본 발명은 상대적으로 연산 복잡도를 저감시키면서도 효과적으로 반려 동물의 코 영역을 검출할 수 있다. Accordingly, the present invention uses a cascaded object detection method to reduce the computational complexity. For example, while photographing a companion animal, the face region of the companion animal is detected first, and the type of the companion animal is first identified. detect the area. This involves first performing the process of identifying the type of companion animal at low resolution with relatively low computational complexity, and then applying the object detection method determined according to the type of companion animal to maintain high resolution in the companion animal's face area and perform nose area detection. will be. Thus, the present invention can effectively detect the nose region of a companion animal while relatively reducing computational complexity.

도 6은 본 발명에 따른 반려 동물의 식별을 위한 전반적인 영상 처리 과정을 도시한다. 도 6에 도시된 것과 같이, 본 발명에 의한 입력 영상을 처리하는 방법은 카메라에서 입력 영상을 받아오는 단계(S605), 입력 영상의 크기를 조정하여 1차 처리 영상을 생성하는 제1 전처리 단계(S610), 제1 전처리 단계에서 생성된 처리 영상으로부터 동물의 위치와 동물의 종을 검출하는 제1 특징 영역 검출 단계(S615), 제1 특징 영역 검출 단계의 결과물로부터 동물 영상의 제1 특징 값을 추출하는 제1 후처리 단계(S620), 제1 후처리 단계를 통해 처리된 영상에서 반려 동물의 종에 따라 반려 동물의 식별을 위한 객체(예: 코)를 검출하기 위한 검출기를 결정하는 단계(S625), 반려 동물의 식별을 위한 영상 처리를 위하여 영상의 크기를 조절하는 제2 전처리 단계(S630), 제1 특징 검출 단계에서 검출할 수 있는 동물의 종에 각각 대응하는 적어도 하나의 제2 특징 영역 검출 단계(S635), 각각의 제2 특징 영역 검출 단계에 대응하여 동물 영상의 제2 특징 값을 추출하는 제2 후처리 단계(S640)를 포함한다.6 illustrates an overall image processing process for identification of a companion animal according to the present invention. As shown in FIG. 6 , the method for processing an input image according to the present invention includes a step of receiving an input image from a camera (S605), a first pre-processing step of generating a primary processed image by adjusting the size of the input image ( S610), the first feature region detection step of detecting the position and species of the animal from the processed image generated in the first preprocessing step (S615), and the first feature value of the animal image from the results of the first feature region detection step A first post-processing step of extracting (S620), a step of determining a detector for detecting an object (eg, nose) for identification of a companion animal according to the species of the companion animal in the image processed through the first post-processing step ( S625), a second pre-processing step of adjusting the size of an image for image processing for identification of a companion animal (S630), and at least one second feature corresponding to an animal species that can be detected in the first feature detecting step, respectively A second post-processing step ( S640 ) of extracting a second feature value of the animal image corresponding to the region detection step ( S635 ) and each second feature region detection step is included.

본 발명에 따른 반려 동물의 식별을 위한 객체의 이미지를 처리하기 위한 방법은, 반려 동물이 포함된 영상을 획득하는 단계와, 영상에서 반려 동물의 종을 결정하기 위한 특징 영역 후보들을 생성하는 단계와, 특징 영역 후보들 각각의 신뢰도 값에 기초하여 위치 및 크기가 결정된 제1 특징 영역을 설정하는 단계와, 제1 특징 영역에서 반려 동물을 식별하기 위한 객체를 포함하는 제2 특징 영역을 설정하는 단계와, 제2 특징 영역에서 상기 객체의 이미지를 획득하는 단계를 포함한다. 이하 강아지의 비문 인식을 중심으로 본 발명에 따른 반려 동물의 식별을 위한 객체의 이미지 처리 방법에 대해 설명한다. A method for processing an image of an object for identification of a companion animal according to the present invention comprises the steps of: acquiring an image including a companion animal; generating feature region candidates for determining a species of a companion animal from the image; , setting a first feature region whose location and size are determined based on the reliability values of each of the feature region candidates, and setting a second feature region including an object for identifying a companion animal in the first feature region; , acquiring an image of the object in the second feature region. Hereinafter, an image processing method of an object for identification of a companion animal according to the present invention will be described with a focus on recognizing a dog's inscription.

제1 전처리 단계first pretreatment step

원본 영상에 대한 제1 전처리를 적용하는 단계(S610)는 원본 영상의 크기, 비율, 방향 등을 조절하여 객체 검출에 적합한 형태로 영상을 변환하는 단계이다. The step of applying the first pre-processing to the original image ( S610 ) is a step of converting the image into a form suitable for object detection by adjusting the size, ratio, direction, etc. of the original image.

카메라 기술의 발달에 따라, 입력 영상은 대부분 수백만에서 수천만 화소로 구성되며, 이렇게 큰 영상을 직접 처리하는 것은 바람직하지 않다. 객체 검출이 효율적으로 작동하려면 입력 영상을 처리하기 적당하도록 전처리 과정을 수행해야 한다. 이러한 과정은 수학적으로는 좌표계 변환으로 이루어진다.With the development of camera technology, the input image is mostly composed of millions to tens of millions of pixels, and it is not desirable to directly process such a large image. In order for object detection to work efficiently, preprocessing must be performed to make the input image suitable for processing. This process is mathematically performed as a coordinate system transformation.

입력 영상 내의 임의의 4점을 처리 영상의 네 꼭지점에 대응시키고, 임의의 좌표계 변환 과정을 거침으로써 임의의 처리 영상을 생성할 수 있음은 명백하다. 그러나 좌표계 변환 과정에 있어 임의의 비선형 변환 함수를 사용하는 경우, 특징 영역 검출기의 결과로 획득한 경계 상자로부터 입력 영상의 특징 영역을 얻는 역변환이 가능하여야 한다. 예를 들면, 입력 영상의 임의의 4점을 처리 영상의 네 꼭지점에 대응시켜 선형 변환하는 아핀 변환(Affine Transformation)을 사용하면 손쉽게 그 역변환 과정을 얻을 수 있으므로 이를 사용함이 바람직하다. It is clear that an arbitrary processed image can be generated by matching any four points in the input image to the four vertices of the processed image and performing an arbitrary coordinate system transformation process. However, when an arbitrary nonlinear transformation function is used in the coordinate system transformation process, inverse transformation to obtain the feature region of the input image from the bounding box obtained as a result of the feature region detector must be possible. For example, it is preferable to use an affine transformation (Affine Transformation), which linearly transforms four arbitrary points of an input image by matching four vertices of the processed image, because the inverse transformation process can be easily obtained.

입력 영상 내의 임의의 4점을 결정하는 방법의 일 예로, 입력 영상의 네 꼭지점을 그대로 사용하는 방법을 생각할 수 있다. 또는, 가로 길이와 세로 길이가 같은 비율로 변환될 수 있도록, 입력 영상에 여백을 덧붙이거나, 또는 입력 영상의 일부를 잘라내는 방법을 사용할 수 있다. 또는 입력 영상의 크기를 축소시키기 위하여 다양한 보간 방법을 적용할 수 있다. As an example of a method of determining four arbitrary points in the input image, a method of using four vertices of the input image as it is may be considered. Alternatively, a method of adding a blank space to the input image or cutting out a part of the input image may be used so that the horizontal and vertical lengths are converted at the same ratio. Alternatively, various interpolation methods may be applied to reduce the size of the input image.

제1 특징 영역 검출 단계first feature region detection step

본 단계는 전처리된 영상 내에서 반려 동물이 존재하는 영역과 그 동물의 종(種)을 먼저 검출함으로써, 후술할 제2 특징 영역 검출 단계에서 사용할 수 있는 제1 특징 영역을 설정하고, 더불어 각 반려 동물의 종에 최적화된 제2 특징 영역 검출기를 선택함으로써 최종적인 특징점 검출 성능을 올리는 데에 그 목적을 둔다. In this step, by first detecting the region where the companion animal exists and the species of the animal in the pre-processed image, the first characteristic region that can be used in the second characteristic region detection step to be described later is set, and each companion animal It aims to improve the final feature point detection performance by selecting the second feature region detector optimized for the animal species.

본 과정에 있어 객체 검출 및 분류 방법은 관련 분야에 통상적인 지식을 가진 자라면 어느 것이던 용이하게 결합할 수 있을 것이다. 그러나, 종래의 방법과 대비하여 인공 신경망에 기반한 방법이 성능이 우수한 것으로 알려져 있으므로, 가급적 인공 신경망에 기반한 특징 검출 기법을 사용하는 것이 바람직하다. 예를 들어, 도 7에 도시된 것과 같이 인공 신경망에 한 장의 이미지에 대하여 여러 가지 크기의 물체를 검출하는 알고리즘인 SSD(Single-Shot Multibox Detection) 방식의 특징 검출기가 사용될 수 있다. In this process, the object detection and classification method can be easily combined by any person with ordinary knowledge in the related field. However, since the method based on the artificial neural network is known to have superior performance compared to the conventional method, it is preferable to use a feature detection technique based on the artificial neural network as much as possible. For example, as shown in FIG. 7 , a feature detector of a single-shot multibox detection (SSD) type, which is an algorithm for detecting objects of various sizes with respect to one image, may be used in an artificial neural network.

앞서 설명한 전처리기에 따라 정규화된 입력 영상은, 인공 신경망에 의하여 계층적으로 제1 특징 영상부터 제n 특징 영상을 생성하게 된다. 즉, 인공 신경망을 사용하여 계층적으로 영상을 복수개의 특징 영상들을 생성하는 단계가 수행된다. 이 때 각 계층마다 특징 영상을 추출하는 방법은 인공 신경망의 학습 단계에서 기계적으로 학습될 수 있다.The input image normalized according to the preprocessor described above is hierarchically generated from the first feature image to the nth feature image by the artificial neural network. That is, the step of generating a plurality of feature images hierarchically using an artificial neural network is performed. In this case, a method of extracting a feature image for each layer may be mechanically learned in the learning step of the artificial neural network.

이렇게 추출된 계층적 특징 영상은, 각 계층마다 대응되는 사전 정의된 상자(Priori Box) 목록과 결합되어 경계 상자와 개체 종류, 그리고 신뢰도 값(확률 값) 목록을 생성하게 된다. 즉, 특징 영상들 각각에 대하여 미리 정의된 경계 영역(경계 상자)들을 적용하여 각 경계 영역들에서 특정 종의 반려 동물이 위치할 확률 값을 계산한다. 이러한 연산 과정 또한 인공 신경망의 학습 단계에서 기계적으로 학습될 수 있다. 예를 들어 그 결과값은 아래의 표 1과 같은 형식으로 반환된다. 이 때 신경망이 판단할 수 있는 종의 개수는 신경망 설계 단계에서 결정되며, 암묵적으로 객체가 존재하지 않는 경우, 즉 "배경"이 정의된다. The hierarchical feature image extracted in this way is combined with a list of predefined boxes corresponding to each layer to generate a list of bounding boxes, object types, and confidence values (probability values). That is, by applying predefined boundary regions (bounding boxes) to each of the feature images, a probability value of a specific species of companion animal being located in each of the boundary regions is calculated. This computational process may also be mechanically learned in the learning stage of the artificial neural network. For example, the result is returned in the format shown in Table 1 below. At this time, the number of species that the neural network can determine is determined in the neural network design stage, and when an object does not exist implicitly, that is, “background” is defined.

이러한 결과 상자는 NMS(Non-Maximum Suppression) 단계를 통하여 중첩되는 결과 상자들을 병합하여 최종적으로 영상 내에 존재하는 객체 검출 결과로 반환된다. NMS는 복수개의 특징 영역 후보들로부터 최종적인 특징 영역을 도출하는 과정으로서, 특징 영역 후보들은 도 7과 같은 절차에 따라 표 1과 같은 확률 값을 고려하여 생성될 수 있다.These result boxes are finally returned as object detection results in the image by merging overlapping result boxes through a non-maximum suppression (NMS) step. NMS is a process of deriving a final feature region from a plurality of feature region candidates, and the feature region candidates may be generated in consideration of the probability values shown in Table 1 according to the procedure shown in FIG. 7 .

이 과정을 상세히 설명하면 다음과 같다.A detailed description of this process is as follows.

1. 배경을 제외한 각각의 종에 대하여 각각 다음 과정을 수행한다.1. Perform the following procedure for each species except for the background.

A. 경계 상자 목록에서, 해당 종일 확률이 특정 문턱 값 보다 낮은 상자를 제외한다. 남은 상자가 없다면 결과 없음으로 종료한다.A. From the list of bounding boxes, exclude boxes whose species probability is lower than a certain threshold. If there are no boxes left, exit with no results.

B. 상기 경계 상자 목록에서, 해당 종일 확률이 가장 높은 상자를 제1 상자(제1 경계 영역)로 지정하고 경계 상자 목록에서 제외한다.B. In the bounding box list, the box with the highest probability of the species is designated as the first box (first bounding area) and excluded from the bounding box list.

C. 나머지 경계 상자 목록에 대하여, 확률이 높은 순서에 따라 각각 다음 과정을 수행한다.C. For the rest of the bounding box list, each of the following processes is performed in the order of highest probability.

i. 제1 상자와의 중첩도를 연산한다. 예를 들어, 교집합 대 합집합 면적 비(Intersection over Union)가 사용될 수 있다.i. Calculate the degree of overlap with the first box. For example, an intersection over union area ratio may be used.

ii. 중첩도 값이 특정 문턱 값보다 높다면, 이 상자는 제1 상자와 중첩되는 상자이다. 제1 상자와 병합한다.ii. If the overlap value is higher than a certain threshold, this box is a box overlapping the first box. Merge with the first box.

D. 제1 상자를 결과 상자 목록에 추가한다.D. Add the first box to the list of result boxes.

E. 경계 상자 목록에 상자가 남아 있다면, 남은 상자를 대상으로 다시 C 단계부터 반복한다.E. If there are still boxes in the bounding box list, repeat from step C again for the remaining boxes.

두 상자 A와 B에 대하여, 예를 들어 교집합 대 합집합 면적비는 아래의 수학식 1과 같이 효과적으로 연산할 수 있다. For two boxes A and B, for example, the intersection to union area ratio can be effectively calculated as in Equation 1 below.

즉, 본 발명에 따르면 특징 영역 후보들을 생성하는 단계는 특징 영상에서 특정 동물 종에 해당할 확률이 가장 높은 제1 경계 영역(제1 상자)을 선택하는 단계와, 특징 영상에서 선택된 제1 경계 영역(제1 상자)을 제외한 나머지 경계 영역들에 대하여 확률 값의 순서에 따라 제1 경계 영역과의 교집합 대 합집합 면적비(IoU)를 계산하고 교집합 대 합집합 면적이 기준 면적비보다 큰 경계 영역을 특징 영상의 특징 영역 후보에 포함시키는 단계를 포함할 수 있다. That is, according to the present invention, generating the feature region candidates includes selecting a first boundary region (first box) that has the highest probability of corresponding to a specific animal species in the feature image, and the first boundary region selected from the feature image. For the remaining boundary regions except (the first box), the intersection-to-union area ratio (IoU) with the first boundary region is calculated according to the order of the probability values, and the boundary region where the intersection-to-union area is larger than the reference area ratio of the feature image It may include including the feature region candidate.

상기 과정에서 제1 상자와 중첩되는 상자를 병합하는 방법에 대해 설명하면 다음과 같다. 예를 들어, 제1 상자는 그대로 유지하고 제2 상자는 경계 상자 목록에서 삭제하는 방법으로 병합할 수 있다 (Hard NMS). 또는, 제1 상자는 그대로 유지하고 제2 상자가 특정 종일 확률을 (0, 1) 사이의 값만큼 가중치를 주어 감소시키고, 감쇄된 결과값이 특정 문턱 값보다 작다면 비로소 경계 상자 목록에서 삭제하는 방법으로 병합할 수 있다 (Soft NMS).A method of merging the first box and the overlapping box in the above process will be described as follows. For example, you can merge the first box by keeping it as it is and deleting the second box from the bounding box list (Hard NMS). Alternatively, the first box is maintained as it is, and the second box reduces the probability of a specific species by weighting it by a value between (0, 1), and if the attenuated result value is smaller than a specific threshold value, it is only deleted from the bounding box list. method (Soft NMS).

본 발명에서 제안하는 일 실시예로서, 아래의 수학식 2와 같이 제1 상자(제1 특징 영역 후보)와 제2 상자(제1 특징 영역 후보)를 확률 값에 따라 병합하는 새로운 방법(Expansion NMS)을 사용할 수 있다.As an embodiment proposed by the present invention, as shown in Equation 2 below, a new method (Expansion NMS) for merging a first box (a first feature region candidate) and a second box (a first feature region candidate) according to a probability value ) can be used.

이 때, p¹, p² 는 각각 제 1 상자(제1 특징 영역 후보), 제 2 상자(제1 특징 영역 후보)의 확률 값이며, C_(x,y) ¹, C_(x,y) ², C_(x,y) ⁿ는 각각 제 1 상자, 제 2 상자, 병합된 상자의 중앙점의 (x,y) 좌표 값을 나타낸다. 같은 방법으로, W₁, W₂, W_n은 각각 제1 상자, 제2 상자, 병합된 상자의 가로 너비를, H¹, H², Hⁿ은 세로 높이를 나타낸다. 병합된 상자의 확률 값은 제1 상자의 확률 값을 사용할 수 있다. 본 발명에 따른 확장형 NMS에 의해 도출되는 제1 특징 영역은 각 특징 영역 후보들에서 특정 종이 위치할 신뢰도 값을 고려하여 결정된다. In this case, p ¹ and p ² are the probability values of the first box (first feature region candidate) and the second box (first feature region candidate), respectively, and C _(x,y) ¹ , C _(x,y) ² , C _(x,y) ⁿ represents the (x,y) coordinate value of the center point of the first box, the second box, and the merged box, respectively. In the same way, W ₁ , W ₂ , and W _n represent the horizontal width of the first box, the second box, and the merged box, respectively, and H ¹ , H ² , H ⁿ represent the vertical height. The probability value of the merged box may use the probability value of the first box. The first feature region derived by the extended NMS according to the present invention is determined in consideration of the reliability value at which a specific species is to be located in each of the feature region candidates.

즉, 제1 특징 영역이 위치하는 중심점(C_(x,y) ⁿ)은 수학식 2와 같이 특징 영역 후보의 중심점들(C_(x,y) ¹, C_(x,y) ²)에 대한 신뢰도 값(p¹, p²)의 가중 합에 의해 결정될 수 있다. That is, the central point (C _(x,y) ⁿ ) at which the first feature region is located is the center point (C _(x,y) ¹ , C _(x,y) ² ) of the feature region candidate as shown in Equation 2 It can be determined by the weighted sum of the reliability values (p ¹ , p ² ).

또한, 제1 특징 영역의 너비(W_n ⁾는 수학식 2와 같이 특징 영역 후보들의 너비들(W₁, W₂)에 대한 신뢰도 값(p¹, p²)의 가중 합에 의해 결정되고, 제1 특징 영역의 높이(Hⁿ)는 특징 영역 후보들의 높이들(H¹, H²)에 대한 신뢰도 값(p¹, p²)의 가중 합에 의해 결정될 수 있다. In addition, the width (W _n ⁾ of the first feature region is determined by the weighted sum of the reliability values (p ¹ , p ² ) for the widths (W ₁ , W ₂ ) of the feature region candidates as in Equation 2, The height H ⁿ of the first feature region may be determined by a weighted sum of reliability values p ¹ , p ² with respect to the heights H ¹ , H ² of the feature region candidates.

상기 실시 예에 따라 새로운 상자를 생성함으로써, 기존의 Hard-NMS 또는 Soft-NMS 방식과 대비하였을 때, 폭과 너비가 더 큰 상자를 얻게 된다. 본 실시예에 따르면 다단계 검출기를 수행하기 위한 전처리 검출에서는 일정 부분의 여백을 추가하는 구성이 가능한데, 본 발명에 따른 확장형 NMS(Expansion NMS)를 사용함으로써 이러한 여백을 적응적으로 결정할 수 있다.By creating a new box according to the above embodiment, a box having a larger width and width is obtained as compared to the existing Hard-NMS or Soft-NMS method. According to the present embodiment, it is possible to add a certain part of a blank in the preprocessing detection for performing the multi-stage detector. By using the expansion NMS according to the present invention, this blank can be adaptively determined.

도 8은 기존의 NMS와 비교하여 본 발명에 따른 확장형 NMS를 적용하여 반려 동물의 특징 영역을 검출하는 경우의 예를 도시한다. 도 8의 (a)는 원본 영상에서 생성된 복수개의 특징 영역 후보들을 도시하며, 도 8의 (b)는 기존의 NMS에 의해 도출된 제1 특징 영역의 예, 도 8의 (c)는 본 발명에 따른 확장형 NMS를 적용하여 도출된 제2 특징 영역의 예를 도시한다. 도 8의 (b)에 도시된 것과 같이 기존의 NMS(Hard NMS, Soft NMS)는 복수개의 박스들(특징 영역 후보들) 중 신뢰도가 가장 큰 하나의 박스(특징 영역 후보)를 선택하기 때문에 이후 수행되는 제2 특징 영역 검출 과정에서 코 영역과 같이 비문을 획득하는데 필요한 영역이 벗어날 가능성이 있다. 8 shows an example of detecting a feature region of a companion animal by applying the extended NMS according to the present invention compared to the existing NMS. (a) of FIG. 8 shows a plurality of feature region candidates generated from the original image, (b) of FIG. 8 is an example of a first feature region derived by the conventional NMS, and (c) of FIG. An example of the second feature area derived by applying the extended NMS according to the present invention is shown. As shown in (b) of FIG. 8, the conventional NMS (Hard NMS, Soft NMS) selects one box (feature area candidate) with the highest reliability from among a plurality of boxes (feature area candidates). In the process of detecting the second feature region, there is a possibility that a region necessary to obtain an inscription, such as a nose region, may deviate.

따라서, 본 발명은 복수개의 박스들(특징 영역 후보들)에 대하여 신뢰도 값에 기반한 가중 평균을 적용하여 도 8의 (c)와 같이 너비와 높이가 큰 하나의 박스를 제1 특징 영역(반려 동물의 얼굴 영역)으로 설정하고, 제1 특징 영역 내에서 반려 동물의 식별을 위한 제2 특징 영역(코 영역)을 검출할 수 있다. 본 발명과 같이 보다 확장된 제1 특징 영역을 설정함으로써 이후 수행되는 제2 특징 영역이 검출되지 않는 오류의 발생을 감소시킬 수 있다.Accordingly, the present invention applies a weighted average based on the reliability value to a plurality of boxes (feature area candidates) to convert one box having a large width and height as shown in FIG. face region), and a second feature region (nose region) for identification of a companion animal may be detected within the first feature region. By setting a more extended first feature region as in the present invention, it is possible to reduce the occurrence of an error in which a subsequent second feature region is not detected.

마지막으로, 이렇게 결정된 하나 또는 복수 개의 경계 상자에 대하여, 전처리 단계에서 사용한 임의의 변환 과정에 대한 역변환 과정을 거침으로써 원본 영상에서의 특징 영역을 얻을 수 있음은 당연하다. 구성에 따라서는 원본 영상에서의 특징 영역에 일정량의 여백을 덧붙임으로써 후술할 제2 검출 단계를 잘 수행할 수 있도록 조정할 수 있다.Finally, it goes without saying that the feature region in the original image can be obtained by performing an inverse transformation process with respect to an arbitrary transformation process used in the pre-processing step for one or a plurality of bounding boxes determined in this way. Depending on the configuration, by adding a certain amount of blank space to the feature region in the original image, the second detection step, which will be described later, may be adjusted to be well performed.

제1 후처리 단계first post-processing step

앞서 기술한 제1 특징 영역 설정 단계(S720)에서 획득된 입력 영상의 각 특징 영역에 대하여, 추가적인 후처리 단계를 수행함으로써 제1 특징 값을 생성할 수 있다. 예를 들어, 입력 영상의 제1 특징 영역에 대한 밝기 정보(제1 특징 값)를 획득하기 위해 아래의 수학식 3과 같은 연산을 수행할 수 있다.The first feature value may be generated by performing an additional post-processing step on each feature region of the input image obtained in the step S720 of setting the first feature region described above. For example, in order to obtain brightness information (first feature value) on the first feature region of the input image, an operation as in Equation 3 below may be performed.

이 때 L은 BT.601 표준에 따른 Luma 값이며, V는 HSV 색공간에서 정의하는 명도 값이다. M, N은 대상 특징 영역의 가로 너비와 세로 높이이다.In this case, L is the Luma value according to the BT.601 standard, and V is the brightness value defined in the HSV color space. M and N are the horizontal width and vertical height of the target feature region.

이렇게 추가적으로 생성한 제1 특징 값을 이용하여, 제1 특징 영역 검출 단계(S615)에서 획득한 제1 특징 영역이 본 특허와 결합되는 응용 분야에서 사용하는데 적합한지 예측할 수 있다. 추가적으로 생성되는 제1 특징 값은 응용 분야에 따라 적절하게 설계되어야 함은 자명하다. 응용 분야에서 정의하는 제1 특징 값의 조건을 충족하지 않는 경우, 선택적으로 후술하는 제2 특징 영역 설정 및 객체 검출 단계를 생략하도록 시스템을 구성할 수 있다.By using the additionally generated first feature value, it may be predicted whether the first feature region obtained in the step of detecting the first feature region ( S615 ) is suitable for use in an application field combined with the present patent. It is obvious that the additionally generated first feature value should be appropriately designed according to an application field. If the condition of the first feature value defined in the application field is not satisfied, the system may be configured to selectively omit the second feature region setting and object detection steps, which will be described later.

제2 특징 영역 검출 단계second feature region detection step

본 단계는 동물이 존재하는 영역에서 구체적으로 응용 분야에서 필요로 하는 특징 영역을 추출하는 것을 목적으로 한다. 예를 들어, 동물의 얼굴 영역에서 눈, 코, 입, 귀의 위치를 검출해 내는 응용 분야를 예를 들면, 제1 특징 영역 검출 단계에서는 동물의 얼굴 영역과 동물의 종 정보를 우선 구분하고, 제2 특징 영역 검출 단계에서는 동물의 종에 따라 눈, 코, 입, 귀 위치를 검출해 내는 것을 목적으로 한다.The purpose of this step is to extract a feature region specifically required in the application field from the region where the animal exists. For example, in an application field that detects the positions of eyes, nose, mouth, and ears in an animal's face region, in the first feature region detection step, the animal's face region and animal species information are first distinguished, and the 2 In the feature area detection step, the purpose of detecting the positions of the eyes, nose, mouth, and ears according to the animal species.

이 과정에 있어, 제2 특징 영역 검출 단계는 각각의 동물의 종에 특화된 서로 독립된 복수 개의 특징 영역 검출기로 구성될 수 있다. 예를 들어, 제1 특징 영역 검출 단계에서 개, 고양이, 햄스터를 구분할 수 있다면, 3개의 제2 특징 영역 검출기를 두고 각각을 개, 고양이, 햄스터에 대해 특화되도록 설계하는 것이 바람직하다. 이렇게 함으로써, 개별 특징 영역 검출기에서 학습하여야 하는 특징 종류를 감소시켜 학습 복잡도를 감소시킬 수 있으며, 또한 학습 데이터 수집 측면에서도 더 적은 숫자의 데이터 만으로도 신경망 학습이 가능하게 됨은 자명하다.In this process, the step of detecting the second characteristic region may consist of a plurality of independent characteristic region detectors specialized for each animal species. For example, if dogs, cats, and hamsters can be distinguished in the first characteristic region detection step, it is preferable to design three second characteristic region detectors to be specialized for dogs, cats, and hamsters. In this way, it is possible to reduce the learning complexity by reducing the types of features to be learned by the individual feature region detector, and it is self-evident that neural network learning is possible even with a smaller number of data in terms of learning data collection.

각각의 제2 특징 영역 검출기는 서로 독립적으로 구성되므로, 통상적인 지식 소유자라면 독립된 개별 검출기를 손쉽게 구성이 가능하다. 각각의 특징 영역 검출기는 각각의 종에서 검출하려는 특징 정보에 맞게끔 개별적으로 구성하는 것이 바람직하다. 또는, 시스템 구성의 복잡도를 감소시키기 위하여, 일부 또는 모든 제2 특징 영역 검출기가 같은 구조의 특징 영역 검출기를 공유하되, 학습 파라미터 값을 교체함으로써 각각의 종에 적합하게 시스템을 구성하는 방법을 사용할 수 있다. 더욱 나아가, 제2 특징 영역 검출기로 제1 특징 영역 검출 단계와 동일한 구조의 특징 영역 검출기를 사용하되, 학습 파라미터 값과 NMS 방법만을 교체함으로써 시스템 복잡도를 더욱 감소시키는 방법을 고려할 수도 있다.Since each of the second feature region detectors is configured independently of each other, an ordinary knowledge holder can easily configure an independent individual detector. Preferably, each feature region detector is individually configured to suit the feature information to be detected in each species. Alternatively, in order to reduce the complexity of the system configuration, some or all of the second feature region detectors share the feature region detectors having the same structure, but a method of configuring the system suitable for each species by replacing the learning parameter values may be used. have. Furthermore, a method of further reducing system complexity by using a feature region detector having the same structure as the first feature region detection step as the second feature region detector, but replacing only the learning parameter values and the NMS method may be considered.

제1 특징 영역 검출 단계 및 제1 후처리 단계를 통하여 설정된 하나 또는 복수 개의 특징 영역에 대하여, 제1 특징 영역 검출 단계에서 검출한 종 정보를 이용하여 어떤 제2 특징 영역 검출기를 사용할 지 결정하고, 결정된 제2 특징 영역 검출기를 이용하여 제2 특징 영역 검출 단계를 수행한다.For one or a plurality of feature regions set through the first feature region detection step and the first post-processing step, determine which second feature region detector to use using the species information detected in the first feature region detection step; A second feature region detection step is performed using the determined second feature region detector.

먼저 전처리 과정을 수행한다. 이 때 좌표를 변환하는 과정에서, 역변환이 가능한 변환 과정을 사용해야 함은 자명하다. 제2 전처리 과정에서는 입력 영상 내에서 검출된 제1 특징 영역을 제2 특징 영역 검출기의 입력 영상으로 변환하여야 하므로 변환 함수를 설계하는데 필요한 4개 점은 제1 특징 영역의 네 꼭지점으로 정의함이 바람직하다.First, preprocessing is performed. In this case, in the process of transforming the coordinates, it is obvious that a transformation process capable of inverse transformation should be used. In the second preprocessing process, since the first feature region detected in the input image must be converted into the input image of the second feature region detector, it is preferable to define the four points necessary for designing the transformation function as four vertices of the first feature region. do.

제2 특징 영역 검출기를 통하여 획득한 제2 특징 영역은 제1 특징 영역을 이용하여 검출된 값이므로, 전체 입력 영상 내에서 제2 특징 영역을 연산할 때에는 제1 특징 영역을 고려하여야 한다.Since the second feature region obtained through the second feature region detector is a value detected using the first feature region, the first feature region must be considered when calculating the second feature region within the entire input image.

제2 특징 영역 검출기를 통하여 획득한 제2 특징 영역에 대하여, 제1 후처리 단계와 유사하게 추가적인 후처리 단계를 수행함으로써 제2 특징 값을 생성할 수 있다. 예를 들어, 영상의 선명도를 구하기 위하여 Sobel filter를 적용하거나, 또는 특징 영역 사이에 검출 여부 및 상대적인 위치 관계를 이용하여, 검출하려는 동물의 자세 등의 정보를 구할 수 있다.A second feature value may be generated by performing an additional post-processing step similar to the first post-processing step on the second feature area obtained through the second feature area detector. For example, information such as the posture of an animal to be detected may be obtained by applying a Sobel filter to obtain image sharpness, or by using detection and relative positional relationship between feature regions.

이렇게 추가적으로 생성한 제2 특징 값을 이용하여, 제2 객체 검출 단계에서 획득한 특징 영역이 본 특허와 결합되는 응용 분야에서 사용하는데 적합한지 예측할 수 있다. 추가적으로 생성되는 제2 특징 값은 응용 분야에 따라 적절하게 설계되어야 함은 자명하다. 응용 분야에서 정의하는 제2 특징 값의 조건을 충족하지 않는 경우, 제2 검출 영역뿐만 아니라 제1 검출 영역을 검출 결과에서 제외하는 등 응용 분야에 적합하게 데이터를 취득할 수 있도록 설계함이 바람직하다.By using the additionally generated second feature value, it can be predicted whether the feature region obtained in the second object detection step is suitable for use in an application field combined with the present patent. It is obvious that the additionally generated second feature value should be appropriately designed according to an application field. When the condition of the second feature value defined in the application field is not satisfied, it is preferable to design so that data can be acquired suitable for the application field, such as excluding the first detection area as well as the second detection area from the detection result. .

시스템 확장system expansion

본 발명에서는 2단계의 검출 단계를 구성함으로써, 제1 특징 위치 검출 단계에서는 동물의 위치와 종을 검출하고, 이 결과에 따라 제2 특징 위치 검출 단계에서 사용하는 검출기를 선택하는 시스템과 구성 방법의 예를 들었다.In the present invention, by configuring two detection steps, the first feature location detection step detects the position and species of an animal, and the system and method of configuring a detector to be used in the second feature location detection step are selected according to the results. For example.

이러한 계단식 구성(cascade configuration)은 손쉽게 다층 계단식 구성으로 확장될 수 있다. 예를 들어, 제1 특징 위치 검출 단계에서는 동물의 몸 전체를 검출하고, 제2 특징 위치 검출 단계에서는 동물의 얼굴 위치와 팔다리 위치를 검출하고, 제3 특징 위치 검출 단계에서는 얼굴에서 눈, 코, 입, 귀의 위치를 검출하는 등의 응용 구성이 가능하다.This cascade configuration can be easily extended to a multi-layer cascade configuration. For example, in the first feature position detection step, the entire body of the animal is detected, in the second feature location detection step, the face position and limb position of the animal are detected, and in the third feature position detection step, the eyes, nose, Applications such as detecting the position of the mouth and ears are possible.

이러한 다층 계단식 구성을 사용함으로써 동시에 여러 계층의 특징 위치 취득이 가능한 시스템을 손쉽게 설계할 수 있다. 다층 계단식 시스템을 설계함에 있어 층 수를 결정하는 데에는 취득하고자 하는 특징 위치의 계층 도메인, 전체 시스템의 동작 시간과 복잡도, 그리고 각각의 개별 특징 영역 검출기를 구성하는데 필요한 자원 등을 고려하여야 최적의 계층 구조를 설계할 수 있음은 자명하다.By using such a multi-layer cascading configuration, it is possible to easily design a system capable of acquiring feature locations of several layers at the same time. In designing a multi-layer cascading system, in determining the number of layers, the hierarchical domain of the feature location to be acquired, the operating time and complexity of the entire system, and the resources required to construct each individual feature region detector should be considered. It is obvious that it is possible to design

도 9는 반려 동물의 식별을 위한 객체의 이미지를 처리하기 위한 방법의 흐름도이다. 9 is a flowchart of a method for processing an image of an object for identification of a companion animal.

본 발명에 따른 반려 동물의 식별을 위한 객체의 이미지를 처리하기 위한 방법은, 반려 동물이 포함된 영상을 획득하는 단계(S910)와, 영상에서 반려 동물의 종을 결정하기 위한 특징 영역 후보들을 생성하는 단계(S920)와, 특징 영역 후보들 각각의 신뢰도 값에 기초하여 위치 및 크기가 결정된 제1 특징 영역을 설정하는 단계(S930)와, 제1 특징 영역에서 반려 동물을 식별하기 위한 객체를 포함하는 제2 특징 영역을 설정하는 단계(S940)와, 제2 특징 영역에서 상기 객체의 이미지를 획득하는 단계(S950)를 포함한다.A method for processing an image of an object for identification of a companion animal according to the present invention includes: acquiring an image including a companion animal ( S910 ); and generating feature region candidates for determining a species of a companion animal from the image ( S920 ), setting a first feature region whose location and size are determined based on the reliability values of each of the feature region candidates ( S930 ), and including an object for identifying a companion animal in the first feature region The method includes the steps of setting a second characteristic area (S940) and obtaining an image of the object in the second characteristic area (S950).

본 발명에 따르면, 특징 영역 후보들을 생성하는 단계는 인공 신경망을 사용하여 계층적으로 특징 영상들을 생성하는 단계와, 특징 영상들 각각에 대하여 미리 정의된 경계 영역들을 적용하여 각 경계 영역들에서 특정 종의 반려 동물이 위치할 확률 값을 계산하는 단계와, 확률 값을 고려하여 특징 영역 후보들을 생성하는 단계를 포함할 수 있다. According to the present invention, the generating of the feature region candidates includes generating the feature images hierarchically using an artificial neural network, and applying predefined boundary regions to each of the feature images to obtain a specific species in each boundary region. The method may include calculating a probability value at which the companion animal will be located, and generating feature region candidates in consideration of the probability value.

전처리기에 따라 정규화된 입력 영상은 인공 신경망에 의해 계층적으로 제1 특징 영상부터 제n 특징 영상이 생성되며, 각 계층마다 특징 영상을 추출하는 방법은 인공 신경망의 학습 단계에서 기계적으로 학습될 수 있다.The input image normalized according to the preprocessor is hierarchically generated from the first feature image to the nth feature image by the artificial neural network, and the method of extracting the feature image for each layer may be mechanically learned in the learning step of the artificial neural network. .

추출된 계층적 특징 영상은 각 계층마다 대응되는 사전 정의된 경계 영역(경계 상자)의 목록과 결합되어 경계 영역 마다 특정 동물 종류이 위치할 확률 값의 목록이 표 1과 같이 생성된다. 여기서 특정 동물 종류인지 여부가 판단이 안되는 경우 "배경"으로 정의될 수 있다. The extracted hierarchical feature image is combined with a list of predefined boundary regions (bounding boxes) corresponding to each layer, and a list of probability values where a specific animal type is located for each boundary region is generated as shown in Table 1. Here, when it is not possible to determine whether it is a specific animal type, it may be defined as "background".

이후 본 발명에 따른 확장형 NMS를 적용하여 각 특징 영상에서 반려 동물의 얼굴과 같이 종의 식별을 위한 제1 특징 영역의 후보들(특징 영역 후보들)이 생성된다. 각 특징 영역 후보들은 앞서 도출된 특정 동물 종별 확률 값을 사용하여 도출될 수 있다. Thereafter, by applying the extended NMS according to the present invention, candidates (feature region candidates) of the first feature region for identifying a species such as the face of a companion animal in each feature image are generated. Each of the feature region candidates may be derived using the previously derived probability value for each specific animal species.

본 발명에 따르면, 특징 영역 후보들을 생성하는 단계는, 특징 영상에서 특정 동물 종에 해당할 확률이 가장 높은 제1 경계 영역을 선택하는 단계와, 특징 영상에서 선택된 제1 경계 영역을 제외한 나머지 경계 영역들에 대하여, 확률 값의 순서에 따라 상기 제1 경계 영역과의 중첩도를 계산하고 상기 중첩도가 기준 중첩도보다 큰 경계 영역을 특징 영상의 특징 영역 후보에 포함시키는 단계를 포함할 수 있다. 이 때 중첩도 평가를 위하여, 예를 들어 두 경계 영역 사이의 교집합 대 합집합 면적비를 사용할 수 있다.According to the present invention, the generating of the feature region candidates includes: selecting a first boundary region having the highest probability of corresponding to a specific animal species from the feature image; and the remaining boundary regions excluding the first boundary region selected from the feature image. , calculating an overlap with the first boundary region according to the order of probability values, and including a boundary region having a greater overlap than a reference overlapping degree in a feature region candidate of the feature image. In this case, in order to evaluate the degree of overlap, for example, a ratio of intersection to union area between two boundary regions may be used.

즉, 하기의 절차를 통해 도 8의 (a)와 같은 특징 영역 후보가 생성될 수 있다. That is, a feature region candidate as shown in FIG. 8A may be generated through the following procedure.

i. 제1 상자와의 중첩도를 연산한다. 예를 들면 교집합 대 합집합 면적 비(Intersection over Union)를 사용할 수 있다.i. Calculate the degree of overlap with the first box. For example, you can use the intersection over union ratio.

ii. 상기 중첩도 값이 특정 문턱 값보다 높다면, 이 상자는 제1 상자와 중첩되는 상자이다. 제1 상자와 병합한다.ii. If the overlap value is higher than a certain threshold, this box is a box overlapping the first box. Merge with the first box.

두 상자 A와 B에 대하여, 예를 들어 교집합 대 합집합 면적비는 앞서 설명한 수학식 1과 같이 효과적으로 연산할 수 있다. For two boxes A and B, for example, the intersection to union area ratio can be effectively calculated as in Equation 1 described above.

도 8을 참조하여 설명한 바와 같이 본 발명에 따라 도출된 각각의 특징 영역 후보들로부터 각 특징 영역 후보의 신뢰도 값에 기반하여 제1 특징 영역(예: 반려 동물의 얼굴 영역)이 도출될 수 있다. As described with reference to FIG. 8 , a first feature region (eg, a face region of a companion animal) may be derived from each of the feature region candidates derived according to the present invention based on the reliability value of each feature region candidate.

본 발명에 따르면, 제1 특징 영역이 위치하는 중심점(C_(x,y) ⁿ)은 수학식 2와 같이 특징 영역 후보의 중심점들(C_(x,y) ¹, C_(x,y) ²)에 대한 신뢰도 값(p¹, p²)의 가중 합에 의해 결정될 수 있다. According to the present invention, the central point (C _(x,y) ⁿ ) at which the first feature region is located is the central point (C _(x,y) ¹ , C _(x,y) ² of the feature region candidate as shown in Equation 2) ) can be determined by the weighted sum of the reliability values (p ¹ , p ² ).

본 발명에 따르면, 제1 특징 영역의 너비(W_n ⁾는 수학식 2와 같이 특징 영역 후보들의 너비들(W₁, W₂)에 대한 신뢰도 값(p¹, p²)의 가중 합에 의해 결정되고, 제1 특징 영역의 높이(Hⁿ)는 특징 영역 후보들의 높이들(H¹, H²)에 대한 신뢰도 값(p¹, p²)의 가중 합에 의해 결정될 수 있다. According to the present invention, the width (W _n ⁾ of the first feature region is determined by the weighted sum of the reliability values (p ¹ , p ² ) for the widths (W ₁ , W ₂ ) of the feature region candidates as shown in Equation (2). is determined, and the height H ⁿ of the first feature region may be determined by a weighted sum of reliability values p ¹ , p ² with respect to the heights H ¹ , H ² of the feature region candidates.

본 발명은 복수개의 박스들(특징 영역 후보들)에 대하여 신뢰도 값에 기반한 가중 평균을 적용하여 너비와 높이가 큰 하나의 박스를 제1 특징 영역(반려 동물의 얼굴 영역)으로 설정하고, 제1 특징 영역 내에서 반려 동물의 식별을 위한 제2 특징 영역(코 영역)을 검출할 수 있다. 본 발명과 같이 보다 확장된 제1 특징 영역을 설정함으로써 이후 수행되는 제2 특징 영역이 검출되지 않는 오류의 발생을 감소시킬 수 있다.The present invention applies a weighted average based on the reliability value to a plurality of boxes (feature area candidates) to set one box having a large width and height as a first feature area (a face area of a companion animal), and the first feature A second feature region (nose region) for identification of a companion animal may be detected within the region. By setting a more extended first feature region as in the present invention, it is possible to reduce the occurrence of an error in which a subsequent second feature region is not detected.

이후 제1 특징 영역(예: 강아지의 얼굴 영역) 내에서 반려 동물의 식별을 위한 제2 특징 영역(예: 코 영역)에 대한 검출이 수행된다. 본 단계는 앞서 수행한 반려 동물의 종을 고려하여 수행된다. 반려 동물이 강아지인 경우 강아지에 최적화된 식별을 위한 객체 검출이 수행될 수 있다. 최적화된 객체 검출은 동물의 종류별로 상이할 수 있다. Thereafter, detection of a second feature region (eg, a nose region) for identification of a companion animal is performed within the first characteristic region (eg, a face region of a dog). This step is performed in consideration of the species of companion animal performed previously. When the companion animal is a dog, object detection for identification optimized for the dog may be performed. The optimized object detection may be different for each animal type.

제2 특징 영역을 설정하는 단계는 반려 동물의 종에 따라 제1 특징 영역(예: 얼굴 영역)에서 반려 동물의 식별을 위한 객체(예: 코)가 위치하는 확률에 기초하여 제2 특징 영역(예: 코 영역)을 설정하는 단계를 포함한다. The step of setting the second characteristic region may include the second characteristic region (eg, nose) based on the probability that an object (eg, nose) for identification of the companion animal is located in the first characteristic region (eg, face region) according to the species of the companion animal. e.g. nose area).

또한, 제2 특징 영역으로 검출된 반려 동물의 식별용 객체의 이미지가 이후 학습 또는 식별에 사용되기에 적합한지 여부를 검사하기 위한 후처리가 수행될 수 있다. 후처리에 대한 결과로서 해당 이미지의 적합도를 나타내는 제2 특징 값으로서 도출될 수 있다. 제2 특징 값이 기준치보다 큰 경우 상기 제2 특징 영역을 포함하는 이미지가 서버로 전송되고, 상기 제2 특징 값이 기준치보다 작은 경우 상기 제2 특징 영역을 포함하는 이미지는 버려진다. In addition, post-processing may be performed to check whether the image of the object for identification of the companion animal detected as the second feature region is suitable to be used for subsequent learning or identification. As a result of the post-processing, it may be derived as a second feature value indicating the fitness of the corresponding image. If the second feature value is greater than the reference value, the image including the second feature region is transmitted to the server. If the second feature value is smaller than the reference value, the image including the second feature region is discarded.

도 10은 반려 동물의 식별을 위한 객체의 이미지를 처리하기 위한 장치의 블록도이다. 10 is a block diagram of an apparatus for processing an image of an object for identification of a companion animal.

카메라(1010)는 렌즈와 같은 광학 모듈과 입력된 광으로부터 영상 신호를 생성하는 CCD(charge-coupled device) 또는 CMOS(complementary metal-oxide semiconductor)를 포함할 수 있으며, 영상 촬영을 통해 영상 데이터를 생성하여 프로세서(1020)로 제공할 수 있다. The camera 1010 may include an optical module such as a lens and a charge-coupled device (CCD) or complementary metal-oxide semiconductor (CMOS) that generates an image signal from input light, and generates image data through image capturing. Thus, it can be provided to the processor 1020 .

프로세서(1020)는 장치(1000)의 각 모듈을 제어하고 영상 처리를 위해 필요한 연산을 수행한다. 프로세서(1020)는 그 기능에 따라 복수개의 마이크로프로세서(프로세싱 회로)로 구성될 수 있다. 프로세서(1020)는 앞서 설명한 바와 같이 반려 동물(예: 강아지)의 식별을 위한 객체(예: 코)를 검출하고 해당 객체에 대한 이미지의 유효성 판단을 수행할 수 있다. The processor 1020 controls each module of the apparatus 1000 and performs an operation necessary for image processing. The processor 1020 may be configured with a plurality of microprocessors (processing circuits) according to their functions. As described above, the processor 1020 may detect an object (eg, nose) for identification of a companion animal (eg, dog) and determine the validity of an image for the corresponding object.

통신 모듈(1030)은 유/무선 네트워크를 통해 외부의 개체(entity)와 데이터를 송신 또는 수신할 수 있다. 특히, 통신 모듈(1030)은 학습 또는 식별용 서버와의 통신을 통해 인공 지능 기반의 처리를 위한 데이터를 교환할 수 있다. The communication module 1030 may transmit or receive data with an external entity through a wired/wireless network. In particular, the communication module 1030 may exchange data for processing based on artificial intelligence through communication with a server for learning or identification.

추가적으로, 장치(1000)는 영상 데이터 및 영상 처리를 위하여 필요한 정보를 저장하는 메모리(1040)와 사용자에게 화면을 출력하는 디스플레이(1050)를 포함하여 용도에 따라 다양한 모듈들을 포함할 수 있다. Additionally, the apparatus 1000 may include various modules according to uses, including a memory 1040 for storing image data and information necessary for image processing, and a display 1050 for outputting a screen to a user.

본 발명에 따른 반려 동물의 식별을 위한 객체의 이미지를 처리하기 위한 장치는, 반려 동물이 포함된 영상을 생성하는 카메라(1010)와, 카메라(1020)로부터 제공된 영상을 처리하여 반려 동물의 식별을 위한 객체의 이미지를 생성하는 프로세서(1030)를 포함한다. 프로세서(1030)는 영상에서 반려 동물의 종을 결정하기 위한 특징 영역 후보들을 생성하고, 특징 영역 후보들 각각의 신뢰도 값에 기초하여 위치 및 크기가 결정된 제1 특징 영역을 설정하고, 상기 제1 특징 영역에서 상기 반려 동물을 식별하기 위한 객체를 포함하는 제2 특징 영역을 설정하고, 상기 제2 특징 영역에서 상기 객체의 이미지를 획득할 수 있다. An apparatus for processing an image of an object for identification of a companion animal according to the present invention includes a camera 1010 for generating an image including a companion animal, and processing an image provided from the camera 1020 to identify a companion animal. and a processor 1030 for generating an image of an object for The processor 1030 generates feature region candidates for determining the species of the companion animal in the image, sets a first feature region whose location and size are determined based on the reliability values of each of the feature region candidates, and the first feature region may set a second characteristic area including an object for identifying the companion animal, and obtain an image of the object from the second characteristic area.

본 발명에 따르면, 프로세서(1020)는 인공 신경망을 사용하여 계층적으로 상기 영상을 복수개의 특징 영상들을 생성하고, 상기 특징 영상들 각각에 대하여 미리 정의된 경계 영역들을 적용하여 각 경계 영역들에서 특정 종의 반려 동물이 위치할 확률 값을 계산하고, 상기 확률 값을 고려하여 상기 특징 영역 후보들을 생성할 수 있다.According to the present invention, the processor 1020 generates a plurality of feature images from the image hierarchically using an artificial neural network, and applies predefined boundary regions to each of the feature images to specify a specific value in each boundary region. The feature region candidates may be generated by calculating a probability value where the companion animal of the species is located, and taking the probability value into consideration.

본 발명에 따르면, 프로세서(1020)는 상기 특징 영상에서 특정 동물 종에 해당할 확률이 가장 높은 제1 경계 영역을 선택하고, 상기 특징 영상에서 상기 선택된 경계 영역을 제외한 나머지 경계 영역들에 대하여, 상기 확률 값의 순서에 따라 상기 제1 경계 영역과의 중첩도를 계산하고 상기 중첩도가 기준 중첩도보다 큰 경계 영역을 상기 특징 영상의 특징 영역 후보에 포함시킬 수 있다. 이때 중첩도를 계산하기 위하여 두 경계 영역 사이의 교집합 대 합집합 면적비가 사용될 수 있다. According to the present invention, the processor 1020 selects a first boundary region that has the highest probability of corresponding to a specific animal species in the feature image, and selects the remaining boundary regions in the feature image except for the selected boundary region. The degree of overlap with the first boundary area may be calculated according to the order of the probability values, and a boundary area having the degree of overlap greater than the reference degree of overlap may be included in the feature area candidate of the feature image. In this case, the intersection to union area ratio between the two boundary regions may be used to calculate the degree of overlap.

본 발명에 따르면, 제1 특징 영역이 위치하는 중심점은 특징 영역 후보의 중심점들에 대한 신뢰도 값의 가중 합에 의해 결정될 수 있다. According to the present invention, the central point at which the first feature region is located may be determined by a weighted sum of reliability values for the central points of the feature region candidate.

본 발명에 따르면, 제1 특징 영역의 너비는 특징 영역 후보들의 너비들에 대한 신뢰도 값의 가중 합에 의해 결정되고, 제1 특징 영역의 높이는 상기 특징 영역 후보들의 높이들에 대한 신뢰도 값의 가중 합에 의해 결정될 수 있다.According to the present invention, the width of the first feature region is determined by the weighted sum of reliability values for the widths of the feature region candidates, and the height of the first feature region is the weighted sum of the reliability values for the heights of the feature region candidates. can be determined by

본 발명에 따르면, 프로세서(1020)는, 다음 영상에서 객체의 변경된 위치를 검출하고, 다음 영상에서 위치가 변경된 객체의 이미지가 인공 지능 기반의 학습 또는 식별에 적합한지 여부를 판단하고, 변경된 위치로 초점을 설정한 상태에서 다음 촬영을 수행하도록 카메라(1010)를 제어할 수 있다. According to the present invention, the processor 1020 detects the changed position of the object in the next image, determines whether the image of the object whose position is changed in the next image is suitable for AI-based learning or identification, and returns to the changed position. In a state in which the focus is set, the camera 1010 may be controlled to perform the next photographing.

본 발명에 따르면, 프로세서(1020)는 영상에서 반려 동물의 종(種)을 결정하기 위한 제1 특징 영역을 설정하고, 제1 특징 영역 내에서 반려 동물의 식별을 위한 객체를 포함하는 제2 특징 영역을 설정할 수 있다. According to the present invention, the processor 1020 sets a first feature area for determining the species of the companion animal in the image, and a second feature including an object for identification of the companion animal within the first feature area. You can set the area.

본 발명에 따르면, 프로세서(1020)는 반려 동물의 식별을 위한 객체의 이미지가 인공 지능 기반의 학습 또는 식별에 적합한지 여부를 판단할 수 있다. According to the present invention, the processor 1020 may determine whether an image of an object for identification of a companion animal is suitable for AI-based learning or identification.

본 발명에 따르면, 프로세서(1020)는 객체의 이미지의 품질이 기준 조건을 만족하는지 여부를 판단하고, 품질이 기준 조건을 만족하는 경우 객체의 이미지를 서버로 전송하고, 품질이 기준 조건을 만족하지 않는 경우 객체의 이미지를 버리고 다음 영상에 대한 촬영을 수행하도록 카메라(1010)를 제어할 수 있다. According to the present invention, the processor 1020 determines whether the quality of the image of the object satisfies the reference condition, and if the quality satisfies the reference condition, transmits the image of the object to the server, and the quality does not satisfy the standard condition If not, the camera 1010 may be controlled to discard the image of the object and perform photographing for the next image.

본 실시예 및 본 명세서에 첨부된 도면은 본 발명에 포함되는 기술적 사상의 일부를 명확하게 나타내고 있는 것에 불과하며, 본 발명의 명세서 및 도면에 포함된 기술적 사상의 범위 내에서 당업자가 용이하게 유추할 수 있는 변형예와 구체적인 실시예는 모두 본 발명의 권리범위에 포함되는 것이 자명하다고 할 것이다.This embodiment and the drawings attached to this specification merely clearly show some of the technical ideas included in the present invention, and those skilled in the art can easily infer within the scope of the technical ideas included in the specification and drawings of the present invention. It will be apparent that all possible modifications and specific embodiments are included in the scope of the present invention.

따라서, 본 발명의 사상은 설명된 실시예에 국한되어 정해져서는 아니 되며, 후술하는 특허청구범위뿐 아니라 이 특허청구범위와 균등하거나 등가적 변형이 있는 모든 것들은 본 발명 사상의 범주에 속한다고 할 것이다.Therefore, the spirit of the present invention should not be limited to the described embodiments, and not only the claims to be described later, but also all those with equivalent or equivalent modifications to the claims will be said to belong to the scope of the spirit of the present invention. .

Claims

A method for processing an image of an object for identification of a companion animal, the method comprising:
acquiring an image including the companion animal;
generating feature region candidates for determining the species of the companion animal from the image;
generating a first feature region whose location and size are determined based on a weighted sum of reliability values at which a specific animal species is to be located in each of the feature region candidates;
setting a second characteristic area including an object for identifying the companion animal in the first characteristic area based on the species of the companion animal; and
acquiring an image of the object in the second feature region,
The step of creating the first feature region comprises:
calculating a reliability value at which a specific animal species will be located in each of the feature region candidates through an operation on a hierarchical feature image generated by an artificial neural network; and
and generating the first feature region by merging each of the feature region candidates by assigning a weight according to a reliability value of each of the feature region candidates.

According to claim 1,
The generating of the feature region candidates includes:
generating a plurality of feature images hierarchically using an artificial neural network;
calculating a probability value that a companion animal of a specific species is located in each of the boundary regions by applying predefined boundary regions to each of the feature images; and
and generating the feature region candidates in consideration of the probability value.

3. The method of claim 2,
The generating of the feature region candidates includes:
selecting a first boundary region having a highest probability of corresponding to a specific animal species from the feature image;
For boundary regions other than the selected first boundary region in the feature image, a degree of overlap with the first boundary region is calculated according to the order of the probability values, and a boundary region whose overlap is greater than a reference overlap is determined. and including in the feature region candidate of the feature image.

4. The method of claim 3,
The degree of overlap is defined as a ratio of intersection to union area of each of the first boundary region and the remaining boundary regions.

According to claim 1,
The central point at which the first feature region is located is determined by a weighted sum of reliability values for the central points of the feature region candidate.

According to claim 1,
The width of the first feature region is determined by a weighted sum of reliability values for the widths of the feature region candidates,
The height of the first feature region is determined by a weighted sum of reliability values for the heights of the feature region candidates.

An apparatus for processing an image of an object for identification of a companion animal, the apparatus comprising:
a camera that generates an image including the companion animal;
a processor for processing the image provided from the camera to generate an image of an object for identification of the companion animal;
The processor is
generating feature region candidates for determining the species of the companion animal from the image;
setting a first feature region whose position and size are determined based on a weighted sum of the reliability values of a specific animal species to be located in each of the feature region candidates;
setting a second characteristic area including an object for identifying the companion animal in the first characteristic area based on the species of the companion animal;
acquiring an image of the object in the second feature area,
To set the first characteristic area, the processor,
calculating the reliability value at which a specific animal species will be located in each of the feature area candidates through operation on the hierarchical feature image generated by the artificial neural network,
The apparatus for generating the first feature region by merging the feature region candidates by weighting them according to the reliability value of each of the feature region candidates.

8. The method of claim 7,
The processor is
Generates a plurality of feature images hierarchically using an artificial neural network,
calculating a probability value that a companion animal of a specific species is located in each of the boundary regions by applying predefined boundary regions to each of the feature images;
An apparatus for generating the feature region candidates in consideration of the probability value.

9. The method of claim 8,
The processor is
selecting a first boundary region with the highest probability of corresponding to a specific animal species from the feature image,
For boundary regions other than the selected first boundary region in the feature image, a degree of overlap with the first boundary region is calculated according to the order of the probability values, and a boundary region whose overlap is greater than a reference overlap is determined. An apparatus for including in a feature region candidate of the feature image.

10. The method of claim 9,
The degree of overlap is defined as a ratio of an intersection to an intersection area of the first boundary region and each of the remaining boundary regions.

8. The method of claim 7,
The central point at which the first feature region is located is determined by a weighted sum of reliability values for the central points of the feature region candidate.

8. The method of claim 7,
The width of the first feature region is determined by a weighted sum of reliability values for the widths of the feature region candidates,
The height of the first feature region is determined by a weighted sum of reliability values for the heights of the feature region candidates.