KR102414153B1

KR102414153B1 - Method and apparatus and method for augmenting learning data for object recognition

Info

Publication number: KR102414153B1
Application number: KR1020220000272A
Authority: KR
Inventors: 허지성; 박지훈
Original assignee: 국방과학연구소
Priority date: 2022-01-03
Filing date: 2022-01-03
Publication date: 2022-06-28

Abstract

본 발명에 따른 물체 인식을 위한 학습 데이터 증강 방법은, 물체 이미지와 레이블 및 물체 영역을 포함하는 바이너리 이미지인 물체 마스크를 수집하는 단계와, 수집된 물체 마스크의 쌍별의 유사도를 계산하는 단계와, 상기 유사도의 계산 결과에 의거하여, 상기 물체 이미지 내의 원본 물체들 중 증강 물체로 치환할 특정 원본 물체를 선정하는 단계와, 선정된 상기 특정 원본 물체와의 유사도를 기준으로 증강 물체를 선정하는 단계와, 선정된 상기 특정 원본 물체의 영역을 삭제하고, 그 삭제 영역에 선정된 상기 증강 물체의 대응 이미지 영역을 덮어씌우는 단계와, 상기 특정 원본 물체의 영역 내 소거된 잔여 배경 영역을 복원(inpaint)하는 단계를 포함할 수 있다.A method for augmenting learning data for object recognition according to the present invention comprises the steps of: collecting an object mask, which is a binary image including an object image, a label, and an object region; calculating the pairwise similarity of the collected object masks; Selecting a specific original object to be replaced with an augmented object from among the original objects in the object image based on a calculation result of the similarity; selecting an augmented object based on the similarity with the selected specific original object; deleting the selected region of the specific original object, overwriting the deleted region with the corresponding image region of the selected augmented object, and restoring (inpaint) the erased residual background region within the region of the specific original object may include.

Description

Method and device for augmenting learning data for object recognition

본 발명은 물체 인식을 위한 학습 데이터를 증강하는 기법에 관한 것으로, 더욱 상세하게는 특정 이미지(바이너리 이미지)에 있는 특정한 물체의 영역을 해당 물체와 유사한 다른 물체(증강 물체)로 치환하여 증강할 수 있는 물체 인식을 위한 학습 데이터 증강 방법 및 그 장치에 관한 것이다.The present invention relates to a technique for augmenting learning data for object recognition, and more particularly, it can be augmented by substituting an area of a specific object in a specific image (binary image) with another object (augmented object) similar to the corresponding object. It relates to a method and apparatus for augmenting learning data for object recognition.

일반적으로, 인공지능 기반 알고리즘을 학습하기 위해서는 대량의 데이터가 필요한데, 이러한 대량의 학습 데이터를 확보하는 것은 많은 비용과 시간이 소요된다는 문제가 있다.In general, a large amount of data is required to learn an AI-based algorithm, and there is a problem that it takes a lot of money and time to secure such a large amount of learning data.

학습 데이터의 변형을 통해 인공지능 알고리즘의 일반화 성능을 증대시키는 것이 일반적인 접근 방법이다.A common approach is to increase the generalization performance of AI algorithms through transformation of the training data.

기존의 방법은 특정 물체 영역을 지우거나(Cutout), 다른 물체 영역과 섞어서 활용하거나(MixUp, CutMix), 물체 영역의 크기를 조정하여 다른 이미지에 덮어 씌우는 방법(Copy-Paste)으로 물체 인식 학습 데이터를 증강한다.The existing method deletes a specific object area (Cutout), mixes it with other object areas (MixUp, CutMix), or adjusts the size of the object area and overlays it on another image (Copy-Paste). to reinforce

이러한 기존의 증강 방법은 물체가 매우 다양한 경우에 효과적인 것으로 알려져 있으나, 유사하지만 다른 여러 종류의 물체의 식별문제(예컨대, 군용차량 식별 등)에 있어서는 그 효과가 떨어진다는 문제가 있다. This conventional augmentation method is known to be effective when there are many different types of objects, but there is a problem in that the effectiveness is inferior in the identification problem (eg, military vehicle identification, etc.) of similar but different types of objects.

한국등록특허 제10-1779782호(공고일: 2017. 10. 11.)Korean Patent Registration No. 10-1779782 (Announcement Date: 2017. 10. 11.) 한국공개특허 제10-2018-0069588호(공개일: 2018. 06. 25.)Korean Patent Publication No. 10-2018-0069588 (published date: 2018. 06. 25.)

본 발명은, 특정 이미지(바이너리 이미지)에 있는 특정한 물체의 영역을 해당 물체와 유사한 다른 물체(증강 물체)로 치환하여 증강할 수 있는 물체 인식을 위한 학습 데이터 증강 방법 및 그 장치를 제공하고자 한다.An object of the present invention is to provide a method and apparatus for augmenting learning data for object recognition that can be augmented by substituting a region of a specific object in a specific image (binary image) with another object (augmented object) similar to the corresponding object.

본 발명은 새로운 물체 인식 학습 데이터를 증강할 수 있는 물체 인식을 위한 학습 데이터 증강 방법 및 그 장치를 제공하고자 한다.An object of the present invention is to provide a method and apparatus for augmenting learning data for object recognition capable of augmenting new object recognition learning data.

본 발명이 해결하고자 하는 과제는 상기에서 언급한 것으로 제한되지 않으며, 언급되지 않은 또 다른 해결하고자 하는 과제는 아래의 기재들로부터 본 발명이 속하는 통상의 지식을 가진 자에 의해 명확하게 이해될 수 있을 것이다.The problem to be solved by the present invention is not limited to those mentioned above, and another problem to be solved that is not mentioned can be clearly understood by those of ordinary skill in the art to which the present invention belongs from the following description. will be.

본 발명은, 일 관점에 따라, 물체 이미지와 레이블 및 물체 영역을 포함하는 바이너리 이미지인 물체 마스크를 수집하는 단계와, 수집된 물체 마스크의 쌍별의 유사도를 계산하는 단계와, 상기 유사도의 계산 결과에 의거하여, 상기 물체 이미지 내의 원본 물체들 중 증강 물체로 치환할 특정 원본 물체를 선정하는 단계와, 선정된 상기 특정 원본 물체와의 유사도를 기준으로 증강 물체를 선정하는 단계와, 선정된 상기 특정 원본 물체의 영역을 삭제하고, 그 삭제 영역에 선정된 상기 증강 물체의 대응 이미지 영역을 덮어씌우는 단계와, 상기 특정 원본 물체의 영역 내 소거된 잔여 배경 영역을 복원(inpaint)하는 단계를 포함하는 물체 인식을 위한 학습 데이터 증강 방법을 제공할 수 있다.According to one aspect, the present invention includes the steps of collecting an object mask, which is a binary image including an object image, a label, and an object region, calculating the pairwise similarity of the collected object masks, and adding based on the steps of selecting a specific original object to be replaced with an augmented object among original objects in the object image, selecting an augmented object based on similarity with the selected specific original object, Object recognition comprising the steps of: deleting a region of an object, overwriting the region of the deleted object with a corresponding image region of the selected augmented object, and inpainting the erased residual background region within the region of the specific original object It is possible to provide a learning data augmentation method for

본 발명의 상기 유사도는, 물체 마스크 쌍 간의 거리 지표에 의거하여 계산될 수 있다.The similarity of the present invention may be calculated based on an index of a distance between a pair of object masks.

본 발명의 상기 유사도는, 물체 마스크 쌍 간의 유사도 지표에 의거하여 계산될 수 있다.The similarity of the present invention may be calculated based on a similarity index between the pair of object masks.

본 발명의 상기 유사도는, 인공신경망 모델을 통한 물체 마스크 쌍의 특징 벡터(feature vector)의 유사도를 측정하는 방식으로 계산될 수 있다.The similarity of the present invention may be calculated by measuring the similarity of a feature vector of an object mask pair through an artificial neural network model.

본 발명의 상기 특정 원본 물체를 선정하는 단계는, 원본 물체 영역의 가로/세로 크기가 기 설정된 특정 크기 이상이거나 또는 이하인 원본 물체를 상기 특정 원본 물체로 선정할 수 있다.In the selecting of the specific original object of the present invention, an original object whose horizontal/vertical size of the original object area is greater than or equal to a preset specific size may be selected as the specific original object.

본 발명의 상기 증강 물체를 선정하는 단계는, 동일한 클래스를 제외하고 다른 클래스를 갖는 원본 물체만을 상기 증강 물체를 선정할 수 있다.In the selecting of the augmented object of the present invention, the augmented object may be selected only from an original object having a different class except for the same class.

본 발명의 상기 증강 물체를 선정하는 단계는, 계산된 상기 유사도에 대해 기 설정된 특정 기준에 의거하여 상기 증강 물체를 선정할 수 있다.In the selecting of the augmented object of the present invention, the augmented object may be selected based on a predetermined criterion for the calculated similarity.

본 발명은, 다른 관점에 따라, 컴퓨터 프로그램을 저장하고 있는 컴퓨터 판독 가능 기록매체로서, 상기 컴퓨터 프로그램은, 프로세서에 의해 실행되면, 물체 이미지와 레이블 및 물체 영역을 포함하는 바이너리 이미지인 물체 마스크를 수집하는 단계와, 수집된 물체 마스크의 쌍별의 유사도를 계산하는 단계와, 상기 유사도의 계산 결과에 의거하여, 상기 물체 이미지 내의 원본 물체들 중 증강 물체로 치환할 특정 원본 물체를 선정하는 단계와, 선정된 상기 특정 원본 물체와의 유사도를 기준으로 증강 물체를 선정하는 단계와, 선정된 상기 특정 원본 물체의 영역을 삭제하고, 그 삭제 영역에 선정된 상기 증강 물체의 대응 이미지 영역을 덮어씌우는 단계와, 상기 특정 원본 물체의 영역 내 소거된 잔여 배경 영역을 복원(inpaint)하는 단계를 포함하는 컴퓨터 판독 가능한 기록매체를 제공할 수 있다.According to another aspect, the present invention is a computer readable recording medium storing a computer program, wherein the computer program, when executed by a processor, collects an object image and an object mask, which is a binary image including a label and an object region calculating the pairwise similarity of the collected object masks, and selecting a specific original object to be substituted with the augmented object from among the original objects in the object image based on the calculation result of the similarity; selecting an augmented object based on the degree of similarity with the specified original object, deleting the selected region of the specific original object, and overwriting the deleted region with a corresponding image region of the selected augmented object; It is possible to provide a computer-readable recording medium comprising the step of inpainting the erased residual background area within the area of the specific original object.

본 발명은, 또 다른 관점에 따라, 컴퓨터 판독 가능한 기록매체에 저장되어 있는 컴퓨터 프로그램으로서, 상기 컴퓨터 프로그램은, 프로세서에 의해 실행되면, 물체 이미지와 레이블 및 물체 영역을 포함하는 바이너리 이미지인 물체 마스크를 수집하는 단계와, 수집된 물체 마스크의 쌍별의 유사도를 계산하는 단계와, 상기 유사도의 계산 결과에 의거하여, 상기 물체 이미지 내의 원본 물체들 중 증강 물체로 치환할 특정 원본 물체를 선정하는 단계와, 선정된 상기 특정 원본 물체와의 유사도를 기준으로 증강 물체를 선정하는 단계와, 선정된 상기 특정 원본 물체의 영역을 삭제하고, 그 삭제 영역에 선정된 상기 증강 물체의 대응 이미지 영역을 덮어씌우는 단계와, 상기 특정 원본 물체의 영역 내 소거된 잔여 배경 영역을 복원(inpaint)하는 단계를 포함하는 동작을 상기 프로세서가 수행하도록 하기 위한 명령어를 포함하는 컴퓨터 프로그램을 제공할 수 있다.According to another aspect, the present invention is a computer program stored in a computer-readable recording medium, wherein the computer program, when executed by a processor, generates an object mask, which is a binary image including an object image, a label, and an object region. collecting, calculating the pairwise similarity of the collected object masks, and selecting a specific original object to be substituted with the augmented object from among the original objects in the object image based on the similarity calculation result; selecting an augmented object based on the degree of similarity to the selected specific original object; deleting the selected region of the specific original object; and overwriting the deleted region with a corresponding image region of the selected augmented object; . , a computer program including instructions for causing the processor to perform an operation including the step of inpainting the erased residual background area within the area of the specific original object.

본 발명은, 또 다른 관점에 따라, 물체 이미지와 레이블 및 물체 영역을 포함하는 바이너리 이미지인 물체 마스크를 수집하는 마스크 수집부와, 수집된 물체 마스크의 쌍별의 유사도를 계산하는 유사도 계산부와, 상기 유사도의 계산 결과에 의거하여, 상기 물체 이미지 내의 원본 물체들 중 증강 물체로 치환할 특정 원본 물체를 선정하는 대상 물체 선정부와, 선정된 상기 특정 원본 물체와의 유사도를 기준으로 증강 물체를 선정하는 증강 물체 선정부와, 선정된 상기 특정 원본 물체의 영역을 삭제하고, 그 삭제 영역에 선정된 상기 증강 물체의 대응 이미지 영역을 덮어씌우는 물체 치환부와, 상기 특정 원본 물체의 영역 내 소거된 잔여 배경 영역을 복원(inpaint)하는 영역 복원부를 포함하는 물체 인식을 위한 학습 데이터 증강 장치를 제공할 수 있다.According to another aspect, the present invention provides a mask collecting unit for collecting an object mask, which is a binary image including an object image, a label, and an object region, and a similarity calculating unit for calculating the pairwise similarity of the collected object masks; Based on the calculation result of the degree of similarity, the target object selection unit for selecting a specific original object to be replaced with the augmented object among the original objects in the object image, and selecting the augmented object based on the degree of similarity with the selected specific original object an augmented object selector, an object replacement unit that deletes the selected region of the specific original object, and overwrites the selected region of the corresponding image of the augmented object in the deleted region; It is possible to provide an apparatus for augmenting learning data for object recognition including a region restoration unit for inpainting a region.

본 발명의 상기 유사도 계산부는, 물체 마스크 쌍 간의 거리 지표에 의거하여 상기 유사도를 계산할 수 있다.The similarity calculator of the present invention may calculate the similarity based on a distance index between the pair of object masks.

본 발명의 상기 유사도 계산부는, 물체 마스크 쌍 간의 유사도 지표에 의거하여 상기 유사도를 계산할 수 있다.The similarity calculator of the present invention may calculate the similarity based on the similarity index between the object mask pairs.

본 발명의 상기 유사도 계산부는, 인공신경망 모델을 통한 물체 마스크 쌍의 특징 벡터(feature vector)의 유사도를 측정하는 방식으로 상기 유사도를 계산할 수 있다.The similarity calculator of the present invention may calculate the similarity by measuring the similarity of a feature vector of an object mask pair through an artificial neural network model.

본 발명의 상기 대상 물체 선정부는, 원본 물체 영역의 가로/세로 크기가 기 설정된 특정 크기 이상이거나 또는 이하인 원본 물체를 상기 특정 원본 물체로 선정할 수 있다.The target object selection unit of the present invention may select, as the specific original object, an original object having a horizontal/vertical size greater than or less than a predetermined specific size of the original object area.

본 발명의 상기 증강 물체 선정부는, 동일한 클래스를 제외하고 다른 클래스를 갖는 원본 물체만을 상기 증강 물체를 선정할 수 있다.The augmented object selector of the present invention may select the augmented object only from original objects having different classes except for the same class.

본 발명의 상기 증강 물체 선정부는, 계산된 상기 유사도에 대해 기 설정된 특정 기준에 의거하여 상기 증강 물체를 선정할 수 있다.The augmented object selector of the present invention may select the augmented object based on a predetermined criterion for the calculated similarity.

본 발명의 실시예에 따르면, 특정 이미지(바이너리 이미지)에 있는 특정한 물체의 영역을 해당 물체와 유사한 다른 물체(증강 물체)로 치환하여 증강할 수 있으며, 이를 통해 인공지능 물체 인식 알고리즘의 학습효과를 증대시킬 수 있다.According to an embodiment of the present invention, the area of a specific object in a specific image (binary image) can be augmented by substituting another object (augmented object) similar to the corresponding object, thereby improving the learning effect of the artificial intelligence object recognition algorithm. can be increased

본 발명의 실시예에 따르면, 학습 데이터에서 유사한 많은 물체들의 영역을 서로 치환하여 기존 보유하고 있던 데이터셋 외에 증강 데이터를 추가적으로 생성할 수 있으며, 이를 통해 인공지능 물체 인식 알고리즘의 학습효과를 더욱 증대시킬 수 있다.According to an embodiment of the present invention, it is possible to additionally generate augmented data in addition to the existing dataset by substituting regions of many similar objects in the learning data, thereby further increasing the learning effect of the artificial intelligence object recognition algorithm. can

도 1은 본 발명의 실시예에 따른 물체 인식을 위한 학습 데이터 증강 장치의 블록 구성도이다.
도 2는 본 발명의 실시예에 따라 물체 인식을 위한 학습 데이터를 증강하는 주요 과정을 도시한 순서도이다.
도 3은 유사도 계산을 위해 2차원 행렬 M을 만드는 과정을 도식적으로 보여주는 예시도로서 그 순서는 다음과 같다.
도 4는 2차원 행렬 M으로부터 인터섹션 매트릭스(intersection matrix)를 구하는 과정을 도식적으로 보여주는 예시도이다.
도 5는 인터섹션 매트릭스로부터 유니온 매트릭스(Union matrix)를 구하는 과정을 도식적으로 보여주는 예시도이다.
도 6은 인터섹션 매트릭스와 유니온 매트릭스를 이용하여 IoU 매트릭스를 만드는 과정을 도식적으로 보여주는 예시도이다.
도 7은 서로 다른 두 물체 영역의 엔코더 네트워크를 활용하여 유사도를 계산하는 것을 도식적으로 보여주는 사진 및 예시도이다.
도 8은 원본 물체 영역을 삭제한 후 선정된 증강 물체(증강 아미지)를 이용한 물체 인식 데이터셋의 증강 과정을 예시적으로 보여주는 사진이다.1 is a block diagram of an apparatus for augmenting learning data for object recognition according to an embodiment of the present invention.
2 is a flowchart illustrating a main process of augmenting learning data for object recognition according to an embodiment of the present invention.
3 is an exemplary diagram schematically showing a process of creating a two-dimensional matrix M for calculating the similarity, and the sequence is as follows.
4 is an exemplary diagram schematically illustrating a process of obtaining an intersection matrix from a two-dimensional matrix M. Referring to FIG.
5 is an exemplary diagram schematically illustrating a process of obtaining a union matrix from an intersection matrix.
6 is an exemplary diagram schematically illustrating a process of creating an IoU matrix using an intersection matrix and a union matrix.
7 is a picture and an example diagram schematically showing similarity calculation using encoder networks of two different object areas.
8 is a photograph exemplarily illustrating an augmentation process of an object recognition dataset using a selected augmented object (augmented image) after deleting an original object region.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 다양한 형태로 구현될 수 있으며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하고, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명의 범주는 청구항에 의해 정의될 뿐이다.Advantages and features of the present invention and methods of achieving them will become apparent with reference to the embodiments described below in detail in conjunction with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but may be implemented in various forms, and only these embodiments allow the disclosure of the present invention to be complete, and those of ordinary skill in the art to which the present invention pertains. It is provided to fully inform the person of the scope of the invention, and the scope of the invention is only defined by the claims.

본 발명의 실시예들을 설명함에 있어서 공지 기능 또는 구성에 대한 구체적인 설명은 본 발명의 실시예들을 설명함에 있어 실제로 필요한 경우 외에는 생략될 것이다. 그리고 후술되는 용어들은 본 발명의 실시예에서의 기능을 고려하여 정의된 용어들로서 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다. 그러므로 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다.In describing the embodiments of the present invention, detailed descriptions of well-known functions or configurations will be omitted except when it is actually necessary to describe the embodiments of the present invention. In addition, the terms to be described later are terms defined in consideration of functions in an embodiment of the present invention, which may vary according to intentions or customs of users and operators. Therefore, the definition should be made based on the content throughout this specification.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시예에 대하여 상세하게 설명한다.Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings.

본 실시예에서는 특정 이미지(바이너리 이미지)에 있는 특정한 물체의 영역을 해당 물체와 유사한 다른 물체로 치환하여 증강하는 방법을 제안한다.This embodiment proposes a method of augmenting a region of a specific object in a specific image (binary image) by substituting another object similar to the corresponding object.

이미지의 물체 영역을 다른 물체로 덮어 새로운 이미지로 증강한 뒤, 기존 물체 영역 부분을 배경과 유사하게 칠하여(inpaint) 새로운 이미지를 증강할 수 있다.After the object area of the image is covered with another object and augmented with a new image, the new image may be augmented by inpainting the existing object area similarly to the background.

이때, 지워진 물체 영역과 새로 덮어진 이미지의 영역이 유사하도록 학습 데이터의 물체 영역들 간의 유사점을 특정 지표를 통해 측정하고, 이 지표가 특정 수치(예컨대, mIoU 0.8 등) 이상인 경우에만 치환이 가능하도록 설정할 수 있다.At this time, the similarity between the object areas of the learning data is measured through a specific index so that the erased object area and the area of the newly covered image are similar, and substitution is possible only when this index is greater than or equal to a specific value (eg, mIoU 0.8, etc.) can be set.

유사도를 구하는 방법으로는 마스크(mask) 간의 거리 지표(예컨대, Manhattan distance)로 구하는 방식, 마스크 간의 유사도 지표(예컨대, Intersection over Union)로 구하는 방식 또는 인공신경망 모델을 통한 두 물체 이미지의 특징 벡터(feature vector)의 유사도를 측정하는 방식으로도 구할 수도 있다.As a method of obtaining the similarity, a method of obtaining a distance index between masks (e.g., Manhattan distance), a method of obtaining a similarity index between masks (e.g., Intersection over Union), or a feature vector of two object images through an artificial neural network model ( It can also be obtained by measuring the similarity of feature vectors).

도 1은 본 발명의 실시예에 따른 물체 인식을 위한 학습 데이터 증강 장치의 블록 구성도이다.1 is a block diagram of an apparatus for augmenting learning data for object recognition according to an embodiment of the present invention.

도 1을 참조하면, 본 실시예에 따른 물체 인식을 위한 학습 데이터 증강 장치는 마스크 수집부(102), 유사도 계산부(104), 대상 물체 선정부(106), 증강 물체 선정부(108), 물체 치환부(110), 영역 복원부(112) 및 증강 물체 DB(114) 등을 포함할 수 있다.Referring to FIG. 1 , the learning data augmentation apparatus for object recognition according to the present embodiment includes a mask collecting unit 102 , a similarity calculating unit 104 , a target object selecting unit 106 , an augmented object selecting unit 108 , It may include an object replacement unit 110 , a region restoration unit 112 , and an augmented object DB 114 .

마스크 수집부(102)는 물체 이미지와 레이블(물체의 클래스, 가로 세로 영역 등) 및 물체 영역을 포함하는 바이너리 이미지인 물체 마스크를 수집하는 등의 기능을 수행할 수 있다.The mask collection unit 102 may perform a function such as collecting an object mask, which is a binary image including an object image, a label (class of an object, horizontal and vertical regions, etc.) and an object region.

유사도 계산부(104)는 마스크 수집부(102)를 통해 수집된 물체 마스크의 쌍별(Object pait)의 유사도를 계산하는 등의 기능을 수행할 수 있다.The similarity calculating unit 104 may perform a function such as calculating the similarity of the object masks collected by the mask collecting unit 102 .

유사도 계산부(104)는 물체 마스크 쌍 간의 거리 지표에 의거하여 유사도를 계산하거나, 물체 마스크 쌍 간의 유사도 지표에 의거하여 유사도를 계산하거나, 인공신경망 모델을 통한 물체 마스크 쌍의 특징 벡터(feature vector)의 유사도를 측정하는 방식으로 유사도를 계산할 수 있다.The similarity calculator 104 calculates the similarity based on the distance index between the pair of object masks, calculates the similarity based on the similarity index between the pair of object masks, or a feature vector of the pair of object masks through an artificial neural network model. The similarity can be calculated by measuring the similarity of

수집된 모든 물체 마스크 쌍별 유사도 계산이 필요한데, 물체 마스크 쌍별 유사도 계산의 한 방법으로, IoU를 활용할 수 있다. 여기에서, IoU 란 두 마스크간의 겹치는 정도를 정량적으로 판단하는 지표이다. IoU의 계산은 다음 그림과 같이 이루어질 수 있다.It is necessary to calculate the similarity of all collected object mask pairs, and IoU can be used as a method of similarity calculation for each object mask pair. Here, IoU is an index for quantitatively determining the degree of overlap between two masks. Calculation of IoU can be done as shown in the following figure.

본 실시예에서는 IoU 비교를 효율적으로 수행하는 방법을 제시한다. 각 물체영역의 개수가 N이라고 할 때, 모든 물체 쌍의 IoU 값을 계산하여야 하므로, IoU 계산 회수는 O(N^2)의 복잡도를 갖는다. 이것은 상당히 큰 계산량이므로 다음과 같은 방법을 통해 효율적으로 계산한다.In this embodiment, a method for efficiently performing IoU comparison is presented. When the number of object regions is N, since IoU values of all object pairs must be calculated, the number of IoU calculations has a complexity of O(N^2). Since this is a fairly large amount of computation, it is efficiently calculated through the following method.

도 3은 유사도 계산을 위해 2차원 행렬 M을 만드는 과정을 도식적으로 보여주는 예시도로서 그 순서는 다음과 같다.3 is an exemplary diagram schematically showing a process of creating a two-dimensional matrix M for calculating the similarity, and the sequence is as follows.

1. 데이터셋 내의 모든 물체들에 대해, 마스크들을 불러온다.1. For all objects in the dataset, load the masks.

2. 모든 마스크를 동일한 크기로 리사이즈(resize)한 후, 평탄화(flatten) 하여 1차원으로 만든다.2. After resizing all the masks to the same size, flatten them to make them one-dimensional.

3. 평탄화된

개의 마스크들을 스택(stack)하여

모양의 2차원 행렬

을 만든다.3. Flattened

by stacking the masks

2D matrix of shapes

makes

도 4는 2차원 행렬 M으로부터 인터섹션 매트릭스(intersection matrix)를 구하는 과정을 도식적으로 보여주는 예시도이다.4 is an exemplary diagram schematically illustrating a process of obtaining an intersection matrix from a two-dimensional matrix M. Referring to FIG.

도 4를 참조하면,

를 하여

크기의 행렬을

를 구한다. 행렬

는

번째 마스크와

번째 마스크의 인터섹션 영역의 크기라고 할 수 있다.Referring to Figure 4,

by doing

matrix of size

save procession

Is

the second mask

It can be said to be the size of the intersection area of the th mask.

도 5는 인터섹션 매트릭스로부터 유니온 매트릭스(Union matrix)를 구하는 과정을 도식적으로 보여주는 예시도이다.5 is an exemplary diagram schematically illustrating a process of obtaining a union matrix from an intersection matrix.

도 5를 참조하면, 행렬

의 대각 성분

는 i번째 마스크 영역의 크기를 의미하므로, 대각성분 벡터

를 뽑아 가로 세로 방향으로 반복(repeat)하면 sum matrix A 와 sum matrix B를 구할 수 있다. A + B ?? M을 계산함으로써 Union matrix U를 얻을 수 있다.

는 i번째 마스크와 j 번째 마스크의 Union 영역의 크기를 의미하게 된다.5, the matrix

diagonal component of

is the size of the i-th mask region, so the diagonal component vector

By extracting and repeating horizontally and vertically, sum matrix A and sum matrix B can be obtained. A + B ?? By calculating M, we can get the union matrix U.

denotes the size of the union area of the i-th mask and the j-th mask.

도 6은 인터섹션 매트릭스와 유니온 매트릭스를 이용하여 IoU 매트릭스를 만드는 과정을 도식적으로 보여주는 예시도이다.6 is an exemplary diagram schematically illustrating a process of creating an IoU matrix using an intersection matrix and a union matrix.

도 6을 참조하면,

을

로 나누면 IoU matrix R이 만들어지는데,

는 i와 j 번째 마스크의 IoU 값을 의미한다.Referring to Figure 6,

second

Dividing by , the IoU matrix R is created,

denotes the IoU value of the i and j-th masks.

상술한 바와 같은 일련의 프로세스를 통해 만들어진 알고리즘은 일반적인 이중 for문에 의한 IoU 연산에 비해, 아래의 표에서와 같이 약 수십만배 이상 빠르게 연산할 수 있다. Algorithms created through a series of processes as described above can operate more than hundreds of thousands of times faster as shown in the table below, compared to IoU operation by a general double for statement.

GPU(본 알고리즘)GPU (Bone Algorithm) CPU(이중 for문)CPU (double for statement) 0.00213 초0.00213 seconds 506.11142 초506.11142 seconds

(GPU: Titan XP 1개 기준, 224x224 크기의 1,920개 object mask에 대하여 IoU matrix 연산 시간 비교. 237,394배 차이)(GPU: Based on one Titan XP, IoU matrix operation time comparison for 1,920 object masks of 224x224 size. 237,394 times difference)

도 7은 서로 다른 두 물체 영역의 엔코더 네트워크를 활용하여 유사도를 계산하는 것을 도식적으로 보여주는 사진 및 예시도이다.7 is a picture and an example diagram schematically showing similarity calculation using encoder networks of two different object areas.

도 7을 참조하면, 인공신경망 엔코더(Neural Network Encoder)를 통한 임베딩 벡터(Embedding vector)로 유사도를 계산할 수 있다.Referring to FIG. 7 , similarity may be calculated using an embedding vector through an artificial neural network encoder.

즉, 인공신경망 엔코더를 통해 이미지 영역을 임베딩 공간(Embedding space)의 벡터(vector)로 표현하고, 이러한 벡터들의 거리(예컨대, Euclidean distance)를 계산하여 유사도로 활용할 수 있다.That is, an image region may be expressed as a vector of an embedding space through an artificial neural network encoder, and a distance (eg, Euclidean distance) of these vectors may be calculated and utilized as a similarity.

대상 물체 선정부(106)는 유사도의 계산 결과에 의거하여, 물체 이미지 내의 원본 물체들 중 증강 물체로 치환할 특정 원본 물체(증강 대상의 원본 이미지)를 선정하는 등의 기능을 수행할 수 있다.The target object selection unit 106 may perform a function such as selecting a specific original object (original image of the augmentation target) to be replaced with the augmented object among original objects in the object image based on the similarity calculation result.

대상 물체 선정부(106)는 원본 물체 영역의 가로/세로 크기가 기 설정된 특정 크기(예컨대, 200픽셀 등) 이상이거나 또는 이하인 원본 물체를 특정 원본 물체로 선정할 수 있다.The target object selector 106 may select an original object having a horizontal/vertical size of a predetermined size (eg, 200 pixels, etc.) or smaller than or equal to a specific original object area of the original object area.

증강 물체 선정부(108)는 대상 물체 선정부(106)에 의해 선정된 특정 원본 물체와의 유사도를 기준으로 증강 물체(증강 이미지)를 선정하는 등의 기능을 수행할 수 있다.The augmented object selector 108 may perform a function such as selecting an augmented object (augmented image) based on the similarity with a specific original object selected by the target object selector 106 .

증강 물체 선정부(108)는 동일한 클래스를 제외하고 다른 클래스를 갖는 원본 물체만을 증강 물체를 선정하거나 혹은 계산된 유사도에 대해 기 설정된 특정 기준(예컨대, IoU 0.8 이상 등)을 활용하여 증강 물체를 선정할 수 있는데, 이를 위해 증강 물체 DB(114)에는 수많은 증강 물체들에 대한 데이터(이미지 데이터)들이 저장될 수 있다.The augmented object selection unit 108 selects the augmented object only from the original object having a different class except for the same class, or selects the augmented object by using a preset specific criterion (eg, IoU 0.8 or higher) for the calculated similarity. To this end, data (image data) of numerous augmented objects may be stored in the augmented object DB 114 .

물체 치환부(110)는 대상 물체 선정부(106)를 통해 선정된 특정 원본 물체의 영역을 삭제하고, 그 삭제 영역에 증강 물체 선정부(108)를 통해 선정된 증강 물체의 대응 이미지 영역을 덮어씌우는(치환하는) 등의 기능을 수행할 수 있다. 즉, 물체 치환부(110)는 증강 물체를 원본 물체와 유사한 크기로 조절(resize)한 뒤 증강 물체 이미지 영역을 덮어쓴다.The object replacement unit 110 deletes the area of the specific original object selected through the target object selection unit 106 and covers the deleted area with the corresponding image area of the augmented object selected through the augmented object selection unit 108 It can perform functions such as covering (replacing). That is, the object replacement unit 110 overwrites the augmented object image area after resizing the augmented object to a size similar to that of the original object.

도 8은 원본 물체 영역을 삭제한 후 선정된 증강 물체(증강 아미지)를 이용한 물체 인식 데이터셋의 증강 과정을 예시적으로 보여주는 사진이다.8 is a photograph illustrating an augmentation process of an object recognition dataset using a selected augmented object (augmented image) after deleting an original object region.

도 8을 참조하면, 영역 복원부(112)는 특정 원본 물체의 영역 내 소거된 잔여 배경 영역을 복원(inpaint)하는 등의 기능을 수행할 수 있다.Referring to FIG. 8 , the region restoration unit 112 may perform a function such as inpainting the erased residual background region in the region of the specific original object.

즉, 영역 복원부(112)는 원본 물체(원본 이미지)를 삭제하고 남은 영역에 대하여 배경과 유사하도록 자연스럽게 칠하기 위해 복원(inpaint) 알고리즘을 활용할 수 있다.That is, the region restoration unit 112 may use an inpaint algorithm to naturally paint the remaining region similar to the background after deleting the original object (original image).

복원 알고리즘의 경우 일반적인 Fast marching, Navier-stokes 등의 방법이 활용될 수 있으며, Neural Network를 기반으로 한 In-painting 알고리즘도 활용될 수 있다.For the restoration algorithm, general fast marching, Navier-stokes, etc. methods can be used, and an in-painting algorithm based on a neural network can also be used.

그리고, 최종적으로 완성된 증강 이미지는 딥러닝 모델의 학습 데이터셋으로 쓰일 수 있다.And, the finally completed augmented image can be used as a training dataset for the deep learning model.

도 2는 본 발명의 실시예에 따라 물체 인식을 위한 학습 데이터를 증강하는 주요 과정을 도시한 순서도이다.2 is a flowchart illustrating a main process of augmenting learning data for object recognition according to an embodiment of the present invention.

도 2를 참조하면, 마스크 수집부(102)에서는 물체 이미지와 레이블(물체의 클래스, 가로 세로 영역 등) 및 물체 영역을 포함하는 바이너리 이미지인 물체 마스크를 수집한다(단계 202).Referring to FIG. 2 , the mask collecting unit 102 collects an object mask, which is a binary image including an object image, a label (object class, horizontal and vertical regions, etc.), and an object region (step 202).

유사도 계산부(104)에서는 마스크 수집부(102)를 통해 수집된 물체 마스크의 쌍별(Object pair)의 유사도를 계산한다(단계 204).The similarity calculating unit 104 calculates the similarity of each object pair of the object masks collected through the mask collecting unit 102 (step 204).

이를 위해, 유사도 계산부(104)에서는 물체 마스크 쌍 간의 거리 지표에 의거하여 유사도를 계산하거나, 물체 마스크 쌍 간의 유사도 지표에 의거하여 유사도를 계산하거나, 인공신경망 모델을 통한 물체 마스크 쌍의 특징 벡터(feature vector)의 유사도를 측정하는 방식으로 유사도를 계산할 수 있다.To this end, the similarity calculator 104 calculates the similarity based on the distance index between the object mask pairs, calculates the similarity based on the similarity index between the object mask pairs, or a feature vector ( The similarity can be calculated by measuring the similarity of feature vectors).

대상 물체 선정부(106)에서는 유사도의 계산 결과에 의거하여, 물체 이미지 내의 원본 물체들 중 증강 물체로 치환할 특정 원본 물체(증강 대상의 원본 이미지)를 선정한다(단계 206).The target object selection unit 106 selects a specific original object (original image of the augmentation target) to be replaced with the augmented object from among original objects in the object image based on the calculation result of the degree of similarity (step 206).

예컨대, 대상 물체 선정부(106)에서는 원본 물체 영역의 가로/세로 크기가 기 설정된 특정 크기(예컨대, 200픽셀 등) 이상이거나 또는 이하인 원본 물체를 특정 원본 물체로 선정할 수 있다.For example, the target object selection unit 106 may select an original object having a horizontal/vertical size of a preset specific size (eg, 200 pixels, etc.) or less as the specific original object.

증강 물체 선정부(108)에서는 대상 물체 선정부(106)에 의해 선정된 특정 원본 물체와의 유사도를 기준으로 증강 물체(증강 이미지)를 선정하는데(단계 208), 예컨대 동일한 클래스를 제외하고 다른 클래스를 갖는 원본 물체만을 증강 물체를 선정하거나 혹은 계산된 유사도에 대해 기 설정된 특정 기준(예컨대, IoU 0.8 이상 등)을 활용하여 증강 물체를 선정할 수 있다.The augmented object selector 108 selects an augmented object (augmented image) based on the similarity with the specific original object selected by the target object selector 106 (step 208), for example, a class other than the same class. The augmented object may be selected only from the original object having .

물체 치환부(110)에서는 대상 물체 선정부(106)를 통해 선정된 특정 원본 물체의 영역을 삭제하고, 그 삭제 영역에 증강 물체 선정부(108)를 통해 선정된 증강 물체의 대응 이미지 영역을 덮어씌운다(물체 치환)(단계 210). 이때, 물체 치환부(110)에서는 증강 물체를 원본 물체와 유사한 크기로 조절(resize)한 뒤 증강 물체 이미지 영역을 덮어쓴다.The object replacement unit 110 deletes the area of the specific original object selected through the target object selection unit 106 and covers the deleted area with the corresponding image area of the augmented object selected through the augmented object selection unit 108 Cover (substitute object) (step 210). At this time, the object replacement unit 110 overwrites the augmented object image area after resizing the augmented object to a size similar to that of the original object.

영역 복원부(112)에서는 특정 원본 물체의 영역 내 소거된 잔여 배경 영역을 복원(inpaint)하는데(단계 212), 원본 물체(원본 이미지)를 삭제하고 남은 영역에 대하여 배경과 유사하도록 자연스럽게 칠하기 위해 복원(inpaint) 알고리즘을 활용할 수 있다.The area restoration unit 112 restores the erased residual background area within the area of the specific original object (step 212). An inpaint algorithm can be used.

여기에서, 복원 알고리즘의 경우 일반적인 Fast marching, Navier-stokes 등의 방법이 활용될 수 있으며, Neural Network를 기반으로 한 In-painting 알고리즘도 활용될 수 있다.Here, in the case of the restoration algorithm, general fast marching, Navier-stokes, etc. methods may be used, and an in-painting algorithm based on a neural network may also be used.

한편, 첨부된 블록도의 각 블록과 흐름도의 각 단계의 조합들은 컴퓨터 프로그램 인스트럭션들에 의해 수행될 수도 있다. 이들 컴퓨터 프로그램 인스트럭션들은 범용 컴퓨터, 특수용 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비의 프로세서에 탑재될 수 있으므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비의 프로세서를 통해 수행되는 그 인스트럭션들이 블록도의 각 블록 또는 흐름도의 각 단계에서 설명된 기능들을 수행하는 수단을 생성하게 된다.Meanwhile, combinations of each block in the accompanying block diagram and each step in the flowchart may be performed by computer program instructions. These computer program instructions may be embodied in a processor of a general-purpose computer, special purpose computer, or other programmable data processing equipment, such that the instructions executed by the processor of the computer or other programmable data processing equipment are not identical to each block in the block diagram or in the flowchart. Each step creates a means for performing the described functions.

이들 컴퓨터 프로그램 인스트럭션들은 특정 방식으로 기능을 구현하기 위해 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비를 지향할 수 있는 컴퓨터 이용 가능 또는 컴퓨터 판독 가능 메모리 등에 저장되는 것도 가능하므로, 그 컴퓨터 이용 가능 또는 컴퓨터 판독 가능 메모리에 저장된 인스트럭션들은 블록도의 각 블록 또는 흐름도 각 단계에서 설명된 기능을 수행하는 인스트럭션 수단을 내포하는 제조 품목을 생산하는 것도 가능하다.These computer program instructions may also be stored in a computer-usable or computer-readable memory, etc., which may be directed to a computer or other programmable data processing equipment to implement a function in a particular manner, and thus the computer-usable or computer-readable memory. The instructions stored in the block diagram may also produce an item of manufacture containing instruction means for performing a function described in each block of the block diagram or each step of the flowchart.

그리고, 컴퓨터 프로그램 인스트럭션들은 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비 상에 탑재되는 것도 가능하므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비 상에서 일련의 동작 단계들이 수행되어 컴퓨터로 실행되는 프로세스를 생성해서 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비를 수행하는 인스트럭션들은 블록도의 각 블록 및 흐름도의 각 단계에서 설명된 기능들을 실행하기 위한 단계들을 제공하는 것도 가능하다.And, since the computer program instructions may be mounted on a computer or other programmable data processing equipment, a series of operating steps are performed on the computer or other programmable data processing equipment to create a computer-executed process to create a computer or other program It is also possible that instructions for performing the possible data processing equipment provide steps for carrying out the functions described in each block of the block diagram and in each step of the flowchart.

또한, 각 블록 또는 각 단계는 특정된 논리적 기능(들)을 실행하기 위한 적어도 하나 이상의 실행 가능한 인스트럭션들을 포함하는 모듈, 세그먼트 또는 코드의 일부를 나타낼 수 있다. 또, 몇 가지 대체 실시예들에서는 블록들 또는 단계들에서 언급된 기능들이 순서를 벗어나서 발생하는 것도 가능함을 주목해야 한다. 예컨대, 잇달아 도시되어 있는 두 개의 블록들 또는 단계들은 사실 실질적으로 동시에 수행되는 것도 가능하고 또는 그 블록들 또는 단계들이 때때로 해당하는 기능에 따라 역순으로 수행되는 것도 가능하다.In addition, each block or each step may represent a module, segment, or part of code including at least one or more executable instructions for executing specified logical function(s). It should also be noted that in some alternative embodiments it is also possible for the functions recited in blocks or steps to occur out of order. For example, it is possible that two blocks or steps shown one after another may in fact be performed substantially simultaneously, or that the blocks or steps may sometimes be performed in the reverse order according to the corresponding function.

이상의 설명은 본 발명의 기술사상을 예시적으로 설명한 것에 불과한 것으로서, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 발명의 본질적인 특성에서 벗어나지 않는 범위 내에서 여러 가지 치환, 변형 및 변경 등이 가능함을 쉽게 알 수 있을 것이다. 즉, 본 발명에 개시된 실시예들은 본 발명의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것으로서, 이러한 실시예에 의하여 본 발명의 기술 사상의 범위가 한정되는 것은 아니다.The above description is merely illustrative of the technical idea of the present invention, and those of ordinary skill in the art to which the present invention pertains may make various substitutions, modifications, and changes within the scope not departing from the essential characteristics of the present invention. It will be easy to see that this is possible. That is, the embodiments disclosed in the present invention are not intended to limit the technical spirit of the present invention, but to explain, and the scope of the technical spirit of the present invention is not limited by these embodiments.

따라서, 본 발명의 보호 범위는 후술되는 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술사상은 본 발명의 권리범위에 포함되는 것으로 해석되어야 할 것이다.Accordingly, the protection scope of the present invention should be interpreted by the claims described below, and all technical ideas within the scope equivalent thereto should be interpreted as being included in the scope of the present invention.

102 : 마스크 수집부
104 : 유사도 계산부
106 : 대상 물체 선정부
108 : 증강 물체 선정부
110 : 물체 치환부
112 : 영역 복원부
114 : 증강 물체 DB102: mask collection unit
104: similarity calculator
106: target object selection unit
108: augmented object selection unit
110: object replacement part
112: area restoration unit
114: augmented object DB

Claims

acquiring an object mask, which is a binary image comprising an object image and a label and object region;
Computing the pairwise similarity of the collected object masks in a way that measures the similarity using the IoU (Intersection over Union) index or the similarity of the feature vector of the object mask pair through an artificial neural network model;
selecting a specific original object to be replaced with an augmented object from among the original objects in the object image based on the calculation result of the degree of similarity;
selecting an augmented object based on the degree of similarity to the selected specific original object;
deleting the selected region of the specific original object, and overwriting the deleted region with the region of the corresponding image of the selected augmented object;
and inpainting the erased residual background area within the area of the specific original object.
A method of augmenting training data for object recognition.

delete

The method of claim 1,
The step of selecting the specific original object comprises:
Selecting an original object whose horizontal and vertical size of the original object area is greater than or equal to a preset specific size as the specific original object
A method of augmenting training data for object recognition.

The method of claim 1,
The step of selecting the augmented object is,
Selecting the augmented object only from the original object having a different class except for the same class
A method of augmenting training data for object recognition.

The method of claim 1,
The step of selecting the augmented object comprises:
Selecting the augmented object based on a specific criterion set for the calculated similarity
A method of augmenting training data for object recognition.

As a computer-readable recording medium storing a computer program,
The computer program, when executed by a processor,
acquiring an object mask, which is a binary image comprising an object image and a label and object region;
Computing the pairwise similarity of the collected object masks in a way that measures the similarity using the IoU (Intersection over Union) index or the similarity of the feature vector of the object mask pair through an artificial neural network model;
selecting a specific original object to be replaced with an augmented object from among the original objects in the object image based on the calculation result of the degree of similarity;
selecting an augmented object based on the degree of similarity to the selected specific original object;
deleting the selected region of the specific original object, and overwriting the deleted region with the region of the corresponding image of the selected augmented object;
Inpainting the erased residual background area within the area of the specific original object
A computer-readable recording medium comprising a.

As a computer program stored in a computer-readable recording medium,
The computer program, when executed by a processor,
acquiring an object mask, which is a binary image comprising an object image and a label and object region;
Computing the pairwise similarity of the collected object masks in a way that measures the similarity using the IoU (Intersection over Union) index or the similarity of the feature vector of the object mask pair through an artificial neural network model;
selecting a specific original object to be replaced with an augmented object from among the original objects in the object image based on the calculation result of the degree of similarity;
selecting an augmented object based on the degree of similarity to the selected specific original object;
deleting the selected region of the specific original object, and overwriting the deleted region with the region of the corresponding image of the selected augmented object;
restoring (inpaint) the erased residual background area within the area of the specific original object;
A computer program comprising instructions for causing the processor to perform an operation comprising:

a mask collecting unit for collecting an object mask, which is a binary image including an object image, a label, and an object region;
A similarity calculator that calculates the pairwise similarity of the collected object masks by using the IoU (Intersection over Union) index or by measuring the similarity of the feature vectors of the object mask pairs through an artificial neural network model;
a target object selection unit for selecting a specific original object to be replaced with an augmented object from among the original objects in the object image based on the calculation result of the degree of similarity;
an augmented object selection unit for selecting an augmented object based on the degree of similarity to the selected specific original object;
an object replacement unit that deletes the selected region of the specific original object and overwrites the selected region of the augmented object with the corresponding image region;
and a region restoration unit for inpainting the erased residual background region within the region of the specific original object.
Learning data augmentation device for object recognition.

delete

11. The method of claim 10,
The target object selection unit,
Selecting an original object whose horizontal and vertical size of the original object area is greater than or equal to a preset specific size as the specific original object
Learning data augmentation device for object recognition.

11. The method of claim 10,
The augmented object selection unit,
Selecting the augmented object only from the original object having a different class except for the same class
Learning data augmentation device for object recognition.

11. The method of claim 10,
The augmented object selection unit,
Selecting the augmented object based on a specific criterion set for the calculated similarity
Learning data augmentation device for object recognition.