KR20230038960A

KR20230038960A - Large-scale category object detection and recognition method for inventory management of autonomous unmanned stores

Info

Publication number: KR20230038960A
Application number: KR1020210121701A
Authority: KR
Inventors: 이필규
Original assignee: 인하대학교 산학협력단
Priority date: 2021-09-13
Filing date: 2021-09-13
Publication date: 2023-03-21
Also published as: KR102574044B1

Abstract

The present invention relates to a large-scale category object detection and recognition method for inventory management of an autonomous store. The large-scale category object detection and recognition method for inventory management of an autonomous store can overcome a weakness of an object detection deep learning model trained with a corresponding data set due to various data classes and features; use a model that can adapt in an environment with different data features through multiple branch tree structures using active semi-supervised learning for training a strong object detection deep learning model; be used for an online vision artificial intelligence (AI) deep learning platform technology, which can adaptively extend and update a deep learning advanced image recognition function, an AI POS, and an inventory-combined smart store; and commercialize behavior recognition technology even in service robot-related and smart home industry-related fields required to quickly prepare for chronological changes.

Description

Large-scale category object detection and recognition method for inventory management of autonomous unmanned stores}

본 발명은 자율무인점포의 인벤토리 관리를 위한 대규모 카테고리 객체 검출인식 방법에 관한 것으로서, 더욱 상세하게는 자율무인점포의 인벤토리 관리를 위한 대규모 카테고리 객체를 계층적 메타러닝 딥러닝(MLDL: Meta learning for Deep learning)을 이용하여 검출 인식하는 방법에 관한 것이다.The present invention relates to a method for detecting and recognizing large-scale category objects for inventory management of autonomous unmanned stores, and more particularly, to hierarchical meta learning for deep learning (MLDL) for large-scale category objects for inventory management of autonomous unmanned stores. It relates to a detection and recognition method using learning).

최근 많은 이미지 데이터 셋들은 일반적인 특성을 추출하기 위한 다양한 데이터 클래스와 특징을 가지고 있다. Recently, many image data sets have various data classes and features for extracting general features.

하지만 이러한 다양한 데이터 클래스와 특징으로 인해 해당 데이터 셋으로 훈련된 물체 검출 딥러닝 모델은 대규모 카테고리 객체인식 환경에서 좋은 성능을 내지 못하는 단점을 보인다. However, due to these various data classes and characteristics, the object detection deep learning model trained with the corresponding data set does not show good performance in a large-scale category object recognition environment.

Alexnet 이래로 딥러닝 물체 검출 분야에서 큰 진전이 이루어졌지만 물체 검출 분야는 여전히 이질적이고 희소한 데이터 분포에서 도전적인 문제이다. 기존의 많은 연구자들은 물체 검출 문제에서 더 나은 정확성을 얻기 위해 다양한 방법을 사용하여 대규모 카테고리에 대해 대응 할 수 있는 방법을 고안했다. Although great progress has been made in the field of deep learning object detection since Alexnet, object detection is still a challenging problem in heterogeneous and sparse data distributions. Many existing researchers have devised methods that can respond to large categories using various methods to obtain better accuracy in object detection problems.

또한, 서브-클래스(sub-class) 변형에 초점을 맞추었는데, 이때 대부분의 서브-클래스는 서브-클래스 내부의 유사 정보만을 기반으로 구축되었다. In addition, we focused on sub-class transformation, where most of the sub-classes were built based only on similar information within the sub-class.

그러나, 같은 클래스 내부에도 각 물체마다의 특징 정보가 조금씩 다르기 때문에 대규모 카테고리 검출인식은 내부에도 각 물체마다의 특징 정보가 조금씩 다르기 때문에 대규모 카테고리 검출인식은 매우 난이한 문제이다. However, large-scale category detection and recognition is a very difficult problem because feature information for each object is slightly different even within the same class.

한편, 최근에는 아마존고를 필두로 비젼 AI 딥러닝 인식기술의 고도화를 통한 매장 AI 영상분석 솔루션의 요구가 대두되고 있고, 매장의 인공지능 인벤토리관리, 픽업관리, 결제관리로 업무 효율성 증대와, COVID-19에 의한 비접촉 및 52시간 근무 등 인건비 상승으로 인한 무인 또는 최소 인원으로 운영되는 인공지능 인벤토리 및 POS 기반의 (준)무인 매장 실현이 가속화되고 있다.On the other hand, recently, with Amazon Go at the head, the demand for store AI image analysis solutions through the advancement of vision AI deep learning recognition technology has emerged, increasing work efficiency through store artificial intelligence inventory management, pickup management, and payment management, and COVID-19. Due to rising labor costs such as non-contact and 52-hour work due to COVID-19, the realization of artificial intelligence inventory and POS-based (semi-)unmanned stores that are operated with an unmanned or minimum number of people is accelerating.

이에 국내의 경우, 현재 600여개소의 준무인 점포인 하이브리드 편의점이 성업 중이다. 그러나, 800이상의 카테고리를 인식관리해야 되는 편의점의 매대 인벤토리는 아직 수작업에 의존하고 있는 실정이다.Accordingly, in Korea, hybrid convenience stores, which are semi-unmanned stores, are currently operating at about 600 locations. However, the shelf inventory of convenience stores, which must recognize and manage over 800 categories, is still dependent on manual work.

그리고, 해외의 경우 아마존고, 아마존 글로서리 이어서 아마존 프레쉬가 소개되었으나, 대규모 카테고리 인식률의 문제로 바코드 중심의 스마트 카트 단계에 머물고 있다. And, overseas, Amazon Go, Amazon Grocery, and Amazon Fresh were introduced, but due to the problem of large-scale category recognition rate, they are staying at the barcode-centered smart cart stage.

이에 궁극적으로 완전 자율자동차와 같이 완전 매장 무인화로 진화하는데 많은 기술적 장벽이 존재한다.As a result, there are many technical barriers to ultimately evolving to fully unmanned stores, such as fully autonomous vehicles.

참고문헌 1: 대한민국 등록특허 제10-0988754호Reference 1: Korean Patent Registration No. 10-0988754 참고문헌 2: 대한민국 등록특허 제10-2114808호Reference 2: Korean Patent Registration No. 10-2114808

따라서, 본 발명은 자율무인점포의 인벤토리 관리시 다양한 데이터 클래스와 그 특징으로 인해 해당 데이터 셋으로 훈련된 물체 검출 딥러닝 모델이 갖는 데이터 특성이 대규모 카테고리 객체 환경에서의 성능 저하의 단점을 극복하고 강인한 물체 검출 딥러닝 모델을 훈련하기 위해 대규모 카테고리 객체를 계층적 메타러닝 딥러닝(MLDL: Meta learning for Deep learning)을 이용하여 검출 인식하는 방법을 제공하는데 그 목적이 있다. Therefore, the present invention overcomes the disadvantage of performance degradation in a large-scale category object environment and robust The purpose of this study is to provide a method for detecting and recognizing large-scale categorical objects using hierarchical meta learning for deep learning (MLDL) to train an object detection deep learning model.

아울러, 본 발명은 이러한 구조를 이용하여 대규모 카테고리처럼 데이터 특성이 복잡한 자율무인점포 환경에서 적응할 수 있는 자율무인점포의 인벤토리 관리를 위한 대규모 카테고리 객체 검출인식 방법을 제공하는데 그 목적이 있다. In addition, an object of the present invention is to provide a method for detecting and recognizing large-scale category objects for inventory management of an autonomous unmanned store that can adapt to an autonomous unmanned store environment having complex data characteristics like a large-scale category by using such a structure.

이와 같은 기술적 과제를 해결하기 위해 본 발명은; The present invention to solve such a technical problem;

자율무인점포의 인벤토리 관리를 위한 대규모 카테고리 객체 검출인식 방법에 있어서, 인벤토리 로봇이 N-카메라(N>= 1)로 스캔되는 다중 영상 이미지 데이터를 수집하는 제1 단계; 상기 영상 이미지 데이터와 연계된 매장의 메타정보를 수집하는 제2 단계; 상기 매장 메타 정보의 메타러닝을 수행하여 메타팩터를 도출하는 제3 단계; 및 이미지 데이터 아카이브와 메타 데이터 아카이브 및 메타팩터를 기반으로 계층적 딥러닝 모델을 생성하는 제4단계;를 포함하는 것을 특징으로 하는 자율무인점포의 인벤토리 관리를 위한 대규모 카테고리 객체 검출인식 방법을 제공한다.A large-scale category object detection and recognition method for inventory management of an autonomous unmanned store, comprising: a first step of collecting multi-image image data scanned by N-cameras (N>= 1) by an inventory robot; A second step of collecting store meta information associated with the video image data; A third step of deriving a meta factor by performing meta-learning on the store meta information; And a fourth step of generating a hierarchical deep learning model based on the image data archive, the meta data archive, and the meta factor. .

이때, 상기 제2 단계는 하나 이상의 입력장치로부터 상기 매장 메타정보를 수집하는 단계인 것을 특징으로 한다.In this case, the second step is characterized in that the store meta information is collected from one or more input devices.

그리고, 상기 메타팩터는 수집된 영상 이미지 데이터의 특성, 분산 영상 특성의 신뢰도, 빈 기반의 특성, 영상 인스턴스 기반의 특성, 영상 수집 시간, 인벤토리 로봇의 위치, 조명의 상태 중에 어느 하나 이상의 검출인식의 성능에 영향을 주는 팩터인 것을 특징으로 한다.And, the meta factor is the detection recognition of one or more of the characteristics of the collected image data, the reliability of the distributed image characteristics, the bin-based characteristics, the video instance-based characteristics, the image collection time, the location of the inventory robot, and the lighting state. Characterized in that it is a factor that affects performance.

아울러, 상기 제4단계는 소량의 라벨링된 데이터로 초기모델을 구성하며 증강 ASSL 학습 과정을 통해 점진적으로 성능을 최적화하는 단계인 것을 특징으로 한다.In addition, the fourth step is characterized by constructing an initial model with a small amount of labeled data and gradually optimizing performance through an augmented ASSL learning process.

또한, 상기 제4 단계 이후에, 인벤토리 로봇에서 N-카메라로 스캔되는 다중 영상을 계층적 딥러닝 검출인식기에 입력하여 검출인식결과를 도출하는 제5단계;를 더 포함하는 것을 특징으로 한다.In addition, after the fourth step, a fifth step of inputting multiple images scanned by the N-camera from the inventory robot to the hierarchical deep learning detection recognizer to derive a detection recognition result; characterized in that it further comprises.

이때, 상기 제5단계는 계층적 딥러닝 검출인식기에서 상기 MLDL 모델생성기의 피이드백을 받아서 인벤토리 로봇과 매장의 메타팩터 정보를 이용하여 대규모 카테고리 객체의 검출인식을 수행하는 단계인 것을 특징으로 한다.In this case, the fifth step is characterized in that the hierarchical deep learning detection recognizer receives feedback from the MLDL model generator and performs detection and recognition of the large-scale category object using the inventory robot and meta factor information of the store.

그리고, 상기 제5 단계 이후에, 도출된 메타팩터와 상기 계층적 딥러닝 검출인식기의 검출인식결과를 이용하여 이미지 데이터 아카이브와 메타데이터 아카이브를 업데이트 하는 제6 단계; 및 상기 업데이트 된 이미지 데이터 아카이브와 메타 데이터 아카이브 및 메타팩터를 기반으로 계층적 딥러닝 모델을 업데이트하는 제7 단계;를 더 포함하는 것을 특징으로 한다.And, after the fifth step, a sixth step of updating the image data archive and the metadata archive by using the derived meta factor and the detection recognition result of the hierarchical deep learning detection recognizer; and a seventh step of updating a hierarchical deep learning model based on the updated image data archive, meta data archive, and meta factor.

아울러, 상기 제5 단계는 인벤토리 로봇에서 N-카메라로 스캔되는 다중 영상을 계층적 딥러닝 검출인식기에 입력하여 검출인식결과를 연속적으로 도출하는 단계인 것을 특징으로 한다.In addition, the fifth step is characterized in that the multiple images scanned by the N-camera from the inventory robot are input to the hierarchical deep learning detection recognizer to continuously derive detection and recognition results.

이때, 상기 제5 단계는 매대 상의 다중 상품객체 검출인식 고도화를 위한 타임-윈도우(Time-window) 기법을 적용하는 단계를 포함하는 것을 특징으로 한다.In this case, the fifth step may include applying a time-window technique for advanced detection and recognition of multiple product objects on the shelf.

아울러, 상기 타임-윈도우(Time-window) 기법은, 각 영상 프레임의 객체 검출 결과로부터 최종 추적 경로를 생성하기에 앞서 최적화된 다중 객체 추적 세그먼트(부분적 추적 경로(Tracklet)를 batch 단위로 수행)를 추출하기 위한 Time-window 기법을 적용하는 단계; 및 인벤토리 로봇의 운용 장소는 객체 식별에 영향을 주는 많은 변수가 존재하기 때문에 추출된 추적 세그먼트 내에서 다중객체 식별의 정확성을 보장하기 위한 방법으로 NMS(Non-Maximum Suppression) 알고리즘의 일종인 WNMS(Weighted Non-Maximum Suppression)을 적용하는 단계;를 포함하는 것을 특징으로 한다.In addition, the time-window technique, prior to generating the final tracking path from the object detection result of each image frame, optimizes multi-object tracking segments (partial tracking paths are performed in batch units). Applying a time-window technique for extraction; And since there are many variables that affect object identification in the operating location of the inventory robot, NMS (Non-Maximum Suppression) is a type of algorithm, WNMS (Weighted It is characterized by including; applying a Non-Maximum Suppression).

본 발명에 따르면, 자율무인점포의 인벤토리 관리를 위한 대규모 카테고리 객체 검출인식시 다양한 데이터 클래스와 특징으로 인해 해당 데이터 셋으로 훈련된 물체 검출 딥러닝 모델이 갖는 단점을 극복할 수 있고, 강인한 물체 검출 딥러닝 모델을 훈련하기 위해 능동 준지도 학습을 이용한 다중 분기 트리 구조를 통해 데이터 특성이 다른 환경에서 적응할 수 있는 모델을 이용할 수 있다. According to the present invention, when detecting and recognizing large-scale category objects for inventory management of autonomous unmanned stores, it is possible to overcome the disadvantages of object detection deep learning models trained with the corresponding data set due to various data classes and characteristics, and robust object detection deep To train a learning model, a model that can adapt to environments with different data characteristics can be used through a multi-branch tree structure using active semi-supervised learning.

아울러, 본 발명은 딥러닝 고도화 영상 인식기능을 적응적으로 확장하고 업데이트할 수 있는 온라인 비젼 AI 딥러닝 플랫폼 기술과 인공지능 POS, 인벤토리가 결합한 스마트 매장에 적용 가능하다. In addition, the present invention can be applied to smart stores that combine online vision AI deep learning platform technology, artificial intelligence POS, and inventory that can adaptively expand and update deep learning advanced image recognition functions.

특히, 본 발명은 온라인 딥러닝 기술로 실용성이 있는 비젼 AI 기술을 확보해서 스마트 매장 기술의 본격적인 상용화를 추진할 수 있으며, 기술적인 면에서 현재 보편적으로 사용되고 있는 오프라인 딥러닝 시스템의 경우 학습이 되지 않은 현장에서는 검출 능력이 급격히 떨어져 변화와 다양성이 존재하는 실제 환경을 인지하고 분석하기에는 한계가 있는 점을 극복하고, 현재 주류를 이루고 있는 지도 학습 기반 딥러닝의 경우 학습한 데이터의 종류에 따라 제한적인 상황에서의 인식이 가능해 범용적인 사용에 차질이 있었으나 이 역시 극복할 수 있는 장점이 있다.In particular, the present invention can promote full-scale commercialization of smart store technology by securing practical vision AI technology with online deep learning technology, and in the case of offline deep learning systems, which are currently commonly used in terms of technology, in the field where learning is not performed In the case of supervised learning-based deep learning, which is currently mainstream, it overcomes the limitations in recognizing and analyzing real environments where change and diversity exist due to a rapid drop in detection ability. Although there was a setback in general-purpose use because it was possible to recognize, this also has an advantage that can be overcome.

아울러, 본 발명은 시대적인 변화에 발 빠르게 준비할 필요성이 있는 서비스 로봇, 스마트 홈 산업과 관련 업계에서도 행동인식기술을 상용화하여 관련 산업을 활성화할 수 있을 뿐만 아니라, 본 발명의 결과물을 통해 구현되는 계층적 온라인 딥러닝 플랫폼은 서빙 로봇에 성공적으로 상용화된 후 응용 분야를 서비스 로봇, 컨슈머 로봇에게도 적용 가능하여 로봇 시장을 활성화하는 기반을 제공할 수 있다.In addition, the present invention can commercialize action recognition technology in service robots, smart home industries and related industries that need to quickly prepare for changes in the times to activate related industries, as well as to realize the results of the present invention The hierarchical online deep learning platform can be applied to service robots and consumer robots after it has been successfully commercialized for serving robots, providing a basis for activating the robot market.

도 1은 본 발명의 일 실시 예에 따른 자율무인점포의 인벤토리 관리를 위한 계층적 MLDL 검출인식 플랫폼을 도시한 도면이다.
도 2는 본 발명의 일 실시 예에 따른 HFM 구축 방법을 설명하기 위해 도시한 도면이다.
도 3은 본 발명의 일 실시 예에 따라 새로운 클래스 또는 모호한 특징을 가지고 있는 기존의 클래스에 대해 대응하는 계층적 MLDL 검출인식 과정을 설명하기 위해 도시한 도면이다.
도 4는 기존 CNN을 이용한 영상 물체 검출과 Time-window 기법을 적용한 CNN 기반의 영상 물체 검출 방식을 설명하기 도시한 도면이다.
도 5는 본 발명의 일 실시 예에 따른 ASSL 트레이닝 방법을 설명하기 위해 도시한 도면이다.
도 6은 본 발명의 일 실시 예에 따른 CNN 알고리즘의 증강학습 방법을 설명하기 위해 도시한 도면이다.
도 7은 본 발명의 일 실시 예에 따른 계층적 MLDL 증강 비젼 AI 솔루션 탑재를 위한 자율무인점포 시스템 H/W 및 F/W 장치를 설명하기 위해 도시한 도면이다.1 is a diagram illustrating a hierarchical MLDL detection and recognition platform for inventory management of an autonomous unmanned store according to an embodiment of the present invention.
2 is a diagram for explaining an HFM construction method according to an embodiment of the present invention.
3 is a diagram for explaining a hierarchical MLDL detection and recognition process corresponding to a new class or an existing class having ambiguous characteristics according to an embodiment of the present invention.
4 is a diagram illustrating an image object detection method using an existing CNN and a CNN-based image object detection method using a time-window technique.
5 is a diagram for explaining an ASSL training method according to an embodiment of the present invention.
6 is a diagram for explaining an augmented learning method of a CNN algorithm according to an embodiment of the present invention.
7 is a diagram illustrating an autonomous unmanned store system H/W and F/W device for loading a hierarchical MLDL augmented vision AI solution according to an embodiment of the present invention.

이하, 본 발명에 따른 자율무인점포의 인벤토리 관리를 위한 대규모 카테고리 객체 검출인식 방법을 첨부한 도면을 참고로 하여 상세히 기술되는 실시 예에 의하여 그 특징들을 이해할 수 있을 것이다. Hereinafter, the characteristics of a method for detecting and recognizing large-scale category objects for inventory management of an autonomous unmanned store according to the present invention will be understood by an embodiment described in detail with reference to the accompanying drawings.

이에 앞서, 본 명세서 및 청구범위에 사용된 용어나 단어는 통상적이거나 사전적인 의미로 한정해서 해석되어서는 아니 되며, 발명자는 그 자신의 발명을 가장 최선의 방법으로 설명하기 위해 용어의 개념을 적절하게 정의할 수 있다는 원칙에 입각하여 본 발명의 기술적 사상에 부합하는 의미와 개념으로 해석되어야만 한다.Prior to this, the terms or words used in this specification and claims should not be construed as being limited to the usual or dictionary meaning, and the inventor appropriately uses the concept of the term in order to explain his/her invention in the best way. It should be interpreted as a meaning and concept consistent with the technical idea of the present invention based on the principle that it can be defined.

따라서, 본 명세서에 기재된 실시 예와 도면에 도시된 구성은 본 발명의 가장 바람직한 일 실시 예에 불과할 뿐이고, 본 발명의 기술적 사상을 모두 대변하는 것은 아니므로, 본 출원시점에 있어서 이들은 대체할 수 있는 다양한 균등물과 변형 예들이 있을 수 있음을 이해하여야 한다. Therefore, the embodiments described in this specification and the configurations shown in the drawings are only one of the most preferred embodiments of the present invention, and do not represent all the technical ideas of the present invention, so at the time of this application, they can be replaced. It should be understood that there may be many equivalents and variations.

우선 본 발명에 따른 자율무인점포의 운영을 위한 매대의 인벤토리 자동관리는 자율점포의 핵심기반 요소로 대규모 상품 카테고리 영상 검출인식 기술이 필수적이다. 대표적인 무인점포의 타겟인 편의점의 경우 700 ~ 800개, 식료품점(글로서리 스토어)인 경우 대략 3000개 정도까지의 인벤토리를 관리해야 한다. First of all, the automatic inventory management of the shelf for the operation of the autonomous unmanned store according to the present invention is a core element of the autonomous store, and large-scale product category image detection and recognition technology is essential. In the case of a convenience store, which is a representative unmanned store target, it is necessary to manage an inventory of 700 to 800 items, and in the case of a grocery store (grocery store), up to about 3,000 items.

특히, 상품이 비정형이고, 진열 특성에 따라서 불확실성이 매우 높은 매대 상품 영상의 낮은 검출인식률은 자율 무인점포 상용화에 큰 걸림돌이다. In particular, the low detection and recognition rate of product images on the shelf, where the product is atypical and the uncertainty is very high depending on the display characteristics, is a major obstacle to commercialization of autonomous unmanned stores.

즉, 기존의 객체 검출인식 기술은 인식률이 낮아, 특히 대규모 클래스 객체의 경우 실제 환경에서 사용하기에 큰 어려움이 있었다.That is, the existing object detection and recognition technology has a low recognition rate, and it is difficult to use it in a real environment, especially in the case of large-scale class objects.

한편, 최근 인터넷의 성장을 기반으로 다양한 빅데이터가 구축되었고, 반도체와 병렬처리 기술의 발달로 컴퓨팅 파워도 크게 성장하고 있으며, 빅데이터와 하드웨어 기술의 발전을 바탕으로 사람의 뇌 구조를 묘사한 매우 복잡한 신경망 모델이 대두되었으며 다양한 딥러닝 기술이 활발하게 보급되고 있으며, 이러한 발전한 딥러닝 기술이 사물 검출 방법에 적용되면서 전반적인 인식률이 크게 상승했다. 하지만, 현재 사물 검출에 사용되는 대부분의 기술은 초기에 막대한 데이터가 요구되는 지도학습 알고리즘을 기반으로 하고 있어, 대규모 카테고리 (수백 개 이상) 객체 군에서 사용할 경우 성능이 급격히 떨어지는 치명적인 단점이 있다. On the other hand, various big data have been built based on the recent growth of the Internet, and computing power has grown significantly with the development of semiconductors and parallel processing technology. Complex neural network models have emerged and various deep learning technologies are being actively spread. As these advanced deep learning technologies are applied to object detection methods, the overall recognition rate has increased significantly. However, most of the technologies currently used for object detection are based on supervised learning algorithms that initially require enormous amounts of data, and have a fatal disadvantage in that their performance drops sharply when used in large-scale category (hundreds or more) object groups.

이에 본 발명에서는 대규모 카테고리 객체 검출인식 응용에서 환경 등의 외부요인의 변화로 저하된 성능을 빠르게 복구할 수 있는 ASSL(Active Semi-Supervised Learning) 방법과 환경 변화에 대응할 수 있는 HFM(Hierarchical Feature Modeling) 기술, 메터러닝을 융합한 계층적 MLDL 고도화 기술을 적용한다.Accordingly, in the present invention, an ASSL (Active Semi-Supervised Learning) method capable of quickly recovering performance deteriorated due to changes in external factors such as the environment in a large-scale category object detection and recognition application and a Hierarchical Feature Modeling (HFM) method capable of responding to environmental changes Hierarchical MLDL advanced technology that combines technology and meter learning is applied.

특히, 본 발명은 자율무인점포의 인벤토리를 자동관리하기 위한 플랫폼으로 계층적 메타러닝 딥러닝(MLDL: Meta learning for Deep learning) 기반의 인벤토리 기술을 제안한다.In particular, the present invention proposes an inventory technology based on hierarchical meta learning for deep learning (MLDL) as a platform for automatically managing the inventory of autonomous unmanned stores.

이러한 본 발명은 CNN(Convolutional Neural Network) 딥러닝 알고리즘, HFM (HFM: Hierarchical Feature Modeling), 소프트 클래스 앙상블 (SCE: Soft Class Ensemble) 알고리즘을 결합하고, 컨텍스트 정보를 할용하기 위한 ML 기법을 적용한 계층적 MLDL 기술을 적용하여 자율무인점포 매대 인벤토리의 대규모 카테고리 상품객체 검출인식을 수행하는 딥러닝 방법으로, 기존 컨볼루션 신경망은 Flat한 1차원 카테고리 구조에 기반하여 학습되므로 기존에 학습된 환경이 아니거나 존재하지 않는 새로운 객체가 등장하는 경우 인식성능이 급격히 떨어지는 문제점을 방지하기 위해 인식 대상 객체의 데이터 분포를 HFM 기반으로 모델링하는 단계, 계층적 CNN 기반의 증강 비젼 AI 기술을 적용하는 단계를 포함한다.The present invention combines a convolutional neural network (CNN) deep learning algorithm, a hierarchical feature modeling (HFM), and a soft class ensemble (SCE) algorithm, and applies a hierarchical ML technique to use context information. It is a deep learning method that applies MLDL technology to perform detection and recognition of large-scale category product objects in autonomous unmanned store shelf inventory. Existing convolutional neural networks are trained based on a flat one-dimensional category structure, so they do not exist in a previously learned environment or exist. In order to prevent the problem of a sharp drop in recognition performance when a new object that does not appear, modeling the data distribution of the object to be recognized based on HFM and applying a hierarchical CNN-based augmented vision AI technology are included.

상기 계층적 메타러닝 딥러닝(MLDL)은 데이터 수집, 가공 프로세스와 매장의 시간, 장소 등 수행 환경을 결합한 방법으로서, 무인매장 상품 객체의 대규모 카테고리와 환경변화에 강인한 최적의 비젼 AI 모델을 생성하기 위하여, 일괄적으로 많은 양의 라벨링된 데이터 의존하는 기존의 딥러닝 기반 비젼 AI와는 달리 상대적으로 소량의 라벨링된 데이터로 초기모델을 구성하며, 증강 ASSL 학습 과정을 통해서 점진적 성능을 최적화하는 단계; 시간, 장소 등 인벤토리 로봇 운용환경의 변화를 반영하여 지속적인 비젼 AI 모델의 품질을 유지하기 위하여 MLDL 컨셉 생성 단계; 상기 생성된 컨셉을 기반으로 계층적 딥러닝을 진행하는 단계; 검출인식된 결과의 에러를 피이드백 받는 단계; 상기 생성된 컨셉을 기반으로 메타러닝을 진행하는 단계; 학습된 메타정보의 에러를 피이드백 받는 단계; 상기 피이드백 정보를 이용하여 MLDL 컨셉생성기를 업그레이드 하는 단계를 포함한다.The hierarchical meta-learning deep learning (MLDL) is a method that combines the data collection and processing process with the execution environment such as the time and place of the store, and generates an optimal vision AI model robust to large-scale categories of unmanned store product objects and environmental changes Constructing an initial model with a relatively small amount of labeled data, unlike conventional deep learning-based vision AI that relies on a large amount of labeled data in batches, and optimizing progressive performance through an augmented ASSL learning process; MLDL concept generation step to maintain the quality of the continuous vision AI model by reflecting changes in the operating environment of the inventory robot, such as time and place; performing hierarchical deep learning based on the generated concept; receiving feedback on an error of a result of detection and recognition; Conducting meta-learning based on the generated concept; receiving feedback on errors in the learned meta information; and upgrading the MLDL concept generator using the feedback information.

상기 계층적 메타러닝 딥러닝(MLDL) 기반의 대규모 카테고리 객체 검출인식 플랫폼 통합은, 데이터 수집, 가공, 학습 환경을 결합한 대규모 카테고리 객체 검출인식 플랫폼으로서 환경 변화에 강인한 최적의 비젼 AI 모델을 생성하고 검증할 수 있는 운용환경을 지원하고, 일괄적으로 많은 양의 고비용 라벨된 데이터에 의존하는 기존의 딥러닝 기반 비젼 AI와는 달리 증강 비젼 AI 패키지를 통해 상대적으로 저렴한 초기 셋업 비용이 소요되며, 증강 ASSL 학습 과정을 통해서 점진적 또는 크라우드 소싱을 통해서 성능을 최적화하며, 운용환경의 변화를 반영하여 지속적인 비젼 AI 모델의 품질을 유지하고, 비젼 AI 기반 인벤토리 로봇 및 시스템을 최적화할 수 있다.The hierarchical meta-learning deep learning (MLDL)-based large-scale category object detection and recognition platform integration is a large-scale category object detection and recognition platform that combines data collection, processing, and learning environments to create and verify optimal vision AI models that are robust to environmental changes. Unlike the existing deep learning-based vision AI, which relies on a large amount of high-cost labeled data in batches, relatively low initial setup costs are required through the augmented vision AI package, and augmented ASSL learning Through the process, it is possible to optimize performance through incremental or crowd sourcing, maintain the quality of continuous vision AI models by reflecting changes in the operating environment, and optimize vision AI-based inventory robots and systems.

도 1을 참고하면, 본 발명의 자율무인점포를 위한 대규모 카테고리 객체 검출인식을 위해 단순한 검출인식이 아닌 메타팩터 정보를 융합하여 매대 인벤토리 관리를 위한 인식성능을 보장하기 위한 계층적 MLDL 검출인식 플랫폼은, MLDL 생성기(10)에서 데이터 아카이브(11)와 메타팩터 아카이브(12)를 기반으로 ML, DL 모델을 생성하고, 인벤토리 로봇은 N-카메라로 스캔되는 다중 영상을 계층적 딥러닝 검출인식기(20)에 입력해서 검출인식결과를 도출하고 그 도출된 검색인식 결과는 MLDL 모델생성기(10)로 전달되며, 매장 메타 정보가 하나 또는 복수의 입력장치에 의해서 수집되어 메타러닝(30)이 수행되며, 계층적 딥러닝 검출인식기(20)에서 도출되는 검출인식결과와 메타러닝(30)를 통해 도출된 메타팩터 결과는 MLDL 모델생성기(10)로 전달되어 MLDL 모델생성기(10)에서 메타팩터와 검출인식결과를 바탕으로 ML, DL 모델을 생성(또는 업데이트)하며, 계층적 딥러닝 검출인식기(20)에서는 MLDL 모델생성기(10)로부터 피이드백을 받아서 인벤토리 로봇과 매장의 메타팩터 정보를 이용하여 검출인식을 수행한다. 이러한 방식은 단순한 계층적 검출인식이 아닌 메타팩터 정보를 융합하여서, 매대 인벤토리 관리를 위한 인식성능을 보장할 수 있다.Referring to FIG. 1, the hierarchical MLDL detection and recognition platform for guaranteeing recognition performance for shelf inventory management by converging meta-factor information rather than simple detection and recognition for large-scale category object detection and recognition for autonomous unmanned stores of the present invention , The MLDL generator (10) generates ML and DL models based on the data archive (11) and the meta factor archive (12), and the inventory robot converts multiple images scanned with N-cameras into a hierarchical deep learning detection recognizer (20 ) to derive a detection recognition result, and the derived search recognition result is transmitted to the MLDL model generator 10, store meta information is collected by one or more input devices, and meta-learning 30 is performed, The detection recognition result derived from the hierarchical deep learning detection recognizer 20 and the meta factor result derived through meta learning 30 are transferred to the MLDL model generator 10, and the meta factor and detection recognition result are transferred to the MLDL model generator 10. Based on the results, ML and DL models are created (or updated), and the hierarchical deep learning detection recognizer (20) receives feedback from the MLDL model generator (10) and detects and recognizes them using the inventory robot and meta factor information of the store. Do it. This method fuses meta factor information rather than simple hierarchical detection and recognition, so that recognition performance for shelf inventory management can be guaranteed.

여기서, 메타팩터는 수집된 영상의 특성, 분산 영상 특성의 신뢰도, 빈 기반의 특성, 영상 인스턴스 기반의 특성, 영상 수집 시간, 인벤토리 로봇의 위치, 조명의 상태 등 직간접적으로 검출인식의 성능에 영향을 주는 팩터이다.Here, the meta factor directly or indirectly affects the performance of detection and recognition, such as collected image characteristics, reliability of distributed image characteristics, bin-based characteristics, image instance-based characteristics, image collection time, location of inventory robots, and lighting conditions. is a factor that gives

이러한 본 발명의 자율무인점포의 인벤토리 관리를 위한 대규모 카테고리 객체 검출인식 방법은, 자율무인점포의 인벤토리 관리를 위한 대규모 카테고리 객체 검출인식 방법에 있어서, 인벤토리 로봇이 N-카메라(N>= 1)로 스캔되는 다중 영상 이미지 데이터를 수집하는 제1 단계; 상기 영상 이미지 데이터와 연계된 매장의 메타정보를 수집하는 제2 단계; 상기 매장 메타 정보의 메타러닝을 수행하여 메타팩터를 도출하는 제3 단계; 및 이미지 데이터 아카이브와 메타 데이터 아카이브 및 메타팩터를 기반으로 계층적 딥러닝 모델(ML, DL 모델)을 생성하는 제4단계;를 포함한다.In the method for detecting and recognizing large-scale category objects for inventory management of autonomous unmanned stores of the present invention, the inventory robot uses N-cameras (N>= 1) A first step of collecting multi-image image data to be scanned; A second step of collecting store meta information associated with the video image data; A third step of deriving a meta factor by performing meta-learning on the store meta information; and a fourth step of generating a hierarchical deep learning model (ML, DL model) based on the image data archive, the meta data archive, and the meta factor.

좀 더 구체적으로 설명하면 본 발명은, 대규모 카테고리 객체 검출인식 방법은 단순한 검출인식이 아닌 메타팩터 정보를 융합하여 매대 인벤토리 관리를 위한 인식성능을 보장하기 위해, MLDL 모델생성기(10)에서 데이터 아카이브(11)와 메타팩터 아카이브(12)를 기반으로 ML, DL 모델을 생성하는 단계; 인벤토리 로봇에서 N-카메라로 스캔되는 다중 영상을 계층적 딥러닝 검출인식기(20)에 입력하여 검출인식결과를 도출하는 단계; 상기 도출된 검색인식 결과를 MLDL 모델생성기(10)로 전달하는 단계; 하나 또는 복수의 입력장치에 의해서 수집되는 매장 메타 정보를 이용해 메타러닝(30)을 수행하고, 계층적 딥러닝 검출인식기(20)에 입력해서 검출인식결과를 도출하는 단계; 상기 메타러닝(30)을 수행하여 도출된 메타팩터 결과가 MLDL 모델생성기(10)로 전달되는 단계; 상기 MLDL 모델생성기(10)가 메타팩터와 검출인식결과를 바탕으로 ML, DL 모델을 생성(업데이트)하는 단계; 계층적 딥러닝 검출인식기(20)에서 상기 MLDL 모델생성기(10)의 피이드백을 받아서 인벤토리 로봇과 매장의 메타팩터 정보를 이용하여 검출인식을 수행하는 단계;를 포함한다.More specifically, in the present invention, the large-scale category object detection and recognition method fuses meta factor information rather than simple detection and recognition to ensure recognition performance for shelf inventory management, in the MLDL model generator 10 data archive ( 11) and generating ML and DL models based on the meta factor archive 12; Deriving a detection recognition result by inputting multiple images scanned by an N-camera from an inventory robot to the hierarchical deep learning detection recognizer 20; transferring the derived search recognition result to the MLDL model generator 10; Performing meta-learning (30) using store meta-information collected by one or a plurality of input devices and inputting it to the hierarchical deep learning detection recognizer (20) to derive a detection recognition result; Transferring the meta factor result derived by performing the meta learning 30 to the MLDL model generator 10; generating (updating), by the MLDL model generator 10, ML and DL models based on meta factors and detection and recognition results; In the hierarchical deep learning detection/recognizer 20, receiving feedback from the MLDL model generator 10 and performing detection/recognition using the inventory robot and store meta-factor information.

본 발명의 자율무인점포의 인벤토리 관리를 위한 대규모 카테고리 객체 검출인식 방법은, 일괄적으로 많은 양의 라벨링된 데이터 의존하는 기존의 딥러닝 기반 비젼 AI와는 달리 상대적으로 소량의 라벨링된 데이터로 초기모델을 구성하며, 증강 ASSL 학습 과정을 통해서 점진적 성능을 최적화한다.The large-scale category object detection and recognition method for inventory management of autonomous unmanned stores of the present invention uses a relatively small amount of labeled data to create an initial model, unlike conventional deep learning-based vision AI that relies on a large amount of labeled data in batches. and optimize the incremental performance through the augmented ASSL learning process.

한편, 본 발명은 기존에 학습된 환경이 아니거나 존재하지 않는 새로운 객체가 등장하는 경우 인식성능이 급격히 떨어지는 문제점을 방지하고 환경변화에 대응할 수 있도록 인식 대상 객체의 데이터 분포를 HFM 기반으로 모델링한다.On the other hand, the present invention models the data distribution of the object to be recognized based on HFM to prevent a problem in which the recognition performance rapidly deteriorates when a new object that is not in the previously learned environment or does not exist appears and responds to environmental changes.

도 2를 참고하면 상기 HFM의 구축 방법은 카테고리 라벨을 가진 데이터셋 결과로부터 유사 카테고리들을 분류한 소프트 클러스터링(Soft clustering) 결과로 획득한 계층 구조 클래스를 활용하여 HFM을 구성하는 단계; 계층적 구조를 가진 HFM을 구축하기 위해서, 카테고리 정보를 계층적으로 표현하는 단계; 클래스 클러스터의 계층적 소프트 클러스터링 방법을 적용하는 단계; 최상위 부모 노드인 H-레벨, 중간층의 실질적 물체 노드인 M-레벨, 세부 분류 노드인 L-레벨로 나누어진 HFM을 모델링하는 단계; 학습 데이터

와 시멘틱 물체 공간

가 주어졌을 때 정의할 수 있는 HFM은 상단에 위치한 Super-카테고리 노드, 중간에 위치한 Augmented-카테고리 노드, 하단에 위치한 Sub-카테고리 노드로 구성의 최적화 하는 단계; Super-카테고리 노드

는 전체 학습 데이터

의 subset인

로 학습하여 다수의 Augmented-카테고리를 출력하는 알고리즘 단계; Augmented 카테고리

은 Super-카테고리 학습 데이터인

의 subset인

으로 학습하고, 다수의 sub-카테고리를 출력하는 알고리즘 단계;를 포함하며, 베이스라인 HFM은 각각 Super-, Augmented-, Sub- 물체 공간인

,

에서 각자의 사물 검출, 인식을 수행하는 구조이며, 서비스 시스템 데이터 분석을 수행한 후 최적화 기법을 포함한다.Referring to FIG. 2, the HFM construction method includes constructing an HFM by utilizing a hierarchical structure class obtained as a result of soft clustering in which similar categories are classified from a dataset result having category labels; Hierarchically expressing category information to construct an HFM having a hierarchical structure; Applying a hierarchical soft clustering method of class clusters; Modeling the HFM divided into H-level, which is the highest parent node, M-level, which is a real object node of the middle layer, and L-level, which is a detailed classification node; learning data

and semantic object space

HFM that can be defined when is given is a super-category node located at the top, an Augmented-category node located in the middle, and a sub-category node located at the bottom. Super-Category Node

is the entire training data

which is a subset of

Algorithm step for outputting a plurality of Augmented-categories by learning with ; Augmented category

is the super-category training data

which is a subset of

Including; an algorithm step for learning and outputting a plurality of sub-categories; wherein the baseline HFM is Super-, Augmented-, and Sub-object space, respectively.

,

It is a structure that performs each object detection and recognition in , and includes an optimization technique after performing service system data analysis.

그리고, 도 3을 참고하면 상기 HFM 방법은 베이스라인 HFM을 기반으로 대상 객체 클래스에 대해 플랫(Flat) CNN을 사용하는 대신 계층적 CNN을 이용하여 새로운 클래스와 모호한 특징을 가지고 있는 기존의 클래스를 분별하는 단계; HFM 각 노드에서 변화된 환경에 증강학습이 가능한 계층적 CNN 구조를 적용하는 단계; 기존에 보지 못한 새 클래스가 추가되는 외부 오픈셋(External open-set) 클래스 학습방식과 기존에 존재하는 클래스이지만 변화된 환경으로 인식률이 낮은 경우 이를 보완하는 내부 오픈셋(Internal open-set) 클래스 학습방식으로 구분하는 단계; 내/외부 오픈셋 방식을 통합하기 위해 사물 객체/인식 검출기에 계층적 딥러닝 방법을 적용하여 증강 비젼 AI 학습 방법을 사용하여 각 오픈셋 클래스에 적용하는 단계를 포함한다.And, referring to FIG. 3, the HFM method discriminates between a new class and an existing class having ambiguous characteristics by using a hierarchical CNN instead of using a flat CNN for the target object class based on the baseline HFM. doing; Applying a hierarchical CNN structure capable of reinforcement learning to the changed environment at each HFM node; An external open-set class learning method in which new classes that have not been seen before are added, and an internal open-set class learning method that compensates for existing classes but whose recognition rate is low due to a changed environment. Dividing into; In order to integrate the internal/external openset method, a hierarchical deep learning method is applied to the object object/recognition detector and applied to each openset class using an augmented vision AI learning method.

또한, 본 발명은 자율무인점포에 구비되는 매대 상의 다중 상품객체 검출인식 고도화를 위한 Time-window 기법을 적용한다. 도 4를 참고하면 객체 검출은 하나의 영상 프레임에 대하여 관심영역에 대한 식별 결과를 보여준다. 다중 객체 추적을 위하여 순차적으로 이미지 프레임이 입력되는 동영상에 대한 검출 성능을 보장하기 위하여, 각 영상 프레임의 객체 검출 결과로부터 최종 추적 경로를 생성하기에 앞서 최적화된 다중 객체 추적 세그먼트(부분적 추적 경로(Tracklet)를 batch 단위로 수행)를 추출하기 위한 Time-window 기법을 적용하는 단계; 인벤토리 로봇의 운용 장소는 객체 식별에 영향을 주는 많은 변수가 존재하기 때문에 추출된 추적 세그먼트 내에서 다중객체 식별의 정확성을 보장하기 위한 방법으로 NMS(Non-Maximum Suppression) 알고리즘의 일종인 WNMS(Weighted Non-Maximum Suppression)를 적용하는 단계; NMS 알고리즘은 불안한 환경에서 다중 객체 식별의 후처리로 적용하며 영상 프레임 내의 여러 개의 식별 후보 중 최적의 식별 결과를 결정하는 단계; WNMS 알고리즘은 유사 다중 객체 식별 결과를 군집화하며 이때 사용되는 식별 스코어를 기준으로 가중치를 부여하고 합산하는 단계; 군집화 방식에 따라 매우 민감한 식별 결과가 도출되므로 환경 변화에 따른 군집화 임계치를 최적으로 조율, 이미지 시퀸스 내 인접 영상 프레임의 식별 결과를 동시에 참조 및 비교함으로써 WNMS를 효과를 높여 다중객체 식별의 안정성을 확보하는 단계;를 포함한다.In addition, the present invention applies a time-window technique for advanced detection and recognition of multiple product objects on a shelf provided in an autonomous unmanned store. Referring to FIG. 4 , object detection shows a result of identifying a region of interest with respect to one image frame. In order to ensure detection performance for videos in which image frames are sequentially input for multi-object tracking, an optimized multi-object tracking segment (partial tracking path) prior to generating the final tracking path from the object detection result of each video frame ) in batch units) applying a time-window technique for extracting; Since there are many variables that affect object identification in the operating location of the inventory robot, WNMS (Weighted Non-Weighted Non-Maximum Suppression), a type of NMS (Non-Maximum Suppression) algorithm, is used as a method to ensure the accuracy of multi-object identification within the extracted tracking segment. -Applying Maximum Suppression); Applying the NMS algorithm as a post-processing of multi-object identification in an unstable environment, determining an optimal identification result among several identification candidates within an image frame; The WNMS algorithm clusters similar multi-object identification results, assigning weights based on the identification scores used at this time, and summing them; Since very sensitive identification results are derived depending on the clustering method, the clustering threshold according to environmental changes is optimally tuned, and the identification results of adjacent image frames in the image sequence are simultaneously referred and compared to increase the effectiveness of WNMS and secure the stability of multi-object identification. step; includes.

그리고, 타겟 시스템에 최적화된 소프트 클래스 앙상블(SCE) 알고리즘은, HFM 구조로부터 최종 물체 검출 결과를 얻어내기 위해서는 각 계층 구조의 사물 검출 결과를 통합할 수 있는 앙상블을 하기 위하여 계층 구조 기반 검출 결과 SCE 기법을 적용 단계; HFM에 의해 최상위 노드에서는

카테고리 공간을, 각각의 Super-카테고리

는

카테고리 공간을, 그리고 augmented 카테고리

은

카테고리 공간을 바탕으로 딥컨볼루션 신경망을 학습하는 단계; 딥컨볼루션 신경망은 이미지로부터 관심 있는 영역(Region of Interest, 이하 RoI)을 입력값으로 받으면 그 결과로 해당 RoI에 대한 특징점을 추출하고, 그 특징점은 검출기를 통해 인식 점수(CS; Confidence score)를 추출하는 단계; 검출기로부터 얻어낸 CS는 절대적인 값이 아닌 상대적인 값으로써 특징점을 학습한 데이터 공간이 각각

,

으로 다르기 때문에 직접적으로 CS값들을 비교하거나 합성할 수 없으므로 이를 해결하기 위해 다른 데이터 공간에서 얻어낸 CS값들을 모두 절대적인 기준으로 변형하여 합성할 수 있도록 CS 결과를 확률 공간으로 전환시키는 단계; 관심 영역(RoI)

이 주어졌을 때, super-카테고리

를 위해 학습시킨 linear Classifier

결과를 확률 공간으로 projection 시키는 확률 함수

를

로 정의하는 단계; 파라미터

와

는 logistic regression을 통해 결정하기 위한 알고리즘 개발확률 함수

를 이용한 최종적인 super-카테고리 검출을 위한 인식 점수

를

로 정의하는 단계; Augemented-카테고리

과 sub-카테고리

또한, 위와 같은 방법으로 각각

,

을 얻어내는 단계; 계층 구조 기반 검출 결과 앙상블(HCE)을 통해 관심 영역(RoI)

에 대해 종합 검출 점수(OCS; Overall Confidence Score)인

를 추출하는 단계; 이 함수에 접근하기 위한 알고리즘을 통해 타겟 시스템의 HFM을 최적화하는 단계;를 포함한다.In addition, the soft class ensemble (SCE) algorithm optimized for the target system, in order to obtain the final object detection result from the HFM structure, is an ensemble that can integrate the object detection results of each hierarchical structure. apply step; At the top node by HFM

category space, each Super-category

Is

Category space, and augmented categories

silver

Learning a deep convolutional neural network based on the category space; A deep convolutional neural network receives a region of interest (RoI) from an image as an input value, extracts a feature point for the corresponding RoI as a result, and the feature point receives a confidence score (CS) through a detector. extracting; The CS obtained from the detector is a relative value rather than an absolute value.

,

Since CS values cannot be directly compared or synthesized because they are different, in order to solve this problem, all CS values obtained from other data spaces are transformed into an absolute standard and converted into a probability space so that they can be synthesized; Region of interest (RoI)

is given, the super-category

The linear Classifier trained for

A probability function that projects the result into a probability space

cast

defining as; parameter

and

is an algorithm development probability function to determine through logistic regression

Recognition score for final super-category detection using

cast

defining as; Augmented-Categories

and sub-categories

In addition, in the same way as above, each

,

obtaining; Region of interest (RoI) via hierarchical structure-based detection result ensemble (HCE)

The Overall Confidence Score (OCS) for

Extracting; optimizing the HFM of the target system through an algorithm to access this function;

최종적으로 본 발명은 도 5를 참고하면 전술한 ASSL을 이용한 CNN 학습 기술, HFM을 활용한 영상데이터 분포 모델링 기술, SCE를 활용한 Classifier 융합기술을 적용하는 단계와, 다양한 노이즈가 존재하는 실시간 환경에서도 강인하게 다중 물체를 인식하는 계층적 딥러닝 기반 MLDL 기술을 포함한다. Finally, referring to FIG. 5, the present invention includes the step of applying the CNN learning technology using ASSL, the image data distribution modeling technology using HFM, and the classifier convergence technology using SCE, even in a real-time environment where various noises exist. It includes hierarchical deep learning-based MLDL technology that robustly recognizes multiple objects.

이러한 본 발명의 대규모 카테고리 검출인식을 위한 계층적 CNN 알고리즘 방법은, 계층적 CNN 알고리즘을 위하여 빠르게 학습하여 적응할 수 있는 ASSL(Active Semi-Supervised Learning) 알고리즘을 적용하는 단계; 이를 기반으로 다이내믹 HFM에 적용하여 타겟 서비스 시스템의 사용환경에 맞는 계층적 딥러닝 구조를 결정하는 단계; 타겟 서비스 시스템의 현장에서 일어나는 오픈셋 사물을 인지하기 위하여 증강 ASSL 학습 방법과 결합한 고도의 시각인지 기술을 포함한다.The hierarchical CNN algorithm method for large-scale category detection and recognition of the present invention includes applying an Active Semi-Supervised Learning (ASSL) algorithm that can be quickly learned and adapted for the hierarchical CNN algorithm; Based on this, applying to dynamic HFM to determine a hierarchical deep learning structure suitable for the use environment of the target service system; It includes a high-level visual recognition technology combined with an augmented ASSL learning method to recognize open-set objects occurring in the field of the target service system.

이때, 도 6을 참고하면 객체 클래스 분류 문제의 불확실성 해결을 위해 일반적으로 사용하는 Forward Batch Learning에서 탈피하여 Forward Learning과 Rollback Learning을 결합하는 증강학습 방법은 레이블이 지정되지 않은 데이터 세트의 실제 레이블을 사전에 알 수가 없으므로 손실 측정 함수를 정의하여 새로운 레이블을 추정하는 단계; 최악 또는 최상의 경우를 모두 고려하여 현재 모델의 적정 boundary를 선택하는 알고리즘 단계; 예상되는 오류를 측정하고 학습 효율을 최적화하기 위한 목적 함수를 정의하여, 이 목적 함수를 최소화하는 데이터를 선택 하는 알고리즘 단계; ASSL 알고리즘에서는 목적 함수를 계산할 때 레이블이 지정되지 않은 모든 데이터를 고려하여 확장하는 단계; 기존 딥러닝 모델의 재학습 방법은 레이블이 지정되지 않은 모든 데이터와 가능한 모든 레이블을 검토하기 때문에 많은 계산을 최소화하기 위해 소규모 Bin 단위의 빠른 증강학습을 수행하는 방법을 적용 하는 단계;를 포함하는 증강 ASSL 기법을 타켓 서비스 시스템에 적용하여 최적화함이 바람직하다.At this time, referring to FIG. 6, the augmented learning method that combines Forward Learning and Rollback Learning, breaking away from Forward Batch Learning, which is generally used to solve the uncertainty of the object class classification problem, uses the actual labels of unlabeled data sets in advance. estimating a new label by defining a loss measurement function because it is unknown to ; An algorithm step of selecting an appropriate boundary of the current model considering both the worst and best cases; an algorithm step of measuring an expected error, defining an objective function for optimizing learning efficiency, and selecting data that minimizes the objective function; In the ASSL algorithm, when calculating the objective function, all unlabeled data are considered and expanded; Since the retraining method of existing deep learning models examines all unlabeled data and all possible labels, applying a method of performing fast augmented learning in units of small bins to minimize many calculations; augmentation including; It is desirable to optimize by applying the ASSL technique to the target service system.

이러한 본 발명의 자율무인점포의 매대 관리를 위해서는 인벤토리 로봇 및 시스템의 H/W(Hardware) 및 F/W(Firmware)의 구축이 필요하며, 도 7을 참고하면 본 발명에서는 인벤토리 로봇(40), 스마트 매대(50), 클라우드 서버 시스템을 이용한 계층적 MLDL 플랫폼을 구축한다. In order to manage the shelf of the autonomous unmanned store of the present invention, it is necessary to build an inventory robot and system H/W (Hardware) and F/W (Firmware). Referring to FIG. 7, in the present invention, the inventory robot 40, Build a hierarchical MLDL platform using a smart shelf 50 and a cloud server system.

물론, 본 발명의 자율무인점포 매대 관리를 위해 N-카메라 인벤토리 로봇 및 시스템을 이용한 매대 인벤토리 관리를 위한 계층적 MLDL 기반의 대규모 카테고리 상품 객체 검출 인식 장치를 포함한다.Of course, the hierarchical MLDL-based large-scale category product object detection and recognition device for shelf inventory management using the N-camera inventory robot and system for autonomous unmanned store shelf management of the present invention is included.

아울러, 본 발명의 자율 무인점포 매대 관리를 위해 인공지능 POS(Point of sales, 판매시점 정보관리)(60), 세큐어 스피드 게이트(70) 및 스마트 매대(50)를 보완하며, 클라우드 기반 MLDL 기술 테스베드 운용환경 구축 및 적용 테스트를 지원함이 바람직하다.In addition, artificial intelligence POS (Point of sales, point-of-sale information management) 60, secure speed gate 70 and smart shelf 50 are supplemented for autonomous unmanned store shelf management of the present invention, and cloud-based MLDL technology It is desirable to support testbed operation environment establishment and application tests.

이상에서 설명된 장치는 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 실시 예들에서 설명된 장치 및 구성요소는 예를 들어 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPA(field programmable array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 처리 장치와 같이 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. The devices described above may be implemented as hardware components, software components, and/or a combination of hardware components and software components. Devices and components described in the embodiments may include, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable array (FPA), a programmable logic unit (PLU), It may be implemented using one or more general purpose or special purpose computers, such as a microprocessor or any other processing device capable of executing and responding to instructions.

이러한 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 애플리케이션을 수행할 수 있다. Such a processing device may run an operating system (OS) and one or more software applications running on the operating system.

또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. A processing device may also access, store, manipulate, process, and generate data in response to execution of software. For convenience of understanding, there are cases in which one processing device is used, but those skilled in the art will understand that the processing device includes a plurality of processing elements and/or a plurality of types of processing elements. It can be seen that it can include.

예를 들어, 상기 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함하거나, 병렬 프로세서(parallel processor) 등과 같은 다른 처리 구성(processing configuration)도 가능하다.For example, the processing device includes a plurality of processors or a processor and a controller, or other processing configurations such as parallel processors are also possible.

또한, 상기 소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. 상기 소프트웨어 및/또는 데이터는 상기 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치에 구체화(embody)될 수 있다. In addition, the software may include a computer program, code, instructions, or a combination of one or more of these, and may independently or in combination configure a processing device to operate as desired ( collectively) to command processing units. The software and/or data may be any tangible machine, component, physical device, virtual equipment, computer storage medium, or It can be embodied in a device.

아울러, 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.In addition, the software may be distributed on networked computer systems and stored or executed in a distributed manner. Software and data may be stored on one or more computer readable media.

또한, 본 발명의 실시 예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. In addition, the method according to the embodiment of the present invention may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer readable medium. The computer readable medium may include program instructions, data files, data structures, etc. alone or in combination. Program commands recorded on the medium may be specially designed and configured for the embodiment or may be known and usable to those skilled in computer software. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic media such as floptical disks. - includes hardware devices specially configured to store and execute program instructions, such as magneto-optical media, and ROM, RAM, flash memory, and the like. Examples of program instructions include high-level language codes that can be executed by a computer using an interpreter, as well as machine language codes such as those produced by a compiler.

이상과 같이 실시 예들이 비록 한정된 실시 예와 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.As described above, although the embodiments have been described with limited examples and drawings, those skilled in the art can make various modifications and variations from the above description. For example, the described techniques may be performed in an order different from the method described, and/or components of the described system, structure, device, circuit, etc. may be combined or combined in a different form than the method described, or other components may be used. Or even if it is replaced or substituted by equivalents, appropriate results can be achieved.

그러므로, 다른 구현들, 다른 실시예들 및 청구범위와 균등한 것들도 후술하는 본 발명의 청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents of the claims are within the scope of the following claims.

10: MLDL 생성기 11: 데이터 아카이브
12: 메타팩터 아카이브 20: 계층적 딥러닝 검출인식기
30: 메타러닝 40: 인벤토리 로봇
50: 스마트 매대 60: 인공지능 POS
70: 세큐어 스피드 게이트 10: MLDL generator 11: Data archive
12: Metafactor Archive 20: Hierarchical Deep Learning Detection Recognizer
30: Meta-Learning 40: Inventory Robot
50: Smart shelf 60: AI POS
70: Secure Speed Gate

Claims

In the large-scale category object detection and recognition method for inventory management of autonomous unmanned stores,
A first step in which the inventory robot collects multi-image image data scanned with N-cameras (N>= 1);
A second step of collecting store meta information associated with the video image data;
A third step of deriving a meta factor by performing meta-learning on the store meta information; and
A fourth step of generating a hierarchical deep learning model based on an image data archive, a meta data archive, and a meta factor; a large-scale category object detection and recognition method for inventory management of an autonomous unmanned store, comprising:

According to claim 1,
The method of detecting and recognizing large-scale category objects for inventory management of an autonomous unmanned store, characterized in that the second step is a step of collecting the store meta-information from one or more input devices.

According to claim 1,
The meta factor depends on the performance of detection and recognition of one or more of the characteristics of collected video image data, reliability of distributed video characteristics, bin-based characteristics, video instance-based characteristics, image collection time, inventory robot location, and lighting condition. A large-scale category object detection and recognition method for inventory management of an autonomous unmanned store, characterized in that it is an influencing factor.

According to claim 1,
The fourth step is a step of constructing an initial model with a small amount of labeled data and gradually optimizing performance through an augmented ASSL learning process.

According to claim 1,
After the fourth step, a fifth step of deriving a detection and recognition result by inputting multiple images scanned by the N-camera from the inventory robot to the hierarchical deep learning detection recognizer; A large-scale categorical object detection and recognition method for inventory management.

According to claim 5,
The fifth step is an autonomous unmanned store, characterized in that the hierarchical deep learning detection recognizer receives feedback from the MLDL model generator and performs detection and recognition of large-scale category objects using inventory robots and store meta-factor information. A method for detecting and recognizing large-scale categorical objects for inventory management.

According to claim 5,
After the fifth step, a sixth step of updating an image data archive and a metadata archive by using the derived meta factor and the detection and recognition result of the hierarchical deep learning detection recognizer; and
A seventh step of updating a hierarchical deep learning model based on the updated image data archive, meta data archive, and meta factor; large-scale category object detection and recognition method for inventory management of an autonomous unmanned store, characterized in that it further comprises .

According to claim 5,
The fifth step is a step of continuously deriving detection and recognition results by inputting multiple images scanned by N-cameras from the inventory robot to the hierarchical deep learning detection recognizer, a large-scale category for inventory management of autonomous unmanned stores, characterized in that Object detection recognition method.

According to claim 8,
The fifth step comprises the step of applying a time-window technique for advanced detection and recognition of multiple product objects on the shelf.

According to claim 9,
The time-window technique extracts an optimized multi-object tracking segment (partial tracking path performed in batch units) prior to generating the final tracking path from the object detection result of each image frame. Applying a time-window technique for; And since there are many variables that affect object identification in the operating location of the inventory robot, NMS (Non-Maximum Suppression) is a type of algorithm, WNMS (Weighted A method for detecting and recognizing large-scale category objects for inventory management of autonomous unmanned stores, comprising: applying non-maximum suppression.