KR20240010905A

KR20240010905A - Method and apparatus for reinforcing a sensor and enhancing a perception performance

Info

Publication number: KR20240010905A
Application number: KR1020220088237A
Authority: KR
Inventors: 김정은; 한재훈; 김민수; 길현석; 최진영; 윤영환; 박경식
Original assignee: 네이버랩스 주식회사
Priority date: 2022-07-18
Filing date: 2022-07-18
Publication date: 2024-01-25
Also published as: WO2024019345A1

Abstract

본 발명은 제1 센서 및 제1 센서를 보강하기 위한 제2 센서로부터 획득한 학습 데이터에 기초하여 학습된 기계 학습 모델을 이용하여 제1 센서를 보강하기 위한 방법 및 이를 위한 장치에 관한 것으로서, 보다 구체적으로는 제1 센서와 제2 센서 중에서 제1 센서만으로 획득한 제1 데이터를 상기 기계 학습 모델에 이용하여 제1 센서를 보강하기 위한 제2 데이터를 획득하는 것; 및 상기 기계 학습 모델을 이용하여 획득한 제2 데이터 및 제1 데이터에 기초하여 인지를 수행하는 방법 및 이를 위한 장치에 관한 것이다. 본 발명에 따르면, 센서를 효과적으로 보강할 수 있으며 센서 보강을 통해 로봇의 인지 성능을 향상시킬 수 있다.The present invention relates to a method and device for reinforcing a first sensor using a machine learning model learned based on learning data obtained from a first sensor and a second sensor for reinforcing the first sensor, Specifically, using first data obtained only from the first sensor among the first sensor and the second sensor in the machine learning model to obtain second data to reinforce the first sensor; and a method and device for performing recognition based on second data and first data obtained using the machine learning model. According to the present invention, sensors can be effectively reinforced and the cognitive performance of a robot can be improved through sensor reinforcement.

Description

Method and device for enhancing sensor and improving cognitive performance {METHOD AND APPARATUS FOR REINFORCING A SENSOR AND ENHANCING A PERCEPTION PERFORMANCE}

본 발명은 로봇에 관한 것으로서, 보다 구체적으로는 로봇의 센서를 보강하고 인지 성능을 향상시키기 위한 방법 및 이를 위한 장치에 관한 것이다.The present invention relates to robots, and more specifically, to a method and device for reinforcing robot sensors and improving cognitive performance.

로봇 기술의 발전으로 다양한 분야에서 로봇이 활용되고 있다. 특히 로봇은 공장 자동화, 무인 매장, 자율 주행 자동차와 같이 제조업 뿐만 아니라 다양한 산업 분야에서 활용되고 있다. 로봇은 주변 환경을 효과적으로 감지/인지하고 행동(action)하기 위해 다양한 센서를 구비한다.With the advancement of robotics technology, robots are being used in various fields. In particular, robots are used not only in manufacturing but also in various industrial fields such as factory automation, unmanned stores, and self-driving cars. Robots are equipped with various sensors to effectively sense/recognize the surrounding environment and take action.

로봇에 장착되는 센서의 경우 기술적 한계로 인해 주변 환경에 대한 인지가 충분하지 못한 경우가 발생할 수 있으며, 이러한 기술적 한계를 극복하기 위해 추가적인 센서를 로봇에 장착하여 기본 장착된 센서의 기술적 한계를 보강하는 방안이 사용되고 있다.In the case of sensors mounted on robots, there may be cases where the perception of the surrounding environment is insufficient due to technical limitations. To overcome these technical limitations, additional sensors are mounted on the robot to reinforce the technical limitations of the basic sensors. The plan is being used.

하지만, 로봇의 디자인이나 설계/제조 비용과 같이 기술 외적인 요인으로 인해 기본 센서를 보강하기 위한 추가적인 센서를 장착하지 못하는 경우가 빈번히 발생하며, 이 경우 실제 서비스를 위해 개발된 로봇은 센서의 기술적 한계를 극복하지 못해 로봇이 주행하기에 필요한 인지를 충분히 수행하지 못하는 문제가 발생할 수 있다.However, it frequently occurs that additional sensors cannot be installed to reinforce the basic sensors due to factors other than technology, such as robot design or design/manufacturing costs. In this case, robots developed for actual service have the technical limitations of sensors. If this cannot be overcome, a problem may arise in which the robot cannot sufficiently perform the cognition necessary for driving.

예를 들어, 일부 비전 센서의 경우 사각 지대가 발생할 수 있으며 사각 지대를 보강하기 위한 추가적인 센서를 실제 구동 환경에서는 장착하지 못하는 경우가 많이 존재한다. 추가적인 센서를 장착하지 않고 센서의 사각 지대를 극복하기 위한 방안으로서 위치 기반 가상 장애물 생성 기법이 사용될 수 있으나, 측위 오차로 인해 센서의 기술적 한계를 극복하는데 어려움이 있다.For example, in the case of some vision sensors, blind spots may occur, and there are many cases in which additional sensors to reinforce blind spots cannot be installed in actual driving environments. Location-based virtual obstacle creation techniques can be used as a way to overcome sensor blind spots without installing additional sensors, but it is difficult to overcome the technical limitations of sensors due to positioning errors.

공개특허공보 제10-2020-0029501호Public Patent Publication No. 10-2020-0029501

본 발명의 목적은 센서를 효과적으로 보강하기 위한 방법 및 이를 위한 장치를 제공하는데 있다.The purpose of the present invention is to provide a method and device for effectively reinforcing a sensor.

보다 구체적으로, 본 발명의 목적은 센서의 사각 지대를 효과적으로 극복하기 위한 방법 및 이를 위한 장치를 제공하는데 있다.More specifically, the purpose of the present invention is to provide a method and device for effectively overcoming the blind spot of a sensor.

본 발명의 다른 목적은 센서 보강을 통해 로봇의 인지 성능을 향상시키는 방법 및 이를 위한 장치를 제공하는데 있다.Another object of the present invention is to provide a method and device for improving the cognitive performance of a robot through sensor reinforcement.

본 발명에서 해결하고자 하는 기술적 과제는 이상에서 언급한 기술적 과제로 제한되지 않으며, 언급하지 않은 또 다른 기술적 과제는 본 명세서에 기재된 내용으로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The technical problems to be solved by the present invention are not limited to the technical problems mentioned above, and other technical problems not mentioned are clearly apparent to those skilled in the art from the contents described in this specification. It will be understandable.

본 발명의 제1 양상으로, 프로세서와 메모리를 포함하는 장치에서 수행되는 방법이 제공되며, 상기 메모리는 상기 프로세서에 의해 실행될 때 제1 센서 및 상기 제1 센서를 보강하기 위한 제2 센서로부터 획득한 학습 데이터에 기초하여 학습된 기계 학습 모델(machine learning model)을 구현하도록 구성된 명령어들을 포함하며, 상기 방법은: 상기 제1 센서와 상기 제2 센서 중에서 상기 제1 센서만으로 획득한 제1 데이터를 상기 기계 학습 모델에 이용하여 상기 제1 센서를 보강하기 위한 제2 데이터를 획득하는 것; 및 상기 기계 학습 모델을 이용하여 획득한 제2 데이터 및 상기 제1 데이터에 기초하여 인지(perception)를 수행하는 것을 포함할 수 있다.In a first aspect of the invention, there is provided a method performed in a device comprising a processor and a memory, wherein the memory, when executed by the processor, obtains from a first sensor and a second sensor to augment the first sensor. It includes instructions configured to implement a machine learning model learned based on learning data, wherein the method includes: first data obtained only from the first sensor among the first sensor and the second sensor; obtaining second data to augment the first sensor for use in a machine learning model; And it may include performing perception based on the first data and second data obtained using the machine learning model.

본 발명의 제2 양상으로, 프로세서와 메모리를 포함하는 장치가 제공되며, 상기 메모리는 상기 프로세서에 의해 실행될 때 제1 센서 및 상기 제1 센서를 보강하기 위한 제2 센서로부터 획득한 학습 데이터에 기초하여 학습된 기계 학습 모델(machine learning model)과 특정 동작을 구현하도록 구성된 명령어들을 포함하며, 상기 특정 동작은: 상기 제1 센서와 상기 제2 센서 중에서 상기 제1 센서만으로 획득한 제1 데이터를 상기 기계 학습 모델에 이용하여 상기 제1 센서를 보강하기 위한 제2 데이터를 획득하는 것; 및 상기 기계 학습 모델을 이용하여 획득한 제2 데이터 및 상기 제1 데이터에 기초하여 인지(perception)를 수행하는 것을 포함할 수 있다.In a second aspect of the invention, there is provided an apparatus comprising a processor and a memory, wherein the memory, when executed by the processor, is based on learning data obtained from a first sensor and a second sensor for augmenting the first sensor. It includes a machine learning model learned and instructions configured to implement a specific operation, wherein the specific operation includes: first data acquired only from the first sensor among the first sensor and the second sensor, obtaining second data to augment the first sensor for use in a machine learning model; And it may include performing perception based on the first data and second data obtained using the machine learning model.

본 발명의 제3 양상으로, 프로세서에 의해 실행될 때 상기 프로세서를 포함하는 장치로 하여금 제1 센서 및 상기 제1 센서를 보강하기 위한 제2 센서로부터 획득한 학습 데이터에 기초하여 학습된 기계 학습 모델(machine learning model)과 특정 동작을 구현하도록 구성된 명령어들을 저장하고 있는 컴퓨터 판독가능한 저장 매체가 제공되며, 상기 특정 동작은: 상기 제1 센서와 상기 제2 센서 중에서 상기 제1 센서만으로 획득한 제1 데이터를 상기 기계 학습 모델에 이용하여 상기 제1 센서를 보강하기 위한 제2 데이터를 획득하는 것; 및 상기 기계 학습 모델을 이용하여 획득한 제2 데이터 및 상기 제1 데이터에 기초하여 인지(perception)를 수행하는 것을 포함할 수 있다.In a third aspect of the present invention, when executed by a processor, a machine learning model learned based on learning data obtained from a first sensor and a second sensor for augmenting the first sensor causes a device including the processor ( A computer-readable storage medium is provided that stores a machine learning model) and instructions configured to implement a specific operation, wherein the specific operation includes: first data acquired only from the first sensor among the first sensor and the second sensor; obtaining second data to augment the first sensor by using the machine learning model; And it may include performing perception based on the first data and second data obtained using the machine learning model.

바람직하게는, 상기 제1 센서는 비전 센서를 포함하고, 상기 제1 데이터는 상기 비전 센서로부터 획득한 이미지 데이터를 포함할 수 있다.Preferably, the first sensor includes a vision sensor, and the first data may include image data obtained from the vision sensor.

바람직하게는, 상기 비전 센서는 뎁스(depth) 이미지 센서, RGB(Red Green Blue) 카메라 센서, 적외선(InfraRed, IR) 카메라 센서 중 적어도 하나를 포함할 수 있다.Preferably, the vision sensor may include at least one of a depth image sensor, a Red Green Blue (RGB) camera sensor, and an InfraRed (IR) camera sensor.

바람직하게는, 상기 제2 센서는 상기 제1 센서의 사각 지대를 보강하기 위한 센서를 포함하고, 상기 제2 데이터는 상기 제1 센서의 사각 지대에 장애물이 존재하는지 여부를 지시하는 정보를 포함할 수 있다.Preferably, the second sensor includes a sensor for reinforcing the blind spot of the first sensor, and the second data may include information indicating whether an obstacle exists in the blind spot of the first sensor. You can.

바람직하게는, 상기 제2 센서는 비행 시간(Time of Flight, ToF) 기반 센서를 포함할 수 있다.Preferably, the second sensor may include a time of flight (ToF) based sensor.

바람직하게는, 상기 기계 학습 모델은 합성곱 신경망(Convolutional Neural Network, CNN)에 기초하여 구현될 수 있다.Preferably, the machine learning model may be implemented based on a convolutional neural network (CNN).

바람직하게는, 상기 장치는 상기 제1 센서가 장착된 로봇을 포함할 수 있다.Preferably, the device may include a robot equipped with the first sensor.

바람직하게는, 상기 장치는 상기 제1 센서가 장착된 로봇과 동작시 통신하도록 구성된 서버를 포함할 수 있다.Preferably, the device may include a server configured to communicate during operation with the robot equipped with the first sensor.

바람직하게는, 상기 서버는 클라우드 서버를 포함할 수 있다.Preferably, the server may include a cloud server.

바람직하게는, 상기 인지를 수행하는 것은: 상기 제1 센서가 장착된 로봇의 주변 환경에 대한 포인트 클라우드(point cloud) 정보 또는 동적 물체의 궤적에 관한 정보를 획득하는 것을 포함할 수 있다.Preferably, performing the recognition may include: acquiring point cloud information about the surrounding environment of the robot equipped with the first sensor or information about the trajectory of a dynamic object.

본 발명에 따르면, 센서를 효과적으로 보강할 수 있다.According to the present invention, the sensor can be effectively reinforced.

보다 구체적으로, 본 발명에 따르면, 센서의 사각 지대를 효과적으로 극복할 수 있다.More specifically, according to the present invention, the blind spot of the sensor can be effectively overcome.

또한, 본 발명에 따르면, 센서 보강을 통해 로봇의 인지 성능을 향상시킬 수 있다.Additionally, according to the present invention, the cognitive performance of the robot can be improved through sensor reinforcement.

본 발명에서 얻을 수 있는 효과는 이상에서 언급한 효과로 제한되지 않으며, 언급하지 않은 또 다른 효과는 본 명세서에 기재된 내용으로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The effects that can be obtained from the present invention are not limited to the effects mentioned above, and other effects not mentioned can be clearly understood by those skilled in the art from the contents described in this specification. There will be.

첨부 도면은 본 발명에 관한 이해를 돕기 위해 상세한 설명의 일부로 포함되고 상세한 설명과 함께 본 발명의 실시예와 기술적 특징을 설명한다.
도 1은 기술적 문제를 예시한다.
도 2는 본 발명의 제안 방법에 따른 블록도를 예시한다.
도 3은 학습 데이터 획득을 위해 제1 센서와 제2 센서를 로봇에 장착한 예를 예시한다.
도 4는 본 발명의 제안 방법의 순서도를 예시한다.
도 5는 본 발명의 제안 방법이 적용될 수 있는 장치를 예시한다.The accompanying drawings are included as part of the detailed description to aid understanding of the present invention and explain embodiments and technical features of the present invention together with the detailed description.
Figure 1 illustrates the technical problem.
Figure 2 illustrates a block diagram according to the proposed method of the present invention.
Figure 3 illustrates an example in which a first sensor and a second sensor are mounted on a robot to acquire learning data.
Figure 4 illustrates a flowchart of the proposed method of the present invention.
Figure 5 illustrates a device to which the proposed method of the present invention can be applied.

본 발명은 다양한 변환을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 이하에서는 특정 실시예들을 첨부된 도면을 기초로 상세히 설명하고자 한다.The present invention can be modified in various ways and can have various embodiments. Hereinafter, specific embodiments will be described in detail based on the accompanying drawings.

이하의 실시예는 본 명세서에서 기술된 방법, 장치 및/또는 시스템에 대한 포괄적인 이해를 돕기 위해 제공된다. 그러나 이는 예시에 불과하며 본 발명은 이에 제한되지 않는다.The following examples are provided to provide a comprehensive understanding of the methods, devices and/or systems described herein. However, this is only an example and the present invention is not limited thereto.

본 발명의 실시예들을 설명함에 있어서, 본 발명과 관련된 공지기술에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략하기로 한다. 그리고, 후술되는 용어들은 본 발명에서의 기능을 고려하여 정의된 용어들로서 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다. 그러므로 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다. 상세한 설명에서 사용되는 용어는 단지 본 발명의 실시예들을 기술하기 위한 것이며, 결코 제한적이어서는 안 된다. 명확하게 달리 사용되지 않는 한, 단수 형태의 표현은 복수 형태의 의미를 포함한다. 본 설명에서, “포함” 또는 “구비”와 같은 표현은 어떤 특성들, 숫자들, 단계들, 동작들, 요소들, 이들의 일부 또는 조합을 가리키기 위한 것이며, 기술된 것 이외에 하나 또는 그 이상의 다른 특성, 숫자, 단계, 동작, 요소, 이들의 일부 또는 조합의 존재 또는 가능성을 배제하도록 해석되어서는 안 된다. In describing the embodiments of the present invention, if it is determined that a detailed description of the known technology related to the present invention may unnecessarily obscure the gist of the present invention, the detailed description will be omitted. In addition, the terms described below are terms defined in consideration of functions in the present invention, and may vary depending on the intention or custom of the user or operator. Therefore, the definition should be made based on the contents throughout this specification. The terminology used in the detailed description is merely for describing embodiments of the present invention and should in no way be limiting. Unless explicitly stated otherwise, singular forms include plural meanings. In this description, expressions such as “including” or “including” are intended to indicate certain features, numbers, steps, operations, elements, parts or combinations thereof, and one or more than those described. It should not be construed to exclude the existence or possibility of any other characteristic, number, step, operation, element, or part or combination thereof.

또한, 제1, 제2 등의 용어는 다양한 구성 요소들을 설명하는데 사용될 수 있지만, 상기 구성 요소들은 상기 용어들에 의해 한정되는 것은 아니며, 상기 용어들은 하나의 구성 요소를 다른 구성 요소로부터 구별하는 목적으로만 사용된다.In addition, terms such as first, second, etc. may be used to describe various components, but the components are not limited by the terms, and the terms are used for the purpose of distinguishing one component from another component. It is used only as

인공 지능(Artificial Intelligence, AI)Artificial Intelligence (AI)

인공 지능은 수학적 모델에 기반하여 컴퓨터 장치에 인공적인 지능을 구현하는 것을 지칭하고, 기계 학습(machine learning)은 인공 지능의 일 분야로서 명시적인 프로그래밍 없이도 학습을 통해 컴퓨터 장치가 특정 문제를 해결할 수 있도록 하는 것을 지칭한다.Artificial intelligence refers to implementing artificial intelligence in computer devices based on mathematical models, and machine learning is a field of artificial intelligence that allows computer devices to solve specific problems through learning without explicit programming. It refers to doing something.

인공 신경망(artificial neural network)은 기계 학습에서 사용되는 모델로서, 시냅스의 결합으로 네트워크를 형성한 인공 뉴런(neuron)들로 구성되는 모델을 지칭한다. 인공 신경망은 신경망으로 약칭될 수 있다. 인공 신경망은 입력층(input layer), 출력층(output layer), 그리고 선택적으로 하나 이상의 은닉층(hidden layer)을 포함할 수 있다. 각 층은 하나 이상의 뉴런을 포함하고, 서로 다른 층(layer) 간의 뉴런과 뉴런은 시냅스를 통해 연결될 수 있다. 입력층의 뉴런은 학습에 사용되는 특징 벡터(feature vector)를 입력 신호로 가질 수 있고, 출력층의 뉴런은 시냅스를 통해 입력되는 입력 신호들, 시냅스의 가중치, 편향에 대한 활성 함수의 함수 값을 출력할 수 있다. 은닉층의 뉴런은 입력층 또는 이전 은닉층의 뉴런 신호와 시냅스의 가중치에 기반하여 연산된 값을 입력 신호로 가질 수 있으며 다음 은닉층 또는 출력층으로 신호를 제공한다. An artificial neural network is a model used in machine learning and refers to a model composed of artificial neurons that form a network through the combination of synapses. Artificial neural networks can be abbreviated as neural networks. An artificial neural network may include an input layer, an output layer, and optionally one or more hidden layers. Each layer contains one or more neurons, and neurons between different layers can be connected through synapses. Neurons in the input layer may have feature vectors used for learning as input signals, and neurons in the output layer output the function values of the activation function for the input signals input through the synapse, the weight of the synapse, and the bias. can do. Neurons in the hidden layer may have values calculated based on the weights of the synapse and the neuron signal of the input layer or previous hidden layer as input signals, and provide the signal to the next hidden layer or output layer.

모델 파라미터(model parameter)는 학습을 통해 결정되는 파라미터를 지칭하며 시냅스의 가중치와 뉴런의 편향 등을 포함할 수 있다. 하이퍼 파라미터(hyper parameter)는 기계 학습 알고리즘에서 학습 전에 설정되어야 하는 파라미터를 의미하며, 학습률(learning rate), 반복 횟수, 미니 배치 크기, 초기화 함수 등이 포함된다. 인공 신경망의 학습 목적은 손실 함수(loss function)를 최소화하는 모델 파라미터를 결정하는 것으로 볼 수 있다. 손실 함수는 인공 신경망의 학습 과정에서 최적의 모델 파라미터를 결정하기 위한 지표로 이용될 수 있다.Model parameters refer to parameters determined through learning and may include synaptic weights and neuron biases. Hyper parameters refer to parameters that must be set before learning in a machine learning algorithm and include learning rate, number of iterations, mini-batch size, initialization function, etc. The learning goal of an artificial neural network can be seen as determining model parameters that minimize the loss function. The loss function can be used as an indicator to determine optimal model parameters in the learning process of an artificial neural network.

기계 학습은 학습 방식에 따라 지도 학습(supervised learning), 비지도 학습(unsupervised learning), 강화 학습(reinforcement learning)으로 분류할 수 있다. 지도 학습(supervised learning)은 학습 데이터에 대한 레이블(label)이 주어진 상태에서 인공 신경망을 학습시키는 방법을 의미하며, 레이블이란 학습 데이터가 인공 신경망에 입력되는 경우 인공 신경망이 추론해 내야 하는 정답(또는 결과 값)을 의미할 수 있다. 비지도 학습(unsupervised learning)은 학습 데이터에 대한 레이블이 주어지지 않는 상태에서 인공 신경망을 학습시키는 방법을 의미할 수 있다. 강화 학습(reinforcement learning)은 어떤 환경 안에서 정의된 에이전트(agent)가 각 상태에서 누적 보상을 최대화하는 행동 혹은 행동 순서를 선택하도록 학습시키는 학습 방법을 의미할 수 있다.Machine learning can be classified into supervised learning, unsupervised learning, and reinforcement learning depending on the learning method. Supervised learning refers to a method of training an artificial neural network given a label for the training data. A label is the correct answer (or result value). Unsupervised learning can refer to a method of training an artificial neural network in a state where no labels for training data are given. Reinforcement learning can refer to a learning method in which an agent defined within an environment learns to select an action or action sequence that maximizes the cumulative reward in each state.

인공 신경망 중에서 복수의 은닉층을 포함하는 심층 신경망(deep neural network)으로 구현되는 기계 학습을 심층 학습(deep learning)으로 지칭할 수 있으며, 심층 학습은 기계 학습의 일 분야이다.Among artificial neural networks, machine learning implemented as a deep neural network including multiple hidden layers can be referred to as deep learning, and deep learning is a field of machine learning.

본 명세서에서 기계 학습을 중심으로 본 발명을 설명하지만, 본 발명이 기계 학습을 이용하여 구현되는 경우로만 제한되는 것은 아니다. 따라서, 본 명세서에서 “학습” 또는 “기계 학습” 이라는 용어는 기계 학습 또는 심층 학습 뿐만 아니라 다른 인공 지능 알고리즘에 기반한 학습을 포함할 수 있고, “AI” 또는 “기계 학습 모델”은 기계 학습 또는 심층 학습에 기반하여 학습된 인공 신경망 모델 뿐만 아니라 다른 인공 지능 알고리즘에 기반하여 학습된 인공 지능 모델을 포함할 수 있다.Although the present invention is described herein with a focus on machine learning, the present invention is not limited to cases where it is implemented using machine learning. Accordingly, as used herein, the terms “learning” or “machine learning” may include learning based on machine learning or deep learning as well as other artificial intelligence algorithms, and “AI” or “machine learning model” may include machine learning or deep learning. It may include an artificial neural network model learned based on learning, as well as an artificial intelligence model learned based on another artificial intelligence algorithm.

기술적 문제technical issues

도 1은 기술적 문제를 예시한다.Figure 1 illustrates the technical problem.

도 1을 참조하면, 로봇은 주변 환경을 감지하기 위해 적어도 하나의 센서(110)를 장착하며 장착된 적어도 하나의 센서(110)로부터 획득한 데이터를 처리(120)하여 로봇의 주변 환경에 대한 인지를 수행할 수 있다(130). 이와 같이 로봇에 기본 장착되는 센서(110)의 경우 기술적 한계로 인해 주변 환경에 대한 인지가 충분하지 못한 경우가 발생할 수 있으며, 이러한 기술적 한계를 극복하기 위해 추가적인 센서(미도시)를 로봇에 장착하여 기본 장착된 센서(110)의 기술적 한계를 보강하는 방안이 사용되고 있다.Referring to FIG. 1, the robot is equipped with at least one sensor 110 to sense the surrounding environment, and processes 120 data obtained from the at least one sensor 110 to detect the surrounding environment of the robot. can be performed (130). In the case of the sensor 110 that is basically installed on the robot, there may be cases where the perception of the surrounding environment is insufficient due to technical limitations. To overcome these technical limitations, additional sensors (not shown) are installed on the robot. A method of reinforcing the technical limitations of the basic installed sensor 110 is being used.

예를 들어, 로봇의 주변 환경을 감지하기 위한 비전 센서(110)가 로봇에 장착될 수 있으며 로봇은 비전 센서(110)로부터 획득한 이미지 데이터를 처리(120)하여 로봇의 주변 환경에 대한 인지(perception)를 수행할 수 있다(130). 일부 비전 센서(110)의 경우 사각 지대가 발생할 수 있으며 비전 센서(110)의 사각 지대를 보강하기 위해 추가적인 센서(미도시)를 로봇에 장착하여 비전 센서(110)의 기술적 한계를 극복하는 방안이 사용될 수 있다. 이 예에서, 처리(120)는 비전 센서(110)로부터 획득한 이미지 데이터와 추가적인 센서(미도시)로부터 획득한 데이터를 융합하고 후처리하는 과정을 포함할 수 있으며, 이러한 과정을 위해 인공 신경망 모델이 사용될 수 있다. 또한, 이 예에서, 인지(130)는 로봇의 주변 환경에 대한 환경 인지 정보(environmental perception information)를 획득하는 과정을 포함할 수 있으며, 예를 들어 환경 인지 정보는 포인트 클라우드(point cloud) 정보, 동적 물체의 궤적에 관한 정보 등을 포함할 수 있다. 본 명세서에서 센서의 사각 지대는 데드 존(dead zone), 불감 지대, 음영 지대, 무응답 구간 등의 다른 용어로도 지칭될 수 있다.For example, a vision sensor 110 for detecting the robot's surrounding environment may be mounted on the robot, and the robot processes image data acquired from the vision sensor 110 (120) to recognize the robot's surrounding environment ( perception) can be performed (130). In the case of some vision sensors 110, blind spots may occur, and in order to reinforce the blind spots of the vision sensor 110, an additional sensor (not shown) is mounted on the robot to overcome the technical limitations of the vision sensor 110. can be used In this example, processing 120 may include fusing and post-processing image data acquired from the vision sensor 110 and data acquired from an additional sensor (not shown), and an artificial neural network model may be used for this process. This can be used. Additionally, in this example, perception 130 may include the process of acquiring environmental perception information about the robot's surrounding environment, for example, environmental perception information may include point cloud information, It may include information about the trajectory of a dynamic object, etc. In this specification, the blind zone of the sensor may also be referred to by other terms such as dead zone, dead zone, shadow zone, and non-response zone.

다만, 로봇의 디자인이나 설계/제조 비용과 같이 기술 외적인 요인으로 인해 센서(110)를 보강하기 위한 추가적인 센서(미도시)를 장착하지 못하는 경우가 빈번히 발생하며, 이 경우 실제 서비스를 위해 개발된 로봇은 센서(110)의 기술적 한계를 극복하지 못해 로봇이 주행하기에 필요한 인지를 충분히 수행하지 못하는 문제가 발생할 수 있다. 예를 들어, 실제 구동 환경에서는 비전 센서(110)의 사각 지대를 보강하기 위한 추가적인 센서(미도시)를 로봇에 장착/사용하지 못하는 경우가 많이 존재한다. 추가적인 센서(미도시)를 장착하지 않고 센서(110)의 사각 지대를 극복하기 위한 방안으로서 위치 기반 가상 장애물 생성 기법이 사용될 수 있으나, 측위 오차로 인해 센서(110)의 기술적 한계를 극복하는데 어려움이 있다.However, it frequently occurs that additional sensors (not shown) to reinforce the sensor 110 cannot be installed due to factors other than technology, such as robot design or design/manufacturing costs. In this case, the robot developed for actual service Failure to overcome the technical limitations of the sensor 110 may cause a problem in which the robot cannot sufficiently perform the recognition necessary for driving. For example, in an actual operating environment, there are many cases where an additional sensor (not shown) to reinforce the blind spot of the vision sensor 110 cannot be mounted/used on the robot. A location-based virtual obstacle creation technique can be used as a way to overcome the blind spot of the sensor 110 without installing an additional sensor (not shown), but it is difficult to overcome the technical limitations of the sensor 110 due to positioning error. there is.

본 발명의 제안 방법Proposed method of the present invention

상기 설명한 기술적 문제를 해결하기 위해 본 발명에서는 로봇에 기본 장착된 센서의 기술적 한계를 극복하기 위한 추가적인 센서를 장착하지 않고도 기계 학습 모델을 이용하여 강화된 인지를 수행할 수 있는 방법을 제안한다. 보다 구체적으로, 본 발명의 제안 방법에서는 기계 학습 모델의 학습을 위한 학습 데이터를 취득할 때에만 추가적인 센서를 로봇에 장착하여 학습 데이터를 취득하고, 취득한 학습 데이터를 이용하여 로봇의 비전 인지 성능을 향상시키기 위한 기계 학습 모델을 학습시킨 후, 실제 로봇 구동 시에는 추가적인 센서를 로봇에 장착하지 않고 기본 장착된 센서만으로 획득한 데이터를 학습된 기계 학습 모델에 이용하여 로봇이 강화된 인지를 수행할 수 있도록 한다.In order to solve the technical problems described above, the present invention proposes a method of performing enhanced recognition using a machine learning model without installing additional sensors to overcome the technical limitations of the sensors basically installed in the robot. More specifically, in the proposed method of the present invention, only when acquiring learning data for learning a machine learning model, additional sensors are mounted on the robot to acquire learning data, and the acquired learning data is used to improve the robot's vision recognition performance. After learning the machine learning model to do this, when actually driving the robot, the data obtained only from the basic sensors is used in the learned machine learning model without installing additional sensors on the robot so that the robot can perform enhanced cognition. do.

본 발명의 이해를 돕기 위해 본 발명의 제안 방법은 로봇을 중심으로 설명하지만 본 발명의 제안 방법이 로봇으로만 제한되어 적용되는 것은 아니다. 예를 들어, 본 발명의 제안 방법은 이동식 로봇, 고정식 로봇 등과 같은 다양한 로봇 뿐만 아니라 자율 주행 자동차, 드론(drone)와 같은 자율 이동체(autonomous vehicle)에도 동일/유사하게 적용될 수 있다. 따라서, 본 명세서에서 로봇이라는 용어는 자율 이동체라는 용어로 대체될 수 있다.To facilitate understanding of the present invention, the proposed method of the present invention is explained focusing on robots, but the proposed method of the present invention is not limited to application only to robots. For example, the proposed method of the present invention can be applied equally/similarly to various robots such as mobile robots, stationary robots, etc., as well as autonomous vehicles such as self-driving cars and drones. Therefore, in this specification, the term robot can be replaced with the term autonomous mobile body.

본 명세서에서 설명을 명확히 하기 위해 로봇에 기본 장착된 센서를 제1 센서라고 지칭하고, 상기 기본 장착된 센서(또는 제1 센서)의 기술적 한계를 극복하기 위한 추가적인 센서 또는 기본 장착된 센서(또는 제1 센서)를 보강하기 위한 센서를 제2 센서라고 지칭한다. 또한, 본 명세서에서 설명을 명확히 하기 위해 제1 센서와 제2 센서 중에서 제1 센서만으로 획득한 데이터를 제1 데이터라고 지칭하고, 제1 데이터를 본 발명의 제안 방법에 따라 기계 학습 모델에 이용하여 획득되는 제1 센서를 보강하기 위한 데이터를 제2 데이터라고 지칭할 수 있다. In this specification, for clarity of explanation, the sensor basically mounted on the robot is referred to as the first sensor, and an additional sensor or basic sensor (or first sensor) to overcome the technical limitations of the basic sensor (or first sensor) is referred to as the first sensor. The sensor to reinforce sensor 1 is referred to as the second sensor. In addition, for clarity of explanation in this specification, data obtained only from the first sensor among the first sensor and the second sensor is referred to as first data, and the first data is used in a machine learning model according to the method proposed by the present invention. Data for reinforcing the first sensor that is acquired may be referred to as second data.

일 예로, 제1 센서는 비전 센서를 포함할 수 있고, 제1 데이터는 비전 센서로부터 획득한 이미지 데이터를 포함할 수 있다. 보다 구체적인 예로, 비전 센서는 뎁스(depth) 이미지 센서, RGB(Red Green Blue) 카메라 센서, 적외선(InfraRed, IR) 카메라 센서 중 적어도 하나를 포함할 수 있다. 일 예로, 제2 센서는 제1 센서의 사각 지대를 보강하기 위한 센서를 포함하고, 제2 데이터는 본 발명의 제안 방법에 따른 기계 학습 모델의 출력으로서 제2 센서의 출력에 대응하는 정보를 포함하며, 보다 구체적인 예로 제2 데이터는 제1 센서의 사각 지대에 장애물이 존재하는지 여부를 지시하는 정보를 포함할 수 있다. 보다 구체적인 예로, 제2 센서는 비행 시간(Time of Flight, ToF) 기반 센서를 포함할 수 있으며, 제1 센서의 사각 지대에서 다양한 해상도(resolution)로 복수의 방향으로 ToF를 측정하여 해당 방향에 장애물이 존재하는 경우 1(또는 0)을 출력하고 해당 방향에 장애물이 존재하지 않는 경우 0(또는 1)을 출력할 수 있다. 보다 구체적인 예로, 제2 센서는 제1 센서의 사각 지대에서 적어도 3개의 방향으로 ToF를 측정하여 특정 방향의 ToF가 임계값 이하인 경우 장애물이 존재함을 지시하는 정보(예, 1(또는 0))를 출력하고, 특정 방향의 ToF가 임계값 이상인 경우 장애물이 존재하지 않음을 지시하는 정보(예, 0(또는 1))를 출력할 수 있다.As an example, the first sensor may include a vision sensor, and the first data may include image data obtained from the vision sensor. As a more specific example, the vision sensor may include at least one of a depth image sensor, a Red Green Blue (RGB) camera sensor, and an InfraRed (IR) camera sensor. As an example, the second sensor includes a sensor for reinforcing the blind spot of the first sensor, and the second data includes information corresponding to the output of the second sensor as the output of a machine learning model according to the proposed method of the present invention. And, as a more specific example, the second data may include information indicating whether an obstacle exists in the blind spot of the first sensor. As a more specific example, the second sensor may include a Time of Flight (ToF)-based sensor, and may measure ToF in multiple directions at various resolutions in the blind spot of the first sensor to detect obstacles in that direction. If there is an obstacle, 1 (or 0) can be output, and if there is no obstacle in that direction, 0 (or 1) can be output. As a more specific example, the second sensor measures ToF in at least three directions in the blind spot of the first sensor, and if the ToF in a specific direction is below a threshold, information indicating the presence of an obstacle (e.g., 1 (or 0)) is output, and if the ToF in a specific direction is greater than the threshold, information indicating that no obstacle exists (e.g., 0 (or 1)) may be output.

도 2는 본 발명의 제안 방법에 따른 블록도를 예시한다.Figure 2 illustrates a block diagram according to the proposed method of the present invention.

로봇은 제1 센서(210)를 기본 장착할 수 있고 학습 데이터 취득 시에만 제1 센서(210)를 보강하기 위한 제2 센서(202)를 장착하여 학습 데이터를 취득할 수 있다. 로봇은 제1 센서(210)와 제2 센서(202)를 장착한 상태로 다양한 환경에서 동작하면서 학습 데이터를 획득한 후 획득한 학습 데이터에 기초하여 기계 학습 모델을 학습할 수 있다(204). 예를 들어, 제1 센서(210) 및 제2 센서(202)로부터 획득한 학습 데이터에 기초하여 기계 학습 모델을 학습하는 것(204)은 로봇에서 수행될 수도 있고 또는 로봇과 동작시 통신하도록 구성된 서버에서 수행될 수도 있다. 보다 구체적인 예로, 서버는 클라우드 서버를 포함할 수 있다.The robot can be equipped with the first sensor 210 by default, and can acquire learning data by installing the second sensor 202 to reinforce the first sensor 210 only when learning data is acquired. The robot may acquire learning data while operating in various environments while equipped with the first sensor 210 and the second sensor 202, and then learn a machine learning model based on the acquired learning data (204). For example, learning a machine learning model 204 based on training data obtained from the first sensor 210 and the second sensor 202 may be performed on a robot or configured to communicate in operation with the robot. It can also be performed on a server. As a more specific example, the server may include a cloud server.

예를 들어, 본 발명의 제안 방법에 따른 기계 학습 모델은 합성곱 신경망(Convolutional Neural Network, CNN)에 기초하여 구현될 수 있지만, 본 발명의 제안 방법에 따른 기계 학습 모델은 합성곱 신경망으로만 제한되는 것은 아니며 다양한 인공 신경망 모델에 기초하여 구현될 수 있다. 합성곱 신경망의 원리, 구조, 및 동작 등에 관한 자세한 설명은 인터넷 사이트(예, 위키피디아(Wikipedia), https://en.wikipedia.org/wiki/Convolutional_neural_network)에 자세히 설명되어 있으며, 본 명세서는 상기 인터넷 사이트의 설명 전체와 참조 문헌 전체를 참조로서 포함한다. 예를 들어, 본 발명의 제안 방법이 적용될 수 있는 합성곱 신경망은 완전 연결층(fully connected layer) 및 손실층(loss layer) 전에 하나 이상의 합성곱층(convolution layer)과 통합층(pooling layer)를 포함할 수 있으며, 합성곱 신경망의 학습을 위해 제1 센서(210)로부터 획득한 학습 데이터는 상기 하나 이상의 합성곱층의 입력으로 사용하고 제2 센서(202)로부터 획득한 학습 데이터는 손실층의 참 데이터 레이블(true data label)로 사용할 수 있다. 보다 구체적으로, 제1 센서(210)와 제2 센서(202)로부터 획득한 학습 데이터에 기초한 합성곱 신경망의 학습은 제1 센서(210)로부터 획득한 학습 데이터를 합성곱 신경망에 입력하여 얻은 출력(output)과 제2 센서(202)로부터 획득한 학습 데이터를 참 데이터 레이블로 설정하여 구한 손실 함수(loss function) 또는 손실층을 최소화하도록 (예를 들어, 역전파(back propagation) 방식을 이용하여) 신경망 가중치(weight)를 업데이트하는 것을 포함할 수 있다.For example, the machine learning model according to the proposed method of the present invention may be implemented based on a convolutional neural network (CNN), but the machine learning model according to the proposed method of the present invention is limited to only the convolutional neural network. This does not mean that it can be implemented based on various artificial neural network models. Detailed descriptions of the principles, structure, and operation of convolutional neural networks are provided in detail on Internet sites (e.g., Wikipedia, https://en.wikipedia.org/wiki/Convolutional_neural_network), and this specification is provided on the Internet site. The entire site description and references are incorporated by reference. For example, a convolutional neural network to which the proposed method of the present invention can be applied includes one or more convolution layers and a pooling layer before the fully connected layer and the loss layer. For learning of the convolutional neural network, the learning data obtained from the first sensor 210 is used as an input to the one or more convolutional layers, and the learning data obtained from the second sensor 202 is the true data of the loss layer. It can be used as a label (true data label). More specifically, the learning of the convolutional neural network based on the learning data acquired from the first sensor 210 and the second sensor 202 is the output obtained by inputting the learning data obtained from the first sensor 210 into the convolutional neural network. (output) and learning data obtained from the second sensor 202 are set as true data labels to minimize the loss function or loss layer (e.g., using a back propagation method) ) may include updating the neural network weights.

제2 센서(202)로부터 학습 데이터 획득 및 이에 기초한 기계 학습 모델의 학습(204)을 포함하는 과정(206)은 기계 학습 모델을 학습할 때에만 수행되며 실제 로봇의 구동 시에는 수행되지 않는다. 보다 구체적으로, 실제 로봇의 구동 시에는 제2 센서(202)를 로봇에 장착하지 않고(또는 실제 로봇의 구동 시에 제2 센서(202)가 장착된 경우 제2 센서(202)로부터 획득한 데이터를 사용하지 않고) 로봇에 장착된 제1 센서(210)만으로 획득한 제1 데이터를 제1 센서(210) 및 제2 센서(202)로부터 획득한 학습 데이터에 기초하여 학습된 기계 학습 모델(212)에 이용하여 제1 센서(210)를 보강하기 위한 제2 데이터(214)를 획득할 수 있다. The process 206, which includes acquiring learning data from the second sensor 202 and learning 204 a machine learning model based thereon, is performed only when learning the machine learning model and is not performed when the actual robot is driven. More specifically, when the actual robot is driven, the second sensor 202 is not mounted on the robot (or when the second sensor 202 is mounted when the actual robot is driven, the data obtained from the second sensor 202 (without using) the machine learning model 212 learned based on the first data acquired only from the first sensor 210 mounted on the robot and the learning data obtained from the first sensor 210 and the second sensor 202 ) can be used to obtain second data 214 for reinforcing the first sensor 210.

학습된 기계 학습 모델(212)은 로봇에 배포(deploy) 또는 구현될 수도 있고 혹은 서버(예, 클라우드 서버)에 배포 또는 구현되어 이용될 수도 있다. 만일 기계 학습 모델(212)이 로봇에 배포 또는 구현되는 경우, 로봇은 제1 센서(210)와 제2 센서(202) 중에서 제1 센서(210)만으로 획득한 제1 데이터를 기계 학습 모델(212)에 이용하여 제1 센서(210)를 보강하기 위한 제2 데이터(214)를 획득할 수 있고, 제1 데이터와 제2 데이터(214)를 처리(220)하여 로봇의 주변 환경에 대한 인지(230)를 수행할 수 있다. 만일 기계 학습 모델(212)이 서버(예, 클라우드 서버)에 배포 또는 구현되는 경우 로봇은 제1 센서(210)와 제2 센서(202) 중에서 제1 센서(210)만으로 획득한 제1 데이터를 서버로 전송하고 서버는 수신한 제1 데이터를 기계 학습 모델(212)에 이용하여 제1 센서(210)를 보강하기 위한 제2 데이터를 획득할 수 있고, 제1 데이터와 제2 데이터(214)를 처리(220)하여 로봇의 주변 환경에 대한 인지(230)를 수행할 수 있다.The learned machine learning model 212 may be deployed or implemented on a robot, or may be deployed or implemented on a server (eg, cloud server) and used. If the machine learning model 212 is distributed or implemented in a robot, the robot uses the first data acquired only with the first sensor 210 among the first sensor 210 and the second sensor 202 to use the machine learning model 212 ) can be used to obtain second data 214 to reinforce the first sensor 210, and the first data and second data 214 are processed (220) to recognize the surrounding environment of the robot ( 230) can be performed. If the machine learning model 212 is distributed or implemented on a server (e.g., cloud server), the robot receives the first data obtained only from the first sensor 210 among the first sensor 210 and the second sensor 202. It transmits to the server, and the server uses the received first data in the machine learning model 212 to obtain second data to reinforce the first sensor 210, and the first data and the second data 214 By processing (220), recognition (230) of the surrounding environment of the robot can be performed.

처리(220)는 제1 센서(210)로부터 획득한 제1 데이터와 기계 학습 모델(212)를 이용하여 획득한 제2 데이터(214)를 융합하고 후처리하는 과정을 포함할 수 있다. 후처리 과정을 위해 추가적인 기계 학습 모델이 사용될 수 있다. 인지(230)는 로봇의 주변 환경에 대한 환경 인지 정보(environmental perception information)를 획득하는 것을 포함할 수 있으며, 예를 들어 환경 인지 정보는 포인트 클라우드(point cloud) 정보, 동적 물체의 궤적에 관한 정보 등을 포함할 수 있다.Processing 220 may include fusing and post-processing first data obtained from the first sensor 210 and second data 214 obtained using the machine learning model 212. Additional machine learning models may be used for post-processing. Perception 230 may include acquiring environmental perception information about the robot's surrounding environment, for example, environmental perception information may include point cloud information, information about the trajectory of a dynamic object, It may include etc.

로봇은 인지(230) 결과에 기초하여 행동(action)을 결정 및/또는 수행할 수 있다. 기계 학습 모델(212)이 서버(예, 클라우드 서버)에 배포 또는 구현되는 경우로봇은 서버로부터 인지(230) 결과를 수신하여 행동(action)을 결정 및/또는 수행하거나 또는 행동에 관한 명령을 수신하여 수행할 수 있다.The robot may decide and/or perform an action based on the results of the recognition 230. When the machine learning model 212 is deployed or implemented on a server (e.g., a cloud server), the robot receives the recognition 230 results from the server to determine and/or perform an action or receive commands regarding the action. It can be done by doing this.

도 3은 학습 데이터 획득을 위해 제1 센서(210)와 제2 센서(202)를 로봇에 장착한 예를 예시한다.Figure 3 illustrates an example in which the first sensor 210 and the second sensor 202 are mounted on a robot to acquire learning data.

도 3을 참조하면, 로봇에 제1 센서(210)가 기본 장착되고 제1 센서(210)는 센서 특성으로 인해 사각 지대(302)를 가질 수 있다. 도 3의 예에서, 사각 지대(302)는 제1 센서(210)의 근접 영역에 발생하는 것으로 도시되어 있지만 근접 영역이 아닌 다른 영역에서 사각 지대가 발생하는 경우에도 본 발명의 제안 방법은 동일/유사하게 적용될 수 있다. 또한, 제1 센서(210)의 기술적 한계로서 사각 지대(302)가 발생하는 것을 예로 들었지만 제1 센서(210)가 다른 기술적 한계를 가지는 경우에도 본 발명의 제안 방법은 동일/유사하게 적용될 수 있다.Referring to FIG. 3, a first sensor 210 is basically mounted on the robot, and the first sensor 210 may have a blind spot 302 due to sensor characteristics. In the example of FIG. 3, the blind spot 302 is shown as occurring in a proximal area of the first sensor 210, but even if the blind spot occurs in an area other than the proximal area, the method proposed by the present invention is the same/ It can be applied similarly. In addition, although the blind spot 302 occurs as a technical limitation of the first sensor 210 as an example, the proposed method of the present invention can be applied in the same/similar manner even if the first sensor 210 has other technical limitations. .

도 3의 예에서, 제1 센서(210)의 사각 지대(302)를 보강하기 위해 로봇에 제2 센서(202)를 추가 장착하여 기계 학습 모델을 학습(204)하기 위한 학습 데이터를 획득할 수 있다. 예를 들어, 제2 센서(202)는 3개 방향으로 장애물 존재 여부를 감지하기 위해 3개 해상도(resolution)를 가지도록 구성될 수 있지만, 더 높은 해상도나 더 낮은 해상도를 가지는 경우에도 본 발명의 제안 방법은 동일/유사하게 적용될 수 있다. 제2 센서(202)는 비행 시간(Time of Flight, ToF) 기반으로 동작할 수 있으며, ToF가 일정 임계값 이하인 경우 해당 방향에 장애물이 존재함을 지시하는 정보를 출력할 수 있고, ToF가 일정 임계값 이상인 경우 해당 방향에 장애물이 존재하지 않음을 지시하는 정보를 출력할 수 있다. 예를 들어, 장애물이 존재하는 경우 1(또는 0)의 값을 출력할 수 있고, 장애물이 존재하지 않는 경우 0(또는 1)의 값을 출력할 수 있다.In the example of FIG. 3, in order to reinforce the blind spot 302 of the first sensor 210, a second sensor 202 is additionally mounted on the robot to obtain learning data for learning a machine learning model (204). there is. For example, the second sensor 202 may be configured to have three resolutions to detect the presence of an obstacle in three directions, but even if it has a higher or lower resolution, the present invention The proposed method can be applied identically/similarly. The second sensor 202 may operate based on Time of Flight (ToF), and if the ToF is below a certain threshold, it may output information indicating that an obstacle exists in the corresponding direction, and the ToF may be constant. If it is above the threshold, information indicating that there are no obstacles in that direction can be output. For example, if an obstacle exists, a value of 1 (or 0) can be output, and if an obstacle does not exist, a value of 0 (or 1) can be output.

도 2를 참조하여 설명한 바와 같이, 제1 센서(210)와 제2 센서(202)로부터 획득한 학습 데이터에 기초한 학습(204)은 로봇에서 수행될 수도 있고 또는 서버(예, 클라우드 서버)에서 수행될 수도 있다. 또한, 학습(204)이 완료된 기계 학습 모델(212)은 로봇에 배포(deploy) 또는 구현될 수도 있고 또는 서버(예, 클라우드 서버)에 배포 또는 구현될 수도 있다.As described with reference to FIG. 2, learning 204 based on learning data obtained from the first sensor 210 and the second sensor 202 may be performed on a robot or on a server (e.g., cloud server). It could be. Additionally, the machine learning model 212 on which learning 204 has been completed may be deployed or implemented on a robot or on a server (eg, cloud server).

제2 센서(202)는 학습 데이터 획득시에만 로봇에 장착되며 실제 로봇 구동시에는 장착되지 않을 수 있다. 혹은 제2 센서(202)는 실제 로봇 구동시에 장착되어 있더라도 학습된 기계 학습 모델(212)에 입력으로 이용되지 않을 수 있다. 도 2를 참조하여 설명한 바와 같이, 본 발명의 제안 방법에서는 제2 센서(202)(또는 제2 센서(202)로부터 획득한 데이터)를 이용하는 대신(또는 제2 센서(202)를 로봇에 장착하지 않고) 제1 센서(210)와 제2 센서(202) 중에서 제1 센서(210)만으로 획득한 데이터를 기계 학습 모델(212)에 입력으로 이용하여 제1 센서(210)를 보강하기 위한 제2 데이터(214)를 획득한다.The second sensor 202 is mounted on the robot only when learning data is acquired and may not be mounted when the robot is actually driven. Alternatively, the second sensor 202 may not be used as an input to the learned machine learning model 212 even if it is installed when the robot is actually driven. As explained with reference to FIG. 2, in the proposed method of the present invention, instead of using the second sensor 202 (or data obtained from the second sensor 202) (or without mounting the second sensor 202 on the robot) (without) a second sensor for reinforcing the first sensor 210 by using data obtained only with the first sensor 210 among the first sensor 210 and the second sensor 202 as input to the machine learning model 212 Obtain data 214.

도 4는 본 발명의 제안 방법의 순서도를 예시한다.Figure 4 illustrates a flowchart of the proposed method of the present invention.

도 4의 예에서, 기계 학습 모델(212)은 제1 센서(210)와 제2 센서(202)를 로봇에 장착하여 획득한 학습 데이터에 기초하여 학습(204)이 완료된 상태이고 장치(500)에 배포 또는 구현된다고 가정한다. 예를 들어, 기계 학습 모델(212)은 소프트웨어, 하드웨어, 또는 이들의 조합으로 장치(500)에 동작시 구현될 수 있으며, 이를 위해 장치(500)에 포함된 메모리(504)는 장치(500)에 포함된 프로세서(502)에 의해 실행될 때 제1 센서(210) 및 제1 센서(210)를 보강하기 위한 제2 센서(202)로부터 획득한 학습 데이터에 기초하여 학습된 기계 학습 모델(machine learning model)(212)을 구현하도록 구성된 명령어들을 포함할 수 있다. 앞서 설명한 바와 같이, 본 발명을 제한하지 않는 예로, 기계 학습 모델(212)은 합성곱 신경망(Convolutional Neural Network, CNN)에 기초하여 구현될 수 있다.In the example of FIG. 4, the machine learning model 212 has completed learning 204 based on learning data obtained by mounting the first sensor 210 and the second sensor 202 on the robot, and the device 500 It is assumed that it is deployed or implemented in . For example, the machine learning model 212 may be implemented in the device 500 as software, hardware, or a combination thereof, and for this purpose, the memory 504 included in the device 500 is used in the device 500. A machine learning model learned based on learning data obtained from the first sensor 210 and the second sensor 202 for augmenting the first sensor 210 when executed by the processor 502 included in may include instructions configured to implement model) (212). As described above, as a non-limiting example of the present invention, the machine learning model 212 may be implemented based on a convolutional neural network (CNN).

도 4를 참조하면, 장치(500)는 제1 센서(210)와 제2 센서(202) 중에서 제1 센서(210)만으로 획득한 제1 데이터를 기계 학습 모델(212)에 이용하여 제1 센서(210)를 보강하기 위한 제2 데이터(214)를 획득할 수 있다(S402). 앞서 설명한 바와 같이, 본 발명을 제한하지 않는 예로, 제1 센서(210)는 비전 센서를 포함할 수 있고, 제1 데이터는 비전 센서로부터 획득한 이미지 데이터를 포함할 수 있다. 보다 구체적인 예로, 비전 센서는 뎁스(depth) 이미지 센서, RGB(Red Green Blue) 카메라 센서, 적외선(InfraRed, IR) 카메라 센서 중 적어도 하나를 포함할 수 있다. 본 발명을 제한하지 않는 예로, 제2 센서(202)는 제1 센서(210)의 사각 지대를 보강하기 위한 센서를 포함할 수 있고, 제2 데이터는 제1 센서(210)의 사각 지대에 장애물이 존재하는지 여부를 지시하는 정보를 포함할 수 있다. 보다 구체적인 예로, 제2 센서(202)는 비행 시간(Time of Flight, ToF) 기반 센서를 포함할 수 있으며, 제1 센서(210)의 사각 지대에서 다양한 해상도로 복수의 방향으로 ToF를 측정하여 해당 방향에 장애물이 존재하는 경우 1(또는 0)을 출력하고 해당 방향에 장애물이 존재하지 않는 경우 0(또는 1)을 출력할 수 있다. 본 발명을 제한하지 않는 예로, 제2 센서(202)는 적어도 3개의 해상도(resolution)을 가지며 제1 센서(210)의 사각 지대에서 적어도 3개의 방향으로 ToF를 측정하여 특정 방향의 ToF가 임계값 이하인 경우 장애물이 존재함을 지시하는 정보(예, 1(또는 0))를 출력하고, 특정 방향의 ToF가 임계값 이상인 경우 장애물이 존재하지 않음을 지시하는 정보(예, 0(또는 1))를 출력할 수 있다.Referring to FIG. 4, the device 500 uses the first data obtained only from the first sensor 210 among the first sensor 210 and the second sensor 202 to the machine learning model 212 to Second data 214 to reinforce 210 may be obtained (S402). As described above, as an example that does not limit the present invention, the first sensor 210 may include a vision sensor, and the first data may include image data obtained from the vision sensor. As a more specific example, the vision sensor may include at least one of a depth image sensor, a Red Green Blue (RGB) camera sensor, and an InfraRed (IR) camera sensor. As an example that does not limit the present invention, the second sensor 202 may include a sensor for reinforcing the blind spot of the first sensor 210, and the second data may include obstacles in the blind spot of the first sensor 210. It may include information indicating whether it exists. As a more specific example, the second sensor 202 may include a time of flight (ToF)-based sensor, and measures ToF in a plurality of directions at various resolutions in the blind spot of the first sensor 210 to provide the corresponding If there is an obstacle in that direction, 1 (or 0) can be output, and if there is no obstacle in that direction, 0 (or 1) can be output. As an example that does not limit the present invention, the second sensor 202 has at least three resolutions and measures ToF in at least three directions in the blind spot of the first sensor 210 so that the ToF in a specific direction is set to a threshold. If it is less than or equal to the threshold, information indicating that an obstacle exists (e.g., 1 (or 0)) is output, and if the ToF in a specific direction is above the threshold, information indicating that an obstacle does not exist (e.g., 0 (or 1)) is output. can be output.

장치(500)는 기계 학습 모델(212)을 이용하여 획득한 제2 데이터(214) 및 제1 센서(210)만으로 획득한 제1 데이터에 기초하여 인지(perception)(230)를 수행할 수 있다(S404). 보다 구체적으로, 장치(500)는 제1 데이터와 제2 데이터(214)를 처리(220)하여 로봇의 주변 환경에 대한 인지(230)를 수행할 수 있다. 앞서 설명한 바와 같이, 처리(220)는 제1 센서(210)로부터 획득한 제1 데이터와 기계 학습 모델(212)를 이용하여 획득한 제2 데이터(214)를 융합하고 후처리하는 과정을 포함할 수 있다. 후처리 과정을 위해 추가적인 기계 학습 모델이 사용될 수 있다. 인지(230)는 로봇의 주변 환경에 대한 환경 인지 정보(environmental perception information)를 획득하는 것을 포함할 수 있으며, 예를 들어 환경 인지 정보는 포인트 클라우드(point cloud) 정보, 동적 물체의 궤적에 관한 정보 등을 포함할 수 있다.The device 500 may perform perception 230 based on second data 214 acquired using the machine learning model 212 and first data acquired only with the first sensor 210. (S404). More specifically, the device 500 may process the first data and the second data 214 (220) to recognize the robot's surrounding environment (230). As previously described, processing 220 may include fusing and post-processing the first data obtained from the first sensor 210 and the second data 214 obtained using the machine learning model 212. You can. Additional machine learning models may be used for post-processing. Perception 230 may include acquiring environmental perception information about the robot's surrounding environment, for example, environmental perception information may include point cloud information, information about the trajectory of a dynamic object, It may include etc.

도 4의 예는 제1 센서(210)가 장착된 로봇에서 수행될 수도 있고 또는 제1 센서(210)가 장착된 로봇과 동작시 통신하도록 구성된 서버에서 수행될 수도 있다. 예를 들어, 서버는 클라우드 서버를 포함할 수 있다.The example of FIG. 4 may be performed on a robot equipped with the first sensor 210 or on a server configured to communicate during operation with the robot equipped with the first sensor 210. For example, the server may include a cloud server.

앞서 설명한 바와 같이, 로봇은 인지(230) 결과에 기초하여 행동(action)을 결정 및/또는 수행할 수 있다. 혹은 (기계 학습 모델(212)이 서버(예, 클라우드 서버)에 배포 또는 구현되는 경우) 장치(500)는 인지(230) 결과에 기초하여 로봇의 행동을 결정하고 로봇에게 행동에 관한 명령을 전송할 수 있다.As described above, the robot may decide and/or perform an action based on the results of the recognition 230. Or (if the machine learning model 212 is distributed or implemented on a server (e.g., cloud server)), the device 500 determines the robot's behavior based on the recognition 230 result and transmits a command regarding the behavior to the robot. You can.

본 발명의 제안 방법이 적용될 수 있는 장치Device to which the proposed method of the present invention can be applied

도 5는 본 발명의 제안 방법이 적용될 수 있는 장치(500)를 예시한다.Figure 5 illustrates a device 500 to which the proposed method of the present invention can be applied.

도 5를 참조하면, 장치(500)는 로봇을 포함하거나 또는 로봇과 동작시 통신하도록 구성된 서버를 포함할 수 있으며, 본 발명의 제안 방법을 구현하도록 구성될 수 있다.Referring to Figure 5, device 500 may include a robot or a server configured to communicate with the robot when operating, and may be configured to implement the proposed method of the present invention.

예를 들어, 본 발명의 제안 방법이 적용될 수 있는 장치(500)는 리피터, 허브, 브리지, 스위치, 라우터, 게이트웨이 등과 같은 네트워크 장치, 데스크톱 컴퓨터, 워크스테이션 등과 같은 컴퓨터 장치, 스마트폰 등과 같은 이동 단말, 랩톱 컴퓨터 등과 같은 휴대용 기기, 디지털 TV 등과 같은 가전 제품, 자동차 등과 같은 이동 수단 등을 포함할 수 있다. 다른 예로, 본 발명의 제안 방법이 적용될 수 있는 장치(500)는 SoC(System On Chip) 형태로 구현된 ASIC(Application Specific Integrated Circuit)의 일부로 포함될 수 있다.For example, devices 500 to which the proposed method of the present invention can be applied include network devices such as repeaters, hubs, bridges, switches, routers, gateways, computer devices such as desktop computers, workstations, etc., and mobile terminals such as smartphones. , portable devices such as laptop computers, home appliances such as digital TVs, and means of transportation such as cars. As another example, the device 500 to which the proposed method of the present invention can be applied may be included as part of an Application Specific Integrated Circuit (ASIC) implemented in the form of a System On Chip (SoC).

메모리(504)는 프로세서(502)의 처리 및 제어를 위한 프로그램을 저장할 수 있고, 본 발명에서 사용되는 데이터와 정보, 본 발명에 따른 데이터 및 정보 처리를 위해 필요한 제어 정보, 데이터 및 정보 처리 과정에서 발생하는 임시 데이터 등을 저장할 수 있다. 메모리(504)는 ROM(Read Only Memory), RAM(Random Access Memory), EPROM(Erasable Programmable Read Only Memory), EEPROM(Electrically Erasable Programmable Read-Only Memory), 플래쉬(flash) 메모리, SRAM(Static RAM), HDD(Hard Disk Drive), SSD(Solid State Drive) 등과 같은 저장 장치로서 구현될 수 있다.The memory 504 may store a program for processing and controlling the processor 502, and may store data and information used in the present invention, control information required for data and information processing according to the present invention, and data and information processing process. Temporary data that occurs can be stored. The memory 504 includes Read Only Memory (ROM), Random Access Memory (RAM), Erasable Programmable Read Only Memory (EPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), flash memory, and Static RAM (SRAM). , can be implemented as a storage device such as a hard disk drive (HDD), solid state drive (SSD), etc.

프로세서(502)는 장치(500) 내 각 모듈의 동작을 제어한다. 특히, 프로세서(502)는 본 발명의 제안 방법을 수행하기 위한 각종 제어 기능을 수행할 수 있다. 프로세서(502)는 컨트롤러(controller), 마이크로 컨트롤러(microcontroller), 마이크로 프로세서(microprocessor), 마이크로 컴퓨터(microcomputer) 등으로도 불릴 수 있다. 본 발명의 제안 방법은 하드웨어(hardware) 또는 펌웨어(firmware), 소프트웨어, 또는 이들의 결합에 의해 구현될 수 있다. 하드웨어를 이용하여 본 발명을 구현하는 경우에는, 본 발명을 수행하도록 구성된 ASIC(application specific integrated circuit) 또는 DSP(digital signal processor), DSPD(digital signal processing device), PLD(programmable logic device), FPGA(field programmable gate array) 등이 프로세서(502)에 구비될 수 있다. 한편, 펌웨어나 소프트웨어를 이용하여 본 발명의 제안 방법을 구현하는 경우에는 펌웨어나 소프트웨어는 본 발명의 제안 방법을 구현하는데 필요한 기능 또는 동작들을 수행하는 모듈, 절차 또는 함수 등과 관련된 명령어(instruction)들을 포함할 수 있으며, 명령어들은 메모리(504)에 저장되거나 메모리(504)와 별도로 컴퓨터 판독가능한 기록 매체(미도시)에 저장되어 프로세서(502)에 의해 실행될 때 장치(500)가 본 발명의 제안 방법을 구현하도록 구성될 수 있다.Processor 502 controls the operation of each module within device 500. In particular, the processor 502 can perform various control functions to perform the proposed method of the present invention. The processor 502 may also be called a controller, microcontroller, microprocessor, microcomputer, etc. The proposed method of the present invention may be implemented by hardware, firmware, software, or a combination thereof. When implementing the present invention using hardware, an application specific integrated circuit (ASIC) or a digital signal processor (DSP), a digital signal processing device (DSPD), a programmable logic device (PLD), or an FPGA (FPGA) configured to perform the present invention is used. A field programmable gate array) may be provided in the processor 502. Meanwhile, when the proposed method of the present invention is implemented using firmware or software, the firmware or software includes instructions related to modules, procedures, or functions that perform the functions or operations necessary to implement the proposed method of the present invention. This can be done, and the instructions are stored in the memory 504 or in a computer-readable recording medium (not shown) separate from the memory 504, and when executed by the processor 502, the device 500 performs the proposed method of the present invention. It can be configured to implement.

또한, 장치(500)는 네트워크 인터페이스 모듈(network interface module, NIM)(506)을 포함할 수 있다. 네트워크 인터페이스 모듈(506)은 프로세서(502)와 동작시 연결(operatively connected)되며, 프로세서(502)는 네트워크 인터페이스 모듈(506)을 제어하여 무선/유선 네트워크를 통해 정보 및/또는 데이터, 신호, 메시지 등을 나르는 무선/유선 신호를 전송 또는 수신할 수 있다. 네트워크 인터페이스 모듈(506)은 예를 들어 IEEE 802 계열, 3GPP LTE(-A), 3GPP 5G 등과 같은 다양한 통신 규격을 지원하며, 해당 통신 규격에 따라 제어 정보 및/또는 데이터 신호를 송수신할 수 있다. 네트워크 인터페이스 모듈(506)은 필요에 따라 장치(500) 밖에 구현될 수도 있다.Additionally, device 500 may include a network interface module (NIM) 506. The network interface module 506 is operatively connected to the processor 502, and the processor 502 controls the network interface module 506 to provide information and/or data, signals, and messages through a wireless/wired network. It can transmit or receive wireless/wired signals carrying lights. The network interface module 506 supports various communication standards, such as IEEE 802 series, 3GPP LTE(-A), 3GPP 5G, etc., and can transmit and receive control information and/or data signals according to the corresponding communication standards. Network interface module 506 may be implemented outside of device 500 as needed.

이상에서 설명된 실시예들은 본 발명의 구성 요소들과 특징들이 소정 형태로 결합된 것들이다. 각 구성 요소 또는 특징은 별도의 명시적 언급이 없는 한 선택적인 것으로 고려되어야 한다. 각 구성 요소 또는 특징은 다른 구성 요소나 특징과 결합되지 않은 형태로 실시될 수 있다. 또한, 일부 구성 요소들 및/또는 특징들을 결합하여 본 발명의 실시예를 구성하는 것도 가능하다. 본 발명의 실시예들에서 설명되는 동작들의 순서는 변경될 수 있다. 어느 실시예의 일부 구성이나 특징은 다른 실시예에 포함될 수 있고, 또는 다른 실시예의 대응하는 구성 또는 특징과 교체될 수 있다. 특허청구범위에서 명시적인 인용 관계가 있지 않은 청구항들을 결합하여 실시예를 구성하거나 출원 후의 보정에 의해 새로운 청구항으로 포함시킬 수 있음은 자명하다.The embodiments described above are those in which the components and features of the present invention are combined in a predetermined form. Each component or feature should be considered optional unless explicitly stated otherwise. Each component or feature may be implemented in a form that is not combined with other components or features. Additionally, it is also possible to configure an embodiment of the present invention by combining some components and/or features. The order of operations described in embodiments of the present invention may be changed. Some features or features of one embodiment may be included in other embodiments or may be replaced with corresponding features or features of other embodiments. It is obvious that claims that do not have an explicit reference relationship in the patent claims can be combined to form an embodiment or included as a new claim through amendment after filing.

본 발명은 다양한 시스템에서 동작하는 로봇, 서버와 같은 장치에 적용될 수 있다.The present invention can be applied to devices such as robots and servers that operate in various systems.

110: 비전 센서
120, 220: 처리
130, 230: 인지
202: 제2 센서 204: 학습
206: 제1 센서 및 제2 센서로부터 획득한 학습 데이터에 기초한 학습
210: 제1 센서 212: 기계 학습 모델 214: 제2 데이터
302: 제1 센서의 사각 지대
500: 장치
502: 프로세서
504: 메모리
506: 네트워크 인터페이스 모듈110: Vision sensor
120, 220: Processing
130, 230: Cognition
202: second sensor 204: learning
206: Learning based on learning data obtained from the first sensor and the second sensor
210: first sensor 212: machine learning model 214: second data
302: Blind spot of first sensor
500: device
502: Processor
504: memory
506: Network interface module

Claims

A method performed in a device comprising a processor and memory, comprising:
The memory includes instructions configured to, when executed by the processor, implement a machine learning model learned based on learning data obtained from a first sensor and a second sensor for augmenting the first sensor, , the method is:
Obtaining second data for reinforcing the first sensor by using first data obtained only from the first sensor among the first sensor and the second sensor in the machine learning model; and
A method comprising performing perception based on the first data and second data obtained using the machine learning model.

In claim 1,
The method wherein the first sensor includes a vision sensor, and the first data includes image data obtained from the vision sensor.

In claim 2,
The method wherein the vision sensor includes at least one of a depth image sensor, a Red Green Blue (RGB) camera sensor, and an InfraRed (IR) camera sensor.

In claim 1,
The second sensor includes a sensor for reinforcing the blind spot of the first sensor, and the second data includes information indicating whether an obstacle exists in the blind spot of the first sensor.

In claim 4,
The method of claim 1, wherein the second sensor comprises a Time of Flight (ToF) based sensor.

In claim 1,
The method wherein the machine learning model is implemented based on a convolutional neural network (CNN).

In claim 1,
The method of claim 1, wherein the device includes a robot equipped with the first sensor.

In claim 1,
The method of claim 1, wherein the device includes a server configured to communicate during operation with a robot equipped with the first sensor.

In claim 8,
The method wherein the server includes a cloud server.

In claim 1,
Performing the above recognition:
A method comprising acquiring point cloud information about the surrounding environment of the robot equipped with the first sensor or information about the trajectory of a dynamic object.

In a device including a processor and memory,
The memory includes instructions configured to implement a specific operation and a machine learning model learned based on learning data obtained from a first sensor and a second sensor for augmenting the first sensor when executed by the processor. , wherein the specific operations include:
Obtaining second data for reinforcing the first sensor by using first data obtained only from the first sensor among the first sensor and the second sensor in the machine learning model; and
A device comprising performing perception based on the first data and second data obtained using the machine learning model.

When executed by a processor, it causes a device including the processor to perform a machine learning model and a specific operation learned based on learning data obtained from a first sensor and a second sensor for augmenting the first sensor. A computer-readable storage medium storing instructions configured to implement, wherein the specific operations include:
Obtaining second data for reinforcing the first sensor by using first data obtained only from the first sensor among the first sensor and the second sensor in the machine learning model; and
A computer-readable storage medium comprising performing perception based on the first data and second data obtained using the machine learning model.