KR20190041840A

KR20190041840A - Controlling mobile robot based on asynchronous target classification

Info

Publication number: KR20190041840A
Application number: KR1020170133570A
Authority: KR
Inventors: 심영우; 류길현; 연승호; 석상옥
Original assignee: 네이버랩스 주식회사
Priority date: 2017-10-13
Filing date: 2017-10-13
Publication date: 2019-04-23
Also published as: KR101974448B1

Abstract

Provided is technology controlling a mobile robot based on asynchronous target classification. According to embodiments of the present invention, unlike an end-to-end robot software architecture, the technology may effectively integrate large-scale deep neural networks into a mobile robot requiring high bandwidth control using an asynchronous deep classification network (ADCN), which may introduce deep learning technology as one component constituting the entire software of the mobile robot.

Description

[0001] CONTROLLING MOBILE ROBOT BASED ON ASYNCHRONOUS TARGET CLASSIFICATION [0002]

아래의 설명은 비동기 방식의 목표물 분류에 기반한 모바일 로봇 제어 기술에 관한 것으로, 보다 자세하게는 비동기식 딥 분류 네트워크(Asynchronous Deep Classification Network, ADCN)이용하여 딥 러닝 기술을 이용하면서도 비동기 방식으로 목표물을 분류할 수 있는 목표물 분류 방법, 상기 목표물 분류 방법을 수행하는 컴퓨터 장치, 그리고 컴퓨터와 결합되어 목표물 분류 방법을 컴퓨터에 실행시키기 위해 컴퓨터 판독 가능한 기록매체에 저장된 컴퓨터 프로그램과 그 기록매체에 관한 것이다.The following description relates to a mobile robot control technology based on asynchronous target classification, and more particularly to an asynchronous Deep Classification Network (ADCN) that can classify targets asynchronously while using deep learning techniques And a computer program stored in a computer-readable recording medium for causing a computer to execute a method of classifying a target, the recording medium being associated with the computer.

딥 러닝(deep learning) 기술의 획기적인 발전에 따라 일부 종래기술들에서는 모바일 로봇 분야에 딥 러닝을 적용할 수 있는 가능성을 제시하였다. 예를 들어, 아래의 논문 1에서는 딥 뉴럴 네트워크를 이용한 장애물 회피 및 경로 탐색을 위한 엔드-투-엔드(end-to-end) 모션 플래닝을 가진 모바일 로봇 구조를 제안하였으며, 아래의 논문 2에서는 딥 뉴럴 네트워크를 사용하는 엔드-투-엔드 시각 네비게이션 및 딥 뉴럴 네트워크의 교육 방법론을 제안하였다.Due to the breakthrough of deep learning technology, some prior arts have suggested the possibility of applying deep learning to the mobile robot field. For example, in the following paper 1, a mobile robot structure with end-to-end motion planning for obstacle avoidance and path search using a deep neural network is proposed. In the following paper 2, We proposed a training methodology for end - to - end visual navigation and deep neural networks using neural networks.

- 논문 1: M. Pfeiffer, M. Schaeuble, J. Nieto, R. Siegwart, and C. Cadena, "From perception to decision: A data-driven approach to end-to-end motion planning for autonomous ground robots," in IEEE International Conference on Robotics and Automation (ICRA), 2017.- Papers 1: M. Pfeiffer, M. Schaeuble, J. Nieto, R. Siegwart, and C. Cadena, "From perception to decision: A data-driven approach to autonomous ground robots, IEEE International Conference on Robotics and Automation (ICRA), 2017.

- 논문 2: Y. Zhu, R. Mottaghi, E. Kolve, J Lim, J. J., A. Gupta, L. Fei-Fei, and A. Farhadi, "Target-driven visual navigation in indoor scenes using deep reinforcement learning," in IEEE International Conference on Robotics and Automation (ICRA), 2017.- Papers 2: Y. Zhu, R. Mottaghi, E. Kolve, J Lim, JJ, A. Gupta, L. Fei-Fei, and A. Farhadi, "Target-driven visual navigation in indoor scenes using deep reinforcement learning, " IEEE International Conference on Robotics and Automation (ICRA), 2017.

이러한 기술들은 새로운 것이지만, 전통적인 로봇 관점에서 다음과 같은 한계점들을 보여준다.Although these techniques are new, they show the following limitations from a traditional robot perspective.

- 이전에 개발된 로봇 소프트웨어 및 하드웨어 구성 요소와의 호환성이 낮음.- Low compatibility with previously developed robot software and hardware components.

- 큰 컴퓨팅 성능을 요구함에 따라 소형 모바일 로봇에서 동작하기 어려움.- It is difficult to operate in a small mobile robot because it requires large computing performance.

- 강화 학습의 어플리케이션 별 트레이닝 방법을 다른 로봇 플랫폼에 적용하기 어려움.- It is difficult to apply application-specific training methods of reinforcement learning to other robot platforms.

로봇의 시각적 인식과 관련하여, 딥 러닝 기반의 시각적 인식 방식은 기존의 특징-기반 인식 방식과 비교하여 뛰어난 인식 성능을 제공한다. 또한, 최근에는 이미지 세분화와 같은 기존 컴퓨터 비전의 다른 영역으로도 이러한 딥 러닝 기반의 시각적 인식 방식이 확장되었다. 성능과 확장성을 고려할 때 모바일 로봇의 시각적 인식 작업과 같은 다양한 인식 작업에 딥 러닝 기반의 접근 방식을 사용하는 것은 매우 유용하다.With regard to the visual recognition of the robot, the deep recognition based visual recognition method provides excellent recognition performance compared to the conventional feature-based recognition method. In addition, recently, these deep learning based visual recognition methods have been extended to other areas of conventional computer vision, such as image segmentation. Considering performance and scalability, it is very useful to use a deep learning - based approach to various recognition tasks such as visual recognition of mobile robots.

그러나 다른 어플리케이션에서와 달리, 모바일 로봇의 제한된 컴퓨팅 성능은 모바일 로봇의 시각적 인식 시스템에서 딥 러닝 기반 접근 방식을 사용하는데 큰 장애가 된다. 종래기술에서 딥 이미지 인식에 이용되는 대부분의 고성능 알고리즘들은 고도로 스택된(stacked) 네트워크 구조로 인해 많은 계산량을 요구한다. 또한, 많은 계산량에 따른 고급 하드웨어의 사용에 대한 요구 사항을 고려할 때, 이러한 종래기술의 네트워크를 소형의 모바일 로봇에 내장하는 것은 모바일 로봇의 시각적 인식 시스템의 처리량을 크게 감소시킨다.However, unlike in other applications, the limited computing performance of mobile robots is a major obstacle to using a deep learning based approach in the visual recognition system of mobile robots. Most high performance algorithms used in deep image recognition in the prior art require a large amount of computation due to the highly stacked network structure. Also, considering the requirement for use of advanced hardware according to the amount of computation, embedding such a prior art network in a small mobile robot greatly reduces the throughput of the visual recognition system of the mobile robot.

이러한 처리량 문제는 최근 연구된 엔드-투-엔드 모델을 모바일 로봇에 적용하는 경우에 시스템 성능을 더욱 저하시킨다. 이러한 엔드-투-엔드 모델은 모바일 로봇의 시각적 인식 시스템과 모바일 로봇의 제어 시스템을 단일 네트워크로 연결하기 때문에 시각적 인식 계산 대역폭이 현저히 낮아 시스템의 전체 처리량이 손상된다는 문제점이 있다.This throughput problem further degrades system performance when applying recently studied end-to-end models to mobile robots. Since the end-to-end model connects the visual recognition system of the mobile robot and the control system of the mobile robot through a single network, the bandwidth of the visual recognition calculation is remarkably low, thereby deteriorating the overall throughput of the system.

이 문제를 해결하기 위한 방법으로, 시각 인식 네트워크를 위한 고급 GPU(Graphic Processing Unit)가 내장된 외부 서버를 사용하는 방법과 네트워크의 처리량을 높이기 위해 신경망의 레이어 수를 줄이고 제한하는 방법이 존재할 수 있다. 그러나 전자의 경우, 다수의 이미지를 외부 서버로 전송함에 따라 외부 서버와 모바일 로봇 사이에 발생하는 대기 시간이 모바일 로봇 전체의 처리량에 큰 영향을 줄 수 있다. 예를 들어, 외부 서버의 GPU는 알고리즘의 입력을 공급하기 위해 모바일 로봇의 카메라를 통해 입력되는 전체 이미지를 수신받아야 하기 때문에 통신 오버헤드가 치명적으로 작용할 수 있다. 또한, 후자의 경우, 프루닝(pruning) 네트워크 구조가 네트워크의 인식 성능을 감소시킨다는 문제점이 있다. 예를 들어, 신경망의 레이어 수를 줄이고 제한함에 따라 증가된 네트워크 처리량을 갖는 구조적으로 최적화된 뉴럴 네트워크는 다른 종래기술들의 무거운 네트워크에 비해 인식 정확도가 떨어지는 것으로 알려져 있다.As a solution to this problem, there may be a method of using an external server with a high-level GPU (Graphic Processing Unit) for visual recognition network and a method of reducing and limiting the number of layers of the neural network to increase the throughput of the network . However, in the former case, the waiting time between the external server and the mobile robot may have a great influence on the throughput of the mobile robot as a result of transmitting a plurality of images to the external server. For example, the GPU of the external server may receive a whole image input through the camera of the mobile robot in order to supply the input of the algorithm, so that the communication overhead may be fatal. Also, in the latter case, there is a problem that the pruning network structure reduces the recognition performance of the network. For example, a structurally optimized neural network with increased network throughput by reducing and limiting the number of layers in a neural network is known to be less accurate than a heavy network of other prior art techniques.

또한, 이미지를 네트워크의 입력 상태로 사용하는 강화 학습 기술이 존재하며, 이러한 종래의 강화 학습 기술은 신경망이 입력 이미지에 나타난 상황의 뉘앙스를 학습할 수 있게 해준다. 이미지의 복잡한 컨텍스트(context)를 이해하는 능력으로 인해, 경로 계획 또는 로봇 팔 조작과 같은 로봇 제어 시스템에 딥 강화 학습을 적용할 수 있다. 근접 정보나 로봇의 속도와 같은 낮은 차원의 데이터만을 사용할 수 있었던 과거의 접근법과는 달리, 최근 접근법은 미가공(raw) 상태의 카메라 이미지를 입력 상태로 이용한다. 예를 들어, 단일의 뉴럴 네트워크에서 카메라의 미가공 이미지를 모터 제어 신호에 연결하는 것이 가능해짐에 따라 엔드-투-엔드 개념을 가진 많은 로봇 제어 알고리즘이 제안되었으며, 일부 종래기술들에서는 시뮬레이션 환경에서 학습된 엔드-투-엔드 시각운동성(visuomotor) 네트워크가 로봇 플랫폼에서 미세 조정만으로 실제 로봇 제어에 적용될 수 있음을 보여주었다.There is also a reinforcement learning technique that uses an image as the input state of the network, and this conventional reinforcement learning technique allows the neural network to learn the nuances of the situation in the input image. Due to the ability to understand the complex context of an image, deep reinforcement learning can be applied to robot control systems such as path planning or robotic arm manipulation. Unlike the past approach, where only low-dimensional data such as proximity information or the speed of the robot could be used, a recent approach utilizes a raw camera image as input. Many robot control algorithms with an end-to-end concept have been proposed, for example, as it has become possible to link raw image of a camera to a motor control signal in a single neural network, End visuomotor network can be applied to real robot control with fine tuning in the robot platform.

이러한 엔드-투-엔드 컨트롤러 방식이 자체 우월성을 가진다고 하더라도 이러한 방식은 앞서 설명한 바와 같이 일반적으로 호환성이 제한된다는 문제점에 있다. 엔드-투-엔드 컨트롤러 네트워크가 주어진 로봇 플랫폼에 학습되고 내장되면, 이 네트워크를 다른 센서와 역학을 가진 다른 로봇에 적용하는 것이 매우 어렵다. 또한, 실제 로봇의 하드웨어로 강화 학습 네트워크를 학습하려면 엄청난 시간과 노력이 필요하다. 따라서 강화 학습 네트워크를 학습하기 위한 시뮬레이션 환경을 구축하는 것이 필수적이지만, 현실적인 학습 시뮬레이터를 개발하려면 3D 공간 재구성과 같은 매우 복잡한 프로세스가 요구된다는 문제점이 있다.Even though such an end-to-end controller scheme has its own superiority, this method has a problem in that compatibility is generally limited as described above. If an end-to-end controller network is learned and embedded in a given robot platform, it is very difficult to apply this network to other sensors and dynamics of other robots. Also, it takes a lot of time and effort to learn reinforcement learning network with real robot hardware. Therefore, it is necessary to construct a simulation environment for learning reinforcement learning network. However, there is a problem that a very complicated process such as 3D space reconfiguration is required to develop a realistic learning simulator.

높은 대역폭의 제어를 필요로 하는 모바일 로봇에 대규모 딥 뉴럴 네트워크를 효과적으로 통합하기 위해, 엔드-투-엔드 로봇 소프트웨어 아키텍처와 달리, 모바일 로봇의 전체 소프트웨어를 구성하는 하나의 구성 요소로서 딥 러닝 기술을 도입할 수 있는 비동기식 딥 분류 네트워크(Asynchronous Deep Classification Network, ADCN)를 제공한다.Unlike the end-to-end robot software architecture, in order to effectively integrate large-scale deep neural networks into mobile robots requiring high bandwidth control, Deep Learning technology is introduced as a component of the overall software of mobile robots Asynchronous Deep Classification Network (ADCN), which can be used to provide a wide range of services.

학습의 복잡성을 줄이고 알고리즘의 확장성을 향상시키기 위해, 모바일 로봇을 위한 게이밍 강화 학습 기반 모션 플래너(Gaming Reinforcement Learning-Based Motion Planner, 이하 GRL-플래너)를 제공한다.Gaming Reinforcement Learning-Based Motion Planner (GRL-Planner) for mobile robots is provided to reduce learning complexity and improve algorithm scalability.

모바일 로봇이 포함하는 카메라를 통해 입력된 이미지에 대한 영상 처리를 통해 목표물의 위치 및 상기 이미지에서의 상기 목표물에 대응하는 관심 영역에 대한 이미지 정보를 획득하는 단계; 상기 목표물의 위치를 이차원 평면상에 사상하여 추상화 맵을 생성하는 단계; 상기 이미지에서의 상기 목표물에 대응하는 관심 영역에 대한 이미지 정보를, 상기 추상화 맵에 사상된 상기 목표물의 위치에 대응하여 저장하는 단계; 비동기식 딥 분류 네트워크(Asynchronous Deep Classification Network)를 통해 상기 추상화 맵에 저장된 관심 영역에 대한 이미지 정보를 처리하여 상기 목표물을 비동기적으로 분류하는 단계; 및 상기 비동기식 딥 분류 네트워크를 통해 비동기적으로 분류된 목표물에 대한 정보를 상기 추상화 맵에 업데이트하는 단계를 포함하는 것을 특징으로 하는 목표물 분류 방법을 제공한다.Obtaining image information about a position of a target and a region of interest corresponding to the target in the image through image processing of an image inputted through a camera included in the mobile robot; Mapping the position of the target on a two-dimensional plane to generate an abstraction map; Storing image information for a region of interest corresponding to the target in the image, corresponding to a position of the target mapped to the abstraction map; Processing image information for a region of interest stored in the abstraction map through an asynchronous Deep Classification Network to classify the target asynchronously; And updating the abstraction map with information about asynchronously classified targets through the asynchronous deep classification network.

컴퓨터와 결합되어 상기 목표물 분류 방법을 컴퓨터에 실행시키기 위해 컴퓨터 판독 가능한 기록매체에 저장된 컴퓨터 프로그램을 제공한다.There is provided a computer program stored in a computer-readable recording medium for causing a computer to execute the target classification method in combination with a computer.

상기 목표물 분류 방법을 컴퓨터에 실행시키기 위한 프로그램이 기록되어 있는 것을 특징으로 하는 컴퓨터에서 판독 가능한 기록매체를 제공한다.There is provided a computer-readable recording medium having recorded thereon a program for causing a computer to execute the target classification method.

모바일 로봇을 제어하는 컴퓨터 장치에 있어서, 컴퓨터에서 판독 가능한 명령을 저장하는 메모리; 상기 명령을 실행하도록 구현되는 적어도 하나의 프로세서를 포함하고, 상기 적어도 하나의 프로세서는, 상기 모바일 로봇이 포함하는 카메라를 통해 입력된 이미지에 대한 영상 처리를 통해 목표물의 위치 및 상기 이미지에서의 상기 목표물에 대응하는 관심 영역에 대한 이미지 정보를 획득하고, 상기 목표물의 위치를 이차원 평면상에 사상하여 추상화 맵을 생성하고, 상기 이미지에서의 상기 목표물에 대응하는 관심 영역에 대한 이미지 정보를, 상기 추상화 맵에 사상된 상기 목표물의 위치에 대응하여 저장하고, 비동기식 딥 분류 네트워크(Asynchronous Deep Classification Network)를 통해 상기 추상화 맵에 저장된 관심 영역에 대한 이미지 정보를 처리하여 상기 목표물을 비동기적으로 분류하고, 상기 비동기식 딥 분류 네트워크를 통해 비동기적으로 분류된 목표물에 대한 정보를 상기 추상화 맵에 업데이트하는 것을 특징으로 하는 컴퓨터 장치를 제공한다.A computer apparatus for controlling a mobile robot, comprising: a memory for storing a computer-readable instruction; And at least one processor configured to execute the command, wherein the at least one processor is operable to determine, based on the position of the target through image processing on the image input through the camera included in the mobile robot, Acquiring image information for a region of interest corresponding to the target, mapping the position of the target on a two-dimensional plane to generate an abstraction map, and generating image information for a region of interest corresponding to the target in the image, And asynchronously classifying the target asynchronously by processing image information on a region of interest stored in the abstraction map through an asynchronous Deep Classification Network, Classify asynchronously through deep classifier networks Information about the target to provide a computer device, characterized in that updating the map abstraction.

모바일 로봇에 있어서, 컴퓨터에서 판독 가능한 명령을 저장하는 메모리; 상기 명령을 실행하도록 구현되는 적어도 하나의 프로세서; 및 이미지를 입력받는 카메라를 포함하고, 상기 적어도 하나의 프로세서는, 상기 카메라를 통해 입력된 이미지에 대한 영상 처리를 통해 목표물의 위치 및 상기 이미지에서의 상기 목표물에 대응하는 관심 영역에 대한 이미지 정보를 획득하고, 상기 목표물의 위치를 이차원 평면상에 사상하여 추상화 맵을 생성하고, 상기 이미지에서의 상기 목표물에 대응하는 관심 영역에 대한 이미지 정보를, 상기 추상화 맵에 사상된 상기 목표물의 위치에 대응하여 저장하고, 비동기식 딥 분류 네트워크(Asynchronous Deep Classification Network)를 통해 상기 추상화 맵에 저장된 관심 영역에 대한 이미지 정보를 처리하여 상기 목표물을 비동기적으로 분류하고, 상기 비동기식 딥 분류 네트워크를 통해 비동기적으로 분류된 목표물에 대한 정보를 상기 추상화 맵에 업데이트하는 것을 특징으로 하는 모바일 로봇을 제공한다.1. A mobile robot, comprising: a memory for storing instructions readable by a computer; At least one processor configured to execute the instruction; And a camera for receiving an image, wherein the at least one processor is operable to perform image processing on an image input through the camera to obtain image information on a position of the target and a region of interest corresponding to the target in the image Generating an abstraction map by mapping the position of the target on a two-dimensional plane, and generating image information for a region of interest corresponding to the target in the image, corresponding to a position of the target mapped to the abstraction map Processing the image information for a region of interest stored in the abstraction map through an asynchronous Deep Classification Network to classify the targets asynchronously, and asynchronously classifying the objects asynchronously through the asynchronous Deep Classification Network Information about the target is updated in the abstraction map The mobile robot includes:

엔드-투-엔드 로봇 소프트웨어 아키텍처와 달리, 모바일 로봇의 전체 소프트웨어를 구성하는 하나의 구성 요소로서 딥 러닝 기술을 도입할 수 있는 비동기식 딥 분류 네트워크(Asynchronous Deep Classification Network, ADCN)를 이용하여 높은 대역폭의 제어를 필요로 하는 모바일 로봇에 대규모 딥 뉴럴 네트워크를 효과적으로 통합할 수 있다. Unlike the end-to-end robot software architecture, the Asynchronous Deep Classification Network (ADCN), which can introduce deep learning technology as a component of the overall software of mobile robots, It is possible to effectively integrate a large-scale deep neural network into a mobile robot requiring control.

모바일 로봇을 위한 게이밍 강화 학습 기반 모션 플래너(Gaming Reinforcement Learning-Based Motion Planner, 이하 GRL-플래너)를 이용하여 학습의 복잡성을 줄이고 알고리즘의 확장성을 향상시킬 수 있다.The Gaming Reinforcement Learning-Based Motion Planner (GRL-Planner) for mobile robots can reduce learning complexity and improve algorithm scalability.

도 1은 본 발명의 일실시예에 있어서, 비동기식 딥 분류 네트워크를 위한 높은 대역폭 프레임워크의 개요도이다.
도 2는 본 발명의 일실시예에 있어서, 데이터를 추출하고 추상화 맵을 생성하는 과정의 예를 도시한 도면이다.
도 3은 본 발명의 일실시예에 있어서, 강화 학습 기반 모션 플래너가 모바일 로봇 플랫폼에서 작동하는 방법에 대한 전반적인 과정의 예를 도시한 도면이다.
도 4는 본 발명의 일실시예에 따른 모바일 로봇의 예를 도시한 도면이다.
도 5는 본 발명의 일실시예에 있어서, 노즐의 내부 반경을 제어하는 예를 도시한 도면이다.
도 6은 본 발명의 일실시예에 있어서, 객체의 인식 화면 및 추상화 맵 화면의 예들을 나타낸 도면이다.
도 7은 본 발명의 일실시예에 있어서, 네 가지 상태들로 구성된 멀티 바디 트래커의 상태 다이어그램의 예를 도시한 도면이다.
도 8은 본 발명의 일실시예에 있어서, 시뮬레이션 환경의 예를 도시한 도면이다.
도 9는 본 발명의 일실시예에 있어서, 로봇 경로 계획의 결과의 예들을 도시한 도면이다.
도 10은 본 발명의 일실시예에 있어서, 모바일 로봇의 내부 구성의 예를 도시한 블록도이다.
도 11은 본 발명의 일실시예에 있어서, 목표물 분류 방법의 예를 도시한 흐름도이다.
도 12는 본 발명의 일실시예에 있어서, 모바일 로봇 제어 방법의 예를 도시한 흐름도이다.1 is a schematic diagram of a high bandwidth framework for an asynchronous deep classifier network, in accordance with an embodiment of the present invention.
2 is a diagram illustrating an example of a process of extracting data and generating an abstraction map according to an embodiment of the present invention.
FIG. 3 is a diagram illustrating an example of an overall process of how a reinforcement learning based motion planner operates in a mobile robot platform, in an embodiment of the present invention.
4 is a diagram illustrating an example of a mobile robot according to an embodiment of the present invention.
5 is a view showing an example of controlling the inner radius of the nozzle in an embodiment of the present invention.
6 is a diagram illustrating examples of an object recognition screen and an abstract map screen according to an exemplary embodiment of the present invention.
FIG. 7 is a diagram illustrating an example of a state diagram of a multi-body tracker including four states according to an embodiment of the present invention.
8 is a diagram showing an example of a simulation environment in an embodiment of the present invention.
9 is a diagram illustrating examples of the results of a robot path plan in an embodiment of the present invention.
10 is a block diagram showing an example of the internal configuration of a mobile robot according to an embodiment of the present invention.
11 is a flowchart showing an example of a target classification method according to an embodiment of the present invention.
12 is a flowchart illustrating an example of a mobile robot control method according to an embodiment of the present invention.

이하, 실시예를 첨부한 도면을 참조하여 상세히 설명한다.Hereinafter, embodiments will be described in detail with reference to the accompanying drawings.

본 발명의 실시예들에 따른 대표 이미지 생성 방법은 이후 설명될 서버와 같은 컴퓨터 장치를 통해 수행될 수 있다. 이때, 컴퓨터 장치에는 본 발명의 일실시예에 따른 컴퓨터 프로그램이 설치 및 구동될 수 있고, 컴퓨터 장치는 구동된 컴퓨터 프로그램의 제어에 따라 본 발명의 일실시예에 따른 대표 이미지 생성 방법을 수행할 수 있다. 상술한 컴퓨터 프로그램은 컴퓨터 장치와 결합되어 대표 이미지 생성 방법을 컴퓨터에 실행시키기 위해 컴퓨터 판독 가능한 기록매체에 저장될 수 있다.The representative image generation method according to embodiments of the present invention can be performed through a computer device such as a server to be described later. At this time, a computer program according to an embodiment of the present invention can be installed and operated in the computer device, and the computer device can perform the representative image generation method according to an embodiment of the present invention, under the control of a driven computer program have. The above-described computer program may be stored in a computer-readable recording medium in combination with a computer apparatus to cause the computer to execute the representative image generating method.

도 1은 본 발명의 일실시예에 있어서, 비동기식 딥 분류 네트워크를 위한 높은 대역폭 프레임워크의 개요도이다. 도 1은 모바일 로봇을 위한 필수 제어 루프(Essential Control Loop, 110)와 비동기식 딥 분류 네트워크(Asynchronous Deep Classification Network, ADCN, 120)를 나타내고 있다. 비동기식 딥 분류 네트워크(120)는 높은 정밀도를 위해 무거운 딥 뉴럴 네트워크 기반 분류기(Deep Neural Network Classifier)를 사용하면서도 시스템 대역폭 손실을 막기 위해 필수 제어 루프(110)에 대해 비동기적으로 실행될 수 있다.1 is a schematic diagram of a high bandwidth framework for an asynchronous deep classifier network, in accordance with an embodiment of the present invention. 1 shows an essential control loop 110 and an asynchronous deep classification network (ADCN) 120 for a mobile robot. The asynchronous deep classifier network 120 may be implemented asynchronously to the essential control loop 110 to prevent system bandwidth loss while using a heavy Deep Neural Network Classifier for high precision.

필수 제어 루프(110)는 빠른 시각적 인식(Fast Visual Perception, 111), 추상화 맵(Abstraction Map, 112), 그리고 모션 플래너(Motion Planner, 113)로 구성될 수 있다. 여기서, 추상화 맵(112)은 비동기식 딥 분류 네트워크(120)와 필수 제어 루프(110)간의 데이터 흐름을 통합할 수 있다. 분류된 타겟 정보로서, 모션 플래너(113)는 추상화 맵(112)을 입력 상태로 제공받을 수 있다.The essential control loop 110 may comprise a Fast Visual Perception 111, an Abstraction Map 112, and a Motion Planner 113. Here, the abstraction map 112 may integrate the data flow between the asynchronous deep classification network 120 and the mandatory control loop 110. As the classified target information, the motion planner 113 may be provided with the abstraction map 112 as an input state.

다시 말해, 본 실시예에서 제안되는 로봇 프레임워크는 빠른 시각적 인식(111), 비동기식 딥 분류 네트워크(120) 및 모션 플래너(113)로 구성될 수 있으며, 이러한 세 가지 구성 요소 사이의 데이터 흐름이 구성 요소와 추가 센서에 대한 데이터 허브로서 하나의 구성 요소인 추상화 맵을 도입함으로써 설정될 수 있다. 추상화 맵(112)은 로봇 아키텍처의 비동기식 데이터 흐름을 버퍼링하고 추상적인 형태의 데이터 입력을 모션 플래너(113)에 제공할 수 있다.In other words, the robot framework proposed in this embodiment can be composed of fast visual recognition 111, asynchronous deep classification network 120, and motion planner 113, and the data flow between these three components can be configured Element and an abstraction map, which is one component, as a data hub for the additional sensor. The abstraction map 112 may buffer the asynchronous data flow of the robot architecture and provide an abstract type of data input to the motion planner 113.

빠른 시각적 인식(111)은 다양한 형태의 단순한 특징 기반 비전 프로세스를 나타낼 수 있다. 대상 객체의 대략적인 정보(rough information)와 같은 이 구성 요소의 결과는 2차원 평면에 투영되고 추상화 맵(112)에서 단순화된 형식으로 매핑될 수 있다. 동시에, 라이더(LIDAR(light detection and ranging)) 또는 IMU(Inertial Measurement Unit)와 같이 복잡한 전처리 과정을 필요로 하지 않는 다양한 센서 데이터가 추상화 맵(112)에 동시적으로 투영될 수 있다. 비동기식 딥 분류 네트워크(120)는 객체 유형이나 모양과 같이 탐지된 대상 객체에 대한 세부 정보를 인식할 수 있다. 느린 비동기식 딥 분류 네트워크(120)와 빠른 시각적 인식(111) 사이의 불일치를 보상하기 위해 멀티 바디 트랙커를 구현함으로써 대상 객체(로봇의 동적 이동에 영향을 받지 않는)의 속성을 나타내는 일반적인 형식을 얻을 수 있다.The fast visual recognition (111) can represent various types of simple feature-based vision processes. The result of this component, such as the rough information of the target object, can be projected in a two-dimensional plane and mapped in a simplified format in the abstraction map 112. [ At the same time, various sensor data that do not require a complex preprocessing process, such as a rider (light detection and ranging), or an IMU (Inertial Measurement Unit), can be simultaneously projected onto the abstraction map 112. The asynchronous deep classification network 120 may recognize details of the detected target object, such as an object type or shape. By implementing a multi-body tracker to compensate for the inconsistency between the slow asynchronous deep classifier network 120 and the fast visual recognition 111, a general format representing properties of the target object (not affected by dynamic movement of the robot) have.

외부로부터 전력을 공급받거나, 고성능 연산장치(외부 서버)와 통신하지 않는 독립형(stand-alone) 로봇의 경우, 이미지를 해석하여 대상 객체가 어떤 물체인지를 판별하는 알고리즘인 딥 뉴럴 네트워크를 직접적으로 적용하기 어렵다. 이는 딥 뉴럴 네트워크 기반 분류기가 독립형 로봇이 수용할 수 없는 고성능, 고전력 소모의 GPU를 요구하기 때문이다. 이러한 요구를 고성능의 GPU를 로봇 외부에 두고 통신으로 이미지를 전송하는 방식으로 해결할 수 있으나, 이 경우에는 통신 지연이 발생하여 전체 인식 속도가 느려진다는 문제점이 있다.In the case of a stand-alone robot that receives power from the outside or does not communicate with a high-performance computation device (external server), it directly applies a deep neural network, which is an algorithm for analyzing an image and determining the object is an object It is difficult to do. This is because deeper neural network based classifiers require high-performance, high-power-consuming GPUs that stand-alone robots can not accommodate. This problem can be solved by a method of transmitting an image by communication with a high-performance GPU outside the robot. However, in this case, there is a problem that a communication delay occurs and the whole recognition speed is slowed down.

따라서, 본 실시예에서는 앞서 설명한 바와 같이, 인식의 속도는 빠르나 부정확한 인식 알고리즘인 빠른 시각적 인식(111)과 인식의 속도는 느리나 높은 정확성을 갖는 인식 알고리즘인 딥 뉴럴 네트워크를 모두 이용하되, 인식 속도가 서로 다른 연산 알고리즘들의 성능이 서로에게 방해받지 않도록 딥 뉴럴 네트워크를 비동기적으로 구성한 비동기식 딥 분류 네트워크(120)를 활용할 수 있다.Accordingly, in the present embodiment, as described above, both the fast visual recognition 111, which is a speedy and inaccurate recognition algorithm, and the deep neural network, which is a recognition algorithm with a high accuracy and a low speed of recognition, An asynchronous deep classification network 120 in which a deep neural network is constructed asynchronously may be utilized so that the performance of different speed mathematical algorithms is not interfered with each other.

이때, 모든 연산 알고리즘들의 데이터 플로우는 추상화 맵(112)상에서 통합될 수 있다. 이러한 추상화 맵(112)에는 물체, 장애물, 이미지 분류 결과가 저장될 수 있으며, 추상화 맵(112)을 단순하게 구성하여 다양한 센서데이터를 쉽게 통할 수 있다. 또한, 추상화 맵(112)은 이미 설명한 바와 같이 로봇의 행동을 계산하기 위한 모션 플래너(113)에 사용될 수 있는 형태로 구성될 수 있다.At this time, the data flow of all operation algorithms can be integrated on the abstraction map 112. An object, an obstacle, and an image classification result may be stored in the abstraction map 112, and the abstraction map 112 may be simply configured to facilitate various sensor data. In addition, the abstraction map 112 can be configured in a form that can be used in the motion planner 113 for calculating the behavior of the robot as already described.

도 2는 본 발명의 일실시예에 있어서, 데이터를 추출하고 추상화 맵을 생성하는 과정의 예를 도시한 도면이다. 도 2는 환경을 추상화하는 맵 레이어의 개요를 나타내고 있다. 탐지된 대상 객체의 위치와 다중 센서 데이터가 추상화 맵(112)에 투영될 수 있다. 예를 들어, 도 2는 모바일 로봇(210)이 라이더(LIDAR)를 이용하여 주변 지형(220)을 파악하고, IMU를 이용하여 모바일 로봇(210)의 움직임을 측정하며, 카메라(Camera)를 통해 주변 지형(220)의 대상 객체들을 촬영하는 모습을 나타내고 있다. 이러한 다양한 센서들을 통해 얻어지는 정보를 이용하여 도 2에 도시된 2차원 맵(230)과 같은 추상화 맵(112)이 생성될 수 있다. 또한, 카메라를 통해 얻어지는 정보들은 딥 분류기(240)로 제공되어 대상 객체가 비동기 방식으로 보다 정확히 분류될 수 있다. 이때, 딥 분류기(240)는 비동기식 딥 분류 네트워크(120)에 포함되는 구성요소일 수 있으며, 딥 뉴럴 네트워크에 의해 학습되는 딥 뉴럴 네트워크 기반 분류기일 수 있다. 분류된 정보가 추상화 맵(112)에 다시 반영됨으로써 서로 비동기적으로 동작하는 알고리즘들에 따른 대상 객체의 정보가 추상화 맵(112)상에서 통합될 수 있다.2 is a diagram illustrating an example of a process of extracting data and generating an abstraction map according to an embodiment of the present invention. Fig. 2 shows an outline of a map layer that abstracts the environment. The location of the detected object and multiple sensor data may be projected onto the abstraction map 112. For example, FIG. 2 illustrates an example in which the mobile robot 210 grasps the surrounding terrain 220 using a rider LIDAR, measures the motion of the mobile robot 210 using the IMU, And the target objects of the surrounding terrain 220 are photographed. An abstract map 112 such as the two-dimensional map 230 shown in FIG. 2 can be generated using the information obtained through various sensors. Further, the information obtained through the camera is provided to the deep classifier 240 so that the target objects can be classified more accurately in an asynchronous manner. At this time, the deep classifier 240 may be a component included in the asynchronous deep classifier network 120 and may be a deep neural network based classifier that is learned by a deep neural network. The classified information is reflected back into the abstraction map 112 so that the information of the object in accordance with the algorithms that operate asynchronously with each other can be integrated on the abstraction map 112. [

예를 들어, 빠른 시각적 인식(111)은 일례로, Blob 디텍션이나 엣지 디텍션과 같이 간단하면서도 빠르게 구동되는 물체 인식 알고리즘을 이용하여 카메라를 통해 들어오는 이미지에서 대략적으로 물체를 인식할 수 있다. 이후, 라이더, 초음파센서 등 후처리가 필요하지 않은 다양한 센싱 데이터를 가상의 평면에 그려냄으로써 추상화 맵(112)이 생성될 수 있다. 이때, 카메라를 통해 들어온 이미지로부터 빠른 시각적 인식(111)을 통해 대략적으로 인식된 물체가 추상화 맵(112)상에 표시될 수 있다. 이때, 카메라와 가상의 평면 사이의 변환 사상은 기설정되어 미리 제공될 수 있다. 또한, 추상화 맵(112)은 카메라를 통해 들어온 이미지를 위치와 연관하여 저장할 수 있다. 예를 들어, 관심 영역(ROI, 대상 객체 및 그 주변 환경을 포함하는 잘린 이미지(cropped image))에 대한 정보가 대상 객체의 템플릿으로 저장될 수 있다. 탐지된 대상 객체가 추상화 맵(112)에서 그려지는 동안, 딥 분류기(240)는 추상화 맵(112)에 저장된 이미지를 비동기적으로 분류하여 분류 결과를 다시 추상화 맵(112)에 저장할 수 있다.For example, the fast visual recognition 111 can recognize an object roughly in an image coming in through a camera using a simple and fast object recognition algorithm such as Blob detection or edge detection. Thereafter, the abstraction map 112 can be generated by plotting various sensing data, such as a rider, an ultrasonic sensor, etc., which need not be post-processed, on a virtual plane. At this time, a roughly recognized object can be displayed on the abstraction map 112 through the quick visual recognition 111 from the image input through the camera. At this time, the transformation map between the camera and the virtual plane may be preset and provided in advance. In addition, the abstraction map 112 may store an image entered via the camera in association with the location. For example, information about a region of interest (a cropped image including an ROI, a target object, and its surroundings) may be stored as a template of the target object. While the detected target object is being drawn in the abstraction map 112, the deep sorter 240 may asynchronously sort the images stored in the abstraction map 112 and store the classification results back into the abstraction map 112.

이때, 모션 플래너(113)는 분류가 완료된 추상화 맵(112)을 입력 데이터로서 입력받아 모바일 로봇(210)의 행동을 계산할 수 있다.At this time, the motion planner 113 may receive the classified abstraction map 112 as input data, and calculate the behavior of the mobile robot 210.

비동기식 딥 분류 네트워크(120)가 물체를 분류하고 있는 동안, 빠른 시각적 인식(111)을 위한 알고리즘은 상대적으로 더 빠른 속도로 물체의 정보를 추상화 맵(112)상에 표시할 수 있다. 만약, 모바일 로봇(210)이 움직이는 경우, 비동기식 딥 분류 네트워크(120)는 빠른 시각적 인식(111)을 위한 알고리즘과 비교하여 상대적으로 이전의 물체의 정보를 갖고 있게 되기 때문에, 두 알고리즘 사이에 대상 물체에 대한 불일치가 발생할 수 있다. 이러한 불일치를 제거하기 위해, 멀티 바디 트래킹 알고리즘(앞서 설명한 멀티 바디 트래커)를 이용하여 모바일 로봇(210)이나 대상 물체가 이동하더라도, 모바일 로봇(210)이나 대상 물체의 움직임을 추적하여, 움직이기 전과 후의 이미지를 대조하는 경우에, 대상 물체를 다른 물체로 인식하는 일이 없도록 할 수 있다.While the asynchronous deep classification network 120 is categorizing objects, an algorithm for fast visual recognition 111 may display information of objects on the abstraction map 112 at a relatively faster rate. If the mobile robot 210 moves, the asynchronous deep classification network 120 will have relatively earlier information of the object as compared to the algorithm for the fast visual recognition 111, May result in inconsistencies. In order to eliminate such inconsistency, even if the mobile robot 210 or the object moves, the movement of the mobile robot 210 or the object is tracked using the multi-body tracking algorithm (the multi-body tracker described above) It is possible to prevent the target object from being recognized as another object when collating the subsequent image.

다시 말해, 이러한 비동기식 딥 분류 네트워크(120)에서 딥 뉴럴 네트워크 기반 분류기는 인식 레이어로부터 전처리되고 추출된 데이터를 처리하는 알고리즘을 포함할 수 있다. 데이터의 처리 후 딥 뉴럴 네트워크 기반 분류기는 모바일 로봇(210)의 필수 제어 루프(110) 보다 상대적으로 더 낮은 대역폭하에서 추상화 맵(112)의 레이어에 대한 딥 뉴럴 네트워크의 결과를 비동기로 업데이트할 수 있다. 이처럼 모바일 로봇(210)의 필수 제어 루프(110)에서 딥 러닝 기반 분류 알고리즘을 분리함으로써, 특히 모바일 환경에서 무거운 컴퓨팅 성능과 지연을 요구하는 딥 러닝 기반 인식 알고리즘의 처리량을 처리하기 위해 충분히 높은 작동 대역폭을 유지할 수 있게 된다.In other words, in this asynchronous deep classification network 120, the deeper neural network based classifier may include an algorithm for processing the preprocessed and extracted data from the recognition layer. After processing of the data, the deep neural network based classifier can asynchronously update the results of the deep neural network for the layer of abstraction map 112 under relatively lower bandwidth than the essential control loop 110 of the mobile robot 210 . By separating the deep learning-based classification algorithm in the essential control loop 110 of the mobile robot 210 as described above, it is possible to obtain a sufficiently high operating bandwidth to handle the throughput of a deep learning based recognition algorithm that requires heavy computing performance and delay, . &Lt; / RTI >

또한, 하나의 단일 추상화 맵(112)에서 센서 및 시각 데이터를 추출함에 따라 다양한 모바일 로봇들에 대한 범용 제어 방식 및/또는 알고리즘을 개발할 수 있게 된다. 이는 동일한 형태의 데이터 표현을 갖는 추상화 맵(112)상에서 센서 또는 시각 프로세스의 변형을 버퍼링할 수 있기 때문이다. 또한, 추상화 맵(112)은 모바일 로봇들의 다양한 형태에 관계없이 이러한 제어 체계에 동일한 입력을 제공할 수 있다.In addition, by extracting the sensor and the time data from one single abstraction map 112, it becomes possible to develop a universal control method and / or algorithm for various mobile robots. This is because it is possible to buffer variations of the sensor or visual process on the abstraction map 112 having the same type of data representation. In addition, the abstraction map 112 may provide the same input to such a control system regardless of various forms of mobile robots.

모바일 로봇(210)의 제어에 대한 강화 학습의 복잡성을 줄이고 알고리즘의 확장성을 향상시키기 위해, 게이밍 강화 학습 기반 모션 플래너(Gaming Reinforcement Learning-Based Motion Planner, 이하 GRL-플래너)를 이용할 수 있다. 본 실시예에 따른 GRL-플래너는 추가 센서에 대한 호환성, 시뮬레이터 개발의 편리성 및 다양한 로봇 플랫폼에 대한 확장성이라는 세 가지 측면에 초점을 맞출 수 있다.A gaming reinforcement learning-based motion planner (GRL-planner) can be used to reduce the complexity of reinforcement learning for the control of the mobile robot 210 and improve the scalability of the algorithm. The GRL-planner according to the present embodiment can focus on three aspects: compatibility with additional sensors, convenience of simulator development, and scalability to various robot platforms.

이러한 접근 방식의 첫 번째 핵심은 형식화된 추상화 맵(112)을 시스템의 호환성을 향상시키는 학습 데이터 소스로 GRL-플래너에게 제공하는 것이다. 앞서 제안된 추상화 맵(112)은 다양한 유형의 센서 정보를 통합하는 일반적인 데이터 형식으로 간주될 수 있기 때문에, 이러한 접근 방식은 센서 조합에 관계없이 GRL-플래너에 대한 일반적인 토대를 제공할 수 있다.The first key to this approach is to provide the formatted abstraction map 112 to GRL-Planner as a learning data source that enhances system compatibility. Since the previously proposed abstraction map 112 can be regarded as a common data format that incorporates various types of sensor information, this approach can provide a common foundation for the GRL-planner regardless of the sensor combination.

강화 학습 네트워크의 입력 상태로 제공되는 추상화 맵(112)은 시뮬레이션 환경의 개발에 대한 편의성도 제공할 수 있다. 이러한 2차원 맵은 불연속적인 데이터로 매우 추상화되어 있기 때문에, 이 맵의 구축 시뮬레이션 환경은 3차원 환경의 시뮬레이터보다 훨씬 낮은 복잡성을 필요로 한다. 따라서, 간단한 시뮬레이션 환경으로 GRL-플래너를 학습하는 것은 미가공 카메라 이미지 기반 모션 플래닝의 경우보다 쉬운 방식으로 수립될 수 있다. 게다가, 이 방법은 최근의 강화 학습 알고리즘의 대부분이 주목할만한 성능을 보이고 있는 2차원 비디오 게임과 유사한 시뮬레이션 환경을 구축할 수 있다. 이러한 유사성으로 인해 최신의 강화 학습 알고리즘의 네트워크 구조를 안정적으로 재사용하여 GRL-플래너 개발을 용이하게 할 수 있다.The abstraction map 112 provided in the input state of the reinforcement learning network can also provide convenience for the development of a simulation environment. Since these two-dimensional maps are highly abstracted with discrete data, the construction simulation environment of this map requires much less complexity than the simulator of the three-dimensional environment. Thus, learning a GRL-planner with a simple simulation environment can be established in an easier way for raw camera image-based motion planning. In addition, this method can build a simulation environment similar to a two-dimensional video game in which most of the recent reinforcement learning algorithms have shown remarkable performance. This similarity makes it easier to develop GRL-planner by stably reusing the network structure of the latest reinforcement learning algorithm.

비디오 게임과 같은 시뮬레이션 환경을 구축하기 위해 인간이 조이스틱 컨트롤러를 갖춘 모바일 로봇 플랫폼을 조작하는 방식으로 네트워크의 출력 상태를 설계할 수 있다. 차동 드라이브 모바일 로봇 플랫폼의 경우 네 가지 동작(앞, 뒤, 왼쪽, 오른쪽)이 네트워크의 출력 상태를 위해 설계될 수 있으며, 홀로노믹(holonomic) 모바일 로봇 플랫폼의 경우 10가지 동작(몸체에서 8방향, 왼쪽, 오른쪽)이 네트워크의 출력 상태를 위해 설계될 수 있다. 이러한 방식으로 모든 모바일 로봇 플랫폼의 이동성을 이산화하고(discretized) 단순화하여 경로 계획을 위한 딥 강화 학습을 활용할 수 있다.To build a simulation environment such as a video game, the output state of the network can be designed by humans manipulating a mobile robot platform equipped with a joystick controller. In the case of a differential drive mobile robot platform, four actions (front, back, left, right) can be designed for the output state of the network, and for a holonomic mobile robot platform, Left, right) can be designed for the output state of the network. In this way, the mobility of all mobile robot platforms can be discretized and simplified to take advantage of deep reinforcement learning for path planning.

입력 상태와 출력 동작을 모두 추상화함으로써 보편적으로 사용 가능한 강화 학습 모델을 모의(simulated) 추상화 맵 환경에 적용하여 네트워크는 쉽게 학습될 수 있다. 네트워크의 출력은 네트워크 출력 각 상태의 확률을 로봇들마다의 해당 동작에 매핑함으로써 로봇들을 구동시킬 수 있다.By abstracting both input state and output behavior, a network can be learned easily by applying a universally usable reinforcement learning model to a simulated abstraction map environment. The output of the network can drive the robots by mapping the probability of each state of the network output to the corresponding operation for each robot.

도 3은 본 발명의 일실시예에 있어서, 강화 학습 기반 모션 플래너가 모바일 로봇 플랫폼에서 작동하는 방법에 대한 전반적인 과정의 예를 도시한 도면이다. 추상화 맵(112)과 조이스틱 제어 신호가 GRL-플래너의 입력 상태와 출력 동작 각각으로서 이용될 수 있다. 추상화 맵(112)은 입력 데이터로 사용되기 전에, 계산 복잡성을 줄이기 위해 다운 샘플링되고 그레이 스케일 이미지로 변환될 수 있다. 이러한 과정의 예를 도 3의 전 처리(Pre-processing, 310)에서 나타내고 있다. 플랫폼 유형에 따라 모바일 로봇의 동작이 조이스틱 제어 신호에 다르게 매핑될 수 있다.FIG. 3 is a diagram illustrating an example of an overall process of how a reinforcement learning based motion planner operates in a mobile robot platform, in an embodiment of the present invention. The abstraction map 112 and the joystick control signal may be used as the input state and output operation of the GRL-planar, respectively. The abstraction map 112 may be downsampled and converted to a grayscale image to reduce computational complexity before being used as input data. An example of such a process is shown in the pre-processing 310 of FIG. Depending on the platform type, the motion of the mobile robot may be mapped differently to the joystick control signal.

예를 들어, 일반적인 모바일 로봇의 행동은 목표물 인식과 지도 제작 및 행동과 같은 과정을 거친다. 일반적인 로봇 경로 생성 방식들은 최단 경로를 찾아가는 것만 가능하기 때문에 여러 개의 목표물이 있는 경우에는 부적합하다는 문제점이 있으며, 목표물의 인식 실패나 갑작스런 장애물의 등장 등과 같이 지도가 갑자기 바뀌는 경우에 대응이 어렵다는 문제점이 있다. 본 발명의 실시예들에서는 이러한 문제들의 대응책들을 개발자가 일일이 직접 만들어주는 것이 아니라 인공 신경망 기반의 경로 생성 알고리즘(딥 강화 학습)을 이용하여 해결할 수 있다. 과거에서 강화 학습의 개념은 존재했으나, 컴퓨팅 파워의 부족으로 작은 차원의 상태만을 입력으로 사용할 수 있었다. 딥 러닝 기술의 발달로 인해 이미지를 입력 상태로 사용할 수 있게 되어 많은 비디오 게임에서 적용되고 있다. 현재 대부분의 간단한 2차원 비디오 게임들이 강화 학습을 통해 학습될 수 있다는 것이 증명되었다. 본 발명의 실시예들에서는 이러한 2차원 비디오 게임들에서와 유사하게 기존에 사용되고 있는 검증된 강화 학습 기술과 추상화 맵(112)을 이용하여 모바일 로봇(210)의 움직임을 제어할 수 있다. 다시 말해, 추상화 맵(112)을 입력 상태로 사용하고, 로봇의 움직임을 게임 컨트롤 신호로 추상화하여 강화 학습을 적용할 수 있다. 추상화 맵(112)을 사용하기 때문에 새로운 센서 데이터나 시각 데이터의 처리 결과를 쉽게 상태에 모델링할 수 있으며, 모바일 로봇(210)의 행동을 게임 컨트롤로 추상화하기 때문에, 모바일 로봇 플랫폼들의 형태와 무관하게 비슷한 액션을 사용할 수 있게 된다.For example, the behavior of a typical mobile robot goes through processes such as target recognition, mapping, and behavior. Since general robot path generation methods can only search the shortest path, there is a problem that it is not suitable when there are a plurality of targets, and there is a problem that it is difficult to cope with sudden change of a map such as a failure of recognition of a target or sudden obstacle . In the embodiments of the present invention, it is possible to solve this problem by using an artificial neural network-based path generation algorithm (Deep Reinforcement Learning) instead of making a countermeasure for such problems directly by a developer. In the past, the concept of reinforcement learning existed, but due to the lack of computing power, only a small number of states could be used as inputs. Due to the development of deep learning technology, images can be used as input, which is applied in many video games. It has now been demonstrated that most simple 2D video games can be learned through reinforcement learning. In the embodiments of the present invention, movement of the mobile robot 210 can be controlled by using the already-verified reinforcement learning technique and the abstraction map 112 similar to those in the two-dimensional video games. In other words, the abstraction map 112 may be used as an input state, and reinforcement learning may be applied by abstracting the motion of the robot as a game control signal. Since the abstraction map 112 is used, it is possible to easily model the processing result of new sensor data or time data and to abstract the behavior of the mobile robot 210 with game control. Therefore, regardless of the form of the mobile robot platforms A similar action will be available.

도 3에서 고레벨 제어기 네트워크(High-level Controller Network, 320)는 딥 강화 학습에 의해 학습된 네트워크의 예로서 입력 상태로 전 처리(310)된 추상화 맵(112)을 입력 상태로 받아 출력 상태로서 모바일 로봇(210)을 위한 행동에 대한 정보를 제공할 수 있다. 이때, 제어기 추상화(Controller Abstraction, 330)는 고레벨 제어기 네트워크(320)에서 제공하는 모바일 로봇(210)을 위한 행동에 대한 정보를 게임 컨트롤 신호로 추상화하는 과정의 예를 나타낸다. 모터 컨트롤러(Moter Controller, 340)는 모바일 로봇(210)이 추상화된 게임 컨트롤 신호에 대응하는 액션을 실행하도록 모바일 로봇(210)에 포함된 모터들을 제어할 수 있다.3, the high-level controller network 320 receives, as an input state, an abstraction map 112, which has been preprocessed 310 as an input state, as an example of a network learned by deep reinforcement learning, And may provide information about the behavior for the robot 210. At this time, the controller abstraction (330) shows an example of a process of abstracting information about a behavior for the mobile robot 210 provided by the high-level controller network 320 into a game control signal. The motor controller 340 may control the motors included in the mobile robot 210 so that the mobile robot 210 executes an action corresponding to the abstracted game control signal.

제안된 방법론을 평가하기 위해 모바일 로봇(210)의 보편적인 임무를 설계하였다. 일반적으로 정성적인 기준을 기반으로 다양한 현대식 모바일 로봇들을 비교하는 것은 어렵다. 왜냐하면, 각각의 모바일 로봇들은 자신만의 고유한 목적을 가지며 모바일 로봇의 하드웨어/소프트웨어 플랫폼도 고유한 목적에 따라 특수화되어 있기 때문이다. 그러나 서로 다른 모바일 로봇들의 특정 목적이 다를지라도 모바일 로봇을 사용하는 공통된 목표가 존재하며, 이러한 모바일 로봇의 일반적인 성능은 정성적인 방법으로 분석될 수 있다. The universal task of the mobile robot 210 is designed to evaluate the proposed methodology. In general, it is difficult to compare various modern mobile robots based on qualitative criteria. This is because each mobile robot has its own unique purpose and the hardware / software platform of the mobile robot is specialized according to its own purpose. However, there is a common goal of using mobile robots, even though the different purposes of different mobile robots are different, and the general performance of such mobile robots can be analyzed in a qualitative way.

예를 들어, 모바일 로봇들의 성능은 인지, 세계 모델링, 기획, 수행 작업, 작동과 같은 기능적인 모듈로 분해될 수 있다. 이러한 기능들은 내장된 연산 알고리즘에 관계없이 로봇 동작의 결과적인 기능들이다. 이러한 기능적인 모듈들을 간단하고 편리하게 관찰할 수 있는 임무를 수립함으로써 이동형 로봇에 대한 일반적인 평가 프레임워크를 구축할 수 있다.For example, the performance of mobile robots can be broken down into functional modules such as cognition, world modeling, planning, performance tasks, and operations. These functions are the resultant functions of the robot operation regardless of the built-in arithmetic algorithm. A general evaluation framework for a mobile robot can be constructed by establishing a task to observe these functional modules simply and conveniently.

로봇 성능 평가를 위해 설계된 임무는 실내 다중-대상 추적 및 수집 작업이다. 로봇은 작업을 수행하기 위해 환경을 검색, 탐색하고 환경과 상호 작용해야 한다. 구체적으로, 로봇은 특정 형태가 실내 지면에 퍼져있는 작은 물체를 추적하고 수집한다. 로봇과 대상은 무작위로 방(일례로, 3m×3m의 크기)에 배치된다. 무게와 크기가 서로 다른 두 가지 유형의 대상(일례로, 38mm 직경의 볼(ball)과 16mm 엣지의 큐브)가 제공된다. 이들은 로봇이 표적을 식별하고 다르게 행동하는지 관찰하기 위해 배열될 수 있다.The mission designed for robotic performance evaluation is indoor multi-object tracking and gathering. The robot must search for, explore, and interact with the environment to perform the task. Specifically, the robot tracks and collects small objects that have a specific shape spread over the interior floor. The robot and the object are randomly placed in a room (for example, 3 m × 3 m in size). Two types of objects of different weights and sizes (for example, a 38 mm diameter ball and a 16 mm edge cube) are provided. These can be arranged to observe that the robot identifies the target and behaves differently.

도 4는 본 발명의 일실시예에 따른 모바일 로봇의 예를 도시한 도면이다. 도 4에 도시된 로봇은 스스로 다중-대상을 추적하고 수집하는 실내 로봇으로서, 로봇은 제안된 시스템 아키텍처의 기능성을 입증하고 인식, 탐색 및 상호 작용으로 구성된 모바일 로봇의 일반적인 목적을 나타내기 위해 개발되었다. 이러한 모바일 로봇의 섀시(chassis)에는 홀로노믹 이동 기능이 있는 4개의 메카늄(mecanum) 휠이 장착되어 있으며, 시각적 수집을 위해 단일 와이드뷰 카메라(Genius, F100)가 플랫폼에 장착되었다.4 is a diagram illustrating an example of a mobile robot according to an embodiment of the present invention. The robot shown in Fig. 4 is an indoor robot that tracks and collects a multi-object by itself. The robot is developed to demonstrate the functionality of the proposed system architecture and to show the general purpose of a mobile robot composed of recognition, search, and interaction . The chassis of this mobile robot is equipped with four mecanum wheels with holonomic movement and a single wide view camera (Genius, F100) is mounted on the platform for visual collection.

시야의 안정성을 확보하기 위해, 플랫폼과 섀시를 실리콘 댐퍼로 조립하여 기계적으로 격리시킴으로써 매우 안정적인 비디오 스트림을 보장할 수 있도록 하였다. 이 구조는 급격한 가속으로 인한 물리적 충격으로부터 카메라 및 프로세서의 흔들림이나 손상을 흡수할 수 있다. 단말작동기(end-effector)의 경우 표적과 로봇 사이의 상호 작용을 보여주기 위해 흡입 모터가 사용되었다.To ensure visibility, the platform and chassis are mechanically isolated with a silicon damper to ensure a very stable video stream. This structure can absorb camera shake or damage from camera and processor from physical impact due to rapid acceleration. In the case of end-effectors, a suction motor was used to show the interaction between the target and the robot.

구체적으로 200W BLDC 흡입 모터가 사용되어 노즐 끝에서 4cm 이내에 있는 표적을 수집할 수 있다. 탐지된 목표물의 크기와 무게에 따라 노즐의 내부 반경이 제어될 수 있다. 도 5는 본 발명의 일실시예에 있어서, 노즐의 내부 반경을 제어하는 예를 도시한 도면이다. 좌측 이미지에 비해, 우측 이미지는 노즐의 내부 반경이 줄어든 예를 나타내고 있다. 이러한 노즐의 내부 반경의 제어를 통해 수집 성능이 최적화될 수 있다. 가변 직경 노즐 뒤의 메커니즘은 직경 감소로 인한 유속 증가로서, 체적 제어를 위한 공기 주머니가 노즐의 내면 및 내면을 따라 동축 원통형 라텍스 막 사이에 형성될 수 있다. 흡입 모터의 동력을 일정하게 유지하면서 다양한 크기와 무게의 물체를 수집하기 위해, 로봇은 노즐의 단면적을 조절하여 가변적인 공기 흐름 압력과 속도를 생성할 수 있다.Specifically, a 200W BLDC suction motor can be used to collect targets within 4cm from the nozzle tip. The inner radius of the nozzle can be controlled according to the size and weight of the detected target. 5 is a view showing an example of controlling the inner radius of the nozzle in an embodiment of the present invention. Compared to the left image, the right image shows an example in which the inner radius of the nozzle is reduced. The collection performance can be optimized through control of the inner radius of the nozzle. The mechanism behind the variable diameter nozzle is to increase the flow rate due to the reduction in diameter, so that an air bag for volume control can be formed between the coaxial cylindrical latex films along the inner and inner surfaces of the nozzle. To collect objects of various sizes and weights while maintaining the power of the suction motor constant, the robot can adjust the cross-sectional area of the nozzle to produce variable air flow pressures and velocities.

제안된 소프트웨어 아키텍처를 구현하기 위해, 로봇은 소형 PC(Intel, NUC)와 모바일 GPU 프로세서(Nvidia, Jetson TX2)로 구성되며 메인 소프트웨어는 ROS 환경에서 실행된다. 7대의 서보 모터(Dynamixel MX-28AT)와 흡입 모터(BLDC, 220W)는 FPGA(Xilinx, Z-7010)를 사용하여 실시간 프로세서(NI, myRIO)로 제어된다. PC는 LIDAR(Slamtec, RPLidar A2), IMU(EBIMU-9DOFV3) 및 고수준 제어를 위한 카메라(Wide Cam(Genius 120° Wide))와 인터페이싱하며, 모바일 GPU는 고수준 시각 분류를 위해 딥 뉴럴 네트워크를 처리한다. 모든 프로세서는 무선 통신을 통해 통신한다. 개발된 로봇의 사양은 표 1에서 설명되어 있다.To implement the proposed software architecture, the robot consists of a small PC (Intel, NUC) and a mobile GPU processor (Nvidia, Jetson TX2). The main software runs in the ROS environment. Seven servomotors (Dynamixel MX-28AT) and suction motors (BLDC, 220W) are controlled by a real-time processor (NI, myRIO) using FPGA (Xilinx, Z-7010). PCs interface with LIDAR (Slamtec, RPLidar A2), IMU (EBIMU-9DOFV3) and Cameras for high-level control (Wide Cam (Genius 120 ° Wide)) and mobile GPUs handle deep neural networks for high- . All processors communicate via wireless communication. The specifications of the developed robot are described in Table 1.

Mechanical SpecificationMechanical Specification Electrical SpecificationsElectrical Specifications Size [mm]Size [mm] 350×420×390350 × 420 × 390 Battery [Wh]Battery [Wh] 240(6S4P Li-ion)240 (6S4P Li-ion) Weight [kg]Weight [kg] 9.6(7.9*)9.6 (7.9 *) LoadingLoading Min.Min. Max.Max. Speed [m/s]Speed [m / s] 0.760.76 Operating Time [Hr]Operating Time [Hr] 2.52.5 1One Inhalation Range [mm]Inhalation Range [mm] > 50> 50 Power Load [W]Power Load [W] 9090 220220

*배터리를 제외한 무게* Weight excluding battery

앞서 언급했듯이, 소프트웨어 아키텍처는 세 가지 주요 구성 요소(빠른 시각적 인식(111), 비동기식 딥 분류 네트워크(120)를 위한 프레임워크 및 GRL-플래너(모션 플래너(113))와 이 주요 구성 요소 사이의 데이터 흐름을 통합하는 데이터 허브 역할의 추상화 맵(112)으로 구성된다.As previously mentioned, the software architecture is based on three main components: fast visual recognition 111, framework for asynchronous deep classification network 120 and GRL-planner (motion planner 113) and data between these major components And an abstraction map 112 serving as a data hub that integrates flows.

필수 제어 루프(110)의 구성 요소로서 빠른 시각적 인식(111)을 위한 알고리즘을 사용하여 미가공 카메라 이미지에서 대상 객체 후보의 절대 좌표 및 크기를 추정한다. 로봇 구현에서 OpenCV 기반의 컬러 필터링과 얼룩 검출 알고리즘(blob detection algorithm)을 사용하여 카메라 이미지에서 각 대상의 위치와 크기를 인식한다. 카메라 이미지의 인식된 위치에서 카메라 평면과 바닥 평면 사이에 기 계산된 전이 행렬을 사용하여 각 객체의 절대 좌표가 추정된다.Estimates the absolute coordinates and size of the target object candidate in the raw camera image using an algorithm for fast visual recognition 111 as a component of the essential control loop 110. Robot implementations use OpenCV based color filtering and blob detection algorithm to recognize the position and size of each object in the camera image. The absolute coordinates of each object are estimated using a pre-computed transition matrix between the camera plane and the bottom plane at the recognized position of the camera image.

도 6은 본 발명의 일실시예에 있어서, 객체의 인식 화면 및 추상화 맵 화면의 예들을 나타낸 도면이다. 도 6(a)의 화면은 카메라 뷰에서 탐지된 목표 객체들이 녹색 상자로서 마킹된 예를 나타내고 있으며, 도 6(b)의 화면은 대상 객체와 장애물의 위치가 추상화 맵(112)에 매핑된 예를 나타내고 있다. 비동기식 딥 분류 네트워크(120)는 각 대상 객체를 분류하여 분류 정보를 추상화 맵(112)에 업데이트하고, 잘못된 대상을 비동적으로 추상화 맵(112)에서 제거할 수 있다.6 is a diagram illustrating examples of an object recognition screen and an abstract map screen according to an exemplary embodiment of the present invention. 6A shows an example in which the target objects detected in the camera view are marked as green boxes. FIG. 6B shows an example in which the positions of the target object and the obstacle are mapped to the abstract map 112 Respectively. The asynchronous deep classification network 120 may classify each target object, update the classification information in the abstraction map 112, and remove the false object non-dynamically from the abstraction map 112.

빠른 시각적 인식(111)의 프로세스 동안 도 6 (a)의 녹색 상자로 표현된 대상 객체 후보의 잘린 이미지도 비동기식 딥 분류 네트워크(120)에서의 자세한 인식을 위해 객체 템플릿(templates)으로 저장될 수 있다.During the process of fast visual recognition 111, the cropped image of the target object candidate represented by the green box in Figure 6 (a) may also be stored as object templates for detailed recognition in the asynchronous deep classification network 120 .

추상화 맵(112)은 빠른 시각적 인식(111)의 대상 객체 후보 위치 및 크기 데이터를 LIDAR의 장애물 위치 데이터와 결합할 수 있다. 구현을 위해 추상화 맵은 0.05m 해상도를 가진 실내 공간의 일반적인 크기인 5m×5m 공간을 설명하도록 설계되었다. 도 6 (b)에 도시된 바와 같이, 대상 객체 후보 및 장애물은 추상화 맵(112)에 매핑되며, 비동기식 딥 분류 네트워크(120)을 이용한 상세한 물체 인식을 위해, 각 대상 객체 후보는 기본 구조 내에서 구현되는 멀티 바디 트래커에 의해 추적될 수 있다.The abstraction map 112 may combine the target object candidate location and size data of the fast visual recognition 111 with the obstacle location data of the LIDAR. For implementation, the abstraction map is designed to illustrate the typical size of a room with a resolution of 0.05m, 5m by 5m space. As shown in FIG. 6 (b), the object candidate and obstacle are mapped to the abstraction map 112, and for detailed object recognition using the asynchronous deep classification network 120, Can be tracked by the implemented multi-body tracker.

도 7은 본 발명의 일실시예에 있어서, 네 가지 상태들로 구성된 멀티 바디 트래커의 상태 다이어그램의 예를 도시한 도면이다. 네 가지 상태들은 탐지된(detected) 상태, 추적된(tracked) 상태, 잃어버린(lost) 상태, 비활성(inactive) 상태일 수 있다. 카메라 이미지에서 대상 객체가 탐지되면, 탐지된 객체는 탐지된 상태로 분류될 수 있다.FIG. 7 is a diagram illustrating an example of a state diagram of a multi-body tracker including four states according to an embodiment of the present invention. The four states may be a detected, tracked, lost, or inactive state. When a target object is detected in the camera image, the detected object can be classified as a detected state.

(a) 탐지된 객체를 등록된 타겟과 새로운 타겟으로 분류하기 위해 멀티 바디 트래커는 탐지된 상태의 타겟과 현재 등록된 타겟을 비교할 수 있다. 추정된 카메라 주행 기록으로 현재 등록된 목표물의 위치를 예측한 후, 멀티 바디 트래커는 카메라 화면의 경계 상자의 중첩을 기반으로 각 대상을 일치시킬 수 있다. 일치하지 않는 대상 객체는 새로운 추적된 상태의 객체로 등록되며 일치하는 객체는 해당 추적 대상 객체의 템플릿을 업데이트하는 데 사용될 수 있다.(a) To classify the detected objects into registered targets and new targets, the multibody tracker can compare the detected targets with the currently registered targets. After predicting the location of the currently registered target with the estimated camera running history, the multi-body tracker can match each object based on the overlap of the boundary boxes of the camera screen. The unmatched target object is registered as a new tracked state object and the matching object can be used to update the template of the tracked object.

(b) 추적된 대상 객체가 일치하지 않으면 대상이 잃어버린 상태로 이동할 수 있다.(b) If the tracked target objects do not match, the target can move to the lost state.

(c) 객체가 일정 시간 동안 잃어버린 상태로 남아 있으면 비활성 상태로 이동할 수 있다.(c) If the object remains in a lost state for a period of time, it can move to the inactive state.

(d) 비동기식으로 작동하는 경우 비동기식 딥 분류 네트워크(120)는 등록된 대상 객체를 지속적으로 분류할 수 있다. 객체가 거짓 타겟으로 분류되면 현재 상태에 관계없이 비활성 상태로 이동할 수 있다.(d) When operating asynchronously, the asynchronous deep classification network 120 may continuously classify the registered target objects. If an object is classified as a false target, it can move to the inactive state regardless of its current state.

비동기식 딥 분류 네트워크(120)는 추상화 맵(112)에서 대상 객체 후보를 분류할 수 있다. 또한, 비동기식 딥 분류 네트워크(120)는 빠른 시각적 인식(111)의 과정에서 오인된 잘못된 대상을 제거할 수 있다. 비동기식 딥 분류 네트워크(120)의 경우 기존 네트워크 중에서 가장 안정적인 성능을 보이는 VGGNet-16이 사용되었다. VGGNet-16에서 완전히 연결된 레이어를 1024개의 유닛을 갖는 완전히 연결된 2개의 레이어로 교체하고 네트워크를 본 실시예에 맞게 미세 조정하였다. 소프트웨어 구성 요소 중 비동기식 딥 분류 네트워크(120)는 모바일 GPU인 Jetson TX2에서 작동한다. 전반적인 학습 과정은 TensorFlow 백엔드(backend)가 있는 Keras 환경에서 수행되었다.The asynchronous deep classification network 120 may classify the object object candidates in the abstraction map 112. In addition, the asynchronous deep classification network 120 can remove false objects that are mistaken in the course of fast visual recognition 111. [ For the asynchronous deep classification network 120, VGGNet-16, which has the most stable performance among the existing networks, was used. In VGGNet-16, the fully connected layer was replaced by two completely connected layers with 1024 units and the network was fine-tuned for this example. Among the software components, the asynchronous deep classification network 120 operates on a mobile GPU, Jetson TX2. The overall learning process was performed in a Keras environment with a TensorFlow backend.

또한, 비디오 게임 환경과 유사한 모바일 로봇 제어 시스템을 추상화하여 GRL-플래너를 구현하였다. 카메라의 미가공 이미지를 입력 상태로 사용하는 대신, 대상 객체 및 주변 장애물의 위치 데이터로 추상화 맵(112)을 작성하여 GRL-플래너의 입력 상태로 활용하였다. 개발된 홀로노믹(holonomic) 모바일 로봇을 제어하기 위한 10가지 동작(몸체에서 8 방향, 왼쪽, 오른쪽)이 강화 학습 네트워크의 출력으로 사용되었다. 비동기식 A3C(advantage actor-critic) 알고리즘은 비디오 게임에서 아트 퍼포먼스(art performance) 의 상태를 보여주는 강화 학습에 사용되었다. 네트워크 모델은 딥 마인드(DeepMind)의 A3C-LSTM 모델과 동일하게 구현되었다. 이 모델은 4개의 연속적인 그레이 스케일 이미지를 입력 상태로 사용하며, 단일의 완전히 연결된 레이어 및 256-LSTM 레이어가 뒤따르는 2개의 컨볼루션 레이어로 구성된다. 첫 번째 컨볼루션 레이어는 스트라이드(stride)가 4인 32 8×8 필터 및 스트라이드가 2인 두 번째 64 4×4 필터를 가지고 있으며, 완전히 연결된 레이어에는 256개의 숨겨진 유닛이 존재한다.In addition, a mobile robot control system similar to a video game environment is abstracted to implement a GRL-planner. Instead of using the raw image of the camera as an input state, an abstraction map 112 is created by using the positional data of the target object and the surrounding obstacles and used as the input state of the GRL-planar. Ten behaviors (eight directions in the body, left and right) to control the developed holonomic mobile robot were used as the outputs of the reinforcement learning network. The asynchronous advantage-critic (A3C) algorithm was used for reinforcement learning to show the state of art performance in video games. The network model is implemented in the same way as the A3C-LSTM model of DeepMind. The model uses four consecutive grayscale images as input, consisting of a single fully connected layer and two convolution layers followed by a 256-LSTM layer. The first convolution layer has a 32 8 × 8 filter with a stride of 4 and a second 64 4 × 4 filter with a stride of 2. There are 256 hidden units in a fully connected layer.

도 8은 본 발명의 일실시예에 있어서, 시뮬레이션 환경의 예를 도시한 도면이다. 본 실시예에 따른 시뮬레이션 환경은 도 3의 홀로노믹스 플랫폼에서와 같이 10개의 조이스틱 명령 입력 형성을 사용하며, 로봇이 정면에 있는 볼을 수집하도록 유도하기 위해 적색 상자(810)의 가운데(로봇 위치)에 더 가까운 목표물을 터치할수록 더 높은 보상이 주어지도록 설계되었다. 시물레이션은 로봇이 모든 목표를 수집하거나 특정 시간 단계에서 목표를 찾지 못하면 종료될 수 있다.8 is a diagram showing an example of a simulation environment in an embodiment of the present invention. The simulation environment according to the present embodiment uses 10 joystick command input formations as in the holonomics platform of FIG. 3, and the center (robot position) of the red box 810 to guide the robot to collect the ball in front, The closer you touch the target, the higher the compensation is designed. Simulations can be terminated if the robot can not collect all targets or find targets at a specific time step.

강화 학습 네트워크는 이러한 도 8의 시뮬레이션 환경에서 학습되었다. 계산 복잡도를 줄이기 위해 0.15m의 해상도로 추상화 맵(112)을 다운 샘플링하였다. 현실적인 시뮬레이션 환경을 위해 객체 감지 실패와 객체 위치의 부정확성은 확률적으로 깜빡이고 흔들리는 시뮬레이터로 모델링되었다. 전반적인 학습은 TensorFlow 프레임 워크를 사용하여 수행되었으며, 6개의 CPU 코어 환경에서 수렴(converge)하는데 6시간이 소요되었다.The reinforcement learning network was learned in this simulation environment of Fig. The abstraction map 112 is downsampled at a resolution of 0.15 m to reduce the computational complexity. For a realistic simulation environment, object detection failure and object location inaccuracy are modeled as probabilistic flickering and shaking simulators. Overall learning was done using the TensorFlow framework, which took 6 hours to converge in a 6 CPU core environment.

기존의 이미지 인식 알고리즘이 딥 학습 기반 인식 알고리즘보다 빠르지만 그 업데이트 속도는 제어 레이어 업데이트 속도보다 여전히 느리다. 따라서 IMU 센서를 사용하여 로봇의 주행 측정을 평가하고 대상 객체 업데이트 사이의 추상화 맵(112)을 추정하였다. 이러한 주행 기록 업데이트를 통해 필수 제어 루프(110)의 대역폭을 최대화할 수 있었다.Conventional image recognition algorithms are faster than deep learning based recognition algorithms, but the update rate is still slower than the control layer update rate. Thus, the IMU sensor was used to evaluate the robot's running measurements and to estimate the abstraction map 112 between target object updates. This driving record update has been able to maximize the bandwidth of the essential control loop 110.

이동형 로봇의 크기인 35cm×39cm를 고려하여 로봇 주변의 45cm×45cm 영역을 작업 실행 영역으로 지정하고, 이 영역에 대상이 들어오면 로봇이 작동시켜 대상을 수집하고 해당 노즐을 해당 방향으로 배치하면서 대상 물체를 향하여 천천히 이동한다. 또한, 가변 형상 노즐의 단면은 비동기식 딥 분류 네트워크(120)가 인식하는 객체 유형에 따라 조정된다.Considering the size of the mobile robot (35cm × 39cm), a 45cm × 45cm area around the robot is designated as the task execution area. When the object comes in this area, the robot is operated to collect the object and the corresponding nozzle is arranged in the corresponding direction Move slowly toward the object. In addition, the cross-section of the variable shape nozzle is adjusted according to the object type recognized by the asynchronous deep classification network 120.

설계된 로봇의 동작에 따른 성능을 요약하면, 비동기식 딥 분류 네트워크(120)와 GRL-플래너가 로봇에 견고하게 내장 및 통합되어 독립 실행 기능을 가지는 로봇의 기능을 향상시킨다. 비동기식 딥 분류 네트워크(120)를 위한 프레임워크를 활용함으로써 도 6에 도시된 바와 같이 로봇이 여러 유형의 대상과 성공적으로 상호 작용하였다. 또한 로봇은 앞서 설명한 바와 같이 간단한 시뮬레이션 환경에서 학습된 GRL- 플래너를 사용하여 대상을 올바르게 탐색하였다.To summarize the performance of the designed robot, the asynchronous deep classification network 120 and the GRL-planner are robustly embedded and integrated into the robot to improve the functionality of the robot with stand-alone functionality. By utilizing the framework for the asynchronous deep classification network 120, the robot successfully interacted with various types of objects, as shown in FIG. In addition, the robot correctly searches the object using the learned GRL-planner in a simple simulation environment as described above.

도 9는 본 발명의 일실시예에 있어서, 로봇 경로 계획의 결과의 예들을 도시한 도면이다. 녹색 원은 대상 객체를 나타내고, 파란색 화살표는 가장 가까운 첫 번째 알고리즘의 예상 경로를 나타내며, 노란색 선은 GRL-플래너가 획득한 로봇의 실제 경로를 나타내고 있다. 이 결과는 본 발명의 실시예들에 따른 성능이 가장 가까운 대상을 추적하려는 종래기술인 경험적(heuristic) 알고리즘의 성능을 능가함을 보여준다.9 is a diagram illustrating examples of the results of a robot path plan in an embodiment of the present invention. The green circle represents the target object, the blue arrow represents the expected path of the nearest first algorithm, and the yellow line represents the actual path of the robot acquired by the GRL-planner. This result shows that the performance according to the embodiments of the present invention surpasses the performance of the conventional heuristic algorithm for tracking the closest object.

특히, 제안된 아키텍쳐는 50Hz의 대역폭으로 전체 시스템을 운영하며, 전체 동작 대역폭은 딥 분류 처리 시간에 의해 제한되지 않는다. 정량적인 지표를 기반으로 하여 제안된 아키텍처와 로봇을 분석하기 위해, 로봇의 각 소프트웨어 구성 요소의 처리 시간과 소비 전력을 표 2와 같이 측정하였다.In particular, the proposed architecture operates the entire system with a bandwidth of 50 Hz, and the overall operating bandwidth is not limited by the deep classification processing time. In order to analyze the proposed architecture and robots based on quantitative indices, the processing time and power consumption of each software component of the robot are measured as shown in Table 2.

ComponentComponent ProcessorProcessor Computation Time [ms]Computation Time [ms] Power Load [W]Power Load [W] RecognitionRecognition Intel i5-5600UIntel i5-5600U 4040 8.98.9 Obstacle DetectionObstacle Detection Intel i5-5600UIntel i5-5600U 8080 2.22.2 Deep ClassifierDeep Classifier TX2 Max-N(batch 4)TX2 Max-N (batch 4) 315315 12.812.8 Motion PlannerMotion Planner Intel i5-5600UIntel i5-5600U 2020 2.22.2 Odometry EstimationOdometry Estimation Intel i5-5600UIntel i5-5600U 2020 --

로봇에는 Nvidia Titan X와 같은 고성능 GPU가 장착되고 서버 없이 독립형 작동 모드에 있다고 가정하고 테이블을 평가하였다. 필수 제어 루프(111)를 빠른 시각적 인식(111)이나 딥 분류기(240)와 같은 다른 소프트웨어 구성 요소와 분리함으로써 하드웨어 제어 계층의 업데이트 속도는 모션 플래너(113)의 계산 시간에만 의존하게 된다. 비동기식 딥 분류 네트워크(120)는 모바일 GPU인 Jetson TX2에서 최대 GPU 주파수 모드와 배치 크기 4로 작동하여 한 주기에 4개의 대상을 분류하였다. 그러나 로봇이 고성능 GPU가 있는 로컬 서버의 추가적인 계산 능력에 액세스할 수 있다면 로봇의 성능을 더욱 향상시킬 수 있다. 딥 분류기(240)의 처리 시간에 대한 개선된 결과가 표 3에 나타나 있다.The robot was equipped with a high-performance GPU such as the Nvidia Titan X, and the table was evaluated assuming that it was in stand-alone mode of operation without a server. The update speed of the hardware control layer depends only on the computation time of the motion planner 113 by separating the essential control loop 111 from other software components such as the fast visual recognition 111 or the deep classifier 240. [ The asynchronous deep classification network 120 operates on the mobile GPU Jetson TX2 with a maximum GPU frequency mode and a batch size of 4 to classify four objects in one cycle. However, if robots can access the additional computational capabilities of a local server with a high-performance GPU, the performance of the robot can be further improved. The improved results for the processing time of the deep classifier 240 are shown in Table 3.

ProcessorProcessor Computation Time [ms]Computation Time [ms] batch 1batch 1 batch 4batch 4 batch 8batch 8 batch 16batch 16 Intel i5-5600UIntel i5-5600U 15601560 58005800 1100011000 -- Jetson TX2 Max-PJetson TX2 Max-P 125125 360360 710710 14301430 Jetson TX2 Max-NJetson TX2 Max-N 110110 315315 590590 12301230 Nvidia GTX 960Nvidia GTX 960 30(40*)30 (40 *) 92(100*)92 (100 *) 175(195*)175 (195 *) 340(365*)340 (365 *) Nvidia Titan XNvidia Titan X 25(35*)25 (35 *) 82(91*)82 (91 *) 163(183*)163 (183 *) 295(380*)295 (380 *)

* 통신 지연을 포함하는 계산 시간* Computation time including communication delay

그러나 외부 GPU 서버가 액세스되더라도 비동기식 딥 분류 네트워크(120)가 로봇의 필수 제어 루프(110)에 포함된다면 비동기식 딥 분류 네트워크(120)의 처리 시간은 실시간으로 로봇을 작동시키기에 충분히 빠르지 못하다. 다시 말해, 비동기식 딥 분류 네트워크(120)를 모바일 로봇의 필수 제어 루프(110)에서 제외하여 비동기 방식으로 처리함에 따라 실시간으로 모바일 로봇을 작동시키기 위해 충분히 빠른 처리 시간을 얻을 수 있다.However, even if an external GPU server is accessed, the processing time of the asynchronous deep classification network 120 is not fast enough to operate the robot in real time if the asynchronous deep classification network 120 is included in the essential control loop 110 of the robot. In other words, asynchronous deep classification network 120 is excluded from the essential control loop 110 of the mobile robot and processed in an asynchronous manner, resulting in a processing time fast enough to operate the mobile robot in real time.

도 10은 본 발명의 일실시예에 있어서, 모바일 로봇의 내부 구성의 예를 도시한 블록도이다. 본 실시예에 따른 모바일 로봇(1000)은 도 10에 도시된 바와 같이, 메모리(1010), 프로세서(1020), 입출력 인터페이스(1030), 카메라(1040), 센서(1050) 및 모터(1060)를 포함할 수 있다. 메모리(1010)와 프로세서(1020), 그리고 입출력 인터페이스(1030)는 모바일 로봇(1000)에 포함되어 모바일 로봇(1000)을 제어하기 위한 별도의 컴퓨터 장치로 구현될 수도 있다. 프로세서(1020)는 복수 개가 이용될 수도 있다.10 is a block diagram showing an example of the internal configuration of a mobile robot according to an embodiment of the present invention. 10, the mobile robot 1000 according to the present embodiment includes a memory 1010, a processor 1020, an input / output interface 1030, a camera 1040, a sensor 1050, and a motor 1060 . The memory 1010, the processor 1020 and the input / output interface 1030 may be implemented as separate computer devices for controlling the mobile robot 1000 included in the mobile robot 1000. A plurality of processors 1020 may be used.

메모리(1010)는 컴퓨터에서 판독 가능한 기록매체로서, RAM(random access memory), ROM(read only memory) 및 디스크 드라이브와 같은 비소멸성 대용량 기록장치(permanent mass storage device)를 포함할 수 있다. 여기서 ROM과 디스크 드라이브와 같은 비소멸성 대용량 기록장치는 메모리(1010)와는 구분되는 별도의 영구 저장 장치로서 모바일 로봇(1000)에 포함될 수도 있다. 또한, 메모리(1010)에는 운영체제와 적어도 하나의 프로그램 코드가 저장될 수 있다. 이러한 소프트웨어 구성요소들은 메모리(1010)와는 별도의 컴퓨터에서 판독 가능한 기록매체로부터 메모리(1010)로 로딩될 수 있다. 이러한 별도의 컴퓨터에서 판독 가능한 기록매체는 플로피 드라이브, 디스크, 테이프, DVD/CD-ROM 드라이브, 메모리 카드 등의 컴퓨터에서 판독 가능한 기록매체를 포함할 수 있다. 다른 실시예에서 소프트웨어 구성요소들은 컴퓨터에서 판독 가능한 기록매체가 아닌 별도의 통신 인터페이스를 통해 메모리(1010)에 로딩될 수도 있다. 예를 들어, 소프트웨어 구성요소들은 네트워크를 통해 수신되는 파일들에 의해 설치되는 컴퓨터 프로그램에 기반하여 모바일 로봇(1000)의 메모리(1010)에 로딩될 수 있다. 이 경우, 모바일 로봇(1000)은 외부 네트워크를 통해 다른 컴퓨터 장치와 통신하기 위한 별도의 통신 인터페이스를 더 포함할 수도 있다.The memory 1010 may be a computer-readable recording medium and may include a permanent mass storage device such as a random access memory (RAM), a read only memory (ROM), and a disk drive. Here, the non-decaying mass storage device such as the ROM and the disk drive may be included in the mobile robot 1000 as a separate persistent storage device different from the memory 1010. The memory 1010 may also store an operating system and at least one program code. These software components may be loaded into the memory 1010 from a computer readable recording medium separate from the memory 1010. [ Such a computer-readable recording medium may include a computer-readable recording medium such as a floppy drive, a disk, a tape, a DVD / CD-ROM drive, and a memory card. In other embodiments, the software components may be loaded into memory 1010 via a separate communication interface rather than a computer-readable recording medium. For example, the software components may be loaded into the memory 1010 of the mobile robot 1000 based on a computer program installed by files received via the network. In this case, the mobile robot 1000 may further include a separate communication interface for communicating with other computer devices via an external network.

프로세서(1020)는 기본적인 산술, 로직 및 입출력 연산을 수행함으로써, 컴퓨터 프로그램의 명령을 처리하도록 구성될 수 있다. 명령은 메모리(1010)에 의해 프로세서(1020)로 제공될 수 있다. 예를 들어 프로세서(1020)는 메모리(1010)와 같은 기록 장치에 저장된 프로그램 코드에 따라 수신되는 명령을 실행하도록 구성될 수 있다.The processor 1020 may be configured to process instructions of a computer program by performing basic arithmetic, logic, and input / output operations. The instructions may be provided to the processor 1020 by the memory 1010. For example, processor 1020 may be configured to execute instructions received in accordance with program code stored in a recording device, such as memory 1010. [

입출력 인터페이스(1030)는 카메라(1040), 센서(1050) 및 모터(1060) 등의 다른 구성 요소들과의 인터페이스를 위한 수단일 수 있다. 예를 들어, 도 5를 통해 설명한 구성 요소들이 무선 통신을 통해 서로 연결될 수도 있음을 이미 설명한 바 있다. 카메라(1040)나 센서(1050) 또는 모터(1060) 등을 위한 구체적인 실시예들은 이미 설명한 바 있기 때문에 반복적인 설명은 생략한다.The input / output interface 1030 may be a means for interfacing with other components such as the camera 1040, the sensor 1050 and the motor 1060. For example, it has been described that the components described with reference to FIG. 5 may be connected to each other through wireless communication. Specific embodiments for the camera 1040, the sensor 1050 or the motor 1060 and the like have already been described, and thus repetitive description will be omitted.

또한, 다른 실시예들에서 모바일 로봇(1000)는 도 10의 구성요소들보다 더 적은 혹은 더 많은 구성요소들을 포함할 수도 있다. 그러나, 대부분의 종래기술적 구성요소들을 명확하게 도시할 필요성은 없다. 예를 들어, 모바일 로봇(1000)는 모바일 로봇(1000)의 동작 목적에 따라 구현되어 모터(1060)의 구동에 따라 특정 동작(일례로, 바퀴의 회전이나 앞서 설명한 흡입 모터를 이용한 공기의 흡입 등)을 실행하기 위한 보다 구체적인 구성 요소들(일례로, 앞서 설명한 실시예에서의 물체의 흡입을 위한 노즐)을 더 포함할 수 있다.Further, in other embodiments, the mobile robot 1000 may include fewer or more components than the components of FIG. However, there is no need to clearly illustrate most prior art components. For example, the mobile robot 1000 may be implemented in accordance with the operation purpose of the mobile robot 1000, and may be configured to perform a specific operation (for example, rotation of a wheel, intake of air using the above- (For example, a nozzle for sucking an object in the above-described embodiment) for performing the above-described operation.

도 11은 본 발명의 일실시예에 있어서, 목표물 분류 방법의 예를 도시한 흐름도이다. 본 실시예에 따른 목표물 분류 방법은 앞서 설명한 모바일 로봇(1000)의 프로세서(1020)의 제어에 따라 수행될 수 있다.11 is a flowchart showing an example of a target classification method according to an embodiment of the present invention. The target classification method according to the present embodiment can be performed according to the control of the processor 1020 of the mobile robot 1000 described above.

단계(1110)에서 모바일 로봇(1000)은 모바일 로봇(1000)이 포함하는 카메라(1040)를 통해 입력된 이미지에 대한 영상 처리를 통해 목표물의 위치 및 이미지에서의 목표물에 대응하는 관심 영역에 대한 이미지 정보를 획득할 수 있다.In step 1110, the mobile robot 1000 performs image processing on the image input through the camera 1040 included in the mobile robot 1000, and outputs the image of the target region corresponding to the target in the image and the image of the region of interest Information can be obtained.

단계(1120)에서 모바일 로봇(1000)은 목표물의 위치를 이차원 평면상에 사상하여 추상화 맵을 생성할 수 있다. 이러한 추상화 맵의 생성은 앞서 도 2를 통해 자세히 설명한 바 있다.In step 1120, the mobile robot 1000 may map the position of the target on a two-dimensional plane to generate an abstraction map. The generation of such an abstraction map has been described in detail with reference to FIG.

단계(1130)에서 모바일 로봇(1000)은 이미지에서의 목표물에 대응하는 관심 영역에 대한 이미지 정보를, 추상화 맵에 사상된 목표물의 위치에 대응하여 저장할 수 있다. 도 6을 통해 관심 영역에 대한 이미지 정보를 목표물의 위치에 대응하여 저장하는 예를 설명한 바 있다.In step 1130, the mobile robot 1000 may store image information for a region of interest corresponding to a target in the image, corresponding to the location of the mapped target in the abstraction map. 6, an example of storing image information of a region of interest corresponding to a position of a target has been described.

단계(1140)에서 모바일 로봇(1000)은 비동기식 딥 분류 네트워크(Asynchronous Deep Classification Network)를 통해 추상화 맵에 저장된 관심 영역에 대한 이미지 정보를 처리하여 목표물을 비동기적으로 분류할 수 있다. 이 경우, 모바일 로봇(1000)은 모바일 로봇(1000)이나 목표물의 이동에 따라 추상화 맵에 저장되는 목표물에 대한 정보의 불일치를 제거하기 위해, 모바일 로봇(1000) 또는 목표물의 이동을 멀티 바디 트래킹(multi-body tracking) 알고리즘을 이용하여 추적할 수 있다.In step 1140, the mobile robot 1000 may process the image information of the ROI stored in the abstraction map through the asynchronous Deep Classification Network to classify the target asynchronously. In this case, the mobile robot 1000 may perform movement of the mobile robot 1000 or the target by multi-body tracking (hereinafter, referred to as " multi-body tracking ") in order to eliminate mismatch of information about the target stored in the abstraction map according to the movement of the mobile robot 1000 or the target. multi-body tracking algorithm.

단계(1150)에서 모바일 로봇(1000)은 비동기식 딥 분류 네트워크를 통해 비동기적으로 분류된 목표물에 대한 정보를 추상화 맵에 업데이트할 수 있다. 추상화 맵은 모션 플래너의 입력 상태로 이용될 수 있으며, 모션 플래너는 모바일 로봇(1000)의 모션을 결정할 수 있다. 이때, 비동기적으로 분류된 목표물에 대한 정보가 추상화 맵에 업데이트됨에 따라 비동기적으로 분류된 목표물에 대한 정보가 모바일 로봇의 모션(1000)의 결정에 반영될 수 있다.In step 1150, the mobile robot 1000 may update the abstraction map with information about asynchronously sorted targets via the asynchronous deep classification network. The abstraction map can be used as an input state of the motion planner, and the motion planner can determine the motion of the mobile robot 1000. At this time, as information about the asynchronously classified target is updated in the abstraction map, information about the asynchronously classified target can be reflected in the determination of the motion 1000 of the mobile robot.

또한, 모바일 로봇(1000)은 모션 플래너의 출력으로서 결정되는 모바일 로봇(1000)의 모션을 기설정된 복수의 게임 컨트롤 신호들 중 적어도 하나의 신호를 이용하여 추상화할 수 있고, 적어도 하나의 신호에 따라 모바일 로봇(1000)이 포함하는 모터(1060)를 제어하여 모바일 로봇(1000)의 모션을 제어할 수 있다. 이러한 모바일 로봇(1000)의 제어는 도 12를 통해 보다 자세히 설명한다.Also, the mobile robot 1000 can abstract the motion of the mobile robot 1000 determined as the output of the motion planner using at least one of a predetermined plurality of game control signals, The motion of the mobile robot 1000 can be controlled by controlling the motor 1060 included in the mobile robot 1000. [ The control of the mobile robot 1000 will be described in more detail with reference to FIG.

도 12는 본 발명의 일실시예에 있어서, 모바일 로봇 제어 방법의 예를 도시한 흐름도이다. 본 실시예에 따른 모바일 로봇 제어 방법은 앞서 설명한 모바일 로봇(1000)의 프로세서(1020)의 제어에 따라 수행될 수 있다. 도 12의 단계(1240) 내지 단계(1280)은 도 11의 실시예에도 적용될 수 있다. 또한, 도 11에서 추상화 맵을 업데이트하는 과정이 도 12의 실시예에 적용될 수도 있다.12 is a flowchart illustrating an example of a mobile robot control method according to an embodiment of the present invention. The mobile robot control method according to the present embodiment can be performed under the control of the processor 1020 of the mobile robot 1000 described above. Steps 1240 to 1280 of FIG. 12 can also be applied to the embodiment of FIG. In addition, the process of updating the abstraction map in Fig. 11 may be applied to the embodiment of Fig.

단계(1210)에서 모바일 로봇(1000)은 모바일 로봇(1000)이 포함하는 카메라(1040)를 통해 입력된 이미지에 대한 영상 처리를 통해 목표물에 대한 정보를 획득할 수 있다.In step 1210, the mobile robot 1000 may acquire information about the target through image processing on the image input through the camera 1040 included in the mobile robot 1000. [

단계(1220)에서 모바일 로봇(1000)은 모바일 로봇(1000)이 포함하는 적어도 하나의 센서(1050)를 통해 출력되는 센싱 데이터를 획득할 수 있다. 센싱 데이터는 일례로, 라이더나 IMU 등의 출력을 포함할 수 있다.In operation 1220, the mobile robot 1000 may acquire sensing data output through at least one sensor 1050 included in the mobile robot 1000. The sensing data may include, for example, an output of a rider or an IMU.

단계(1230)에서 모바일 로봇(1000)은 목표물에 대한 정보와 센싱 데이터를 이차원 평면상에 사상하여 추상화 맵을 생성할 수 있다. 이러한 추상화 맵의 생성은 앞서 도 2를 통해 자세히 설명한 바 있다. 단계(1230)은 도 11의 단계(1120)에 비해 센싱 데이터를 더 이용하여 추상화 맵을 생성하는 실시예를 설명하고 있다.In operation 1230, the mobile robot 1000 may generate an abstraction map by mapping information about the target and sensing data on a two-dimensional plane. The generation of such an abstraction map has been described in detail with reference to FIG. Step 1230 describes an embodiment in which an abstraction map is generated by further using sensing data as compared to step 1120 in Fig.

단계(1240)에서 모바일 로봇(1000)은 생성된 추상화 맵을 모션 플래너의 입력 상태로 이용하여 모바일 로봇(1000)의 모션을 결정할 수 있다.In step 1240, the mobile robot 1000 may determine the motion of the mobile robot 1000 by using the generated abstraction map as an input state of the motion planner.

단계(1250)에서 모바일 로봇(1000)은 생성된 추상화 맵을 기반으로 시뮬레이션 환경을 구축할 수 있다.In step 1250, the mobile robot 1000 may construct a simulation environment based on the generated abstraction map.

단계(1260)에서 모바일 로봇(1000)은 구축된 시뮬레이션 환경을 통해 모션 플래너를 강화 학습할 수 있다. 시물레이션 환경은 이차원 비디오 게임의 시뮬레이션 환경에 대응할 수 있으며, 모바일 로봇(1000)은 단계(1260)에서 게이밍 강화 학습(Gaming Reinforcement Learning, GRL)을 위한 알고리즘을 이용하여 상기 모션 플래너를 강화 학습할 수 있다.In step 1260, the mobile robot 1000 can reinforce the motion planner through the established simulation environment. The simulation environment may correspond to the simulation environment of the two dimensional video game and the mobile robot 1000 may reinforce learning of the motion planner using an algorithm for gaming reinforcement learning (GRL) in step 1260 .

단계(1270)에서 모바일 로봇(1000)은 결정된 모바일 로봇의 모션(1000)을 기설정된 복수의 게임 컨트롤 신호들 중 적어도 하나의 신호를 이용하여 추상화할 수 있다.In step 1270, the mobile robot 1000 may abstract the motion 1000 of the determined mobile robot using at least one of a predetermined plurality of game control signals.

단계(1280)에서 모바일 로봇(1000)은 적어도 하나의 신호에 따라 모바일 로봇(1000)이 포함하는 모터(1060)를 제어하여 모바일 로봇의 모션을 제어할 수 있다.In step 1280, the mobile robot 1000 may control the motion of the mobile robot by controlling the motor 1060 included in the mobile robot 1000 according to at least one signal.

이처럼 본 발명의 실시예들에 따르면, 엔드-투-엔드 로봇 소프트웨어 아키텍처와 달리, 모바일 로봇의 전체 소프트웨어를 구성하는 하나의 구성 요소로서 딥 러닝 기술을 도입할 수 있는 비동기식 딥 분류 네트워크를 이용하여 높은 대역폭의 제어를 필요로 하는 모바일 로봇에 대규모 딥 뉴럴 네트워크를 효과적으로 통합할 수 있다. 또한, 모바일 로봇을 위한 게이밍 강화 학습 기반 모션 플래너를 이용하여 학습의 복잡성을 줄이고 알고리즘의 확장성을 향상시킬 수 있다.As described above, according to the embodiments of the present invention, unlike the end-to-end robot software architecture, an asynchronous deep classifying network capable of introducing a deep learning technology as one component constituting the entire software of the mobile robot is used It is possible to effectively integrate large-scale deep neural networks into mobile robots that require bandwidth control. In addition, it can reduce learning complexity and improve algorithm scalability by using gaming enhanced learning based motion planner for mobile robots.

이상에서 설명된 시스템 또는 장치는 하드웨어 구성요소, 또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPGA(field programmable gate array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 어플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다. 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The system or apparatus described above may be implemented as a hardware component, or a combination of hardware components and software components. For example, the apparatus and components described in the embodiments may be implemented within a computer system, such as, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate array (FPGA) , A programmable logic unit (PLU), a microprocessor, or any other device capable of executing and responding to instructions. The processing device may execute an operating system (OS) and one or more software applications running on the operating system. The processing device may also access, store, manipulate, process, and generate data in response to execution of the software. For ease of understanding, the processing apparatus may be described as being used singly, but those skilled in the art will recognize that the processing apparatus may have a plurality of processing elements and / As shown in FIG. For example, the processing unit may comprise a plurality of processors or one processor and one controller. Other processing configurations are also possible, such as a parallel processor.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치에 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록매체에 저장될 수 있다.The software may include a computer program, code, instructions, or a combination of one or more of the foregoing, and may be configured to configure the processing device to operate as desired or to process it collectively or collectively Device can be commanded. The software and / or data may be in the form of any type of machine, component, physical device, virtual equipment, computer storage media, or device As shown in FIG. The software may be distributed over a networked computer system and stored or executed in a distributed manner. The software and data may be stored on one or more computer readable recording media.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 매체는 컴퓨터로 실행 가능한 프로그램을 계속 저장하거나, 실행 또는 다운로드를 위해 임시 저장하는 것일 수도 있다. 또한, 매체는 단일 또는 수개 하드웨어가 결합된 형태의 다양한 기록수단 또는 저장수단일 수 있는데, 어떤 컴퓨터 시스템에 직접 접속되는 매체에 한정되지 않고, 네트워크 상에 분산 존재하는 것일 수도 있다. 매체의 예시로는, 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM 및 DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical medium), 및 ROM, RAM, 플래시 메모리 등을 포함하여 프로그램 명령어가 저장되도록 구성된 것이 있을 수 있다. 또한, 다른 매체의 예시로, 애플리케이션을 유통하는 앱 스토어나 기타 다양한 소프트웨어를 공급 내지 유통하는 사이트, 서버 등에서 관리하는 기록매체 내지 저장매체도 들 수 있다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다.The method according to an embodiment may be implemented in the form of a program command that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The medium may be one that continues to store computer executable programs, or temporarily store them for execution or download. In addition, the medium may be a variety of recording means or storage means in the form of a combination of a single hardware or a plurality of hardware, but is not limited to a medium directly connected to a computer system, but may be dispersed on a network. Examples of the medium include a magnetic medium such as a hard disk, a floppy disk and a magnetic tape, an optical recording medium such as CD-ROM and DVD, a magneto-optical medium such as a floptical disk, And program instructions including ROM, RAM, flash memory, and the like. As another example of the medium, a recording medium or a storage medium managed by a site or a server that supplies or distributes an application store or various other software to distribute the application may be mentioned. Examples of program instructions include machine language code such as those produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like.

이상과 같이 실시예들이 비록 한정된 실시예와 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. For example, it is to be understood that the techniques described may be performed in a different order than the described methods, and / or that components of the described systems, structures, devices, circuits, Lt; / RTI > or equivalents, even if it is replaced or replaced.

그러므로, 다른 구현들, 다른 실시예들 및 청구범위와 균등한 것들도 후술하는 청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents to the claims are also within the scope of the following claims.

Claims

Obtaining image information about a position of a target and a region of interest corresponding to the target in the image through image processing of an image inputted through a camera included in the mobile robot;
Mapping the position of the target on a two-dimensional plane to generate an abstraction map;
Storing image information for a region of interest corresponding to the target in the image, corresponding to a position of the target mapped to the abstraction map;
Processing image information for a region of interest stored in the abstraction map through an asynchronous Deep Classification Network to classify the target asynchronously; And
Updating information on the asynchronously classified target through the asynchronous deep classification network to the abstraction map
Wherein the target classification method comprises:

The method according to claim 1,
Determining the motion of the mobile robot using the abstraction map as an input state of the motion planner
Further comprising:
And information about the asynchronously classified target is reflected in the determination of the motion of the mobile robot as information about the asynchronously classified target is updated in the abstraction map.

3. The method of claim 2,
Abstracting the motion of the mobile robot determined as the output of the motion planner using at least one of a predetermined plurality of game control signals; And
Controlling a motion of the mobile robot by controlling a motor included in the mobile robot according to the at least one signal
Further comprising the steps of:

3. The method of claim 2,
Constructing a simulation environment based on the generated abstraction map; And
The step of reinforcement learning of the motion planner through the constructed simulation environment
Further comprising the steps of:

5. The method of claim 4,
The simulation environment corresponds to a simulation environment of a two-dimensional video game,
The step of reinforcement learning of the motion planner includes:
Wherein the motion planner is reinforcement learning using an algorithm for gaming reinforcement learning (GRL).

The method according to claim 1,
Wherein the asynchronously classifying comprises:
Wherein the movement of the mobile robot or the target is detected using a multi-body tracking algorithm in order to eliminate mismatch of information about the target stored in the abstraction map according to movement of the mobile robot or the target Wherein the target is classified into two categories.

A computer program stored in a computer-readable medium for causing a computer to execute the method of any one of claims 1 to 6 in combination with the computer.

A computer-readable recording medium storing a program for causing a computer to execute the method according to any one of claims 1 to 6.

A computer apparatus for controlling a mobile robot,
A memory for storing instructions readable by a computer; And
At least one processor < RTI ID = 0.0 >
Lt; / RTI >
Wherein the at least one processor comprises:
Acquiring image information about a position of a target and a region of interest corresponding to the target in the image through image processing of an image input through a camera included in the mobile robot,
Mapping the position of the target on a two-dimensional plane to generate an abstraction map,
Storing image information for a region of interest corresponding to the target in the image corresponding to a position of the target mapped to the abstraction map,
Processing the image information for a region of interest stored in the abstraction map through an asynchronous Deep Classification Network to classify the target asynchronously,
Updating the abstraction map with information about asynchronously classified targets via the asynchronous deep classification network
The computer device comprising:

10. The method of claim 9,
Wherein the at least one processor comprises:
Determining the motion of the mobile robot using the abstraction map as an input state of the motion planner,
Wherein information about the asynchronously classified target is reflected in the determination of the motion of the mobile robot as information about the asynchronously classified target is updated in the abstraction map.

11. The method of claim 10,
Wherein the at least one processor comprises:
Abstracting the motion of the mobile robot determined as the output of the motion planner using at least one of a predetermined plurality of game control signals,
And controls the motion of the mobile robot by controlling a motor included in the mobile robot according to the at least one signal.

11. The method of claim 10,
Wherein the at least one processor comprises:
Constructing a simulation environment based on the generated abstraction map,
Reinforcing the motion planner through the constructed simulation environment
The computer device comprising:

13. The method of claim 12,
The simulation environment corresponds to a simulation environment of a two-dimensional video game,
Wherein the at least one processor comprises:
Wherein the motion planner is reinforcement learning using an algorithm for gaming reinforcement learning (GRL).

10. The method of claim 9,
Wherein the at least one processor comprises:
Wherein the movement of the mobile robot or the target is detected using a multi-body tracking algorithm in order to eliminate mismatch of information about the target stored in the abstraction map according to movement of the mobile robot or the target Said computer system comprising:

In a mobile robot,
A memory for storing instructions readable by a computer;
At least one processor configured to execute the instruction; And
Camera receiving images
Lt; / RTI >
Wherein the at least one processor comprises:
Acquiring image information about a position of a target and a region of interest corresponding to the target in the image through image processing on an image input through the camera,
Mapping the position of the target on a two-dimensional plane to generate an abstraction map,
Storing image information for a region of interest corresponding to the target in the image corresponding to a position of the target mapped to the abstraction map,
Processing the image information for a region of interest stored in the abstraction map through an asynchronous Deep Classification Network to classify the target asynchronously,
Updating the abstraction map with information about asynchronously classified targets via the asynchronous deep classification network
And a mobile robot.