KR20220036621A

KR20220036621A - Learning method of unit action deep learning model and robot control method using the same

Info

Publication number: KR20220036621A
Application number: KR1020200119049A
Authority: KR
Inventors: 이상형; 조남준
Original assignee: 한국생산기술연구원
Priority date: 2020-09-16
Filing date: 2020-09-16
Publication date: 2022-03-23
Also published as: KR102644164B1

Abstract

The present invention relates to a method that divides an action of an operator into a plurality of unit actions comprising approaching, aligning, grabbing an object, moving an object, placing down an object, inserting, and tightening, and acquires an image for a correct answer image and an additional learning from a demonstration comprising direct teaching and observation respectively for each unit action, thereby learning using the same. Therefore, the present invention is capable of allowing even complex tasks to be modeled quickly and accurately.

Description

Learning method of unit action deep learning model and robot control method using the same}

본 발명은 단위행동 딥러닝 모델의 학습 방법과 이를 이용한 로봇 제어 방법에 관한 것이다.The present invention relates to a learning method of a unit action deep learning model and a robot control method using the same.

최근 산업 각 분야에서 딥러닝 모델을 생성하는 것에 대한 관심이 높아지고 있으며, 산업용/협동 로봇을 사용하는 분야에서는 작업자의 작업(task)을 모사하는 로봇을 제작하거나, 제작된 로봇의 모델을 강화학습하여 최적화하는것에 대한 기술 개발이 있어왔다.Recently, interest in creating deep learning models in each industrial field is increasing, and in fields using industrial/collaborative robots, robots that imitate the tasks of workers are made, or the models of the robots are reinforced. Techniques have been developed to optimize.

하나의 작업은 여러 개의 단위행동으로 구성된다. 단순한 작업은 하나의 모델로 모델링 가능하나, 복잡한 작업은 하나의 모델로 모델링하기에는 복잡하다. 종래에는, 작업자의 작업을 구성하는 여러 개의 단위행동을 구분하여 모델링하고, 학습하는 것에 대한 인식이 부족하였으며, 이에 따라 연산 시간이 증가되고, 정확도가 감소되는 문제가 있었다. One task consists of several unit actions. A simple task can be modeled with a single model, but a complex task is too complex to be modeled with a single model. Conventionally, there was a lack of recognition for modeling and learning by classifying several unit actions constituting a worker's work, and thus there was a problem in that the calculation time was increased and the accuracy was decreased.

예를 들어, 한국공개특허문헌 제10-2019-0088093호는 로봇을 위한 학습 방법에 관한 것으로, 로봇에서 촬영된 이미지 및 로봇의 타겟 자세에 대한 학습 결과 에 대하여, 로봇의 현재 자세에서 타겟 물체에 대한 자세를 강화학습하는 단계를 포함하고, 강화학습된 결과로부터 로봇의 행동에 대한 평가를 강화학습하여, 타겟 자세를 학습한다. 다만, 로봇은 단위행동을 구분하여 모델링하는 것에 대한 인식은 없고, 여러 신호가 입력되었을 때 우선순위를 정하는 것 및 계획을 수립하는 것에 대한 인식은 없다.For example, Korean Patent Publication No. 10-2019-0088093 relates to a learning method for a robot, and with respect to the learning result for the image captured by the robot and the robot's target posture, the robot's current posture to the target object Including the step of reinforcement learning the posture for the, reinforcement learning the evaluation of the robot behavior from the reinforcement learning results to learn the target posture. However, the robot does not have recognition for modeling unit actions separately, and there is no recognition for setting priorities and establishing plans when multiple signals are input.

다른 예를 들어, 한국공개특허문헌 제10-2020-0072592호는 로봇용 학습 프레임워크 설정방법 및 이를 수행하는 디지털 제어장치에 관한 것으로, 시연자의 시연정보와 모방학습을 통해 로봇이 초기 모터 스킬을 학습하고, 강화학습을 하여 로봇의 모터 스킬을 향상시킨다. 다만, 로봇은 단위행동을 구분하여 모델링하는 것에 대한 인식은 없고, 여러 신호가 입력되었을 때 우선순위를 정하는 것 및 계획을 수립하는 것에 대한 인식은 없다.For another example, Korean Patent Publication No. 10-2020-0072592 relates to a method for setting a learning framework for a robot and a digital control device for performing the same. Learning and reinforcement learning improve the robot's motor skills. However, the robot does not have recognition for modeling unit actions separately, and there is no recognition for setting priorities and establishing plans when multiple signals are input.

(특허문헌 1) 한국공개특허문헌 제10-2019-0088093호(Patent Document 1) Korean Patent Publication No. 10-2019-0088093

(특허문헌 2) 한국공개특허문헌 제10-2020-0072592호(Patent Document 2) Korean Patent Publication No. 10-2020-0072592

(특허문헌 3) 한국등록특허문헌 제10-2131097호(Patent Document 3) Korean Patent Document No. 10-2131097

본 발명은 상기와 같은 문제점을 해결하기 위하여 안출된 것이다. The present invention has been devised to solve the above problems.

본 발명은 작업자의 시연으로부터 딥러닝 모델을 생성하고, 로봇 및 기계를 이용하여 작업자의 행동을 대체 또는 협업하기 위한 기술이다.The present invention is a technology for generating a deep learning model from the operator's demonstration, and substituting or collaborating with the operator's behavior using a robot and a machine.

구체적으로, 본 발명은 작업자의 작업을 단위행동으로 구분하여, 단위행동 각각 딥러닝 모델을 생성하기 위함이다.Specifically, the present invention is to create a deep learning model for each unit action by dividing the worker's work into unit actions.

또한, 본 발명은 단위행동 딥러닝 모델을 생성하고, 비용함수를 이용하여 학습하기 위함이다.In addition, the present invention is to create a unit action deep learning model, and to learn using a cost function.

또한, 본 발명은 생성된 단위행동 딥러닝 모델을 목표 지향 순서에 따라 배열할 수 있고, 새롭게 입력되는 이미지가 어느 단위행동에 부합하는지 판단하기 위함이다.In addition, the present invention can arrange the generated unit action deep learning model in the order of goal orientation, and is for determining which unit action corresponds to which newly input image.

또한, 본 발명은 학습된 단위행동 딥러닝 모델이 적용된 로봇 기구를 정확하게 제어하기 위함이다.In addition, the present invention is to accurately control the robot mechanism to which the learned unit action deep learning model is applied.

상기와 같은 과제를 해결하기 위한 본 발명의 일 실시예는, 작업자의 행동을 다가가기, 정렬하기, 물체 잡기, 물체 옮기기, 물체 내려놓기, 삽입하기 및 조이기를 포함하는 n개의 단위행동으로 구분하여, 상기 단위행동마다 각각 직접 교시(direct teaching) 및 관찰(observation)을 포함하는 시연으로부터 정답 이미지 및 추가 학습용 이미지를 획득하고, 이를 이용하여 학습하는 방법으로서, (a) 전처리 모듈(100)은 작업자의 시연으로부터 로우 이미지가 입력되면 상기 로우 이미지에서 단위행동을 각각 구분하고, 상기 구분된 단위행동마다 각각 정답 이미지와 그 외의 배경을 추출하고, 상기 구분된 단위행동마다 상기 정답 이미지와 배경을 포함하는 각각의 이미지 세트를 생성하는 단계;(b) 모델링 수행 모듈(200)이 상기 각각의 이미지 세트를 입력받아, 각각 딥러닝 학습하여 다수의 레이어를 포함하는 제1 내지 제n 단위행동 딥러닝 모델(n은 2이상의 자연수)을 상기 단위행동마다 각각 생성하는 단계;(c) 데이터 처리 모듈(300)에서 상기 제1 내지 제n 단위행동 딥러닝 모델의각각 제1 내지 제n 비용함수(cost function)를 판단하고, 상기 제1 내지 제n 비용함수의 정답 벡터를 결정하는 단계;(d) 상기 전처리 모듈(100)에 추가 학습용 이미지가 입력되고, 상기 전처리 모듈(100)은 상기 추가 학습용 이미지에서 단위행동을 각각 구분하고, 상기 각각의 제1 내지 제n 단위행동 딥러닝 모델 중 상기 단위행동에 대응되는 어느 하나의 단위행동 딥러닝 모델에 상기 추가 학습용 이미지를 전송하고, 상기 어느 하나의 단위행동 딥러닝 모델은 상기 추가 학습용 이미지에서 특징 벡터를 추출하고, 상기 추출된 특징 벡터는 상기 데이터 처리 모듈(300)로 전송하고, 상기 데이터 처리 모듈(300)이 상기 추가 학습용 이미지에서 추출된 특징 벡터를 임베딩한 특징 벡터와 상기 (c)단계에서 결정한 정답 벡터를 추가 학습 모듈(400)로 전송하는 단계; 및 (e) 상기 추가 학습 모듈(400)에서 상기 전송된 정답 벡터와 상기 추가 학습용 이미지의 임베딩된 특징 벡터의 차이를 비교하고, 상기 차이가 기설정된 값 미만이면, 상기 추가 학습용 이미지의 임베딩된 특징 벡터로 상기 제1 내지 제n 단위행동 딥러닝 모델 중 상기 어느 하나를 추가 학습시키는 단계;를 포함하는, 학습 방법을 제공한다.An embodiment of the present invention for solving the above problems is divided into n unit actions including approaching, aligning, grabbing an object, moving an object, putting an object down, inserting, and tightening the operator's actions. , A method of acquiring a correct answer image and an image for additional learning from a demonstration including direct teaching and observation, respectively, for each unit action, and learning by using this, (a) the preprocessing module 100 is a worker When a raw image is input from the demonstration of Generating each image set; (b) the modeling performing module 200 receives the respective image sets, each deep learning learns first to nth unit action deep learning models including a plurality of layers ( n is a natural number greater than or equal to 2) for each of the unit actions; (c) first to nth cost functions of the first to nth unit action deep learning models in the data processing module 300, respectively and determining the correct answer vector of the first to nth cost functions; (d) an image for additional learning is input to the pre-processing module 100, and the pre-processing module 100 is a unit in the image for additional learning Each action is divided, and the additional learning image is transmitted to any one unit action deep learning model corresponding to the unit action among the first to nth unit action deep learning models, and the one unit action deep learning model. The learning model extracts a feature vector from the image for further learning, sends the extracted feature vector to the data processing module 300, and the data processing module 300 embeds the feature vector extracted from the image for additional learning transmitting one feature vector and the correct answer vector determined in step (c) to the additional learning module 400; and (e) comparing the difference between the transmitted correct answer vector and the embedded feature vector of the image for additional learning from the additional learning module 400, and if the difference is less than a preset value, the embedded feature of the image for additional learning It provides a learning method, including; further learning any one of the first to n-th unit action deep learning model as a vector.

일 실시예는, 상기 (e)단계 이후, (f) 상기 전처리 모듈(100)에 추가 학습용 이미지가 더 입력되고, 상기 (d) 내지 (e) 단계가 반복되어 상기 제1 내지 제n 단위행동 딥러닝 모델 중 다른 하나를 추가 학습시키는 단계를 더 포함할 수 있다.In one embodiment, after step (e), (f) an additional learning image is further input to the pre-processing module 100, and steps (d) to (e) are repeated to repeat the first to n-th unit actions It may further include the step of further training another one of the deep learning models.

일 실시예는, (a1) 상기 전처리 모듈(100)이 상기 정답 이미지에서 기 설정된 방법에 따라 타겟 물체와 장애물을 분리하는 단계; 및 (a2) 상기 전처리 모듈(100)이 상기 분리된 타겟 물체와 장애물을 상기 배경에 배열하되 상기 타겟 물체와 장애물의 위치, 방향 및 자세를 변경하면서 배열함으로써, 새로운 이미지 세트 생성하는 단계; 를 포함할 수 있다.In one embodiment, (a1) the pre-processing module 100 separating the target object and the obstacle according to a preset method in the correct answer image; and (a2) generating, by the pre-processing module 100, the separated target object and obstacle in the background while changing the position, direction, and posture of the target object and obstacle, thereby generating a new image set; may include

상기와 같은 과제를 해결하기 위한 본 발명의 다른 실시예는 전술한, 상기 학습 방법으로 생성된 제1 내지 제n 단위행동 딥러닝 모델을 이용한 로봇 제어 방법으로서, (g) 상기 전처리 모듈(100)에 실시간 이미지가 입력되면, 상기 전처리 모듈(100)이 상기 실시간 이미지를 상기 제1 내지 제n 단위행동 딥러닝 모델 각각에 전송하고, 상기 제1 내지 제n 단위행동 딥러닝 모델 각각은 전송된 실시간 이미지에서 특징 벡터를 추출하여 상기 데이터 처리 모듈(300)로 전송하는 단계;(h) 상기 데이터 처리 모듈(300)은 상기 (g)단계에서 전송된 특징 벡터를 각각 임베딩하고, 상기 임베딩된 특징 벡터 각각을 상기 제1 내지 제n 비용함수에 대입하여 제1 내지 제n 비용함수값을 각각 연산하고, 상기 연산된 제1 내지 제n 비용함수값 각각을 리모델링 모듈(500)로 전송하는 단계; 및 (i) 상기 리모델링 모듈(500)은 상기 데이터 처리 모듈(300)에서 전송된 상기 제1내지 제n 비용함수값을 이용하여 기설정된 방법으로 로봇 제어값을 각각 연산하고, 상기 로봇 제어값들 중 가장 큰 값에 해당하는 상기 제1 내지 제n 단위행동 딥러닝 모델 중 어느 하나를 선택하는 단계; 를 포함하는 제어 방법을 제공한다.Another embodiment of the present invention for solving the above problems is a robot control method using the first to n-th unit action deep learning models generated by the above-described learning method, (g) the pre-processing module 100 When a real-time image is input to , the pre-processing module 100 transmits the real-time image to each of the first to n-th unit action deep learning models, and each of the first to n-th unit action deep learning models is transmitted in real time. extracting a feature vector from the image and transmitting it to the data processing module 300; (h) the data processing module 300 embeds each of the feature vectors transmitted in the step (g), and the embedded feature vector calculating first to nth cost function values by substituting each into the first to nth cost functions, respectively, and transmitting each of the calculated first to nth cost function values to the remodeling module 500; and (i) the remodeling module 500 calculates each robot control value in a preset method using the first to n-th cost function values transmitted from the data processing module 300, and the robot control values selecting any one of the first to n-th unit action deep learning models corresponding to the largest value among; It provides a control method comprising a.

일 실시예는, 상기 (i)단계는, (i1) 상기 리모델링 모듈(500)은 각각의 상기 제1 내지 제n 비용함수값이 모두 기설정된 값 이상이면, 1에서 제1 비용함수값을 차감한 값에 1을 더한 후 상기 제1 비용함수값을 곱하여 상기 제1 단위행동 딥러닝 모델의 로봇 제어값을 연산하는 단계;(i2) 상기 리모델링 모듈(500)이 상기 제1 내지 제n 비용함수값이 중 어느 하나라도 기설정된 값 미만이면, 1에서 제1 비용함수값을 차감한 값에 상기 제1 비용함수값을 곱하여 상기 제1 단위행동 딥러닝 모델의 로봇 제어값을 연산하는 단계; 및(i3) 상기 리모델링 모듈(500)이, 1에서 제n 비용함수값을 차감한 값과 1에 제n-1 비용함수값을 더한 값을 더한 후 여기에 제n 비용함수값을 곱하여 상기 제n 단위행동 딥러닝 모델(n은 2 이상의 자연수)의 로봇 제어값을 연산하는 단계; 를 포함할 수 있다.In one embodiment, in the step (i), (i1) the remodeling module 500 subtracts the first cost function value from 1 when all of the first to n-th cost function values are greater than or equal to a preset value. calculating a robot control value of the first unit action deep learning model by adding 1 to one value and then multiplying the first cost function value; (i2) the remodeling module 500 performs the first to nth cost functions calculating a robot control value of the first unit action deep learning model by multiplying a value obtained by subtracting a first cost function value from 1 when any one of the values is less than a preset value; and (i3) the remodeling module 500 adds a value obtained by subtracting an n-th cost function value from 1 and a value obtained by adding an n-1th cost function value to 1, and then multiplies it by the n-th cost function value. calculating the robot control value of the n unit action deep learning model (n is a natural number greater than or equal to 2); may include

일 실시예는, 상기 (g) 단계 이전에, 상기 전처리 모듈(100)에 실시간 이미지가 입력되면, 상기 전처리 모듈(100)이 상기 실시간 이미지를 하나 이상의 실시간 이미지로 분류하는 단계를 더 포함하며, 상기 분류된 하나 이상의 실시간 이미지마다 각각 상기 (g) 내지 (i) 단계가 반복되어, 상기 제1 내지 제n 단위행동 딥러닝 모델 중 어느 하나 이상이 선택되고, 상기 (i) 단계 이후, 상기 로봇이, 상기 선택된 제1 내지 제n 단위행동 딥러닝 모델 각각에 대응되는 하나 이상의 단위행동을 수행하도록 제어되는 단계; 를 포함할 수 있다.In one embodiment, before the step (g), if a real-time image is input to the pre-processing module 100, the pre-processing module 100 further comprises the step of classifying the real-time image into one or more real-time images, Steps (g) to (i) are repeated for each of the classified one or more real-time images, any one or more of the first to n-th unit action deep learning models are selected, and after step (i), the robot This, controlling to perform one or more unit actions corresponding to each of the selected first to n-th unit action deep learning model; may include

한편 본 발명은 전술한, 상기 학습 방법에 의해 생성된 딥러닝 모델을 포함한 프로그램으로서, 저장 매체에 기록된 프로그램을 제공한다.Meanwhile, the present invention provides a program recorded in a storage medium as a program including the deep learning model generated by the above-described learning method.

또한, 전술한, 상기 학습 방법이 수행되는 시스템을 제공한다.In addition, there is provided a system in which the above-described learning method is performed.

또한, 전술한, 상기 제어 방법이 수행되도록 저장 매체에 기록된 프로그램을 제공한다.In addition, there is provided a program recorded in a storage medium to perform the above-described control method.

또한, 전술한, 상기 제어 방법이 수행되는, 로봇을 제공한다.In addition, there is provided a robot, in which the above-described control method is performed.

본 발명에 따라, 다음과 같은 효과가 달성된다. According to the present invention, the following effects are achieved.

본 발명은 작업자의 작업을 단위행동으로 구분하여, 단위행동 각각 딥러닝 모델을 생성함에 따라, 작업자의 작업을 정확하게 모델링할 수 있고, 복잡한 작업이더라도 신속하고 정확하게 모델링할 수 있다.The present invention divides the worker's work into unit actions and generates a deep learning model for each unit action, so that the worker's work can be accurately modeled, and even a complex job can be modeled quickly and accurately.

또한, 본 발명은 단위행동 딥러닝 모델을 생성하고, 비용함수를 이용하여 학습하여, 생성된 단위행동 딥러닝 모델에서 최적화하는 과정을 포함하여 학습하는바 단위행동 딥러닝 모델이 작업자의 작업을 정확하게 학습할 수 있다.In addition, the present invention generates a unit action deep learning model, learns using a cost function, and learns including the process of optimizing in the generated unit action deep learning model, so that the unit action deep learning model accurately performs the worker's work can learn

또한, 본 발명은 생성된 단위행동 딥러닝 모델을 목표 지향 순서에 따라 배열할 수 있고, 새롭게 입력되는 이미지가 어느 단위행동에 부합하는지 판단하는 바, 작업자의 단위행동이 어느 순서에서 시작하더라도 그 이후의 단위행동 및 작업을 판단 및 수행할 수 있다.In addition, the present invention can arrange the generated unit action deep learning model according to the goal-oriented order, and it is determined which unit action the newly input image corresponds to. can judge and perform unit actions and tasks of

또한, 본 발명은 학습된 딥러닝 모델이 적용되는 매니퓰레이터를 통하여, 작업자의 단위행동 별 작업을 정확하게 수행할 수 있다.In addition, the present invention can accurately perform a task for each unit action of an operator through a manipulator to which the learned deep learning model is applied.

도 1은 본 발명에 따른 방법을 설명하기 위한 도면이다.
도 2는 리모델링 모듈에서 비용함수값을 이용하여 로봇 제어값을 연산하는 것을 설명하기 위한 도면이다.
도 3 내지 도 9는 본 발명에 따라 리모델링 모듈에서 로봇 제어값을 연산하는 것을 설명하기 위한 도면이다.
도 10은 본 발명에 따른 시스템 및 방법이 적용될 수 있는 로봇 보조기구를 나타낸 도면이다.1 is a view for explaining a method according to the present invention.
2 is a view for explaining calculation of a robot control value using a cost function value in the remodeling module.
3 to 9 are diagrams for explaining calculation of a robot control value in the remodeling module according to the present invention.
10 is a view showing a robot aid to which the system and method according to the present invention can be applied.

몇몇 경우, 본 발명의 개념이 모호해지는 것을 피하기 위하여 공지의 구조 및 장치는 생략되거나, 각 구조 및 장치의 핵심기능을 중심으로 한 블록도 형식으로 도시될 수 있다.In some cases, well-known structures and devices may be omitted or shown in block diagram form focusing on core functions of each structure and device in order to avoid obscuring the concept of the present invention.

또한, 본 발명의 실시예들을 설명함에 있어서 공지 기능 또는 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략할 것이다. 그리고 후술되는 용어들은 본 발명의 실시예에서의 기능을 고려하여 정의된 용어들로서 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다. 그러므로 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다.In addition, in the description of the embodiments of the present invention, if it is determined that a detailed description of a well-known function or configuration may unnecessarily obscure the gist of the present invention, the detailed description thereof will be omitted. In addition, the terms to be described later are terms defined in consideration of functions in an embodiment of the present invention, which may vary according to intentions or customs of users and operators. Therefore, the definition should be made based on the content throughout this specification.

본 발명에 따른 시스템은, 전처리 모듈(100), 모델링 수행모듈(200), 데이터 처리 모듈(300), 추가 학습 모듈(400), 리모델링 모듈(500)을 포함한다.The system according to the present invention includes a preprocessing module 100 , a modeling performing module 200 , a data processing module 300 , an additional learning module 400 , and a remodeling module 500 .

도 1 내지 도 10을 참조하여, 본 발명에 따른 시스템과 시스템이 적용되는 방법을 설명한다.A system according to the present invention and a method to which the system is applied will be described with reference to FIGS. 1 to 10 .

산업현장에서 작업하는 작업자의 행동은 작업의 종류에 따라 다양한 행동을 수행한다.The actions of workers working at industrial sites perform various actions according to the type of work.

작업자의 행동을 다수의 단위행동으로 구분할 수 있다. 예를 들어, 단위행동은 다가가기, 정렬하기, 물체 잡기, 물체 옮기기, 물체 내려놓기, 삽입하기, 조이기를 포함할 수 있다. 다만, 이에 제한되어 해석되는 것은 아니다. A worker's behavior can be divided into a number of unit actions. For example, the unit action may include approaching, aligning, grabbing an object, moving an object, putting down an object, inserting, and tightening. However, the interpretation is not limited thereto.

본 발명에서는 작업자의 단위행동 마다 각각 직접 교시(direct teaching) 및 관찰(observation)을 포함하는 시연으로부터 정답 이미지와 추가 학습용 이미지를 이용하여 학습한다.In the present invention, learning is performed using the correct answer image and the additional learning image from a demonstration including direct teaching and observation, respectively, for each unit action of the worker.

도 1을 참조하여, 본 발명에 따른 방법을 설명한다.1 , a method according to the present invention will be described.

작업자의 시연으로부터 전처리 모듈(100)에 로우(raw) 이미지가 입력된다.A raw image is input to the pre-processing module 100 from the operator's demonstration.

전처리 모듈(100)은 로우 이미지에서 단위행동을 각각 구분하고, 구분된 단위행동 마다 각각 정답 이미지와 그 외의 배경을 추출한다.The pre-processing module 100 separates each unit action from the raw image, and extracts a correct answer image and other backgrounds for each divided unit action.

전처리 모듈(100)이 정답 이미지에서 기 설정된 방법에 따라 타겟 물체와 장애물을 분리한다.The pre-processing module 100 separates the target object and the obstacle from the correct answer image according to a preset method.

전처리 모듈(100)이 분리된 타겟 물체와 장애물을 배경에 배열하되 타겟 물체와 장애물의 위치, 방향 및 자세를 변경하면서 배열함으로써, 새로운 이미지 세트 생성한다.The pre-processing module 100 generates a new image set by arranging the separated target object and obstacle in the background while changing the position, direction, and posture of the target object and the obstacle.

전처리 모듈(100)은 구분된 단위행동마다 정답 이미지와 배경을 포함하는 각각의 이미지 세트를 생성한다.The pre-processing module 100 generates each image set including a correct answer image and a background for each divided unit action.

전처리 모듈(100)은 생성된 각각의 이미지 세트를 모델링 수행모듈(200)로 전송한다. The pre-processing module 100 transmits each generated image set to the modeling performing module 200 .

이 때, 전처리 모듈(100)에 입력되는 작업자의 행동에 대한 이미지는 하나의 단위행동을 포함하도록 미리 구분될 수도 있다. 이 경우에는, 전처리 모듈(100)에서는 하나의 정답 이미지만이 추출되고, 이후 과정에서 하나의 단위행동 딥러닝 모델이 생성되고, 로우 이미지를 다수번 전처리 모듈(100)에 입력하여, 제1 내지 제n 단위행동 딥러닝 모델을 생성할 수도 있다. At this time, the image of the operator's action input to the preprocessing module 100 may be pre-classified to include one unit action. In this case, only one correct answer image is extracted from the pre-processing module 100, one unit action deep learning model is generated in the subsequent process, and the raw image is input to the pre-processing module 100 multiple times, and the first to It is also possible to create an nth unit action deep learning model.

모델링 수행모듈(200)는 전송된 이미지 세트를 각각 딥러닝 학습하여 제1 내지 제n 단위행동 딥러닝 모델을 단위행동마다 각각 생성한다.The modeling performing module 200 deep-learning each transmitted image set generates first to n-th unit action deep learning models for each unit action, respectively.

이 때, 제1 내지 제n 단위행동 딥러닝 모델은 작업자의 단위행동마다 각각 형성되는 것이다.At this time, the first to nth unit action deep learning models are formed for each unit action of the worker.

단위행동 딥러닝 모델은 다수의 레이어(layer)를 포함한다.A unit action deep learning model includes multiple layers.

제1 내지 제n 단위행동 딥러닝 모델이 생성된 후, 각각의 모델에서 비용함수를 결정하는 것을 설명한다.After the first to nth unit action deep learning models are generated, it will be described that the cost function is determined in each model.

전처리 모듈(100)로부터 추출된 정답 이미지가 제1 내지 제n 단위행동 딥러닝 모델로 각각 입력된다.The correct answer images extracted from the preprocessing module 100 are respectively input to the first to nth unit action deep learning models.

제1 내지 제n 단위행동 딥러닝 모델은 다수의 레이어 중에서 중간 레이어에서 생성되는 특징 벡터를 각각 추출한다.The first to nth unit action deep learning models extract feature vectors generated in an intermediate layer from among a plurality of layers, respectively.

이 때, 중간 레이어의 계층은 임의로 선택되어도 무관하다.In this case, the layer of the intermediate layer may be arbitrarily selected.

제1 내지 제n 단위행동 딥러닝 모델은 각각의 정답 이미지에서 추출된 특징 벡터를 데이터 처리 모듈(300)로 전송한다.The first to nth unit action deep learning models transmit feature vectors extracted from each correct answer image to the data processing module 300 .

데이터 처리 모듈(300)은 제1 내지 제n 단위행동 딥러닝 모델에서 전송된 정답 이미지에서 추출된 특징 벡터를 임베딩 공간에서 임베딩하여 차원을 축소한다.The data processing module 300 reduces the dimension by embedding the feature vector extracted from the correct answer image transmitted from the first to nth unit action deep learning models in the embedding space.

이 때, 임베딩 방법은 특정 방법에 제한되는 것은 아니고, 임베딩은 가령 u-map과 같은 프로그램에 의해 이루어질 수 있으나 이에 제한되는 것은 아니다.In this case, the embedding method is not limited to a specific method, and embedding may be performed by, for example, a program such as u-map, but is not limited thereto.

데이터 처리 모듈(300)은 제1 내지 제n 단위행동 딥러닝 모델에 따라 각각 제1 내지 제n 비용함수(cost function)를 판단하고, 정답 이미지에서 추출되어 임베딩된 벡터를 각각의 정답 벡터(reference vector)로 결정한다.The data processing module 300 determines first to n-th cost functions, respectively, according to the first to n-th unit action deep learning models, and uses vectors extracted from the correct answer image and embedded in each correct answer vector (reference). vector) is determined.

이 때, 임베딩된 특징 벡터는 2차원의 벡터이나, 이에 제한되는 것은 아니다.In this case, the embedded feature vector is a two-dimensional vector, but is not limited thereto.

이 때, 비용함수는 오차를 측정하는 함수로서, 이후 비용함수의 정답 벡터를 기준으로, 제1 내지 제n 단위행동 딥러닝 모델로 입력되는 실시간 이미지에서의 임베딩 공간에서의 벡터와 비용함수의 정답 벡터와 차이를 비교하여, 차이가 적은 값이 해당 작업자의 단위행동과 유사하게 수행되었음을 판단하는 기준으로 사용될 수 있다.At this time, the cost function is a function that measures the error, and then, based on the correct vector of the cost function, the vector and the correct answer of the cost function in the embedding space in the real-time image input to the first to nth unit action deep learning models By comparing the difference with the vector, a value with a small difference can be used as a criterion for judging that the unit action of the worker is similarly performed.

이후, 추가 학습 모듈(400)이 생성된 제1 내지 제n 비용함수를 이용하여 제1 내지 제n 단위행동 딥러닝 모델을 추가 학습시키는 것을 설명한다.Thereafter, it will be described that the additional learning module 400 further trains the first to nth unit action deep learning models using the generated first to nth cost functions.

추가 학습용 이미지가 전처리 모듈(100)에 입력된다.An image for further learning is input to the pre-processing module 100 .

추가 학습용 이미지는 로봇의 동작으로부터 생성되는 이미지를 의미한다. 즉, 로봇이 단위행동을 수행할 때 로봇의 각도와 위치를 포함하는 이미지를 의미할 수 있다. 이 경우에도, 로봇은 작업자의 시연을 학습하여 동작되는 로봇일 수 있다.The image for further learning means an image generated from the robot's motion. That is, when the robot performs a unit action, it may mean an image including the angle and position of the robot. Even in this case, the robot may be a robot operated by learning the demonstration of the operator.

또는, 추가 학습용 이미지는 작업자의 시연으로부터 생성되는 이미지를 의미할 수 있다. 가령, 작업자가 직접 단위행동을 수행하는 동작을 포함하는 이미지일 수 있다.Alternatively, the image for additional learning may mean an image generated from a demonstration by an operator. For example, it may be an image including an operation in which an operator directly performs a unit action.

전처리 모듈(100)은 추가 학습용 이미지에서 단위행동을 각각 구분하고, 각각의 제1 내지 제n 단위행동 딥러닝 모델 중 단위행동에 대응되는 어느 하나의 단위행동 딥러닝 모델에 추가 학습용 이미지를 전송한다.The preprocessing module 100 separates each unit action from the additional learning image, and transmits the additional learning image to any one unit action deep learning model corresponding to the unit action among each of the first to n-th unit action deep learning models. .

예를 들어, 전처리 모듈(100)이 추가 학습용 이미지에서 구분된 단위행동은 제3 단위행동일 수 있고, 전처리 모듈(100)은 제3 단위행동 딥러닝 모델에 추가 학습용 이미지를 전송한다. 이후, 제3 단위행동 딥러닝 모델이 추가 학습되는 것을 의미한다.For example, the unit action separated from the image for further learning by the pre-processing module 100 may be a third unit action, and the pre-processing module 100 transmits the image for additional learning to the third unit action deep learning model. After that, it means that the third unit action deep learning model is additionally learned.

이 때, 추가 학습용 이미지에서 구분된 단위행동이 다수일 수 있고, 이 경우에는 학습용 이미지를 구분하고, 구분된 추가 학습용 이미지를 각각의 단위행동 딥러닝 모델에 전송하여 추가 학습시킬 수 있다.At this time, there may be a plurality of unit actions divided in the additional learning image, and in this case, the learning image may be divided, and the divided additional learning image may be transmitted to each unit action deep learning model for additional learning.

어느 하나의 단위행동 딥러닝 모델은 추가 학습용 이미지에서 특징 벡터를 추출하고, 추출된 특징 벡터는 데이터 처리 모듈(300)로 전송한다.Any one unit action deep learning model extracts a feature vector from an image for further learning, and transmits the extracted feature vector to the data processing module 300 .

데이터 처리 모듈(300)은 추가 학습용 이미지에서 특징 벡터를 임베딩한다.The data processing module 300 embeds the feature vector in the image for further training.

데이터 처리 모듈(300)은 추가 학습용 이미지에서 추출된 특징 벡터를 임베딩한 특징 벡터와 정답 벡터를 추가 학습 모듈(400)로 전송한다.The data processing module 300 transmits the feature vector in which the feature vector extracted from the image for further learning is embedded and the correct answer vector to the additional learning module 400 .

데이터 처리 모듈(300)은 정답 벡터와 실시간 이미지의 임베딩된 특징 벡터를 추가 학습 모듈(400)로 전송한다.The data processing module 300 transmits the correct answer vector and the embedded feature vector of the real-time image to the additional learning module 400 .

추가 학습 모듈(400)은 추가 학습용 이미지의 임베딩된 특징 벡터와, 정답 벡터와의 차이가 기설정된 값 미만인지 판단한다. The additional learning module 400 determines whether a difference between the embedded feature vector of the image for additional learning and the correct answer vector is less than a preset value.

추가 학습 모듈(400)은 차이가 기설정된 값 미만인 실시간 이미지의 임베딩된 특징 벡터로 제1 내지 제n 단위행동 딥러닝 모델 중 어느 하나를 추가 학습시킨다.The additional learning module 400 further trains any one of the first to n-th unit action deep learning models with embedded feature vectors of real-time images in which the difference is less than a preset value.

이후, 전처리 모듈(100)에 추가 학습용 이미지가 더 입력되고, 상기 과정이반복되어 제1 내지 제n 단위행동 딥러닝 모델 중 다른 하나를 추가 학습시킨다. Thereafter, an image for additional learning is further input to the preprocessing module 100 , and the above process is repeated to additionally learn another one of the first to nth unit action deep learning models.

즉, 이와 같은 과정을 통해 제1 내지 제n 단위행동 딥러닝 모델이 각각 추가 학습될 수 있음을 의미한다.That is, it means that the first to n-th unit action deep learning models can each be additionally trained through this process.

상기의 과정을 통하여, 제1 내지 제n 단위행동 딥러닝 모델이 생성되고, 추가 학습되는 것을 설명하였다.Through the above process, it has been described that the first to nth unit action deep learning models are generated and additionally learned.

이후, 도 2 내지 도 8을 참조하여, 학습 방법으로 생성된 제1 내지 제n 단위행동 딥러닝 모델을 이용한 로봇 제어 방법을 설명한다.Hereinafter, a robot control method using the first to nth unit action deep learning models generated by the learning method will be described with reference to FIGS. 2 to 8 .

작업자의 단위행동 별로 각각 제1 내지 제n 단위행동 딥러닝 모델을 생성한 것을, 로봇이 이를 단순히 순서대로만 학습할 경우에는 작업을 수행하다가 예기치 못한 상황이 발생할 경우 처음부터 학습된 단위행동부터 다시 작업을 수행해야 한다.If the first to nth unit action deep learning models are generated for each unit action of the worker, and the robot simply learns them in order, the robot performs the work and when an unexpected situation occurs, it starts again from the unit action learned from the beginning. should be performed

따라서, 본 발명에서는 단위행동들 사이의 전이가 일어날 수 있도록 구성하여, 로봇이 완전 연결 상태 머신(fully-connected state machine)과 같이 행동하도록 로봇 행동 생성 딥러닝 모델을 생성한다.Therefore, in the present invention, by configuring so that the transition between unit behaviors can occur, the robot behavior generation deep learning model is generated so that the robot behaves like a fully-connected state machine.

전처리 모듈(100)에 실시간 이미지가 입력되면, 전처리 모듈(100)이 상기 실시간 이미지를 제1 내지 제n 단위행동 딥러닝 모델 각각에 전송하고, 제1 내지 제n 단위행동 딥러닝 모델 각각은 전송된 실시간 이미지에서 특징 벡터를 추출하여 데이터 처리 모듈(300)로 전송한다.When a real-time image is input to the pre-processing module 100, the pre-processing module 100 transmits the real-time image to each of the first to n-th unit action deep learning models, and each of the first to n-th unit action deep learning models is transmitted The feature vector is extracted from the real-time image and transmitted to the data processing module 300 .

이 때, 실시간 이미지는 작업자의 시연 또는 작업에 따른 이미지일 수 있으나, 이에 제한되는 것은 아니다.In this case, the real-time image may be an image according to a demonstration or operation of an operator, but is not limited thereto.

데이터 처리 모듈(300)은 전송된 특징 벡터를 각각 임베딩하고, 임베딩된 특징 벡터 각각을 제1 내지 제n 비용함수에 대입하여 제1 내지 제n 비용함수값을 각각 연산하고, 연산된 제1 내지 제n 비용함수값 각각을 리모델링 모듈(500)로 전송한다.The data processing module 300 embeds the transmitted feature vectors, respectively, calculates first to n-th cost function values by substituting each of the embedded feature vectors into the first to n-th cost functions, and calculates the calculated first to n-th cost functions, respectively. Each of the n-th cost function values are transmitted to the remodeling module 500 .

리모델링 모듈(500)은 데이터 처리 모듈(300)에서 전송된 상기 제1 내지 제n 비용함수값을 이용하여 기설정된 방법으로 로봇 제어값을 각각 연산하고, 로봇 제어값들 중 가장 큰 값에 해당하는 제1 내지 제n 단위행동 딥러닝 모델 중 어느 하나를 선택한다.The remodeling module 500 calculates each robot control value in a preset method using the first to n-th cost function values transmitted from the data processing module 300, and corresponds to the largest value among the robot control values. Select any one of the first to nth unit action deep learning models.

..

리모델링 모듈(500)은 제1 내지 제n 비용함수값을 전송받고, 1에서 차감한 제1 내지 제n 비용함수값을 각각 연산한다.The remodeling module 500 receives the first to n-th cost function values, and calculates first to n-th cost function values subtracted from 1, respectively.

리모델링 모듈(500)이 제1 내지 제n 비용함수값이 각각 기설정된 값 이상인지 판단한다.The remodeling module 500 determines whether each of the first to n-th cost function values is greater than or equal to a preset value.

리모델링 모듈(500)이 제1 내지 제n 비용함수값이 각각 기설정된 값 이상이면, 1에서 제1 비용함수값을 차감한 값에 1을 더한 후 상기 제1 비용함수값을 곱하여 상기 제1 단위행동 딥러닝 모델의 로봇 제어값을 연산한다.When the remodeling module 500 has the first to nth cost function values greater than or equal to a preset value, 1 is added to a value obtained by subtracting the first cost function value from 1 and then multiplied by the first cost function value to the first unit Calculate the robot control value of the behavioral deep learning model.

또한, 리모델링 모듈(500)이 제1 내지 제n 비용함수값이 중 어느 하나라도 기설정된 값 미만이면, 1에서 제1 비용함수값을 차감한 값에 제1 비용함수값을 곱하여 상기 제1 단위행동 딥러닝 모델의 로봇 제어값을 연산한다.In addition, when the remodeling module 500 has any one of the first to n-th cost function values less than a predetermined value, the first unit by multiplying the value obtained by subtracting the first cost function value from 1 by the first cost function value Calculate the robot control value of the behavioral deep learning model.

도 2를 참조하면, 1에서 제n 비용함수값을 차감한 값과 1에서 제n-1 비용함수값을 차감한 값을 더한 값을 더한 후 여기에 제n 비용함수값을 곱하여 상기 제n 단위행동 딥러닝 모델(n은 2 이상의 자연수)의 로봇 제어값을 연산한다.Referring to FIG. 2 , after adding a value obtained by subtracting an nth cost function value from 1 and a value obtained by subtracting an n−1th cost function value from 1, the nth unit is multiplied by the nth cost function value. Calculate the robot control value of the behavioral deep learning model (n is a natural number greater than or equal to 2).

상기의 연산 과정에서는, 각각의 제1 내지 제n 단위행동 딥러닝 모델에서의 비용함수값이 높으면 로봇이 행동을 수행해야 하는 것을 의미하는 바, 해당 제1 내지 제n 단위행동 딥러닝 모델에서의 비용함수를 곱해준다.In the above calculation process, when the cost function value in each of the first to nth unit action deep learning models is high, it means that the robot should perform an action. Multiply by the cost function.

또한, 상기의 연산 과정에서는 이전의 단위행동의 비용함수가 낮다는 의미는 그 다음 단위행동을 수행해야 함을 의미하는 바, 제n 단위행동 딥러닝 모델의 로봇 제어값을 연산할 때 제n-1 단위행동 딥러닝 모델의 1에서 차감한 제n-1 비용함수값을 더해주는 바, 제n-1 단위행동 딥러닝 모델의 비용함수값이 낮아 제n-1 단위행동을 수행하는 것이 바람직하지 않음을 제n+1 단위행동 딥러닝 모델에 반영하여, 제n 단위행동 딥러닝 모델의 로봇 제어값의 수치를 증가시킬 수 있다. In addition, in the above calculation process, the low cost function of the previous unit action means that the next unit action must be performed. When calculating the robot control value of the nth unit action deep learning model, the nth As the n-1th cost function value subtracted from 1 of the 1st unit action deep learning model is added, it is undesirable to perform the n-1st unit action because the cost function value of the n-1st unit action deep learning model is low. By reflecting the n+1 unit action deep learning model, it is possible to increase the numerical value of the robot control value of the nth unit action deep learning model.

리모델링 모듈(500)이 제1 내지 제n 단위행동 딥러닝 모델에서 연산된 로봇 제어값이 가장 큰 단위행동 딥러닝 모델을 선택한다.The remodeling module 500 selects the unit action deep learning model having the largest robot control value calculated in the first to nth unit action deep learning models.

이 때, 상기와 같은 과정은 반복되어 수행될 수 있다. At this time, the above process may be repeatedly performed.

즉, 전처리 모듈(100)에 실시간 이미지가 입력되면, 전처리 모듈(100)이 상기 실시간 이미지를 하나 이상의 실시간 이미지로 분류한 후, 분류된 하나 이상의 실시간 이미지마다 각각 상기 과정이 반복되고, 제1 내지 제n 단위행동 딥러닝 모델 중 어느 하나 이상이 선택되는 과정이 반복된다.That is, when a real-time image is input to the pre-processing module 100, the pre-processing module 100 classifies the real-time image into one or more real-time images, and then the process is repeated for each classified one or more real-time images, and the first to The process of selecting one or more of the n-th unit action deep learning models is repeated.

이후, 로봇이, 선택된 제1 내지 제n 단위행동 딥러닝 모델 각각에 대응되는 하나 이상의 단위행동을 수행하도록 제어되고, 이와 같은 과정을 반복하여 로봇은 입력된 실시간 이미지에 대응되는 단위행동을 연속적으로 수행하도록 제어될 수 있다.Thereafter, the robot is controlled to perform one or more unit actions corresponding to each of the selected first to nth unit action deep learning models, and by repeating this process, the robot continuously performs unit actions corresponding to the input real-time image. can be controlled to perform.

도 3 내지 도 9는 예를 들어, 기설정된 값이 0.2일 때 제1 내지 제5 단위행동 딥러닝 모델에서 수행되는 단위행동을 결정하는 것을 설명한다.3 to 9 illustrate, for example, determining the unit action performed in the first to fifth unit action deep learning models when the preset value is 0.2.

제1 단위행동 딥러닝 모델은 단위행동이 "다가가기"로 설정된다.In the first unit action deep learning model, the unit action is set to "approach".

제2 단위행동 딥러닝 모델은 단위행동이 "정렬하기"로 설정된다.In the second unit action deep learning model, the unit action is set to "align".

제3 단위행동 딥러닝 모델은 단위행동이 "물체잡기"로 설정된다.In the third unit action deep learning model, the unit action is set to “grab object”.

제4 단위행동 딥러닝 모델은 단위행동이 "물체 옮기기"로 설정된다.In the fourth unit action deep learning model, the unit action is set to "moving an object".

제5 단위행동 딥러닝 모델은 단위행동이 "내려놓기"로 설정된다.In the fifth unit action deep learning model, the unit action is set to "put down".

도 3에서 제1 단위행동 딥러닝 모델에서 입력된 이미지에 대한 제1 비용함수값이 0.61이고, 제2 단위행동 딥러닝 모델에서 입력된 이미지에 대한 제2 비용함수값이 0.71이고, 제3 단위행동 딥러닝 모델에서 입력된 이미지에 대한 제3 비용함수값이 0.81이고, 제4 단위행동 딥러닝 모델에서 입력된 이미지에 대한 제4 비용함수값이 0.61이고, 제5 단위행동 딥러닝 모델에서 입력된 이미지에 대한 제5 비용함수값이 0.77이다. 3 , the first cost function value for the image input from the first unit action deep learning model is 0.61, the second cost function value for the image input from the second unit action deep learning model is 0.71, and the third unit The third cost function value for the image input from the behavioral deep learning model is 0.81, the fourth cost function value for the image input from the fourth unit action deep learning model is 0.61, and the input from the fifth unit action deep learning model The fifth cost function value for the image obtained is 0.77.

제1 내지 제5 단위행동 딥러닝 모델의 비용함수값은 각각 기설정된 값인 0.2 이상이다. 이 경우, 제1 단위행동 딥러닝 모델에서 1에서 차감한 제1 비용함수에 1을 더하고, 그 후 제1 비용함수를 곱한다. 따라서, 제1 단위행동 딥러닝 모델에서는 로봇 제어값이 (1+0.39)*0.61=0.8479로 연산된다. 이는 제1 내지 제5 단위행동의 로봇 제어값 중 가장 큰 값인 바, 리모델링 모델(400)은 입력된 이미지에 대한 단위행동으로 제1 단위행동인 "다가가기"를 결정할 수 있다.Each of the cost function values of the first to fifth unit action deep learning models is 0.2 or more, which is a preset value. In this case, 1 is added to the first cost function subtracted from 1 in the first unit action deep learning model, and then multiplied by the first cost function. Therefore, in the first unit action deep learning model, the robot control value is calculated as (1+0.39)*0.61=0.8479. This is the largest value among the robot control values of the first to fifth unit actions, and the remodeling model 400 may determine the first unit action, “approach,” as the unit action for the input image.

도 4는 제1 단위행동 딥러닝 모델에서 입력된 이미지에 대한 제1 비용함수값이 0.02이고, 제2 단위행동 딥러닝 모델에서 입력된 이미지에 대한 제2 비용함수값이 0.71이고, 제3 단위행동 딥러닝 모델에서 입력된 이미지에 대한 제3 비용함수값이 0.81이고, 제4 단위행동 딥러닝 모델에서 입력된 이미지에 대한 제4 비용함수값이 0.61이고, 제5 단위행동 딥러닝 모델에서 입력된 이미지에 대한 제5 비용함수값이 0.77이다. 4 shows that the first cost function value for the image input from the first unit action deep learning model is 0.02, the second cost function value for the image input from the second unit action deep learning model is 0.71, and the third unit The third cost function value for the image input from the behavioral deep learning model is 0.81, the fourth cost function value for the image input from the fourth unit action deep learning model is 0.61, and the input from the fifth unit action deep learning model The fifth cost function value for the image obtained is 0.77.

제1 단위행동 딥러닝 모델에서의 비용함수값이 0.02로 기설정된 값 미만인 바, 이와 같은 경우 제1 단위행동 딥러닝 모델의 로봇 제어값을 연산할 때 제1 단위행동 딥러닝 모델의 제1 비용함수값과 1에서 차감한 제1 비용함수값만을 곱하여 연산한다.The cost function value in the first unit action deep learning model is less than a preset value of 0.02. In this case, the first cost of the first unit action deep learning model when calculating the robot control value of the first unit action deep learning model It is calculated by multiplying only the function value and the first cost function value subtracted from 1.

이 때, 제1 단위행동 딥러닝 모델에서의 제1 비용함수값이 낮을 경우 로봇은 제1 단위행동을 수행한 경우로, 로봇이 이미 타겟 물체에 다가간 이후임을 의미한다.In this case, when the first cost function value in the first unit action deep learning model is low, the robot performs the first unit action, which means that the robot has already approached the target object.

제2 단위행동 딥러닝 모델에서 로봇 제어값은 제2 단위행동 딥러닝 모델에서의 1에서 차감한 제2 비용함수값인 0.29에 1에서 차감한 제1 비용함수값인 0.98을 더한 값에 제2 비용함수값인 0.71을 곱하여 연산된다. 따라서, 제2 단위행동 딥러닝 모델에서는 로봇 제어값이 (0.98+0.29)*0.71=0.9017로 연산된다. 이는 제1 내지 제5 단위행동의 로봇 제어값 중 가장 큰 값인 바, 리모델링 모델(400)은 입력된 이미지에 대한 단위행동으로 제2 단위행동인 "정렬하기"를 결정할 수 있다.In the second unit action deep learning model, the robot control value is a value obtained by adding 0.98, which is the first cost function value subtracted from 1 to 0.29, which is the second cost function value subtracted from 1, in the second unit action deep learning model. It is calculated by multiplying the cost function value by 0.71. Therefore, in the second unit action deep learning model, the robot control value is calculated as (0.98+0.29)*0.71=0.9017. This is the largest value among the robot control values of the first to fifth unit actions, and the remodeling model 400 may determine the second unit action, “sort”, as the unit action for the input image.

도 5 내지 도 7은 각각 제3 내지 제5 단위행동에서의 로봇 제어값이 각각 큰 경우를 설명하며, 이에 따라 각각 제3 내지 제5 단위행동이 결정된다.5 to 7 illustrate a case where the robot control values in the third to fifth unit actions are large, respectively, and the third to fifth unit actions are respectively determined accordingly.

도 8를 참조하여, 로봇이 있는 환경에 타겟 물체가 2개 있는 경우로 바닥에 물체 1개가 놓여 있고, 로봇이 물체 1개를 잡고 있는 상황 가정한다. Referring to FIG. 8 , it is assumed that there are two target objects in an environment in which the robot is located, one object is placed on the floor, and the robot is holding one object.

이 경우에는 예를 들어, 제2 단위행동인 “정렬하기”의 비용함수도 낮고, 제3 단위행동인“물체잡기”의 비용함수도 낮은 상황으로, 바닥에 놓인 물체에 대하여 제3 단위행동인“물체잡기”를 하는 것이 바람직한지, 이미 잡고 있는 물체에 대하여 제4 단위행동인“물체 옮기기＂를 하는 것이 바람직한지가 문제된다.In this case, for example, the cost function of the second unit action “sorting” is low, and the cost function of the third unit action “grassing an object” is also low. The question is whether it is desirable to “grab an object” or whether it is desirable to perform the fourth unit action “moving an object” for an object that is already being held.

본 발명에서는 이런 경우 이미 잡고 있는 물체에 대하여 제4 단위행동인“물체 옮기기”를 먼저 수행한 후 다른 물체에 대한 로봇 행동을 수행하도록 제어할 수 있다.In this case, in the present invention, the fourth unit action, “moving an object,” is first performed with respect to an object already being held, and then the robot action can be controlled to be performed on another object.

도 8에서 제2 단위행동 딥러닝 모델과 제3 단위행동 딥러닝 모델의 비용함수값이 각각 0.04, 0.06으로 낮은 것으로 도시되고, 제4 단위행동 딥러닝 모델의 로봇 제어값은 제3 단위행동 딥러닝 모델에서의 높은 1에서 차감된 제3 비용함수값을 더해지는 바 제1 내지 제5 단위행동 딥러닝 모델의 로봇 제어값 중 가장 높다.In FIG. 8 , the cost function values of the second unit action deep learning model and the third unit action deep learning model are shown as low as 0.04 and 0.06, respectively, and the robot control value of the fourth unit action deep learning model is the third unit action deep learning model. The bar is the highest among the robot control values of the first to fifth unit action deep learning models by adding the third cost function value subtracted from the high 1 in the learning model.

제3 단위행동 딥러닝 모델에서는 제2 단위행동 딥러닝 모델에서의 높은 1에서 차감된 제2 비용함수값이 더해지더라도 제3 단위행동 딥러닝 모델의 낮은 제3 비용함수가 곱해지는 바, 제4 단위행동 딥러닝 모델보다 로봇 제어값이 낮다.In the third unit action deep learning model, even if the second cost function value subtracted from the high 1 in the second unit action deep learning model is added, the third low cost function of the third unit action deep learning model is multiplied. The robot control value is lower than that of the unit action deep learning model.

도 9을 참조하여, 로봇이 있는 환경에 타겟 물체가 2개 있는 경우로, 바닥에 물체 1개가 놓여 있고, 로봇이 물체 1개를 옮기는 상황 가정한다. Referring to FIG. 9 , it is assumed that there are two target objects in the environment where the robot is located, one object is placed on the floor, and the robot moves one object.

예를 들어, 제1 단위행동인“다가가기”의 비용함수도 낮고, 제4 단위행동의“물체 옮기기”의 비용함수도 낮은 상황으로, 이 경우에 제2 단위행동인“정렬하기”를 하는 것이 바람직한지, 제5 단위행동인“내려놓기＂를 하는 것이 바람직한지 문제된다.For example, the cost function of the first unit action “reaching” is low and the cost function of “moving the object” of the fourth unit action is low. In this case, The question is whether it is desirable to perform the 5th unit action, “letting go”.

본 발명에서는 이런 경우 이미 잡고 있는 물체를 제5 단위행동인“내려놓기”를 먼저 끝낸 후 다른 물체에 대한 로봇 행동을 수행하도록 제어할 수 있다.In this case, the present invention can control the robot to perform the robot action on another object after first ending the fifth unit action, “putting down,” on the object already being held.

다만, 이전 행동의 결과에 따라 바뀔 수 있음. 예를 들어, 제4단위행동인“물체 옮기기＂의 상황보다 제1 단위행동인 “다가가기”의 상황이 확실하게 보장되어 있다면 제5 단위행동인 “내려놓기”가 아닌 제2 단위행동인“정렬하기＂를 수행할 수도 있다.However, it may change depending on the results of previous actions. For example, if the situation of “approaching”, which is the first unit action, is more clearly guaranteed than the situation of “moving an object”, which is the fourth unit action, the second unit action “ Sorting" can also be performed.

이에 따라, 리모델링 모델(400)에 제1 내지 제n 단위행동 딥러닝 모델에 이미지가 연속하여 입력되더라도 로봇이 완전 연결 상태 머신(fully-connected state machine)과 같이 행동하도록 로봇 행동 생성 딥러닝 모델을 생성할 수 있다.Accordingly, even if the images are continuously input to the first to n-th unit action deep learning models in the remodeling model 400, the robot behavior generation deep learning model is generated so that the robot behaves like a fully-connected state machine. can create

도 10은 본 발명에 따른 시스템 및 방법이 적용될 수 있는 로봇을 예시적으로 나타낸 도면으로, 도시된 로봇에 제한되는 것은 아니고, 산업용/협동 로봇을 사용하는 모든 분야에 적용 가능하며, 현재 로봇을 활용하고 있는 모든 분야를 포함해 아직까지 로봇을 사용하지 못한 분야들에도 그 사용 범위 확장이 가능하다.10 is a diagram illustrating an exemplary robot to which the system and method according to the present invention can be applied. It is possible to expand the scope of use to fields where robots have not yet been used, including all fields of use.

위 설명한 본 발명의 일 실시예에 따른 학습 방법과 제어방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. The learning method and the control method according to an embodiment of the present invention described above may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer-readable medium.

상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. The computer-readable medium may include program instructions, data files, data structures, etc. alone or in combination.

상기 매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다.The program instructions recorded on the medium may be specially designed and configured for the present invention, or may be known and available to those skilled in the art of computer software.

컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. Examples of the computer-readable recording medium include magnetic media such as hard disks, floppy disks and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic such as floppy disks. - includes magneto-optical media, and hardware devices specially configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like.

프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. Examples of program instructions include not only machine language codes such as those generated by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like.

상기된 하드웨어 장치는 본 발명을 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The hardware devices described above may be configured to operate as one or more software modules to carry out the operations of the present invention, and vice versa.

또한, 위 설명한 본 발명의 일 실시예에 따른 학습 방법은 상기 학습 방법이 수행되는 시스템에 의해 수행될 수 있다.In addition, the learning method according to an embodiment of the present invention described above may be performed by a system in which the learning method is performed.

또한, 위 설명한 본 발명의 일 실시예에 따른 제어 방법은 로봇에 의해 수행될 수 있다. In addition, the control method according to an embodiment of the present invention described above may be performed by a robot.

이상, 본 명세서에는 본 발명을 당업자가 용이하게 이해하고 재현할 수 있도록 도면에 도시한 실시예를 참고로 설명되었으나 이는 예시적인 것에 불과하며, 당업자라면 본 발명의 실시예로부터 다양한 변형 및 균등한 타 실시예가 가능하다는 점을 이해할 것이다. 따라서 본 발명의 보호범위는 특허청구범위에 의해서 정해져야 할 것이다.In the above, the present specification has been described with reference to the embodiments shown in the drawings so that those skilled in the art can easily understand and reproduce the present invention, but these are merely exemplary, and those skilled in the art can make various modifications and equivalent other modifications from the embodiments of the present invention. It will be appreciated that embodiments are possible. Therefore, the protection scope of the present invention should be defined by the claims.

100: 전처리 모듈
200: 모델링 수행 모듈
300: 데이터 처리 모듈
400: 추가 학습 모듈
500: 리모델링 모듈100: preprocessing module
200: modeling performance module
300: data processing module
400: Additional Learning Modules
500: remodeling module

Claims

Classifying the actions of the operator into n unit actions including approaching, aligning, grabbing an object, moving an object, putting down an object, inserting, and tightening, direct teaching and observation for each of the unit actions, respectively ) as a method of acquiring a correct answer image and an image for additional learning from a demonstration including, and learning using the image,
(d) to the pre-processing module 100 (a) when a raw image is input from the operator's demonstration, the pre-processing module 100 separates unit actions from the raw images, extracting a background other than that, and generating each image set including the correct answer image and a background for each of the divided unit actions;
(b) the modeling performing module 200 receives the respective image sets, each deep learning learns the first to n-th unit action deep learning models (n is a natural number greater than or equal to 2) including a plurality of layers as the unit generating each for each action;
(c) In the data processing module 300, first to n-th cost functions of each of the first to n-th unit action deep learning models are determined, and the correct vector of the first to n-th cost functions is calculated determining;
An image for additional learning is input, and the preprocessing module 100 separates each unit action from the image for additional learning, and any one unit corresponding to the unit action among the first to n-th unit action deep learning models, respectively. Transmits the image for additional learning to a behavior deep learning model, and any one unit behavior deep learning model extracts a feature vector from the image for additional learning, and the extracted feature vector is transmitted to the data processing module 300, , transmitting, by the data processing module 300, the feature vector in which the feature vector extracted from the image for additional learning is embedded and the correct answer vector determined in step (c) to the additional learning module 400; and
(e) the additional learning module 400 compares the difference between the transmitted correct answer vector and the embedded feature vector of the image for additional learning, and if the difference is less than a preset value, the embedded feature vector of the image for additional learning further learning any one of the first to n-th unit action deep learning models with
How to learn.

According to claim 1,
After step (e),
(f) Another additional learning image is further input to the pre-processing module 100, and using this, steps (d) to (e) are repeated to add another one of the first to n-th unit action deep learning models learning; further comprising,
How to learn.

According to claim 1,
(a1) the pre-processing module 100 separating the target object and the obstacle according to a preset method in the correct answer image; and
(a2) generating a new image set by arranging, by the pre-processing module 100, the separated target object and obstacle in the background while changing positions, directions, and postures of the target object and obstacle; containing,
How to learn.

As a robot control method using the first to nth unit action deep learning model generated by the learning method according to any one of claims 1 to 3,
(g) when a real-time image is input to the pre-processing module 100, the pre-processing module 100 transmits the real-time image to each of the first to n-th unit action deep learning models, and the first to n-th units each behavioral deep learning model extracting a feature vector from the transmitted real-time image and transmitting it to the data processing module 300;
(h) the data processing module 300 embeds each of the feature vectors transmitted in step (g), and substitutes each of the embedded feature vectors into the first to nth cost functions to obtain first to nth cost calculating each function value, and transmitting each of the calculated first to n-th cost function values to the remodeling module 500; and
(i) the remodeling module 500 calculates each robot control value in a preset method using the first to n-th cost function values transmitted from the data processing module 300, and among the robot control values selecting any one of the first to nth unit action deep learning models corresponding to the largest value; containing,
control method.

5. The method of claim 4,
Step (i) is,
(i1) The remodeling module 500 adds 1 to the value obtained by subtracting the first cost function value from 1 when all of the first to nth cost function values are greater than or equal to a preset value, and then the first cost function calculating a robot control value of the first unit action deep learning model by multiplying the value;
(i2) the remodeling module 500 multiplies the first cost function value by the value obtained by subtracting the first cost function value from 1 when any one of the first to nth cost function values is less than a preset value calculating a robot control value of the first unit action deep learning model; and
(i3) the remodeling module 500 adds a value obtained by subtracting an n-th cost function value from 1 and a value obtained by adding an n-th cost function value to 1, and then multiplies the n-th cost function value to the n-th unit Calculating the robot control value of the behavioral deep learning model (n is a natural number greater than or equal to 2); containing,
control method.

6. The method of claim 5,
Before step (g),
When a real-time image is input to the pre-processing module 100, the pre-processing module 100 further comprises the step of classifying the real-time image into one or more real-time images,
Steps (g) to (i) are repeated for each of the classified one or more real-time images, and any one or more of the first to n-th unit action deep learning models are selected,
After step (i),
controlling, by the robot, to perform one or more unit actions corresponding to each of the selected first to n-th unit action deep learning models; containing,
control method.

A program including a deep learning model generated by the learning method according to any one of claims 1 to 3, recorded on a storage medium.

A system in which a learning method according to any one of claims 1 to 3 is performed.

A program recorded in a storage medium so that the control method according to any one of claims 4 to 6 is performed.

A robot, on which the control method according to any one of claims 4 to 6 is performed.