KR20200059111A

KR20200059111A - Grasping robot, grasping method and learning method for grasp based on neural network

Info

Publication number: KR20200059111A
Application number: KR1020190015126A
Authority: KR
Inventors: 서일홍; 박영빈; 김병완
Original assignee: 한양대학교 산학협력단
Priority date: 2018-11-20
Filing date: 2019-02-08
Publication date: 2020-05-28
Also published as: KR102228525B1

Abstract

Disclosed are an in-depth supervised learning method for grasp, and a method and a robot for grasping a target object using the in-depth supervised learning result. The grasping method of the grasping robot using a neural network includes the steps of: generating a first image in which a workspace image is divided into a target object area and a background area, wherein the workspace image is obtained by photographing a workspace in which a plurality of objects is disposed; estimating a close position and a close posture of an end effector with respect to a target object included in a target object area of the first image using a first neural network learned; and estimating a grasping position and grasping posture of the end effector with respect to the target object included in a second image obtained from the position and posture of the end effector close to the target object using a second neural network learned.

Description

A gripping method using a neural network, a gripping method, and a gripping robot {GRASPING ROBOT, GRASPING METHOD AND LEARNING METHOD FOR GRASP BASED ON NEURAL NETWORK}

본 발명은 뉴럴 네트워크를 이용하는 파지 방법, 파지 학습 방법 및 파지 로봇에 관한 것으로서, 파지를 위한 심층 지도 학습 방법 및 심층 지도 학습 결과를 이용해 목표 물체를 파지하는 방법과 로봇에 관한 것이다.The present invention relates to a gripping method, a gripping learning method, and a gripping robot using a neural network, and to a robot and a method of gripping a target object using a depth map learning method for gripping and a deep map learning result.

기계 학습(machine learning)을 이용하여, 로봇의 동작을 제어하기 위한 다양한 연구들이 진행되고 있다. 그리고 기계 학습 방법 중에서도, 심층 지도 학습을 통해 로봇의 파지 동작을 제어하는 연구들이 있다. 심층 지도 학습이란 심층 학습(Deep learning)과 지도 학습(Supervised learning)이 결합한 형태의 기계 학습 방법이다.Various studies have been conducted to control the operation of the robot using machine learning. And among the machine learning methods, there are studies that control the gripping motion of the robot through in-depth supervised learning. Deep supervised learning is a machine learning method that combines deep learning and supervised learning.

로봇은 인공 신경망 즉 뉴럴 네트워크(neural network)를 이용하여 심층 지도 학습을 수행할 수 있다. 로봇은 카메라로부터 획득한 이미지를 통해 파지 대상인 목표 물체의 특징을 추출하며, 레이블(label), 즉 정답으로 제시된 목표 물체에 대한 파지 동작을 이용하여 지도 학습을 수행할 수 있다. 그리고 이미지로부터 목표 물체의 특징을 추출하기 위해, CNN(Convolutional Neural Network)이 이용될 수 있다.The robot may perform deep supervised learning using an artificial neural network, that is, a neural network. The robot may extract the characteristics of the target object to be gripped through the image acquired from the camera, and perform supervised learning using a label, that is, a gripping action on the target object presented as the correct answer. And to extract the characteristics of the target object from the image, CNN (Convolutional Neural Network) may be used.

관련 선행문헌으로 특허 문헌인 대한민국 공개특허 제2018-0114217호, 비특허 문헌인 "문성필, 박영빈, 서일홍, 혼합 밀도 신경망와 심층 컨볼루션 신경망을 이용한 사전 파지 자세 추정 방법, 2016년 대한전자공학회 하계학술대회 논문집"이 있다.Korea Patent Publication No. 2018-0114217, which is a patent literature, related to the prior literature, "Moon Sung-pil, Park Young-bin, Seo Il-hong, Pre-phage attitude estimation method using mixed density neural network and deep convolutional neural network, 2016 Korean Society of Electronics Engineers of Science Summer Conference Journals. "

본 발명은 파지를 위한 심층 지도 학습 방법 및 심층 지도 학습 결과를 이용해 목표 물체를 파지하는 방법과 로봇을 제공하기 위한 것이다.The present invention is to provide a robot and a method for gripping a target object using a deep map learning method for gripping and a deep map learning result.

상기한 목적을 달성하기 위한 본 발명의 일 실시예에 따르면, 복수의 물체가 배치된 작업 공간을 촬영하여 획득된 작업 공간 이미지가, 목표 물체 영역 및 배경 영역으로 분할된 제1이미지를 생성하는 단계; 학습된 제1뉴럴 네트워크를 이용하여, 상기 제1이미지의 목표 물체 영역에 포함된 목표 물체에 대한 엔드 이펙터의 근접 위치 및 근접 자세를 추정하는 단계; 및 학습된 제2뉴럴 네트워크를 이용하여, 상기 목표 물체에 근접한 엔드 이펙터의 위치 및 자세에서 획득한 제2이미지에 포함된 상기 목표 물체에 대한 상기 엔드 이펙터의 파지 위치 및 파지 자세를 추정하는 단계를 포함하는 뉴럴 네트워크를 이용하는 파지 로봇의 파지 방법이 제공된다.According to an embodiment of the present invention for achieving the above object, a step of generating a first image in which a workspace image obtained by photographing a workspace in which a plurality of objects are disposed is divided into a target object region and a background region. ; Estimating a close position and a close posture of an end effector with respect to a target object included in a target object area of the first image using the learned first neural network; And estimating the gripping position and gripping posture of the end effector with respect to the target object included in the second image obtained from the position and posture of the end effector close to the target object using the learned second neural network. A gripping method of a gripping robot using an included neural network is provided.

또한 상기한 목적을 달성하기 위한 본 발명의 다른 실시예에 따르면, 복수의 물체가 배치된 작업 공간을 촬영하여 획득된 작업 공간 이미지가, 훈련 물체 영역 및 배경 영역으로 분할된 제1이미지를 생성하는 단계; 제1뉴럴 네트워크를 이용하여, 상기 제1이미지의 훈련 물체 영역에 포함된 훈련 물체에 대한 엔드 이펙터의 근접 위치 및 근접 자세를 학습하는 단계; 및 제2뉴럴 네트워크를 이용하여, 상기 훈련 물체에 근접한 엔드 이펙터의 위치 및 자세에서 획득한 제2이미지에 포함된 상기 훈련 물체에 대한 상기 엔드 이펙터의 파지 위치 및 파지 자세를 학습하는 단계를 포함하는 파지 로봇을 위한 뉴럴 네트워크 학습 방법이 제공된다.In addition, according to another embodiment of the present invention for achieving the above object, the work space image obtained by photographing a work space in which a plurality of objects are arranged to generate a first image divided into a training object area and a background area step; Learning, by using a first neural network, the proximity and posture of the end effector with respect to the training object included in the training object area of the first image; And using the second neural network, learning the gripping position and gripping posture of the end effector for the training object included in the second image obtained from the position and posture of the end effector close to the training object. Neural network learning method for phage robot is provided.

또한 상기한 목적을 달성하기 위한 본 발명의 또 다른 실시예에 따르면, 복수의 물체가 배치된 작업 공간을 촬영하여 작업 공간 이미지를 생성하는 제1카메라; 상기 작업 공간 이미지가, 목표 물체 영역 및 배경 영역으로 분할된 제1이미지를 생성하는 이미지 처리부; 학습된 제1뉴럴 네트워크를 이용하여, 상기 제1이미지의 목표 물체 영역에 포함된 목표 물체에 대한 엔드 이펙터의 근접 위치 및 근접 자세를 추정하는 근접 동작 제어부; 상기 목표 물체에 근접한 엔드 이펙터의 위치 및 자세에서 상기 목표 물체를 촬영하여 제2이미지를 생성하는 제2카메라; 및 학습된 제2뉴럴 네트워크를 이용하여, 상기 제2이미지에 포함된 상기 목표 물체에 대한 상기 엔드 이펙터의 파지 위치 및 파지 자세를 추정하는 파지 동작 제어부를 포함하는 뉴럴 네트워크를 이용하는 파지 로봇이 제공된다.In addition, according to another embodiment of the present invention for achieving the above object, a first camera for generating a workspace image by photographing a workspace in which a plurality of objects are arranged; An image processing unit generating a first image in which the workspace image is divided into a target object region and a background region; A proximity operation control unit for estimating the proximity and posture of the end effector with respect to the target object included in the target object area of the first image using the learned first neural network; A second camera that photographs the target object at a position and posture of an end effector close to the target object to generate a second image; And a gripping motion control unit that estimates a gripping position and a gripping position of the end effector with respect to the target object included in the second image using the learned second neural network. .

본 발명에 따르면, 학습에 이용되는 이미지에서 훈련 물체 이외 불필요한 정보가 제거될 수 있으므로, 파지를 위한 학습 효율이 향상될 수 있다.According to the present invention, unnecessary information other than a training object may be removed from an image used for learning, so that learning efficiency for gripping can be improved.

또한 본 발명에 따르면, 훈련 물체에 대한 근접 동작과 파지 동작을 각각 별도의 뉴런 네트워크를 이용하여 학습하고 훈련 물체에 근접한 상태에서의 훈련 물체 이미지를 이용하여 파지 동작을 학습함으로써, 학습 효율이 향상될 수 있다.In addition, according to the present invention, learning efficiency is improved by learning proximity motion and gripping motion to a training object using separate neuron networks and learning gripping motion using a training object image in a state close to the training object. Can be.

도 1 및 도 2는 본 발명의 일실시예에 따른 파지 로봇을 위한 뉴럴 네트워크 학습 방법의 개념을 설명하기 위한 도면이다.
도 3은 본 발명의 일실시예에 따른 파지 로봇을 위한 뉴럴 네트워크 학습 방법을 설명하기 위한 흐름도이다.
도 4는 제2이미지의 일실시예를 도시하는 도면이다.
도 5는 본 발명의 일실시예에 따른 뉴럴 네트워크를 이용하는 파지 로봇의 파지 방법을 설명하기 위한 흐름도이다.
도 6은 본 발명의 일실시예에 따른 뉴럴 네트워크를 설명하기 위한 도면이다.
도 7은 본 발명의 일실시예에 따른 뉴럴 네트워크를 이용하는 파지 로봇을 설명하기 위한 도면이다.1 and 2 are diagrams for explaining the concept of a neural network learning method for a gripping robot according to an embodiment of the present invention.
3 is a flowchart illustrating a neural network learning method for a gripping robot according to an embodiment of the present invention.
4 is a view showing an embodiment of a second image.
5 is a flowchart illustrating a gripping method of a gripping robot using a neural network according to an embodiment of the present invention.
6 is a view for explaining a neural network according to an embodiment of the present invention.
7 is a view for explaining a gripping robot using a neural network according to an embodiment of the present invention.

본 발명은 다양한 변경을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예들을 도면에 예시하고 상세한 설명에 상세하게 설명하고자 한다. 그러나, 이는 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 각 도면을 설명하면서 유사한 참조부호를 유사한 구성요소에 대해 사용하였다. The present invention can be applied to various changes and can have various embodiments, and specific embodiments will be illustrated in the drawings and described in detail in the detailed description. However, this is not intended to limit the present invention to specific embodiments, and should be understood to include all modifications, equivalents, and substitutes included in the spirit and scope of the present invention. In describing each drawing, similar reference numerals are used for similar components.

이하에서, 본 발명에 따른 실시예들을 첨부된 도면을 참조하여 상세하게 설명한다.Hereinafter, embodiments according to the present invention will be described in detail with reference to the accompanying drawings.

도 1 및 도 2는 본 발명의 일실시예에 따른 파지 로봇을 위한 뉴럴 네트워크 학습 방법의 개념을 설명하기 위한 도면이다. 1 and 2 are diagrams for explaining the concept of a neural network learning method for a gripping robot according to an embodiment of the present invention.

파지 로봇은 파지 동작의 학습 단계에서 학습을 위한 훈련 데이터를 이용한다. 이러한 훈련 데이터는 다양한 훈련 물체 별로 주어진 파지 동작 데이터를 포함한다. 파지 동작은 파지 로봇의 엔드 이펙터(end effector), 예컨대 로봇 핸드에 의해 수행되는 것으로서, 파지를 위한 엔드 이펙터의 위치 및 자세를 포함하는 개념이다. The gripping robot uses training data for learning in the learning phase of the gripping motion. This training data includes gripping motion data given for various training objects. The gripping operation is performed by an end effector of a gripping robot, for example, a robot hand, and is a concept including the position and posture of the end effector for gripping.

즉, 파지 로봇은 훈련 데이터에 기반하여, 다양한 훈련 물체 별로 미리 주어진 파지를 위한 엔드 이펙터의 위치 및 자세를 학습하는 것이다. That is, the gripping robot is to learn the position and posture of the end effector for a given gripping in advance for various training objects based on the training data.

이후 파지 로봇은 파지 동작 수행 단계에서 학습 결과, 즉 학습 데이터를 이용하여, 목표 물체에 대한 파지 동작을 추정한다. 목표 물체가 학습에 이용된 특정 훈련 물체와 유사하다면, 파지 로봇은 특정 훈련 물체에 대한 파지 동작과 유사한 파지 동작을 목표 물체에 대한 파지 동작으로서 추정할 수 있다.Thereafter, the gripping robot estimates the gripping motion for the target object by using the learning result, that is, the learning data, in the gripping motion performing step. If the target object is similar to the specific training object used for learning, the gripping robot can estimate the gripping motion similar to the gripping motion for the specific training object as the gripping motion for the target object.

파지 로봇은 전술된 바와 같이, 뉴럴 네트워크를 이용하여 다양한 훈련 물체별 파지 동작을 학습할 수 있으며 학습된 뉴럴 네트워크를 이용하여 목표 물체에 대한 파지 동작을 추정할 수 있다. 다양한 훈련 물체 및 목표 물체를 인식하기 위해 물체에 대한 이미지가 이용되며, 이러한 이미지로부터 훈련 물체 및 목표 물체의 특징값을 추출하기 위해 CNN(Convolutional Neural Network)이 이용될 수 있다.As described above, the gripping robot can learn gripping motions for various training objects using the neural network, and estimate the gripping motion for the target object using the learned neural network. An image of an object is used to recognize various training objects and target objects, and a convolutional neural network (CNN) can be used to extract feature values of the training object and target objects from the image.

이 때, 파지를 위한 작업 공간에 다양한 물체들이 존재할 수 있기 때문에 도 1에 도시된 바와 같이, 작업 공간을 촬영하여 획득한 훈련 데이터의 이미지에는 훈련 물체인 검은색 테이프(110) 이외 다른 물체도 함께 포함될 수 있다. 이 경우 훈련 데이터의 이미지에 훈련 물체 이외 불필요한 정보들이 많이 포함되기 때문에 학습 효율이 떨어질 수 있다. At this time, since various objects may exist in the working space for gripping, as shown in FIG. 1, an image of the training data obtained by shooting the working space also includes other objects other than the black tape 110 as a training object. Can be included. In this case, since the image of the training data contains a lot of unnecessary information other than the training object, the learning efficiency may decrease.

이에 본 발명은 도 2와 같이, 훈련 물체가 집중된 이미지를 이용하여 학습을 수행한다. 도 2의 이미지는 도 1의 작업 공간 이미지로부터 얻어진 이미지로서, 작업 공간 이미지와 비교하여 훈련 물체인 검은색 테이프(110)가 포함된 훈련 물체 영역을 제외하고 나머지 영역은 모두 배경 영역으로 처리된 이미지이다. Accordingly, the present invention performs learning using an image in which a training object is concentrated, as shown in FIG. 2. The image of FIG. 2 is an image obtained from the workspace image of FIG. 1, and all other regions are processed as background regions except for the training object region including the black tape 110 as a training object compared to the workspace image to be.

특히, 본 발명은 훈련 물체가 보다 집중될 수 있는 이미지를 이용하기 위해, 배경 영역의 화소값을 미리 설정된 하나의 화소값으로 설정할 수 있으며, 배경 영역의 화소값은 도 2에 도시된 바와 같이, 일예로서 0일 수 있다. 반면, 훈련 물체 영역의 화소값은 작업 공간 이미지에서의 훈련 물체에 대한 화소값에 대응된다.Particularly, in order to use an image in which the training object can be more focused, the present invention may set the pixel value of the background area to one preset pixel value, and the pixel value of the background area may be as shown in FIG. 2. As an example, it may be 0. On the other hand, the pixel value of the training object area corresponds to the pixel value for the training object in the workspace image.

결국, 본 발명에 따르면, 학습에 이용되는 이미지에서 훈련 물체 이외 불필요한 정보가 제거될 수 있으므로, 파지를 위한 학습 효율이 향상될 수 있다.Consequently, according to the present invention, unnecessary information other than a training object may be removed from an image used for learning, so that learning efficiency for gripping can be improved.

도 3은 본 발명의 일실시예에 따른 파지 로봇을 위한 뉴럴 네트워크 학습 방법을 설명하기 위한 흐름도이며, 도 4는 제2이미지의 일실시예를 도시하는 도면이다.3 is a flowchart illustrating a neural network learning method for a gripping robot according to an embodiment of the present invention, and FIG. 4 is a diagram illustrating an embodiment of a second image.

본 발명에 따른 학습 방법은 프로세서를 포함하는 컴퓨팅 장치, 로봇 등에서 수행될 수 있으며, 이하에서는 컴퓨팅 장치인 학습 장치에서 수행되는 학습 방법이 일실시예로서 설명된다.The learning method according to the present invention may be performed in a computing device including a processor, a robot, or the like. Hereinafter, a learning method performed in a learning device that is a computing device will be described as an embodiment.

본 발명에 따른 학습 장치는 복수의 물체가 배치된 작업 공간을 촬영하여 획득된 작업 공간 이미지가, 훈련 물체 영역 및 배경 영역으로 분할된 제1이미지를 생성(S310)한다. 훈련 물체 영역은 작업 공간 이미지에 포함된 다양한 물체들 중에서 미리 지정된 훈련 물체를 포함하는 영역이다. 일예로서, 작업 공간 이미지는 도 1과 같을 수 있으며, 제1이미지는 훈련 물체가 집중된 이미지로서, 도 2와 같은 이미지일 수 있다.The learning apparatus according to the present invention generates a first image in which a workspace image obtained by photographing a workspace in which a plurality of objects is disposed is divided into a training object area and a background area (S310). The training object area is an area including a predetermined training object among various objects included in the workspace image. As an example, the workspace image may be the same as in FIG. 1, and the first image may be an image in which a training object is concentrated, and may be an image as in FIG. 2.

이후 학습 장치는 제1뉴럴 네트워크를 이용하여, 제1이미지의 훈련 물체 영역에 포함된 훈련 물체에 대한 엔드 이펙터의 근접 위치 및 근접 자세를 학습(S320)한다. 여기서, 엔드 이펙터의 근접 위치 및 근접 자세는 훈련 물체에 근접하기 위한 파지 로봇의 관절각 또는 엑츄에이터에 대한 제어값에 대응될 수 있다. Thereafter, the learning apparatus uses the first neural network to learn the proximity and posture of the end effector with respect to the training object included in the training object area of the first image (S320). Here, the proximity position and the proximity posture of the end effector may correspond to the joint angle of the gripping robot for approaching the training object or the control value for the actuator.

그리고 제2뉴럴 네트워크를 이용하여, 훈련 물체에 근접한 엔드 이펙터의 위치 및 자세에서 획득한 제2이미지에 포함된 훈련 물체에 대한 엔드 이펙터의 파지 위치 및 파지 자세를 학습(S330)한다. 마찬가지로 엔드 이펙터의 파지 위치 및 파지 자세는 훈련 물체를 파지 하기위한 파지 로봇의 관절각 또는 엑츄에이터에 대한 제어값에 대응될 수 있다.Then, using the second neural network, the gripping position and gripping posture of the end effector for the training object included in the second image acquired from the position and posture of the end effector close to the training object are learned (S330). Similarly, the gripping position and the gripping position of the end effector may correspond to the joint angle of the gripping robot for gripping the training object or a control value for the actuator.

이와 같이, 본 발명에 따른 학습 방법은 제1이미지와 같이 훈련 물체에 집중된 이미지를 이용하여 학습을 수행하되, 제1이미지로부터 바로 파지 동작을 학습하는 것이 아니라, 단계 S320에서 파지 동작을 위한 사전 동작으로서 엔드 이펙터를 훈련 물체에 근접시키는 동작을 학습한다. As described above, the learning method according to the present invention performs learning using an image focused on a training object, such as a first image, but does not learn a gripping motion directly from the first image, but a pre-operation for gripping motion in step S320. As it learns the action of bringing the end effector closer to the training object.

작업 공간 이미지는 작업 공간 전체를 촬영한 이미지이기 때문에, 제1이미지에서 집중된 훈련 물체의 크기가 제1이미지 전체 크기에 비해 상대적으로 작다. 따라서, 훈련 물체에 대한 특징값이 정확하게 추출되기 어려우므로, 본 발명에 따른 학습 장치는 1차적으로 제1이미지를 이용하여 엔드 이펙터를 훈련 물체에 근접시키는 동작을 학습한다.Since the working space image is an image of the entire working space, the size of the training object concentrated in the first image is relatively smaller than the total size of the first image. Therefore, since it is difficult to accurately extract the feature values for the training object, the learning apparatus according to the present invention primarily learns the operation of bringing the end effector closer to the training object using the first image.

이후, 카메라가 장착된 엔드 이펙터가 훈련 물체에 근접한 상태로 획득된 제2이미지는 도 4와 같이, 제1이미지에서의 훈련 물체보다 큰 형상의 훈련 물체를 포함한다. 도 4의 제2이미지에서 훈련 물체인 검은색 테이프(110) 아래의 그리퍼는 파지 로봇의 엔드 이펙터이다.Subsequently, the second image obtained while the end effector equipped with the camera is close to the training object includes a training object having a shape larger than the training object in the first image, as shown in FIG. 4. In the second image of FIG. 4, the gripper under the black tape 110 as a training object is an end effector of the gripping robot.

제2이미지는 작업 공간 이미지를 획득하는 카메라보다 상대적으로 낮은 곳에 위치하는 카메라에 의해 획득된다. 따라서, 제1이미지를 이용하는 경우보다 상대적으로 훈련 물체에 대한 특징값이 보다 정확하게 추출될 수 있으므로, 단계 S330에서 제2이미지를 이용하여 엔드 이펙터의 파지 위치 및 파지 자세를 학습한다.The second image is obtained by a camera positioned at a relatively lower position than the camera that acquires the workspace image. Therefore, since the feature values for the training object can be extracted more accurately than in the case of using the first image, the gripping position and the gripping posture of the end effector are learned using the second image in step S330.

이 때, 단계 S330에서 학습 장치는 엔드 이펙터의 근접 위치 및 근접 자세에 기반하여, 엔드 이펙터의 파지 위치 및 파지 자세를 추정할 수 있다. 즉, 학습 장치는 제2이미지에 포함된 훈련 물체 및 엔드 이펙터의 근접 위치 및 근접 자세에 대한 엔드 이펙터의 파지 위치 및 파지 자세를 학습한다.At this time, in step S330, the learning device may estimate the gripping position and the gripping posture of the end effector based on the proximity position and the posture of the end effector. That is, the learning apparatus learns the gripping position and the gripping posture of the end effector with respect to the proximity position and the posture of the training object and the end effector included in the second image.

이와 같이 본 발명에 따르면, 훈련 물체에 대한 근접 동작과 파지 동작을 각각 별도의 뉴런 네트워크를 이용하여 학습하고 훈련 물체에 근접한 상태에서의 훈련 물체 이미지를 이용하여 파지 동작을 학습함으로써, 학습 효율이 향상될 수 있다.As described above, according to the present invention, learning efficiency is improved by learning proximity motion and gripping motion to a training object using separate neuron networks and learning gripping motion using a training object image in a state close to the training object. Can be.

한편, 단계 S310에서 학습 장치는 제1이미지를 생성하기 위해, 작업 공간 이미지에서 물체를 인식한 후, 작업 공간 이미지를, 인식된 물체 중 미리 설정된 훈련 물체가 위치하는 훈련 물체 영역 및 배경 영역으로 분할하여 제1이미지를 생성할 수 있다. 학습 장치는 일실시예로서 이미지를 객체 단위의 의미적 특징으로 분할하는 의미적 영상 분할(semantic segmentation) 알고리즘에 기반하여, 제1이미지를 생성할 수 있다.On the other hand, in step S310, the learning device recognizes the object in the work space image to generate the first image, and then divides the work space image into a training object area and a background area in which the preset training object is located among the recognized objects. To generate a first image. As an embodiment, the learning apparatus may generate a first image based on a semantic segmentation algorithm that divides an image into semantic features of an object unit.

또한 단계 S310에서 학습 장치는 제1이미지의 훈련 물체 영역에 대응되는 작업 공간 이미지의 영역에서의 화소값을 그대로 제1이미지의 훈련 물체 영역의 화소값으로 이용한다. 다만, 배경 영역의 색상은 훈련 물체 영역이 부각될 수 있도록 예컨대 검은색으로 처리될 수 있다.In addition, in step S310, the learning device uses the pixel value in the area of the workspace image corresponding to the area of the training object of the first image as the pixel value of the area of the training object of the first image. However, the color of the background area may be processed in black, for example, so that the area of the training object is highlighted.

그리고 실시예에 따라서 배경 영역의 화소값은 훈련 물체 영역의 화소값에 따라서 적응적으로 달라질 수 있다. 예컨대, 훈련 물체 영역의 화소값의 평균값이 임계값 이하로서 훈련 물체 영역이 어둡다면, 상대적으로 배경 영역이 밝아지도록 배경 영역의 화소값이 결정됨으로써, 제1이미지에서 훈련 물체 영역이 보다 집중될 수 있다.Also, according to an embodiment, the pixel value of the background area may be adaptively changed according to the pixel value of the training object area. For example, if the average value of the pixel values of the training object region is less than or equal to the threshold, and the training object region is dark, the pixel value of the background region is determined such that the background region is relatively bright, so that the training object region can be more concentrated in the first image have.

도 5는 본 발명의 일실시예에 따른 뉴럴 네트워크를 이용하는 파지 로봇의 파지 방법을 설명하기 위한 흐름도이다.5 is a flowchart illustrating a gripping method of a gripping robot using a neural network according to an embodiment of the present invention.

본 발명에 따른 파지 로봇은 복수의 물체가 배치된 작업 공간을 촬영하여 획득된 작업 공간 이미지가, 목표 물체 영역 및 배경 영역으로 분할된 제1이미지를 생성(S510)한다. 파지 로봇은 단계 S310과 같이 제1이미지를 생성할 수 있다.The gripping robot according to the present invention generates a first image in which a workspace image obtained by photographing a workspace in which a plurality of objects is disposed is divided into a target object area and a background area (S510). The gripping robot may generate a first image as in step S310.

즉, 파지 로봇은 작업 공간 이미지에서 물체를 인식하고, 작업 공간 이미지를, 인식된 물체 중 미리 설정된 목표 물체가 위치하는 목표 물체 영역 및 배경 영역으로 분할하여 제1이미지를 생성할 수 있다. 그리고 제1이미지에 포함된 목표 물체 영역의 화소값은 작업 공간 이미지에서의 목표 물체에 대한 화소값에 대응되며, 제1이미지에 포함된 배경 영역의 화소값은 미리 설정된 하나의 화소값일 수 있다.That is, the gripping robot may recognize an object in the work space image and divide the work space image into a target object area and a background area in which a preset target object is located among the recognized objects to generate a first image. Further, the pixel value of the target object area included in the first image corresponds to the pixel value of the target object in the workspace image, and the pixel value of the background area included in the first image may be one preset pixel value.

이후 파지 로봇은 단계 S320에서 학습된 제1뉴럴 네트워크를 이용하여, 제1이미지의 목표 물체 영역에 포함된 목표 물체에 대한 엔드 이펙터의 근접 위치 및 근접 자세를 추정한다. 즉, 파지 로봇은 엔드 이펙터가 목표 물체에 근접할 수 있는 로봇의 관절각 또는 엑츄에이터에 대한 제어값을 추정한다.Thereafter, the gripping robot estimates the proximity and posture of the end effector with respect to the target object included in the target object area of the first image using the first neural network learned in step S320. That is, the gripping robot estimates the control value for the joint angle or actuator of the robot, where the end effector can approach the target object.

그리고 파지 로봇은 단계 S330에서 학습된 학습된 제2뉴럴 네트워크를 이용하여, 목표 물체에 근접한 엔드 이펙터의 위치 및 자세에서 획득한 제2이미지에 포함된 목표 물체에 대한 엔드 이펙터의 파지 위치 및 파지 자세를 추정한다. 즉, 파지 로봇은 엔드 이펙터가 목표 물체를 파지할 수 있는 로봇의 관절각 또는 엑츄에이터에 대한 제어값을 추정한다.Then, the gripping robot uses the learned second neural network learned in step S330, and the gripping position and gripping posture of the end effector with respect to the target object included in the second image obtained from the position and posture of the end effector close to the target object. To estimate. In other words, the gripping robot estimates the control value for the joint angle or actuator of the robot, which allows the end effector to grip the target object.

이 때, 파지 로봇은 제2이미지에 포함된 목표 물체 및 엔드 이펙터의 근접 위치 및 근접 자세에 기반하여, 엔드 이펙터의 파지 위치 및 파지 자세를 추정할 수 있다.At this time, the gripping robot may estimate the gripping position and gripping posture of the end effector based on the proximity position and proximity posture of the target object and the end effector included in the second image.

도 6은 본 발명의 일실시예에 따른 뉴럴 네트워크를 설명하기 위한 도면이다.6 is a view for explaining a neural network according to an embodiment of the present invention.

도 6을 참조하면, 제1 및 제2이미지는 CNN으로 입력되며, 훈련 물체 및 목표 물체에 대한 특징값(feature points)이 출력된다. 도 6에서의 RGB 이미지(RGB image)는 제1 및 제2이미지를 나타낸다.Referring to FIG. 6, the first and second images are input to the CNN, and feature points for the training object and the target object are output. The RGB image in FIG. 6 represents first and second images.

이 때, 훈련 물체 및 목표 물체를 나타내는 벡터값(Target Object)이 함께 뉴럴 네트워크로 입력될 수 있다.At this time, a vector object representing a training object and a target object (Target Object) may be input to the neural network together.

그리고 훈련 물체 및 목표 물체 각각에 대한 레이블로서, 엔드 이펙터의 근접 위치 및 자세 그리고 엔드 이펙터의 파지 위치 및 자세가 뉴렬 네트워크에 주어질 수 있다. And, as a label for each of the training object and the target object, the proximity position and posture of the end effector and the gripping position and posture of the end effector can be given to the new column network.

특히, 훈련 물체 및 목표 물체에 대한 파지 위치 및 자세를 학습 및 추정하는 제2뉴럴 네트워크에는 엔드 이펙터의 근접 위치 및 근접 자세(Current Joint Angles)가 추가적으로 입력될 수 있다.In particular, the second neural network for learning and estimating the gripping position and posture for the training object and the target object may additionally input the proximity position and the current posture (Current Joint Angles) of the end effector.

한편, 본 발명에 따른 뉴럴 네트워크는 혼합 밀도 네트워크(Mixture Density Network) 기반의 신경망일 수 있다. 하나의 목표 물체에 대한 파지 동작은 여러가지가 있을 수 있으므로, 혼합 밀도 네트워크를 통해 목표 물체의 배치 상태를 고려한 적절한 파지 동작이 추정될 수 있다.Meanwhile, the neural network according to the present invention may be a neural network based on a mixture density network. Since there may be various gripping motions for one target object, an appropriate gripping motion can be estimated in consideration of the placement state of the target object through the mixed density network.

도 7은 본 발명의 일실시예에 따른 뉴럴 네트워크를 이용하는 파지 로봇을 설명하기 위한 도면이다.7 is a view for explaining a gripping robot using a neural network according to an embodiment of the present invention.

도 7을 참조하면, 본 발명에 따른 제1카메라(710), 이미지 처리부(720), 근접 동작 제어부(730), 제2카메라(740), 파지 동작 제어부(750)를 포함한다.Referring to FIG. 7, a first camera 710, an image processing unit 720, a proximity operation control unit 730, a second camera 740, and a gripping operation control unit 750 according to the present invention are included.

제1카메라(710)는 복수의 물체가 배치된 작업 공간을 촬영하여 작업 공간 이미지를 생성한다. 제1카메라(710)는 후술되는 제2카메라(740)보다 높은 곳에 위치하여 작업 공간 전체가 포함되도록 작업 공간 이미지를 생성하며, 일예로서, 작업 공간 이미지는 도 1과 같을 수 있다.The first camera 710 photographs a work space in which a plurality of objects are disposed to generate a work space image. The first camera 710 is positioned higher than the second camera 740 to be described later to generate a workspace image to include the entire workspace, and as an example, the workspace image may be as shown in FIG. 1.

이미지 처리부(720)는 작업 공간 이미지가, 목표 물체 영역 및 배경 영역으로 분할된 제1이미지를 생성한다. 일예로서, 제1이미지는 도 2와 같을 수 있으며, 이미지 처리부(720)는 의미적 영상 분할 알고리즘을 이용하여, 제1이미지를 생성할 수 있다.The image processing unit 720 generates a first image in which the workspace image is divided into a target object area and a background area. As an example, the first image may be as illustrated in FIG. 2, and the image processing unit 720 may generate a first image using a semantic image segmentation algorithm.

근접 동작 제어부(730)는 학습된 제1뉴럴 네트워크를 이용하여, 제1이미지의 목표 물체 영역에 포함된 목표 물체에 대한 엔드 이펙터의 근접 위치 및 근접 자세를 추정한다.The proximity operation control unit 730 estimates the proximity position and the proximity posture of the end effector with respect to the target object included in the target object area of the first image using the learned first neural network.

제2카메라(740)는 목표 물체에 근접한 엔드 이펙터의 위치 및 자세에서 목표 물체를 촬영하여 제2이미지를 생성하며, 일예로서, 제2카메라(740)는 엔드 이펙터의 손목 부위에 위치할 수 있다.The second camera 740 photographs the target object at the position and posture of the end effector close to the target object to generate a second image. As an example, the second camera 740 may be located at the wrist of the end effector. .

파지 동작 제어부(750)는 학습된 제2뉴럴 네트워크를 이용하여, 제2이미지에 포함된 목표 물체에 대한 엔드 이펙터의 파지 위치 및 파지 자세를 추정한다.The gripping motion controller 750 estimates the gripping position and gripping posture of the end effector for the target object included in the second image using the learned second neural network.

파지 로봇의 액츄에이터(760)는 일예로서 파지 로봇의 관절각을 조절하는 모터일 수 있으며, 근접 동작 제어부(730)에 의해 추정된 엔드 이펙터의 근접 위치 및 근접 자세에 따라 구동함으로써, 파지 로봇의 엔드 이펙터가 목표 물체 근처에 위치할 수 있도록 한다.The actuator 760 of the gripping robot may be, for example, a motor that adjusts the joint angle of the gripping robot, and is driven by the proximity position and proximity of the end effector estimated by the proximity operation control unit 730 to end the gripping robot. Allow the effector to be positioned near the target object.

또한 액츄에이터(760)는 파지 동작 제어부(750)에 의해 추정된 엔드 이펙터의 파지 위치 및 파지 자세에 따라 구동함으로써, 엔드 이펙터가 목표 물체를 파지할 수 있도록 한다.In addition, the actuator 760 is driven according to the gripping position and the gripping posture of the end effector estimated by the gripping motion control unit 750, so that the end effector can grip the target object.

앞서 설명한 기술적 내용들은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 실시예들을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 하드웨어 장치는 실시예들의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The technical contents described above may be implemented in the form of program instructions that can be executed through various computer means and may be recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, or the like alone or in combination. The program instructions recorded on the medium may be specially designed and configured for the embodiments or may be known and usable by those skilled in computer software. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical media such as CD-ROMs, DVDs, and magnetic media such as floptical disks. -Hardware devices specifically configured to store and execute program instructions such as magneto-optical media, and ROM, RAM, flash memory, and the like. Examples of program instructions include high-level language codes that can be executed by a computer using an interpreter, etc., as well as machine language codes produced by a compiler. The hardware device can be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

이상과 같이 본 발명에서는 구체적인 구성 요소 등과 같은 특정 사항들과 한정된 실시예 및 도면에 의해 설명되었으나 이는 본 발명의 보다 전반적인 이해를 돕기 위해서 제공된 것일 뿐, 본 발명은 상기의 실시예에 한정되는 것은 아니며, 본 발명이 속하는 분야에서 통상적인 지식을 가진 자라면 이러한 기재로부터 다양한 수정 및 변형이 가능하다. 따라서, 본 발명의 사상은 설명된 실시예에 국한되어 정해져서는 아니되며, 후술하는 특허청구범위뿐 아니라 이 특허청구범위와 균등하거나 등가적 변형이 있는 모든 것들은 본 발명 사상의 범주에 속한다고 할 것이다.As described above, in the present invention, specific matters such as specific components and the like have been described by limited embodiments and drawings, but these are provided only to help the overall understanding of the present invention, and the present invention is not limited to the above embodiments , Anyone having ordinary knowledge in the field to which the present invention pertains can make various modifications and variations from these descriptions. Accordingly, the spirit of the present invention should not be limited to the described embodiments, and should not be determined, and all claims that are equivalent or equivalent to the scope of the claims as well as the claims described below belong to the scope of the spirit of the invention. .

Claims

Generating a first image in which a workspace image obtained by photographing a workspace in which a plurality of objects is disposed is divided into a target object area and a background area;
Estimating a close position and a close posture of an end effector with respect to a target object included in a target object area of the first image using the learned first neural network; And
Estimating the gripping position and gripping posture of the end effector for the training object included in the second image obtained from the position and posture of the end effector close to the target object using the learned second neural network
Gripping method of the gripping robot using a neural network comprising a.

According to claim 1,
The step of generating the first image
Recognizing the object in the workspace image; And
Generating the first image by dividing the workspace image into the target object area and the background area in which the target object is preset among the recognized objects.
Gripping method of the gripping robot using a neural network comprising a.

According to claim 1,
The pixel value of the target object area included in the first image is
It corresponds to the pixel value for the target object in the workspace image,
The pixel value of the background area included in the first image is
Which is one preset pixel value
The gripping method of the gripping robot using the neural network.

According to claim 3,
The pixel value of the background area included in the first image is
0 persons
The gripping method of the gripping robot using the neural network.

According to claim 1,
Estimating the gripping position and the gripping posture of the end effector
Estimating the gripping position and the gripping posture of the end effector based on the proximity position and the posture of the end effector
The gripping method of the gripping robot using the neural network.

Generating a first image in which a workspace image obtained by photographing a workspace in which a plurality of objects is disposed is divided into a training object area and a background area;
Learning, by using a first neural network, the proximity and posture of the end effector with respect to the training object included in the training object area of the first image; And
Learning a gripping position and a gripping posture of the end effector with respect to the training object included in a second image obtained from a position and a posture of an end effector close to the training object, using a second neural network
Neural network learning method for a phage robot comprising a.

The method of claim 6,
The step of generating the first image
Recognizing the object in the workspace image; And
Generating the first image by dividing the workspace image into the training object area and the background area in which the preset training object is located among the recognized objects.
Neural network learning method for a phage robot comprising a.

The method of claim 6,
The pixel value of the training object region included in the first image is
It corresponds to the pixel value for the training object in the workspace image,
The pixel value of the background area included in the first image is
Which is one preset pixel value
Neural network learning method for phage robot.

The method of claim 6,
The step of learning the gripping position and the gripping position of the end effector
Learning the gripping position and gripping posture of the end effector for the proximity position and proximity posture of the training object and the end effector included in the second image
Neural network learning method for phage robot.

A first camera that photographs a work space in which a plurality of objects are arranged to generate a work space image;
An image processing unit generating a first image in which the workspace image is divided into a target object region and a background region;
A proximity operation control unit for estimating the proximity and posture of the end effector with respect to the target object included in the target object area of the first image using the learned first neural network;
A second camera that photographs the target object at a position and posture of an end effector close to the target object to generate a second image; And
A gripping motion controller for estimating the gripping position and gripping posture of the end effector with respect to the target object included in the second image using the learned second neural network
Phage robot using a neural network, including.