KR102657338B1

KR102657338B1 - Learning data collection system and method

Info

Publication number: KR102657338B1
Application number: KR1020210134833A
Authority: KR
Inventors: 박순용
Original assignee: 네이버랩스 주식회사
Priority date: 2021-10-12
Filing date: 2021-10-12
Publication date: 2024-04-12
Also published as: KR20230051895A

Abstract

위에서 살펴본 과제를 해결하기 위하여, 본 발명에 따른 학습 데이터 수집 방법은, 카메라를 통해, 대상 물체 및 상기 대상 물체 주변에 위치한 마커(marker)에 대한 기준 영상을 획득하는 단계, 상기 영상으로부터 상기 마커를 검출하여, 상기 영상에 포함된 상기 마커의 적어도 하나의 마커 기준점을 특정하는 단계, 상기 마커 기준점을 기준으로, 상기 카메라 좌표계에 대한 상기 마커의 마커 좌표계를 특정하는 단계, 상기 마커 좌표계의 원점을 기준으로 정렬된 그리드 영역을 상기 영상에 투영하는 단계, 상기 그리드 영역의 이동에 근거하여, 상기 그리드 영역에서 상기 대상 물체를 특정하는 단계 및 상기 그리드 영역에서의 상기 대상 물체에 대한 특정 결과를 이용하여, 상기 영상에 포함된 상기 대상 물체에 대한 학습 데이터를 수집하는 단계를 포함할 수 있다.In order to solve the problems described above, the learning data collection method according to the present invention includes the steps of acquiring a reference image of a target object and a marker located around the target object through a camera, and selecting the marker from the image. Detecting and specifying at least one marker reference point of the marker included in the image, specifying a marker coordinate system of the marker with respect to the camera coordinate system based on the marker reference point, and based on the origin of the marker coordinate system Projecting a grid area aligned to the image, specifying the target object in the grid area based on movement of the grid area, and using the specific result for the target object in the grid area, It may include collecting learning data about the target object included in the image.

Description

Learning data collection system and method {LEARNING DATA COLLECTION SYSTEM AND METHOD}

본 발명은 인공지능에서 학습의 대상이 되는 학습 데이터 수집 시스템 및 이를 이용한 학습 데이터 수집 방법에 관한 것이다.The present invention relates to a learning data collection system that is an object of learning in artificial intelligence and a learning data collection method using the same.

인공지능의 사전적 의미는, 인간의 학습능력과 추론능력, 지각능력, 자연언어의 이해능력 등을 컴퓨터 프로그램으로 실현한 기술이라 할 수 있다. 이러한 인공지능은 머신러닝에 인간의 뇌를 모방한 신경망 네트워크를 더한 딥러닝으로 인하여 비약적인 발전을 이루었다.The dictionary meaning of artificial intelligence is a technology that realizes human learning ability, reasoning ability, perception ability, and natural language understanding ability through computer programs. This artificial intelligence has made rapid progress due to deep learning, which adds a neural network that mimics the human brain to machine learning.

딥러닝(deep learning)이란, 컴퓨터가 인간처럼 판단하고 학습할 수 있도록 하고, 이를 통해 사물이나 데이터를 군집화하거나 분류하는 기술로서, 최근에는 텍스트 데이터 뿐만 아니라 영상 데이터에 대한 분석까지 가능해져, 매우 다양한 산업분야에 적극적으로 활용되고 있다.Deep learning is a technology that enables computers to judge and learn like humans, and to cluster or classify objects or data. Recently, it has become possible to analyze not only text data but also video data, enabling a wide variety of applications. It is actively used in industrial fields.

예를 들어, 로봇 분야, 자율 주행 분야, 의료 분야 등 다양한 산업분야에서는 딥러닝 기반의 학습 네트워크(이하, “딥러닝 네트워크”라 명명함)를 통하여, 학습 대상 데이터를 기반으로 학습을 수행하고, 의미 있는 학습 결과를 도출함으로써, 각 산업분야에 유용하게 활용되고 있다.For example, in various industrial fields such as robotics, autonomous driving, and medical fields, learning is performed based on learning target data through a deep learning-based learning network (hereinafter referred to as “deep learning network”). By deriving meaningful learning results, it is usefully utilized in each industry field.

일 예로서, 로봇 분야에서는, 로봇이 수행하는 작업에 대한 이해를 위하여, 로봇 주변의 상황 또는 로봇 주변에 배치된 작업 대상물에 대한 정확한 판단이 가능해야 하며, 이를 위해, 딥러닝 기반의 영상인식 기술(예를 들어, 로봇 비전(vision)기술)이 적극 활용되고 있다.As an example, in the field of robotics, in order to understand the tasks performed by the robot, it must be possible to accurately judge the situation around the robot or the work objects placed around the robot. To this end, deep learning-based image recognition technology (For example, robot vision technology) is being actively used.

한편, 딥러닝 뿐만 아니라 머신러닝과 같은 인공지능 분야에서는, 보다 많은 양에 대한 데이터에 대해 학습을 수행함에 따라, 정확도가 높아지고, 보다 양질의 결과물을 도출하는 것이 가능하다. 따라서, 인공지능 분야에서는, 학습의 대상이 되는 데이터를 수집하는 것이 필수적이다.Meanwhile, in artificial intelligence fields such as machine learning as well as deep learning, as learning is performed on a larger amount of data, accuracy increases and it is possible to produce better quality results. Therefore, in the field of artificial intelligence, it is essential to collect data that is the subject of learning.

특히, 영상 데이터를 기반으로 한 딥러닝 네트워크 또는 머신러닝 네트워크는, 영상 데이터에 대응되는 대상물(또는 물체)의 위치 또는 자세를 추정할 수 있으며, 이러한 추정을 위해서는 영상 데이터와 함께, 대상물의 자유도 정보(위치 정보 및 자세 정보)가 학습 데이터로서 확보되어야 한다. In particular, a deep learning network or machine learning network based on image data can estimate the position or posture of an object (or object) corresponding to the image data, and for this estimation, the degree of freedom of the object is used along with the image data. Information (position information and posture information) must be secured as learning data.

종래, 영상 데이터 및 이에 대응되는 자유도 정보를 학습 데이터로서 수집하기 위해서는, 영상 데이터에 대해 대상 물체가 포함된 영역을 일일이 특정하고, 특정된 영역에 대한 좌표를 추출하는 수작업이 이루어져야 하므로, 학습 데이터를 확보하기 위한 엄청난 노동력이 필요했다.Conventionally, in order to collect image data and the corresponding degree-of-freedom information as learning data, manual work must be done to individually specify the areas containing the target object in the image data and extract the coordinates for the specified areas, so that the learning data A huge amount of labor was needed to secure it.

이에, 자유도 정보를 포함한 학습 데이터를 수집하는 방법에 대한 개선이 매우 절실한 상황이다.Accordingly, there is a great need for improvement in methods for collecting learning data including degree-of-freedom information.

본 발명은, 인공지능 네트워크의 학습을 위한 학습 데이터를 수집하는 학습 데이터 수집 시스템 및 방법에 관한 것이다.The present invention relates to a learning data collection system and method for collecting learning data for learning of an artificial intelligence network.

보다 구체적으로, 본 발명은, 자유도 정보를 포함하는 학습 데이터를 수집하는 학습 데이터 수집 시스템 및 방법에 관한 것이다.More specifically, the present invention relates to a learning data collection system and method for collecting learning data including degree of freedom information.

나아가, 본 발명은 다양한 자세를 갖는 대상 물체에 대한 학습 데이터를 수집하는 학습 데이터 수집 시스템 및 방법에 관한 것이다.Furthermore, the present invention relates to a learning data collection system and method for collecting learning data for target objects with various postures.

나아가, 본 발명은 학습 데이터를 수집하는데 소요되는 시간 및 노동력을 최소화할 수 있는 학습 데이터 수집 시스템 및 방법에 관한 것이다.Furthermore, the present invention relates to a learning data collection system and method that can minimize the time and labor required to collect learning data.

나아가, 본 발명에 따른 학습 데이터 수집 시스템은, 대상 물체 및 상기 대상 물체 주변에 위치한 마커(marker)에 대한 기준 영상을 촬영하는 카메라, 상기 영상을 저장하는 저장부 및 상기 영상으로부터 상기 마커를 검출하여, 상기 영상에 포함된 상기 마커의 적어도 하나의 마커 기준점을 특정하고, 상기 마커 기준점을 기준으로, 상기 카메라 좌표계에 대한 상기 마커의 마커 좌표계를 특정하는 제어부를 포할 수 있다.Furthermore, the learning data collection system according to the present invention includes a camera for capturing a reference image of a target object and a marker located around the target object, a storage unit for storing the image, and detecting the marker from the image. , may include a control unit that specifies at least one marker reference point of the marker included in the image, and specifies a marker coordinate system of the marker with respect to the camera coordinate system based on the marker reference point.

나아가, 제어부는, 상기 마커 좌표계의 원점을 기준으로 정렬된 그리드 영역을 상기 영상에 투영하고, 상기 그리드 영역의 이동에 근거하여, 상기 그리드 영역에서 상기 대상 물체를 특정하며, 상기 그리드 영역에서의 상기 대상 물체에 대한 특정 결과를 이용하여, 상기 영상에 포함된 상기 대상 물체에 대한 학습 데이터를 수집할 수 있다.Furthermore, the control unit projects a grid area aligned with the origin of the marker coordinate system onto the image, specifies the target object in the grid area based on movement of the grid area, and specifies the target object in the grid area. Using specific results about the target object, learning data about the target object included in the image can be collected.

나아가, 본 발명에 따른 전자기기에서 하나 이상의 프로세스에 의하여 실행되며, 컴퓨터로 판독될 수 있는 기록매체에 저장된 프로그램은, 카메라를 통해, 대상 물체 및 상기 대상 물체 주변에 위치한 마커(marker)에 대한 기준 영상을 획득하는 단계, 상기 영상으로부터 상기 마커를 검출하여, 상기 영상에 포함된 상기 마커의 적어도 하나의 마커 기준점을 특정하는 단계, 상기 마커 기준점을 기준으로, 상기 카메라 좌표계에 대한 상기 마커의 마커 좌표계를 특정하는 단계, 상기 마커 좌표계의 원점을 기준으로 정렬된 그리드 영역을 상기 영상에 투영하는 단계, 상기 그리드 영역의 이동에 근거하여, 상기 그리드 영역에서 상기 대상 물체를 특정하는 단계 및 상기 그리드 영역에서의 상기 대상 물체에 대한 특정 결과를 이용하여, 상기 영상에 포함된 상기 대상 물체에 대한 학습 데이터를 수집하는 단계를 수행하도록 하는 명령어들을 포함할 수 있다.Furthermore, the program, which is executed by one or more processes in the electronic device according to the present invention and stored in a computer-readable recording medium, uses a camera to measure the target object and the markers located around the target object. Acquiring an image, detecting the marker from the image, and specifying at least one marker reference point of the marker included in the image, based on the marker reference point, a marker coordinate system of the marker with respect to the camera coordinate system specifying, projecting a grid area aligned with the origin of the marker coordinate system onto the image, specifying the target object in the grid area based on movement of the grid area, and in the grid area It may include instructions for performing a step of collecting learning data for the target object included in the image using a specific result for the target object.

위에서 살펴본 것과 같이, 본 발명에 따른 학습 데이터 수집 시스템 및 방법은, 대상 물체와 함께 마커를 촬영한 영상에서, 마커를 기준으로 정의된 그리드 영역을 이용하여, 대상 물체에 대한 학습 데이터를 확보할 수 있다. 이때, 본 발명에서는 그리드 영역에 대한 수평 회전 및 수직 이동을 통해, 대상 물체를 둘러싸는 3차원 박스 영역을 정의 및 대상 물체에 대한 확보할 수 있다. 따라서, 본 발명에 의하면, 대상 물체에 대한 학습 데이터를 확보하기 위하여 영상에서 일일이 대상 물체의 학습 데이터 추출 지점들을 선택하지 않아도 됨으로써, 학습 데이터 확보에 있어서 노동력 및 소요 시간을 절대적으로 줄일 수 있다.As seen above, the learning data collection system and method according to the present invention can secure learning data for the target object by using a grid area defined based on the marker in an image of a marker taken together with the target object. there is. At this time, in the present invention, a three-dimensional box area surrounding the target object can be defined and secured for the target object through horizontal rotation and vertical movement with respect to the grid area. Therefore, according to the present invention, in order to secure learning data for the target object, it is not necessary to individually select learning data extraction points of the target object from the image, thereby absolutely reducing labor and time required for securing learning data.

도 1은 본 발명에 따라 수집된 학습 데이터가 활용되는 예를 설명하기 위한 개념도이다.
도 2는 본 발명에 따른 학습 데이터 수집 시스템을 설명하기 위한 개념도이다.
도 3은 본 발명에 따른 학습 데이터 수집 방법을 설명하기 위한 흐름도이다.
도 4, 도 5a, 도 5b, 도 6, 도 7, 도 8, 도 9, 도 10 및 도 11은 학습 데이터를 수집하는 방법을 설명하기 위한 개념도들이다.
도 12는 수집된 학습 데이터를 설명하기 위한 개념도이다.1 is a conceptual diagram illustrating an example in which learning data collected according to the present invention is utilized.
Figure 2 is a conceptual diagram for explaining the learning data collection system according to the present invention.
Figure 3 is a flow chart to explain the learning data collection method according to the present invention.
Figures 4, 5a, 5b, 6, 7, 8, 9, 10, and 11 are conceptual diagrams for explaining a method of collecting learning data.
Figure 12 is a conceptual diagram for explaining collected learning data.

이하, 첨부된 도면을 참조하여 본 명세서에 개시된 실시 예를 상세히 설명하되, 도면 부호에 관계없이 동일하거나 유사한 구성요소에는 동일한 참조 번호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다. 이하의 설명에서 사용되는 구성요소에 대한 접미사 "모듈" 및 "부"는 명세서 작성의 용이함만이 고려되어 부여되거나 혼용되는 것으로서, 그 자체로 서로 구별되는 의미 또는 역할을 갖는 것은 아니다. 또한, 본 명세서에 개시된 실시 예를 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 본 명세서에 개시된 실시 예의 요지를 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다. 또한, 첨부된 도면은 본 명세서에 개시된 실시 예를 쉽게 이해할 수 있도록 하기 위한 것일 뿐, 첨부된 도면에 의해 본 명세서에 개시된 기술적 사상이 제한되지 않으며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. Hereinafter, embodiments disclosed in the present specification will be described in detail with reference to the accompanying drawings. However, identical or similar components will be assigned the same reference numbers regardless of drawing symbols, and duplicate descriptions thereof will be omitted. The suffixes “module” and “part” for components used in the following description are given or used interchangeably only for the ease of preparing the specification, and do not have distinct meanings or roles in themselves. Additionally, in describing the embodiments disclosed in this specification, if it is determined that detailed descriptions of related known technologies may obscure the gist of the embodiments disclosed in this specification, the detailed descriptions will be omitted. In addition, the attached drawings are only for easy understanding of the embodiments disclosed in this specification, and the technical idea disclosed in this specification is not limited by the attached drawings, and all changes included in the spirit and technical scope of the present invention are not limited. , should be understood to include equivalents or substitutes.

제1, 제2 등과 같이 서수를 포함하는 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되지는 않는다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다.Terms containing ordinal numbers, such as first, second, etc., may be used to describe various components, but the components are not limited by the terms. The above terms are used only for the purpose of distinguishing one component from another.

어떤 구성요소가 다른 구성요소에 "연결되어" 있다거나 "접속되어" 있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소가 다른 구성요소에 "직접 연결되어" 있다거나 "직접 접속되어" 있다고 언급된 때에는, 중간에 다른 구성요소가 존재하지 않는 것으로 이해되어야 할 것이다.When a component is said to be "connected" or "connected" to another component, it is understood that it may be directly connected to or connected to the other component, but that other components may exist in between. It should be. On the other hand, when it is mentioned that a component is “directly connected” or “directly connected” to another component, it should be understood that there are no other components in between.

단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. Singular expressions include plural expressions unless the context clearly dictates otherwise.

본 출원에서, "포함한다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.In this application, terms such as “comprise” or “have” are intended to designate the presence of features, numbers, steps, operations, components, parts, or combinations thereof described in the specification, but are not intended to indicate the presence of one or more other features. It should be understood that this does not exclude in advance the possibility of the existence or addition of elements, numbers, steps, operations, components, parts, or combinations thereof.

본 발명은, 인공지능 네트워크의 학습을 위한 학습 데이터를 수집하는 학습 데이터 수집 시스템 및 방법에 관한 것으로서, 특히 자유도 정보(또는 자유도 자세)를 포함하는 학습 데이터를 자동으로 수집할 수 있는 학습 데이터 수집 방법 및 시스템에 대한 것이다.The present invention relates to a learning data collection system and method for collecting learning data for learning of an artificial intelligence network, and in particular, learning data that can automatically collect learning data including degree-of-freedom information (or degree-of-freedom posture). It is about collection methods and systems.

앞서 살펴본 것과 같이, 인공지능의 발전에 힘입어 영상인식 기술은 다양한 산업분야에 활용되고 있다. 특히, 로봇 분야에서는, 인공지능 기반의 영상 인식 기술(예를 들어, 딥러닝 기반의 영상인식 기술)에 기반하여, 로봇이 속한 작업 환경을 분석 및 이해하고, 이를 기반으로 로봇이 목표로 하는 작업을 수행하고 있다.As seen above, thanks to the development of artificial intelligence, image recognition technology is being used in various industrial fields. In particular, in the robotics field, based on artificial intelligence-based image recognition technology (for example, deep learning-based image recognition technology), the work environment to which the robot belongs is analyzed and understood, and based on this, the robot's target task is determined. is carrying out.

예를 들어, 도 1에 도시된 것과 같이, 로봇(R)에게 특정 작업(또는 임무)(예를 들어, 서빙)이 주어진 경우, 로봇(R) 또는 로봇(R) 주변에 배치된 카메라(미도시됨)는 로봇(R)의 작업 환경에 해당하는 영상을 촬영할 수 있다. 그리고, 로봇(R)의 제어부는, 촬영된 영상에 기반하여, 로봇(R)이 특정 작업을 수행하기 위하여, 어떻게 동작해야 하는지에 대한 판단을 내리고, 판단에 따라 동작하도록 로봇(R)을 제어할 수 있다.For example, as shown in FIG. 1, when a specific task (or mission) (e.g., serving) is given to the robot R, the robot R or a camera placed around the robot R (not shown) shown) can capture images corresponding to the working environment of the robot (R). And, the control unit of the robot (R) makes a judgment on how the robot (R) should operate in order to perform a specific task, based on the captured image, and controls the robot (R) to operate according to the judgment. can do.

이 경우, 로봇(R)의 제어부는, 촬영된 영상에서 작업의 대상이 되는 대상물(A, 또는 객체(object), 예를 들어, 접시, 컵(a1, a2))을 인식하고, 대상물(A)의 위치 및 자세(또는 포즈, pose)를 분석하여, 로봇(R)이 대상물에 대해 목표로 하는 작업을 수행할 수 있도록 로봇(R)을 제어해야 한다.In this case, the control unit of the robot (R) recognizes the object (A, or object (object), for example, plate, cup (a1, a2)) that is the target of the work in the captured image, and recognizes the object (A) ), the position and posture (or pose) must be analyzed to control the robot (R) so that it can perform the target task on the object.

이를 위하여, 로봇(R)의 제어부는, 촬영된 영상으로부터 다양한 정보를 수집하여야 하며, 예를 들어, i) 작업의 대상이 되는 대상물의 종류, ii) 작업의 대상이 되는 대상물의 크기, iii) 작업의 대상이 되는 대상물의 형상, iv) 작업의 대상이 되는 대상물의 위치(예를 들어, 도 1에 도시된 것과 같이, 접시, 컵(a1, a2)이 테이블(table)의 어디쯤에 놓여 있는지 등), v) 작업의 대상이 되는 대상물의 자세(예를 들어, 도 1에 도시된 것과 같이, 접시, 컵(a1, a2)이 테이블에 놓여져 있는 자세(ex: 비스듬히 기울어져 있는지 등)), vi) 대상물을 촬영하는 카메라의 자세에 대한 정보 중 복수의 정보를 이용하여, 로봇(R)을 정확하게 제어할 수 있다. For this purpose, the control unit of the robot (R) must collect various information from the captured image, for example, i) the type of object that is the target of the work, ii) the size of the object that is the target of the work, iii) Shape of the object that is the target of the work, iv) Location of the object that is the target of the work (for example, as shown in Figure 1, the plate and cup (a1, a2) are placed somewhere on the table presence, etc.), v) the posture of the object that is the target of the work (e.g., as shown in Figure 1, the posture in which the plate and cup (a1, a2) are placed on the table (e.g., whether it is tilted at an angle, etc.) ), vi) The robot (R) can be accurately controlled using a plurality of pieces of information among the information about the posture of the camera that photographs the object.

나아가, 대상물에 대한 다양한 정보를 확보하기 위한 니즈는, 위에서 설명한 로봇에 대한 작업 뿐만 아니라, 로봇의 주행에도 유용하게 활용할 수 있다. 사람과 로봇이 공존하는 환경에서, 로봇이 주변 환경에 대하여 안전하게 주행하기 위해서는, 로봇 주변에 위치한 환경에 대한 정보를 정확하게 이해하는 것이 필요하며, 이 때, 로봇 주변에 위치한 다양한 대상물에 대한 다양한 정보를 기반으로, 로봇의 주행, 동작 등이 적절하게 제어될 수 있기 때문이다.Furthermore, the need to secure various information about an object can be useful not only for the robot work described above, but also for robot driving. In an environment where humans and robots coexist, in order for the robot to safely navigate the surrounding environment, it is necessary to accurately understand information about the environment located around the robot. At this time, various information about various objects located around the robot are needed. This is because, based on this, the robot's driving and movements can be appropriately controlled.

나아가, 대상물에 대한 다양한 정보를 확보하기 위한 니즈는, 자율 주행 기술에서도 유용하게 활용할 수 있다. 자율 주행을 위해서는, 자율 주행을 수행하는 기기(예를 들어, 자동차)가 주변 환경을 정확하게 인식하는 것이 필요하여, 이때, 기기 주변에 위치한 다양한 대상물에 적절한 대응을 위해서는, 대상물의 다양한 자세에 정보가 필요하다.Furthermore, the need to secure various information about objects can also be useful in autonomous driving technology. For autonomous driving, it is necessary for the device performing autonomous driving (e.g., a car) to accurately recognize the surrounding environment. In this case, in order to respond appropriately to various objects located around the device, information on the various postures of the objects is required. need.

한편, 대상물 또는 대상물을 촬영하는 카메라의 위치 및 자세는 “자유도”, “자유도 자세” 또는 “자유도 정보”라고도 표현될 수 있으며, 본 명세서에서는 설명의 편의를 위하여, “자유도 정보”라고 통일하여 명명하도록 한다.Meanwhile, the position and posture of the object or the camera that photographs the object may also be expressed as “degree of freedom,” “degree of freedom posture,” or “degree of freedom information,” and in this specification, for convenience of explanation, it is referred to as “degree of freedom information.” Let's name it in a unified way.

자유도 정보는 위치 정보 및 자세 정보를 포함한 개념으로 이해되어 질 수 있다. 이러한, 자유도 정보는, 3차원 위치(x, y, z)에 해당하는 위치 정보(또는 3차원 위치 정보) 및 3차원 자세(r(roll), θ(pitch), ?(yaw))에 해당하는 자세 정보(또는 3차원 자세 정보)를 포함할 수 있다.Degree of freedom information can be understood as a concept including position information and posture information. This degree of freedom information includes position information (or 3-dimensional position information) corresponding to the 3-dimensional position (x, y, z) and 3-dimensional posture (r (roll), θ (pitch), ? (yaw)). Corresponding posture information (or 3D posture information) may be included.

앞서 살펴본 것과 같이, 로봇(R) 또는 다양한 기기들이 작업의 대상이 되는 대상물에 대하여 정확하게 작업을 수행하기 위해서는 자유도 정보를 파악하는 것이 매우 중요하다. As seen above, it is very important to understand the degree of freedom information in order for the robot (R) or various devices to accurately perform work on the object that is the target of the work.

예를 들어, 로봇(R)의 제어부는 작업의 대상이 되는 대상물(a1, a2)을 잡기 위하여, 로봇 팔(R1, R2)을 어떤 각도로 제어하고, 어떤 자세로 파지를 해야 하는지를 결정해야 하며, 이는 작업의 대상이 되는 대상물(또는 대상물을 촬영하는 카메라)의 자세 및 위치 중 적어도 하나에 근거하여 결정되기 때문이다.For example, the control unit of the robot (R) must decide at what angle to control the robot arms (R1, R2) and in what posture to grasp the objects (a1, a2) that are the targets of work. , This is because it is determined based on at least one of the posture and position of the object (or the camera that photographs the object) that is the target of the work.

이때, 촬영된 영상으로부터 작업의 대상이 되는 대상물(예를 들어, a1, a2)이 인식된 것만으로, 대상물(또는 대상물을 촬영한 카메라)의 자유도 정보까지 인지할 수 있다면, 작업의 정확도 뿐만 아니라, 작업의 효율을 확보할 수 있다.At this time, if the degree of freedom information of the object (or the camera that photographed the object) can be recognized simply by recognizing the object (for example, a1, a2) that is the target of the work from the captured image, not only the accuracy of the work Rather, work efficiency can be secured.

이를 위하여, 촬영된 영상으로부터 획득되는 특정 형상(또는 특정 자세)를 갖는 대상물에 대한 이미지(또는 영상)와 대상물에 대한 자세 정보가 상호 매칭되어, 학습 데이터로서 활용될 수 있다.To this end, the image (or video) of an object with a specific shape (or specific posture) obtained from a captured image and the posture information about the object can be matched with each other and used as learning data.

한편, 대상물에 대한 자세 정보는, i) 대상물이 특정 형상일때, 대상물의 기준 좌표계를 기준으로 어떤 위치 또는 어떤 자세를 갖는지에 대한 대상물 기준의 자유도 정보 및 ii) 대상물이 특정 형상 일때, 대상물을 촬영한 카메라가, 카메라의 기준 좌표계를 기준으로, 어떤 위치 또는 어떤 자세를 갖는지에 대한 카메라 기준의 자유도 정보, iii) 대상물이 촬영된 영상에서 대상물에 해당하는 영역의 적어도 하나의 지점에 대한 픽셀 좌표 중 적어도 하나를 포함할 수 있다.On the other hand, the posture information about the object includes i) when the object has a specific shape, the degree of freedom information based on the object about what position or posture it has based on the reference coordinate system of the object, and ii) when the object has a specific shape, the object Camera-based degree of freedom information about what position or posture the photographed camera has based on the camera's reference coordinate system, iii) pixels for at least one point in the area corresponding to the object in the image in which the object was photographed It may contain at least one of the coordinates.

위와 같이, 본 발명에서 설명되는 자유도 정보는, 대상물 기준의 자유도 정보와 카메라 기준의 자유도 정보 중 적어도 하나를 포함할 수 있다. As above, the degree of freedom information described in the present invention may include at least one of object-based degree of freedom information and camera-based degree of freedom information.

이때, 대상물의 자유도 정보는, 대상물을 기준으로 하는 대상물의 기준 좌표계에 대한 대상물의 자유도 정보 또는 대상물을 촬영한(또는 대상물을 바라보는) 카메라의 자유도 정보 중 적어도 하나로 표현될 수 있다. 이때, 카메라의 자유도 정보는, 카메라를 기준으로 하는 카메라의 기준 좌표계에 대한 자유도 정보일 수 있다.At this time, the degree of freedom information of the object may be expressed as at least one of the degree of freedom information of the object with respect to the object's reference coordinate system based on the object or the degree of freedom information of the camera that photographed the object (or viewed the object). At this time, the camera degree-of-freedom information may be degree-of-freedom information about the camera's reference coordinate system based on the camera.

이는, 대상물 기준의 기준 좌표계와 카메라 기준의 기준 좌표계는 서로 상대적인 위치 관계를 갖기 때문이다.This is because the reference coordinate system based on the object and the reference coordinate system based on the camera have a relative positional relationship with each other.

예를 들어, 대상물 기준의 기준 좌표계에 대한 대상물의 자유도 정보에 역변환을 수행하는 경우, 대상물 기준의 기준 좌표계에 대한 카메라의 자유도 정보가 얻어질 수 있다.For example, when performing inverse transformation on the object's degree of freedom information with respect to the object-based reference coordinate system, the camera's degree-of-freedom information with respect to the object-based reference coordinate system may be obtained.

이와 반대로, 카메라의 기준 좌표계에 대한 카메라의 자유도 정보에 역변환을 수행하는 경우, 카메라의 기준 좌표계에 대한 대상물의 자유도 정보가 얻어질 수 있다.Conversely, when inverse transformation is performed on the camera's degree of freedom information with respect to the camera's reference coordinate system, information on the object's degree of freedom with respect to the camera's reference coordinate system can be obtained.

나아가, 대상물의 기준 좌표계와 카메라의 기준 좌표계 간의 상대적인 위치 관계가 정의되는 경우, 대상물의 기준 좌표계에 대한 대상물의 자유도 정보로부터, 카메라의 기준 좌표계에 대한 카메라의 자유도 정보가 얻어질 수 있다.Furthermore, when the relative positional relationship between the reference coordinate system of the object and the reference coordinate system of the camera is defined, the degree of freedom information of the camera with respect to the reference coordinate system of the camera can be obtained from the degree of freedom information of the object with respect to the reference coordinate system of the object.

이와 반대로, 카메라의 기준 좌표계에 대한 카메라의 자유도 정보로부터, 대상물의 기준 좌표계에 대한 대상물의 자유도 정보가 얻어질 수 있음은 물론이다.Conversely, of course, the degree of freedom information of the object with respect to the reference coordinate system of the object can be obtained from the camera's degree of freedom information with respect to the camera's reference coordinate system.

예를 들어, 대상물의 자유도 정보에 대하여, 대상물의 기준 좌표계와 카메라의 기준 좌표계 간의 상대적인 위치 관계를 반영하는 경우, 카메라의 자유도 정보가 얻어질 수 있다. 이와 반대로, 카메라의 자유도 정보에 대하여, 카메라의 기준 좌표계와 대상물의 기준 좌표계 간의 상대적인 위치 관계를 반영하는 경우, 대상물의 자유도 정보가 얻어질 수 있다. For example, with respect to the degree of freedom information of the object, if the relative positional relationship between the reference coordinate system of the object and the reference coordinate system of the camera is reflected, the degree of freedom information of the camera may be obtained. Conversely, when the camera's degree of freedom information reflects the relative positional relationship between the camera's reference coordinate system and the object's reference coordinate system, the object's degree of freedom information can be obtained.

여기에서, 상대적인 위치 관계는, 어느 하나의 기준 좌표계에 대하여 다른 하나의 기준 좌표계가 회전(rotation) 및 변환(translation, 병진 이동)된 정도를 의미할 수 있다. Here, the relative positional relationship may mean the degree to which one reference coordinate system is rotated or translated with respect to another reference coordinate system.

나아가, 본 발명에서는 도 4에 도시된 것과 같이, 대상물에 대한 자유도 정보를 수집하기 위하여, 마커(430)와 함께 대상물(420)을 촬영할 수 있다. 이 경우, 마커(430)는 마커(430)의 기준 좌표계를 가질 수 있으며, 마커(430)의 자유도 정보는, 위에서 살펴본 관계와 같이, 카메라의 기준 좌표계와 마커(430)의 기준 좌표계 간의 상대적인 위치 관계를 반영하여, 얻어질 수 있다.Furthermore, in the present invention, as shown in FIG. 4, the object 420 can be photographed together with the marker 430 in order to collect degree of freedom information about the object. In this case, the marker 430 may have a reference coordinate system of the marker 430, and the degree of freedom information of the marker 430 is the relative relationship between the reference coordinate system of the camera and the reference coordinate system of the marker 430, as in the relationship discussed above. It can be obtained by reflecting the positional relationship.

나아가, 대상물의 자유도 정보는, 마커(430)의 기준 좌표계와 대상물의 기준 좌표계 간의 상대적인 위치 관계를 반영하여 얻어질 수 있음은 물론이다.Furthermore, of course, the degree of freedom information of the object can be obtained by reflecting the relative positional relationship between the reference coordinate system of the marker 430 and the reference coordinate system of the object.

즉, 본 발명에서는, 카메라의 기준 좌표계, 마커의 기준 좌표계, 대상물의 기준 좌표계 간의 상호 상대적인 위치 관계에 근거하여, 카메라, 마커 및 대상물 중 적어도 하나에 대한 자유도 정보를 수집하는 것이 가능하다.That is, in the present invention, it is possible to collect degree-of-freedom information about at least one of the camera, the marker, and the object based on the relative positional relationship between the reference coordinate system of the camera, the reference coordinate system of the marker, and the reference coordinate system of the object.

한편, 로봇(R) 또는 자율주행 기기 등이 정확한 작업을 수행하기 위해서는, 방대한 학습 데이터를 기반으로 학습된 인공지능 알고리즘(예를 들어, 딥러닝 알고리즘 또는 딥러닝 네트워크)이 필요하다. 따라서, 본 발명에서는, 학습 데이터를 수집하는 방법에 대하여 첨부된 도면과 함께 보다 구체적으로 살펴본다. Meanwhile, in order for a robot (R) or self-driving device to perform accurate tasks, an artificial intelligence algorithm (for example, a deep learning algorithm or deep learning network) learned based on massive learning data is required. Therefore, in the present invention, the method of collecting learning data will be examined in more detail with the attached drawings.

도 2는 본 발명에 따른 학습 데이터 수집 시스템을 설명하기 위한 개념도이고, 도 3은 본 발명에 따른 학습 데이터 수집 방법을 설명하기 위한 흐름도이다. 나아가, 도 4, 도 5a, 도 5b, 도 6, 도 7, 도 8, 도 9, 도 10 및 도 11은 학습 데이터를 수집하는 방법을 설명하기 위한 개념도들이며, 도 12는 수집된 학습 데이터를 설명하기 위한 개념도이다.FIG. 2 is a conceptual diagram for explaining the learning data collection system according to the present invention, and FIG. 3 is a flowchart for explaining the learning data collection method according to the present invention. Furthermore, Figures 4, 5a, 5b, 6, 7, 8, 9, 10, and 11 are conceptual diagrams for explaining a method of collecting learning data, and Figure 12 shows the collected learning data. This is a concept diagram for explanation.

본 발명에 대한 설명에 앞서, 본 명세서에서 언급되는 “대상물”은, 그 종류에 제한이 없으며, 매우 다양한 물체로 해석되어질 수 있다. 도 2의 (a) 및 (b)에 도시된 것과 같이, 대상물은 시각적 또는 물리적으로 구분이 가능한 구체적인 형태를 가지고 있는 것으로서, 물건(또는 물체, 211, 213, 215, 217, 221) 뿐만 아니라, 사람 또는 동물의 개념까지 포함하는 것으로 이해되어질 수 있다.Prior to the description of the present invention, the “object” mentioned in this specification is not limited in its type and can be interpreted as a wide variety of objects. As shown in Figures 2 (a) and (b), the object has a specific form that can be visually or physically distinguished, and is not only an object (or object, 211, 213, 215, 217, 221), It can be understood to include the concept of people or animals.

앞서 살펴본 것과 같이, 로봇 또는 자율 주행 차량 등의 보다 높은 성능을 위해서는, 최대한 많은 양의 학습 데이터를 기반으로, 학습을 수행하는 것이다. 이를 위하여, 학습 데이터를 확보하는 것은 매우 중요한 일이며, 본 발명에서는 영상에 포함된 대상물에 3차원 박스(또는 박스 영역, 또는 경계 박스(bounding box))에 기반하여 대상물에 대한 학습 데이터를 확보하는 방법에 대하여 제안한다. 이때, 3차원 박스는 정육면체 또는 직육면체의 형상으로 이루어질 수 있으며, 3차원 박스의 크기, 형상은 3차원 박스에 포함되는 대상물의 크기, 형상에 따라 가변될 수 있다.As seen above, for higher performance of robots or autonomous vehicles, learning is performed based on as much learning data as possible. For this purpose, securing learning data is very important, and in the present invention, learning data for the object is secured based on a 3D box (or box area, or bounding box) of the object included in the image. Suggest a method. At this time, the 3D box may be in the shape of a cube or rectangular parallelepiped, and the size and shape of the 3D box may vary depending on the size and shape of the object included in the 3D box.

도 2의 (a) 및 (b)에 도시된 것과 같이, 본 발명에서는, 대상물을 촬영한 영상에서, 대상물 각각(211, 213, 215, 217, 221)을 둘러싸는 3차원 박스(212, 214, 216, 218, 222)를 특정함으로써, 3차원 박스(212, 214, 216, 218, 222)의 적어도 하나의 지점과 관련된 자유도 정보를 학습 데이터로서 수집할 수 있다. 이때, 3차원 박스의 적어도 하나의 지점에 대한 정보는, 3차원 박스에서, 선과 선이 만나는 꼭지점(an apex, a vertex, angular point, a corner point)에 대한 정보일 수 있다. 즉, 대상물에 대한 자유도 정보는, 대상물을 둘러싸는 3차원 박스의 꼭지점에 대한 자유도 정보일 수 있다.As shown in Figures 2 (a) and (b), in the present invention, in the image taken of the object, three-dimensional boxes (212, 214) surrounding each object (211, 213, 215, 217, 221) , 216, 218, 222), degree-of-freedom information related to at least one point of the three-dimensional box (212, 214, 216, 218, 222) can be collected as learning data. At this time, information about at least one point of the 3D box may be information about an apex, a vertex, angular point, a corner point where lines meet in the 3D box. That is, the degree of freedom information about the object may be the degree of freedom information about the vertices of a three-dimensional box surrounding the object.

이대, 본 발명에서는, 3차원 박스의 각 꼭지점(정육면체 또는 직육면체의 꼭지점(8개의 꼭지점)에 대한 정보를 대상물에 대한 학습 데이터로서 수집할 수 있다.Ewha Womans University, in the present invention, information about each vertex of a three-dimensional box (the vertices of a cube or rectangular parallelepiped (eight vertices)) can be collected as learning data for the object.

이에, 본 발명에 따른 학습 데이터 수집 시스템에서는, 영상에 포함된 대상물을 둘러싼 3차원 박스를 특정하고, 3차원 박스에 포함된 복수의 지점들에 대한 자유도 정보를 획득함으로써, 대상물에 대한 학습 데이터를 수집하는 방법을 제안한다.Accordingly, in the learning data collection system according to the present invention, the three-dimensional box surrounding the object included in the image is specified, and the degree of freedom information for a plurality of points included in the three-dimensional box is acquired, thereby providing learning data for the object. We propose a method to collect.

본 발명에 따른 학습 데이터 수집 시스템(100)은, 통신부(110), 저장부(120) 및 제어부(130) 중 적어도 하나를 포함할 수 있다.The learning data collection system 100 according to the present invention may include at least one of a communication unit 110, a storage unit 120, and a control unit 130.

통신부(110)는 카메라(400)로부터 촬영된 영상을 수신하기 위한 수단으로서, 통신 방법에는 특별한 제한을 두지 않는다.The communication unit 110 is a means for receiving images captured by the camera 400, and there are no particular restrictions on the communication method.

통신부(110)는 유선 또는 무선 통신 중 적어도 하나를 수행하도록 이루어질 수 있다. 통신부(110)는 통신이 가능한 다양한 대상과 통신을 수행하도록 이루어질 수 있다. The communication unit 110 may be configured to perform at least one of wired or wireless communication. The communication unit 110 may be configured to communicate with various objects capable of communication.

한편, 통신부(110)는 적어도 하나의 외부 서버와 통신하도록 이루어질 수 있다. 여기에서, 외부 서버는, 저장부(120)의 적어도 일부의 구성에 해당하는 클라우드 서버 또는 데이터베이스 중 적어도 하나를 포함할 수 있다. 한편, 외부 서버에서는, 제어부(130)의 적어도 일부의 역할을 수행하도록 구성될 수 있다. 즉, 데이터 처리 또는 데이터 연산 등의 수행은 외부 서버에서 이루어지는 것이 가능하며, 본 발명에서는 이러한 방식에 대한 특별한 제한을 두지 않는다.Meanwhile, the communication unit 110 may be configured to communicate with at least one external server. Here, the external server may include at least one of a cloud server or a database corresponding to at least a portion of the storage unit 120. Meanwhile, the external server may be configured to perform at least part of the role of the control unit 130. In other words, data processing or data computation can be performed on an external server, and the present invention does not place any special restrictions on this method.

한편, 통신부(110)는 통신하는 대상의 통신 규격에 따라 다양한 통신 방식을 지원할 수 있다. Meanwhile, the communication unit 110 can support various communication methods depending on the communication standard of the communication target.

예를 들어, 통신부(110)는, WLAN(Wireless LAN), Wi-Fi(Wireless-Fidelity), Wi-Fi(Wireless Fidelity) Direct, DLNA(Digital Living Network Alliance), WiBro(Wireless Broadband), WiMAX(World Interoperability for Microwave Access), HSDPA(High Speed Downlink Packet Access), HSUPA(High Speed Uplink Packet Access), LTE(Long Term Evolution), LTE-A(Long Term Evolution-Advanced), 5G(5th Generation Mobile Telecommunication ), 블루투스(Bluetooth™), RFID(Radio Frequency Identification), 적외선 통신(Infrared Data Association; IrDA), UWB(Ultra-Wideband), ZigBee, NFC(Near Field Communication), Wi-Fi Direct, Wireless USB(Wireless Universal Serial Bus) 기술 중 적어도 하나를 이용하여, 통신을 수행하도록 이루어질 수 있다.For example, the communication unit 110 supports wireless LAN (WLAN), wireless-fidelity (Wi-Fi), wireless fidelity (Wi-Fi) Direct, digital living network alliance (DLNA), wireless broadband (WiBro), and WiMAX ( World Interoperability for Microwave Access), High Speed Downlink Packet Access (HSDPA), High Speed Uplink Packet Access (HSUPA), Long Term Evolution (LTE), Long Term Evolution-Advanced (LTE-A), 5th Generation Mobile Telecommunication (5G) , Bluetooth™, RFID (Radio Frequency Identification), Infrared Data Association (IrDA), UWB (Ultra-Wideband), ZigBee, NFC (Near Field Communication), Wi-Fi Direct, Wireless USB (Wireless Universal) Communication may be performed using at least one of the Serial Bus (Serial Bus) technologies.

한편, 카메라(400)는 영상을 촬영하기 위한 수단으로서, 본 발명에 따른 시스템(100) 내에 포함되거나, 또는 별도로 구비될 수 있다. 본 발명에서 카메라(400)는 “이미지 센서”라고도 명명될 수 있다.Meanwhile, the camera 400 is a means for capturing images and may be included in the system 100 according to the present invention or may be provided separately. In the present invention, the camera 400 may also be referred to as an “image sensor.”

카메라(400)는 정적인 영상 및 동적인 영상 중 적어도 하나를 촬영하도록 이루어질 수 있으며, 단수 또는 복수로 구비될 수 있다.The camera 400 may be configured to capture at least one of static images and dynamic images, and may be provided in singular or plural forms.

카메라(400)는 대상물(또는 피사체, 또는 물체)의 깊이 정보를 획득할 수 있는 3차원 깊이 카메라(3D depth camera) 또는 RGB-깊이 카메라(RGB-depth camera) 등으로 이루어질 수 있다. 카메라(400)가 3차원 깊이 카메라로 이루어진 경우, 촬영된 영상을 이루는 각 픽셀(pixel)의 깊이 값을 알 수 있으며, 이를 통하여 대상물의 깊이 정보가 획득될 수 있다The camera 400 may be configured as a 3D depth camera or an RGB-depth camera capable of acquiring depth information of an object (or subject, or object). If the camera 400 is configured as a 3D depth camera, the depth value of each pixel constituting the captured image can be known, and depth information of the object can be obtained through this.

이러한 카메라(400)는 도 4에 도시된 것과 같이, 카메라 좌표계{C}를 갖도록 이루어질 수 있다.This camera 400 may be configured to have a camera coordinate system {C}, as shown in FIG. 4.

한편, 본 발명에서는 적어도 3개의 기준 좌표계가 정의될 수 있으며, 제1 기준 좌표계는 카메라(400)를 기준으로 설정되는 카메라 좌표계{C}일 수 있다.Meanwhile, in the present invention, at least three reference coordinate systems may be defined, and the first reference coordinate system may be a camera coordinate system {C} set based on the camera 400.

카메라 좌표계{C}는 카메라(400)의 특정 지점을 원점으로 하는 3차원 좌표계일 수 있다.The camera coordinate system {C} may be a three-dimensional coordinate system with a specific point of the camera 400 as the origin.

나아가, 제2 기준 좌표계는 도 4에 도시된 것과 같이, 카메라(400)를 통해 촬영된 영상(410)에 포함된 마커(430)를 기준으로 설정되는 마커 좌표계{W}일 수 있다.Furthermore, the second reference coordinate system may be a marker coordinate system {W} set based on the marker 430 included in the image 410 captured through the camera 400, as shown in FIG. 4.

마커 좌표계{W}는 영상에 포함된 마커(430)의 특정 지점을 원점으로 하는 3차원 좌표계일 수 있다.The marker coordinate system {W} may be a three-dimensional coordinate system whose origin is a specific point of the marker 430 included in the image.

나아가, 제3 기준 좌표계는 도 4에 도시된 것과 같이, 카메라(400)를 통해 촬영된 영상(410)에 포함된 대상물(420)을 기준으로 설정되는 대상물(또는 물체) 좌표계{O}일 수 있다.Furthermore, the third reference coordinate system may be an object (or object) coordinate system {O} set based on the object 420 included in the image 410 captured through the camera 400, as shown in FIG. there is.

대상물 좌표계{O}는 영상에 포함된 대상물(420)의 특정 지점을 원점으로 하는 3차원 좌표계일 수 있다.The object coordinate system {O} may be a three-dimensional coordinate system whose origin is a specific point of the object 420 included in the image.

한편, 본 발명에서 카메라에 의해 촬영된 영상에 포함된 마커(430) 및 대상물(420)은, 실세계에 있는 마커 및 대상물을 피사체로 하여 촬영된 결과물일 수 있다. 즉, 영상에 포함된 마커(430) 및 대상물(420)는 실세계에 포함된 마커 및 대상물에 각각 대응되는 그래픽 객체이며, 본 명세서에서는 설명의 편의를 위하여 그래픽 객체에 대한 용어를 별도로 정의하지 않고, 마커에 대한 촬영을 통해 얻어진 영상에 포함된 그래픽 객체를 “마커”로 동일하게 명명하고, 대상물에 대한 촬영을 통해 얻어진 영상에 포함된 그래픽 객체를 “대상물”로 동일하게 명명하도록 한다.Meanwhile, in the present invention, the marker 430 and the object 420 included in the image captured by the camera may be the result of photography using markers and objects in the real world as subjects. That is, the marker 430 and the object 420 included in the image are graphic objects that respectively correspond to the marker and object included in the real world. In this specification, for convenience of explanation, the term for the graphic object is not separately defined, Graphic objects included in images obtained through filming of a marker are identically named “markers,” and graphic objects included in images obtained through filming of objects are identically named “objects.”

한편, 저장부(120)는 본 발명에 따른 다양한 정보를 저장하도록 이루어질 수 있다. 저장부(120)의 종류는 매우 다양할 수 있으며, 적어도 일부는, 외부 서버(클라우드 서버 및 데이터베이스(database: DB) 중 적어도 하나)를 의미할 수 있다. 즉, 저장부(120)와 관련된 정보가 저장되는 공간이면 충분하며, 물리적인 공간에 대한 제약은 없는 것으로 이해될 수 있다. Meanwhile, the storage unit 120 can be configured to store various information according to the present invention. The types of storage unit 120 may be very diverse, and at least some of them may refer to external servers (at least one of a cloud server and a database (DB)). In other words, it can be understood that any space where information related to the storage unit 120 is stored is sufficient, and there are no restrictions on physical space.

저장부(120)에는 i)본 발명에 따른 데이터 수집 시스템에 의해 수집된 학습 데이터, ii) 카메라(400)를 통해 촬영된 영상, iii) 촬영된 영상과 관련된 마커, 대상물 및 카메라 중 적어도 하나와 관련된 자유도 정보, iv) 제1 기준 좌표계(또는 카메라 좌표계){C}, 제2 기준 좌표계(또는 마커 좌표계){W}, 제3 기준 좌표계(또는 대상물 좌표계){O} 각각에 대한 정보, v)그리고 제1 내지 제3 기준 좌표계 적어도 두개 간의 상대적인 위치관계에 대한 정보 중 적어도 하나가 저장될 수 있다.The storage unit 120 includes at least one of i) learning data collected by the data collection system according to the present invention, ii) images captured through the camera 400, and iii) markers, objects, and cameras related to the captured images. Related degree of freedom information, iv) information about each of the first reference coordinate system (or camera coordinate system) {C}, the second reference coordinate system (or marker coordinate system) {W}, and the third reference coordinate system (or object coordinate system) {O}, v) At least one of information about the relative positional relationship between at least two first to third reference coordinate systems may be stored.

다음으로 제어부(130)는 본 발명과 관련된 학습 데이터 수집 시스템(100)의 전반적인 동작을 제어하도록 이루어질 수 있다. 제어부(130)는 인공지능 알고리즘을 처리 가능한 프로세서(processor, 또는 인공지능 프로세서)를 포함할 수 있다.Next, the control unit 130 may be configured to control the overall operation of the learning data collection system 100 related to the present invention. The control unit 130 may include a processor (or artificial intelligence processor) capable of processing artificial intelligence algorithms.

제어부(130)는 촬영된 영상을 이용하여, 도 2에서 함께 살펴본 대상물에 대한 3차원 박스를 생성(또는 특정)하도록 이루어질 수 있다.The control unit 130 may use the captured image to create (or specify) a three-dimensional box for the object viewed in FIG. 2.

나아가, 제어부(130)는 수집된 영상들 및 자유도 정보를 기반으로 학습을 수행하는 학습부를 더 포함할 수 있다. 이러한 학습부는 신경망 네트워크 구조를 가질 수 있다.Furthermore, the control unit 130 may further include a learning unit that performs learning based on the collected images and degree-of-freedom information. This learning unit may have a neural network network structure.

한편, 제어부(130)는 딥러닝 알고리즘에 기반하여, 카메라(400)를 통해 촬영되는 영상에서, 카메라(400)에 의해 촬영된 대상물(300)을 인식 및 추적할 수 있다. 이러한 작업은 트래킹(tracking)이라고도 명명될 수 있다. Meanwhile, the control unit 130 can recognize and track the object 300 captured by the camera 400 in the image captured by the camera 400, based on a deep learning algorithm. This task may also be called tracking.

나아가, 본 발명에 따른 학습 데이터 수집 시스템은, 영상에 포함된 3차원 박스를 특정(생성)하기 위하여 사용자와 상호작용하는 인터페이스를 제공할 수 있다.Furthermore, the learning data collection system according to the present invention can provide an interface that interacts with the user to specify (create) a 3D box included in the image.

이러한 인터페이스는 학습 데이터 수집을 위한 프로그램 또는 애플리케이션의 형식으로 제공될 수 있다.This interface may be provided in the form of a program or application for collecting learning data.

이 경우, 본 발명에서 대상물에 대한 학습 데이터의 수집은 촬영된 영상에 대하여 3차원 박스를 설정하는 프로세스를 제공하는 프로그램을 통해 이루어질 수 있다.In this case, in the present invention, collection of learning data about an object can be accomplished through a program that provides a process for setting a 3D box for a captured image.

한편, 제어부(130)는 대상물의 학습 데이터를 수집하기 위하여, 카메라 파라미터, 제1 기준 좌표계(또는 카메라 좌표계){C}, 제2 기준 좌표계(또는 마커 좌표계){W} 및 제3 기준 좌표계(또는 대상물 좌표계){O} 중 적어도 하나에 근거한 연산을 수행할 수 있다. 이러한 연산은, 카메라 파라미터, 제1 기준 좌표계(또는 카메라 좌표계){C}, 제2 기준 좌표계(또는 마커 좌표계){W} 및 제3 기준 좌표계(또는 대상물 좌표계){O} 중 적어도 하나에 대한 정보(값)을 이용하여 PnP(Perspective-n-Point)방정식을 계산하는 것일 수 있다.Meanwhile, in order to collect learning data of the object, the control unit 130 includes camera parameters, a first reference coordinate system (or camera coordinate system) {C}, a second reference coordinate system (or marker coordinate system) {W}, and a third reference coordinate system ( Alternatively, an operation based on at least one of the object coordinate system) {O} may be performed. This operation is performed on at least one of camera parameters, a first reference coordinate system (or camera coordinate system) {C}, a second reference coordinate system (or marker coordinate system) {W}, and a third reference coordinate system (or object coordinate system) {O}. It may be calculating a PnP (Perspective-n-Point) equation using information (values).

한편, 본 발명에서는 학습 데이터 수집의 대상이 되는 대상물 주변에 마커를 위치시키고, 대상물과 함께 마커를 촬영함으로써, 마커의 자유도 정보를 기반으로, 대상물에 대한 자유도 정보를 수집할 수 있으며, 이하에서는 위에서 살펴본 본 발명에 따른 학습 데이터 수집 시스템의 구성에 기반하여, 학습 데이터를 수집하는 방법에 대하여 보다 구체적으로 살펴본다.Meanwhile, in the present invention, by placing a marker around the object that is the target of learning data collection and photographing the marker together with the object, the degree of freedom information about the object can be collected based on the degree of freedom information of the marker. In this section, we will look in more detail at the method of collecting learning data based on the configuration of the learning data collection system according to the present invention discussed above.

먼저, 본 발명에 따른 학습 데이터 수집 방법에 의하면, 카메라를 통해 대상물 및 대상물 주변에 위치한 마커에 대한 영상을 획득하는 과정이 진행될 수 있다(S310, 도 3 참조).First, according to the learning data collection method according to the present invention, a process of acquiring images of the object and markers located around the object through a camera can be performed (S310, see FIG. 3).

도 4에 도시된 것과 같이, 본 발명에서는 학습 데이터를 수집하고자 하는 대상물(420) 주변에 마커(430)를 위치시킬 수 있다. 여기에서, 마커(430)는 도시된 것과 같이, 적어도 하나의 에이프릴 태그(Apriltag)를 포함하도록 이루어질 수 있다. 에이프릴 태그는, 시각적 기준이 되는 표식으로서, 2차원 정보로 이루어질 수 있다. As shown in FIG. 4, in the present invention, a marker 430 can be positioned around the object 420 for which learning data is to be collected. Here, the marker 430 may include at least one April tag, as shown. The April tag is a mark that serves as a visual standard and may be composed of two-dimensional information.

본 발명에서, 마커(430)와 대상물(420)간의 상대 위치는(이격된 거리, 이격된 높이) 특별한 한정이 없으며, 카메라(400)의 화각(FoV, Field of VIEW)에 대상물(420)과 마커(430)가 함께 포함될 정도이면 족하다. 즉, 본 발명에서는 카메라(400)에 촬영된 영상 내에, 대상물(420)과 마커(430)가 함께 포함될 정도이면, 마커(430)가 놓여지는 위치에 대해서는 특별한 한정을 두지 않는다.In the present invention, the relative position (distance apart, height) between the marker 430 and the object 420 is not particularly limited, and the object 420 and the object 420 are located in the field of view (FoV) of the camera 400. It is sufficient as long as the marker 430 is included. That is, in the present invention, as long as the object 420 and the marker 430 are included together in the image captured by the camera 400, there is no special limitation on the position where the marker 430 is placed.

나아가, 마커(430)는 도 4에 도시된 것과 같이, 대상물(420)이 놓여진 평면과 동일한 평면에 위치할 수 있다. 한편, 마커(430)는 대상물(420)이 놓여진 평면과 서로 다른 평면에 위치할 수 있음은 물론이다. 이 경우, 마커(430)과 놓여진 평면과 대상물(420)이 놓여지는 평면은 서로 다른 높이의 평면들일 수 있다.Furthermore, the marker 430 may be located on the same plane as the plane on which the object 420 is placed, as shown in FIG. 4 . Meanwhile, of course, the marker 430 may be located on a different plane from the plane on which the object 420 is placed. In this case, the plane on which the marker 430 is placed and the plane on which the object 420 is placed may be planes of different heights.

본 발명에서 대상물(420) 및 마커(430)가 함께 포함되도록 촬영된 영상은, 다양한 용어로 명명될 수 있다.In the present invention, images captured to include the object 420 and the marker 430 may be named by various terms.

예를 들어, 대상물(420)에 대해 사전에 확보된 학습 데이터가 존재하지 않는 경우라면, 카메라를 통해 촬영된 영상은 기준 영상(또는 기준 이미지)로 명명될 수 있다. 이러한 기준 영상은 대상물(420)에 대한 제1 번째 학습 데이터가 수집된 영상일 수 있다.For example, if there is no pre-secured learning data for the object 420, the image captured by the camera may be called a reference image (or reference image). This reference image may be an image from which the first learning data for the object 420 is collected.

나아가, 본 발명에서는, 카메라(400)를 통하여, 기준 영상 이후에 복수의 영상들을 촬영할 수 있으며, 이러한 영상들은 후속 영상 등으로 명명될 수 있다. 후속 영상은, 대상물(420)의 다양한 자세에 대한 학습 데이터를 수집하기 위하여 촬영되는 영상들을 의미할 수 있다. 후속 영상에는, 마커(430)가 포함되도록 촬영되거나, 마커(430)가 포함되지 않은 상태로도 촬영될 수 있다.Furthermore, in the present invention, a plurality of images can be captured after the reference image through the camera 400, and these images may be referred to as follow-up images. Follow-up images may refer to images captured to collect learning data about various postures of the object 420. The subsequent image may be captured to include the marker 430, or may be captured without the marker 430.

후속 영상에 마커(430)가 포함되지 않은 경우, 후속 영상에서의 대상물(420)에 대한 학습 데이터는, 기준 영상에서 수집된 대상물(420)의 학습 데이터를 근거로 수집될 수 있다.When the marker 430 is not included in the subsequent image, learning data for the object 420 in the subsequent image may be collected based on learning data for the object 420 collected in the reference image.

다음으로 본 발명에서는 촬영된 영상으로부터 마커를 검출하여, 영상에 포함된 마커의 적어도 하나의 마커 기준점을 특정하는 과정 및 마커 기준점을 기준으로 카메라 좌표계에 대한 마커의 마커 좌표계를 특정하는 과정이 진행될 수 있다(S320, S330).Next, in the present invention, a marker can be detected from the captured image, a process of specifying at least one marker reference point of the marker included in the image, and a process of specifying the marker coordinate system of the marker with respect to the camera coordinate system based on the marker reference point may be performed. There is (S320, S330).

제어부(130)는 도 4에 도시된 것과 같이, 영상(410)에 포함된 마커(430)를 검출할 수 있다. 이 경우, 제어부(130)는 기 설정된 마커 검출 알고리즘(또는 프로그램)에 근거하여, 마커(430)에 포함된 적어도 하나의 마커 기준점을 특정할 수 있다. 마커 기준점은, 마커(430)에 포함된 코너들에 각각 대응되는 적어도 하나의 코너 포인트(corner point)를 의미할 수 있다.The control unit 130 can detect the marker 430 included in the image 410, as shown in FIG. 4. In this case, the control unit 130 may specify at least one marker reference point included in the marker 430 based on a preset marker detection algorithm (or program). The marker reference point may mean at least one corner point corresponding to each corner included in the marker 430.

도 4의 확대된 부분(440)에서와 같이, 제어부(130)는 촬영된 영상으로부터 마커(430)를 인식하고, 인식된 마커(430)의 코너에 각각 대응되는 적어도 하나의 마커 기준점(또는 코너 포인트, 0, 1, 2, 3, 4, 5, 6, 7)을 특정(추출)할 수 있다.As in the enlarged portion 440 of FIG. 4, the control unit 130 recognizes the marker 430 from the captured image, and sets at least one marker reference point (or corner) corresponding to each corner of the recognized marker 430. Points (0, 1, 2, 3, 4, 5, 6, 7) can be specified (extracted).

그리고 제어부(130)는 상기 특정된 마커 기준점에 근거하여, 도 5a에 도시된 것과 같이, 마커(430)의 마커 좌표계{W}를 특정할 수 있다. 제어부(130)는 카메라 좌표계{C}에 대한 마커 좌표계{W} 자유도 정보(또는 6자유도 자세(POSE))를 측정할 수 있다.And the control unit 130 may specify the marker coordinate system {W} of the marker 430, as shown in FIG. 5A, based on the specified marker reference point. The control unit 130 may measure marker coordinate system {W} degree-of-freedom information (or 6-degree-of-freedom posture (POSE)) with respect to the camera coordinate system {C}.

마커 좌표계{W}는 특정된 마커 기준점(예를 들어, 0, 1, 2, 3, 4, 5, 6, 7) 중 어느 하나를 원점으로 3차원의 축을 포함할 수 있으며, 도 4 및 도 5a에 도시와 같이, “0”에 해당하는 마커 기준점을 원점으로 할 수 있다.The marker coordinate system {W} may include a three-dimensional axis with any one of the specified marker reference points (e.g., 0, 1, 2, 3, 4, 5, 6, 7) as the origin, and is shown in FIGS. 4 and 4 As shown in 5a, the marker reference point corresponding to “0” can be used as the origin.

한편, 제어부(130)는 카메라(400)의 카메라 좌표계{C}를 기준으로, 마커 좌표계{W}의 자유도 정보(6자유도 자세)를 추출할 수 있다. 이를 통해, 제어부(130)는 카메라 좌표계{C}에 대하여, 영상(410)에 포함된 마커(430) 또는 마커 좌표계{W}가 회전(rotation) 및 변환(translation, 병진 이동)된 정도를 추출할 수 있다.Meanwhile, the control unit 130 may extract degree-of-freedom information (six degrees-of-freedom posture) of the marker coordinate system {W} based on the camera coordinate system {C} of the camera 400. Through this, the control unit 130 extracts the degree to which the marker 430 or the marker coordinate system {W} included in the image 410 is rotated and translated with respect to the camera coordinate system {C}. can do.

제어부(130)는 촬영된 영상(410)에서, 마커 기준점에 대응되는 픽셀(pixel) 좌표를 특정하고, 상기 마커 기준점에 대응되는 픽셀(pixel) 좌표 및 카메라(400)의 카메라 좌표계{C}를 이용하여, 카메라 좌표계{C}에 대한 상기 마커 좌표계{W}의 자유도 자세를 특정할 수 있다.The control unit 130 specifies the pixel coordinates corresponding to the marker reference point in the captured image 410, and uses the pixel coordinates corresponding to the marker reference point and the camera coordinate system {C} of the camera 400. Using this, the degree-of-freedom attitude of the marker coordinate system {W} with respect to the camera coordinate system {C} can be specified.

보다 구체적으로, 제어부(130)는 카메라 좌표계{W}에 대한 마커 좌표계{W}의 자유도 정보(또는 상대적인 위치 관계)를 추출하기 위하여, 도 5b에 도시된 것과 같이, PnP을 계산할 수 있다.More specifically, the control unit 130 may calculate PnP, as shown in FIG. 5B, in order to extract degree of freedom information (or relative position relationship) of the marker coordinate system {W} with respect to the camera coordinate system {W}.

한편, 카메라 좌표계{W}에 대한 마커 좌표계{W}의 상대적인 위치 관계는, 마커 좌표계{W}가 카메라 좌표계{W}에 대하여, 회전(rotation) 및 변환(translation)된 정도를 의미할 수 있다. 이러한 회전 및 변환된 정도에 따른 정보는 카메라 좌표계{C}에 대한 마커 좌표계{W}의 자유도 정보(자유도 자세)에 해당할 수 있다.Meanwhile, the relative positional relationship of the marker coordinate system {W} with respect to the camera coordinate system {W} may mean the degree to which the marker coordinate system {W} is rotated and translated with respect to the camera coordinate system {W}. . Information according to this degree of rotation and transformation may correspond to degree-of-freedom information (degree-of-freedom posture) of the marker coordinate system {W} with respect to the camera coordinate system {C}.

카메라 좌표계{C}에 대한 마커 좌표계{W}의 자유도 정보(자유도 자세)를 추출하기 위하여, 제어부(130)는 도 5a, 5b에 도시된 것과 같이, 촬영된 영상(410)에서 마커(430)의 적어도 하나의 마커 기준점들에 해당하는 픽셀 좌표(u, v(도 5b의 도면부호 510참조)) 카메라(400)의 3차원 좌표(x, y, z(도 5b의 도면부호 520 참조))를 PnP방정식에 대입하여, 카메라 좌표계{C}에 대한 마커 좌표계{W}의 자유도 정보(자유도 자세)(도 5b의 도면부호 530참조)를 추출할 수 있다.In order to extract degree-of-freedom information (degree-of-freedom posture) of the marker coordinate system {W} with respect to the camera coordinate system {C}, the control unit 130 selects a marker ( Pixel coordinates (u, v (reference numeral 510 in FIG. 5B)) corresponding to at least one marker reference point 430) and three-dimensional coordinates (x, y, z (reference numeral 520 in FIG. 5B) of the camera 400) )) can be substituted into the PnP equation to extract degree-of-freedom information (degree-of-freedom posture) of the marker coordinate system {W} with respect to the camera coordinate system {C} (see reference numeral 530 in FIG. 5B).

한편, 이러한 PnP방정식에는, intrinsic parameter(540)가 적용되며, 이는 카메라(400)의 특성을 나타내는 카메라 고유 파라미터에 해당할 수 있다. 카메라 고유 파라미터는 저장부(120)에 저장되어 존재할 수 있다.Meanwhile, an intrinsic parameter 540 is applied to this PnP equation, which may correspond to a camera-specific parameter representing the characteristics of the camera 400. Camera-specific parameters may be stored in the storage unit 120.

한편, 도 5b에 도시된 것과 같이, PnP 방정식에 대한 파라미터에 대하여 설명하면, “S”는 스케일 상수(ex: 1), “r“은 skew parameter(일종의 왜곡 보정 상수에 해당), “t1”, “t2”, “t3”은 회전 및 변환에 대한 행렬 및 벡터를 의미할 수 있다.Meanwhile, as shown in Figure 5b, when describing the parameters for the PnP equation, “S” is a scale constant (ex: 1), “r” is a skew parameter (corresponding to a type of distortion correction constant), and “t1” , “t2”, and “t3” may refer to matrices and vectors for rotation and transformation.

제어부(130)는 카메라(400)의 카메라 좌표계{C}에 대한 정보 및 카메라 고유 파라미터에 대한 정보를 미리 알고 있으므로, 이를 이용하여, 영상(410)에서의 마커 좌표계{W}의 자유도 정보를 추출할 수 있다.Since the control unit 130 knows in advance information about the camera coordinate system {C} of the camera 400 and information about the camera's unique parameters, it uses this to obtain information on the degree of freedom of the marker coordinate system {W} in the image 410. It can be extracted.

다음으로 본 발명에서는 마커 좌표계의 원점을 기준으로 정렬된 그리드 영역을 영상에 투영하는 과정이 진행될 수 있다(S340).Next, in the present invention, a process of projecting the grid area aligned based on the origin of the marker coordinate system onto the image may be performed (S340).

도 6에 도시된 것과 같이, 그리드 영역(610)은 그리드(grid)를 포함하는 평면으로 이해되어질 수 있다. 제어부(130)는 영상(410)에 그리드 영역(610)을 위치시킬 수 있다. 영상(410) 그리드 영역(610)을 투영 또는 위치시키는 과정은, 본 발명에 따른 학습 데이터 수집 시스템에 의하여 운용되는 프로그램(소프트웨어, 애플리케이션 등)을 통하여 이루어질 수 있다.As shown in FIG. 6, the grid area 610 can be understood as a plane including a grid. The control unit 130 may position the grid area 610 on the image 410. The process of projecting or positioning the grid area 610 of the image 410 can be accomplished through a program (software, application, etc.) operated by the learning data collection system according to the present invention.

도 6에 도시된 것과 같이, 그리드 영역(610)은 대상물(620)의 가장 아랫 부분에 해당하는 바닥면에 위치하도록 투영될 수 있다. 제어부(130)는 그리드 영역(610)에 대한 초기 위치를, 대상물(620)의 바닥면(바닥 평면)과 동일 평면 상으로 설정할 수 있다.As shown in FIG. 6 , the grid area 610 may be projected to be located on the floor corresponding to the bottom portion of the object 620 . The control unit 130 may set the initial position of the grid area 610 to be on the same plane as the bottom surface (floor plane) of the object 620.

도 6에 도시된 것과 같이, 그리드 영역(610)은 복수의 수평선들(621) 및 복수의 수직선들(631)이 수직 교차하여 형성되는 복수의 사각 영역을 포함할 수 있다.As shown in FIG. 6 , the grid area 610 may include a plurality of square areas formed by vertically intersecting a plurality of horizontal lines 621 and a plurality of vertical lines 631.

그리드 영역(610)의 복수의 수평선들(621)은 마커 좌표계{W}의 제1 축(예를 들어, X축)과 평행하며, 그리드 영역(610)의 복수의 수직선들(631)은 마커 좌표계{W}의 제2 축(예를 들어, Y축)과 평행할 수 있다.The plurality of horizontal lines 621 of the grid area 610 are parallel to the first axis (e.g., X-axis) of the marker coordinate system {W}, and the plurality of vertical lines 631 of the grid area 610 are marker It may be parallel to the second axis (eg, Y axis) of the coordinate system {W}.

제어부(130)는 영상(410)에 대해 그리드 영역(610)을 투영 또는 위치시킬 때에, 마커 좌표계{W}에 대해 그리드 영역(610)이 정렬되도록, 그리드 영역(610)의 배치 위치에 대한 제어를 수행할 수 있다. When projecting or positioning the grid area 610 on the image 410, the control unit 130 controls the arrangement position of the grid area 610 so that the grid area 610 is aligned with the marker coordinate system {W}. can be performed.

그리드 영역(610)은 복수의 수평선들(621) 중 어느 하나인 기준 수평선(620)을 포함하고, 상기 복수의 수진선들(631) 중 어느 하나인 기준 수직선(630)을 포함할 수 있다. 기준 수평선(620)과 기준 수직선(630)은 상호 수직 교차하도록 이루어질 수 있다.The grid area 610 may include a reference horizontal line 620, which is one of the plurality of horizontal lines 621, and a reference vertical line 630, which is one of the plurality of vertical lines 631. The reference horizontal line 620 and the reference vertical line 630 may perpendicularly intersect each other.

마커 좌표계{W}와 그리드 영역(610)의 정렬은, 상기 복수의 수평선들(621) 중 기준 수평선(620) 및 상기 복수의 수직선들(631) 중 기준 수직선(630)이 마커 좌표계{W}에 대하여 정렬되는 것을 의미할 수 있다.The alignment of the marker coordinate system {W} and the grid area 610 is such that the reference horizontal line 620 among the plurality of horizontal lines 621 and the reference vertical line 630 among the plurality of vertical lines 631 are aligned with the marker coordinate system {W} It can mean being sorted with respect to .

보다 구체적으로, 도 6에 도시된 것과 같이, 마커 좌표계{W}는 서로 직교하는 제1 축(X축) 및 제2 축(Y축)을 포함하고, 마커 좌표계{W}에 대한 그리드 영역(610)의 정렬은, 마커 좌표계{W}의 제1축(X축)과 그리드 영역(610)의 기준 수평선(620)을 일치시키고, 마커 좌표계{W}의 제2 축(Y축)과 그리드 영역(610)의 기준 수직선(630)을 일치시키는 것을 통해 이루어질 수 있다.More specifically, as shown in FIG. 6, the marker coordinate system {W} includes a first axis (X axis) and a second axis (Y axis) orthogonal to each other, and a grid area for the marker coordinate system {W} ( The alignment of 610) matches the first axis (X-axis) of the marker coordinate system {W} with the reference horizontal line 620 of the grid area 610, and aligns the second axis (Y-axis) of the marker coordinate system {W} with the grid. This can be achieved by matching the reference vertical line 630 of the area 610.

이 경우, 기준 수평선(620)과 기준 수직선(630)의 교차 점은 마커 좌표계{W}의 원점(0, 0, 0)과 일치할 수 있다.In this case, the intersection point of the reference horizontal line 620 and the reference vertical line 630 may coincide with the origin (0, 0, 0) of the marker coordinate system {W}.

이 경우, 마커 좌표계{W}의 제1 축(X축)과 기준 수평선(620)이 이루는 각도는 0도(640 참조)일 수 있다. 마찬가지로, 제2 축(Y축)과 기준 직선(630)이 이루는 각도는 0도일 수 있다.In this case, the angle formed between the first axis (X-axis) of the marker coordinate system {W} and the reference horizontal line 620 may be 0 degrees (see 640). Likewise, the angle formed between the second axis (Y-axis) and the reference straight line 630 may be 0 degrees.

이와 같이, 마커 좌표계{W}와 그리드 영역(610) 간의 정렬이 이루어지는 경우, 그리드 영역(610)이 이동되더라도, 마커 좌표계{W} 대비 그리드 영역(610)에 포함된 특정 지점이 얼만큼 회전 또는 변환되었는지 파악할 수 있으며, 궁극적으로, 카메라 좌표계{C} 또는 마커 좌표계{W}에 대한 그리드 영역(610)의 특정 지점의 자유도 정보(6자유도 자세)를 파악하는 것이 가능하다.In this way, when alignment is made between the marker coordinate system {W} and the grid area 610, even if the grid area 610 is moved, the specific point included in the grid area 610 is rotated or rotated relative to the marker coordinate system {W}. It is possible to determine whether the conversion has occurred, and ultimately, it is possible to determine the degree of freedom information (six degrees of freedom posture) of a specific point in the grid area 610 with respect to the camera coordinate system {C} or marker coordinate system {W}.

위에서 살펴본 것과 같이, 마커 좌표계{W}와 그리드 영역(610) 간의 정렬이 이루어지면, 다음으로 제어부(130)는 그리드 영역(610)을 이동시켜, 대상물(620)에 대한 특정을 수행할 수 있다.As seen above, when alignment is achieved between the marker coordinate system {W} and the grid area 610, the control unit 130 can then move the grid area 610 to specify the object 620. .

보다 구체적으로, 본 발명에서는 그리드 영역의 이동에 근거하여 그리드 영역에서 대상 물체를 특정하고, 그리드 영역에서의 대상물에 대한 특정 결과를 이용하여 영상에 포함된 대상물에 대한 학습 데이터를 수집하는 과정이 진행될 수 있다(S350, S360).More specifically, in the present invention, the process of specifying a target object in the grid area based on the movement of the grid area and collecting learning data about the object included in the image using the specific results for the object in the grid area will be carried out. (S350, S360).

제어부(130)는 마커 좌표계{W}의 원점(0, 0, 0, 도 6 참조)을 기준으로 그리드 영역을(610) 수평 회전하여, 그리드 영역(610)과 영상(410)에 포함된 대상물(420) 간의 정렬을 수행할 수 있다.The control unit 130 horizontally rotates the grid area 610 based on the origin (0, 0, 0, see FIG. 6) of the marker coordinate system {W}, so that the object included in the grid area 610 and the image 410 Alignment between (420) can be performed.

제어부(130)는 도 7에 도시된 것과 같이, 마커 좌표계{W}의 원점(0, 0, 0, 도 6 참조)을 기준으로 그리드 영역을(610) 소정 각도(θ) 만큼 수평 회전시켜, 그리드 영역(610)과 대상물(420)에 대한 정렬을 수행할 수 있다.As shown in FIG. 7, the control unit 130 horizontally rotates the grid area 610 by a predetermined angle θ based on the origin (0, 0, 0, see FIG. 6) of the marker coordinate system {W}, Alignment of the grid area 610 and the object 420 can be performed.

제어부(130)는 그리드 영역(610)과 대상물(420)이 정렬될 때까지, 마커 좌표계{W}의 원점을 기준으로, 그리드 영역(610)을 회전시킬 수 있다. 이때, 상기 소정 각도(θ)는, 그리드 영역(610)과 대상물(620)이 정렬이 되는 각도를 의미할 수 있다.The control unit 130 may rotate the grid area 610 based on the origin of the marker coordinate system {W} until the grid area 610 and the object 420 are aligned. At this time, the predetermined angle θ may mean an angle at which the grid area 610 and the object 620 are aligned.

한편, 그리드 영역(610)과 대상물(420) 간의 정렬은, 도 8에 도시된 것과 같이, 그리드 영역(610)의 기준 수평선(620) 또는 기준 수직선(630)이, 대상물(420)과 수평 또는 수직이 되는 것을 의미할 수 있다.Meanwhile, the alignment between the grid area 610 and the object 420 is such that, as shown in FIG. 8, the reference horizontal line 620 or the reference vertical line 630 of the grid area 610 is horizontal or horizontal with the object 420. It can mean being vertical.

그리드 영역(610)의 기준 수평선(620) 또는 기준 수직선(630)이, 대상물(420)과 수평 또는 수직이 된다고 함은, 영상에서 대상물(420)이 놓여진 일 영역(또는 일면, 제1 면)의 적어도 하나의 모서리와, 그리드 영역(610)의 기준 수평선(620) 또는 기준 수직선(630)이 수평 또는 수직이 되는 것을 의미할 수 있다.The reference horizontal line 620 or the reference vertical line 630 of the grid area 610 being horizontal or perpendicular to the object 420 means an area (or one side, the first side) in the image where the object 420 is placed. This may mean that at least one edge of and the reference horizontal line 620 or the reference vertical line 630 of the grid area 610 are horizontal or vertical.

이때, 대상물(420)이 놓여진 제1 면은 다양하게 정의될 수 있으며, 도 9에 도시된 것과 같이, 대상물(910)이 놓여진 제1 면(또는 일 영역, 바닥면)의 최외각 지점들 중 적어도 하나의 지점을 포함하여 형성되는 특정 형상의 평면(920, 930)일 수 있다. 이때, 특정 형상은 사각형상으로 이루어질 수 있다. 이때, 특정 형상의 평면(920, 930)은 대상물이 차지하는 영역(910)을 모두 포함하되, 그 면적이 최소가 되도록 특정될 수 있다. 예를 들어, 도 9의 (a)에 해당하는 평면(920)의 면적은, 도 9의 (b)에 해당하는 평면(930)의 면적보다 작지만, 대상물이 차지하는 영역(910)을 모두 포함하도록 이루어질 수 있다.At this time, the first surface on which the object 420 is placed can be defined in various ways, and as shown in FIG. 9, among the outermost points of the first surface (or one area, floor surface) on which the object 910 is placed It may be a plane 920 or 930 of a specific shape formed including at least one point. At this time, the specific shape may be square. At this time, the planes 920 and 930 of a specific shape include all of the area 910 occupied by the object, but may be specified so that the area is minimal. For example, the area of the plane 920 corresponding to (a) of FIG. 9 is smaller than the area of the plane 930 corresponding to (b) of FIG. 9, but is made to include all of the area 910 occupied by the object. It can be done.

한편, 제어부(130)는 도 8에 도시된 것과 같이, 대상물(420)의 제1 면(810, 바닥면 또는 일면)과 그리드 영역(610)에 대한 정렬을 수행함으로써, 대상물(420)이 차지하는 사각 형상의 평면(일면, 제1 면 또는 영역(810))을 둘러싼 모서리들 중 적어도 하나가, 그리드 영역(610)의 기준 수평선(620) 또는 기준 수직선(630)이 수평 또는 수직이 되도록 할 수 있다.Meanwhile, as shown in FIG. 8, the control unit 130 performs alignment of the first surface 810 (bottom surface or one surface) of the object 420 and the grid area 610, thereby dividing the area occupied by the object 420. At least one of the edges surrounding a square-shaped plane (one side, first side, or area 810) may cause the reference horizontal line 620 or the reference vertical line 630 of the grid area 610 to be horizontal or vertical. there is.

나아가, 대상물의 제1 면(810)의 특정은, 그리드 영역(610)이 대상물(420)과 수평(또는 수직)이 되도록 회전된 상태에서, 대상물(420)이 위치한 제1 면(810)의 테두리 부분에 적어도 하나의 지점(811, 812, 813, 814)이 특정되는 것을 통해 이루어질 수 있다.Furthermore, the specification of the first face 810 of the object is determined by rotating the grid area 610 so that it is horizontal (or perpendicular) to the object 420, the first face 810 on which the object 420 is located. This can be achieved by specifying at least one point (811, 812, 813, 814) in the border portion.

상기 적어도 하나의 지점(811, 812, 813, 814)은, 키포인트(keypoint)라고 명명될 수 있다. 제어부(130)는 대상물의 제1 면(810)을 사각 형상으로 정의할 수 있으며, 제1 면(810)에 해당하는 사각 형상의 4개의 키포인트들(811, 812, 813, 814)을 추출할 수 있다. 이때, 키포인트들(811, 812, 813, 814)의 특정은, 제어부(130) 또는 사용자에 의하여 이루어질 수 있다. 제어부(130)는 어느 하나의 키포인트가 사용자에 의해 특정되면, 나머지 적어도 3개의 키포인트를 자동으로 추출하는 것 또한 가능하다. 이 경우, 제어부(130)는 키포인트들(811, 812, 813, 814)에 의하여 대상물(420)의 바닥면을 모두 포함하는 사각형상의 제1 면이 특정되도록 키포인트들(811, 812, 813, 814)을 특정할 수 있다.The at least one point (811, 812, 813, 814) may be named a keypoint. The control unit 130 may define the first side 810 of the object as a square shape and extract four key points 811, 812, 813, and 814 of the square shape corresponding to the first side 810. You can. At this time, the key points 811, 812, 813, and 814 may be specified by the control unit 130 or the user. The control unit 130 is also capable of automatically extracting at least three remaining key points when one key point is specified by the user. In this case, the control unit 130 uses the key points 811, 812, 813, and 814 to specify the first surface of the rectangular shape including the entire bottom surface of the object 420. ) can be specified.

한편, 도 8에 도시된 것과 같이, 본 발명에서는 키포인트들(811, 812, 813, 814) 중 어느 하나의 키포인트(811)를 원점으로 하는 대상물 좌표계{O}가 특정될 수 있다. Meanwhile, as shown in FIG. 8, in the present invention, an object coordinate system {O} with one key point 811 among the key points 811, 812, 813, and 814 as the origin can be specified.

제어부(130)는 대상물(420)의 제1 면(810)에 대응되는 복수 지점(키포인트들, 811, 812, 813, 814)에 각각 대응되는 좌표 정보를 추출할 수 있다.The control unit 130 may extract coordinate information corresponding to a plurality of points (key points 811, 812, 813, and 814) corresponding to the first surface 810 of the object 420.

이때, 좌표 정보는, 3차원 좌표에 해당할 수 있다. 제어부(130)는 필요에 따라, 복수 지점(키포인트들, 811, 812, 813, 814)에 대하여, 대상물 좌표계{O}에서의 좌표 정보, 마커 좌표계{W}에서의 좌표 정보, 카메라 좌표계{C}에서의 좌표 정보 및 영상(410)에서의 픽셀 좌표 중 적어도 하나를 추출할 수 있다. 이는, 대상물 좌표계{O}, 마커 좌표계{W}, 카메라 좌표계{C} 및 픽셀 좌표는, 모두 상대적인 위치관계를 갖기 때문에, 어느 하나의 좌표 정보를 알면, 나머지에 대한 좌표 정보를 추출(또는 추정)하는 것이 가능하다.At this time, the coordinate information may correspond to three-dimensional coordinates. If necessary, the control unit 130 provides coordinate information in the object coordinate system {O}, coordinate information in the marker coordinate system {W}, and camera coordinate system {C) for a plurality of points (key points 811, 812, 813, 814). At least one of coordinate information in } and pixel coordinates in the image 410 can be extracted. This is because the object coordinate system {O}, marker coordinate system {W}, camera coordinate system {C}, and pixel coordinates all have a relative positional relationship, so if one coordinate information is known, coordinate information for the others can be extracted (or estimated). ) is possible.

대상물 좌표계{O}는 대상물을 기준으로 하는 기준 좌표계로서, 마커 좌표계{W}로부터 수평, 수직 이동 및 소정 각도만큼 회전된 좌표계일 수 있다.The object coordinate system {O} is a reference coordinate system based on the object, and may be a coordinate system that is horizontally or vertically moved and rotated by a predetermined angle from the marker coordinate system {W}.

한편, 키포인트들(811, 812, 813, 814)에 의해 정의된 제1 면(810)은 대상물(420)을 포함하는 3차원 박스 영역의 바닥면이 될 수 있다.Meanwhile, the first surface 810 defined by the key points 811, 812, 813, and 814 may be the bottom surface of the three-dimensional box area including the object 420.

다음으로, 제어부(130)는 그리드 영역(610)에 대한 수직 이동을 통해, 3차원 박스 영역의 천정면을 특정할 수 있다.Next, the control unit 130 can specify the ceiling surface of the three-dimensional box area through vertical movement with respect to the grid area 610.

보다 구체적으로, 제어부(130)는 앞서 살펴본 것과 같이, 마커 좌표계{W}를 기준으로 그리드 영역(610)을 수평 회전시킴으로써, 그리드 영역(610)과 대상물(420)이 정렬된 상태에서, 그리드 영역(610)을 수직 이동시킬 수 있다. 제어부(130)는 도 10의 (a), (b) 및 (c)에 도시된 것과 같이, 그리드 영역(610)을 수직 이동시킴으로써, 대상물(420)의 제1 면(810)과 마주 보는 제2 면(830)을 특정할 수 있다.More specifically, as discussed above, the control unit 130 horizontally rotates the grid area 610 based on the marker coordinate system {W}, so that the grid area 610 and the object 420 are aligned. (610) can be moved vertically. As shown in (a), (b), and (c) of FIG. 10, the control unit 130 vertically moves the grid area 610 to create a first surface facing the first side 810 of the object 420. Two sides (830) can be specified.

제어부(130)는 도 10의 (c)에 도시된 것과 같이, 대상물(420)의 가장 낮은 부분에 해당하는 제1 면(바닥면, 810)으로부터, 대상물(420)의 가장 높은 부분까지 그리드 영역(610)을 수직이동 시킴으로써, 제2 면(천정면, 820)을 특정할 수 있다.As shown in (c) of FIG. 10, the control unit 130 controls the grid area from the first surface (bottom surface, 810) corresponding to the lowest part of the object 420 to the highest part of the object 420. By vertically moving 610, the second surface (ceiling surface, 820) can be specified.

도 10의 (a) 및 (c), 그리고 도 11에 도시된 것과 같이, 제1 면(810)과 제2 면(820)은 대상물(420)을 둘러싸는 3차원 박스(3차원 박스 영역, 1110)에서 서로 마주 보는 면들일 수 있다.As shown in Figures 10 (a) and (c) and Figure 11, the first surface 810 and the second surface 820 are a three-dimensional box (three-dimensional box area, 1110), the sides may be facing each other.

이때, 제1 면(810)과 제2 면(820)의 면적은 서로 동일할 수 있다. 나아가, 경우에 따라, 제1 면(810)과 제2 면(820)의 면적은 서로 상이할 수도 있음은 물론이다.At this time, the areas of the first surface 810 and the second surface 820 may be the same. Furthermore, of course, in some cases, the areas of the first surface 810 and the second surface 820 may be different from each other.

이와 같이, 제2 면(820)이 특정되는 경우, 제어부(130)는 도 11에 도시된 것과 같이, 대상물(420)의 제2 면(820)에 대응되는 복수 지점(키포인트들, 815, 816, 817, 818)에 각각 대응되는 좌표 정보를 추출할 수 있다.In this way, when the second surface 820 is specified, the control unit 130 selects a plurality of points (key points, 815, 816) corresponding to the second surface 820 of the object 420, as shown in FIG. , 817, and 818), respectively, can be extracted.

이때, 좌표 정보는, 3차원 좌표에 해당할 수 있다. 제어부(130)는 필요에 따라, 복수 지점(키포인트들, 815, 816, 817, 818)에 대하여, 대상물 좌표계{O}에서의 좌표 정보, 마커 좌표계{W}에서의 좌표 정보, 카메라 좌표계{C}에서의 좌표 정보 및 영상(410)에서의 픽셀 좌표 중 적어도 하나를 추출할 수 있다. 이는, 대상물 좌표계{O}, 마커 좌표계{W}, 카메라 좌표계{C} 및 픽셀 좌표는, 모두 상대적인 위치관계를 갖기 때문에, 어느 하나의 좌표 정보를 알면, 나머지에 대한 좌표 정보를 추출(또는 추정)하는 것이 가능하다.At this time, the coordinate information may correspond to three-dimensional coordinates. If necessary, the control unit 130 provides coordinate information in the object coordinate system {O}, coordinate information in the marker coordinate system {W}, and camera coordinate system {C) for a plurality of points (key points 815, 816, 817, 818). At least one of coordinate information in } and pixel coordinates in the image 410 can be extracted. This is because the object coordinate system {O}, marker coordinate system {W}, camera coordinate system {C}, and pixel coordinates all have a relative positional relationship, so if one coordinate information is known, coordinate information for the others can be extracted (or estimated). ) is possible.

이와 같이, 본 발명에서는 그리드 영역(610) 상에서, 대상물(420)의 제1 면(810)의 키포인트들(811, 812, 813, 814) 및 제2 면(820)의 키포인트들(815, 816, 817, 818)의 좌표 정보를 추출함으로써, 대상물(420)에 대한 학습 데이터를 수집할 수 있다.As such, in the present invention, on the grid area 610, key points 811, 812, 813, 814 of the first side 810 of the object 420 and key points 815, 816 of the second side 820 , 817, 818), learning data for the object 420 can be collected.

이와 같이, 제어부(130)는 그리드 영역(610)을 통해, 키포인트들(811, 812, 813, 814, 815, 816, 817, 818)을 특정함으로써 제1 면(810) 및 제2 면(820)를 정의하고, 나아가, 제1 면(810) 및 제2 면(820)에 의해 정의되는 3차원 박스(1110)를 특정함으로써, 대상물(420)에 대한 학습 데이터를 수집할 수 있다.In this way, the control unit 130 specifies the key points 811, 812, 813, 814, 815, 816, 817, and 818 through the grid area 610 to control the first side 810 and the second side 820. ), and further by specifying the three-dimensional box 1110 defined by the first side 810 and the second side 820, learning data for the object 420 can be collected.

대상물(420)에 대한 학습 데이터는, 3차원 박스(1110)의 키포인트들(811, 812, 813, 814, 815, 816, 817, 818)의 좌표정보일 수 있다.The learning data for the object 420 may be coordinate information of key points 811, 812, 813, 814, 815, 816, 817, and 818 of the three-dimensional box 1110.

학습 데이터는, 대상물(420)의 제1 면(810)의 키포인트들(811, 812, 813, 814)에 해당하는 복수 지점에 각각 대응되는 대상물 좌표계{O}에 대한 좌표 정보(또는 3차원 좌표 정보), 상기 대상물(420)의 제2 면(820)의 키포인트들(815, 816, 817, 818)에 해당하는 복수 지점에 각각 대응되는 상기 대상물 좌표계{O}에 대한 좌표 정보(또는 3차원 좌표 정보)를 포함할 수 있다. The learning data is coordinate information (or three-dimensional coordinates) for the object coordinate system {O} corresponding to a plurality of points corresponding to the key points 811, 812, 813, and 814 of the first surface 810 of the object 420. information), coordinate information (or three-dimensional coordinate information) may be included.

나아가, 학습 데이터는, 상기 제1 면(810)의 키포인트들(811, 812, 813, 814)에 해당하는 복수 지점에 각각 대응되는 영상(410)의 픽셀 좌표 정보 및 상기 제2 면(820)의 키포인트들(815, 816, 817, 818)에 해당하는 복수 지점에 각각 대응되는 영상(410)의 픽셀 좌표 정보를 더 포함할 수 있다.Furthermore, the learning data includes pixel coordinate information of the image 410 corresponding to a plurality of points corresponding to key points 811, 812, 813, and 814 of the first side 810 and the second side 820. It may further include pixel coordinate information of the image 410 corresponding to a plurality of points corresponding to the key points 815, 816, 817, and 818.

픽셀 좌표 정보는, 카메라 좌표계{C}를 기준으로 하는 좌표 정보 일 수 있다.Pixel coordinate information may be coordinate information based on the camera coordinate system {C}.

한편, 학습 데이터는, 카메라 좌표계{C}에 대한 대상물 좌표계{O}의 자유도 정보(자유도 자세)를 더 포함할 수 있다.Meanwhile, the learning data may further include degree-of-freedom information (degree-of-freedom posture) of the object coordinate system {O} with respect to the camera coordinate system {C}.

나아가, 도 12에 도시된 것과 같이, 학습 데이터는 대상물에 다양한 자세들에 대한 이미지 및 대상물의 다양한 자세 마다의 대상물의 자유도 정보가 매칭되어, 저장부(120)에 저장될 수 있다.Furthermore, as shown in FIG. 12 , learning data may be stored in the storage unit 120 by matching images of various postures of the object and information on the degree of freedom of the object for each of the various postures of the object.

이때, 자유도 정보는, 카메라 좌표계{C}, 마커 좌표계{W} 및 대상물 좌표계{O} 중 어느 하나를 기준으로 하는 정보일 수 있다.At this time, the degree of freedom information may be information based on any one of the camera coordinate system {C}, marker coordinate system {W}, and object coordinate system {O}.

제어부(130)는 카메라 좌표계{C}, 마커 좌표계{W} 및 대상물 좌표계{O} 마다, 대상물의 다양한 자세에 따른 자유도 정보를 연산하는 것 또한 가능하며, 이 경우, 다양한 좌표계 마다, 데이터셋이 구축될 수 있다. 이는, 대상물 좌표계{O}, 마커 좌표계{W}, 카메라 좌표계{C} 및 픽셀 좌표는, 모두 상대적인 위치관계를 갖기 때문에, 제어부(130)는 어느 하나의 좌표계에 대한 자유도 정보를 알면, 나머지에 대한 좌표계에 대한 자유도 정보를 추출(또는 추정)하는 것이 가능하다.The control unit 130 is also capable of calculating degree of freedom information according to various postures of the object for each camera coordinate system {C}, marker coordinate system {W}, and object coordinate system {O}. In this case, for each various coordinate system, the data set This can be built. This is because the object coordinate system {O}, marker coordinate system {W}, camera coordinate system {C}, and pixel coordinates all have a relative positional relationship, so if the control unit 130 knows the degree of freedom information for any one coordinate system, the remaining It is possible to extract (or estimate) degree of freedom information about the coordinate system for .

한편, 제어부(130)는 위에서 살펴본 방법으로 마커와 대상물이 함께 촬영된 영상(“기준 영상”이라 명명함)에서 대상물에 대한 학습 데이터를 확보하면, 상기 학습 데이터가 확보된 이후에 촬영된 영상(“후속 영상”이라 명명함)에서는, 마커가 포함되지 않더라도, 후속 영상에 포함된 대상물에 대한 학습 데이터를 수집하는 것이 가능하다.Meanwhile, when the control unit 130 secures learning data for an object from an image in which a marker and an object are captured together (referred to as a “reference image”) using the method described above, the image captured after the learning data is obtained ( In (termed “follow-up video”), it is possible to collect learning data for objects included in the follow-up video, even if no marker is included.

즉, 본 발명에서는 카메라(400)를 통해, 대상물에 대한 후속 영상을 촬영할 수 있으며, 기준 영상을 통해 획득된 대상물에 대한 학습 데이터를 이용하여, 후속 영상에 포함된 대상물에 대한 학습 데이터를 획득할 수 있다.That is, in the present invention, a follow-up image of an object can be captured through the camera 400, and learning data about the object included in the follow-up image can be obtained using learning data about the object obtained through the reference image. You can.

이는, 카메라 좌표계{C}, 대상물 좌표계{O} 및 마커 좌표계{W} 중 적어도 두개 간의 상대적인 위치 관계를 알고 있기 때문이며, 이를 통해 제어부(130)는 후속 영상에서 투영된 그리드 영역에서, 앞서 살펴본 방식으로 대상물의 제1 면(바닥면) 및 제2 면(천정면)에 대응되는 키포인트들의 좌표 정보를 연산하는 것이 가능하다.This is because the relative positional relationship between at least two of the camera coordinate system {C}, the object coordinate system {O}, and the marker coordinate system {W} is known, and through this, the control unit 130 operates in the grid area projected in the subsequent image using the method described above. It is possible to calculate coordinate information of key points corresponding to the first surface (floor surface) and the second surface (ceiling surface) of the object.

나아가, 본 발명에 따르면, S310 내지 S360 과정의 반복을 통해, 다양한 자세를 갖는 대상물에 대한 학습 데이터를 확보하는 것 또한 가능하다.Furthermore, according to the present invention, it is also possible to secure learning data for objects with various postures through repetition of processes S310 to S360.

한편, 위에서 살펴본 방식으로, 대상물에 대한 학습 데이터가 확보된 이후에는, 제어부(13)는 학습 데이터를 이용하여, 심층 신경망(Deep Neural Network)에 대한 학습을 수행할 수 있다. 심층 신경망은, 기 확보된 학습 데이터를 바탕으로, 학습 데이터에 포함되지 않은 대상물에 대한 다양한 자세에 대한 자유도 정보를 학습할 수 있다.Meanwhile, after learning data for an object is secured in the manner described above, the control unit 13 can perform learning on a deep neural network using the learning data. A deep neural network can learn degree-of-freedom information about various postures for objects that are not included in the learning data, based on already secured learning data.

이러한 심층 신경망의 학습이 완료되면, 로봇 또는 자율주행 기기들은, 카메라를 통해 대상물에 대한 영상이 획득되면, 학습이 수행된 심층 신경망을 통해 대상물에 대한 자유도 정보(6자유도 자세)정보를 획득할 수 있다. Once learning of this deep neural network is completed, robots or autonomous devices acquire images of the object through a camera and obtain degree-of-freedom information (six degrees-of-freedom posture) information about the object through the learned deep neural network. can do.

이때, 로봇 또는 자율주행 기기들이 획득하는 자유도 정보는, 로봇 또는 자율주행 기기들에 구비된 카메라 좌표계에 대한 대상물 좌표계의 6자유도 자세 정보일 수 있다.At this time, the degree-of-freedom information acquired by the robot or self-driving devices may be 6-degree-of-freedom posture information of the object coordinate system with respect to the camera coordinate system provided in the robot or self-driving devices.

나아가, 로봇 또는 자율주행 기기들은, 학습이 수행된 심층 신경망을 통해, 로봇 또는 자율주행 기기에서 촬영된 영상에 포함된 대상물에 대한 3차원 박스의 키포인트들의 픽셀좌표를 획득할 수 있다.Furthermore, robots or self-driving devices can obtain pixel coordinates of key points of a 3D box for an object included in an image captured by a robot or self-driving device through a learned deep neural network.

한편, 위에서 살펴본 본 발명은, 컴퓨터에서 하나 이상의 프로세스에 의하여 실행되며, 이러한 컴퓨터로 판독될 수 있는 매체(또는 기록 매체)에 저장 가능한 프로그램으로서 구현될 수 있다.Meanwhile, the present invention discussed above can be implemented as a program that is executed by one or more processes on a computer and can be stored in a medium (or recording medium) that can be read by such a computer.

나아가, 위에서 살펴본 본 발명은, 프로그램이 기록된 매체에 컴퓨터가 읽을 수 있는 코드 또는 명령어로서 구현하는 것이 가능하다. 즉, 본 발명은 프로그램의 형태로 제공될 수 있다. Furthermore, the present invention discussed above can be implemented as computer-readable codes or instructions on a program-recorded medium. That is, the present invention may be provided in the form of a program.

한편, 컴퓨터가 읽을 수 있는 매체는, 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록장치를 포함한다. 컴퓨터가 읽을 수 있는 매체의 예로는, HDD(Hard Disk Drive), SSD(Solid State Disk), SDD(Silicon Disk Drive), ROM, RAM, CD-ROM, 자기 테이프, 플로피 디스크, 광 데이터 저장 장치 등이 있다. Meanwhile, computer-readable media includes all types of recording devices that store data that can be read by a computer system. Examples of computer-readable media include HDD (Hard Disk Drive), SSD (Solid State Disk), SDD (Silicon Disk Drive), ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical data storage device, etc. There is.

나아가, 컴퓨터가 읽을 수 있는 매체는, 저장소를 포함하며 전자기기가 통신을 통하여 접근할 수 있는 서버 또는 클라우드 저장소일 수 있다. 이 경우, 컴퓨터는 유선 또는 무선 통신을 통하여, 서버 또는 클라우드 저장소로부터 본 발명에 따른 프로그램을 다운로드 받을 수 있다.Furthermore, the computer-readable medium may be a server or cloud storage that includes storage and can be accessed by electronic devices through communication. In this case, the computer can download the program according to the present invention from a server or cloud storage through wired or wireless communication.

나아가, 본 발명에서는 위에서 설명한 컴퓨터는 프로세서, 즉 CPU(Central Processing Unit, 중앙처리장치)가 탑재된 전자기기로서, 그 종류에 대하여 특별한 한정을 두지 않는다.Furthermore, in the present invention, the computer described above is an electronic device equipped with a processor, that is, a CPU (Central Processing Unit), and there is no particular limitation on its type.

한편, 상기의 상세한 설명은 모든 면에서 제한적으로 해석되어서는 아니되고 예시적인 것으로 고려되어야 한다. 본 발명의 범위는 첨부된 청구항의 합리적 해석에 의해 결정되어야 하고, 본 발명의 등가적 범위 내에서의 모든 변경은 본 발명의 범위에 포함된다.Meanwhile, the above detailed description should not be construed as restrictive in all respects and should be considered illustrative. The scope of the present invention should be determined by reasonable interpretation of the appended claims, and all changes within the equivalent scope of the present invention are included in the scope of the present invention.

Claims

In the learning data collection method using a learning data collection system,
Obtaining images of an object and a marker located around the object through a camera;
Detecting the marker from the image and specifying at least one marker reference point of the marker included in the image;
Specifying a marker coordinate system of the marker with respect to a camera coordinate system of the camera based on the marker reference point;
Projecting a grid area aligned based on the origin of the marker coordinate system onto the image;
specifying the object in the grid area based on movement of the grid area; and
Comprising: collecting learning data for the object included in the image using a specific result for the object in the grid area,
The specific result for the object includes information about a first side of the object and a second side facing the first side, which are specified based on horizontal rotation and vertical movement with respect to the grid area. How to collect training data.

According to claim 1,
In the step of specifying the marker coordinate system,
In the image, specify pixel coordinates corresponding to the marker reference point,
A learning data collection method characterized by specifying the degree-of-freedom posture of the marker coordinate system with respect to the camera coordinate system using pixel coordinates corresponding to the marker reference point and the camera coordinate system of the camera.

According to paragraph 2,
The grid area includes a plurality of square areas formed by vertically intersecting a plurality of horizontal lines and a plurality of vertical lines,
Alignment of the grid area with respect to the marker coordinate system is,
A learning data collection method, characterized in that the marker coordinate system, a reference horizontal line among the plurality of horizontal lines, and a reference vertical line among the plurality of vertical lines are aligned.

According to paragraph 3,
The marker coordinate system includes a first axis and a second axis orthogonal to each other,
Alignment of the grid area with respect to the marker coordinate system is:
A learning data collection method characterized by matching the first axis of the marker coordinate system with the reference horizontal line, and matching the second axis of the marker coordinate system with the reference vertical line.

According to paragraph 4,
The step of specifying the object in the grid area includes:
A learning data collection method comprising the step of horizontally rotating the grid area based on the origin of the marker coordinate system and performing alignment between the grid area and the object included in the image.

According to clause 5,
Alignment between the grid area and the object is,
A learning data collection method, wherein the reference horizontal line or the reference vertical line in the grid area is horizontal or vertical to the object.

According to clause 6,
A learning data collection method, characterized in that an object coordinate system for the object rotated by a specific angle according to the horizontal rotation with respect to the marker coordinate system is defined through alignment between the grid area and the object.

In clause 7,
The step of specifying the object in the grid area includes:
A learning data collection method further comprising specifying the first side of the object while the grid area is aligned with the object through the horizontal rotation.

According to clause 8,
In the step of specifying the first side of the object,
A learning data collection method characterized by extracting coordinate information about the object coordinate system corresponding to a plurality of points respectively corresponding to the first face of the object.

According to clause 9,
The step of specifying the object in the grid area includes:
A learning data collection method further comprising specifying the second side of the object by vertically moving the grid region while the grid region is aligned with the object through the horizontal rotation.

According to clause 10,
In the step of specifying the second side of the object,
A learning data collection method characterized by extracting coordinate information about the object coordinate system corresponding to a plurality of points respectively corresponding to the second surface of the object.

According to clause 11,
In the step of specifying the object,
Learning data, wherein a three-dimensional box area surrounding the object is specified based on coordinate information extracted corresponding to the first side of the object and coordinate information extracted corresponding to the second side of the object. Collection method.

According to clause 12,
The learning data is,
Coordinate information on the object coordinate system corresponding to a plurality of points on the first surface of the object, coordinate information on the object coordinate system corresponding to a plurality of points on the second surface of the object, and plural points on the first surface A learning data collection method comprising at least one of pixel coordinate information of the image corresponding to each and pixel coordinate information of the image corresponding to each of a plurality of points on the second surface.

According to clause 13,
The coordinate information on the object coordinate system corresponding to a plurality of points on the first surface of the object and the coordinate information on the object coordinate system respectively corresponding to a plurality of points on the second surface of the object are three-dimensional coordinate information. How to collect learning data.

According to clause 14,
Taking a follow-up image of the object through the camera; and
A learning data collection method further comprising acquiring learning data for the object included in the subsequent image using learning data for the object obtained through the image.

A camera that captures a reference image of an object and a marker located around the object;
a storage unit that stores the image; and
Detecting the marker from the image and specifying at least one marker reference point of the marker included in the image,
A control unit that specifies a marker coordinate system of the marker relative to the camera coordinate system of the camera, based on the marker reference point,
The control unit,
Projecting a grid area aligned based on the origin of the marker coordinate system onto the image,
Based on the movement of the grid area, the object is specified in the grid area,
Collecting learning data for the object included in the image using specific results for the object in the grid area,
The specific result for the object includes information about a first side of the object and a second side facing the first side, which are specified based on horizontal rotation and vertical movement with respect to the grid area. Learning data collection system.

A program that is executed by one or more processes in an electronic device and stored on a computer-readable recording medium,
The above program is,
Obtaining a reference image of an object and a marker located around the object through a camera;
Detecting the marker from the image and specifying at least one marker reference point of the marker included in the image;
Specifying a marker coordinate system of the marker with respect to a camera coordinate system of the camera based on the marker reference point;
Projecting a grid area aligned based on the origin of the marker coordinate system onto the image;
specifying the object in the grid area based on movement of the grid area; and
Includes instructions to perform a step of collecting learning data for the object included in the image using a specific result for the object in the grid area,
The specific result for the object includes information about a first face of the object and a second face facing the first face, which are specified based on horizontal rotation and vertical movement with respect to the grid area. A program stored on a recording medium that can be read.