KR102310588B1

KR102310588B1 - Method of generating skeleton data for artificial intelligence learning, and computer program recorded on record-medium for executing method thereof

Info

Publication number: KR102310588B1
Application number: KR1020210019296A
Authority: KR
Inventors: 김도훈
Original assignee: 주식회사 인피닉
Priority date: 2021-02-10
Filing date: 2021-02-10
Publication date: 2021-10-13

Abstract

Disclosed is a method for generating skeleton data of an object included in an image when annotating data for training artificial intelligence (AI). According to the present invention, the method comprises the following steps: when an image which is a target of an annotation work for training AI is loaded, identifying a structure template to be utilized for generation of skeleton data; outputting the identified structure template by overlaying the identified structure template on the image; moving positions of one or more key points included in the structure template under control of an operator; and generating skeleton data corresponding to an object included in the image on the basis of position coordinates of the moved key point and a connection relationship between the key points. In this case, the skeleton data is data related to a three-dimensional skeleton of an object for identifying the body shape, pose, or direction of the object included in the image and the structure template can be a data structure having a predefined number of key points according to properties of the object and a connection relationship between the predefined key points.

Description

A method of generating skeleton data for artificial intelligence learning and a computer program recorded on a recording medium to execute the same {Method of generating skeleton data for artificial intelligence learning, and computer program recorded on record-medium for executing method thereof}

본 발명은 인공지능(Artificial Intelligence, AI) 학습용 데이터 설계에 관한 것이다. 보다 상세하게는, 인공지능(AI) 학습용 데이터를 어노테이션함에 있어, 이미지 속에 포함된 객체의 스켈레톤 데이터를 생성할 수 있는 방법 및 이를 실행하기 위하여 기록매체에 기록된 컴퓨터 프로그램에 관한 것이다.The present invention relates to data design for artificial intelligence (AI) learning. More specifically, in annotating data for artificial intelligence (AI) learning, it relates to a method of generating skeleton data of an object included in an image, and a computer program recorded in a recording medium for executing the method.

인공지능(AI)은 인간의 학습능력, 추론능력 및 지각능력 등의 일부 또는 전부를 컴퓨터 프로그램을 이용하여 인공적으로 구현하는 기술을 의미한다. 인공지능(AI)과 관련하여, 기계 학습(machine learning)은 다수의 파라미터로 구성된 모델을 이용하여 주어진 데이터로 파라미터를 최적화하는 학습을 의미한다. 이와 같은, 기계 학습은 학습용 데이터의 형태에서 따라, 지도 학습(supervised learning), 비지도 학습(unsupervised learning) 및 강화 학습(reinforcement learning)으로 구분된다.Artificial intelligence (AI) refers to a technology that artificially implements some or all of human learning ability, reasoning ability, and perception ability using computer programs. In relation to artificial intelligence (AI), machine learning refers to learning to optimize parameters with given data using a model composed of multiple parameters. Such machine learning is classified into supervised learning, unsupervised learning, and reinforcement learning according to the form of data for learning.

일반적으로, 인공지능(AI) 학습용 데이터의 설계는 데이터 구조의 설계, 데이터의 수집, 데이터의 정제, 데이터의 가공, 데이터의 확장, 및 데이터의 검증 단계로 진행된다.In general, the design of data for artificial intelligence (AI) learning proceeds in the steps of data structure design, data collection, data purification, data processing, data expansion, and data verification.

각각의 단계에서 대하여 보다 구체적으로 설명하면, 데이터 구조의 설계는 온톨로지(ontology) 정의, 분류 체계의 정의 등을 통해 이루어진다. 데이터의 수집은 직접 촬영, 웹 크롤링(web crawling) 또는 협회/전문 단체 등을 통해 데이터를 수집하여 이루어진다. 데이터 정제는 수집된 데이터 내에서 중복 데이터를 제거하고, 개인 정보 등을 비식별화하여 이루어진다. 데이터의 가공은 메타데이터(meta data)를 입력하고 어노테이션(annotation)을 수행하여 이루어진다. 데이터의 확장은 온톨로지 매핑(mapping)을 수행하고, 필요에 따라 온톨로지를 보완하거나 확장하여 이루어진다. 그리고, 데이터의 검증은 다양한 검증 도구를 활용하여 설정된 목표 품질에 따른 유효성을 검증하여 이루어진다.To describe each step in more detail, the design of the data structure is made through the definition of an ontology, a definition of a classification system, and the like. The collection of data is made by collecting data through direct shooting, web crawling, or association/professional organizations. Data purification is performed by removing duplicate data from the collected data and de-identifying personal information. Data processing is performed by inputting metadata and performing annotations. Data expansion is performed by performing ontology mapping and supplementing or extending the ontology as needed. And, the verification of the data is performed by verifying the validity according to the set target quality using various verification tools.

일반적으로, 데이터 가공 단계의 어노테이션은 이미지 속에 포함된 객체에 대하여 바운딩 박스(bounding box) 처리하고, 바운딩 박스 처리된 객체의 속성 정보를 입력하여 진행된다. 이와 같은 어노테이션은 데이터 라벨링(data labeling)이라 지칭되기도 한다. 그리고, 어노테이션의 작업 결과물에 해당되는 데이터셋(dataset)은 JSON(Java Script Object Notation) 파일 형태로 산출된다.In general, the annotation of the data processing step is performed by processing a bounding box on an object included in an image, and inputting property information of the object that has been processed with the bounding box. Such annotations are also referred to as data labeling. And, the dataset corresponding to the work result of the annotation is calculated in the form of a JSON (Java Script Object Notation) file.

한편, 단순히 이미지 속에 포함된 객체의 속성을 식별하는 경우에는 바운딩 박스 처리를 통한 어노테이션으로 충분하다. 그러나, 이미지 속에 포함된 객체의 체형(body shape), 자세(pose) 또는 방향(direction) 등을 식별하기 위해서는 이미지 속에 포함된 객체의 스켈레톤 데이터(skeleton data)가 필요하다.On the other hand, in the case of simply identifying the properties of an object included in an image, annotation through bounding box processing is sufficient. However, in order to identify the body shape, pose, or direction of the object included in the image, skeleton data of the object included in the image is required.

스켈레톤 데이터는 객체의 체형, 자세 또는 방향 변화의 기준이 되는 지점(예를 들어, 관절 등)의 위치에 대응되는 하나 이상의 키 포인트(key point)를 포함하여 구성될 수 있다. 이와 같은, 스켈레톤 데이터를 구성하는 하나 이상의 키 포인트는 객체의 속성에 따라 그 개수, 연결 관계 및 위치해야 하는 지점 등의 규격이 사전에 정의되어 있다. The skeleton data may be configured to include one or more key points corresponding to positions of points (eg, joints, etc.) that are reference points for changes in body shape, posture, or direction of an object. As for the one or more key points constituting the skeleton data, standards such as the number, connection relationship, and point to be located are defined in advance according to the properties of the object.

예를 들어, 3차원 인체 자세(3D human pose) 모델에 따른 스켈레톤 데이터의 경우, 인체의 주요 골격에 따라 연결된 16개의 키 포인트로 구성되며, 키 포인트 1은 왼쪽 엉덩이, 키 포인트 2는 왼쪽 무릎, 키 포인트 3은 왼쪽 발, 키 포인트 4는 오른쪽 엉덩이, 키 포인트 5는 오른쪽 무릎, 키 포인트 6은 오른쪽 발, 키 포인트 7은 몸통 중앙, 키 포인트 8은 몸통 상체, 키 포인트 9는 목, 키 포인트 10은 머리 중심, 키 포인트 11은 오른쪽 어깨, 키 포인트 12는 오른쪽 팔꿈치, 키 포인트 13은 오른손, 키 포인트 14는 왼쪽 어깨, 키 포인트 15는 왼쪽 팔꿈치, 키 포인트 16은 왼손으로, 각각의 위치가 사전에 정의되어 있다. For example, in the case of skeleton data according to a 3D human pose model, it consists of 16 key points connected according to the main skeleton of the human body, key point 1 is the left hip, key point 2 is the left knee, Key point 3 is the left foot, key point 4 is the right hip, key point 5 is the right knee, key point 6 is the right foot, key point 7 is the center of the torso, key point 8 is the upper body, key point 9 is the neck, key point 10 is the center of the head, key point 11 is the right shoulder, key point 12 is the right elbow, key point 13 is the right hand, key point 14 is the left shoulder, key point 15 is the left elbow, key point 16 is the left hand, each position is It is predefined.

따라서, 인공지능(AI) 학습을 위한 이미지의 어노테이션 과정에서 스켈레톤 데이터를 생성하기 위해서는, 어노테이션을 수행하는 작업자가 객체의 속성에 대응하는 스켈레톤 데이터의 규격을 식별하고, 식별된 규격에 따른 키 포인트가 누락되지 않으며, 키 포인트가 정확한 지점에 위치하도록 어노테이션을 수행하여야 하는 어려움이 있다. Therefore, in order to generate skeleton data in the image annotation process for artificial intelligence (AI) learning, the worker performing the annotation identifies the specification of the skeleton data corresponding to the property of the object, and the key point according to the identified specification is It is not omitted, and there is a difficulty in performing an annotation so that the key point is located at the correct point.

대한민국 공개특허공보 제10-2018-0122247호, ‘이종 센서들로부터 추출된 스켈레톤 정보를 이용하여 기계학습 데이터 및 주석을 생성하는 장치 및 그 방법’, (2018.11.12. 공개)Republic of Korea Patent Publication No. 10-2018-0122247, 'A device and method for generating machine learning data and annotations using skeleton information extracted from heterogeneous sensors', (published on November 12, 2018)

본 발명의 일 목적은 인공지능(AI) 학습용 데이터를 어노테이션함에 있어 이미지 속에 포함된 객체의 스켈레톤 데이터를 생성할 수 있는 방법을 제공하는 것이다.One object of the present invention is to provide a method for generating skeleton data of an object included in an image in annotating data for artificial intelligence (AI) learning.

본 발명의 다른 목적은 인공지능(AI) 학습용 데이터를 어노테이션함에 있어 이미지 속에 포함된 객체의 스켈레톤 데이터를 생성할 수 있는 방법을 실행하기 위하여 기록매체에 기록된 컴퓨터 프로그램을 제공하는 것이다.Another object of the present invention is to provide a computer program recorded on a recording medium to execute a method capable of generating skeleton data of an object included in an image in annotating data for artificial intelligence (AI) learning.

본 발명의 기술적 과제들은 이상에서 언급한 기술적 과제들로 제한되지 않으며, 언급되지 않은 또 다른 기술적 과제들은 아래의 기재로부터 당업자에게 명확하게 이해될 수 있을 것이다.The technical problems of the present invention are not limited to the technical problems mentioned above, and other technical problems not mentioned will be clearly understood by those skilled in the art from the following description.

상술한 바와 같은 기술적 과제를 달성하기 위하여, 본 발명은 인공지능(AI) 학습용 데이터를 어노테이션함에 있어 이미지 속에 포함된 객체의 스켈레톤 데이터를 생성할 수 있는 방법을 제안한다. 상기 방법은 인공지능(Artificial Intelligence, AI) 학습을 위한 어노테이션(annotation) 작업의 대상이 되는 이미지가 로딩(loading)되면, 스켈레톤 데이터(skeleton data)의 생성에 활용될 수 있는 구조 템플릿(structure template)을 식별하는 단계; 상기 식별된 구조 템플릿을 상기 이미지 위에 오버레이(overlay)하여 출력하는 단계; 작업자의 제어에 따라, 상기 구조 템플릿에 포함된 하나 이상의 키 포인트(key point)의 위치를 이동시키는 단계; 및 상기 이동된 키 포인트의 위치 좌표 및 상기 키 포인트 사이의 연결 관계를 기초로, 상기 이미지 속에 포함된 객체(object)에 대응하는 스켈레톤 데이터를 생성하는 단계를 포함할 수 있다. 이 경우, 상기 스켈레톤 데이터는 이미지 속에 포함된 객체의 체형(body shape), 자세(pose) 또는 방향(direction)을 식별하기 위한 객체의 3차원 골격과 관련된 데이터이고, 상기 구조 템플릿은 객체의 속성에 따라 사전에 정의된 개수의 키 포인트와, 사전에 정의된 키 포인트 사이의 연결 관계를 가지는 데이터 구조가 될 수 있다.In order to achieve the technical task as described above, the present invention proposes a method for generating skeleton data of an object included in an image in annotating data for artificial intelligence (AI) learning. The method is a structure template that can be utilized to generate skeleton data (skeleton data) when the image that is the target of the annotation work for artificial intelligence (AI) learning is loaded (loading) identifying a; outputting the identified structure template by overlaying it on the image; moving the positions of one or more key points included in the structure template under the control of an operator; and generating skeleton data corresponding to an object included in the image based on the position coordinates of the moved key point and a connection relationship between the key points. In this case, the skeleton data is data related to a three-dimensional skeleton of an object for identifying the body shape, pose, or direction of the object included in the image, and the structure template is based on the properties of the object. Accordingly, it may be a data structure having a predefined number of key points and a connection relationship between the predefined key points.

보다 구체적으로, 상기 구조 템플릿을 식별하는 단계는 사전에 설정된 상기 인공지능(AI) 학습과 관련된 프로젝트의 속성, 상기 이미지의 속성 또는 상기 작업자의 속성에 따라, 객체의 유형별로 규격화된 구조 템플릿이 구비된 데이터베이스로부터 하나의 구조 템플릿을 식별할 수 있다.More specifically, the step of identifying the structure template includes a structure template standardized for each type of object according to the preset properties of the project related to the artificial intelligence (AI) learning, the properties of the image, or the properties of the worker. One structure template can be identified from the database.

나아가, 상기 구조 템플릿을 식별하는 단계는 상기 구조 템플릿을 적용할 객체가 상기 이미지 내에서 차지하고 있는 크기, 위치 또는 형상에 따라, 상기 데이터베이스로부터 하나의 구조 템플릿을 식별할 수 있다.Furthermore, in the step of identifying the structure template, one structure template may be identified from the database according to a size, position or shape occupied by an object to which the structure template is to be applied in the image.

상기 오버레이하여 출력하는 단계는 상기 구조 템플릿에 포함된 키 포인트 중에서 사전에 설정된 하나 이상의 기준 키 포인트가 사전에 설정된 특징점 위에 위치하도록, 상기 구조 템플릿을 이미지 위에 오버레이할 수 있다. 이 경우, 상기 특징점은 객체의 골격 중에서 뼈가 분기하는 관절의 위치, 또는 상기 객체의 신체 기관 중에서 움직임이 최소인 신체 기관의 위치에 따라 사전에 정의된 지점이 될 수 있다.In the overlaying and outputting step, the structure template may be overlaid on the image so that one or more preset reference key points among the key points included in the structure template are positioned on the preset feature points. In this case, the feature point may be a point defined in advance according to a position of a joint from which a bone diverges in the skeleton of the object, or a position of a body organ with minimal movement among the body organs of the object.

상기 오버레이하여 출력하는 단계는 객체의 크기, 위치 및 형상과, 특징점의 위치로 구성된 데이터셋(dataset)을 이용하여 기계학습(machine learning)된 제3의 인공지능에 대하여, 상기 구조 템플릿을 적용할 객체가 상기 이미지 내에서 차지하고 있는 크기, 위치 또는 형상을 기초로 질의하여, 상기 기준 키 포인트가 위치할 특징점의 위치를 식별할 수도 있다.In the step of overlaying and outputting, the structure template is applied to a third artificial intelligence machine learned using a dataset consisting of the size, position and shape of the object and the position of the feature point. The position of the feature point at which the reference key point is to be located may be identified by querying based on the size, position, or shape that the object occupies in the image.

상기 키 포인트의 위치를 이동시키는 단계는 상기 구조 템플릿에 포함된 키 포인트 중에서 상기 이미지 속에 포함된 객체에 대응시킬 수 있는 키 포인트와, 상기 이미지 속에 포함된 객체에 대응시킬 수 없는 키 포인트를 서로 다른 사용자 인터페이스(User Interface, UI)로 출력할 수 있다.In the step of moving the position of the key point, among the key points included in the structure template, a key point that can correspond to an object included in the image and a key point that cannot correspond to an object included in the image are different from each other. It can be output through a user interface (UI).

상기 방법은 상기 키 포인트의 위치를 이동시키는 단계 이후에 상기 객체에 대한 체형, 자세 또는 방향 정보를 포함하는 속성 정보를 설정하는 단계를 더 포함할 수 있다. 이 경우, 상기 스켈레톤 데이터를 생성하는 단계는 상기 속성 정보를 포함시켜 상기 스켈레톤 데이터를 생성하고, 상기 속성 정보를 설정하는 단계는 상기 이동된 키 포인트의 위치 좌표 및 상기 키 포인트 사이의 연결 관계를 기초로 상기 작업자에게 제안할 체형, 자세 또는 방향 정보를 식별한 후, 상기 식별된 제안할 체형, 자세 또는 방향 정보를 사용자 인터페이스(UI)를 통해 출력할 수 있다.The method may further include setting attribute information including body shape, posture, or direction information for the object after moving the position of the key point. In this case, the generating of the skeleton data includes the attribute information to generate the skeleton data, and the setting of the attribute information is based on the positional coordinates of the moved key point and the connection relationship between the key points. After identifying the body shape, posture, or direction information to be proposed to the worker, the identified body shape, posture, or direction information to be suggested may be output through a user interface (UI).

상기 키 포인트의 위치를 이동시키는 단계는 상기 작업자의 제어에 따라, 상기 구조 템플릿이 3차원 회전된 형상을 사용자 인터페이스(UI)를 통해 출력할 수 있다.In the step of moving the position of the key point, the three-dimensionally rotated shape of the structure template may be output through a user interface (UI) under the control of the operator.

또한, 상기 키 포인트의 위치를 이동시키는 단계는 상기 이동된 키 포인트의 위치 좌표를 기초로, 상기 구조 템플릿에 포함된 제1 키 포인트와 제2 키 포인트 사이를 연결하는 제1 간선(edge)과, 상기 제2 키 포인트와 제3 키포인트 사이를 연결하는 제2 간선 사이의 각도가 사전에 설정된 임계 각도 범위를 벗어나는 경우, 상기 키 포인트의 위치 이동에 오류가 존재함을 사용자 인터페이스(UI)를 통해 출력할 수도 있다.In addition, moving the position of the key point includes a first edge connecting between the first key point and the second key point included in the structure template, based on the position coordinates of the moved key point; , when the angle between the second trunk line connecting the second key point and the third key point is out of a preset threshold angle range, it is recognized through a user interface (UI) that an error exists in the position movement of the key point. You can also print

상술한 바와 같은 기술적 과제를 달성하기 위하여, 본 발명은 스켈레톤 데이터를 생성할 수 있는 방법을 실행하기 위하여 기록매체에 기록된 컴퓨터 프로그램을 제안한다. 상기 컴퓨터 프로그램은 메모리(memory); 입출력장치(input output device); 및 상기 메모리에 상주된 명령어를 처리하는 프로세서(processor)를 포함하여 구성된 컴퓨팅 장치와 결합될 수 있다. 그리고, 상기 컴퓨터 프로그램은 상기 프로세서가, 인공지능(AI) 학습을 위한 어노테이션 작업의 대상이 되는 이미지가 상기 메모리에 로딩되면, 스켈레톤 데이터의 생성에 활용될 수 있는 구조 템플릿을 식별하는 단계; 상기 프로세서가, 상기 식별된 구조 템플릿을 상기 이미지 위에 오버레이하여 상기 입출력장치를 통해 출력하는 단계; 상기 프로세서가, 상기 입출력장치를 통해 입력된 작업자의 제어에 따라, 상기 구조 템플릿에 포함된 하나 이상의 키 포인트의 위치를 이동시키는 단계; 및 상기 프로세서가, 상기 이동된 키 포인트의 위치 좌표 및 상기 키 포인트 사이의 연결 관계를 기초로, 상기 이미지 속에 포함된 객체에 대응하는 스켈레톤 데이터를 생성하는 단계를 실행시키기 위하여 기록매체에 기록될 수 있다.In order to achieve the technical problem as described above, the present invention proposes a computer program recorded on a recording medium to execute a method for generating skeleton data. The computer program includes a memory; input output device; and a processor for processing instructions resident in the memory. Then, the computer program, the processor, when the image to be annotated for artificial intelligence (AI) learning is loaded into the memory, identifying a structure template that can be utilized to generate the skeleton data; outputting, by the processor, the identified structure template over the image through the input/output device; moving, by the processor, the position of one or more key points included in the structure template according to the control of the operator input through the input/output device; and generating, by the processor, the skeleton data corresponding to the object included in the image based on the positional coordinates of the moved key point and the connection relationship between the key points. have.

기타 실시 예들의 구체적인 사항들은 상세한 설명 및 도면들에 포함되어 있다.Specific details of other embodiments are included in the detailed description and drawings.

본 발명의 실시 예들에 따르면, 인공지능(AI) 학습을 위한 이미지 속에 포함된 객체를 대상으로 스켈레톤 데이터를 생성함에 있어, 키 포인트의 개수와 연결 관계 등이 규격화 되어 있는 템플릿을 활용함으로써, 스켈레톤 데이터에 포함되어야 하는 키 포인트가 누락되지 않으며, 키 포인트 사이의 연결 관계가 정확하게 설정될 수 있다. According to embodiments of the present invention, in generating skeleton data for an object included in an image for artificial intelligence (AI) learning, by using a template in which the number and connection relationship of key points are standardized, skeleton data Key points that should be included in the .

또한, 본 발명의 실시 예들에 따르면, 이미지 상에서 명확하게 확인할 수 없는 객체의 일부분에 대응하는 키 포인트를 구분될 수 있게 설정함으로써, 작업자가 객체의 3차원 골격에 따른 스켈레톤 데이터를 보다 명확하게 확인하며 어노테이션을 수행할 수 있게 된다.In addition, according to embodiments of the present invention, by setting a key point corresponding to a part of an object that cannot be clearly identified on the image to be distinguished, the operator more clearly confirms the skeleton data according to the three-dimensional skeleton of the object, Annotations can be performed.

결과적으로, 본 발명의 실시 예들에 따르면, 이미지 속에 포함된 객체의 체형, 자세 또는 방향을 정확하게 학습시킬 수 있는 스켈레톤 데이터를 생성할 수 있게 된다.As a result, according to embodiments of the present invention, it is possible to generate skeleton data that can accurately learn the body shape, posture, or direction of an object included in an image.

본 발명의 효과들은 이상에서 언급한 효과로 제한되지 아니하며, 언급되지 않은 또 다른 효과들은 청구범위의 기재로부터 본 발명이 속한 기술분야의 통상의 기술자에게 명확하게 이해될 수 있을 것이다.Effects of the present invention are not limited to the effects mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the description of the claims.

도 1 내지 도 3은 본 발명의 다양한 실시예에 따른 인공지능 학습 시스템의 구성도이다.
도 4는 본 발명의 일 실시예에 따른 어노테이션 장치의 논리적 구성도이다.
도 5는 본 발명의 일 실시예에 따른 어노테이션 장치의 하드웨어 구성도이다.
도 6 내지 도 10은 본 발명의 일 실시예에 따라 스켈레톤 데이터를 생성하는 과정을 설명하기 위한 예시도이다.
도 11 내지 도 14는 본 발명의 일 실시예에 따라 연속된 이미지들에 대한 스켈레톤 데이터를 생성하는 과정을 설명하기 위한 예시도이다.
도 15는 본 발명의 일 실시예에 따른 스켈레톤 데이터 생성 방법을 설명하기 위한 순서도이다.
도 16은 본 발명의 일 실시예에 따른 기 수행된 어노테이션 결과를 활용한 구조 템플릿 식별 방법을 설명하기 위한 순서도이다.1 to 3 are block diagrams of an artificial intelligence learning system according to various embodiments of the present invention.
4 is a logical configuration diagram of an annotation device according to an embodiment of the present invention.
5 is a hardware configuration diagram of an annotation apparatus according to an embodiment of the present invention.
6 to 10 are exemplary views for explaining a process of generating skeleton data according to an embodiment of the present invention.
11 to 14 are exemplary views for explaining a process of generating skeleton data for consecutive images according to an embodiment of the present invention.
15 is a flowchart illustrating a method for generating skeleton data according to an embodiment of the present invention.
16 is a flowchart illustrating a structure template identification method using a pre-performed annotation result according to an embodiment of the present invention.

본 명세서에서 사용되는 기술적 용어는 단지 특정한 실시 예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아님을 유의해야 한다. 또한, 본 명세서에서 사용되는 기술적 용어는 본 명세서에서 특별히 다른 의미로 정의되지 않는 한, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 의미로 해석되어야 하며, 과도하게 포괄적인 의미로 해석되거나, 과도하게 축소된 의미로 해석되지 않아야 한다. 또한, 본 명세서에서 사용되는 기술적인 용어가 본 발명의 사상을 정확하게 표현하지 못하는 잘못된 기술적 용어일 때에는, 당업자가 올바르게 이해할 수 있는 기술적 용어로 대체되어 이해되어야 할 것이다. 또한, 본 발명에서 사용되는 일반적인 용어는 사전에 정의되어 있는 바에 따라, 또는 전후 문맥상에 따라 해석되어야 하며, 과도하게 축소된 의미로 해석되지 않아야 한다.It should be noted that technical terms used herein are used only to describe specific embodiments, and are not intended to limit the present invention. In addition, the technical terms used in this specification should be interpreted in the meaning generally understood by those of ordinary skill in the art to which the present invention belongs, unless otherwise defined in this specification, and excessively inclusive. It should not be construed in the meaning of a human being or in an excessively reduced meaning. In addition, when the technical terms used in the present specification are incorrect technical terms that do not accurately express the spirit of the present invention, they should be understood by being replaced with technical terms that those skilled in the art can correctly understand. In addition, general terms used in the present invention should be interpreted as defined in advance or according to the context before and after, and should not be interpreted in an excessively reduced meaning.

또한, 본 명세서에서 사용되는 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "구성된다" 또는 "가지다" 등의 용어는 명세서 상에 기재된 여러 구성 요소들, 또는 여러 단계들을 반드시 모두 포함하는 것으로 해석되지 않아야 하며, 그 중 일부 구성 요소들 또는 일부 단계들은 포함되지 않을 수도 있고, 또는 추가적인 구성 요소 또는 단계들을 더 포함할 수 있는 것으로 해석되어야 한다.Also, as used herein, the singular expression includes the plural expression unless the context clearly dictates otherwise. In the present application, terms such as “consisting of” or “having” should not be construed as necessarily including all of the various components or various steps described in the specification, some of which components or some steps are included. It should be construed that it may not, or may further include additional components or steps.

또한, 본 명세서에서 사용되는 제1, 제2 등과 같이 서수를 포함하는 용어는 다양한 구성 요소들을 설명하는데 사용될 수 있지만, 상기 구성 요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성 요소를 다른 구성 요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 본 발명의 권리 범위를 벗어나지 않으면서 제1 구성 요소는 제2 구성 요소로 명명될 수 있고, 유사하게 제2 구성 요소도 제1 구성 요소로 명명될 수 있다. Also, terms including ordinal numbers such as first, second, etc. used herein may be used to describe various components, but the components should not be limited by the terms. The above terms are used only for the purpose of distinguishing one component from another. For example, without departing from the scope of the present invention, a first component may be referred to as a second component, and similarly, a second component may also be referred to as a first component.

어떤 구성 요소가 다른 구성 요소에 "연결되어" 있다거나 "접속되어" 있다고 언급된 때에는, 그 다른 구성 요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성 요소가 존재할 수도 있다. 반면에, 어떤 구성 요소가 다른 구성 요소에 "직접 연결되어" 있다거나 "직접 접속되어" 있다고 언급된 때에는, 중간에 다른 구성 요소가 존재하지 않는 것으로 이해되어야 할 것이다.When a component is referred to as being “connected” or “connected” to another component, it may be directly connected or connected to the other component, but another component may exist in between. On the other hand, when it is said that a certain element is "directly connected" or "directly connected" to another element, it should be understood that no other element is present in the middle.

이하, 첨부된 도면을 참조하여 본 발명에 따른 바람직한 실시예를 상세히 설명하되, 도면 부호에 관계없이 동일하거나 유사한 구성 요소는 동일한 참조 번호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다. 또한, 본 발명을 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다. 또한, 첨부된 도면은 본 발명의 사상을 쉽게 이해할 수 있도록 하기 위한 것일 뿐, 첨부된 도면에 의해 본 발명의 사상이 제한되는 것으로 해석되어서는 아니 됨을 유의해야 한다. 본 발명의 사상은 첨부된 도면 외에 모든 변경, 균등물 내지 대체물에 까지도 확장되는 것으로 해석되어야 한다. Hereinafter, a preferred embodiment according to the present invention will be described in detail with reference to the accompanying drawings, but the same or similar components are assigned the same reference numerals regardless of reference numerals, and redundant description thereof will be omitted. In addition, in the description of the present invention, if it is determined that a detailed description of a related known technology may obscure the gist of the present invention, the detailed description thereof will be omitted. In addition, it should be noted that the accompanying drawings are only for easy understanding of the spirit of the present invention, and should not be construed as limiting the spirit of the present invention by the accompanying drawings. The spirit of the present invention should be construed as extending to all changes, equivalents, or substitutes other than the accompanying drawings.

한편, 상술한 바와 같이 스켈레톤 데이터는 이미지 속에 포함된 객체의 체형, 자세 또는 방향 등을 식별할 수 있도록, 객체의 속성에 따라 키 포인트의 개수, 연결 관계 및 위치해야 하는 지점 등의 규격이 사전에 정의되어 있다. 따라서, 인공지능(AI) 학습을 위한 이미지의 어노테이션 과정에서 스켈레톤 데이터를 생성하기 위해서는, 어노테이션을 수행하는 작업자가 객체의 속성에 대응하는 스켈레톤 데이터의 규격을 식별하고, 식별된 규격에 따른 키 포인트가 누락되지 않으며, 키 포인트가 정확한 지점에 위치하도록 어노테이션을 수행하여야 하는 어려움이 있다.On the other hand, as described above, in the skeleton data, specifications such as the number of key points, connection relationships, and points to be located are determined in advance according to the properties of the object so that the body type, posture, or direction of the object included in the image can be identified. It is defined. Therefore, in order to generate skeleton data in the image annotation process for artificial intelligence (AI) learning, the worker performing the annotation identifies the specification of the skeleton data corresponding to the property of the object, and the key point according to the identified specification is It is not omitted, and there is a difficulty in performing an annotation so that the key point is located at the correct point.

이러한 어려움을 극복하기 위하여, 본 발명은 규격화된 구조의 템플릿(structure template)을 활용하거나, 또는 연속된 이미지에 대하여 기 수행된 어노테이션 결과를 활용하여, 이미지 속에 포함된 객체의 스켈레톤 데이터를 생성할 수 있는 수단들을 제안하고자 한다.In order to overcome this difficulty, the present invention can generate skeleton data of an object included in an image by using a template of a standardized structure, or by using a result of an annotation previously performed on a continuous image. We would like to suggest the means available.

도 1 내지 도 3은 본 발명의 다양한 실시예에 따른 인공지능 학습 시스템의 구성도이다.1 to 3 are block diagrams of an artificial intelligence learning system according to various embodiments of the present invention.

도 1에 도시된 바와 같이, 본 발명의 일 실시예에 따른 인공지능 학습 시스템은 하나 이상의 어노테이션 장치(100-1, 100-2, 100-3, …, 100-n; 100) 및 인공지능 학습 장치(300)를 포함하여 구성될 수 있다.As shown in Figure 1, the artificial intelligence learning system according to an embodiment of the present invention is one or more annotation devices (100-1, 100-2, 100-3, ..., 100-n; 100) and artificial intelligence learning Device 300 may be included.

도 2에 도시된 바와 같이, 본 발명의 다른 실시예에 따른 인공지능 학습 시스템은 어노테이션 장치(100) 및 인공지능 학습 장치(300) 외에 학습 데이터 설계 장치(200)를 추가적으로 포함하여 구성될 수 있다.As shown in FIG. 2 , the artificial intelligence learning system according to another embodiment of the present invention may be configured to additionally include the learning data design apparatus 200 in addition to the annotation apparatus 100 and the artificial intelligence learning apparatus 300 . .

또한, 도 3에 도시된 바와 같이, 본 발명의 또 다른 실시예에 따른 인공지능 학습 시스템은 어노테이션 장치(100), 학습 데이터 설계 장치(200) 및 인공지능 학습 장치(300)가 공개된 네트워크(public network)를 통해 서로 연결될 수도 있다. 이 경우, 어노테이션 장치(100)의 일부는 클라우딩 서비스(clouding service)에 의해 어노테이션을 수행하는 장치가 될 수도 있다.In addition, as shown in Fig. 3, the artificial intelligence learning system according to another embodiment of the present invention is a network in which the annotation apparatus 100, the learning data design apparatus 200, and the artificial intelligence learning apparatus 300 are disclosed. They can also be connected to each other through a public network). In this case, a part of the annotation apparatus 100 may be a device for performing annotation by a clouding service.

이와 같은, 다양한 실시예에 따른 인공지능 학습 시스템의 구성 요소들은 기능적으로 구분되는 요소들을 나타낸 것에 불과하므로, 둘 이상의 구성 요소가 실제 물리적 환경에서는 서로 통합되어 구현되거나, 하나의 구성 요소가 실제 물리적 환경에서는 서로 분리되어 구현될 수 있을 것이다.As such, since the components of the AI learning system according to various embodiments are merely functionally distinct elements, two or more components are integrated with each other in the actual physical environment, or one component is the actual physical environment. may be implemented separately from each other.

각각의 구성 요소에 대하여 설명하면, 어노테이션 장치(100)는 학습 데이터 설계 장치(200) 또는 인공지능 학습 장치(300)로부터 제공된 이미지에 대하여 어노테이션을 수행하는데 사용될 수 있는 장치이다.Each component will be described. The annotation apparatus 100 is a device that can be used to annotate an image provided from the learning data design apparatus 200 or the artificial intelligence learning apparatus 300 .

특히, 본 발명에 따른 어노테이션 장치(100)는 이미지 속에 포함된 객체를 대상으로 스켈레톤 데이터를 생성함에 있어, 규격화된 구조의 템플릿(structure template)을 활용하거나, 또는 연속된 이미지에 대하여 기 수행된 어노테이션 결과를 활용할 수 있는 특징을 가지고 있다.In particular, when the annotation apparatus 100 according to the present invention generates skeleton data for an object included in an image, a template of a standardized structure is utilized or an annotation previously performed on a continuous image is used. It has features that can use the results.

이와 같은, 어노테이션 장치(100)는 학습 데이터 설계 장치(200) 또는 인공지능 학습 장치(300)와 데이터를 송수신하고, 송수신된 데이터를 이용하여 연산을 수행할 수 있는 장치라면 어떠한 장치라도 허용될 수 있다.As such, the annotation apparatus 100 can transmit and receive data to and from the learning data design apparatus 200 or the artificial intelligence learning apparatus 300, and any apparatus capable of performing an operation using the transmitted and received data may be permitted. have.

예를 들어, 어노테이션 장치(100)는 데스크탑(desktop), 워크스테이션(workstation) 또는 서버(server)와 같은 고정식 컴퓨팅 장치 중 어느 하나가 될 수 있으나, 이에 한정되지 아니하고, 스마트폰(smart phone), 랩탑(laptaop), 태블릿(tablet), 패블릿(phablet), 휴대용 멀티미디어 재생장치(Portable Multimedia Player, PMP), 개인용 휴대 단말기(Personal Digital Assistants, PDA) 또는 전자책 단말기(E-book reader)과 같은 이동식 컴퓨팅 장치 중 어느 하나가 될 수도 있다.For example, the annotation device 100 may be any one of a fixed computing device such as a desktop, a workstation, or a server, but is not limited thereto, and a smart phone, such as laptops, tablets, phablets, Portable Multimedia Players (PMPs), Personal Digital Assistants (PDAs), or E-book readers. It may be any of the mobile computing devices.

상술한 바와 같은, 어노테이션 장치(100)에 대한 구체적인 구성 및 동작에 대해서는 추후 도 4 내지 도 16을 참조하여 설명하기로 한다.A detailed configuration and operation of the annotation apparatus 100 as described above will be described later with reference to FIGS. 4 to 16 .

다음 구성으로, 학습 데이터 설계 장치(200)는 인공지능(AI) 학습용 데이터를 설계 및 생성하는데 사용될 수 있는 장치이다. 이와 같은, 학습 데이터 설계 장치(200)는 기본적으로 인공지능 학습 장치(300)와 구분되는 장치이나, 실제 물리적 환경에서 인공지능 학습 장치(300)에 통합되어 구현될 수도 있다.With the following configuration, the learning data design device 200 is a device that can be used to design and generate data for artificial intelligence (AI) learning. As such, the learning data designing apparatus 200 is basically a device distinct from the artificial intelligence learning apparatus 300, but may be implemented by being integrated into the artificial intelligence learning apparatus 300 in an actual physical environment.

구체적으로, 학습 데이터 설계 장치(200)는 인공지능 학습 장치(300)로부터 인공지능(AI) 학습과 관련된 프로젝트의 속성을 수신할 수 있다. 학습 데이터 설계 장치(200)는 사용자의 제어 및 프로젝트의 속성을 기초로, 인공지능(AI) 학습을 위한 데이터 구조의 설계, 수집된 데이터의 정제, 데이터의 가공, 데이터의 확장 및 데이터의 검증을 수행할 수 있다.Specifically, the learning data design apparatus 200 may receive the properties of the project related to artificial intelligence (AI) learning from the artificial intelligence learning apparatus 300 . The learning data design device 200 performs design of a data structure for artificial intelligence (AI) learning, purification of collected data, processing of data, expansion of data, and verification of data, based on the user's control and the properties of the project. can be done

특히, 학습 데이터 설계 장치(200)는 인공지능(AI) 학습을 위한 데이터 가공을 위하여, 어노테이션의 대상이 되는 이미지를 어노테이션 장치(100)에 전송할 수 있다. 학습 데이터 설계 장치(200)는 어노테이션 장치(100)로부터 어노테이션 작업 결과물을 수신할 수 있다. 이 경우, 어노테이션 작업 결과물은 JSON(Java Script Object Notation) 파일 형식을 가질 수 있다. 이와 다르게, 학습 데이터 설계 장치(200)는 JSON과 다른 형식의 어노테이션 작업 결과물을 수신한 후, 수신된 결과물을 기초로 JSON 파일을 생성할 수도 있다. 그리고, 학습 데이터 설계 장치(200)는 수신 또는 생성된 JSON 파일을 검수(inspection)한 후, 이를 패키징하여 인공지능 학습 장치(300)에 전송할 수 있다.In particular, the learning data design apparatus 200 may transmit an image to be annotated to the annotation apparatus 100 for data processing for artificial intelligence (AI) learning. The learning data design apparatus 200 may receive an annotation work result from the annotation apparatus 100 . In this case, the annotation operation result may have a JSON (Java Script Object Notation) file format. Alternatively, the learning data design apparatus 200 may generate a JSON file based on the received result after receiving an annotation work result in a format different from JSON. Then, the learning data design apparatus 200 may inspect the received or generated JSON file, package it, and transmit it to the artificial intelligence learning apparatus 300 .

이와 같은, 학습 데이터 설계 장치(200)는 어노테이션 장치(100) 및 인공지능 학습 장치(300)와 데이터를 송수신하고, 송수신된 데이터를 이용하여 연산을 수행할 수 있는 장치라면 어떠한 장치라도 허용될 수 있다. 예를 들어, 학습 데이터 설계 장치(200)는 데스크탑, 워크스테이션 또는 서버와 같은 고정식 컴퓨팅 장치 중 어느 하나가 될 수 있으나, 이에 한정되는 것은 아니다.As such, the learning data design device 200 may be any device capable of transmitting and receiving data to and from the annotation device 100 and the artificial intelligence learning device 300 and performing an operation using the transmitted/received data. have. For example, the learning data design apparatus 200 may be any one of a fixed computing device such as a desktop, a workstation, or a server, but is not limited thereto.

다음 구성으로, 인공지능 학습 장치(300)는 인공지능(AI) 학습용 데이터를 기초로, 인공지능(AI)의 기계 학습을 수행하는데 사용될 수 있는 장치이다.With the following configuration, the artificial intelligence learning device 300 is a device that can be used to perform machine learning of artificial intelligence (AI) based on data for artificial intelligence (AI) learning.

구체적으로, 인공지능 학습 장치(300)는 어노테이션 장치(100)로부터 직접 또는 학습 데이터 설계 장치(200)로부터 패키징된 JSON 파일을 수신할 수 있다. 그리고, 인공지능 학습 장치(300)는 수신된 JSON 파일을 이용하여 인공지능(AI)의 기계 학습을 수행할 수 있다.Specifically, the artificial intelligence learning apparatus 300 may receive a packaged JSON file directly from the annotation apparatus 100 or from the learning data design apparatus 200 . And, the artificial intelligence learning apparatus 300 may perform machine learning of artificial intelligence (AI) using the received JSON file.

이와 같은, 인공지능 학습 장치(300)는 어노테이션 장치(100) 또는 학습 데이터 설계 장치(200)와 데이터를 송수신하고, 송수신된 데이터를 이용하여 연산을 수행할 수 있는 장치라면 어떠한 장치라도 허용될 수 있다. 예를 들어, 인공지능 학습 장치(300)는 데스크탑, 워크스테이션 또는 서버와 같은 고정식 컴퓨팅 장치 중 어느 하나가 될 수 있으나, 이에 한정되는 것은 아니다.As such, the artificial intelligence learning apparatus 300 can transmit and receive data to and from the annotation apparatus 100 or the learning data design apparatus 200, and any apparatus may be permitted as long as it can perform an operation using the transmitted and received data. have. For example, the artificial intelligence learning apparatus 300 may be any one of a fixed computing device such as a desktop, a workstation, or a server, but is not limited thereto.

상술한 바와 같은, 하나 이상의 어노테이션 장치(100), 학습 데이터 설계 장치(200) 및 인공지능 학습 장치(300)는 장치들 사이에 직접 연결된 보안회선, 공용 유선 통신망 또는 이동 통신망 중 하나 이상이 조합된 네트워크를 이용하여 데이터를 송수신할 수 있다. As described above, at least one annotation device 100, learning data design device 200, and artificial intelligence learning device 300 are a combination of at least one of a secure line, a public wired communication network, or a mobile communication network directly connected between the devices. Data can be transmitted and received using the network.

예를 들어, 공용 유선 통신망에는 이더넷(ethernet), 디지털가입자선(x Digital Subscriber Line, xDSL), 광동축 혼합망(Hybrid Fiber Coax, HFC), 광가입자망(Fiber To The Home, FTTH)가 포함될 수 있으나, 이에 한정되는 것도 아니다. 그리고, 이동 통신망에는 코드 분할 다중 접속(Code Division Multiple Access, CDMA), 와이드 밴드 코드 분할 다중 접속(Wideband CDMA, WCDMA), 고속 패킷 접속(High Speed Packet Access, HSPA), 롱텀 에볼루션(Long Term Evolution, LTE), 5세대 이동통신(5th generation mobile telecommunication)가 포함될 수 있으나, 이에 한정되는 것은 아니다. For example, public wired networks include Ethernet, x Digital Subscriber Line (xDSL), Hybrid Fiber Coax (HFC), and Fiber To The Home (FTTH). However, it is not limited thereto. In addition, the mobile communication network includes Code Division Multiple Access (CDMA), Wideband CDMA, WCDMA, High Speed Packet Access (HSPA), Long Term Evolution, LTE) and 5th generation mobile communication may be included, but are not limited thereto.

이하, 상술한 바와 같은, 어노테이션 장치(100)의 구성에 대하여 보다 구체적으로 설명하기로 한다.Hereinafter, the configuration of the annotation apparatus 100 as described above will be described in more detail.

도 4는 본 발명의 일 실시예에 따른 어노테이션 장치의 논리적 구성도이다.4 is a logical configuration diagram of an annotation device according to an embodiment of the present invention.

도 4에 도시된 바와 같이, 어노테이션 장치(100)는 통신부(105), 입출력부(110), 저장부(115), 템플릿 제공부(120), 어노테이션 작업부(125) 및 스켈레톤 생성부(130)를 포함하여 구성될 수 있다.As shown in FIG. 4 , the annotation apparatus 100 includes a communication unit 105 , an input/output unit 110 , a storage unit 115 , a template providing unit 120 , an annotation working unit 125 , and a skeleton generating unit 130 . ) may be included.

이와 같은, 어노테이션 장치(100)의 구성 요소들은 기능적으로 구분되는 요소들을 나타낸 것에 불과하므로, 둘 이상의 구성 요소가 실제 물리적 환경에서는 서로 통합되어 구현되거나, 하나의 구성 요소가 실제 물리적 환경에서는 서로 분리되어 구현될 수 있을 것이다.As such, the components of the annotation apparatus 100 merely represent functionally distinct elements, so that two or more components are integrated with each other in the actual physical environment, or one component is separated from each other in the actual physical environment. could be implemented.

각각의 구성 요소에 대하여 설명하면, 통신부(105)는 학습 데이터 설계 장치(200) 및 인공지능 학습 장치(300)와 데이터를 송수신할 수 있다.When each component is described, the communication unit 105 may transmit/receive data to and from the learning data design apparatus 200 and the artificial intelligence learning apparatus 300 .

구체적으로, 통신부(105)는 학습 데이터 설계 장치(200) 또는 인공지능 학습 장치(300)로부터 하나 이상의 이미지를 수신할 수 있다. Specifically, the communication unit 105 may receive one or more images from the learning data design apparatus 200 or the artificial intelligence learning apparatus 300 .

여기서, 이미지는 인공지능(AI) 학습을 위한 어노테이션 작업의 대상이 되는 이미지이다. 이와 같은, 이미지는 학습 데이터 설계 장치(200) 또는 인공지능 학습 장치(300)가 설계한 데이터 가공 계획에 따라, 어노테이션 작업의 대상이 되는 이미지를 개별적으로 수신하거나, 또는 복수 개의 이미지를 일괄적으로 수신할 수 있다.Here, the image is an image to be subjected to annotation work for artificial intelligence (AI) learning. As such, according to the data processing plan designed by the learning data design device 200 or the artificial intelligence learning device 300, the images are individually received, or a plurality of images are collectively received. can receive

통신부(105)는 스켈레톤 생성부(130)에 의해 생성된 스켈레톤 데이터를 포함하는 어노테이션 작업 결과를 학습 데이터 설계 장치(200) 또는 인공지능 학습 장치(300)에 전송할 수 있다. The communication unit 105 may transmit an annotation operation result including the skeleton data generated by the skeleton generating unit 130 to the learning data design apparatus 200 or the artificial intelligence learning apparatus 300 .

여기서, 스켈레톤 데이터는 이미지 속에 포함된 객체(object)의 체형(body shape), 자세(pose) 또는 방향(direction)을 식별하기 위한 객체의 3차원 골격과 관련된 데이터이다.Here, the skeleton data is data related to a three-dimensional skeleton of an object for identifying the body shape, pose, or direction of the object included in the image.

이와 같은, 스켈레톤 데이터는 객체의 체형, 자세 또는 방향 변화의 기준이 되는 지점(예를 들어, 관절 등)의 위치에 대응되는 하나 이상의 키 포인트(key point)를 포함하여 구성될 수 있다. 스켈레톤 데이터에 포함된 하나 이상의 키 포인트는 객체의 체형, 자세 또는 방향 변화의 기준이 되는 지점을 의미한다. 그리고, 스켈레톤 데이터를 구성하는 하나 이상의 키 포인트는 객체의 속성에 따라 그 개수, 연결 관계 및 위치해야 하는 지점 등의 규격이 사전에 정의되어 있다. As such, the skeleton data may be configured to include one or more key points corresponding to positions of points (eg, joints, etc.) that are reference points for changing the body shape, posture, or direction of the object. One or more key points included in the skeleton data refer to points that are reference points for changes in body shape, posture, or direction of an object. In addition, standards such as the number, connection relationship, and location of one or more key points constituting the skeleton data are defined in advance according to the properties of the object.

예를 들어, 3차원 인체 자세(3D human pose) 모델에 따른 스켈레톤 데이터의 경우, 인체의 주요 골격에 따라 연결된 16개의 키 포인트로 구성될 수 있다. 그리고, 16개의 키 포인트 중에서 키 포인트 1은 왼쪽 엉덩이, 키 포인트 2는 왼쪽 무릎, 키 포인트 3은 왼쪽 발, 키 포인트 4는 오른쪽 엉덩이, 키 포인트 5는 오른쪽 무릎, 키 포인트 6은 오른쪽 발, 키 포인트 7은 몸통 중앙, 키 포인트 8은 몸통 상체, 키 포인트 9는 목, 키 포인트 10은 머리 중심, 키 포인트 11은 오른쪽 어깨, 키 포인트 12는 오른쪽 팔꿈치, 키 포인트 13은 오른손, 키 포인트 14는 왼쪽 어깨, 키 포인트 15는 왼쪽 팔꿈치, 키 포인트 16은 왼손으로, 각각의 위치가 사전에 정의되어 있다. For example, in the case of skeleton data according to a 3D human pose model, it may be composed of 16 key points connected according to the main skeleton of the human body. And, out of 16 key points, key point 1 is the left hip, key point 2 is the left knee, key point 3 is the left foot, key point 4 is the right hip, key point 5 is the right knee, key point 6 is the right foot, the height Point 7 is the center of the torso, key point 8 is the upper body, key point 9 is the neck, key point 10 is the center of the head, key point 11 is the right shoulder, key point 12 is the right elbow, key point 13 is the right hand, key point 14 is Left shoulder, key point 15 is the left elbow, key point 16 is the left hand, each position is predefined.

그리고, 통신부(150)는 학습 데이터 설계 장치(200) 또는 인공지능 학습 장치(300)로부터 프로젝트의 속성, 이미지의 속성 또는 작업자의 속성을 수신할 수 있다.In addition, the communication unit 150 may receive a project property, an image property, or an operator property from the learning data design device 200 or the artificial intelligence learning device 300 .

여기서, 프로젝트의 속성에는 인공지능(AI)의 학습과 관련된 프로젝트에 대한 학습 목적, 학습 기간, 학습에 필요한 이미지의 수, 이미지에서 식별하고자 하는 객체의 속성, 객체의 스켈레톤 데이터와 관련된 규격 등이 포함될 수 있으나, 이에 한정되는 것은 아니다.Here, the properties of the project include the learning purpose of the project related to learning of artificial intelligence (AI), the learning period, the number of images required for learning, the properties of objects to be identified in the images, standards related to the object's skeleton data, etc. However, the present invention is not limited thereto.

이미지의 속성에는 이미지의 파일명, 이미지의 크기(너비, 높이), 해상도, 비트 수준, 압축 형식, 촬영 장치명, 노출 시간, ISO 감도, 초점 거리, 조리개 개방 값, 촬영 장소 좌표(GPS 위도, 경도), 촬영 시각 등이 포함될 수 있으나, 이에 한정되는 것은 아니다.The properties of the image include the file name of the image, size (width, height), resolution, bit level, compression format, recording device name, exposure time, ISO sensitivity, focal length, aperture opening value, location coordinates (GPS latitude, longitude) of the image. , shooting time, etc. may be included, but is not limited thereto.

작업자의 속성에는 작업자의 명칭, 식별번호, 할당된 작업량, 작업에 따른 비용, 작업 결과 평가 등이 포함될 수 있으나, 이에 한정되는 것은 아니다.The attributes of the operator may include, but are not limited to, the name of the operator, an identification number, the amount of work allocated, the cost according to the operation, and evaluation of the operation result.

다음 구성으로, 입출력부(110)는 사용자 인터페이스(User Interface, UI)를 통해 작업자로부터 신호를 입력 거나, 또는 연산된 결과를 외부로 출력할 수 있다.With the following configuration, the input/output unit 110 may input a signal from an operator through a user interface (UI) or output the calculated result to the outside.

여기서, 작업자는 어노테이션 작업을 수행하는 자를 의미한다. 이와 같은, 작업자는 사용자, 수행자, 라벨러 또는 데이터 라벨러 등으로 지칭될 수 있으며, 이에 한정되는 것은 아니다.Here, the worker means a person who performs the annotation work. Such an operator may be referred to as a user, performer, labeler, or data labeler, but is not limited thereto.

구체적으로, 입출력부(110)는 어노테이션 작업의 대상이 되는 이미지를 출력할 수 있다. 특히, 입출력부(110)는 이미지 위에 구조 템플릿을 오버레이(overlay)하여 출력할 수 있다.Specifically, the input/output unit 110 may output an image to be annotated. In particular, the input/output unit 110 may output the structure template by overlaying it on the image.

여기서, 구조 템플릿은 객체의 속성에 따라 사전에 정의된 개수의 키 포인트와, 사전에 정의된 키 포인트 사이의 연결 관계를 가지는 데이터 구조를 의미한다. 이와 같은, 구조 템플릿은 최종적으로 스켈레톤 데이터를 생성하는데 사용될 수 있다.Here, the structure template refers to a data structure having a pre-defined number of key points according to object properties and a connection relationship between the pre-defined key points. As such, the structure template can finally be used to generate the skeleton data.

입출력부(110)는 스켈레톤 데이터에 포함될 키 포인트(즉, 구조 템플릿에 포함된 키 포인트)의 위치를 이동시키기 위한 제어 신호를 작업자로부터 입력 받을 수 있다.The input/output unit 110 may receive a control signal for moving a position of a key point to be included in the skeleton data (ie, a key point included in the structure template) from an operator.

입출력부(110)는 스켈레톤 데이터에 포함될 객체의 속성 정보를 설정하기 위한 제어 신호를 작업자로부터 입력 받을 수 있다. The input/output unit 110 may receive a control signal for setting attribute information of an object to be included in the skeleton data from an operator.

여기서, 객체의 속성 정보에는 객체의 유형, 객체에 대한 체형, 자세 또는 방향에 관한 정보가 포함될 수 있으나, 이에 한정되는 것은 아니다.Here, the attribute information of the object may include, but is not limited to, the type of the object, and information about the body shape, posture, or direction of the object.

그리고, 입출력부(110)는 작업자의 제어에 따라, 구조 템플릿이 3차원 회전된 형상을 출력할 수도 있다.In addition, the input/output unit 110 may output a three-dimensionally rotated shape of the structure template under the control of the operator.

다음 구성으로, 저장부(115)는 어노테이션 작업에 필요한 데이터를 저장할 수 있다.With the following configuration, the storage unit 115 may store data required for annotation work.

구체적으로, 저장부(115)는 통신부(105)를 통해 수신된 이미지를 저장할 수 있다. 저장부(115)는 통신부(105)를 통해 수신된 프로젝트의 속성, 이미지의 속성 또는 작업자의 속성을 저장할 수 있다. Specifically, the storage unit 115 may store the image received through the communication unit 105 . The storage unit 115 may store the properties of the project, the properties of the images, or the properties of the workers received through the communication unit 105 .

저장부(115)는 입출력부(110)를 통해 입력된 제어 신호에 따라 위치가 이동된 키 포인트를 포함하는 구조 템플릿을 임시 저장할 수 있다. 저장부(115)는 입출력부(110)를 통해 입력된 객체의 속성을 임시 저장할 수 있다.The storage unit 115 may temporarily store a structure template including a key point whose position is moved according to a control signal input through the input/output unit 110 . The storage unit 115 may temporarily store the properties of the object input through the input/output unit 110 .

특히, 저장부(115)는 객체의 유형별로 규격화된 구조 템플릿들이 구비된 데이터베이스(database)를 저장할 수 있다.In particular, the storage unit 115 may store a database in which structure templates standardized for each type of object are provided.

다음 구성으로, 템플릿 제공부(120)는 인공지능(AI) 학습을 위한 어노테이션 작업의 대상이 되는 이미지가 메모리에 로딩(loading)되면, 스켈레톤 데이터의 생성에 활용될 수 있는 구조 템플릿을 제공할 수 있다.With the following configuration, the template providing unit 120 provides a structure template that can be utilized to generate skeleton data when an image that is a target of an annotation operation for artificial intelligence (AI) learning is loaded into memory. have.

본 발명의 일 실시예에 따르면, 템플릿 제공부(120)는 어노테이션 작업의 대상이 되는 현재 이미지에 한정되어 구조 템플릿을 식별할 수 있다. According to an embodiment of the present invention, the template providing unit 120 may identify the structure template by being limited to the current image to be annotated.

구체적으로, 템플릿 제공부(120)는 통신부(105)를 통해 사전에 설정된 프로젝트의 속성, 이미지의 속성 또는 작업자의 속성에 따라, 저장부(115)의 데이터베이스로부터 하나의 구조 템플릿을 식별할 수 있다. Specifically, the template providing unit 120 may identify one structure template from the database of the storage unit 115 according to the project property, the image property, or the worker property set in advance through the communication unit 105 . .

예를 들어, 템플릿 제공부(120)는 구조 템플릿을 적용할 객체가 이미지 내에서 차지하고 있는 크기, 위치 또는 형상에 따라, 저장부(115)의 데이터베이스로부터 하나의 구조 템플릿을 식별할 수 있다.For example, the template providing unit 120 may identify one structure template from the database of the storage unit 115 according to the size, position, or shape occupied by the object to which the structure template is to be applied in the image.

그리고, 템플릿 제공부(120)는 입출력부(110)를 통해, 식별된 구조 템플릿을 이미지 위에 오버레이(overlay)하여 출력할 수 있다. In addition, the template providing unit 120 may output the identified structure template by overlaying it on the image through the input/output unit 110 .

특히, 템플릿 제공부(120)는 구조 템플릿을 이미지 위에 오버레이 함에 있어, 구조 템플릿에 포함된 키 포인트 중에서 사전에 설정된 하나 이상의 기준 키 포인트가 사전에 설정된 특징점 위에 위치시킬 수 있다. In particular, in overlaying the structure template on the image, the template providing unit 120 may position one or more preset reference key points among key points included in the structure template on the preset feature points.

여기서, 특징점은 객체의 골격 중에서 뼈가 분기하는 관절의 위치, 또는 객체의 신체 기관 중에서 움직임이 최소인 신체 기관의 위치에 따라 사전에 정의된 지점이 될 수 있다. Here, the feature point may be a point defined in advance according to a position of a joint from which a bone diverges in the skeleton of the object or a position of a body organ with minimal movement among the body organs of the object.

예를 들어, 16개의 키 포인트로 구성되는 3차원 인체 자세 모델에 따른 구조 템플릿의 경우, 키 포인트 10을 객체의 머리 중심에 위치시킬 수 있다. For example, in the case of a structure template according to a 3D human body posture model consisting of 16 key points, the key point 10 may be located at the center of the head of the object.

이와 같은, 기준 키 포인트가 위치할 특징점의 위치를 식별하기 위하여, 템플릿 제공부(120)는 객체의 크기, 위치 및 형상과 특징점의 위치로 구성된 데이터셋(dataset)을 이용하여 기계학습된 제3의 인공지능(AI)에 대하여, 구조 템플릿을 적용할 객체가 이미지 내에서 차지하고 있는 크기, 위치 또는 형상을 기초로 질의하여, 기준 키 포인트가 위치할 특징점의 위치를 식별할 수 있다. In order to identify the position of the feature point where the reference key point is to be located, the template providing unit 120 performs machine learning using a dataset consisting of the size, position and shape of the object and the position of the feature point. For the artificial intelligence (AI) of , it is possible to identify the position of the feature point where the reference key point is to be located by querying based on the size, position, or shape occupied by the object to which the structure template is applied in the image.

지금까지 상술한 바와 다르게, 본 발명의 다른 실시예에 따르면, 템플릿 제공부(120)는 연속된 이미지에 대하여 기 수행된 어노테이션 결과를 활용하여 구조 템플릿을 식별할 수 있다.Unlike the above, according to another embodiment of the present invention, the template providing unit 120 may identify the structure template by using the result of the annotation previously performed on the continuous image.

구체적으로, 템플릿 제공부(120)는 어노테이션 작업의 대상이 되는 제1 이미지가 로딩되면, 제1 이미지와 시간적으로 연속되게 촬영된 제2 이미지를 식별할 수 있다.Specifically, when the first image, which is the object of the annotation operation, is loaded, the template providing unit 120 may identify the first image and the second image that is temporally continuously photographed.

이를 위하여, 템플릿 제공부(120)는 제1 이미지를 어노테이션 작업의 대상으로 포함하는 프로젝트와 관련된 이미지들 중에서 제1 이미지의 어노테이션 작업을 수행하는 작업자가 어노테이션을 기 수행하였던 이미지를 제2 이미지로 식별할 수 있다. 즉, 제2 이미지는 제1 이미지와 동일한 프로젝트에 속하는 이미지들 중에서 작업자가 어노테이션 작업을 기 수행한 이미지가 될 수 있다. To this end, the template providing unit 120 identifies, as a second image, an image previously annotated by a worker performing an annotation operation on the first image among images related to a project including the first image as the object of the annotation operation. can do. That is, the second image may be an image previously annotated by the operator among images belonging to the same project as the first image.

만약, 작업자가 어노테이션을 기 수행하였던 이미지가 복수 개인 경우, 템플릿 제공부(120)는 이미지의 속성을 기초로, 작업자가 어노테이션을 기 수행하였던 이미지 중에서 제1 이미지의 촬영 장소 좌표와 촬영 장치명이 동일한 이미지를 식별할 수 있다. 그리고, 템플릿 제공부(120)는 제1 이미지의 촬영 장소 좌표와 촬영 장치명이 동일한 이미지 중에서 제1 이미지의 촬영 시각과 시간 차이가 가장 작은 이미지를 제2 이미지로 식별할 수 있다.If the number of images on which the operator has previously been annotated is plural, the template providing unit 120, based on the properties of the image, has the same photographing location coordinates and the photographing device name of the first image among the images on which the operator has previously performed annotations. image can be identified. In addition, the template providing unit 120 may identify, as the second image, an image having the smallest photographing time and time difference of the first image among images having the same photographing location coordinates and photographing device name of the first image.

제2 이미지를 식별한 이후, 템플릿 제공부(120)는 제1 이미지 속에 포함된 객체 중에서 제2 이미지 속에 포함된 객체와 동일한 객체를 식별할 수 있다. After identifying the second image, the template providing unit 120 may identify the same object as the object included in the second image from among the objects included in the first image.

이를 위하여, 템플릿 제공부(120)는 제1 이미지 및 제2 이미지 내에서 객체가 차지하고 있는 크기, 위치 또는 형상의 유사성을 기준으로, 제1 이미지 속에 포함된 객체와 제2 이미지 속에 포함된 객체가 서로 동일한 객체에 해당하는지 판단할 수 있다. To this end, the template providing unit 120 determines the object included in the first image and the object included in the second image based on the similarity of the size, position, or shape occupied by the object in the first image and the second image. It can be determined whether they correspond to the same object.

이 경우, 템플릿 제공부(120)는 객체의 크기, 위치 또는 형상의 유사성을 판단하기 위하여, 제1 이미지 및 제2 이미지에 대하여 이미지 처리(image processing)를 수행할 수 있다.In this case, the template providing unit 120 may perform image processing on the first image and the second image in order to determine the similarity of the size, position, or shape of the object.

예를 들어, 템플릿 제공부(120)는 제1 이미지 및 제2 이미지 각각을 RGB(Red, Green, Blue)에 따라 세 개의 이미지로 분할할 수 있다. 템플릿 제공부(120)는 세 개의 이미지로 분할된 각 이미지의 엣지를 추출(edge detection)할 수 있다. 보다 상세하게, 템플릿 제공부(120)는 각 이미지의 엣지 추출을 위하여, LoG(Laplacian of Gaussian) 알고리즘 또는 DoG(Difference of Gaussian) 알고리즘 중 어느 하나를 이용할 수 있다. For example, the template providing unit 120 may divide each of the first image and the second image into three images according to RGB (Red, Green, Blue). The template providing unit 120 may extract an edge of each image divided into three images (edge detection). In more detail, the template providing unit 120 may use either a Laplacian of Gaussian (LoG) algorithm or a Difference of Gaussian (DoG) algorithm for edge extraction of each image.

LoG 알고리즘을 이용할 경우, 템플릿 제공부(120)는 가우시안 필터(Gaussian filter)를 이용하여 이미지 내에 존재하는 잡음을 제거할 수 있다. 템플릿 제공부(120)는 잡음이 제거된 이미지에 라플라시안 필터(Laplacian)를 적용할 수 있다. 그리고, 템플릿 제공부(120)는 라플라시안 필터가 적용된 이미지에 영교차(zerocrossing)을 검출하여 엣지를 추출할 수 있다.When using the LoG algorithm, the template providing unit 120 may use a Gaussian filter to remove noise existing in the image. The template providing unit 120 may apply a Laplacian filter to the image from which the noise has been removed. In addition, the template providing unit 120 may extract an edge by detecting a zero crossing in the image to which the Laplacian filter is applied.

DoG 알고리즘을 이용할 경우, 템플릿 제공부(120)는 이미지로부터 분산이 서로 다른 가우시안 마스크(Gaussian mask)를 두 개 생성한다. 템플릿 제공부(120)는 생성된 하나의 마스크에서 다른 하나의 마스크를 뺀다. 그리고, 템플릿 제공부(120)는 뺀 마스크를 이미지에 적용하여 엣지를 추출할 수 있다.When using the DoG algorithm, the template providing unit 120 generates two Gaussian masks having different variances from the image. The template providing unit 120 subtracts the other mask from the generated one mask. Then, the template providing unit 120 may extract the edge by applying the subtracted mask to the image.

템플릿 제공부(120)는 각 이미지 내에서 추출된 엣지에 의한 폐쇄 영역(enclosure)을 하나 이상 식별할 수 있다. 이 경우, 템플릿 제공부(120)는 엣지 영역이 폐쇄되었는지 명확히 하기 위하여, 각 이미지에 이진화(binarization)를 먼저 처리할 수 있다. The template providing unit 120 may identify one or more enclosures by edges extracted from within each image. In this case, the template providing unit 120 may first process each image with binarization in order to clarify whether the edge region is closed.

템플릿 제공부(120)는 식별된 폐쇄 영역이 이미지 내에서 차지하고 있는 크기, 위치 또는 형상을 기초로, 제1 이미지 내의 객체와 및 제2 이미지 내의 객체 사이의 유사성을 판단할 수 있다.The template providing unit 120 may determine the similarity between the object in the first image and the object in the second image based on the size, position, or shape occupied by the identified closed region in the image.

동일한 객체를 식별한 이후, 템플릿 제공부(120)는 제2 이미지 속에 포함된 동일한 객체에 어노테이션된 스켈레톤 데이터를 기초로, 제1 이미지 속에 포함된 객체에 적용할 구조 템플릿을 식별할 수 있다.After identifying the same object, the template providing unit 120 may identify a structure template to be applied to the object included in the first image based on the skeleton data annotated on the same object included in the second image.

그리고, 템플릿 제공부(120)는 입출력부(110)를 통해, 식별된 구조 템플릿을 제1 이미지 위에 오버레이(overlay)하여 출력할 수 있다. In addition, the template providing unit 120 may output the identified structure template by overlaying it on the first image through the input/output unit 110 .

특히, 템플릿 제공부(120)는 제2 이미지를 통해 식별된 구조 템플릿을 제1 이미지 위에 오버레이함에 있어, 구조 템플릿의 위치 좌표가 제1 이미지에 포함된 객체에 대응되도록 자동 보정할 수 있다.In particular, in overlaying the structure template identified through the second image on the first image, the template providing unit 120 may automatically correct the position coordinates of the structure template to correspond to the object included in the first image.

구체적으로, 제2 이미지 속에 포함된 객체의 위치와 제1 이미지 속에 포함된 객체의 위치의 차이 값을 산출할 수 있다. 그리고, 템플릿 제공부(120)는 산출된 차이 값에 대응하여 구조 템플릿에 포함된 키 포인트의 위치를 변경한 후, 제1 이미지 위에 오버레이하여 출력할 수 있다.Specifically, a difference value between the position of the object included in the second image and the position of the object included in the first image may be calculated. Then, the template providing unit 120 may change the position of the key point included in the structure template in response to the calculated difference value, and then overlay it on the first image and output it.

이와 같은, 자동 보정된 구조 템플릿의 위치 좌표가 올바른 보정인지 검증하기 위하여, 템플릿 제공부(120)는 산출된 차이 값에 대응하여 변경된 키 포인트의 위치 좌표를 기초로, 구조 템플릿에 포함된 제1 키 포인트와 제2 키 포인트 사이를 연결하는 제1 간선(edge)과, 제2 키 포인트와 제3 키 포인트 사이를 연결하는 제2 간선 사이의 각도가 사전에 설정된 임계 각도 범위를 벗어나는지 판단할 수 있다. In order to verify whether the position coordinates of the automatically corrected structure template are correct corrections, the template providing unit 120 is configured to provide the first element included in the structure template based on the position coordinates of the key points changed in response to the calculated difference value. It is determined whether the angle between the first edge connecting the key point and the second key point and the second edge connecting the second key point and the third key point is out of a preset threshold angle range. can

판단 결과, 제1 간선과 제2 간선 사이의 각도가 임계 각도 범위 내인 경우, 템플릿 제공부(120)는 구조 템플릿의 위치 좌표가 올바른 것으로 판단할 수 있다. 이와 다르게, 제1 간선과 제2 간선 사이의 각도가 임계 각도 범위를 벗어나는 경우, 템플릿 제공부(120)는 제1 간선과 제2 간선 사이의 각도가 임계 각도 범위 내로 들어오도록 제1 키 포인트 또는 제3 키 포인트의 위치를 재변경할 수 있다.As a result of the determination, when the angle between the first trunk line and the second trunk line is within the critical angle range, the template providing unit 120 may determine that the position coordinates of the structure template are correct. Alternatively, when the angle between the first trunk and the second trunk is out of the critical angle range, the template providing unit 120 may use the first key point or The position of the third key point may be changed again.

나아가, 템플릿 제공부(120)는 구조 템플릿에 포함된 키 포인트의 위치를 변경함에 있어, 객체가 제1 이미지 내에서 차지하고 있는 크기, 위치 또는 형상과, 객체가 제2 이미지 내에서 차지하고 있는 크기, 위치 또는 형상을 대비한 결과를 기초로, 구조 템플릿에 포함된 모든 키 포인트에 차이 값을 적용할 것인지, 또는 구조 템플릿에 포함된 일부 키포인트에만 차이 값을 적용할지 여부를 결정할 수 있다. Furthermore, in changing the position of the key point included in the structure template, the template providing unit 120 includes a size, position or shape occupied by the object in the first image, the size occupied by the object in the second image, Based on the result of comparing the position or shape, it is possible to determine whether to apply the difference value to all keypoints included in the structure template or to apply the difference value only to some keypoints included in the structure template.

즉, 템플릿 제공부(120)는 제1 이미지 내의 객체가 제2 이미지 내의 객체 형상이 어느 정도 유지되며 이동된 상황인 경우, 구조 템플릿에 포함된 모든 키 포인트의 위치 좌표를 일괄적으로 자동 보정할 수 있다. 이와 다르게, 템플릿 제공부(120)는 제1 이미지 내의 객체가 제2 이미지 내의 객체 형상이 변형되어 이동된 상황인 경우, 구조 템플릿에 포함된 키 포인트 일부의 위치 좌표만을 자동 보정할 수 있다.That is, when the object in the first image is moved while maintaining the shape of the object in the second image to some extent, the template providing unit 120 automatically corrects the positional coordinates of all key points included in the structure template in a batch. can Alternatively, when the object in the first image is moved because the shape of the object in the second image is changed, the template providing unit 120 may automatically correct only the position coordinates of some key points included in the structure template.

다음 구성으로, 어노테이션 작업부(125)는 템플릿 제공부(120)에 의해 제공된 구조 템플릿에 포함된 하나 이상의 키 포인트의 위치를 이동시키는 어노테이션을 수행할 수 있다.With the following configuration, the annotation working unit 125 may perform annotation to move the position of one or more key points included in the structure template provided by the template providing unit 120 .

구체적으로, 어노테이션 작업부(125)는 입출력부(110)를 통해 입력된 제어 신호에 따라, 템플릿 제공부(120)를 통해 이미지 위에 오버레이하여 출력된 구조 템플릿의 키 포인트의 위치를 이동시킬 수 있다. Specifically, the annotation working unit 125 may move the position of the key point of the output structure template by overlaying it on the image through the template providing unit 120 according to the control signal input through the input/output unit 110 . .

이와 같은, 키 포인트의 위치 이동이 올바른 이동인지 검증하기 위하여, 어노테이션 작업부(125)는 이동된 키 포인트의 위치 좌표를 기초로, 구조 템플릿에 포함된 제1 키 포인트와 제2 키 포인트 사이를 연결하는 제1 간선(edge)과, 제2 키 포인트와 제3 키 포인트 사이를 연결하는 제2 간선 사이의 각도가 사전에 설정된 임계 각도 범위를 벗어나는지 판단할 수 있다. In order to verify whether the positional movement of the key point is correct, the annotation working unit 125 performs an interval between the first key point and the second key point included in the structure template based on the position coordinates of the moved key point. It may be determined whether an angle between a connecting first edge and a second edge connecting between the second key point and the third key point is out of a preset threshold angle range.

판단 결과, 제1 간선과 제2 간선 사이의 각도가 임계 각도 범위 내인 경우, 어노테이션 작업부(125)는 키 포인트의 위치 이동이 올바른 것으로 판단할 수 있다. 이와 다르게, 제1 간선과 제2 간선 사이의 각도가 임계 각도 범위를 벗어나는 경우, 어노테이션 작업부(125)는 키 포인트의 위치 이동에 오류가 존재함을 입출력부(110)의 사용자 인터페이스(UI)를 통해 출력할 수 있다.As a result of the determination, when the angle between the first trunk line and the second trunk line is within the threshold angle range, the annotation working unit 125 may determine that the position movement of the key point is correct. On the other hand, when the angle between the first trunk line and the second trunk line is out of the critical angle range, the annotation working unit 125 indicates that there is an error in the position movement of the key point in the user interface (UI) of the input/output unit 110 . It can be output through

본 발명의 일 실시예에 따르면, 어노테이션 작업부(125)는 객체의 일 부분 중에서 이미지 상에서 명확하게 확인할 수 있는 부분과 이미지 상에서 명확하게 확인할 수 없는 부분을 구분할 수 있는 기능을 제공할 수 있다.According to an embodiment of the present invention, the annotation working unit 125 may provide a function of distinguishing a part that can be clearly identified on an image from a part that cannot be clearly identified on an image among a portion of an object.

구체적으로, 어노테이션 작업부(125)는 구조 템플릿에 포함된 키 포인트 중에서 이미지 속에 포함된 객체에 대응시킬 수 있는 키 포인트와, 이미지 속에 포함된 객체에 대응시킬 수 없는 키 포인트를 입출력부(110)를 통해 서로 다른 사용자 인터페이스(UI)로 출력할 수 있다.Specifically, the annotation work unit 125 selects a key point that can correspond to an object included in an image among key points included in the structure template and a key point that cannot correspond to an object included in the image to the input/output unit 110 . can be output to different user interfaces (UIs).

본 발명의 또 다른 실시예에 따르면, 어노테이션 작업부(125)는 구조 템플릿에 포함된 키 포인트가 정확한 지점에 위치하였는지 작업자가 확인할 수 있는 기능을 제공할 수 있다. According to another embodiment of the present invention, the annotation working unit 125 may provide a function for the operator to check whether the key point included in the structure template is located at the correct point.

구체적으로, 어노테이션 작업부(125)는 작업자의 제어에 따라, 구조 템플릿이 3차원 회전된 형상을 입출력부(110)의 사용자 인터페이스(UI)를 통해 출력할 수 있다. In detail, the annotation working unit 125 may output a three-dimensionally rotated shape of the structure template through the user interface (UI) of the input/output unit 110 under the control of the operator.

상술한 바와 같은 키 포인트의 위치 이동이 완료되면, 어노테이션 작업부(125)는 입출력부(110)를 통해 입력된 작업자의 제어 신호에 따라, 이미지 속에 포함된 객체에 대한 속성 정보를 설정할 수 있다. 여기서, 객체의 속성 정보에는 객체의 유형, 객체에 대한 체형, 자세 또는 방향에 관한 정보가 포함될 수 있으나, 이에 한정되는 것은 아니다.When the position movement of the key point as described above is completed, the annotation work unit 125 may set attribute information on the object included in the image according to the operator's control signal input through the input/output unit 110 . Here, the attribute information of the object may include, but is not limited to, the type of the object, and information about the body type, posture, or direction of the object.

본 발명의 또 다른 실시예에 따르면, 어노테이션 작업부(125)는 작업자에 의해 객체에 대한 속성 정보가 설정되기 이전에, 객체에 대한 속성 정보를 선제적으로 제안할 수 있다.According to another embodiment of the present invention, the annotation working unit 125 may preemptively suggest the attribute information on the object before the attribute information on the object is set by the operator.

구체적으로, 어노테이션 작업부(125)는 이동된 키 포인트의 위치 좌표 및 키 포인트 사이의 연결 관계를 기초로, 작업자에게 제안할 체형, 자세 또는 방향 정보를 식별할 수 있다. 그리고, 어노테이션 작업부(125)는 식별된 제안할 체형, 자세 또는 방향 정보를 입출력부(110)의 사용자 인터페이스(UI)를 통해 출력할 수 있다.Specifically, the annotation work unit 125 may identify body shape, posture, or direction information to be suggested to the operator based on the positional coordinates of the moved key point and the connection relationship between the key points. In addition, the annotation work unit 125 may output the identified body shape, posture, or direction information to be proposed through the user interface (UI) of the input/output unit 110 .

다음 구성으로, 스켈레톤 생성부(130)는 어노테이션 작업부(125)에 의해 수행된 어노테이션 작업 결과를 기초로 스켈레톤 데이터를 생성할 수 있다.With the following configuration, the skeleton generating unit 130 may generate skeleton data based on the result of the annotation work performed by the annotation work unit 125 .

구체적으로, 스켈레톤 생성부(130)는 구조 템플릿에 포함된 키 포인트의 위치 좌표 및 키 포인트의 연결 관계를 기초로, 이미지 속에 포함된 객체에 대응하는 스켈레톤 데이터를 생성할 수 있다. 이 경우, 스켈레톤 데이터는 객체의 속성 정보가 포함되어 생성될 수 있다.Specifically, the skeleton generator 130 may generate skeleton data corresponding to the object included in the image based on the positional coordinates of the key points included in the structure template and the connection relationship between the key points. In this case, the skeleton data may be generated by including attribute information of the object.

그리고, 스켈레톤 생성부(130)는 통신부(105)를 통해, 생성된 스켈레톤 데이터를 포함하는 어노테이션 작업 결과를 학습 데이터 설계 장치(200) 또는 인공지능 학습 장치(300)에 전송할 수 있다. 이 경우, 어노테이션 작업 결과는 JSON(Java Script Object Notation) 파일 형식을 가질 수 있으나, 이에 한정되는 것은 아니다.In addition, the skeleton generating unit 130 may transmit an annotation operation result including the generated skeleton data to the learning data design apparatus 200 or the artificial intelligence learning apparatus 300 through the communication unit 105 . In this case, the annotation operation result may have a JSON (Java Script Object Notation) file format, but is not limited thereto.

이하, 상술한 바와 같은 어노테이션 장치(100)의 논리적 구성요소를 구현하기 위한 하드웨어에 대하여 보다 구체적으로 설명한다.Hereinafter, hardware for implementing the logical components of the annotation apparatus 100 as described above will be described in more detail.

도 5는 본 발명의 일 실시예에 따른 어노테이션 장치의 하드웨어 구성도이다.5 is a hardware configuration diagram of an annotation apparatus according to an embodiment of the present invention.

도 5에 도시된 바와 같이, 어노테이션 장치(100)는 프로세서(Processor, 150), 메모리(Memory, 155), 송수신기(Transceiver, 160), 입출력장치(Input/output device, 165), 데이터 버스(Bus, 170) 및 스토리지(Storage, 175)를 포함하여 구성될 수 있다. 5, the annotation device 100 includes a processor 150, a memory 155, a transceiver 160, an input/output device 165, and a data bus. , 170) and storage (Storage, 175).

프로세서(150)는 메모리(155)에 상주된 스켈레톤 데이터 생성 방법이 구현된 소프트웨어(180a)에 따른 명령어를 기초로, 어노테이션 장치(100)의 동작 및 기능을 구현할 수 있다. 메모리(155)에는 스켈레톤 데이터 생성 방법이 구현된 소프트웨어(180a)가 상주(loading)될 수 있다. 송수신기(160)는 학습 데이터 설계 장치(200) 및 인공지능 학습 장치(300)와 데이터를 송수신할 수 있다. 입출력장치(165)는 어노테이션 장치(100)의 동작에 필요한 데이터를 입력 받고, 이미지 및 구조 템플릿을 출력할 수 있다. 데이터 버스(170)는 프로세서(150), 메모리(155), 송수신기(160), 입출력장치(165) 및 스토리지(175)와 연결되어, 각각의 구성 요소 사이가 서로 데이터를 전달하기 위한 이동 통로의 역할을 수행할 수 있다.The processor 150 may implement the operation and function of the annotation apparatus 100 based on a command according to the software 180a in which the method for generating skeleton data residing in the memory 155 is implemented. The memory 155 may be loaded with software 180a in which the method for generating skeleton data is implemented. The transceiver 160 may transmit/receive data to and from the learning data design apparatus 200 and the artificial intelligence learning apparatus 300 . The input/output device 165 may receive data necessary for the operation of the annotation device 100 , and may output an image and a structure template. The data bus 170 is connected to the processor 150 , the memory 155 , the transceiver 160 , the input/output device 165 and the storage 175 . can play a role.

스토리지(175)는 스켈레톤 데이터 생성 방법이 구현된 소프트웨어(180a)의 실행을 위해 필요한 애플리케이션 프로그래밍 인터페이스(Application Programming Interface, API), 라이브러리(library) 파일, 리소스(resource) 파일 등을 저장할 수 있다. 스토리지(175)는 스켈레톤 데이터 생성 방법이 구현된 소프트웨어(180b)를 저장할 수 있다. 또한, 스토리지(175)는 스켈레톤 데이터 생성 방법의 수행에 필요한 데이터베이스(185)를 저장할 수 있다. 여기서, 데이터베이스(185)에는 객체의 유형별로 규격화된 구조 템플릿들이 포함되어 저장될 수 있으나, 이에 한정되는 것은 아니다.The storage 175 may store an application programming interface (API), a library file, a resource file, etc. necessary for the execution of the software 180a in which the method for generating the skeleton data is implemented. The storage 175 may store the software 180b in which the method for generating skeleton data is implemented. In addition, the storage 175 may store the database 185 necessary for performing the method for generating skeleton data. Here, the database 185 may include and store standardized structure templates for each type of object, but is not limited thereto.

본 발명의 제1 실시예에 따르면, 메모리(155)에 상주되거나 또는 스토리지(175)에 저장된 스켈레톤 데이터 생성 방법을 구현하기 위한 소프트웨어(180a, 180b)는 프로세서(150)가 인공지능(AI) 학습을 위한 어노테이션 작업의 대상이 되는 이미지가 메모리(155)에 로딩되면 스켈레톤 데이터의 생성에 활용될 수 있는 구조 템플릿을 식별하는 단계, 프로세서(150)가 구조 템플릿을 이미지 위에 오버레이하여 입출력장치(165)를 통해 출력하는 단계, 프로세서(150)가 입출력장치(165)를 통해 입력된 작업자의 제어에 따라 구조 템플릿에 포함된 하나 이상의 키 포인트의 위치를 이동시키는 단계, 및 프로세서(150)가 이동된 키 포인트의 위치 좌표 및 상기 키 포인트 사이의 연결 관계를 기초로 이미지 속에 포함된 객체에 대응하는 스켈레톤 데이터를 생성하는 단계를 실행시키기 위하여 기록매체에 기록된 컴퓨터 프로그램이 될 수 있다.According to the first embodiment of the present invention, the software 180a, 180b for implementing the method for generating skeleton data resident in the memory 155 or stored in the storage 175 is the processor 150 artificial intelligence (AI) learning. When the image to be annotated for is loaded into the memory 155, a step of identifying a structure template that can be used for generation of skeleton data, the processor 150 overlays the structure template on the image to the input/output device 165 outputting through, the processor 150 moving the position of one or more key points included in the structure template under the control of the operator input through the input/output device 165, and the processor 150 moving the key It may be a computer program recorded on a recording medium to execute the step of generating skeleton data corresponding to the object included in the image based on the positional coordinates of the points and the connection relationship between the key points.

본 발명의 제2 실시예에 따르면, 메모리(155)에 상주되거나 또는 스토리지(175)에 저장된 스켈레톤 데이터 생성 방법을 구현하기 위한 소프트웨어(180a, 180b)는 프로세서(150)가 인공지능(AI) 학습을 위한 어노테이션 작업의 대상이 되는 제1 이미지가 메모리(155)에 로딩되면 제1 이미지와 시간적으로 연속되게 촬영된 제2 이미지를 식별하는 단계, 프로세서(150)가 제1 이미지 속에 포함된 객체 중에서 제2 이미지 속에 포함된 객체와 동일한 객체를 식별하는 단계, 프로세서(150)가 제2 이미지 속에 포함된 객체에 어노테이션된 스켈레톤 데이터를 기초로 제1 이미지 속에 포함된 객체에 적용할 구조 템플릿을 식별하는 단계, 프로세서(150)가 식별된 구조 템플릿을 제1 이미지 위에 오버레이하여 입출력장치(165)를 통해 출력하는 단계를 실행시키기 위하여 기록매체에 기록된 컴퓨터 프로그램이 될 수 있다. According to the second embodiment of the present invention, the software 180a, 180b for implementing the method for generating skeleton data resident in the memory 155 or stored in the storage 175 is the processor 150 artificial intelligence (AI) learning. When the first image, which is the subject of the annotation operation for Identifying the same object as the object included in the second image, the processor 150 identifying the structure template to be applied to the object included in the first image based on the skeleton data annotated on the object included in the second image Step, the processor 150 may be a computer program recorded on the recording medium to execute the step of overlaying the identified structure template on the first image and outputting it through the input/output device 165 .

보다 구체적으로, 프로세서(150)는 ASIC(Application-Specific Integrated Circuit), 다른 칩셋(chipset), 논리 회로 및/또는 데이터 처리 장치를 포함할 수 있다. 메모리(155)는 ROM(Read-Only Memory), RAM(Random Access Memory), 플래쉬 메모리, 메모리 카드, 저장 매체 및/또는 다른 저장 장치를 포함할 수 있다. 송수신기(160)는 유무선 신호를 처리하기 위한 베이스밴드 회로를 포함할 수 있다. 입출력장치(165)는 키보드(keyboard), 마우스(mouse), 및/또는 조이스틱(joystick) 등과 같은 입력 장치 및 액정표시장치(Liquid Crystal Display, LCD), 유기 발광 다이오드(Organic LED, OLED) 및/또는 능동형 유기 발광 다이오드(Active Matrix OLED, AMOLED) 등과 같은 영상 출력 장치 프린터(printer), 플로터(plotter) 등과 같은 인쇄 장치를 포함할 수 있다. More specifically, the processor 150 may include an application-specific integrated circuit (ASIC), another chipset, a logic circuit, and/or a data processing device. The memory 155 may include read-only memory (ROM), random access memory (RAM), flash memory, memory cards, storage media, and/or other storage devices. The transceiver 160 may include a baseband circuit for processing wired and wireless signals. The input/output device 165 includes an input device such as a keyboard, a mouse, and/or a joystick, and a liquid crystal display (LCD), an organic light emitting diode (OLED) and/or an input device such as a joystick. Alternatively, an image output device such as an active matrix OLED (AMOLED) may include a printing device such as a printer or a plotter.

본 명세서에 포함된 실시 예가 소프트웨어로 구현될 경우, 상술한 방법은 상술한 기능을 수행하는 모듈(과정, 기능 등)로 구현될 수 있다. 모듈은 메모리(155)에 상주되고, 프로세서(150)에 의해 실행될 수 있다. 메모리(155)는 프로세서(150)의 내부 또는 외부에 있을 수 있고, 잘 알려진 다양한 수단으로 프로세서(150)와 연결될 수 있다.When the embodiment included in this specification is implemented in software, the above-described method may be implemented as a module (process, function, etc.) that performs the above-described function. The module resides in the memory 155 and may be executed by the processor 150 . The memory 155 may be internal or external to the processor 150 , and may be connected to the processor 150 by various well-known means.

도 5에 도시된 각 구성요소는 다양한 수단, 예를 들어, 하드웨어, 펌웨어(firmware), 소프트웨어 또는 그것들의 결합 등에 의해 구현될 수 있다. 하드웨어에 의한 구현의 경우, 본 발명의 일 실시예는 하나 또는 그 이상의 ASICs(Application Specific Integrated Circuits), DSPs(Digital Signal Processors), DSPDs(Digital Signal Processing Devices), PLDs(Programmable Logic Devices), FPGAs(Field Programmable Gate Arrays), 프로세서, 콘트롤러, 마이크로 콘트롤러, 마이크로 프로세서 등에 의해 구현될 수 있다.Each component shown in FIG. 5 may be implemented by various means, for example, hardware, firmware, software, or a combination thereof. In the case of implementation by hardware, an embodiment of the present invention is one or more ASICs (Application Specific Integrated Circuits), DSPs (Digital Signal Processors), DSPDs (Digital Signal Processing Devices), PLDs (Programmable Logic Devices), FPGAs ( Field Programmable Gate Arrays), a processor, a controller, a microcontroller, a microprocessor, and the like.

또한, 펌웨어나 소프트웨어에 의한 구현의 경우, 본 발명의 일 실시예는 이상에서 설명된 기능 또는 동작들을 수행하는 모듈, 절차, 함수 등의 형태로 구현되어, 다양한 컴퓨터 수단을 통하여 판독 가능한 기록매체에 기록될 수 있다. 여기서, 기록매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 기록매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 예컨대 기록매체는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(Magnetic Media), CD-ROM(Compact Disk Read Only Memory), DVD(Digital Video Disk)와 같은 광 기록 매체(Optical Media), 플롭티컬 디스크(Floptical Disk)와 같은 자기-광 매체(Magneto-Optical Media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치를 포함한다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함할 수 있다. 이러한, 하드웨어 장치는 본 발명의 동작을 수행하기 위해 하나 이상의 소프트웨어로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.In addition, in the case of implementation by firmware or software, an embodiment of the present invention is implemented in the form of a module, procedure, function, etc. that performs the functions or operations described above, and is stored in a recording medium readable through various computer means. can be recorded. Here, the recording medium may include a program command, a data file, a data structure, etc. alone or in combination. The program instructions recorded on the recording medium may be specially designed and configured for the present invention, or may be known and available to those skilled in the art of computer software. For example, the recording medium includes a magnetic medium such as a hard disk, a floppy disk, and a magnetic tape, an optical recording medium such as a compact disk read only memory (CD-ROM), a digital video disk (DVD), and a floppy disk. magneto-optical media, such as a disk, and hardware devices specially configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions may include not only machine language codes such as those generated by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like. Such hardware devices may be configured to operate as one or more software to perform the operations of the present invention, and vice versa.

이하, 지금까지 상술한 바와 같은 본 발명의 다양한 실시예에 따른 인공지능 학습 시스템의 특징들에 대하여, 도면을 참조하여 구체적으로 설명하기로 한다.Hereinafter, the features of the artificial intelligence learning system according to various embodiments of the present invention as described above will be described in detail with reference to the drawings.

도 6 내지 도 10은 본 발명의 일 실시예에 따라 스켈레톤 데이터를 생성하는 과정을 설명하기 위한 예시도이다.6 to 10 are exemplary views for explaining a process of generating skeleton data according to an embodiment of the present invention.

도 6을 참조하면, 본 발명의 일 실시예에 따른 인공지능 학습 시스템의 어노테이션 장치(100)는 인공지능(AI) 학습을 위한 어노테이션 작업의 대상이 되는 이미지(10)가 메모리에 로딩되면, 이미지(10) 속에 포함된 하나 이상의 객체(20)의 스켈레톤 데이터(40)를 생성하기 위한 구조 템플릿(30)을 식별할 수 있다.Referring to FIG. 6 , in the annotation apparatus 100 of the artificial intelligence learning system according to an embodiment of the present invention, when an image 10 to be annotated for artificial intelligence (AI) learning is loaded into a memory, the image A structure template 30 for generating the skeleton data 40 of one or more objects 20 included in 10 may be identified.

여기서, 구조 템플릿(30)은 객체(20)의 속성에 따라 사전에 정의된 개수의 키 포인트(31)와, 사전에 정의된 키 포인트(31) 사이의 연결 관계(33)를 가지는 데이터 구조를 의미한다.Here, the structure template 30 is a data structure having a predefined number of key points 31 according to the properties of the object 20 and a connection relationship 33 between the predefined key points 31 . it means.

도 7을 참조하면, 어노테이션 장치(100)는 식별된 구조 템플릿(30)을 이미지(10) 속에 포함된 객체(20)의 위에 오버레이하여 출력할 수 있다.Referring to FIG. 7 , the annotation apparatus 100 may overlay the identified structure template 30 on the object 20 included in the image 10 and output it.

이 경우, 어노테이션 장치(100)는 구조 템플릿(30)에 포함된 키 포인트(31) 중에서 사전에 설정된 하나 이상의 기준 키 포인트(31-sp)가 사전에 설정된 특징점(31-sp) 위에 위치시킬 수 있다.In this case, the annotation apparatus 100 may position one or more preset reference key points 31-sp among the key points 31 included in the structure template 30 above the preset feature points 31-sp. have.

여기서, 특징점(31-sp)은 객체(20)의 골격 중에서 뼈가 분기하는 관절의 위치, 또는 객체의 신체 기관 중에서 움직임이 최소인 신체 기관의 위치에 따라 사전에 정의된 지점이 될 수 있다. Here, the feature point 31 -sp may be a predefined point according to a position of a joint where bones are branched in the skeleton of the object 20 or a position of a body organ with minimal movement among the body organs of the object.

도 8을 참조하면, 어노테이션 장치(100)는 작업자의 제어에 따라, 구조 템플릿(30)에 포함된 하나 이상의 키 포인트(31)의 위치를 이동(31-old에서 31-new로 이동)시키는 어노테이션을 수행할 수 있다.Referring to FIG. 8 , the annotation device 100 moves (moves from 31-old to 31-new) the position of one or more key points 31 included in the structure template 30 under the control of the operator. can be performed.

특히, 어노테이션 장치(100)는 객체(20)의 일 부분 중에서 이미지(10) 상에서 명확하게 확인할 수 있는 부분과 이미지(10) 상에서 명확하게 확인할 수 없는 부분을 구분할 수 있는 기능을 제공할 수 있다. In particular, the annotation apparatus 100 may provide a function of distinguishing a part that can be clearly identified on the image 10 from a part that cannot be clearly identified on the image 10 among a portion of the object 20 .

예를 들어, 어노테이션 장치(100)는 이미지(10) 상에서 명확하게 확인할 수 없는 부분에 대응하는 키 포인트(31)의 색상을 다른 색상으로 표현하거나 또는 음영 처리함으로써, 이미지(10) 상에서 명확하게 확인할 수 있는 부분에 대응하는 키 포인트(31)와 구분할 수 있는 기능을 제공할 수 있다. For example, the annotation apparatus 100 expresses or shades the color of the key point 31 corresponding to a portion that cannot be clearly identified on the image 10 with a different color, so that it can be clearly identified on the image 10 . It is possible to provide a function that can be distinguished from the key point 31 corresponding to the available part.

도 9를 참조하면, 어노테이션 장치(100)는 키 포인트의 위치 이동이 올바른 이동인지 검증할 수 있다.Referring to FIG. 9 , the annotation apparatus 100 may verify whether the positional movement of the key point is a correct movement.

구체적으로, 어노테이션 장치(100)는 이동된 키 포인트의 위치 좌표를 기초로, 구조 템플릿(30)에 포함된 제1 키 포인트(31-1)와 제2 키 포인트(31-2) 사이를 연결하는 제1 간선(33-1)과, 제2 키 포인트(31-2)와 제3 키 포인트(31-3) 사이를 연결하는 제2 간선(33-2) 사이의 각도(d)가 사전에 설정된 임계 각도 범위를 벗어나는지 판단할 수 있다. Specifically, the annotation apparatus 100 connects between the first key point 31-1 and the second key point 31-2 included in the structure template 30 based on the position coordinates of the moved key point. The angle d between the first trunk line 33-1 and the second trunk line 33-2 connecting between the second key point 31-2 and the third key point 31-3 is It can be determined whether it is outside the threshold angle range set in .

판단 결과, 제1 간선(33-1)과 제2 간선(33-2) 사이의 각도(d)가 임계 각도 범위 내인 경우, 어노테이션 장치(100)는 키 포인트의 위치 이동이 올바른 것으로 판단할 수 있다. 이와 다르게, 제1 간선(33-1)과 제2 간선(33-2) 사이의 각도(d)가 임계 각도 범위를 벗어나는 경우, 어노테이션 장치(100)는 키 포인트의 위치 이동에 오류가 존재함을 사용자 인터페이스(UI)를 통해 출력할 수 있다.As a result of the determination, when the angle d between the first trunk line 33-1 and the second trunk line 33-2 is within the critical angle range, the annotation apparatus 100 may determine that the position movement of the key point is correct. have. On the other hand, if the angle d between the first trunk line 33-1 and the second trunk line 33-2 is out of the critical angle range, the annotation device 100 has an error in the position movement of the key point. can be output through a user interface (UI).

그리고, 어노테이션 장치(100)는 작업자의 제어에 따라, 이미지(10) 속에 포함된 객체(20)에 대한 속성 정보를 설정할 수 있다. 여기서, 객체(20)의 속성 정보에는 객체의 유형, 객체에 대한 체형, 자세 또는 방향에 관한 정보가 포함될 수 있으나, 이에 한정되는 것은 아니다.In addition, the annotation apparatus 100 may set attribute information on the object 20 included in the image 10 under the control of the operator. Here, the attribute information of the object 20 may include information about the type of the object, body shape, posture, or direction of the object, but is not limited thereto.

도 10을 참조하면, 어노테이션 장치(100)는 구조 템플릿(30)에 포함된 키 포인트(31)의 위치 좌표, 키 포인트(31)의 연결 관계(33), 및 객체의 속성 정보를 포함하여 스켈레톤 데이터(40)를 생성할 수 있다. Referring to FIG. 10 , the annotation apparatus 100 includes the location coordinates of the key point 31 included in the structure template 30 , the connection relationship 33 of the key point 31 , and the skeleton including object property information. Data 40 may be generated.

그리고, 어노테이션 장치(100)는 생성된 스켈레톤 데이터(40)를 포함하는 어노테이션 작업 결과를 학습 데이터 설계 장치(200) 또는 인공지능 학습 장치(300)에 전송할 수 있다.In addition, the annotation apparatus 100 may transmit an annotation operation result including the generated skeleton data 40 to the learning data design apparatus 200 or the artificial intelligence learning apparatus 300 .

상술한 바와 같은 본 발명의 실시예에 따르면, 인공지능(AI) 학습을 위한 이미지(10) 속에 포함된 객체(20)를 대상으로 스켈레톤 데이터(40)를 생성함에 있어, 키 포인트(31)의 개수와 연결 관계(33) 등이 규격화 되어 있는 템플릿(30)을 활용함으로써, 스켈레톤 데이터(40)에 포함되어야 하는 키 포인트(31)가 누락되지 않으며, 키 포인트(31) 사이의 연결 관계(33)가 정확하게 설정될 수 있다. According to the embodiment of the present invention as described above, in generating the skeleton data 40 for the object 20 included in the image 10 for artificial intelligence (AI) learning, the key point 31 By using the template 30 in which the number and connection relationship 33 are standardized, the key point 31 to be included in the skeleton data 40 is not omitted, and the connection relationship 33 between the key points 31 ) can be set correctly.

또한, 본 발명의 실시예에 따르면, 이미지(10) 상에서 명확하게 확인할 수 없는 객체(20)의 일부분에 대응하는 키 포인트(31)를 구분될 수 있게 설정함으로써, 작업자가 객체(20)의 3차원 골격에 따른 스켈레톤 데이터를 보다 명확하게 확인하며 어노테이션을 수행할 수 있게 된다. In addition, according to an embodiment of the present invention, by setting the key point 31 corresponding to a part of the object 20 that cannot be clearly identified on the image 10 to be distinguishable, the operator can It becomes possible to more clearly check the skeleton data according to the dimensional skeleton and to perform annotations.

결과적으로, 본 발명의 실시 예에 따르면, 이미지(10) 속에 포함된 객체(20)의 체형, 자세 또는 방향을 정확하게 학습시킬 수 있는 스켈레톤 데이터(40)를 생성할 수 있게 되는 것이다.As a result, according to an embodiment of the present invention, it is possible to generate the skeleton data 40 that can accurately learn the body shape, posture, or direction of the object 20 included in the image 10 .

도 11 내지 도 14는 본 발명의 일 실시예에 따라 연속된 이미지들에 대한 스켈레톤 데이터를 생성하는 과정을 설명하기 위한 예시도이다.11 to 14 are exemplary views for explaining a process of generating skeleton data for consecutive images according to an embodiment of the present invention.

도 11을 참조하면, 본 발명의 일 실시예에 따른 인공지능 학습 시스템의 어노테이션 장치(100)는 인공지능(AI) 학습을 위한 어노테이션 작업의 대상이 되는 제1 이미지(10)가 메모리에 로딩되면, 제1 이미지(10)와 시간적으로 연속되게 촬영된 제2 이미지(10-old)를 식별할 수 있다. Referring to FIG. 11 , in the annotation apparatus 100 of the artificial intelligence learning system according to an embodiment of the present invention, when the first image 10, which is the object of the annotation work for artificial intelligence (AI) learning, is loaded into the memory, , it is possible to identify the first image 10 and the second image 10-old that is temporally continuously photographed.

구체적으로, 어노테이션 장치(100)는 제1 이미지(10)를 어노테이션 작업의 대상으로 포함하는 프로젝트와 관련된 이미지들 중에서 제1 이미지(10)의 어노테이션 작업을 수행하는 작업자가 어노테이션을 기 수행하였던 이미지를 제2 이미지(10-old)로 식별할 수 있다. 즉, 제2 이미지(10-old)는 제1 이미지(10)와 동일한 프로젝트에 속하는 이미지들 중에서 작업자가 어노테이션 작업을 기 수행한 이미지가 될 수 있다. Specifically, the annotation apparatus 100 selects an image previously annotated by a worker performing an annotation operation on the first image 10 among images related to a project including the first image 10 as the object of the annotation operation. It can be identified by the second image 10-old. That is, the second image 10-old may be an image previously annotated by an operator among images belonging to the same project as the first image 10 .

만약, 작업자가 어노테이션을 기 수행하였던 이미지가 복수 개인 경우, 어노테이션 장치(100)는 이미지의 속성을 기초로, 작업자가 어노테이션을 기 수행하였던 이미지 중에서 제1 이미지(10)의 촬영 장소 좌표와 촬영 장치명이 동일한 이미지를 식별할 수 있다. 그리고, 어노테이션 장치(100)는 제1 이미지(10)의 촬영 장소 좌표와 촬영 장치명이 동일한 이미지 중에서 제1 이미지(10)의 촬영 시각과 시간 차이가 가장 작은 이미지를 제2 이미지(10-old)로 식별할 수 있다.If there are a plurality of images on which the operator has previously annotated, the annotation apparatus 100 determines the photographing location coordinates of the first image 10 and the photographing apparatus from among the images on which the operator has previously performed annotations, based on the properties of the image. The same person can identify the same image. In addition, the annotation apparatus 100 selects an image having the smallest difference between the photographing time and the photographing time of the first image 10 among the images in which the photographing location coordinates of the first image 10 and the photographing apparatus name are the same as the second image (10-old). can be identified as

도 12를 참조하면, 어노테이션 장치(100)는 제1 이미지(10) 속에 포함된 객체(20) 중에서 제2 이미지 속에 포함된 객체(20-old)와 동일한 객체를 식별할 수 있다.Referring to FIG. 12 , the annotation apparatus 100 may identify the same object as the object 20-old included in the second image from among the objects 20 included in the first image 10 .

구체적으로, 어노테이션 장치(100)는 제1 이미지(10) 및 제2 이미지(10-old) 내에서 객체(20, 20-old)가 차지하고 있는 크기, 위치 또는 형상의 유사성을 기준으로, 제1 이미지(10) 속에 포함된 객체(20)와 제2 이미지(10-old) 속에 포함된 객체(20-old)가 서로 동일한 객체에 해당하는지 판단할 수 있다. Specifically, the annotation apparatus 100 determines the first image 10 and the second image 10-old based on the similarity of the size, position, or shape occupied by the objects 20 and 20-old in the first image 10 and the second image 10-old. It may be determined whether the object 20 included in the image 10 and the object 20-old included in the second image 10-old correspond to the same object.

이 경우, 어노테이션 장치(100)는 객체(20, 20-old)의 유사성을 판단하기 위하여, 제1 이미지(10) 및 제2 이미지(10-old)에 대하여 엣지 추출, 이진화 및 폐쇄 영역 식별 등의 이미지 처리를 수행할 수도 있다.In this case, in order to determine the similarity of the objects 20 and 20-old, the annotation apparatus 100 performs edge extraction, binarization, and closed region identification with respect to the first image 10 and the second image 10-old. of image processing may be performed.

도 13 및 도 14를 참조하면, 어노테이션 장치(100)는 제2 이미지(10-old) 속에 포함된 동일한 객체(20-old)에 어노테이션된 스켈레톤 데이터(40)를 기초로, 제1 이미지(10)에 포함된 객체(20)에 적용할 구조 템플릿(30)을 식별할 수 있다.13 and 14 , the annotation apparatus 100 performs an annotation on the first image 10 based on the skeleton data 40 annotated on the same object 20-old included in the second image 10-old. ) to be applied to the object 20 included in the structure template 30 can be identified.

그리고, 식별된 구조 템플릿(30)을 기초로, 제1 이미지(10)의 어노테이션을 수행하는 과정은 도 7 내지 도 10을 참조하여 설명한 바와 동일하므로, 반복하여 설명하지 않는다. In addition, the process of performing the annotation of the first image 10 based on the identified structure template 30 is the same as that described with reference to FIGS. 7 to 10 , and thus the description thereof will not be repeated.

상술한 바와 같은, 본 발명의 실시예에 따르면, 인공지능(AI) 학습을 위한 이미지(10) 속에 포함된 객체(20)를 대상으로 스켈레톤 데이터(40)를 생성함에 있어, 연속된 이미지(10-old)에 대하여 기 수행된 어노테이션 결과를 활용함으로써 보다 빠른 어노테이션을 가능하게 한다.As described above, according to the embodiment of the present invention, in generating the skeleton data 40 for the object 20 included in the image 10 for artificial intelligence (AI) learning, the continuous image 10 -old) enables faster annotation by utilizing the previously performed annotation result.

따라서, 본 발명의 실시예에 따르면, 연속 촬영된 다수의 이미지(10-old, 10)에 각각 포함된 객체(20-old, 20)의 체험, 자세 또는 방향을 학습시키기 위한 스켈레톤 데이터(40)를 보다 신속하고 정확하게 생성할 수 있게 되는 것이다.Therefore, according to an embodiment of the present invention, skeleton data 40 for learning the experience, posture, or direction of the objects 20-old and 20 respectively included in a plurality of consecutively photographed images 10-old and 10 . can be created more quickly and accurately.

이하, 상술한 바와 같은, 어노테이션 장치(100)의 동작에 대하여 보다 구체적으로 설명하기로 한다.Hereinafter, the operation of the annotation apparatus 100 as described above will be described in more detail.

도 15는 본 발명의 일 실시예에 따른 스켈레톤 데이터 생성 방법을 설명하기 위한 순서도이다.15 is a flowchart illustrating a method for generating skeleton data according to an embodiment of the present invention.

도 15를 참조하여 본 발명의 일 실시예에 따른 스켈레톤 데이터 생성 방법을 설명함에 있어, 도 6 내지 도 10을 참조하여 설명한 바와 동일한 설명은 반복하여 기재하지 않는다.In describing the method for generating skeleton data according to an embodiment of the present invention with reference to FIG. 15 , the same description as described with reference to FIGS. 6 to 10 will not be repeated.

도 15를 참조하면, 어노테이션 장치(100)는 작업자의 제어에 따라, 인공지능(AI) 학습을 위한 어노테이션 작업의 대상이 되는 이미지를 메모리에 로딩(loading)할 수 있다(S100).Referring to FIG. 15 , the annotation apparatus 100 may load an image to be subjected to an annotation operation for artificial intelligence (AI) learning into a memory under the control of the operator ( S100 ).

어노테이션 장치(100)는 이미지가 메모리에 로딩되면, 스켈레톤 데이터의 생성에 활용될 수 있는 구조 템플릿을 식별할 수 있다(S200). When the image is loaded into the memory, the annotation apparatus 100 may identify a structure template that can be used to generate the skeleton data ( S200 ).

구체적으로, 본 발명의 일 실시예에 따르면, 어노테이션 장치(100)는 프로젝트의 속성, 이미지의 속성 또는 작업자의 속성에 따라, 데이터베이스로부터 하나의 구조 템플릿을 식별할 수 있다.Specifically, according to an embodiment of the present invention, the annotation apparatus 100 may identify one structure template from a database according to a property of a project, a property of an image, or a property of a worker.

이와 다르게, 본 발명의 다른 실시예에 다르면, 어노테이션 장치(100)는 연속된 이미지에 대하여 기 수행된 어노테이션 결과를 활용하여 구조 템플릿을 식별할 수도 있다. 이러한 과정에 대해서는 추후 도 16을 참조하여 후술하기로 한다.Alternatively, according to another embodiment of the present invention, the annotation apparatus 100 may identify a structure template by using a result of an annotation previously performed on a continuous image. This process will be described later with reference to FIG. 16 .

어노테이션 장치(100)는 식별된 구조 템플릿을 이미지 위에 오버레이하여 출력할 수 있다(S300).The annotation apparatus 100 may overlay the identified structure template on the image and output it (S300).

특히, 어노테이션 장치(100)는 구조 템플릿을 이미지 위에 오버레이 함에 있어, 구조 템플릿에 포함된 키 포인트 중에서 사전에 설정된 하나 이상의 기준 키 포인트가 사전에 설정된 특징점 위에 위치시킬 수 있다. In particular, in overlaying the structure template on the image, the annotation apparatus 100 may position one or more preset reference key points among key points included in the structure template on the preset feature points.

이와 같이, 구조 템플릿에 포함된 키 포인트들 중에서 기준이 되는 키 포인트를 미리 위치시킴으로써, 작업자의 어노테이션 작업 편의를 향상시킬 수 있게 된다.As described above, by pre-positioning a reference key point from among the key points included in the structure template, it is possible to improve the convenience of the operator's annotation work.

어노테이션 장치(100)는 작업자의 제어에 따라, 구조 템플릿에 포함된 하나 이상의 키 포인트의 위치를 이동시킬 수 있다(S400). 그리고, 어노테이션 장치(100)는 작업자의 제어에 따라, 이미지에 포함된 객체에 대한 속성 정보를 설정할 수 있다(S500). The annotation apparatus 100 may move the position of one or more key points included in the structure template under the control of the operator ( S400 ). Then, the annotation apparatus 100 may set attribute information on the object included in the image under the control of the operator ( S500 ).

어노테이션 장치(100)는 구조 템플릿에 포함된 키 포인트의 위치 좌표, 키 포인트의 연결 관계 및 객체의 속성 정보를 기초로, 스켈레톤 데이터를 생성할 수 있다(S600). 그리고, 어노테이션 장치(100)는 생성된 스켈레톤 데이터를 포함하는 어노테이션 작업 결과를 학습 데이터 설계 장치(200) 또는 인공지능 학습 장치(300)에 전송할 수 있다.The annotation apparatus 100 may generate skeleton data based on location coordinates of key points included in the structure template, connection relationships between key points, and property information of objects ( S600 ). In addition, the annotation apparatus 100 may transmit an annotation operation result including the generated skeleton data to the learning data design apparatus 200 or the artificial intelligence learning apparatus 300 .

도 16은 본 발명의 일 실시예에 따른 기 수행된 어노테이션 결과를 활용한 구조 템플릿 식별 방법을 설명하기 위한 순서도이다.16 is a flowchart illustrating a structure template identification method using a pre-performed annotation result according to an embodiment of the present invention.

도 16을 참조하여 본 발명의 일 실시예에 따른 연속된 이미지에 대한 스켈레톤 데이터 생성 방법을 설명함에 있어, 도 11 내지 도 14를 참조하여 설명한 바와 동일한 설명은 반복하여 기재하지 않는다.In describing a method of generating skeleton data for a continuous image according to an embodiment of the present invention with reference to FIG. 16 , the same description as described with reference to FIGS. 11 to 14 will not be repeated.

도 16을 참조하면, 어노테이션 장치(100)는 어노테이션 작업의 대상이 되는 제1 이미지가 메모리에 로딩되면, 제1 이미지와 시간적으로 연속되게 촬영된 제2 이미지를 식별할 수 있다(S210).Referring to FIG. 16 , when the first image to be annotated is loaded into the memory, the annotation apparatus 100 may identify the first image and the second image temporally continuously photographed ( S210 ).

어노테이션 장치(100)는 제1 이미지 속에 포함된 객체 중에서 제2 이미지 속에 포함된 객체와 동일한 객체를 식별할 수 있다(S220).The annotation apparatus 100 may identify the same object as the object included in the second image among the objects included in the first image (S220).

그리고, 어노테이션 장치(100)는 제2 이미지 속에 포함된 동일한 객체에 어노테이션된 스켈레톤 데이터를 기초로, 제1 이미지에 포함된 객체에 적용할 수 있는 구조 템플릿을 식별할 수 있다(S230).In addition, the annotation apparatus 100 may identify a structure template applicable to the object included in the first image based on the skeleton data annotated on the same object included in the second image ( S230 ).

이상과 같이, 본 명세서와 도면에는 본 발명의 바람직한 실시예에 대하여 개시하였으나, 여기에 개시된 실시예 외에도 본 발명의 기술적 사상에 바탕을 둔 다른 변형 예들이 실시 가능하다는 것은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에게 자명한 것이다. 또한, 본 명세서와 도면에서 특정 용어들이 사용되었으나, 이는 단지 본 발명의 기술 내용을 쉽게 설명하고 발명의 이해를 돕기 위한 일반적인 의미에서 사용된 것이지, 본 발명의 범위를 한정하고자 하는 것은 아니다. 따라서, 상술한 상세한 설명은 모든 면에서 제한적으로 해석되어서는 아니 되고 예시적인 것으로 고려되어야 한다. 본 발명의 범위는 첨부된 청구항의 합리적 해석에 의해 선정되어야 하고, 본 발명의 등가적 범위 내에서의 모든 변경은 본 발명의 범위에 포함된다.As described above, although preferred embodiments of the present invention have been disclosed in the present specification and drawings, it is in the technical field to which the present invention pertains that other modifications based on the technical idea of the present invention can be implemented in addition to the embodiments disclosed herein. It is obvious to those with ordinary knowledge. In addition, although specific terms have been used in the present specification and drawings, these are only used in a general sense to easily explain the technical contents of the present invention and help the understanding of the present invention, and are not intended to limit the scope of the present invention. Accordingly, the above detailed description should not be construed as restrictive in all respects but as exemplary. The scope of the present invention should be determined by a reasonable interpretation of the appended claims, and all modifications within the equivalent scope of the present invention are included in the scope of the present invention.

어노테이션 장치: 100-1, 100-2, 100-3, …, 100-n; 100
학습 데이터 설계 장치: 200 인공지능 학습 장치: 300
통신부: 105 입출력부: 110
저장부: 115 템플릿 제공부: 120
어노테이션 작업부: 125 스켈레톤 생성부: 130Annotation devices: 100-1, 100-2, 100-3, … , 100-n; 100
Learning data design device: 200 Artificial intelligence learning device: 300
Communication unit: 105 Input/output unit: 110
Storage unit: 115 Template supply unit: 120
Annotation Workspace: 125 Skeleton Creator: 130

Claims

Step of identifying, by the annotation device, a structure template for generating skeleton data of an object included in an image to be subjected to an annotation operation for artificial intelligence (AI) learning ;
outputting, by the annotation device, the identified structure template overlaid on the image;
moving, by the annotation device, a position of one or more key points included in the structure template according to an operator's control; and
generating, by the annotation device, skeleton data corresponding to an object included in the image based on the positional coordinates of the moved key point and a connection relationship between the moved key points,
The skeleton data is data related to a three-dimensional skeleton of an object for identifying the body shape, pose, or direction of the object included in the image,
The method for generating skeleton data, characterized in that the structure template is a data structure having a predefined number of key points according to the properties of the object and a connection relationship between the predefined key points.

The method of claim 1 , wherein identifying the structure template comprises:
A method for generating skeleton data, characterized in that one structure template is identified from a database provided with structure templates standardized for each type of object.

3. The method of claim 2, wherein identifying the structure template comprises:
and identifying one structure template from the database according to a size, position or shape occupied by an object to which the structure template is to be applied in the image.

The method of claim 1 , wherein the overlaid outputting comprises:
Overlaying the structure template on the image so that one or more preset reference key points among the key points included in the structure template are located on the preset feature points,
The method for generating skeleton data, characterized in that the feature point is a point defined in advance according to a position of a joint from which a bone diverges in the skeleton of an object or a position of a body organ with minimal movement among the body organs of the object.

5. The method of claim 4, wherein the overlaid outputting comprises:
With respect to the third artificial intelligence machine-learned using a dataset consisting of the size, position and shape of the object and the position of the feature point, the object to which the structure template is to be applied occupies in the image. A method for generating skeleton data, characterized in that the position of the feature point at which the reference key point is to be located is identified by querying based on the size, position or shape of the present.

The method of claim 1, wherein moving the position of the key point comprises:
Among the key points included in the structure template, a key point that can correspond to an object included in the image and a key point that cannot correspond to an object included in the image are output to different user interfaces (UI) A method for generating skeleton data, characterized in that.

The method of claim 1, wherein after moving the position of the key point,
The method further comprising the step of setting, by the annotation device, attribute information including body shape, posture, or direction information about the object,
The step of generating the skeleton data is
Generates the skeleton data by including the attribute information,
The step of setting the attribute information is
After identifying the body shape, posture, or direction information to be suggested to the worker based on the position coordinates of the moved key point and the connection relationship between the moved key points, the identified body shape, posture or direction information to be proposed A method for generating skeleton data, characterized in that output through a user interface (UI).

The method of claim 1, wherein moving the position of the key point comprises:
According to the control of the operator, the structure template is characterized in that the output through a user interface (UI) a three-dimensional rotated shape, skeleton data generation method.

The method of claim 1, wherein moving the position of the key point comprises:
Based on the coordinates of the position of the moved key point, a first edge connecting the first key point and the second key point included in the structure template, and the second key point and the third key point When the angle between the connecting second trunks is out of a preset critical angle range, the method for generating skeleton data, characterized in that outputting an error in the position movement of the key point through a user interface (UI).

memory;
input output device; and
In combination with a computing device configured to include a processor for processing instructions resident in the memory,
identifying, by the processor, a structure template for generating skeleton data of an object included in an image to be annotated for artificial intelligence (AI) learning;
overlaying, by the processor, the identified structure template on the image and outputting it through the input/output device;
moving, by the processor, the position of one or more key points included in the structure template according to the control of the operator input through the input/output device; and
Execute, by the processor, generating skeleton data corresponding to the object included in the image based on the positional coordinates of the moved key point and the connection relationship between the moved key points,
The skeleton data is data related to a three-dimensional skeleton of an object for identifying the body type, posture, or direction of the object included in the image,
The structure template is a data structure having a predefined number of key points according to the properties of the object and a connection relationship between the predefined key points, the computer program recorded on the recording medium.