KR20220021423A

KR20220021423A - Apparatus and method for collecting learning data for training of surgical tool and organ recognition

Info

Publication number: KR20220021423A
Application number: KR1020210105617A
Authority: KR
Inventors: 윤지훈; 홍승범; 홍슬기; 이지원; 신소연; 박보경; 성낙준; 유하영; 김성재; 박성현; 최민국
Original assignee: (주)휴톰
Priority date: 2020-08-11
Filing date: 2021-08-10
Publication date: 2022-02-22
Also published as: KR102655325B1

Abstract

Provided are a method for collecting learning data for a surgical tool recognition learning training and a program therefor. As the method executed by a computer, the method for collecting the learning data for the surgical tool recognition learning training comprises: a step of acquiring, by the computer, at least a plurality of surgical data among the actual surgical data, the virtual surgical data, and the random surgical data; and a step of composing and providing the acquired plurality of surgical data as a learning dataset. Therefore, the present invention is capable of solving a class imbalance.

Description

Apparatus and method for collecting learning data for surgical tool and organ recognition learning training

본 발명은 수술 도구 및 장기 인식 학습 훈련을 위한 학습 데이터 수집 장치 및 방법에 관한 것이다.The present invention relates to a learning data collection apparatus and method for surgical tool and organ recognition learning training.

최근 외과 수술에서는 개복 수술의 비중이 크게 낮아지고 환자의 합병증을 최소화하여 회복 속도를 향상시킬 수 있는 최소 침습 수술의 비율이 매우 크게 늘어나고 있다.In recent surgical operations, the proportion of open surgery has greatly decreased, and the proportion of minimally invasive surgery, which can improve recovery speed by minimizing patient complications, is increasing significantly.

이 중에서도 로봇을 활용한 최소 침습 수술의 비율은 더욱 빠르게 증가하고 있다. 또한, 전문의가 카메라로부터 입력된 영상을 보며 진행하는 복강경 수술의 특성 상 최소 침습 수술은 환자 내부에서 일어나는 수술의 모든 과정이 영상으로 기록되어 있다.Among them, the rate of minimally invasive surgery using robots is increasing even more rapidly. In addition, due to the characteristics of laparoscopic surgery, in which a specialist views an image input from a camera, minimally invasive surgery records all procedures of the operation occurring inside the patient as an image.

따라서, 수술 영상을 일종의 수술 기록으로 간주하여 수술 과정을 인식하면 수술에 대한 평가 및 분석뿐 아니라, 수술 도중 실시간 가이던스(guidance)뿐 아니라 자동화 수술이 가능해질 수 있다.Therefore, if the surgical image is regarded as a kind of surgical record and the surgical process is recognized, not only the evaluation and analysis of the operation, but also real-time guidance during the operation as well as the automated operation may be possible.

수술의 영상을 활용한 자동화된 수술 과정의 분석 및 평가를 위해서는 수술 도구에 대한 위치 측정과 움직임 인식 및 장기에 대한 인식이 필수적이다. 수술 데이터, 즉, 수술 과정, 수술 도구에 대한 위치 측정과 움직임, 장기에 대한 인식 등을 획득하여, 이를 학습 데이터로 활용할 수 있다.In order to analyze and evaluate the automated surgical process using surgical images, position measurement, movement recognition, and organ recognition of surgical tools are essential. Surgery data, that is, surgical process, position measurement and movement of surgical tools, recognition of organs, etc. may be acquired and used as learning data.

대한민국 공개특허 제10-2013-0100758호 (2013.09.11)Republic of Korea Patent Publication No. 10-2013-0100758 (2013.09.11)

본 발명이 해결하고자 하는 과제는 정확한 수술 도구 및 장기 인식 학습을 위한 수술 도구 및 장기의 학습 데이터를 제공하는 것이다.The problem to be solved by the present invention is to provide an accurate surgical tool and learning data of a surgical tool and an organ for learning to recognize an organ.

또한, 본 발명이 해결하고자 하는 과제는 클래스 불균형을 해결할 수 있는 학습 데이터를 제공하는 것이다.In addition, the problem to be solved by the present invention is to provide learning data capable of solving class imbalance.

본 발명이 해결하고자 하는 과제들은 이상에서 언급된 과제로 제한되지 않으며, 언급되지 않은 또 다른 과제들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.The problems to be solved by the present invention are not limited to the problems mentioned above, and other problems not mentioned will be clearly understood by those skilled in the art from the following description.

상술한 과제를 해결하기 위한 본 발명의 일 실시예에 따른 수술 도구 및 장기 인식 학습 훈련을 위한 학습 데이터 수집 방법은 컴퓨터에 의해 실행되는 방법으로서, 컴퓨터가 실제 수술데이터, 가상 수술데이터 및 랜덤 수술데이터 중 적어도 복수의 수술데이터를 획득하는 단계 및 획득한 상기 복수의 수술데이터를 학습데이터 셋으로 구성하여 제공하는 단계를 포함하고, 상기 실제 수술데이터는, 실제로 수행된 수술의 종류, 사용된 도구의 위치정보 및 사용된 수술 도구의 정보를 포함하는 실제 수술 도구데이터와 실제로 수행된 수술의 장기에 대한 실제 수술 장기데이터를 포함하는 것이고, 상기 가상 수술데이터는, 가상의 수술 환경, 가상의 수술 환경에서 사용된 가상의 수술 도구데이터 및 가상의 수술 환경에서의 장기에 대한 가상의 수술 장기데이터를 포함하는 것이고, 상기 랜덤 수술데이터는, 도메인 무작위화(Domain Randomization)로부터 생성된 가상 데이터일 수 있다.The learning data collection method for surgical tool and organ recognition learning training according to an embodiment of the present invention for solving the above-described problems is a method executed by a computer, and the computer uses real surgical data, virtual surgery data, and random surgery data Comprising the steps of acquiring at least a plurality of surgical data and providing the plurality of acquired surgical data as a learning data set, wherein the actual surgical data includes the type of surgery actually performed and the location of the tool used. It includes real surgical tool data including information and information on the used surgical tools and real surgical organ data for organs of actually performed surgery, and the virtual surgical data is used in a virtual surgical environment and a virtual surgical environment. It includes virtual surgical tool data and virtual surgical organ data for organs in a virtual surgical environment, and the random surgical data may be virtual data generated from domain randomization.

상술한 과제를 해결하기 위한 본 발명의 일 실시예에 따른 수술 도구 인식 학습 훈련을 위한 학습 데이터 수집 방법은 상기 학습데이터 셋을 기반으로 특정 수술에 대하여 사용되는 특정 수술 도구에 관한 수술 도구 인식 학습 훈련을 수행하는 단계 및 상기 수행한 수술 도구 인식 학습 훈련을 기반으로 하여 수술 도구 인식 학습 모델을 생성하는 단계를 더 포함할 수 있다.The learning data collection method for surgical tool recognition learning training according to an embodiment of the present invention for solving the above-described problems is a surgical tool recognition learning training for a specific surgical tool used for a specific surgery based on the learning data set and generating a surgical tool recognition learning model based on the performed surgical tool recognition learning training.

상기 실제 수술 도구데이터는, 도구의 헤드(head), 관절(wrist) 및 바디(body) 중 적어도 하나 이상을 구분하여 실제 수술 영상 내에서 바운딩되거나 픽셀 단위로 폴리곤(polygon) 처리되거나, 또는 도구의 특징 별로 상기 실제 수술 영상 내에서 바운딩되거나 상기 픽셀 단위로 상기 폴리곤 처리되는 것 중 적어도 하나를 포함하는 것일 수 있다.The actual surgical tool data is bound within the actual surgical image by dividing at least one of a head, a wrist, and a body of the tool, or is processed as a polygon in a pixel unit, or of the tool. For each feature, it may include at least one of bounding within the actual surgical image or processing the polygon in units of pixels.

상기 가상의 수술 환경은 특정 실제 수술 영상데이터를 이용하여 3D화 하여 제작된 3D 수술 환경이고, 상기 가상의 수술 도구데이터와 상기 가상의 수술 장기데이터는, 상기 3D 수술 환경 내에서 사용자로부터 활용되어 도출된 가상의 수술 도구와 수술 장기 정보일 수 있다.The virtual surgical environment is a 3D surgical environment produced by 3D using specific real surgical image data, and the virtual surgical tool data and the virtual surgical organ data are utilized and derived from the user within the 3D surgical environment. virtual surgical tools and surgical organ information.

상기 랜덤 수술데이터는, 도메인 무작위화(Domain Randomization)로부터 생성된 가상 데이터로서, 수술 도구, 수술 대상 장기, 카메라의 설정 조건을 무작위로 변경함으로써, 무작위로 생성된 특정 수술, 특정 수술 대상 장기, 및 특정 수술에서 사용되는 랜덤 수술 도구 중 적어도 하나에 대한 데이터를 포함할 수 있다.The random surgery data is virtual data generated from domain randomization, and by randomly changing the setting conditions of a surgical tool, a target organ, and a camera, a specific surgery, a specific surgery target organ, and It may include data on at least one of random surgical tools used in a specific surgery.

상기 학습 데이터는, 특정 수술에 대한 특정 수술 도구 및 특정 수술 장기의 데이터가 각각 균일한 양의 데이터로서 클래스 균형(Class balance)을 맞춰 획득되는 것일 수 있다.The learning data may be obtained by balancing a class as data of a specific surgical tool and a specific surgical organ for a specific surgery, respectively, as uniform amounts of data.

또한, 본 발명은, 상기 수술데이터를 획득한 후, 분할 모델을 통해 상기 획득한 수술데이터에 대해 시멘틱 이미지 데이터를 생성하는 단계를 더 포함하고, 상기 학습데이터 셋 구성 제공 단계는, 상기 복수의 수술데이터와 상기 복수의 수술데이터 각각에 대응되는 상기 시멘틱 이미지 데이터를 학습데이터 셋으로 구성하여 제공할 수 있다.In addition, the present invention further comprises the step of generating semantic image data for the acquired surgical data through a segmentation model after acquiring the surgical data, and the providing of the learning data set configuration includes the plurality of surgeries Data and the semantic image data corresponding to each of the plurality of surgical data may be provided as a learning data set.

여기서, 상기 시멘틱 이미지 데이터 생성 단계는, 상기 복수의 수술데이터에 상기 실제 수술데이터가 포함되는 경우, 상기 분할 모델을 통해 상기 실제 수술데이터를 기반으로 실제 시멘틱 이미지 데이터를 생성하고, 상기 복수의 수술데이터에 상기 가상 수술데이터가 포함되는 경우, 상기 분할 모델을 통해 상기 가상 수술데이터를 기반으로 가상 시멘틱 이미지 데이터를 생성하고, 상기 복수의 수술데이터에 상기 랜덤 수술데이터가 포함되는 경우, 상기 분할 모델을 통해 상기 랜덤 수술데이터를 기반으로 랜덤 시멘틱 이미지 데이터를 생성할 수 있다.Here, the semantic image data generating step, when the actual surgical data is included in the plurality of surgical data, generates real semantic image data based on the actual surgical data through the segmentation model, and the plurality of surgical data When the virtual surgical data is included in the segmentation model, virtual semantic image data is generated based on the virtual surgical data through the segmentation model, and when the random surgical data is included in the plurality of surgical data, through the segmentation model Random semantic image data may be generated based on the random surgical data.

또한, 상기 시멘틱 이미지 데이터는, 상기 복수의 수술데이터 각각에 포함되는 상기 도구에 대한 시멘틱 수술 도구데이터 또는 상기 장기에 대한 시멘틱 수술 장기데이터를 포함하고, 상기 시멘틱 수술 도구데이터 또는 상기 시멘틱 수술 장기데이터는, 상기 도구 또는 상기 장기의 위치 범위, 상기 도구 또는 상기 장기의 명칭 중 적어도 하나를 포함할 수 있다.In addition, the semantic image data includes semantic surgical tool data for the tool or semantic surgery organ data for the organ included in each of the plurality of surgical data, and the semantic surgical tool data or the semantic surgery organ data is , a location range of the tool or the organ, and at least one of a name of the tool or the organ.

상술한 과제를 해결하기 위한 본 발명의 일 실시예에 따른 수술 도구 인식 학습 훈련을 위한 학습 데이터 수집 방법은 컴퓨터에 의해 실행되는 방법으로서, 컴퓨터가 실제 수술데이터를 획득하는 단계 상기 획득한 실제 수술데이터에서 클래스 불균형(Class imbalance)을 발생시키는 어노테이션(annotation)이 부족한 데이터의 종류 및 필요한 양을 추출하는 단계, 가상 수술데이터 및 랜덤 수술데이터 중 적어도 하나에서 상기 필요한 양만큼 상기 추출한 데이터의 종류에 해당하는 데이터를 균형 데이터로서 획득하는 단계를 포함하고, 상기 실제 수술데이터는, 실제로 수행된 수술의 종류, 사용된 도구의 위치정보 및 사용된 수술 도구의 정보를 포함하는 실제 수술 도구데이터를 포함하는 것이고, 상기 가상 수술데이터는, 가상의 수술 환경 및 가상의 수술 환경에서 사용된 가상의 수술 도구데이터를 포함하는 것이고, 상기 랜덤 수술데이터는, 도메인 무작위화(Domain Randomization)로부터 생성된 가상 데이터일 수 있다.The learning data collection method for the surgical tool recognition learning training according to an embodiment of the present invention for solving the above-mentioned problems is a method executed by a computer, the computer acquiring actual surgical data The acquired actual surgical data extracting a type and a required amount of data lacking an annotation for generating a class imbalance in at least one of virtual surgery data and random surgery data corresponding to the type of data extracted by the required amount Acquiring data as balance data, wherein the actual surgical data includes actual surgical tool data including information on the type of surgery actually performed, location information of a used tool, and information on a used surgical tool, The virtual surgical data includes a virtual surgical environment and virtual surgical tool data used in the virtual surgical environment, and the random surgical data may be virtual data generated from domain randomization.

상술한 과제를 해결하기 위한 본 발명의 일 실시예에 따른 수술 도구 인식 학습 훈련을 위한 학습 데이터 수집 방법은 상기 획득한 균형 데이터 및 상기 실제 수술데이터를 학습데이터로서 특정 수술에 대하여 사용되는 특정 수술 도구에 관한 수술 도구 인식 학습 훈련을 수행하는 단계 및 상기 수행한 수술 도구 인식 학습 훈련을 기반으로 하여 수술 도구 인식 학습 모델을 생성하는 단계를 포함할 수 있다.The learning data collection method for surgical tool recognition learning training according to an embodiment of the present invention for solving the above-described problems is a specific surgical tool used for a specific surgery by using the acquired balance data and the actual surgical data as learning data It may include the steps of performing a surgical tool recognition learning training for the and generating a surgical tool recognition learning model based on the performed surgical tool recognition learning training.

상기 특정 수술이 위 절제술인 경우, 상기 특정 수술 도구는, 로봇 수술 도구, 복강경 수술 도구 및 보조 수술 도구 중 적어도 하나를 포함하고, 상기 특정 수술 장기는, 위, 간, 담낭, 비장 및 췌장을 포함하고, 상기 특정 수술이 담낭 절제술인 경우, 상기 특정 수술 도구는, 복강경 도구 및 보조 수술 도구 중 적어도 하나를 포함하고, 상기 특정 수술 장기는, 담낭과 간을 포함하는 것일 수 있다.When the specific operation is gastrectomy, the specific surgical tool includes at least one of a robotic surgical tool, a laparoscopic surgical tool, and an auxiliary surgical tool, and the specific surgical organ includes a stomach, liver, gallbladder, spleen, and pancreas and, when the specific operation is cholecystectomy, the specific surgical tool may include at least one of a laparoscopic tool and an auxiliary surgical tool, and the specific surgical organ may include a gallbladder and a liver.

상술한 과제를 해결하기 위한 본 발명의 다른 실시예에 따른 수술 도구 인식 학습 훈련을 위한 학습 데이터 수집 프로그램은 하드웨어인 컴퓨터와 결합되어, 상기 방법 중 어느 하나의 방법을 실행시키기 위해 저장될 수 있다.The learning data collection program for surgical tool recognition learning training according to another embodiment of the present invention for solving the above-described problems may be combined with a computer, which is hardware, and stored to execute any one of the methods.

본 발명의 기타 구체적인 사항들은 상세한 설명 및 도면들에 포함되어 있다.Other specific details of the invention are included in the detailed description and drawings.

상기 본 발명에 의하면, 실제 수술데이터로부터 획득하기 어려운 데이터를 가상의 데이터로 획득함으로써, 클래스 불균형을 해결할 수 있다.According to the present invention, by acquiring data that is difficult to obtain from actual surgical data as virtual data, class imbalance can be solved.

또한, 상기 본 발명에 의하면, 반복되는 다양한 데이터를 많이 생성하고, 이를 활용함으로써 보다 정확한 학습 모델을 생성할 수 있다.In addition, according to the present invention, it is possible to generate a more accurate learning model by generating a lot of repeated various data and utilizing it.

본 발명의 효과들은 이상에서 언급된 효과로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.Effects of the present invention are not limited to the effects mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the following description.

도 1a는 본 발명에 따른 수술 도구 및 장기 인식 학습 훈련을 위한 학습 데이터를 수집하는 장치(10)를 설명하기 위한 도면이다.
도 1b는 본 발명의 일 실시예에 따른 수술 도구 및 장기 인식 학습 훈련을 위한 학습 데이터 수집 방법의 흐름도이다.
도 2a는 본 발명에 따른 영상 내 수술 도구 및 장기의 어노테이션을 예시적으로 나타낸 도면이다.
도 2b는 본 발명에 따른 분할 모델을 통해 실제 수술데이터를 기반으로 실제 시멘틱 이미지 데이터를 생성하는 것을 설명하기 위한 도면이다.
도 2c는 본 발명에 따른 분할 모델을 통해 가상 수술데이터를 기반으로 가상 시멘틱 이미지 데이터를 생성하는 것을 설명하기 위한 도면이다.
도 2d는 본 발명에 따른 분할 모델을 통해 랜덤 수술데이터를 기반으로 랜덤 시멘틱 이미지 데이터를 생성하는 것을 설명하기 위한 도면이다.
도 3은 본 발명에 따른 학습데이터 셋을 이용하여 수술 도구 인식 학습 모델을 생성하는 방법의 흐름도이다.
도 4는 본 발명에 따른 실제 수술데이터에서 부족한 데이터를 균형 데이터로서 획득하는 방법의 흐름도이다.
도 5는 본 발명에 따른 실제 수술데이터로부터 획득된 어노테이션의 분포를 나타낸 그래프이다.
도 6은 본 발명에 따른 가상 수술데이터 및/또는 랜덤 수술데이터를 획득함으로써 달라진 어노테이션의 분포를 나타낸 그래프이다.
도 7은 본 발명에 따른 실제 수술데이터(복강경 담낭 절제술과 위암 비디오)에서 얻은 어노테이션의 분포를 설명한 그래프이다.
도 8은 본 발명에 따른 실제 수술데이터(위암 수술데이터) 및 가상 수술데이터(3D 환자 모델)에 대한 복강경 담낭 절제술 및 위 절제술에서 얻은 주석 분포를 설명한 그래프이다.
도 9는 획득한 실제 수술데이터 및 균형 데이터를 기반으로 하여 수술 도구 및 장기 인식 학습 모델을 생성하는 방법의 흐름도이다. 1A is a view for explaining an apparatus 10 for collecting learning data for a surgical tool and organ recognition learning training according to the present invention.
1B is a flowchart of a learning data collection method for training for learning a surgical tool and organ recognition according to an embodiment of the present invention.
FIG. 2A is a diagram exemplarily illustrating annotation of a surgical tool and an organ within an image according to the present invention.
2B is a diagram for explaining the generation of actual semantic image data based on actual surgical data through the segmentation model according to the present invention.
2C is a diagram for explaining the generation of virtual semantic image data based on virtual surgical data through the segmentation model according to the present invention.
2D is a diagram for explaining the generation of random semantic image data based on random surgical data through the segmentation model according to the present invention.
3 is a flowchart of a method for generating a surgical tool recognition learning model using a learning data set according to the present invention.
4 is a flowchart of a method for acquiring insufficient data as balance data in actual surgical data according to the present invention.
5 is a graph showing the distribution of annotations obtained from actual surgical data according to the present invention.
6 is a graph showing the distribution of annotations changed by acquiring virtual surgical data and/or random surgical data according to the present invention.
7 is a graph illustrating the distribution of annotations obtained from actual surgical data (laparoscopic cholecystectomy and gastric cancer video) according to the present invention.
8 is a graph explaining the distribution of annotations obtained from laparoscopic cholecystectomy and gastrectomy for actual surgical data (gastric cancer surgery data) and virtual surgery data (3D patient model) according to the present invention.
9 is a flowchart of a method of generating a surgical tool and organ recognition learning model based on the acquired actual surgical data and balance data.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나, 본 발명은 이하에서 개시되는 실시예들에 제한되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하고, 본 발명이 속하는 기술 분야의 통상의 기술자에게 본 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. Advantages and features of the present invention and methods of achieving them will become apparent with reference to the embodiments described below in detail in conjunction with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but may be implemented in various different forms, and only the present embodiments allow the disclosure of the present invention to be complete, and those of ordinary skill in the art to which the present invention pertains. It is provided to fully understand the scope of the present invention to those skilled in the art, and the present invention is only defined by the scope of the claims.

본 명세서에서 사용된 용어는 실시예들을 설명하기 위한 것이며 본 발명을 제한하고자 하는 것은 아니다. 본 명세서에서, 단수형은 문구에서 특별히 언급하지 않는 한 복수형도 포함한다. 명세서에서 사용되는 "포함한다(comprises)" 및/또는 "포함하는(comprising)"은 언급된 구성요소 외에 하나 이상의 다른 구성요소의 존재 또는 추가를 배제하지 않는다. 명세서 전체에 걸쳐 동일한 도면 부호는 동일한 구성 요소를 지칭하며, "및/또는"은 언급된 구성요소들의 각각 및 하나 이상의 모든 조합을 포함한다. 비록 "제1", "제2" 등이 다양한 구성요소들을 서술하기 위해서 사용되나, 이들 구성요소들은 이들 용어에 의해 제한되지 않음은 물론이다. 이들 용어들은 단지 하나의 구성요소를 다른 구성요소와 구별하기 위하여 사용하는 것이다. 따라서, 이하에서 언급되는 제1 구성요소는 본 발명의 기술적 사상 내에서 제2 구성요소일 수도 있음은 물론이다.The terminology used herein is for the purpose of describing the embodiments and is not intended to limit the present invention. In this specification, the singular also includes the plural unless specifically stated otherwise in the phrase. As used herein, “comprises” and/or “comprising” does not exclude the presence or addition of one or more other components in addition to the stated components. Like reference numerals refer to like elements throughout, and "and/or" includes each and every combination of one or more of the recited elements. Although "first", "second", etc. are used to describe various elements, these elements are not limited by these terms, of course. These terms are only used to distinguish one component from another. Accordingly, it goes without saying that the first component mentioned below may be the second component within the spirit of the present invention.

다른 정의가 없다면, 본 명세서에서 사용되는 모든 용어(기술 및 과학적 용어를 포함)는 본 발명이 속하는 기술분야의 통상의 기술자에게 공통적으로 이해될 수 있는 의미로 사용될 수 있을 것이다. 또한, 일반적으로 사용되는 사전에 정의되어 있는 용어들은 명백하게 특별히 정의되어 있지 않는 한 이상적으로 또는 과도하게 해석되지 않는다.Unless otherwise defined, all terms (including technical and scientific terms) used herein will have the meaning commonly understood by those of ordinary skill in the art to which this invention belongs. In addition, terms defined in a commonly used dictionary are not to be interpreted ideally or excessively unless specifically defined explicitly.

본 명세서에서 '컴퓨터'는 연산처리를 수행할 수 있는 다양한 장치들이 모두 포함된다. 예를 들어, 컴퓨터는 데스크 탑 PC, 노트북(Note Book) 뿐만 아니라 스마트폰(Smart phone), 태블릿 PC, 셀룰러폰(Cellular phone), 피씨에스폰(PCS phone; Personal Communication Service phone), 동기식/비동기식 IMT-2000(International Mobile Telecommunication-2000)의 이동 단말기, 팜 PC(Palm Personal Computer), 개인용 디지털 보조기(PDA; Personal Digital Assistant) 등도 해당될 수 있다.As used herein, the term 'computer' includes various devices capable of performing arithmetic processing. For example, computers include desktop PCs and notebooks (Note Books) as well as smart phones, tablet PCs, cellular phones, PCS phones (Personal Communication Service phones), synchronous/asynchronous A mobile terminal of International Mobile Telecommunication-2000 (IMT-2000), a Palm Personal Computer (PC), a Personal Digital Assistant (PDA), and the like may also be applicable.

이하, 첨부된 도면을 참조하여 본 발명의 실시예를 상세하게 설명한다. Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1a는 본 발명에 따른 수술 도구 인식 및 장기 학습 훈련을 위한 학습 데이터를 수집하는 장치(10)를 설명하기 위한 도면이다.1A is a view for explaining an apparatus 10 for collecting learning data for surgical tool recognition and long-term learning training according to the present invention.

도 1b는 본 발명의 일 실시예에 따른 수술 도구 인식 및 장기 학습 훈련을 위한 학습 데이터 수집 과정을 나타낸 흐름도이다.1B is a flowchart illustrating a learning data collection process for surgical tool recognition and long-term learning training according to an embodiment of the present invention.

도 2a는 본 발명에 따른 영상 내 수술 도구 및 장기의 어노테이션을 예시적으로 나타낸 도면이다.FIG. 2A is a diagram exemplarily illustrating annotation of a surgical tool and an organ within an image according to the present invention.

도 2b는 본 발명에 따른 분할 모델을 통해 실제 수술데이터를 기반으로 실제 시멘틱 이미지 데이터를 생성하는 것을 설명하기 위한 도면이다.2B is a diagram for explaining the generation of actual semantic image data based on actual surgical data through the segmentation model according to the present invention.

도 2c는 본 발명에 따른 분할 모델을 통해 가상 수술데이터를 기반으로 가상 시멘틱 이미지 데이터를 생성하는 것을 설명하기 위한 도면이다.2C is a diagram for explaining the generation of virtual semantic image data based on virtual surgical data through the segmentation model according to the present invention.

도 2d는 본 발명에 따른 분할 모델을 통해 랜덤 수술데이터를 기반으로 랜덤 시멘틱 이미지 데이터를 생성하는 것을 설명하기 위한 도면이다.2D is a diagram for explaining the generation of random semantic image data based on random surgical data through the segmentation model according to the present invention.

이하, 도 1a 내지 도 2d를 참조하여, 본 발명에 따른 수술 도구 및 장기 인식 학습 훈련을 위한 학습 데이터를 수집하는 장치(10)에 대해서 설명하도록 한다.Hereinafter, with reference to FIGS. 1A to 2D , an apparatus 10 for collecting learning data for a surgical tool and organ recognition learning training according to the present invention will be described.

장치(10)는 정확한 수술 도구 인식 및 장기 학습을 위한 수술 도구 및 장기의 학습 데이터를 제공할 수 있다. 또한, 장치(10)는 클래스 불균형을 해결할 수 있는 학습 데이터를 제공할 수 있다.The device 10 may provide learning data of surgical tools and organs for accurate surgical tool recognition and organ learning. In addition, the device 10 may provide training data capable of resolving class imbalance.

이에 따라, 장치(10)는 실제 수술데이터로부터 획득하기 어려운 데이터를 가상의 데이터로 획득함으로써, 클래스 불균형을 해결할 수 있다. 또한, 장치(10)는 반복되는 다양한 데이터를 많이 생성하고, 이를 활용함으로써 보다 정확한 학습 모델을 생성할 수 있다.Accordingly, the device 10 may solve the class imbalance by acquiring data that is difficult to obtain from actual surgical data as virtual data. In addition, the device 10 may generate a lot of repeated various data and use it to generate a more accurate learning model.

이러한, 장치(10)는 연산처리를 수행하여 사용자에게 결과를 제공할 수 있는 다양한 장치들이 모두 포함될 수 있다. The device 10 may include all of a variety of devices capable of providing a result to a user by performing arithmetic processing.

컴퓨터의 형태가 될 수 있다. 보다 상세하게는, 상기 컴퓨터는 연산처리를 수행하여 사용자에게 결과를 제공할 수 있는 다양한 장치들이 모두 포함될 수 있다. It can be in the form of a computer. More specifically, the computer may include all of a variety of devices capable of providing a result to a user by performing arithmetic processing.

예를 들어, 컴퓨터는 데스크 탑 PC, 노트북(Note Book) 뿐만 아니라 스마트폰(Smart phone), 태블릿 PC, 셀룰러폰(Cellular phone), 피씨에스폰(PCS phone; Personal Communication Service phone), 동기식/비동기식 IMT-2000(International Mobile Telecommunication-2000)의 이동 단말기, 팜 PC(Palm Personal Computer), 개인용 디지털 보조기(PDA; Personal Digital Assistant) 등도 해당될 수 있다. 또한, 헤드마운트 디스플레이(Head Mounted Display; HMD) 장치가 컴퓨팅 기능을 포함하는 경우, HMD장치가 컴퓨터가 될 수 있다. For example, computers include desktop PCs and notebooks (Note Books) as well as smart phones, tablet PCs, cellular phones, PCS phones (Personal Communication Service phones), synchronous/asynchronous A mobile terminal of International Mobile Telecommunication-2000 (IMT-2000), a Palm Personal Computer (PC), a Personal Digital Assistant (PDA), and the like may also be applicable. In addition, when a head mounted display (HMD) device includes a computing function, the HMD device may be a computer.

또한, 컴퓨터는 클라이언트로부터 요청을 수신하여 정보처리를 수행하는 서버가 해당될 수 있다.In addition, the computer may correspond to a server that receives a request from a client and performs information processing.

그리고, 장치(10)는 획득부(110), 메모리(120) 및 프로세서(130)를 포함할 수 있다. 여기서, 장치(10)는 도 1a에 도시된 구성요소보다 더 적은 수의 구성요소나 더 많은 구성요소를 포함할 수 있다.In addition, the device 10 may include an acquisition unit 110 , a memory 120 , and a processor 130 . Here, the device 10 may include fewer or more components than those illustrated in FIG. 1A .

획득부(110)는 장치(10)와 외부 서버(미도시) 사이 또는 장치(10)와 통신망(미도시) 사이의 무선 통신을 가능하게 하는 하나 이상의 모듈을 포함할 수 있다.The acquisition unit 110 may include one or more modules that enable wireless communication between the device 10 and an external server (not shown) or between the device 10 and a communication network (not shown).

여기서, 획득부(110)는 장치(10)를 하나 이상의 네트워크에 연결하는 하나 이상의 모듈을 포함할 수 있다.Here, the acquisition unit 110 may include one or more modules for connecting the device 10 to one or more networks.

이러한, 획득부(110)는 실제 수술데이터, 가상 수술데이터 및 랜덤 수술데이터 중 적어도 복수의 수술데이터를 획득할 수 있다. 여기서, 획득부(110)는 상기 실제 수술데이터, 가상 수술데이터 및 랜덤 수술데이터를 상기 외부 서버(미도시)로부터 획득할 수 있다.The acquisition unit 110 may acquire at least a plurality of surgical data among actual surgical data, virtual surgical data, and random surgical data. Here, the acquisition unit 110 may acquire the actual surgery data, virtual surgery data, and random surgery data from the external server (not shown).

통신망(미도시)은 장치(10)와 외부 서버(미도시) 간의 다양한 정보를 송수신할 수 있다. 통신망(미도시)은 다양한 형태의 통신망이 이용될 수 있으며, 예컨대, WLAN(Wireless LAN), 와이파이(Wi-Fi), 와이브로(Wibro), 와이맥스(Wimax), HSDPA(High Speed Downlink Packet Access) 등의 무선 통신방식 또는 이더넷(Ethernet), xDSL(ADSL, VDSL), HFC(Hybrid Fiber Coax), FTTC(Fiber to The Curb), FTTH(Fiber To The Home) 등의 유선 통신방식이 이용될 수 있다.A communication network (not shown) may transmit/receive various information between the device 10 and an external server (not shown). The communication network (not shown) may use various types of communication networks, for example, WLAN (Wireless LAN), Wi-Fi, Wibro, Wimax, HSDPA (High Speed Downlink Packet Access), etc. of wireless communication method or wired communication method such as Ethernet, xDSL (ADSL, VDSL), HFC (Hybrid Fiber Coax), FTTC (Fiber to The Curb), FTTH (Fiber To The Home) can be used.

한편, 통신망(미도시)은 상기에 제시된 통신방식에 한정되는 것은 아니며, 상술한 통신방식 이외에도 기타 널리 공지되었거나 향후 개발될 모든 형태의 통신 방식을 포함할 수 있다.On the other hand, the communication network (not shown) is not limited to the communication method presented above, and may include all types of communication methods that are well known or to be developed in the future in addition to the above communication methods.

메모리(120)는 장치(10)의 다양한 기능을 지원하는 데이터를 저장할 수 있다. 메모리(120)는 장치(10)에서 구동되는 다수의 응용 프로그램(application program 또는 애플리케이션(application)), 장치(10)의 동작을 위한 데이터들, 명령어들을 저장할 수 있다. 이러한 응용 프로그램 중 적어도 일부는, 장치(10)의 기본적인 기능을 위하여 존재할 수 있다. 한편, 응용 프로그램은, 메모리(120)에 저장되고, 장치(10) 상에 설치되어, 프로세서(130)에 의하여 상기 장치(10)의 동작(또는 기능)을 수행하도록 구동될 수 있다.The memory 120 may store data supporting various functions of the device 10 . The memory 120 may store a plurality of application programs (or applications) driven in the device 10 , data for operation of the device 10 , and commands. At least some of these application programs may exist for basic functions of the device 10 . Meanwhile, the application program may be stored in the memory 120 , installed on the device 10 , and driven by the processor 130 to perform an operation (or function) of the device 10 .

또한, 메모리(120)는 상기 획득부(110)에서 획득한 상기 실제 수술데이터, 상기 가상 수술데이터 및 상기 랜덤 수술데이터를 환자 별로 저장할 수 있다.In addition, the memory 120 may store the actual surgery data, the virtual surgery data, and the random surgery data acquired by the acquisition unit 110 for each patient.

또한, 메모리(120)는 본 발명에 따른 수술 도구 및 장기 인식 학습 훈련을 위한 학습 데이터를 수집하기 위한 복수의 프로세스를 구비할 수 있다. 여기서, 상기 복수의 프로세스는 프로세서(130)에 대한 동작을 설명할 때 후술하도록 한다.In addition, the memory 120 may include a plurality of processes for collecting learning data for a surgical tool and organ recognition learning training according to the present invention. Here, the plurality of processes will be described later when an operation of the processor 130 is described.

프로세서(130)는 상기 응용 프로그램과 관련된 동작 외에도, 통상적으로 장치(10)의 전반적인 동작을 제어할 수 있다. 프로세서(130)는 위에서 살펴본 구성요소들을 통해 입력 또는 출력되는 신호, 데이터, 정보 등을 처리하거나 메모리(120)에 저장된 응용 프로그램 또는 복수의 프로세스를 구동함으로써, 사용자에게 적절한 정보 또는 기능을 제공 또는 처리할 수 있다.In addition to the operation related to the application program, the processor 130 may generally control the overall operation of the device 10 . The processor 130 processes signals, data, information, etc. input or output through the above-described components or drives an application program or a plurality of processes stored in the memory 120 to provide or process appropriate information or functions to the user. can do.

또한, 프로세서(130)는 메모리(120)에 저장된 응용 프로그램 또는 복수의 프로세스를 구동하기 위하여, 도 1a과 함께 살펴본 구성요소들 중 적어도 일부를 제어할 수 있다. 나아가, 프로세서(130)는 상기 응용 프로그램 또는 상기 복수의 프로세스의 구동을 위하여, 장치(10)에 포함된 구성요소들 중 적어도 둘 이상을 서로 조합하여 동작 시킬 수 있다.In addition, the processor 130 may control at least some of the components discussed with reference to FIG. 1A in order to drive an application program or a plurality of processes stored in the memory 120 . Furthermore, the processor 130 may operate by combining at least two or more of the components included in the device 10 to drive the application program or the plurality of processes.

도 1b를 참조하면, 본 발명의 수술 도구 및 장기 인식 학습 훈련을 위한 학습 데이터 수집 방법은, 장치(10)의 프로세서(130)에 의해 실행되는 방법으로서, 프로세서(130)가 적어도 복수의 수술데이터를 획득하는 단계(S110), 상기 수술데이터를 획득한 후, 분할 모델을 통해 상기 획득한 복수의 수술데이터를 기반으로 시멘틱 이미지 데이터를 생성하는 단계(S120), 획득한 복수의 수술데이터와 상기 시멘틱 이미지 데이터를 학습데이터 셋으로 구성하여 제공하는 단계(S130)를 포함할 수 있다.Referring to FIG. 1B , the method for collecting learning data for surgical tool and organ recognition learning training of the present invention is a method executed by the processor 130 of the device 10, wherein the processor 130 includes at least a plurality of surgical data A step of acquiring (S110), after acquiring the surgical data, generating semantic image data based on the acquired plurality of surgical data through a segmentation model (S120), the acquired plurality of surgical data and the semantic It may include a step (S130) of providing image data by configuring it as a learning data set.

프로세서(130)는 적어도 복수의 수술데이터를 획득할 수 있다(S110). The processor 130 may acquire at least a plurality of surgical data (S110).

여기서, 프로세서(130)는 복수의 프로세스 중 제1 프로세스를 기반으로 실제 수술데이터, 가상 수술데이터 및 랜덤 수술데이터 중 적어도 복수의 수술데이터를 획득할 수 있다.Here, the processor 130 may acquire at least a plurality of surgical data among actual surgical data, virtual surgical data, and random surgical data based on a first process among the plurality of processes.

여기서, 실제 수술데이터는, 실제로 수행된 수술의 종류, 사용된 도구의 위치정보 및 사용된 수술 도구의 정보를 포함하는 실제 수술 도구데이터와 실제로 수행된 수술의 장기에 대한 실제 장기데이터를 포함할 수 있다.Here, the actual surgical data may include actual surgical tool data including the type of surgery actually performed, location information of the used tool, and information on the used surgical tool, and actual organ data on the organ of the actually performed surgery. there is.

또한, 실제 수술데이터는, 실제 수술 영상 데이터로부터 각각의 데이터를 획득할 수 있으며, 실제 수술 영상 데이터는 실제 수술 시, 실제 수술 중에 획득한 영상 데이터일 수 있다.Also, the actual surgical data may be obtained from the actual surgical image data, and the actual surgical image data may be image data obtained during an actual operation or during an actual operation.

또한, 실제 수술데이터는 전술한 바와 같이, 해당 수술이 진행된 수술의 종류, 실제 수술 중 사용된 도구의 위치정보를 포함하며, 이때 사용한 수술 도구의 정보인 실제 수술 도구데이터와 수술된 수술 장기의 정보인 실제 수술 장기데이터를 포함할 수 있다.In addition, as described above, the actual surgical data includes the type of operation in which the corresponding operation was performed and the location information of the tool used during the actual operation. Actual surgical organ data may be included.

프로세서(130)는 분할 모델을 통해 상기 실제 수술 영상 데이터를 기반으로 상기 실제 수술데이터를 생성할 수 있다. 여기서, 분할 모델은 이후 상세히 후술하도록 한다.The processor 130 may generate the actual surgical data based on the actual surgical image data through a segmentation model. Here, the split model will be described later in detail.

여기서, 실제 수술 도구데이터는, 상기 분할 모델을 통해 도구의 헤드(head), 관절(wrist) 및 바디(body) 중 적어도 하나 이상을 구분하여 실제 수술 영상 내에서 바운딩되거나 픽셀 단위로 폴리곤(polygon) 처리되는 것 중 적어도 하나를 포함할 수 있다.Here, the actual surgical tool data is bound within the actual surgical image by dividing at least one of a head, a joint, and a body of the tool through the segmentation model, or a polygon in a pixel unit. It may include at least one of those to be processed.

또는, 실제 수술 도구데이터는, 상기 분할 모델을 통해 도구의 특징 별로 상기 실제 수술 영상 내에서 바운딩 되거나 픽셀 단위로 폴리곤 처리되는 것 중 적어도 하나를 포함할 수 있다.Alternatively, the actual surgical tool data may include at least one of bounding within the actual surgical image for each feature of the tool through the segmentation model or processing polygons in units of pixels.

여기서, 도구의 특징은, 도구의 종류, 도구의 크기 등 특징이 되는 것은 모두 포함되어, 상기 분할 모델을 통해 특징 별로 구분되어 바운딩되거나, 픽셀 단위로 폴리곤 처리될 수 있다. 이러한, 폴리곤 처리는 특정 부분에 대해 픽셀 단위로 외곽선을 따는 방식일 수 있다.Here, the features of the tool include all features such as the type of tool and the size of the tool, and may be divided and bound for each feature through the division model, or may be polygon-processed in units of pixels. Such polygon processing may be a method of tracing an outline for a specific portion in units of pixels.

또한, 실제 수술 장기데이터는, 상기 분할 모델을 통해 상기 실제 수술 영상 내에서 시각적으로 구분되는 장기가 포함되는 픽셀 중 적어도 하나 이상이 폴리곤 처리될 수 있다. 또는, 상기 실제 수술 장기데이터는, 상기 분할 모델을 통해 상기 실제 수술 영상 내에서 상기 장기가 포함되는 특정 부분이 바운딩되는 것 중 적어도 하나를 포함할 수 있다.In addition, the actual surgery organ data may be polygon-processed at least one or more of the pixels including the organs visually distinguished in the actual surgery image through the segmentation model. Alternatively, the actual surgery organ data may include at least one of bounding a specific part including the organ in the actual surgery image through the segmentation model.

실제 수술 영상 내에서 바운딩은 실제 수술 영상 내에서 특정하고자 하는 부분, 즉, 학습시키고자 하는 정보에 해당하는 부분을 상기 분할 모델을 통해 바운딩 박스(bounding box)형태로 어노테이션(annotation)하는 것일 수 있다.Bounding in the actual surgical image may be annotating the part to be specified in the actual surgical image, that is, the part corresponding to the information to be learned, in the form of a bounding box through the segmentation model. .

또는, 실제 수술 영상 내에서 폴리곤 처리는 실제 수술 영상 내에서 특정하고자 하는 부분, 즉, 학습시키고자 하는 정보에 해당하는 부분을 상기 분할 모델을 통해 픽셀 단위로 폴리곤 처리하는 것일 수 있다.Alternatively, the polygon processing in the actual surgical image may be a pixel-by-pixel processing of a part to be specified in the actual surgical image, that is, a part corresponding to information to be learned through the segmentation model.

여기서, 어노테이션이란, 바운딩 박스 또는 폴리곤 처리된 특정 부분을 포함한 정답 라벨일 수 있다.Here, the annotation may be a correct answer label including a bounding box or a polygon-treated specific part.

도 2a를 참조하면, 수술 영상 내에서 수술 도구를 바운딩 박스로 표시하여 어노테이션하는 방법은, 실제 수술 영상 내에서 수행하는 것뿐만 아니라, 3D 수술 환경 또는 3D 가상 공간의 영상 내에서 어노테이션을 수행하는 것도 동일하게 적용될 수 있다.Referring to FIG. 2A , a method of annotating a surgical tool by displaying it as a bounding box in a surgical image is not only performed within the actual surgical image, but also performed in a 3D surgical environment or an image of a 3D virtual space. The same can be applied.

수술 도구를 헤드, 관절, 바디로 구분하여 바운딩 박스 처리하는 것은, 도구의 정확한 위치 정보를 획득하기 위함으로써, 해당 수술 도구의 헤드 위치와, 관절 및/또는 바디의 위치를 특정하면, 수술 도구의 위치 및 방향 정보를 획득할 수 있다.The processing of the bounding box by dividing the surgical tool into a head, a joint, and a body is to obtain accurate position information of the tool. It is possible to obtain location and direction information.

다만, 수술 도구는 기구학적 특정에 따라, 헤드, 바디로만 구분될 수 있고, 헤드, 관절 및 바디로 구분될 수도 있다. However, the surgical tool may be divided into only a head and a body, or may be divided into a head, a joint, and a body according to kinematic characteristics.

수술 도구는 예시적으로는, 복강경 수술 도구, 로봇 수술 도구뿐만 아니라, 수술 과정에서 자주 사용되는 바늘, 표본 가방, 수술용 튜브 등이 포함될 수 있으며, 상기 나열된 예시 이외에도 수술 시 사용되는 도구는 모두 포함될 수 있다.Surgical tools may include, for example, laparoscopic surgical tools and robotic surgical tools, as well as needles, specimen bags, and surgical tubes frequently used in the surgical process, and all tools used during surgery other than the examples listed above are included. can

가상 수술데이터는 가상의 수술 환경 및 가상의 수술 환경에서 사용된 가상의 수술 도구데이터와 가상의 수술 장기데이터를 포함할 수 있다. 이때, 가상의 수술 환경은 특정 실제 수술 영상데이터를 이용하여 3D화 하여 제작된 3D 수술 환경이고, 가상의 수술 도구데이터와 가상의 수술 장기데이터는, 3D 수술 환경 내에서 사용자로부터 활용되어 도출된 가상의 수술도구 정보일 수 있다.The virtual surgery data may include virtual surgical tool data and virtual surgical organ data used in the virtual surgical environment and the virtual surgical environment. At this time, the virtual surgical environment is a 3D surgical environment produced by 3D using specific real surgical image data, and the virtual surgical tool data and virtual surgical organ data are virtual surgical environments derived from users within the 3D surgical environment. of surgical tools.

프로세서(130)는 분할 모델을 통해 상기 가상의 수술 환경을 기반으로 상기 가상 수술데이터를 생성할 수 있다. 여기서, 분할 모델은 이후 상세히 후술하도록 한다.The processor 130 may generate the virtual surgical data based on the virtual surgical environment through the segmentation model. Here, the split model will be described later in detail.

이러한, 가상의 수술 환경은 가상 신체 모델을 통해 구현된 환경일 수 있다. 여기서, 가상 수술 환경은 수술 도구 및 장기 교육을 위한 효과적인 레이블링 배포를 위해 다음과 같은 디지털 트윈 가상 환경일 수 있다.Such a virtual surgical environment may be an environment implemented through a virtual body model. Here, the virtual surgical environment may be the following digital twin virtual environment for effective labeling distribution for surgical tools and long-term education.

또한, 가상 신체 모델은 수술 상황을 고려한 복부 전산화 단층 촬영 영상을 이용하여 5개의 재건 장기를 분할하기 위한 라벨과 환자의 기복막을 반영한 피부를 3D 재건에 사용하여 생성될 수 있다.In addition, the virtual body model can be created by using a label for segmenting five reconstructed organs using a computed tomography image of the abdomen in consideration of the surgical situation and using the skin reflecting the patient's peritoneum for 3D reconstruction.

또한, 가상의 수술 환경은 수술 도구의 경우 로봇 및 복강경 기구의 카탈로그와 실제 측정값을 사용하여 상용 소프트웨어 도구인 3D Max, Zbrush 및 Substance Painter를 사용하여 3D 모델을 재구성하여 생성될 수 있다.In addition, a virtual surgical environment can be created by reconstructing a 3D model using commercially available software tools 3D Max, Zbrush, and Substance Painter using catalogs and actual measurements of robotic and laparoscopic instruments in the case of surgical tools.

또한, 가상의 수술 환경은 수술의 카메라 시점을 재현하기 위해 로봇 수술에 사용되는 내시경과 동일한 내/외부 매개변수로 가상 카메라 환경을 구축하여 생성될 수 있다.In addition, the virtual surgical environment may be created by building a virtual camera environment with the same internal/external parameters as the endoscope used in robotic surgery to reproduce the camera viewpoint of the surgery.

실제 수술 영상데이터를 이용하여 3D화 하여 제작된 3D 수술 환경은, 특정 카메라 시점에 해당하는 2D 정보로부터 사영되어 제작될 수 있다.A 3D surgical environment produced by 3D using actual surgical image data can be produced by projecting from 2D information corresponding to a specific camera viewpoint.

여기서, 실제 수술 영상데이터는, 예시적으로 컴퓨터단층촬영(Computer Tomography, CT), 핵자기공명 컴퓨터 단층촬영 영상(Nuclear Magnetic Resonance Computed Tomography, NMR-CT), 양전자 단층촬영 영상(Positron Emission Tomography; PET), CBCT(Cone Beam CT, CBCT), 전자빔 단층촬영 영상(Electron BeamTomography; EBT), 엑스레이(X-ray) 영상, 자기공명영상(Margnetic Resonance Imaging; MRI)을 포함할 수 있으나, 상기 예시에 한정되지 않고, 동영상을 포함하여 수술 시 촬영한 영상데이터를 모두 포함할 수 있다. Here, the actual surgical image data is exemplarily computed tomography (CT), nuclear magnetic resonance computed tomography (NMR-CT), positron emission tomography (PET). ), CBCT (Cone Beam CT, CBCT), Electron Beam Tomography (EBT), X-ray image, Magnetic Resonance Imaging (MRI), but limited to the above examples It is not possible to include all image data taken during surgery, including video.

가상의 수술 도구데이터와 가상의 수술 장기데이터는, 구성된 3D 수술 환경 내에서 사용자로부터 임의적으로 사용되는 수술 도구의 정보 및 수술 장기의 정보일 수 있다. The virtual surgical tool data and the virtual surgical organ data may be information of a surgical tool and a surgical organ that are arbitrarily used by a user within the configured 3D surgical environment.

프로세서(130)는 실제 수술에서 사용되지 않은 수술 도구라도 가상으로 사용되는 수술 도구 또는 수술 장기를 상기 분할 모델을 통해 바운딩 박스로 어노테이션하거나 픽셀 단위로 폴리곤(polygon) 처리하여 학습데이터로 활용한 학습 훈련을 수행할 수 있다.The processor 130 annotates a surgical tool or a surgical organ that is used virtually even if it is a surgical tool that is not used in actual surgery as a bounding box through the segmentation model, or by processing polygons in pixel units to learn training using learning data can be performed.

다만, 가상의 수술 도구데이터와 가상의 수술 장기데이터의 경우에는, 생성적 적대 신경망을 이용한 이미지 간 번역의 기법을 적용하여 학습에 활용할 수도 있다.However, in the case of virtual surgical tool data and virtual surgical organ data, a technique of image-to-image translation using a generative adversarial neural network may be applied and utilized for learning.

생성적 적대 신경망에 대하여는 심층신경망에 대한 설명과 함께 후술한다.The generative adversarial network will be described later along with the description of the deep neural network.

랜덤 수술데이터는, 도메인 무작위화(Domain Randomization)로부터 생성된 가상 데이터로서, 수술 도구, 수술 대상 장기, 카메라의 설정 조건을 무작위로 변경함으로써, 무작위로 생성된 특정 수술, 특정 수술 대상 장기, 및 특정 수술에서 사용되는 랜덤 수술 도구 중 적어도 하나에 대한 데이터를 포함할 수 있다.Random surgical data is virtual data generated from domain randomization, and by randomly changing the setting conditions of a surgical tool, a surgical target organ, and a camera, a specific surgery, a specific surgery target organ, and a specific It may include data about at least one of the random surgical tools used in surgery.

프로세서(130)는 분할 모델을 통해 상기 가상의 수술 환경을 기반으로 상기 랜덤 수술데이터를 생성할 수 있다. 여기서, 분할 모델은 이후 상세히 후술하도록 한다.The processor 130 may generate the random surgical data based on the virtual surgical environment through a segmentation model. Here, the split model will be described later in detail.

즉, 프로세서(130)는 도메인 무작위화(Domain Randomization)로부터 생성된 가상 데이터인 랜덤 수술데이터에 대해 상기 분할 모델을 통해 수술 도구 또는 수술 장기를 바운딩 박스로 어노테이션하거나 픽셀 단위로 폴리곤(polygon) 처리할 수 있다.That is, the processor 130 annotates a surgical tool or a surgical organ as a bounding box through the segmentation model for random surgical data, which is virtual data generated from domain randomization, or to process polygons in units of pixels. can

랜덤 수술데이터를 획득하기 위해 구현되는 3D 가상 공간은, 임의적인 수술의 3D 가상 공간을 포함할 수 있고, 전술한 가상 수술데이터에 포함된 3D 수술 환경과 동일하게 제작된 3D 수술 환경을 포함할 수 있다.The 3D virtual space implemented to acquire random surgical data may include a 3D virtual space of an arbitrary surgery, and may include a 3D surgical environment manufactured in the same way as the 3D surgical environment included in the aforementioned virtual surgical data. there is.

수술 대상 장기의 경우, 장기의 색, 텍스쳐(texture), 변형(deformation), 위치를 무작위로 설정할 수 있고, 수술 도구의 경우, 수술 도구의 종류, 텍스쳐, 기구학적 동작, 위치를 무작위로 설정할 수 있고, 카메라의 경우, 카메라의 시야(FOV, Field of view), 위치, 종류, 왜곡 정도(distortion)를 무작위로 설정할 수 있다.In the case of an organ to be operated, the color, texture, deformation, and position of the organ can be set randomly. And, in the case of a camera, the field of view (FOV), position, type, and distortion of the camera can be randomly set.

즉, 프로세서(130)는 기존 실제 수술에서 일반적으로 사용되는 수술도구와 관계없이, 모든 수술 대상 장기, 수술 도구, 카메라 조건 등을 무작위로 설정하여, 무작위로 획득한 정보를 데이터화할 수 있다.That is, the processor 130 may randomly set all surgical target organs, surgical tools, camera conditions, and the like, irrespective of surgical tools generally used in the existing actual surgery, and convert the randomly acquired information into data.

예를 들어, 랜덤 수술데이터 획득 시에는 하나 이상의 로봇 도구(예컨대, Harmonic Ace, Maryland Bipolar Forceps, Cadiere Forceps, Stapler, Medium-Large Clip Applier, Small Clip Applier 등), 하나 이상의 복강경 도구(예컨대, Atraumatic grasper, Electric hook, Curved Atraumatic Grasper, Suction-Irrigation, Clip Applier(metal), Scissors, Overholt, Ligasure 등) 및 하나 이상의 장기에 대하여 3D 모델을 제작할 수 있다. 다만, 상기 실시예에 한정되지 않고, 다양한 로봇 도구 및 다양한 복강경 도구에 대하여 3D 모델을 제작할 수 있다.For example, when acquiring random surgical data, one or more robotic tools (eg, Harmonic Ace, Maryland Bipolar Forceps, Cadiere Forceps, Stapler, Medium-Large Clip Applier, Small Clip Applier, etc.), one or more laparoscopic tools (eg, Atraumatic grasper) , Electric hook, Curved Atraumatic Grasper, Suction-Irrigation, Clip Applier (metal), Scissors, Overholt, Ligasure, etc.) and one or more organs can produce 3D models. However, the present invention is not limited to the above embodiment, and 3D models may be produced for various robot tools and various laparoscopic tools.

또는, 랜덤 수술데이터 획득 시에는 위 절제 대한 수술에 사용되는 도구(예컨대, atraumatic grasper, Cadiere forceps, curved atraumatic grasper, harmonic ace, Maryland bipolar forceps, medium-large clip applier, Overholt, scissors, small clip applier, stapler, and suction-irrigation) 및 하나 이상의 복강경 도구(예컨대, atraumatic grasper, clip applier (hem-o-lok), clip applier (metal), curved atraumatic grasper, electrichook, Ligasure, Overholt, scissors, and suction-irrigation)에 대하여 3D 모델을 제작할 수 있다. 다만, 상기 실시예에 한정되지 않고, 다양한 로봇 도구 및 다양한 복강경 도구에 대하여 3D 모델을 제작할 수 있다.Alternatively, when acquiring random surgical data, tools used for gastrectomy surgery (eg, atraumatic grasper, Cadiere forceps, curved atraumatic grasper, harmonic ace, Maryland bipolar forceps, medium-large clip applier, Overholt, scissors, small clip applier, stapler, and suction-irrigation) and one or more laparoscopic tools (e.g., atraumatic grasper, clip applier (hem-o-lok), clip applier (metal), curved atraumatic grasper, electrichook, Ligasure, Overholt, scissors, and suction-irrigation) ) can create a 3D model. However, the present invention is not limited to the above embodiment, and 3D models may be produced for various robot tools and various laparoscopic tools.

프로세서(130)는 분할 모델을 통해 획득한 적어도 복수의 수술데이터를 기반으로 시멘틱 이미지 데이터를 생성할 수 있다(S120). The processor 130 may generate semantic image data based on at least a plurality of surgical data acquired through the segmentation model (S120).

여기서, 상기 분할 모델은, 의미적 분할 모델과 인스턴스 분할 모델을 포함할 수 있다. 의미적 분할 모델은 DeepLabV2, HRNet, HRNet+OCR 중 적어도 하나를 포함할 수 있다. 또한, 인스턴스 분할 모델은 Casecade R-CNN, HRNet, HRNet+OCR 중 적어도 하나를 포함할 수 있다.Here, the segmentation model may include a semantic segmentation model and an instance segmentation model. The semantic segmentation model may include at least one of DeepLabV2, HRNet, and HRNet+OCR. In addition, the instance partitioning model may include at least one of Casecade R-CNN, HRNet, and HRNet+OCR.

여기서, 의미적 분할 모델은 영상 내 존재하는 모든 픽셀을 추론할 수 있고, 인스턴스 분할 모델은 바운딩 박스 내의 인스턴스 정보를 픽셀 레벨로 추론할 수 있다.Here, the semantic segmentation model may infer all pixels existing in the image, and the instance segmentation model may infer instance information in the bounding box at the pixel level.

보다 상세하게는 프로세서(130)는 상기 분할 모델을 통해 상기 실제 수술데이터, 상기 가상 수술데이터 및 상기 랜덤 수술데이터에서 각각 추출된 프레임과 그에 해당되는 레이블 맵으로부터 상기 시맨틱 이미지 데이터를 생성할 수 있다.In more detail, the processor 130 may generate the semantic image data from each of the frames extracted from the actual surgical data, the virtual surgical data, and the random surgical data through the segmentation model and a label map corresponding thereto.

여기서, 상기 시멘틱 이미지 데이터는, 상기 복수의 수술데이터 각각에 포함되는 도구에 대한 시멘틱 수술 도구데이터 또는 장기에 대한 시멘틱 수술 장기데이터를 포함할 수 있다.Here, the semantic image data may include semantic surgical tool data for tools included in each of the plurality of surgical data or semantic surgical organ data for organs.

그리고, 상기 시멘틱 수술 도구데이터 또는 상기 시멘틱 수술 장기데이터는, 상기 복수의 수술데이터 각각에 포함되는 상기 도구 또는 상기 장기에 대해 바운딩되거나, 픽셀 단위로 폴리곤 처리되는 것 중 적어도 하나를 포함할 수 있다.In addition, the semantic surgery tool data or the semantic surgery organ data may include at least one of bound to the tool or the organ included in each of the plurality of surgical data, or polygon-processed in units of pixels.

그리고, 상기 시멘틱 수술 도구데이터 또는 상기 시멘틱 수술 장기데이터는, 상기 바운딩되거나, 상기 폴리곤 처리된 부분 마다 상기 도구 또는 장기의 위치, 상기 도구 또는 상기 장기의 명칭, 상기 도구 또는 상기 장기의 상태(일 예로, 도구: 관절 부분이 접혀진 상태, 장기: 일부분이 절단된 상태 등) 중 적어도 하나의 정보를 포함할 수 있다.And, the semantic surgery tool data or the semantic surgery organ data is the location of the tool or organ, the name of the tool or organ, the state of the tool or the organ for each bounded or polygon-processed part (for example, , a tool: a state in which the joint part is folded, an organ: a state in which a part is cut, etc.).

구체적으로, 도 2b를 보면, 프로세서(130)는 상기 복수의 수술데이터에 상기 실제 수술데이터가 포함되는 경우, 상기 분할 모델을 통해 상기 실제 수술데이터를 기반으로 실제 시멘틱 이미지 데이터를 생성할 수 있다.Specifically, referring to FIG. 2B , when the actual surgical data is included in the plurality of surgical data, the processor 130 may generate actual semantic image data based on the actual surgical data through the segmentation model.

또한, 도 2c를 보면, 프로세서(130)는 상기 복수의 수술데이터에 상기 가상 수술데이터가 포함되는 경우, 상기 분할 모델을 통해 상기 가상 수술데이터를 기반으로 가상 시멘틱 이미지 데이터를 생성할 수 있다.In addition, referring to FIG. 2C , when the virtual surgical data is included in the plurality of surgical data, the processor 130 may generate virtual semantic image data based on the virtual surgical data through the segmentation model.

또한, 도 2d를 보면, 프로세서(130)는 상기 복수의 수술데이터에 상기 랜덤 수술데이터가 포함되는 경우, 상기 분할 모델을 통해 상기 랜덤 수술데이터를 기반으로 랜덤 시멘틱 이미지 데이터를 생성할 수 있다.In addition, referring to FIG. 2D , when the random surgical data is included in the plurality of surgical data, the processor 130 may generate random semantic image data based on the random surgical data through the segmentation model.

프로세서(130)는 획득한 복수의 수술데이터와 상기 시멘틱 이미지 데이터를 학습데이터 셋으로 구성하여 제공할 수 있다(S130). 여기서, 학습데이터 셋을 제공받는 주체는, 해당 학습데이터를 활용하는 사용자, 컴퓨터, 시스템 등을 포함할 수 있다.The processor 130 may configure and provide a plurality of acquired surgical data and the semantic image data as a learning data set (S130). Here, the subject receiving the learning data set may include a user, a computer, a system, etc. utilizing the corresponding learning data.

보다 상세하게는, 프로세서(130)는 복수의 프로세스 중 제3 프로세스를 기반으로 상기 복수의 수술데이터와 상기 복수의 수술데이터 각각에 대한 상기 시멘틱 이미지 데이터를 학습데이터 셋으로 구성하여 제공할 수 있다.In more detail, the processor 130 may configure the plurality of surgical data and the semantic image data for each of the plurality of surgical data as a learning data set based on a third process among the plurality of processes to provide the training data set.

이에 따라, 실제 수술영상이 부족하고, 실제 수술에서는 사용되는 도구만 사용될 확률이 많기 때문에 추가적으로 가상 수술데이터와 랜덤 수술데이터를 추가하고, 추가로, 시맨틱 이미지 데이터를 추가하여 학습의 성능을 높일 수 있다.Accordingly, since there is a shortage of actual surgical images and only tools used in actual surgery are highly likely to be used, the learning performance can be improved by additionally adding virtual surgical data and random surgical data, and additionally adding semantic image data. .

일 예로, 프로세서(130)는 상기 실제 수술데이터와 상기 실제 수술데이터에 대한 시멘틱 이미지 데이터 데이터를 학습데이터 셋으로 구성하여 제공할 수 있다.As an example, the processor 130 may configure the actual surgical data and semantic image data data for the actual surgical data as a learning data set and provide them.

도 3은 본 발명의 학습데이터 셋을 이용하여 수술 도구 인식 학습 모델을 생성하는 방법의 흐름도이다.3 is a flowchart of a method for generating a surgical tool recognition learning model using the learning data set of the present invention.

도 3은 도 1b의 방법에 수술 도구 인식 학습 훈련을 수행하는 단계(S150) 및 수술 도구 인식 학습 모델을 생성하는 단계(S170)를 더 포함한다.Figure 3 further includes the step of performing the surgical tool recognition learning training in the method of Figure 1b (S150) and generating the surgical tool recognition learning model (S170).

수술 도구 인식 학습 훈련을 수행하는 단계(S150)는 학습데이터 셋을 기반으로 특정 수술에 대하여 사용되는 특정 수술 도구에 관한 수술 도구 인식 학습 훈련을 수행하는 것이다.The step of performing the surgical tool recognition learning training (S150) is to perform the surgical tool recognition learning training with respect to a specific surgical tool used for a specific surgery based on the learning data set.

이 때, 학습데이터 셋은, 전술한 실제 수술데이터, 가상 수술데이터 및/또는 랜덤 수술데이터로부터 획득되는 것이며, 획득된 학습데이터 셋은 수술 종류, 장기의 크기, 장기의 상태, 수술 대상 나이 중 적어도 하나로 나뉘어 구성될 수 있다.At this time, the learning data set is obtained from the above-described actual surgery data, virtual surgery data, and/or random surgery data, and the obtained learning data set is at least one of the surgical type, the size of the organ, the condition of the organ, and the age of the operation target. It can be divided into one.

또한, 학습데이터는, 특정 수술에 대한 특정 수술 도구의 데이터가 각각 균일한 양의 데이터로서 클래스 균형(Class balance)을 맞춰 획득될 수 있다.In addition, the learning data may be acquired by balancing the class as data of a specific surgical tool for a specific surgery, respectively, as a uniform amount of data.

특정 수술 도구는, 예시적으로는, 복강경 수술 도구, 로봇 수술 도구뿐만 아니라, 수술 과정에서 자주 사용되는 바늘, 표본 가방, 수술용 튜브 등이 포함될 수 있으며, 상기 나열된 예시 이외에도 수술 시 사용되는 도구는 모두 포함될 수 있다.Specific surgical tools may include, by way of example, not only laparoscopic surgical tools and robotic surgical tools, but also needles, specimen bags, surgical tubes, etc. frequently used in the surgical process. In addition to the examples listed above, tools used during surgery are can all be included.

구체적인 예시로는, 담낭 절제술에 활용되는 복강경 도구 및 보조 수술 도구의 경우는, Atraumatic grasper, Electric hook, Curved Atraumatic Grasper, Suction-Irrigation, Clip Applier(metal), Scissors, Overholt, Ligasure, Clip Applier(Ham-o-Lok), Specimen Bag이 해당될 수 있다. 여기서, 상기 담낭 절제술의 경우, 수술 장기는 담낭(gallbladder)과 간(liver)이 포함될 수 있다.As a specific example, in the case of laparoscopic tools and auxiliary surgical tools used for cholecystectomy, Atraumatic grasper, Electric hook, Curved Atraumatic Grasper, Suction-Irrigation, Clip Applier (metal), Scissors, Overholt, Ligasure, Clip Applier (Ham) -o-Lok), Specimen Bag may be applicable. Here, in the case of the cholecystectomy, the surgical organ may include a gallbladder and a liver.

위 절제술의 경우에는, 로봇 도구, 복강경 도구, 보조 도구를 포함할 수 있고, Harmonic Ace, Maryland Bipolar Forceps, Cadiere Forceps, Medium-Large Clip Applier, Small Clip Applier, Curved Atraumatic Grasper, Stapler, Suction, Suction-Irrigation, Scissors, Atraumatic Grasper, Overholt, Guaze Needle, Endotip, Drain Tube, Needle, Needle Holder, Specimen Bag이 해당될 수 있다. 여기서, 상기 위 절제술의 경우, 수술 장기는 위(stomach), 간(liver), 담낭(gallbladder), 비장(spleen) 및 췌장(pancreas)이 포함될 수 있다.In the case of gastrectomy, it may include robotic tools, laparoscopic tools, auxiliary tools, Harmonic Ace, Maryland Bipolar Forceps, Cadiere Forceps, Medium-Large Clip Applier, Small Clip Applier, Curved Atraumatic Grasper, Stapler, Suction, Suction- Irrigation, Scissors, Atraumatic Grasper, Overholt, Guaze Needle, Endotip, Drain Tube, Needle, Needle Holder, Specimen Bag may be applicable. Here, in the case of gastrectomy, the surgical organ may include a stomach, a liver, a gallbladder, a spleen, and a pancreas.

구체적으로, 상기 로봇 도구는 Harmoic Ace, Maryland Bipolar Forceps, Cadiere Forceps, Medium-large Clip Applier, Small Clip Applier 등이 포함될 수 있고, 상기 복강경 도구는 Curved Atraumatic Grasper, Stapler, Suction 등이 포함될 수 있고, 상기 수술 보조 도구는 Guaze Needle, Endotip, Specimenbag, Drain Tube 등이 포함될 수 있다.Specifically, the robot tool may include Harmoic Ace, Maryland Bipolar Forceps, Cadiere Forceps, Medium-large Clip Applier, Small Clip Applier, etc., and the laparoscopic tool may include Curved Atraumatic Grasper, Stapler, Suction, etc., and the Surgical auxiliary tools may include Guaze Needle, Endotip, Specimenbag, Drain Tube, and the like.

수술 도구 인식 학습 모델을 생성하는 단계(S170)는 수행한 수술 도구 인식 학습 훈련을 기반으로 하여 수술 도구 및 장기인식 학습 모델을 생성하는 것이다.The step of generating the surgical tool recognition learning model ( S170 ) is to generate a surgical tool and organ recognition learning model based on the performed surgical tool recognition learning training.

본 발명의 실시예들에 따른 심층신경망(Deep Neural Network; DNN)은, 하나 이상의 컴퓨터 내에 하나 이상의 레이어(Layer)를 구축하여 복수의 데이터를 바탕으로 판단을 수행하는 시스템 또는 네트워크를 의미한다. 예를 들어, 심층신경망은 콘벌루션 풀링 층(Convolutional Pooling Layer), 로컬 접속 층(a locally-connected layer) 및 완전 연결 층(fully-connected layer)을 포함하는 층들의 세트로 구현될 수 있다. 콘벌루션 풀링 층 또는 로컬 접속 층은 영상 내 특징들을 추출하도록 구성될 수 있다. 완전 연결 층은 영상의 특징 간의 상관 관계를 결정할 수 있다. 일부 실시 예에서, 심층신경망의 전체적인 구조는 콘벌루션 풀링 층에 로컬 접속 층이 이어지고, 로컬 접속 층에 완전 연결 층이 이러지는 형태로 이루어질 수 있다. 심층신경망은 다양한 판단기준(즉, 파라미터(Parameter))를 포함할 수 있고, 입력되는 영상 분석을 통해 새로운 판단기준(즉, 파라미터)를 추가할 수 있다.A deep neural network (DNN) according to embodiments of the present invention refers to a system or network that constructs one or more layers in one or more computers and performs judgment based on a plurality of data. For example, a deep neural network may be implemented with a set of layers including a convolutional pooling layer, a locally-connected layer, and a fully-connected layer. A convolutional pooling layer or a local connection layer may be configured to extract features in an image. A fully connected layer can determine the correlation between features of an image. In some embodiments, the overall structure of the deep neural network may consist of a convolutional pooling layer followed by a local access layer, and a fully connected layer connected to the local access layer. The deep neural network may include various judgment criteria (ie, parameters), and may add new judgment criteria (ie, parameters) through input image analysis.

본 발명의 실시예들에 따른 심층신경망은, 영상분석에 적합한 콘볼루셔널 신경망이라고 부르는 구조로서, 주어진 영상 데이터들로부터 가장 분별력(Discriminative Power)가 큰 특징을 스스로 학습하는 특징 추출층(Feature Extraction Layer)와 추출된 특징을 기반으로 가장 높은 예측 성능을 내도록 예측 모델을 학습하는 예측층(Prediction Layer)이 통합된 구조로 구성될 수 있다. The deep neural network according to embodiments of the present invention is a structure called a convolutional neural network suitable for image analysis, and a feature extraction layer that learns by itself the feature with the greatest discriminative power from given image data. ) and a prediction layer that learns a predictive model to obtain the highest predictive performance based on the extracted features may have an integrated structure.

특징 추출층은 영상의 각 영역에 대해 복수의 필터를 적용하여 특징 지도(Feature Map)를 만들어 내는 콘벌루션 층(Convolution Layer)과 특징 지도를 공간적으로 통합함으로써 위치나 회전의 변화에 불변하는 특징을 추출할 수 있도록 하는 통합층(Pooling Layer)을 번갈아 수 차례 반복하는 구조로 형성될 수 있다. 이를 통해, 점, 선, 면 등의 낮은 수준의 특징에서부터 복잡하고 의미 있는 높은 수준의 특징까지 다양한 수준의 특징을 추출해낼 수 있다. The feature extraction layer spatially integrates the convolution layer, which creates a feature map by applying a plurality of filters to each region of the image, and the feature map to obtain features that are invariant to changes in position or rotation. It may be formed in a structure in which a pooling layer that can be extracted is alternately repeated several times. Through this, various level features can be extracted from low-level features such as points, lines, and planes to complex and meaningful high-level features.

콘벌루션 층은 입력 영상의 각 패치에 대하여 필 터와 국지 수용장(Local Receptive Field)의 내적에 비선형 활성 함수(Activation Function)을 취함으로 서 특징지도(Feature Map)을 구하게 되는데, 다른 네트워크 구조와 비교하여, CNN은 희소한 연결성 (Sparse Connectivity)과 공유된 가중치(Shared Weights)를 가진 필터를 사용하는 특징이 있다. 이러한 연결구조는 학습할 모수의 개수를 줄여주고, 역전파 알고리즘을 통한 학습을 효율적으로 만들어 결과적으로 예측 성능을 향상시킨다. The convolutional layer obtains a feature map by taking a nonlinear activation function on the dot product of the filter and the local receptive field for each patch of the input image. In comparison, CNNs are characterized by using filters with sparse connectivity and shared weights. Such a connection structure reduces the number of parameters to be learned, makes learning through the backpropagation algorithm efficient, and consequently improves prediction performance.

통합 층(Pooling Layer 또는 Sub-sampling Layer)은 이전 콘벌루션 층에서 구해진 특징 지도의 지역 정보를 활용하여 새로운 특징 지도를 생성한다. 일반적으로 통합 층에 의해 새로 생성된 특징지도는 원래의 특징 지도보다 작은 크기로 줄어드는데, 대표적인 통합 방법으로는 특징 지도 내 해당 영역의 최대값을 선택하는 최대 통합(Max Pooling)과 특징 지도 내 해당 영역의 평균값을 구하는 평균 통합(Average Pooling) 등이 있다. 통합 층의 특징지도는 일반적으로 이전 층의 특징 지도보다 입력 영상에 존재하는 임의의 구조나 패턴의 위치에 영향을 적게 받을 수 있다. 즉, 통합층은 입력 영상 혹은 이전 특징 지도에서의 노이즈나 왜곡과 같은 지역적 변화에 보다 강인한 특징을 추출할 수 있게 되고, 이러한 특징은 분류 성능에 중요한 역할을 할 수 있다. 또 다른 통합 층의 역할은, 깊은 구조상에서 상위의 학습 층으로 올라갈수록 더 넓은 영역의 특징을 반영할 수 있게 하는 것으로서, 특징 추출 층이 쌓이면서, 하위 층에서는 지역적인 특징을 반영하고 상위 층으로 올라 갈수록 보다 추상적인 전체 영상의 특징을 반영하는 특징 생성할 수 있다.The integration layer (Pooling Layer or Sub-sampling Layer) generates a new feature map by using local information of the feature map obtained from the previous convolutional layer. In general, the newly created feature map by the integration layer is reduced to a smaller size than the original feature map. Representative integration methods include Max Pooling, which selects the maximum value of the corresponding region in the feature map, and the corresponding feature map in the feature map. There is an average pooling method that obtains the average value of a region. In general, the feature map of the integrated layer can be less affected by the location of arbitrary structures or patterns present in the input image than the feature map of the previous layer. That is, the integration layer can extract features that are more robust to regional changes such as noise or distortion in the input image or previous feature map, and these features can play an important role in classification performance. Another role of the integration layer is to reflect the features of a wider area as you go up to the upper learning layer in the deep structure. More and more abstract features can be generated that reflect the features of the entire image.

이와 같이, 콘벌루션 층과 통합 층의 반복을 통해 최종적으로 추출된 특징은 다중 신경망(MLP: Multi-layer Perception)이나 서포트 벡터 머신(SVM: Support Vector Machine)과 같은 분류 모델이 완전 연결 층(Fully-connected Layer)의 형태로 결합되어 분류 모델 학습 및 예측에 사용될 수 있다.In this way, the features finally extracted through iteration of the convolutional layer and the integration layer are fully connected to a classification model such as a multi-layer perception (MLP) or a support vector machine (SVM). -connected layer) and can be used for classification model training and prediction.

다만, 본 발명의 실시예들에 따른 심층신경망의 구조는 이에 한정되지 아니하고, 다양한 구조의 신경망으로 형성될 수 있다.However, the structure of the deep neural network according to the embodiments of the present invention is not limited thereto, and may be formed of a neural network of various structures.

생성적 적대 신경망 학습 모델이란, 비지도 학습으로서, 분류를 담당하는 판별자 D(Discriminator)와 랜덤한 노이즈에서 데이터를 만들어 내는 생성자 G(Generator)의 두 개의 모델로 구성되어 있다.The generative adversarial neural network learning model is unsupervised learning and consists of two models: a discriminator D (Discriminator) responsible for classification and a generator G (Generator) that generates data from random noise.

생성적 적대 신경망 학습 모델은, 생성자 G와 판별자 D가 대립하며 서로의 성능을 개선시켜 나가는 모델이며, 판별자 D는 원 데이터만을 참으로 판단하기 위해 노력하고 생성자는 판별자 D가 거짓으로 판별하지 못하도록 가짜 데이터를 생성해가며 두 모델의 성능이 같이 올라가게 되는 모델이다.The generative adversarial neural network learning model is a model in which the generator G and the discriminator D oppose each other and improve each other's performance. It is a model in which the performance of both models increases together by generating fake data to prevent this from happening.

생성자 G는 원 데이터의 확률분포를 알아내려 하고, 분포를 재현하여 가짜 데이터를 실 데이터와 차이가 없도록 하는 것이고, 판별자 D는 판별 대상인 데이터가 실 데이터인지, 생성자 G가 만들어낸 데이터인지 구별하여 각각에 대한 확률을 추정하는 것이다.Generator G tries to find out the probability distribution of the original data and reproduces the distribution so that the fake data does not differ from the real data. Estimating the probability for each.

생성적 적대 신경망 학습 모델의 수식은 하기의 수학식 1과 같다.The formula of the generative adversarial neural network learning model is as Equation 1 below.

[수학식 1][Equation 1]

도 4는 실제 수술데이터에서 부족한 데이터를 균형 데이터로서 획득하는 방법의 흐름도이다.4 is a flowchart of a method for acquiring insufficient data as balance data in actual surgical data.

도 5는 본 발명에 따른 실제 수술데이터로부터 획득된 어노테이션의 분포를 나타낸 그래프이다.5 is a graph showing the distribution of annotations obtained from actual surgical data according to the present invention.

도 6은 본 발명에 따른 가상 수술데이터 및/또는 랜덤 수술데이터를 획득함으로써 달라진 어노테이션의 분포를 나타낸 그래프이다.6 is a graph showing the distribution of annotations changed by acquiring virtual surgical data and/or random surgical data according to the present invention.

도 7은 본 발명에 따른 실제 수술데이터(복강경 담낭 절제술과 위암 비디오)에서 얻은 어노테이션의 분포를 설명한 그래프이다.7 is a graph illustrating the distribution of annotations obtained from actual surgical data (laparoscopic cholecystectomy and gastric cancer video) according to the present invention.

도 8은 본 발명에 따른 실제 수술데이터(위암 수술데이터) 및 가상 수술데이터(3D 환자 모델)에 대한 복강경 담낭 절제술 및 위 절제술에서 얻은 주석 분포를 설명한 그래프이다.8 is a graph explaining the distribution of annotations obtained from laparoscopic cholecystectomy and gastrectomy for actual surgical data (gastric cancer surgery data) and virtual surgery data (3D patient model) according to the present invention.

도 4를 참조하면, 본 발명의 수술 도구 인식 학습 훈련을 위한 학습 데이터 수집 방법은, 실제 수술데이터를 획득하는 단계(S210), 어노테이션이 부족한 데이터의 종류 및 필요한 양을 추출하는 단계(S230) 및 추출한 데이터의 종류에 해당하는 데이터를 균형 데이터로서 획득하는 단계(S250)를 포함한다.Referring to FIG. 4, the learning data collection method for the surgical tool recognition learning training of the present invention includes the steps of acquiring actual surgical data (S210), extracting the type and required amount of data lacking annotation (S230) and and acquiring data corresponding to the type of extracted data as balanced data (S250).

도 4 내지 도 9에서의 실제 수술데이터, 가상 수술데이터 및 랜덤 수술데이터는 도 1a 내지 도 3에서 전술한 내용과 동일하다.The actual surgery data, virtual surgery data, and random surgery data in FIGS. 4 to 9 are the same as those described above in FIGS. 1A to 3 .

어노테이션이 부족한 데이터의 종류 및 필요한 양을 추출하는 단계(S230)는 획득한 실제 수술데이터에서 클래스 불균형(Class imbalance)을 발생시키는 어노테이션(annotation)이 부족한 데이터의 종류 및 필요한 양을 추출하는 것이다.The step of extracting the type and required amount of data lacking annotation ( S230 ) is to extract the type and required amount of data lacking annotation that causes class imbalance in the acquired actual surgical data.

실제 수술 시에 사용되는 도구는 제한/한정이 있을 수 있고, 실제 수술에는 수행되지 않는 동작 등이 있을 수 있고, 실제 수술 케이스마다 도구 및 수술 동작이 전혀 달라질 수 있기 때문에, 실제 수술 상에서 모든 도구 사용 및 동작에 대한 데이터를 획득하기 어렵다. 이러한 이유로 실제 수술데이터만을 획득하는 경우에는, 클래스 불균형이 발생될 수 있는 것이다.Since the tools used in actual surgery may have limitations/limitations, there may be movements that are not performed in actual surgery, and the tools and surgical operations may be completely different for each actual surgical case, all tools are used in actual surgery and it is difficult to obtain data on the operation. For this reason, when only actual surgical data is acquired, class imbalance may occur.

본 발명에서는 실제 수술데이터 만으로는 발생되는 클래스 불균형을 가상 수술데이터 및/또는 랜덤 수술데이터 획득으로 해결한다. 따라서, 획득한 실제 수술데이터에서 클래스 불균형을 발생시키는 데이터의 종류 및 필요한 양을 추출하여, 필요한 양만큼의 필요한 데이터 종류를 가상 수술데이터 및/또는 랜덤 수술데이터로부터 획득할 수 있다.In the present invention, class imbalance that occurs only with actual surgical data is solved by acquiring virtual surgical data and/or random surgical data. Accordingly, by extracting the type and required amount of data generating class imbalance from the acquired actual surgical data, the required data type as much as the required amount may be obtained from the virtual surgery data and/or random surgery data.

어노테이션이 부족한 데이터의 종류 및 필요한 양을 추출하는 단계(S230)는, 가상 수술데이터 및 랜덤 수술데이터 중 적어도 하나에서 필요한 양만큼, 추출한 데이터의 종류에 해당하는 데이터를 균형 데이터로서 획득하는 것이다.The step of extracting the type and required amount of data lacking annotation ( S230 ) is to obtain the data corresponding to the type of extracted data as balanced data by the required amount from at least one of the virtual surgery data and the random surgery data.

균형 데이터는, 실제 수술데이터만으로 발생되는 클래스 불균형을 해결하기 위해, 가상 수술데이터 및/또는 랜덤 수술데이터로부터 획득하는 추가적인 데이터를 의미한다.Balanced data refers to additional data obtained from virtual surgery data and/or random surgery data in order to resolve class imbalance caused only by actual surgical data.

어노테이션이 부족한 데이터의 종류는, 수술 동작, 수술 도구 종류, 특정 수술 위치, 수술 도구 사용 위치 중 적어도 하나를 포함할 수 있다.The type of data lacking annotations may include at least one of a surgical operation, a surgical tool type, a specific surgical location, and a surgical tool use location.

본 실시예에서, 실제 수술데이터는, 실제로 수행된 수술의 종류, 사용된 도구의 위치정보 및 사용된 수술 도구의 정보를 포함하는 실제 수술 도구데이터를 포함하는 것이고, 가상 수술데이터는, 가상의 수술 환경 및 가상의 수술 환경에서 사용된 가상의 수술 도구데이터를 포함하는 것이고, 랜덤 수술데이터는, 도메인 무작위화(Domain Randomization)로부터 생성된 가상 데이터이고, 각각의 데이터들은 도 1a 내지 도 3에서 전술한 내용과 동일하다.In this embodiment, the actual surgical data includes real surgical tool data including the type of surgery actually performed, location information of a used tool, and information on a used surgical tool, and the virtual surgery data includes a virtual surgery It includes the virtual surgical tool data used in the environment and the virtual surgical environment, and the random surgical data is virtual data generated from domain randomization, and each data is described above in FIGS. 1A to 3 . Same as content

도 5를 참조하면, 도 5는 실제 수술데이터의 분포를 각 수술도구로 분류하여 나타낸 것인데, 담낭 절제술에 대한 실제 수술데이터(도 5의 상단 그래프) 및 위 절제술에 대한 실제 수술데이터(도 5의 하단 그래프) 모두 획득된 데이터 자체가 불균형하여, 클래스 불균형을 야기시키는 것을 확인할 수 있다.Referring to FIG. 5, FIG. 5 shows the distribution of actual surgical data classified by each surgical tool. bottom graph), it can be seen that the obtained data itself is unbalanced, causing class imbalance.

도 6을 참조하면, 도 6은 가상 수술데이터 및/또는 랜덤 수술데이터를 획득함으로써 달라진 어노테이션의 분포를 나타낸 것인데, 이는 실제 수술데이터와 균형 데이터를 합한 데이터의 값들로써, 각 수술도구로 분류하여 나타낸 것이다. 담낭 절제술에 대한 실제 수술데이터 및 균형 데이터의 합(도 6의 상단 그래프) 및 위 절제술에 대한 실제 수술데이터 및 균형 데이터의 합(도 6의 하단 그래프)이 균형적인 것을 확인할 수 있다.Referring to FIG. 6, FIG. 6 shows the distribution of annotations changed by acquiring virtual surgical data and/or random surgical data, which are values of data obtained by summing actual surgical data and balance data, classified by each surgical tool. will be. It can be seen that the sum of actual surgical data and balance data for cholecystectomy (upper graph in Fig. 6) and the sum of actual operational data and balance data for gastrectomy (lower graph in Fig. 6) are balanced.

따라서, 결과적으로 본 발명에 의하면 클래스 불균형의 문제를 해결하고, 해당 데이터를 이용한 학습 결과가 보다 정확한 결과로서 도출되는 것이다.Therefore, as a result, according to the present invention, the problem of class imbalance is solved, and a learning result using the corresponding data is derived as a more accurate result.

도 7을 참조하면, 도 7은 실제 수술데이터(복강경 담낭 절제술과 위암 비디오)에서 얻은 어노테이션의 분포를 나타낼 수 있다.Referring to FIG. 7 , FIG. 7 may represent the distribution of annotations obtained from actual surgical data (laparoscopic cholecystectomy and gastric cancer video).

여기서, 어노테이션의 수는 로그 눈금으로 조정될 수 있다. 실제 수술데이터인 담낭 절제술과 위 절제술 모두 몇 가지 도구에서 심각한 수준의 불균형 문제를 야기시키는 것을 확인할 수 있다.Here, the number of annotations may be adjusted on a logarithmic scale. It can be confirmed that both cholecystectomy and gastrectomy, which are actual surgical data, cause serious imbalance problems in some tools.

도 8을 참조하면, 도 8은 실제 수술데이터(위암 비디오) 및 3D 환자 모델에 대한 복강경 담낭절제술 및 위 절제술에서 얻은 주석 분포를 나타낼 수 있다.Referring to FIG. 8, FIG. 8 may show the distribution of annotations obtained from laparoscopic cholecystectomy and gastrectomy for actual surgical data (gastric cancer video) and 3D patient model.

실제 수술데이터(예컨대, 위암 수술데이터)에서 얻은 어노테이션의 분포와 비교하여 클래스 불균형 문제는 실제 수술데이터, 가상 수술데이터, 랜덤 수술데이터 분포에서 위의 두 수술(복강경 담낭 절제술, 위 절제술) 모두 클래스 불균형 문제가 완화되었다는 것을 확인할 수 있다.Compared with the distribution of annotations obtained from actual surgical data (eg, gastric cancer surgery data), the class imbalance problem is class imbalance in both of the above two surgeries (laparoscopic cholecystectomy, gastrectomy) in the actual surgical data, virtual surgery data, and random surgical data distribution. It can be seen that the problem has been alleviated.

도 9는 획득한 실제 수술데이터 및 균형 데이터를 기반으로 하여 수술 도구 인식 학습 모델을 생성하는 방법의 흐름도이다.9 is a flowchart of a method of generating a surgical tool recognition learning model based on the acquired actual surgical data and balance data.

도 9를 참조하면, 수술 도구 인식 학습 모델을 생성하는 방법은, 도 6의 단계에 수술 도구 인식 학습 훈련을 수행하는 단계(S270) 및 수술 도구 인식 학습 모델을 생성하는 단계(S290)를 더 포함한다.Referring to FIG. 9 , the method of generating a surgical tool recognition learning model further includes performing surgical tool recognition learning training in the step of FIG. 6 ( S270 ) and generating a surgical tool recognition learning model ( S290 ) do.

수술 도구 인식 학습 훈련을 수행하는 단계(S270)는 획득한 균형 데이터 및 실제 수술데이터를 학습데이터로서 특정 수술에 대하여 사용되는 특정 수술 도구에 관한 수술 도구 인식 학습 훈련을 수행하는 것이다.The step of performing the surgical tool recognition learning training (S270) is to perform the surgical tool recognition learning training with respect to a specific surgical tool used for a specific surgery using the acquired balance data and actual surgical data as learning data.

수술 도구 인식 학습 모델을 생성하는 단계(S290)는 수행한 수술 도구 인식 학습 훈련을 기반으로 하여 수술 도구 인식 학습 모델을 생성하는 것이다.The step of generating the surgical tool recognition learning model ( S290 ) is to generate a surgical tool recognition learning model based on the performed surgical tool recognition learning training.

도 9의 수술 도구 인식 학습 훈련을 수행하는 단계(S270) 및 수술 도구 인식 학습 모델을 생성하는 단계(S290)는, 도 3의 수술 도구 인식 학습 훈련을 수행하는 단계(S150) 및 수술 도구 인식 학습 모델을 생성하는 단계(S170)와 학습데이터만이 상이할 뿐이며, 수행하는 방법은 동일하게 적용된다.The step of performing the surgical tool recognition learning training of FIG. 9 ( S270 ) and the step of generating the surgical tool recognition learning model ( S290 ), the step of performing the surgical tool recognition learning training of FIG. 3 ( S150 ) and the surgical tool recognition learning Only the step of generating the model ( S170 ) and the training data are different, and the performing method is applied the same.

본 발명의 실시예와 관련하여 설명된 방법 또는 알고리즘의 단계들은 하드웨어로 직접 구현되거나, 하드웨어에 의해 실행되는 소프트웨어 모듈로 구현되거나, 또는 이들의 결합에 의해 구현될 수 있다. 소프트웨어 모듈은 RAM(Random Access Memory), ROM(Read Only Memory), EPROM(Erasable Programmable ROM), EEPROM(Electrically Erasable Programmable ROM), 플래시 메모리(Flash Memory), 하드 디스크, 착탈형 디스크, CD-ROM, 또는 본 발명이 속하는 기술 분야에서 잘 알려진 임의의 형태의 컴퓨터 판독가능 기록매체에 상주할 수도 있다.The steps of a method or algorithm described in relation to an embodiment of the present invention may be implemented directly in hardware, as a software module executed by hardware, or by a combination thereof. A software module may include random access memory (RAM), read only memory (ROM), erasable programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), flash memory, hard disk, removable disk, CD-ROM, or It may reside in any type of computer-readable recording medium well known in the art to which the present invention pertains.

이상, 첨부된 도면을 참조로 하여 본 발명의 실시예를 설명하였지만, 본 발명이 속하는 기술분야의 통상의 기술자는 본 발명이 그 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 실시될 수 있다는 것을 이해할 수 있을 것이다. 그러므로, 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며, 제한적이 아닌 것으로 이해해야만 한다.As mentioned above, although embodiments of the present invention have been described with reference to the accompanying drawings, those skilled in the art to which the present invention pertains know that the present invention may be embodied in other specific forms without changing the technical spirit or essential features thereof. you will be able to understand Therefore, it should be understood that the embodiments described above are illustrative in all respects and not restrictive.

10: 장치
110: 통신부
120: 메모리
130: 프로세서10: device
110: communication department
120: memory
130: processor

Claims

A learning data collection method for surgical tool and organ recognition learning training performed by a device, the method comprising:
Acquiring at least a plurality of surgical data of actual surgical data, virtual surgery data, and random surgery data; and
Comprising the step of providing the obtained plurality of surgical data to configure as a learning data set,
The actual surgical data includes actual surgical tool data including the type of surgery actually performed, location information of the used tool, and information on the used surgical tool, and actual surgical organ data for the organ of the actually performed surgery. ,
The virtual surgical data includes a virtual surgical environment, virtual surgical tool data used in the virtual surgical environment, and virtual surgical organ data for organs in the virtual surgical environment,
The random surgical data is virtual data generated from domain randomization,
Learning data collection methods for training surgical instruments and organ recognition learning.

According to claim 1,
performing surgical tool recognition learning training for a specific surgical tool used for a specific surgery based on the learning data set; and
Further comprising the step of generating a surgical tool recognition learning model based on the performed surgical tool recognition learning training,
Learning data collection methods for training surgical instruments and organ recognition learning.

According to claim 1,
The actual surgical tool data is bound within the actual surgical image by dividing at least one of a head, a wrist, and a body of the tool, or polygon-processed in units of pixels,
Or comprising at least one of bounding within the actual surgical image for each feature of the tool or processing the polygon in units of pixels,
The actual surgery organ data is at least one of at least one of the pixels including the visually distinguished organs in the actual surgery image is polygon-processed, or a specific part including the organ in the actual surgery image is bound. which includes one,
Learning data collection methods for training surgical instruments and organ recognition learning.

According to claim 1,
The virtual surgical environment is a 3D surgical environment created based on specific actual surgical image data,
The virtual surgical tool data and the virtual surgical organ data are virtual surgical tool information and surgical organ information derived by being utilized by a user within the 3D surgical environment,
Learning data collection methods for training surgical instruments and organ recognition learning.

According to claim 1,
The random surgical data is,
As virtual data generated from domain randomization, by randomly changing the setting conditions of a surgical tool, a surgical target organ, and a camera, a specific surgery randomly generated, a specific surgery target organ, and random used in a specific surgery comprising data about at least one of the surgical instruments;
Learning data collection methods for training surgical instruments and organ recognition learning.

According to claim 1,
The learning data is
Data of a specific surgical tool and a specific surgical organ for a specific surgery are each obtained with a class balance as a uniform amount of data,
Learning data collection methods for training surgical instruments and organ recognition learning.

According to claim 1,
After acquiring the plurality of surgical data, generating semantic image data for the acquired plurality of surgical data through a segmentation model; further comprising,
The step of providing the learning data set configuration,
Providing the plurality of surgical data and the semantic image data corresponding to each of the plurality of surgical data as a learning data set,
Learning data collection methods for training surgical instruments and organ recognition learning.

8. The method of claim 7,
The semantic image data generation step includes:
When the actual surgical data is included in the plurality of surgical data,
Generate actual semantic image data based on the actual surgical data through the segmentation model,
When the virtual surgical data is included in the plurality of surgical data,
Generate virtual semantic image data based on the virtual surgical data through the segmentation model,
When the random surgical data is included in the plurality of surgical data,
Generating random semantic image data based on the random surgical data through the segmentation model,
Learning data collection methods for training surgical instruments and organ recognition learning.

9. The method of claim 8,
The semantic image data is
including semantic surgical tool data for the tool or semantic surgical organ data for the organ included in each of the plurality of surgical data,
The semantic surgery tool data or the semantic surgery organ data includes at least one of a location range of the tool or the organ, and a name of the tool or the organ.
Learning data collection methods for training surgical instruments and organ recognition learning.

A learning data collection method for surgical tool recognition learning training performed by a device, the method comprising:
acquiring actual surgical data;
extracting a type and a required amount of data lacking an annotation for generating a class imbalance from the acquired actual surgical data;
Comprising the step of acquiring data corresponding to the type of data extracted by the required amount from at least one of virtual surgery data and random surgery data as balanced data,
The actual surgical data includes actual surgical tool data including information on the type of surgery actually performed, location information of a used tool, and information on a used surgical tool,
The virtual surgical data is to include a virtual surgical environment and virtual surgical tool data used in the virtual surgical environment,
The random surgical data is virtual data generated from domain randomization,
Learning data collection methods for training surgical instruments and organ recognition learning.

11. The method of claim 10,
performing surgical tool recognition learning training on a specific surgical tool used for a specific surgery using the acquired balance data and the actual surgical data as learning data; and
Comprising the step of generating a surgical tool recognition learning model based on the performed surgical tool recognition learning training,
Learning data collection methods for training surgical instruments and organ recognition learning.

12. The method of claim 2 or 11,
If the specific operation is gastrectomy,
The specific surgical tool includes at least one of a robotic surgical tool, a laparoscopic surgical tool, and an auxiliary surgical tool,
The specific surgical organs include stomach, liver, gallbladder, spleen and pancreas,
If the specific operation is cholecystectomy,
The specific surgical tool includes at least one of a laparoscopic tool and an auxiliary surgical tool,
The specific surgical organ, including the gallbladder and liver,
Learning data collection methods for training surgical instruments and organ recognition learning.

13. A learning data collection program for surgical tools and organ recognition learning training, stored for executing the method of any one of claims 1 to 12 in combination with a computer which is hardware.