KR102580750B1

KR102580750B1 - 3d image registration method based on markerless, method for tracking 3d object and apparatus implementing the same method

Info

Publication number: KR102580750B1
Application number: KR1020200188364A
Authority: KR
Inventors: 이원진; 최시은; 최민혁
Original assignee: 서울대학교산학협력단
Priority date: 2020-12-30
Filing date: 2020-12-30
Publication date: 2023-09-19
Also published as: KR102580750B9; KR20220096157A

Abstract

본 발명은 마커리스 기반의 3차원 영상 정합 방법 및 이를 이용한 3차원 객체 추적 방법 및 장치에 관한 것으로, 적어도 하나의 프로세서에 의해 동작하는 컴퓨팅 장치의 동작 방법으로서, 사용자의 의료 영상에 기초하여 생성한 2차원 연조직 영상을 학습된 특징점 추출 모델에 입력하여 해부학적 위치에 기초한 하나 이상의 제1 특징점을 추출하고, 의료 영상에 기초하여 제1 특징점을 3차원 좌표로 재구성하는 단계, 깊이 인식 카메라로 사용자를 촬영한 촬영 영상을 학습된 특징점 추출 모델에 입력하여 해부학적 위치에 기초한 하나 이상의 제2 특징점을 추출하고, 촬영 영상의 깊이 정보를 적용하여 상기 제2 특징점을 3차원 좌표로 재구성하는 단계, 그리고 제1 특징점의 3차원 좌표와 제2 특징점의 3차원 좌표를 서로 매칭하여 의료 영상과 촬영 영상을 정합하는 단계를 포함한다. The present invention relates to a markerless-based 3D image registration method and a 3D object tracking method and device using the same. It is a method of operating a computing device operated by at least one processor, and is a method of operating a computing device operated by at least one processor. Inputting a 2D soft tissue image into a learned feature point extraction model to extract one or more first feature points based on anatomical location, and reconstructing the first feature point into 3D coordinates based on the medical image; detecting the user with a depth recognition camera Inputting the captured image into a learned feature point extraction model to extract one or more second feature points based on anatomical location, applying depth information of the captured image to reconstruct the second feature point into three-dimensional coordinates, and It includes the step of matching the medical image and the captured image by matching the 3D coordinates of the first feature point and the 3D coordinates of the second feature point.

Description

Markerless-based 3D image registration method and 3D object tracking method and device using the same {3D IMAGE REGISTRATION METHOD BASED ON MARKERLESS, METHOD FOR TRACKING 3D OBJECT AND APPARATUS IMPLEMENTING THE SAME METHOD}

마커리스 기반의 3차원 영상을 정합하여 추적하는 기술이 제공된다. A markerless-based 3D image registration and tracking technology is provided.

최근에는 치과, 이비인후과, 정형/신경외과 등의 의료 수술 분야에서 절개 부위를 최소화하여 수술을 진행하는 최소침습 수술(MIS, minimally invasive surgery)이 개발되어 적용되고 있다. Recently, minimally invasive surgery (MIS), which performs surgery with minimal incisions, has been developed and applied in medical surgical fields such as dentistry, otolaryngology, orthopedics/neurosurgery.

최소침습 수술은 손상되는 부위가 적어 통증과 합병증의 발생률이 적고, 회복이 빨라 일상생활로의 복귀가 빠르지만, 일반적인 개복 수술에 비해 수술 기술이 복잡하다. 특히, 최소침습 수술은 제한된 시야 확보로 인해 환부나 구조물의 정확한 위치 정위가 어렵기 때문에, 3차원 위치추적 장치를 결합한 영상가이드 수술 항법 시스템(surgical navigation system, 내비게이션 수술)이 활용된다. Minimally invasive surgery has fewer damaged areas, lower incidence of pain and complications, and faster recovery and faster return to daily life, but the surgical technique is more complicated than general open surgery. In particular, since it is difficult to accurately position the affected area or structure due to limited visibility in minimally invasive surgery, an image-guided surgical navigation system that combines a 3D position tracking device is used.

특히, MRI/CT/초음파 등의 3차원 의료영상을 이용한 증강현실 기반 영상가이드 수술 항법 시스템은 환자의 수술 전 촬영된 의료 영상 좌표계와 수술 중 촬영 영상에 기초한 물리적 좌표계간의 정합 단계와, 정합 단계 이후에도 위치추적장치를 도구에 부착하여 물리적 좌표계에서 움직이는 도구의 위치를 추적하는 기술이 필수적으로 필요하다. In particular, the augmented reality-based image-guided surgical navigation system using 3D medical images such as MRI/CT/ultrasound has a matching step between the medical image coordinate system captured before the patient's surgery and a physical coordinate system based on images taken during the surgery, and even after the matching step. Technology is essential to track the position of a moving tool in a physical coordinate system by attaching a position tracking device to the tool.

이에 따라 기존의 정합 방법으로는 환자의 신체에 장착된 특정 기구(maker) 또는 실제 환자 신체 부위에 하나 이상의 지점을 긁는 방식으로 접촉하여, 접촉 지점을 기준으로 의료 영상 좌표계와 물리적 좌표계를 정합한다.Accordingly, the existing registration method contacts a specific device (maker) mounted on the patient's body or an actual part of the patient's body by scratching one or more points, and matches the medical image coordinate system and the physical coordinate system based on the contact point.

이러한 경우, 별도로 사용되는 특정 기구의 부피로 인해 수술 부위 등이 가려지며, 기술자의 숙련도에 따라 정합의 정확도가 결정되어 오차 발생으로 인한 수술 준비시간이 증가되고, 환자의 정합 부위에 접촉하는 환자간에 교차 감염이 발생할 가능성이 있다. In this case, the surgical area is obscured due to the volume of the specific instrument used separately, the accuracy of registration is determined by the technician's skill level, which increases the surgical preparation time due to errors, and causes friction between patients who come into contact with the patient's registration area. There is a possibility that cross-infection may occur.

그러므로 별도의 도구 없이도 비접촉적으로 자동으로 영상 좌표계와 물리적 좌표계간의 정합이 가능하며, 정합된 좌표에서 고정밀하게 고속적으로 환자를 추적하는 기술이 요구된다. Therefore, it is possible to automatically and non-contactly match the image coordinate system with the physical coordinate system without any additional tools, and technology is required to track the patient with high precision and high speed in the matched coordinates.

본 발명의 한 실시예는 마커와 같은 별도 장비 없이 의료 영상과 촬영 영상에서 기계학습 기반의 모델을 통해 해부학적인 특징점의 좌표를 추출하고, 추출된 특징점의 좌표에 기초하여 의료 영상과 촬영 영상을 정합하며, 연속적인 촬영 영상에 기초하여 해당 특징점 좌표들을 통해 영상 내의 객체를 추적하는 방법을 제공하는 것이다. One embodiment of the present invention extracts the coordinates of anatomical feature points from medical images and captured images through a machine learning-based model without separate equipment such as markers, and matches the medical images and captured images based on the coordinates of the extracted feature points. And, based on continuously captured images, it provides a method of tracking objects in the image through corresponding feature point coordinates.

본 발명의 한 실시예는 기계학습 모델과 광학추적 알고리즘을 함께 이용하여 고속으로 정확한 객체 추적을 수행하는 3차원 객체 추적 방법 및 장치를 제공하는 것이다.One embodiment of the present invention provides a 3D object tracking method and device that performs accurate object tracking at high speed by using a machine learning model and an optical tracking algorithm together.

상기 과제 이외에도 구체적으로 언급되지 않은 다른 과제를 달성하는 데 본 발명에 따른 실시예가 사용될 수 있다.In addition to the above tasks, embodiments according to the present invention can be used to achieve other tasks not specifically mentioned.

본 발명의 실시예에 따르면, 적어도 하나의 프로세서에 의해 동작하는 컴퓨팅 장치의 동작 방법으로서, 사용자의 의료 영상에 기초하여 생성한 2차원 연조직 영상을 학습된 특징점 추출 모델에 입력하여 해부학적 위치에 기초한 하나 이상의 제1 특징점을 추출하고, 의료 영상에 기초하여 상기 제1 특징점을 3차원 좌표로 재구성하는 단계, 깊이 인식 카메라로 사용자를 촬영한 촬영 영상을 특징점 추출 모델에 입력하여 해부학적 위치에 기초한 하나 이상의 제2 특징점을 추출하고, 촬영 영상의 깊이 정보를 적용하여 제2 특징점을 3차원 좌표로 재구성하는 단계, 그리고 제1 특징점의 3차원 좌표와 제2 특징점의 3차원 좌표를 서로 매칭하여 의료 영상과 상기 촬영 영상을 정합하는 단계를 포함한다. According to an embodiment of the present invention, as a method of operating a computing device operated by at least one processor, a two-dimensional soft tissue image generated based on a user's medical image is input into a learned feature point extraction model to extract data based on anatomical location. Extracting one or more first feature points and reconstructing the first feature points into three-dimensional coordinates based on the medical image, inputting the captured image of the user with a depth recognition camera into a feature point extraction model, one based on the anatomical location Extracting the above second feature point, applying the depth information of the captured image to reconstruct the second feature point into 3D coordinates, and matching the 3D coordinates of the first feature point with the 3D coordinates of the second feature point to produce a medical image. and matching the captured images.

특징점 추출 모델은, 입력받은 영상에 대해 기울기 정보 기반의 영상으로 전환한 후, 기울기 정보 기반의 영상에서 미리 라벨링된 특징점의 2차원 좌표를 출력할 수 있다. The feature point extraction model can convert the input image into an image based on gradient information and then output two-dimensional coordinates of pre-labeled feature points from the image based on gradient information.

특징점 추출 모델은, 얼굴을 포함한 영상인 경우에, 얼굴 위치에 따른 움직임이 최소화된 눈 외측 끝점, 눈 내측 끝점, 입꼬리 점, 코 시작점 중에서 하나 이상의 지점을 특징점으로 선정하여 라벨링할 수 있다. In the case of an image including a face, the feature point extraction model can select and label one or more points among the outer end point of the eye, the inner end point of the eye, the corner of the mouth, and the start point of the nose, where movement according to the face position is minimized, as a feature point.

제1 특징점을 3차원 좌표로 재구성하는 단계는, 의료 영상에 기초하여 3차원 연조직 모델로 렌더링(rendering)하고, 3차원 연조직 모델에서 관상 평면(Coronal Plane)방향으로 투영한 2차원 연조직 영상을 생성하여 특징점 추출 모델에 입력할 수 있다. The step of reconstructing the first feature point into 3D coordinates involves rendering into a 3D soft tissue model based on the medical image and generating a 2D soft tissue image projected from the 3D soft tissue model toward the coronal plane. This can be input into the feature point extraction model.

제1 특징점을 3차원 좌표로 재구성하는 단계는, 2차원 연조직 영상에 대한 제1 특징점의 2차원 좌표를 획득하여, 제1 특징점의 2차원 좌표를 기준점으로 3차원 연조직 모델과 접점을 가지는 3차원 좌표를 추출할 수 있다. The step of reconstructing the first feature point into three-dimensional coordinates involves acquiring the two-dimensional coordinates of the first feature point for the two-dimensional soft tissue image, and using the two-dimensional coordinates of the first feature point as a reference point to create a three-dimensional model that has a contact point with the three-dimensional soft tissue model. Coordinates can be extracted.

촬영 영상을 정합하는 단계는, 제1 특징점의 3차원 좌표와 제2 특징점의 3차원 좌표에서 라벨링이 일치하는 특징점간에 점대점(point-to-point) 매칭을 수행하고, 매칭된 특징점간의 좌표 변환값을 산출할 수 있다. The step of matching captured images involves performing point-to-point matching between feature points whose labeling matches the 3D coordinates of the first feature point and the 3D coordinates of the second feature point, and converting coordinates between the matched feature points. The value can be calculated.

촬영 영상을 정합하는 단계는, 좌표 변환값을 의료 영상에 적용하여 의료 영상의 좌표계를 상기 촬영 영상에 대한 좌표계로 변환할 수 있다. In the step of matching the captured image, a coordinate transformation value may be applied to the medical image to convert the coordinate system of the medical image into a coordinate system for the captured image.

본 발명의 실시예에 따르면, 적어도 하나의 프로세서에 의해 동작하는 컴퓨팅 장치의 동작 방법으로서, 깊이 인식 카메라를 통해 사용자를 연속적으로 촬영한 촬영 영상들 수집하는 단계, 촬영 영상들에서 학습된 특징점 추출 모델과 광학 추적 알고리즘을 선택적으로 이용하여 각 촬영 영상마다 하나 이상의 라벨링된 특징점 좌표를 추출하는 단계, 그리고 촬영 영상마다의 깊이 정보에 기초하여 라벨링된 특징점 좌표를 각각 3차원 좌표로 재구성하고, 수집 순서에 따라 인접한 촬영 영상들간의 라벨링된 특징점의 3차원 좌표들을 매칭하여 촬영 영상에서 특징점을 갖는 객체를 추적하는 단계를 포함한다. According to an embodiment of the present invention, there is provided a method of operating a computing device operated by at least one processor, comprising: collecting captured images continuously captured of a user through a depth recognition camera; and a feature point extraction model learned from the captured images. and selectively using an optical tracking algorithm to extract one or more labeled feature point coordinates for each captured image, and reconstructing the labeled feature point coordinates into 3D coordinates based on the depth information for each captured image, respectively, in the collection order. Accordingly, it includes the step of tracking an object having a feature point in a captured image by matching the 3D coordinates of the labeled feature point between adjacent captured images.

촬영 영상들을 미리 설정된 단위에 기초하여 순차적으로 그룹핑하는 단계를 더 포함하고, 특징점 좌표를 추출하는 단계는, 촬영 영상들 중에서 첫번째 촬영 영상에 대해서는 학습된 특징점 추출 모델을 이용하여 특징점 좌표를 추출하고, 두번째 촬영 영상에서부터는 직전 촬영 영상에서 추출된 특징점 좌표에 기초하여 광학 추적 알고리즘을 이용하여 특징점 좌표를 추출할 수 있다. It further includes sequentially grouping the captured images based on a preset unit, and the step of extracting feature point coordinates includes extracting feature point coordinates for the first captured image among the captured images using a learned feature point extraction model, From the second captured image, feature point coordinates can be extracted using an optical tracking algorithm based on feature point coordinates extracted from the previous captured image.

특징점 추출 모델은, 입력받은 영상에 대해 기울기 정보 기반의 영상으로 전환한 후, 기울기 정보 기반의 영상에서 해부학적 위치에 기초하여 미리 라벨링된 특징점의 2차원 좌표를 출력할 수 있다. The feature point extraction model can convert the input image into an image based on gradient information and then output two-dimensional coordinates of feature points pre-labeled based on anatomical positions in the image based on gradient information.

특징점을 갖는 객체를 추적하는 단계는, 수집 순서에 따라 인접한 촬영 영상들간의 동일한 라벨링을 갖는 특징점의 3차원 좌표를 점대점(point-to-point) 매칭을 수행하고, 매칭된 3차원 좌표간의 3차원 변화량을 산출할 수 있다. The step of tracking an object with a feature point involves performing point-to-point matching of the 3D coordinates of feature points with the same labeling between adjacent captured images according to the collection order, and 3D coordinates between the matched 3D coordinates. Dimensional change can be calculated.

해부학적 위치에 기초하여 추출된 특징점을 갖는 사용자의 의료 영상을 수집하고, 촬영 영상의 좌표계로 정합하는 단계, 그리고 산출된 3차원의 변화량을 의료 영상에 적용하여 해당 촬영 영상에 대응하는 의료 영상을 생성하는 단계를 더 포함할 수 있다. Collecting the user's medical image with feature points extracted based on the anatomical position, matching it to the coordinate system of the captured image, and applying the calculated 3-dimensional change amount to the medical image to create a medical image corresponding to the captured image. A generating step may be further included.

본 발명의 한 실시예에 따르면, 통신장치, 메모리, 그리고 메모리에 로드된 프로그램의 명령들(instructions)을 실행하는 적어도 하나의 프로세서를 포함하고, 프로그램은 사용자의 의료 영상과 깊이 인식 카메라로 촬영한 사용자의 촬영 영상을 학습된 특징점 추출 모델에 입력하여 각각 해부학적 위치에 기초하여 라벨링된 하나 이상의 제1 특징점의 좌표와 제2 특징점의 좌표를 획득하면, 제1 특징점의 좌표와 제2 특징점의 좌표간에 매칭을 통해 의료 영상과 상기 촬영 영상을 정합하고, 수집된 연속적인 촬영 영상들에서 특징점 추출 모델과 광학 추적 알고리즘을 선택적으로 이용하여 각 촬영 영상마다 하나 이상의 특징점 좌표를 추출하면, 연속되는 촬영 영상간에 특징점 좌표를 매칭하고, 매칭된 특징점 좌표들의 변화량을 산출하여 촬영 영상에서의 객체를 추적하도록 기술된 명령들(Instrutctions)을 포함한다. According to one embodiment of the present invention, it includes a communication device, a memory, and at least one processor that executes instructions of a program loaded in the memory, and the program captures a user's medical image and a depth recognition camera. When the user's captured image is input into the learned feature point extraction model to obtain the coordinates of one or more first feature points and the coordinates of the second feature point, each labeled based on anatomical location, the coordinates of the first feature point and the coordinates of the second feature point are obtained. If the medical image and the captured image are matched through matching between the captured images, and the coordinates of one or more feature points are extracted for each captured image by selectively using a feature point extraction model and an optical tracking algorithm from the collected continuous captured images, the consecutive captured images Includes instructions that match feature point coordinates between the camera, calculate the amount of change in the matched feature point coordinates, and track the object in the captured image.

프로그램은, 의료 영상에 기초하여 3차원 연조직 모델로 렌더링(rendering)하고, 3차원 연조직 모델에서 관상 평면(Coronal Plane)방향으로 투영한 2차원 연조직 영상을 생성하고, 특징점 추출 모델을 이용하여 2차원 연조직 영상에 대한 제1 특징점의 2차원 좌표를 획득하여, 제1 특징점의 2차원 좌표를 기준점으로 3차원 연조직 모델과 접점을 가지는 3차원 좌표를 추출하도록 기술된 명령들(Instrutctions)을 포함할 수 있다. The program renders a 3D soft tissue model based on medical images, creates a 2D soft tissue image projected from the 3D soft tissue model toward the coronal plane, and uses a feature point extraction model to create a 2D soft tissue image. Instructions may be included to obtain the 2-dimensional coordinates of the first feature point for the soft tissue image and extract 3-dimensional coordinates having a contact point with the 3-dimensional soft tissue model using the 2-dimensional coordinates of the first feature point as a reference point. there is.

프로그램은, 촬영 영상의 깊이 정보를 적용하여 재구성된 제2 특징점의 3차원 좌표와 제1 특징점의 3차원 좌표에서 라벨링이 일치하는 특징점간에 점대점(point-to-point) 매칭을 수행하고, 매칭된 특징점간의 좌표 변환값을 산출하도록 기술된 명령들(Instrutctions)을 포함할 수 있다. The program performs point-to-point matching between feature points whose labeling matches the 3-dimensional coordinates of the first feature point and the 3-dimensional coordinates of the second feature point reconstructed by applying the depth information of the captured image, and matching Instructions described to calculate coordinate transformation values between feature points may be included.

프로그램은, 연속적인 촬영 영상들을 미리 설정된 단위에 기초하여 순차적으로 그룹핑하고, 그룹 내 첫번째 촬영 영상에 대해서는 학습된 특징점 추출 모델을 이용하여 특징점 좌표를 추출하고, 두번째 촬영 영상에서부터는 직전 촬영 영상에서 추출된 특징점 좌표에 기초하여 광학 추적 알고리즘을 이용하여 특징점 좌표를 추출하도록 기술된 명령들(Instrutctions)을 포함할 수 있다. The program groups sequentially captured images sequentially based on preset units, extracts feature point coordinates using a learned feature point extraction model for the first captured image in the group, and extracts feature point coordinates from the immediately preceding image for the second captured image. It may include instructions to extract feature point coordinates using an optical tracking algorithm based on the feature point coordinates.

본 발명의 한 실시예에 따르면 마커를 사용하지 않으면서 해부학적 특징점을 이용한 무구속적 방식으로 3차원 영상간에 정합을 수행하고, 객체를 추적함으로써, 마커와 같은 추가적인 장비 없이도 최소화된 계산량으로 빠르고 정확한 정합 결과와 추적 데이터를 확보할 수 있다. According to one embodiment of the present invention, registration is performed between 3D images in an unconstrained manner using anatomical feature points without using markers and tracking objects, thereby providing fast and accurate data with a minimized amount of calculation without the need for additional equipment such as markers. Matching results and tracking data can be obtained.

본 발명의 한 실시예에 따르면 실시간으로 의료영상과 깊이 영상의 정합이 자동으로 수행되기 때문에 3차원 영상 정합을 위한 기술자의 기술 숙련도에 영향을 받지 않으므로 숙련도 차이에 따른 오류를 최소화하여 일정하게 정확도가 높은 정합결과를 획득할 수 있다.According to one embodiment of the present invention, since the registration of medical images and depth images is performed automatically in real time, it is not affected by the technical skill of the technician for 3D image registration, and thus errors due to differences in skill are minimized and accuracy is maintained at a constant level. High matching results can be obtained.

본 발명의 한 실시예에 따르면 추적하는 객체의 해부학적 특징점에 기초하여 기계학습 모델과 광학 추적 알고리즘을 동시에 사용하여 객체를 추적함으로써 누적 오차 발생을 방지하면서 고속으로 객체를 추적할 수 있다. According to one embodiment of the present invention, the object can be tracked at high speed while preventing the occurrence of cumulative errors by simultaneously tracking the object using a machine learning model and an optical tracking algorithm based on the anatomical feature points of the object being tracked.

본 발명의 한 실시예에 따르면 환자의 해부학적 특징점에 기초하여 빠르고 정확하게 영상간의 정합 및 영상에서의 객체 추적을 제공함으로써 영상 가이드 수술에서 수술의 정확도를 향상시킬 수 있다.According to one embodiment of the present invention, surgical accuracy can be improved in image-guided surgery by providing quick and accurate registration between images and object tracking in images based on anatomical features of the patient.

도 1은 본 발명의 한 실시예에 따른 3차원 영상 정합 및 객체를 추적하는 컴퓨팅 장치를 나타낸 구성도이다.
도 2는 본 발명의 한 실시예에 따른 컴퓨팅 장치의 동작 방법을 나타낸 순서도이다.
도 3은 본 발명의 다른 실시예에 따른 컴퓨팅 장치의 동작 방법을 나타낸 순서도이다.
도 4는 본 발명의 한 실시예에 따른 특징점 추출 모델의 동작을 설명하기 위한 예시도이다.
도 5는 본 발명의 한 실시예에 따른 의료 영상에서의 추출한 특징점 좌표를 3차원으로 재구성하는 과정을 나타내기 위한 예시도이다.
도 6은 본 발명의 한 실시예에 따른 촬영 영상에서의 추출한 특징점 좌표를 3차원으로 재구성하는 과정을 설명하기 위한 예시도이다.
도 7은 본 발명의 한 실시예에 따른 연속적인 촬영 영상에서의 특징점 좌표 추출하는 과정을 설명하기 위한 예시도 이다.
도 8은 본 발명의 한 실시예에 따른 특징점 좌표의 매칭을 통한 추적하는 과정을 설명하기 위한 예시도이다.
도 9는 본 발명의 한 실시예에 컴퓨팅 장치의 하드웨어 구성도이다.Figure 1 is a configuration diagram showing a computing device for 3D image registration and object tracking according to an embodiment of the present invention.
Figure 2 is a flowchart showing a method of operating a computing device according to an embodiment of the present invention.
Figure 3 is a flowchart showing a method of operating a computing device according to another embodiment of the present invention.
Figure 4 is an example diagram for explaining the operation of a feature point extraction model according to an embodiment of the present invention.
Figure 5 is an exemplary diagram illustrating a process of reconstructing the coordinates of feature points extracted from a medical image in three dimensions according to an embodiment of the present invention.
Figure 6 is an example diagram to explain the process of reconstructing the coordinates of feature points extracted from a captured image in three dimensions according to an embodiment of the present invention.
Figure 7 is an example diagram to explain the process of extracting feature point coordinates from continuously captured images according to an embodiment of the present invention.
Figure 8 is an example diagram for explaining a tracking process through matching feature point coordinates according to an embodiment of the present invention.
Figure 9 is a hardware configuration diagram of a computing device in one embodiment of the present invention.

첨부한 도면을 참고로 하여 본 발명의 실시예에 대해 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 동일 또는 유사한 구성요소에 대해서는 동일한 도면부호가 사용되었다. 또한 널리 알려져 있는 공지기술의 경우 그 구체적인 설명은 생략한다. With reference to the attached drawings, embodiments of the present invention will be described in detail so that those skilled in the art can easily implement the present invention. The present invention may be implemented in many different forms and is not limited to the embodiments described herein. In order to clearly explain the present invention in the drawings, parts not related to the description are omitted, and the same reference numerals are used for identical or similar components throughout the specification. Additionally, in the case of well-known and well-known technologies, detailed descriptions thereof are omitted.

명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다. Throughout the specification, when a part is said to “include” a certain element, this means that it may further include other elements rather than excluding other elements, unless specifically stated to the contrary.

또한, 명세서에 기재된 "…부", "…기", "…모듈" 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어나 소프트웨어 또는 하드웨어 및 소프트웨어의 결합으로 구현될 수 있다.In addition, terms such as “…unit”, “…unit”, and “…module” used in the specification refer to a unit that processes at least one function or operation, which may be implemented through hardware or software or a combination of hardware and software. You can.

본 발명에서 설명하는 장치는 적어도 하나의 프로세서, 메모리 장치, 통신 장치 등을 포함하는 하드웨어로 구성되고, 지정된 장소에 하드웨어와 결합되어 실행되는 프로그램이 저장된다. 하드웨어는 본 발명의 방법을 실행할 수 있는 구성과 성능을 가진다. 프로그램은 도면들을 참고로 설명한 본 발명의 동작 방법을 구현한 명령어(instructions)를 포함하고, 프로세서와 메모리 장치 등의 하드웨어와 결합하여 본 발명을 실행한다. The device described in the present invention is composed of hardware including at least one processor, memory device, communication device, etc., and a program to be executed in conjunction with the hardware is stored in a designated location. The hardware has a configuration and performance capable of executing the method of the present invention. The program includes instructions that implement the operating method of the present invention described with reference to the drawings, and executes the present invention by combining it with hardware such as a processor and memory device.

본 명세서에서 "전송 또는 제공"은 직접적인 전송 또는 제공하는 것뿐만 아니라 다른 장치를 통해 또는 우회 경로를 이용하여 간접적으로 전송 또는 제공도 포함할 수 있다.In this specification, “transmission or provision” may include not only direct transmission or provision, but also indirect transmission or provision through another device or using a circuitous route.

본 명세서에서, 제1, 제2 등과 같이 서수를 포함하는 용어들은 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되지는 않는다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 예를들어, 본 개시의 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다. In this specification, terms including ordinal numbers, such as first, second, etc., may be used to describe various components, but the components are not limited by the terms. The above terms are used only for the purpose of distinguishing one component from another. For example, a first component may be referred to as a second component, and similarly, the second component may also be referred to as a first component without departing from the scope of the present disclosure.

본 명세서에서 도면을 참고하여 설명한 실시예들에서, 임의의 실시예로 단독 구현될 수도 있고, 여러 실시예가 병합되거나 분할될 수도 있고, 각 실시예에서 특정 동작은 수행되지 않을 수 있다.In the embodiments described herein with reference to the drawings, any embodiment may be implemented independently, several embodiments may be merged or divided, and specific operations may not be performed in each embodiment.

도 1은 본 발명의 한 실시예에 따른 3차원 영상 정합 및 객체를 추적하는 컴퓨터 장치를 나타낸 구성도이다. 1 is a configuration diagram showing a computer device for 3D image registration and object tracking according to an embodiment of the present invention.

도 1에 도시한 바와 같이, 컴퓨터 장치(100)는 영상 정합 모듈(110), 객체 추적 모듈(120)을 포함하며, 이외에도 학습 모듈(130)을 더 포함할 수 있다.As shown in FIG. 1, the computer device 100 includes an image registration module 110 and an object tracking module 120, and may further include a learning module 130.

설명을 위해, 영상 정합 모듈(110), 객체 추적 모듈(120) 그리고 학습 모듈(130)로 명명하여 부르나, 이들은 적어도 하나의 프로세서에 의해 동작하는 컴퓨팅 장치이다. 여기서, 영상 정합 모듈(110), 객체 추적 모듈(120) 그리고 학습 모듈(130)은 하나의 컴퓨팅 장치에 구현되거나, 별도의 컴퓨팅 장치에 분산 구현될 수 있다. 별도의 컴퓨팅 장치에 분산 구현된 경우, 영상 정합 모듈(110), 객체 추적 모듈(120) 그리고 학습 모듈(130)은 통신 인터페이스를 통해 서로 통신할 수 있다. 컴퓨팅 장치는 본 발명을 수행하도록 작성된 소프트웨어 프로그램을 실행할 수 있는 장치이면 충분하고, 예를 들면, 서버, 랩탑 컴퓨터 등일 수 있다. For explanation purposes, they will be referred to as the image registration module 110, the object tracking module 120, and the learning module 130, but these are computing devices operated by at least one processor. Here, the image registration module 110, the object tracking module 120, and the learning module 130 may be implemented in one computing device, or may be distributed and implemented in separate computing devices. When distributed and implemented in separate computing devices, the image registration module 110, the object tracking module 120, and the learning module 130 may communicate with each other through a communication interface. The computing device may be any device capable of executing a software program written to carry out the present invention, and may be, for example, a server, a laptop computer, etc.

설명의 편의상 영상 정합 모듈(110)과 객체 추적 모듈(120)로 분류해서 설명하지만, 각 모듈에서 동일한 기능을 수행하는 구성요소는 별도로 구비되지 않고 하나의 구성 요소가 서로 공유될 수 있다. For convenience of explanation, the modules are classified into the image matching module 110 and the object tracking module 120. However, components that perform the same function in each module are not provided separately and one component may be shared with each other.

영상 정합 모듈(110)은 의료 영상(A)과 깊이 인식 카메라를 통해 촬영된 촬영 영상(B)을 수집한다. The image registration module 110 collects a medical image (A) and a captured image (B) captured through a depth recognition camera.

영상 정합 모듈(110)은 의료 영상을 생성하는 기기 또는 의료 영상이 저장된 데이터베이스에 접속하여 사용자의 ID에 연동되는 의료 영상을 수집할 수 있다. The image registration module 110 may collect medical images linked to the user's ID by accessing a device that generates medical images or a database where medical images are stored.

여기서, 의료 영상은 자기공명영상 (magnetic resonance imaging MRI), 컴퓨터 단층 촬영(Computed Tomography, CT), 양전자 단층 촬영(positron emission tomography, PET), 초음파 영상 등과 같이 의료 기기를 통해 촬영된 3차원 영상을 나타낸다. Here, medical images are three-dimensional images captured through medical devices, such as magnetic resonance imaging MRI, computed tomography (CT), positron emission tomography (PET), and ultrasound imaging. indicates.

영상 정합 모듈(110)은 깊이 인식 카메라 또는 깊이 인식 카메라의 저장 장치에 연동되어 실시간으로 사용자를 촬영하는 촬영 영상을 수집할 수 있다. The image matching module 110 can collect captured images of a user in real time by linking with a depth recognition camera or a storage device of a depth recognition camera.

여기서, 깊이 인식 카메라(미도시함)는 삼각측량 방식의 3차원 레이저 스캐너, 내시경 장치, 구조 광선패턴을 이용한 깊이 카메라, 적외선(IR: Infra-Red)의 반사 시간 차이를 이용한 TOF(Time-Of-flight) 방식의 깊이 카메라, 씨암(C-arm) 장치, 광간섭단층촬영장치(optical coherence tomography)등을 포함한다. Here, the depth recognition camera (not shown) is a triangulation-type 3D laser scanner, an endoscope device, a depth camera using a structural light pattern, and a time-of-flight (TOF) camera using the difference in reflection time of infrared (IR: Infra-Red). -flight) type depth camera, C-arm device, optical coherence tomography, etc.

깊이 인식 카메라는 영상을 촬영할 때 컬러 영상과 깊이 영상을 동시에 생성하므로 촬영 영상은 컬러 영상과 깊이 영상을 포함한다. When a depth recognition camera captures an image, it simultaneously creates a color image and a depth image, so the captured image includes both a color image and a depth image.

영상 정합 모듈(110)은 수집된 영상마다 미리 설정된 라벨링된 특징점을 추출한다. The image registration module 110 extracts preset labeled feature points for each collected image.

여기서, 라벨링된 특징점은 미리 설정된 특정 지점으로 연조직 상 위에 위치하지만, 해부학적 구조 또는 분석에 의해 움직임이 최소화된 지점을 의미한다. Here, the labeled feature point is a preset specific point located on soft tissue, but refers to a point where movement is minimized due to anatomical structure or analysis.

예를 들어, 얼굴의 연조직 위치에서 움직임이 최소화된 지점으로 눈 외측 끝점(Exocanthion(좌, 우)), 눈 내측 끝점(Endocanthion(좌, 우)), 입꼬리점(Cheilion(좌, 우)) 그리고 코 시작점(Pronasale)과 같이 7개의 지점을 특정하여 라벨링(Labelling)하여 설정할 수 있다. For example, the points of minimal movement in soft tissue locations of the face include the exocanthion (left, right), the inner endpoint of the eye (endocanthion (left, right)), the corner of the mouth (cheilion (left, right)), and It can be set by specifying and labeling 7 points, such as the nose starting point (Pronasale).

이에 영상 정합 모듈(110)은 수집된 영상에서 라벨링된 지점을 특징점으로 추출할 수 있다. Accordingly, the image registration module 110 can extract labeled points from the collected images as feature points.

한편 영상 정합 모듈(110)은 학습된 특징점 추출 모델을 이용하여 의료 영상 또는 촬영 영상에서 특징점을 추출할 수 있으며, 연속적인 촬영 영상에서 특징점을 추출할 때, 학습된 특징점 추출 모델과 광학 추적 알고리즘을 동시에 사용할 수 있다. Meanwhile, the image registration module 110 can extract feature points from medical images or captured images using a learned feature point extraction model, and when extracting feature points from continuously captured images, the learned feature point extraction model and optical tracking algorithm are used. Can be used simultaneously.

영상 정합 모듈(110)은 의료 영상 또는 촬영 영상에서 추출된 각각의 특징점을 적용하여 의료 영상에 대한 특징점 좌표를 재구성하고, 촬영 영상에 대한 특징점 좌표를 재구성한다. The image registration module 110 reconstructs feature point coordinates for the medical image by applying each feature point extracted from the medical image or captured image, and reconstructs feature point coordinates for the captured image.

다시 말해, 영상 정합 모듈(110)은 특징점들이 2차원 위치로 추출되기 때문에 3차원 위치 정보로 변환한다. In other words, the image registration module 110 converts the feature points into 3D location information because they are extracted as 2D locations.

상세하게는 영상 정합 모듈(110)은 의료 영상에서는 추출된 특징점을 기준점으로 설정하고 해당 기준점과 접점을 가지는 의료 영상의 좌표를 추출하고, 촬영 영상에서는 포함되는 깊이 정보를 이용하여 해당 2차원 위치에 깊이 정보를 적용하여 3차원 좌표로 변환할 수 있다. In detail, the image registration module 110 sets the extracted feature point as a reference point in the medical image, extracts the coordinates of the medical image having a contact point with the reference point, and uses the depth information included in the captured image to locate the corresponding two-dimensional location. Depth information can be applied and converted to 3D coordinates.

영상 정합 모듈(110)은 재구성된 의료 영상의 특징점 좌표와 재구성된 촬영 영상의 특징점 좌표를 매칭하여 정합을 수행한다. The image registration module 110 performs registration by matching the feature point coordinates of the reconstructed medical image and the feature point coordinates of the reconstructed captured image.

여기서, 영상 정합 모듈(110)은 특징점 좌표간의 매칭을 통해 의료 영상의 좌표계를 촬영 영상의 좌표계로 변환하도록 정합을 수행할 수 있다. Here, the image registration module 110 may perform registration to convert the coordinate system of the medical image into the coordinate system of the captured image through matching between feature point coordinates.

객체 추적 모듈(120)은 깊이 인식 카메라 또는 깊이 인식 카메라의 저장 장치에 연동되어 연속하는 촬영 영상(B)들을 수집한다. The object tracking module 120 collects continuous captured images (B) in conjunction with a depth recognition camera or a storage device of a depth recognition camera.

객체 추적 모듈(120)은 미리 설정된 단위에 기초하여 촬영 영상들을 그룹핑하면서 실시간 촬영 영상들을 수집할 수 있다. The object tracking module 120 may collect real-time captured images while grouping captured images based on preset units.

객체 추적 모듈(120)은 그룹핑된 촬영 영상들의 순서에 기초하여 학습된 특징점 추출 모델과 광학 추적 알고리즘을 선택적으로 적용하여 각 촬영 영상마다 미리 설정된 라벨링된 특징점의 2차원 좌표를 추출한다. The object tracking module 120 selectively applies a feature point extraction model and an optical tracking algorithm learned based on the order of grouped captured images to extract two-dimensional coordinates of a preset labeled feature point for each captured image.

그리고 객체 추적 모듈(120)은 촬영 영상마다 포함되는 깊이 정보를 이용하여 해당 2차원 위치에 깊이 정보를 적용하여 3차원 좌표로 변환한다. Then, the object tracking module 120 uses the depth information included in each captured image to apply the depth information to the corresponding 2-dimensional location and convert it into 3-dimensional coordinates.

객체 추적 모듈(120)은 연속하는 촬영영상들에 대해서 시간의 순서에 따른 촬영 영상간에 특징점의 3차원 좌표를 매칭하여 3차원 위치 변화량을 계산한다. 이때, 객체 추적 모듈(120)은 계산된 3차원 위치 변화량을 정합된 의료 영상에 적용하여 실시간 촬영 영상 내의 객체를 추적한 결과를 의료 영상에 반영할 수 있다. The object tracking module 120 calculates a 3D position change amount by matching the 3D coordinates of feature points between images captured in the order of time for consecutive captured images. At this time, the object tracking module 120 may apply the calculated 3D position change amount to the registered medical image and reflect the result of tracking the object in the real-time captured image in the medical image.

그리고 학습 모듈(130)은 입력된 영상에서 미리 설정된 해부학적 특징점(라벨링된 특징점)을 추출하는 특징점 추출 모델을 학습시킨다. And the learning module 130 learns a feature extraction model that extracts preset anatomical feature points (labeled feature points) from the input image.

특징점 추출 모델은 입력 영상에 대해 기울기 정보 기반 영상으로 전환한 후, 기울기 정보 기반으로 라벨링된 특징점의 좌표를 추출하는 기계학습 모델이다. The feature point extraction model is a machine learning model that converts the input image into an image based on gradient information and then extracts the coordinates of the feature points labeled based on the gradient information.

특징점 추출 모델은 서포트 백신 머신(Support Vector Machine, SVM), 랜덤 포레스트 모델(Random Forest, RF), 콘볼루션 신경망(Convolution Neural Network, CNN)등과 같은 기계학습 알고리즘으로 구현이 가능하다. The feature point extraction model can be implemented with machine learning algorithms such as Support Vector Machine (SVM), Random Forest (RF), and Convolution Neural Network (CNN).

이처럼 학습 모듈(130)은 컴퓨팅 장치(100) 내부에서 특징점 추출 모델을 학습시킬 수 있으나, 학습 모듈(130)은 별도의 장치에서 학습이 완료된 특징점 추출 모델을 컴퓨팅 장치(100)에 제공할 수 있다. In this way, the learning module 130 can train a feature point extraction model inside the computing device 100, but the learning module 130 can provide the computing device 100 with a feature point extraction model that has been trained in a separate device. .

한편, 특징점 추출 모델은 2차원 영상에서 미리 라벨링된 특징점의 위치를 추출하는 것으로 영상 정합 모듈(110)과 객체 추적 모듈(120)에 동일한 특징점 추출 모델이 사용될 수 있으나, 입력되는 영상의 종류에 따라 의료 영상에 대한 특징점 추출 모델과 촬영 영상에 대한 특징점 추출 모델로 구분될 수 있다.Meanwhile, the feature point extraction model extracts the positions of pre-labeled feature points from a two-dimensional image. The same feature point extraction model can be used in the image registration module 110 and the object tracking module 120, but depending on the type of input image, It can be divided into a feature point extraction model for medical images and a feature point extraction model for captured images.

이하에서는 도 2와 도 3을 이용하여 컴퓨팅 장치가3차원 영상간에 정합을 수행하는 방법과 3차원 영상에서의 객체를 추적하는 방법에 대해 상세하게 설명한다. Hereinafter, using FIGS. 2 and 3, a detailed description will be given of how a computing device performs registration between 3D images and a method of tracking an object in a 3D image.

도 2는 본 발명의 한 실시예에 따른 컴퓨팅 장치의 동작 방법을 나타낸 순서도이다. Figure 2 is a flowchart showing a method of operating a computing device according to an embodiment of the present invention.

도 2에 도시한 바와 같이, 컴퓨팅 장치(100)는 사용자의 의료 영상을 수집한다(S110). 그리고 컴퓨팅 장치(100)는 의료 영상에 기초하여 2차원 연조직 영상을 생성한다(S120). As shown in FIG. 2, the computing device 100 collects the user's medical image (S110). Then, the computing device 100 generates a two-dimensional soft tissue image based on the medical image (S120).

컴퓨팅 장치(100)는 3차원 의료 영상을 3차원 렌더링을 수행하여 3차원 연조직 모델을 생성하고, 생성한 3차원 연조직 모델로부터 2차원 연조직 영상을 생성할 수 있다. The computing device 100 may generate a 3D soft tissue model by performing 3D rendering on a 3D medical image and generate a 2D soft tissue image from the generated 3D soft tissue model.

여기서, 연조직 모델과 연조직 영상은, 얼굴 모델이나 얼굴 영상과 같이 단단한 정도가 낮은 특성을 가지는 연조직에 대한 모델과 영상을 의미한다. Here, the soft tissue model and soft tissue image refer to a model and image of soft tissue that has characteristics of low rigidity, such as a face model or face image.

상세하게는 컴퓨팅 장치(100)는 마칭 큐브(marching cube) 등의 표면 랜더링 알고리즘을 이용하여 의료 영상의 3차원 연조직 모델 생성한다. 그리고 컴퓨팅 장치(100)는 3차원 연조직 모델을 XZ 평면(관상 평면)으로 투영하여 2차원 연조직 영상을 생성한다. In detail, the computing device 100 generates a 3D soft tissue model of a medical image using a surface rendering algorithm such as a marching cube. Then, the computing device 100 projects the 3D soft tissue model onto the XZ plane (coronal plane) to generate a 2D soft tissue image.

여기서, 컴퓨팅 장치(100)가 생성한 3차원 연조직 모델과 2차원 연조직 영상은 모두 의료 영상 기반으로 의료 영상의 좌표계(의료 영상 기기의 좌표계)와 동일하다. Here, the 3D soft tissue model and the 2D soft tissue image generated by the computing device 100 are both based on medical images and are identical to the coordinate system of the medical image (coordinate system of the medical imaging device).

의료 영상 기기의 좌표계는 전역 수평과 정렬되어 있기 때문에, 컴퓨팅 장치(100)는 연조직이 얼굴 영역인 경우, 얼굴의 전면부가 표현되도록 관상 평면 방향인 XZ평면으로 투영한다. Since the coordinate system of the medical imaging device is aligned with the global horizontal, the computing device 100 projects to the XZ plane, which is the coronal plane direction, so that the front part of the face is represented when the soft tissue is a facial area.

이때, 도 2에서는 의료 영상을 수집하고 2차원 연조직 영상을 생성하는 것으로 도시하였지만, 컴퓨팅 장치(100)는 사용자의 의료 영상 자체만을 수집하거나, 사용자에 대한 3차원 의료 영상, 3차원 연조직 모델 그리고 2차원 연조직 영상 중에서 하나 이상의 의료 영상을 수집할 수 있다.At this time, although FIG. 2 shows collecting medical images and generating a two-dimensional soft tissue image, the computing device 100 collects only the user's medical image itself, or collects a three-dimensional medical image for the user, a three-dimensional soft tissue model, and 2 One or more medical images may be collected from the dimensional soft tissue images.

예를 들어, 컴퓨팅 장치(100)는 다른 기기로부터 3차원 의료 영상에 기초하여 생성된3차원 연조직 모델 또는 2차원 연조직 영상을 수집할 수 있다. For example, the computing device 100 may collect a 3D soft tissue model or a 2D soft tissue image generated based on a 3D medical image from another device.

다음으로 컴퓨팅 장치(100)는 학습된 특징점 추출 모델을 이용하여 2차원 특징점을 추출한다(S130). Next, the computing device 100 extracts two-dimensional feature points using the learned feature point extraction model (S130).

컴퓨팅 장치(100)는 의료 영상에 대해서는 2차원 연조직 영상을 입력받으면, 특징점을 추출하기 위한 학습된 특징점 추출 모델에 입력한다. 그리고 컴퓨팅 장치(100)는 학습된 특징점 추출 모델로부터 제1 특징점의 2차원 좌표(Pⁱ _n)를 추출한다. (Pⁱ _n은 2차원 연조직 영상에서 특징점의 2차원 좌표이고, n은 특징점 개수, i는 의료 영상을 나타냄)When the computing device 100 receives a two-dimensional soft tissue image for a medical image, it inputs it into a learned feature point extraction model for extracting feature points. And the computing device 100 extracts the two-dimensional coordinates (P ⁱ _n ) of the first feature point from the learned feature point extraction model. (P ⁱ _n is the 2-dimensional coordinate of the feature point in the 2-dimensional soft tissue image, n is the number of feature points, and i represents the medical image)

상세하게는 컴퓨팅 장치(100)는 학습된 특징점 추출 모델을 통해 2차원 연조직 영상을 기울기 정보 기반 영상으로 전환하고, 기울기 정보 기반 영상에서 라벨링된 제1 특징점을 추출한다. In detail, the computing device 100 converts a two-dimensional soft tissue image into a gradient information-based image through a learned feature point extraction model and extracts a labeled first feature point from the tilt information-based image.

그리고 컴퓨팅 장치(100)는 촬영 영상의 컬러 영상에 대해서 학습된 특징점 추출 모델을 통해 2차원 연조직 영상을 기울기 정보 기반 영상으로 전환하고, 기울기 정보 기반 영상에서 라벨링된 제1 특징점을 추출한다. Then, the computing device 100 converts the two-dimensional soft tissue image into a gradient information-based image through a feature point extraction model learned for the color image of the captured image, and extracts the labeled first feature point from the tilt information-based image.

여기서, 학습된 특징점 추출 모델의 출력값은 각각 라벨링된 특징점들의 좌표이다. Here, the output value of the learned feature point extraction model is the coordinates of each labeled feature point.

다음으로 컴퓨팅 장치(100)는 광선 투사 기반으로 추출된 특징점의 2차원 좌표를 3차원 좌표로 재구성한다(S140). Next, the computing device 100 reconstructs the two-dimensional coordinates of the feature points extracted based on ray projection into three-dimensional coordinates (S140).

컴퓨팅 장치(100)는 2차원 연조직 영상에서의 제1 특징점 좌표를 2차원 연조직 모델에 투영하여 3차원 좌표로 재구성한다. The computing device 100 projects the coordinates of the first feature point in the two-dimensional soft tissue image onto the two-dimensional soft tissue model and reconstructs them into three-dimensional coordinates.

이때, 컴퓨팅 장치(100)는 광선 투사 알고리즘을 이용하여 각 특징점 좌표들에서 광선을 투사한 후, 접점을 이루는 3차원 연조직 모델의 3차원 좌표를 추출할 수 있다. At this time, the computing device 100 may project a ray from the coordinates of each feature point using a ray projection algorithm and then extract the 3D coordinates of the 3D soft tissue model forming the contact point.

다시 말해, 컴퓨팅 장치(100)는 제1 특징점의 2차원 좌표(Pⁱ _n)를 기준점으로 하여 2차원 연조직 영상에서 3차원 연조직 모델로 광선을 투사하고, 투사된 광선의 첫 접점을 이루는 3차원 연조직 모델의 3차원 좌표f(Pⁱ _n)를 추출한다. (f는 광선 투사 알고리즘을 나타냄)In other words, the computing device 100 projects a light ray from a 2-dimensional soft tissue image to a 3-dimensional soft tissue model using the 2-dimensional coordinates (P ⁱ _n ) of the first feature point as a reference point, and creates a 3-dimensional ray that forms the first contact point of the projected ray. Extract the 3D coordinates f(P ⁱ _n ) of the soft tissue model. (f represents the ray casting algorithm)

이를 통해 컴퓨팅 장치(100)는 제1 특징점의 2차원 좌표(Pⁱ _n)에서 제1 특징점의 3차원 좌표 f(Pⁱ _n)로 재구성할 수 있다. Through this, the computing device 100 can reconstruct from the two-dimensional coordinates (P ⁱ _n ) of the first feature point to the three-dimensional coordinates f (P ⁱ _n ) of the first feature point.

한편, 앞서 설명한 S110단계에서부터 S140 단계는 실시간으로 진행되거나 이전 시점에 각 단계를 수행하고 수행된 결과를 저장한 후,컴퓨팅 장치(100)는 영상간의 정합을 수행하기 전에 저장된 3차원 좌표를 수집할 수 있다. Meanwhile, steps S110 to S140 described above are performed in real time or at a previous time, and after performing each step and storing the results, the computing device 100 collects the stored 3D coordinates before performing registration between images. You can.

다시 말해, S110 단계에서부터 S140 단계는 반드시 실시간으로 진행되는 것은 아니며, 적용되는 환경에 따라 수행 시점을 달리할 수 있다. In other words, steps S110 to S140 are not necessarily performed in real time, and the execution timing may vary depending on the applied environment.

다음으로 컴퓨팅 장치(100)는 사용자의 촬영 영상 수집한다(S150). Next, the computing device 100 collects the user's captured images (S150).

여기서 촬영 영상은 깊이 영상 카메라의 촬영 영상으로 2차원 컬러 영상과 깊이 영상을 포함한다. Here, the captured image is captured by a depth imaging camera and includes a two-dimensional color image and a depth image.

그리고 컴퓨팅 장치(100)는 학습된 특징점 추출 모델을 이용하여 2차원 특징점을 추출한다(S160). Then, the computing device 100 extracts two-dimensional feature points using the learned feature point extraction model (S160).

컴퓨팅 장치(100)는 학습된 특징점 추출 모델을 통해 촬영 영상의 컬러 영상을 기울기 정보 기반 영상으로 전환하고, 기울기 정보 기반 영상에서 라벨링된 특징점을 추출한다. The computing device 100 converts the color image of the captured image into a gradient information-based image through the learned feature point extraction model and extracts labeled feature points from the tilt information-based image.

다시 말해 컴퓨팅 장치(100)는 학습된 특징점 추출 모델에 2차원 컬러 영상을 입력하여 2차원 컬러 영상에서 제2 특징점의 2차원 좌표(P^c _n)를 추출한다. (P^c _n은 2차원 컬러 영상에서 특징점의 2차원 좌표이고, n은 특징점 개수, c는 촬영 영상을 나타냄)In other words, the computing device 100 inputs a two-dimensional color image into the learned feature point extraction model and extracts the two-dimensional coordinates (P ^c _n ) of the second feature point from the two-dimensional color image. (P ^c _n is the 2-dimensional coordinate of the feature point in the 2-dimensional color image, n is the number of feature points, and c represents the captured image)

다음으로 컴퓨팅 장치(100)는 깊이 정보 기반으로 추출된 특징점의 2차원 좌표를 3차원 좌표로 재구성한다(S170). Next, the computing device 100 reconstructs the two-dimensional coordinates of the feature points extracted based on depth information into three-dimensional coordinates (S170).

컴퓨팅 장치(100)는 제2특징점의 2차원 좌표(P_n ^c)의 깊이 영상 정보를 이용하여 3차원 좌표p (P_n ^c )재구성한다. (p는 핀홀 카메라 모델을 나타냄)The computing device 100 reconstructs the 3D coordinates p (P _n ^c ) using the depth image information of the 2D coordinates (P _n ^c ) of the second feature point. (p indicates pinhole camera model)

이를 다음 수학식 1과 같이 나타낼 수 있다. This can be expressed as Equation 1 below:

[수학식 1][Equation 1]

여기서, p (P^c _n)는 제2특징점의 3차원 좌표를 나타내고 f_x, fy는 깊이 영상 카메라의 초점거리(focal length), c_x, c_y는 카메라 주점 위치(principal point)로, 카메라 내부 파라미터를 의미한다. Here, p (P ^c _n ) represents the three-dimensional coordinates of the second feature point, f _x and fy are the focal length of the depth image camera, and c _x and c _y are the camera principal points. It means internal parameter.

다음으로 컴퓨팅 장치(100)는 재구성된 특징점의 3차원 좌표에 기초하여 정합 수행한다(S180).Next, the computing device 100 performs registration based on the 3D coordinates of the reconstructed feature points (S180).

컴퓨팅 장치(100)는 의료 영상에 대한 제2 특징점의3차원 좌표 f(Pⁱ _n)와 촬영 영상에 대한 제2 특징점의3차원 좌표 p(P^c _n)를 점대점(point-to-point) 매칭을 수행하여 3차원 정합을 수행한다. The computing device 100 calculates the three-dimensional coordinates f (P ⁱ _n ) of the second feature point for the medical image and the three-dimensional coordinate p (P ^c _n ) of the second feature point for the captured image in a point-to-point manner. ) Perform matching to perform 3D registration.

예를 들어, 컴퓨팅 장치(100)는 의료 영상에서 추출된 7개의 라벨링된 제1 특징점들의 3차원 좌표와 촬영 영상에서 추출된 7개의 라벨링된 제2 특징점들의 3차원 좌표를 동일한 라벨링을 기준으로 점대점 매칭을 수행한다. For example, the computing device 100 divides the 3D coordinates of the 7 labeled first feature points extracted from the medical image and the 3D coordinates of the 7 labeled second feature points extracted from the captured image into point-to-point based on the same labeling. Perform point matching.

컴퓨팅 장치(100)는 제1 특징점의 3차원 좌표와 제2 특징점의 3차원 좌표를 점대점 매칭을 수행한다. The computing device 100 performs point-to-point matching between the 3D coordinates of the first feature point and the 3D coordinates of the second feature point.

그리고 컴퓨팅 장치(100)는 의료 영상에서 촬영 영상으로의 좌표 변환 정보를 산출한다(S190). Then, the computing device 100 calculates coordinate conversion information from the medical image to the captured image (S190).

컴퓨팅 장치(100)는 특징점 좌표들 간의 정합을 통해 의료 영상의 3차원 좌표계를 촬영 영상의 3차원 좌표계로 변환하기 위한 변환값을 산출한다. 이를 통해 컴퓨팅 장치(100)는 변환값을 의료 영상의 전체 영상에 적용하여 의료 영상과 촬영 영상을 정합할 수 있다. The computing device 100 calculates a transformation value for converting the 3D coordinate system of the medical image into the 3D coordinate system of the captured image through matching between feature point coordinates. Through this, the computing device 100 can apply the conversion value to the entire medical image to match the medical image and the captured image.

이를 통해 컴퓨팅 장치(100)는 7개의특징점의 3차원 좌표들을 점대점 매칭을 수행하면서, 의료 영상의 좌표계로부터 촬영 영상의 좌표계로의 변환값(T^c _i)을 연산한다. Through this, the computing device 100 performs point-to-point matching of the 3D coordinates of the seven feature points and calculates a conversion value (T ^c _i ) from the coordinate system of the medical image to the coordinate system of the captured image.

이를 수학식으로 나타내면 다음 수학식 2와 같다. If this is expressed in a mathematical equation, it is as follows in equation 2.

[수학식 2][Equation 2]

p(P^c _n )= T^c _{i *}f(Pⁱ _n)p(P ^c _n )= T ^c _{i *} f(P ⁱ _n )

컴퓨팅 장치(100)는 의료 영상의 전체 영상에 대해서 변환값(T^c _i)을 이용하여 촬영 영상의 좌표계로 변환할 수 있으며, 해당 변환값(T^c _i)을 저장한다. The computing device 100 can convert the entire medical image into the coordinate system of the captured image using a conversion value (T ^c _i ) and stores the conversion value (T ^c _i ).

도 3은 본 발명의 다른 실시예에 따른 컴퓨팅 장치의 동작 방법을 나타낸 순서도이다. Figure 3 is a flowchart showing a method of operating a computing device according to another embodiment of the present invention.

도 3에 도시한 바와 같이, 컴퓨팅 장치(100)는 연속적인 촬영 영상을 수집하며 순서에 기초하여 일정한 단위로 그룹핑한다(S210). As shown in FIG. 3, the computing device 100 collects continuously captured images and groups them into certain units based on order (S210).

컴퓨팅 장치(100)는 실시간으로 촬영 영상들을 수집하면서 일정한 단위로 촬영 영상들을 그룹핑할 수 있다. The computing device 100 may collect captured images in real time and group the captured images into certain units.

예를 들어, 8개의 프레임으로 설정된 경우, 컴퓨팅 장치(100)는 8개의 프레임 단위로 연속적으로 수집되는 촬영 영상들을 그룹핑한다. For example, when set to 8 frames, the computing device 100 groups continuously collected images in units of 8 frames.

이때, 그룹핑된 촬영 영상들의 순서에 기초하여 학습된 특징점 추출 모델과 광학 추적 알고리즘을 선택적으로 적용하기 위해서 컴퓨팅 장치는 연속적인 촬영 영상에 대해 순서에 기초하여 그룹핑을 수행한다. At this time, in order to selectively apply the feature point extraction model and optical tracking algorithm learned based on the order of the grouped captured images, the computing device performs grouping based on the order of consecutive captured images.

다음으로 컴퓨팅 장치(100)는 그룹내 위치에 기초하여 첫번째 촬영 영상에 대해 특징점 추출 모델을 이용하여 특징점의 2차원 좌표를 추출한다(S220). Next, the computing device 100 extracts the two-dimensional coordinates of the feature point using a feature point extraction model for the first captured image based on the position within the group (S220).

컴퓨팅 장치(100)는 첫번째 영상에 대해서 특징점 추출 모델을 이용하여 특징점의 2차원 좌표(P^cM_n)를 추출한다. (P^cM_n는 M번째 획득된 컬러 영상의 특징점의 2차원 좌표를 나타냄, M은 자연수)The computing device 100 extracts the two-dimensional coordinates (P ^c M _n ) of the feature point from the first image using a feature point extraction model. (P ^c M _n represents the two-dimensional coordinates of the feature point of the Mth acquired color image, M is a natural number)

컴퓨팅 장치(100)는 미리 설정된 라벨링된 특징점들을 모두 촬영 영상에서 추출하는 것이 바람직하지만, 사용자의 움직임에 의해 촬영 영상 내에 형상이 변경되기 때문에 촬영 영상에 위치하는 라벨링된 특징점들을 적어도 셋 이상 추출한다. 여기서, 세 개의 라벨링된 특징점은 3차원 변환을 적용하기 위한 최소 특징점의 개수로 반드시 이에 한정하는 것은 아니다. It is desirable for the computing device 100 to extract all preset labeled feature points from the captured image, but because the shape in the captured image changes due to the user's movement, it extracts at least three labeled feature points located in the captured image. Here, the three labeled feature points are the minimum number of feature points for applying 3D transformation, and are not necessarily limited thereto.

여기서, 촬영 영상을 특징점 추출 모델에 입력하여 특징점을 추출하는 단계로 S160단계와 동일하므로 중복되는 설명은 생략한다. Here, the step of extracting feature points by inputting the captured image into the feature point extraction model is the same as step S160, so redundant explanation will be omitted.

다음으로 컴퓨팅 장치(100)는 두번째 촬영 영상에서부터 그룹내 마지막 촬영 영상까지 직전 촬영의 특징점 2차원 좌표에 기초하여 광학 추적 알고리즘을 통해 각각 특징점의 2차원 좌표를 추출한다(S230). Next, the computing device 100 extracts the two-dimensional coordinates of each feature point through an optical tracking algorithm based on the two-dimensional coordinates of the feature point of the previous shot from the second shot image to the last shot image in the group (S230).

컴퓨팅 장치(100)는 그룹내 첫번째 촬영영상 이후의 촬영영상들에 대해서는 직전 영상에서 추출된 2차원 좌표에 기초하여 광학 추적 알고리즘을 통해 2차원 특징점(P^cM+1_n = O(P^cM_n))을 추출한다. (P^cM_n는 M번째 획득된 컬러 영상의 특징점의 2차원 좌표, P^cM+1_n 는 M+1번째 획득된 컬러 영상의 특징점의 2차원 좌표, O는 광학 추적 알고리즘을 나타냄)The computing device 100 determines two-dimensional feature points (P ^c M + 1 _n = O (P ^c M _n )) is extracted. (P ^c M _n is the two-dimensional coordinate of the feature point of the M-th acquired color image, P ^c M+1 _n is the two-dimensional coordinate of the feature point of the M+1-th acquired color image, O represents the optical tracking algorithm)

컴퓨팅 장치(100)는 촬영 영상마다 깊이 정보 기반으로 추출된 특징점의 2차원 좌표를 3차원 좌표로 재구성한다(S240). The computing device 100 reconstructs the two-dimensional coordinates of the feature points extracted based on depth information for each captured image into three-dimensional coordinates (S240).

그리고 컴퓨팅 장치(100)는 S170단계와 동일한 방법으로 촬영 영상에 대해 깊이 영상 정보 및 핀홀 카메라 모델 기반으로 특징점의 2차원 좌표(P^cM_n)에서 3차원 좌표p (P^cM_n)로 재구성한다.And the computing device 100 reconstructs the captured image from the two-dimensional coordinates (P ^c M _n ) of the feature point to the three-dimensional coordinate p (P ^c M _n ) based on the depth image information and the pinhole camera model in the same manner as in step S170. do.

다음으로 컴퓨팅 장치(100)는 순서에 따라 재구성된 특징점의 3차원 좌표들을 매칭하여 촬영 영상 내 객체를 위치를 추적한다(S250). Next, the computing device 100 tracks the location of the object in the captured image by matching the 3D coordinates of the feature points reconstructed in order (S250).

컴퓨터 장치(100)는 연속되는 촬영 영상간에 동일하게 라벨링된 특징점들의 3차원 좌표를 매칭하여 추적한다. The computer device 100 matches and tracks the three-dimensional coordinates of identically labeled feature points between consecutive captured images.

상세하게는 눈 외측 끝점으로 라벨링된 특징점간의 3차원 좌표를 매칭함으로써 7개의 특징점들을 라벨링에 기초하여 점대점으로 각각 매칭한다. In detail, the 7 feature points are matched point-to-point based on labeling by matching the 3D coordinates between feature points labeled as the outer endpoint of the eye.

그리고 컴퓨팅 장치(100)는 촬영 영상간 특징점의 3차원 좌표들에 대한 변화량을 산출한다(S260). Then, the computing device 100 calculates the amount of change in the 3D coordinates of the feature points between captured images (S260).

이를 수학식으로 나타내면 다음 수학식 3과 같다. If this is expressed in a mathematical equation, it is as follows in equation 3.

[수학식 3][Equation 3]

p(P^cM+1_n)= T_M ^M+1 _*p(P^cM_n)p(P ^c M+1 _n )= T _M ^M+1 _* p(P ^c M _n )

컴퓨팅 장치(100)는 M번째 촬영 영상에 대해서 변화량 (T_M ^M+1)을 적용하여 M+1번째 촬영 영상에 대응되도록 변환할 수 있으며, 해당 변화량(T_M ^M+1)을 저장한다. The computing device 100 may apply a change amount (T _M ^M+1 ) to the M-th captured image to convert it to correspond to the M+1-th captured image, and store the corresponding change amount (T _M ^M+1 ).

컴퓨팅 장치(100)는 연속되는 2개의 촬영 영상에 대해서7개 특징점의 3차원 좌표들을 점대점 매칭 및 변화량(T_M ^M+1) 연산을 수행하면서, 정합 변환값(T^c _i)을 3차원 촬영 영상에 적용하여 촬영 영상마다 대응되는 3차원 의료 영상을 제공할 수 있다.The computing device 100 performs point-to-point matching and variation (T _M ^{M + 1} ) calculations on the 3-dimensional coordinates of 7 feature points for two consecutive captured images, and converts the matching transformation value (T ^c _i ) into 3-dimensional By applying it to captured images, it can provide 3D medical images corresponding to each captured image.

이를 수학식으로 나타내면 다음 수학식 4과 같다. If this is expressed in a mathematical equation, it is as follows in equation 4.

[수학식 4][Equation 4]

f(PⁱM+1_n)= (T^c _i)^-1 _*T_M ^M+1 _*p(P^cM_n)f(P ⁱ M+1 _n )= (T ^c _i ) ^-1 _* T _M ^M+1 _* p(P ^c M _n )

다시 말해, 카메라 좌표계에서 추적 점의 변화량, 정합 변화량을 적용하여 촬영 영상간의 특징점 좌표의 변화가 적용된 의료 영상을 제공할 수 있다. In other words, it is possible to provide a medical image to which changes in feature point coordinates between captured images are applied by applying the change amount and registration change amount of the tracking point in the camera coordinate system.

수학식 4는 m번째 촬영 영상의 특징점으로부터 m+1번재 의료 영상내 특징점을 추측하는 식으로, 그 역도 성립이 가능하다. Equation 4 estimates the feature point in the m+1th medical image from the feature point of the mth captured image, and vice versa is also possible.

도 4는 본 발명의 한 실시예에 따른 특징점 추출 모델의 동작을 설명하기 위한 예시도이다.Figure 4 is an example diagram for explaining the operation of a feature point extraction model according to an embodiment of the present invention.

도 4에 도시한 바와 같이, 컴퓨팅 장치(100)는 입력된 영상을 기울기 정보 영상(HOG feature, Histogram of Oriented Gradients)으로 변환하고, 변환된 기울기 정보 영상에서 특징점들을 추출하는 특징점 추출 모델(SVM 과 Random forest)을 이용한다.As shown in FIG. 4, the computing device 100 converts the input image into a gradient information image (HOG feature, Histogram of Oriented Gradients) and uses a feature point extraction model (SVM and Random forest) is used.

예를 들어, 컴퓨팅 장치(100)는 입력받은 2차원 영상에 대해 필수적이지 않은 정보(예를 들어 일정한 색상의 배경)를 제거하고 윤곽선들이 강조되는 그라디언트 이미지로 변환하고 조명 변화의 영향을 최소화하기 위해 정규화를 수행하여 최종 기울기 정보 영상을 생성할 수 있다. For example, the computing device 100 removes non-essential information (e.g., background of a certain color) from the input two-dimensional image, converts it into a gradient image in which outlines are emphasized, and minimizes the influence of lighting changes. Normalization can be performed to generate the final gradient information image.

이에 컴퓨팅 장치(100)는 기울기 정보 영상을 특징점 추출 모델에 입력하여 미리 라벨링된 특징점들을 해당 기울기 정보 영상에서 추출할 수 있다. Accordingly, the computing device 100 may input the tilt information image into a feature point extraction model and extract pre-labeled feature points from the corresponding tilt information image.

도 5 본 발명의 한 실시예에 따른 의료 영상에서의 추출한 특징점 좌표를 3차원으로 재구성하는 과정을 나타내기 위한 예시도이다.Figure 5 is an exemplary diagram showing a process of reconstructing the coordinates of feature points extracted from a medical image in three dimensions according to an embodiment of the present invention.

도 6의 (a)는 의료 영상(A-1)에서 추출된 특징점의 2차원 좌표(A-4)를 나타내고 (b)에서는 추출된 특징점의 2차원 좌표를 3차원 좌표로 변환한다. Figure 6 (a) shows the two-dimensional coordinates (A-4) of the feature point extracted from the medical image (A-1), and in (b), the two-dimensional coordinates of the extracted feature point are converted into three-dimensional coordinates.

도 6의 (a)에 도시한 바와 같이, 의료 기기에서 촬영된 의료 영상(A-1)에서 특징점 추출 모델에 입력하기 위해 컴퓨팅 장치(100)는 의료 영상(A-1)을 3차원 렌더링하여 3차원 연조직 모델(A-2)을 생성한 후, XZ 평면으로 투영하여 2차원 연조직 영상(A-3)을 생성할 수 있다. As shown in (a) of FIG. 6, in order to input a feature point extraction model from a medical image (A-1) captured by a medical device, the computing device 100 renders the medical image (A-1) in three dimensions. After creating a 3D soft tissue model (A-2), a 2D soft tissue image (A-3) can be created by projecting it to the XZ plane.

다만, 앞에서 설명한 바와 같이, 컴퓨팅 장치(100)는 의료 영상(A-1), 3차원 연조직 모델(A-2), 2차원 연조직 영상(A-3) 중에서 하나 이상의 의료 영상을 수집할 수 있으며, 수집되는 의료 영상의 종류에 따라 영상 생성 과정을 거치거나 제외하고 바로 특징점 추출 모델에 입력할 수 있다. However, as described above, the computing device 100 can collect one or more medical images among a medical image (A-1), a three-dimensional soft tissue model (A-2), and a two-dimensional soft tissue image (A-3). , Depending on the type of medical image collected, it can be input directly into the feature extraction model, either through the image creation process or without.

그리고 도 5의 (b)와 같이, 컴퓨팅 장치(100)는 특징점의 2차원 좌표(A-4)에서 3차원 연조직 모델(A-2)로 광선을 투사하여 첫 접점에 대한 3차원 좌표로 재구성할 수 있다. 이러한 과정을 통해 의료 영상에서 추출된 특징점의 2차원 좌표를 3차원 좌표로 변환한다. And as shown in (b) of FIG. 5, the computing device 100 projects a ray from the 2-dimensional coordinates (A-4) of the feature point to the 3-dimensional soft tissue model (A-2) and reconstructs it into 3-dimensional coordinates for the first contact point. can do. Through this process, the two-dimensional coordinates of the feature points extracted from the medical image are converted into three-dimensional coordinates.

도 6은 본 발명의 한 실시예에 따른 촬영 영상에서의 추출한 특징점 좌표를 3차원으로 재구성하는 과정을 설명하기 위한 예시도이다.Figure 6 is an example diagram to explain the process of reconstructing the coordinates of feature points extracted from a captured image in three dimensions according to an embodiment of the present invention.

도 6에 도시한 바와 같이, 촬영 영상(B)은 2차원 컬러 영상과 2차원 깊이 영상을 가지며, 컴퓨팅 장치(100)는 2차원 컬러 영상에서 라벨링된 특징점의 2차원 좌표를 추출한다. As shown in FIG. 6, the captured image B has a two-dimensional color image and a two-dimensional depth image, and the computing device 100 extracts two-dimensional coordinates of labeled feature points from the two-dimensional color image.

2차원 컬러 영상과 2차원 깊이 영상은 같은 좌표계를 가지므로, 컴퓨팅 장치(100)는 2차원 깊이 영상에서 특징점의 2차원 좌표를 추출하면 해당 깊이 정보를 획득할 수 있다. Since the 2D color image and the 2D depth image have the same coordinate system, the computing device 100 can obtain the corresponding depth information by extracting the 2D coordinates of the feature point from the 2D depth image.

이에 따라 컴퓨팅 장치(100)는 7개의 특징점의 2차원 좌표에 대해 해당 촬영 영상의 2차원 깊이 영상에 기초하여 해당 지점의 깊이 정보를 적용함에 따라, 특징점의 3차원 좌표로 재구성할 수 있다. Accordingly, the computing device 100 can reconstruct the 3D coordinates of the 7 feature points by applying the depth information of the point based on the 2D depth image of the captured image to the 2D coordinates of the feature points.

이하에서는 도 7 내지 도 8를 이용하여 연속적인 촬영 영상에서 객체를 추적하는 과정에 대해서 상세하게 설명한다.Hereinafter, the process of tracking an object in continuously captured images will be described in detail using FIGS. 7 and 8.

도 7은 본 발명의 한 실시예에 따른 연속적인 촬영 영상에서의 특징점 좌표 추출하는 과정을 설명하기 위한 예시도 이다. Figure 7 is an example diagram to explain the process of extracting feature point coordinates from continuously captured images according to an embodiment of the present invention.

도 7에 도시한 바와 같이, 컴퓨팅 장치(100)는 시간에 따라 수집되는 연속적인 촬영 영상에 대해 기계학습 모델과 과학 추적 알고리즘을 함께 사용하여 촬영 영상에 대해서 특징점을 추출한다. As shown in FIG. 7, the computing device 100 extracts feature points from continuously captured images collected over time by using a machine learning model and a scientific tracking algorithm together.

도 7에는 #1부터 #M까지의 프레임(예를 들어 8개 프레임)을 하나의 그룹으로 가정하고 그룹 내에서 수집된 프레임에 대해 특징점을 추출하는 과정을 시간에 기초하여 설명한다. (M은 자연수)In Figure 7, the frames #1 to #M (for example, 8 frames) are assumed to be one group, and the process of extracting feature points for the frames collected within the group is explained based on time. (M is a natural number)

설명의 편의상 각 시점마다 하나의 프레임을 수신한다고 하면, t₀ 시점에서 컴퓨팅 장치(100)는 #1 프레임을 수신하면 기계학습 모델(Machine learning)인 학습된 특징점 추출 모델을 통해 특징점(Landmark)을 추출한다. 여기서 특징점은 앞서 설명한 바와 같이, 해부학적 구조에 기초하여 움직임이 최소화되는 지점을 의미한다. For convenience of explanation, if one frame is received at each time point, when the computing device 100 receives frame #1 at time t ₀ , it extracts a landmark through a learned feature point extraction model, which is a machine learning model. Extract. Here, as previously explained, the feature point refers to a point where movement is minimized based on the anatomical structure.

컴퓨팅 장치(100)는 #1 프레임에 대한 특징점 좌표를 추출과 동시에 저장할 수 있다. The computing device 100 may extract and simultaneously store the feature point coordinates for frame #1.

그룹핑되는 프레임의 개수에 대해서 8개로 가정하였으므로, 컴퓨팅 장치(100)는 t₀시점부터 t₇시점까지 연속적인 프레임들을 수신하면 버퍼(save buffer)에 임시 저장한다.Since the number of grouped frames is assumed to be 8, when the computing device 100 receives consecutive frames from time t ₀ to time t ₇ , they temporarily store them in a buffer (save buffer).

이에 컴퓨팅 장치(100)는 t₈이 되는 시점에서 저장하였던 #1 프레임에서 추출한 특징점(#1 Landmark)을 기초하여 광학 추적 알고리즘(Optical flow)을 이용하여 #2 프레임에서 특징점(#2 Landmark)을 추출하여 저장한다. Accordingly, the computing device 100 extracts the feature point (#2 Landmark) from the #2 frame using an optical tracking algorithm (Optical flow) based on the feature point (#1 Landmark) extracted from the #1 frame stored at t ₈ . Extract and save.

그리고 t₉ 시점부터 t₁₄시점까지 이전 프레임에서 추출한 특징점에 기초하여 광학 추적 알고리즘을 이용하여 해당 프레임에 대한 특징점을 추출하여 저장한다. Then, based on the feature points extracted from the previous frame from time t ₉ to time t _14, the feature points for the frame are extracted and stored using an optical tracking algorithm.

한편, 도 7에서는 하나의 그룹에 대해서 설명하지만, 실시간 촬영 영상에 적용함에 있어서, 연속되는 촬영 영상에서 임의의 단위로 그룹핑을 수행하여 복수개의 그룹에 대해 앞서 설명한 바와 같이 특징점을 추출할 수 있다. Meanwhile, one group is described in FIG. 7, but when applied to real-time captured images, feature points can be extracted for a plurality of groups as described above by grouping successive captured images in arbitrary units.

이때, 실시간성을 위해 하나의 그룹에 대한 특징점 추출 과정과 동시에 다른 그룹에 속하는 촬영 영상들을 수집하고 특징점을 추출할 수 있다. At this time, for real-time purposes, captured images belonging to other groups can be collected and feature points extracted at the same time as the feature point extraction process for one group.

예를 들어, t₀에서 t₇ 시점에서 하나의 그룹에 대한 프레임을 수신한 후, t₈시점에서 두번째 프레임에 대한 특징점 좌표를 추출함과 동시에 다른 그룹에서 첫번째 프레임을 수신할 수 있다. For example, after receiving a frame for one group from time t ₀ to t ₇ , the feature point coordinates for the second frame can be extracted at time t ₈ and the first frame from another group can be received at the same time.

이러한 구성에 대해서 다음 표 1과 같이 표시할 수 있다. This configuration can be displayed as shown in Table 1 below.

Current ImageCurrent Image MM M+1M+1 M+2M+2 M+3M+3 M+4M+4 M+5M+5 M+6M+6 M+7M+7 Machine learningMachine learning MM Optical flowOptical flow M-7M-7 M-6M-6 M-5M-5 M-4M-4 M-3M-3 M-2M-2 M-1M-1 Save bufferSave buffer MM M+1M+1 M+2M+2 M+3M+3 M+4M+4 M+5M+5 M+6M+6 M+7M+7 DisplayDisplay M-8M-8 M-7M-7 M-6M-6 M-5M-5 M-4M-4 M-3M-3 M-2M-2 M-1M-1

표 1에서는 M-8이 첫번째 프레임으로 그룹핑된 제1 그룹(M-8,…,M-1) M이 첫번째 프레임으로 그룹핑된 제2 그룹(M,…M+7)으로 제1 그룹과 제2 그룹의 프레임들을 처리하는 과정을 나타낸다. In Table 1, the first group (M-8,...,M-1) in which M-8 is grouped as the first frame, and the second group (M,...M+7) in which M is grouped as the first frame, Indicates the process of processing 2 groups of frames.

상세하게는 프레임 번호(Current Image)에 기재된 순서에 따라 프레임들을 수집하는 과정에서 특징점 추출 모델(machine learning)을 통해 특징점을 추출하는 프레임 번호, 광학 추적 알고리즘(optical flow)을 통해 특징점을 추출하는 프레임 번호, 버퍼(Save buffer)에 임시 저장된 프레임 번호, 그리고 연동되는 화면(Display)에 표시하는 프레임 번호를 나타낸다.Specifically, in the process of collecting frames according to the order described in the frame number (Current Image), a frame number for extracting feature points through a feature point extraction model (machine learning), and a frame for extracting feature points through an optical tracking algorithm (optical flow) Indicates the number, the frame number temporarily stored in the buffer (Save buffer), and the frame number displayed on the linked screen (Display).

예를 들어, 수집되는 프레임의 번호에 기초하여 제2 그룹의 첫번째 프레임(M 프레임)을 수집하면, 수집된 M 프레임을 특징점 추출 모델(Machine learning)에 입력하여 추출된 특징점들을 저장하면서 해당 M 프레임도 버퍼에 임시 저장한다. 그리고 M-8 프레임을 수집할 때, 추출된 M-8 프레임의 특징점들에 기초하여 버퍼에 임시 저장된 제1 그룹의 M-7 프레임에 대해 광학적 알고리즘으로 특징점들을 추출한다. 그리고 디스플레이 화면에는 제1 그룹의 첫번째 프레임인 M-8 프레임을 표시한다. For example, if the first frame (M frame) of the second group is collected based on the number of the collected frame, the collected M frame is input into a feature point extraction model (machine learning) and the extracted feature points are stored while the M frame It is also temporarily stored in the buffer. And when collecting M-8 frames, feature points are extracted using an optical algorithm for the first group of M-7 frames temporarily stored in the buffer based on the feature points of the extracted M-8 frames. And the M-8 frame, which is the first frame of the first group, is displayed on the display screen.

다음으로 제2 그룹의 M+1 프레임을 수집하여 버퍼에 임시 저장함과 동시에 M-7 프레임의 추출된 특징점들에 기초하여 제1 그룹의 M-6 프레임에서 광학적 알고리즘으로 특징점들을 추출한다. 그리고 M-7 프레임을 디스플레이 화면에 표시한다. Next, the M+1 frames of the second group are collected and temporarily stored in a buffer, and at the same time, feature points are extracted from the M-6 frames of the first group using an optical algorithm based on the extracted feature points of the M-7 frames. Then, the M-7 frame is displayed on the display screen.

이러한 과정을 반복하면 표 1과 같이 제2 그룹을 수집하는 동안 제2 그룹의 첫번째 프레임과 제1 그룹의 두번째 프레임에서부터 마지막 프레임까지 특징점들을 추출함으로써, 디스플레이 화면에는 끊기지 않도록 프레임들을 표시할 수 있다. By repeating this process, as shown in Table 1, while collecting the second group, feature points are extracted from the first frame of the second group and from the second frame to the last frame of the first group, so that the frames can be displayed without interruption on the display screen.

이와 같이, 컴퓨팅 장치(100)는 실시간으로 수집되는 연속적인 촬영 영상에 대해 특징점들을 추출함으로써, 끊김 없이 고속으로 특징점을 추출한다. In this way, the computing device 100 extracts feature points at high speed without interruption by extracting feature points from continuously captured images collected in real time.

다시 말해, 높은 정확도를 가지는 특징점들을 추출할 수 있지만, 시간 다소 오래 소요되는 기계학습 모델과 고속으로 추적이 가능하지만 누적 오차 발생이 되는 광학 추적 알고리즘을 함께 사용함으로써 컴퓨팅 장치(100)는 높은 정확도를 가지는 특징점들을 추출하면서 고속으로 영상 내 객체의 추적이 가능하다. In other words, the computing device 100 achieves high accuracy by using a machine learning model that can extract feature points with high accuracy but takes a relatively long time and an optical tracking algorithm that can track at high speed but generates cumulative errors. It is possible to track objects in an image at high speed while extracting feature points.

도 8은 본 발명의 한 실시예에 특징점 좌표의 매칭을 통한 추적하는 과정을 설명하기 위한 예시도이다. Figure 8 is an example diagram for explaining a tracking process through matching feature point coordinates in one embodiment of the present invention.

도 8에 도시한 바와 같이, 컴퓨팅 장치(100)는 그룹에 속하는 M번째 컬러 영상과 연속적인 M+1 번째 컬러 영상에서 재구성된 3차원 좌표를 기반으로 영상 내 객체를 추적할 수 있다. As shown in FIG. 8, the computing device 100 can track an object in an image based on 3D coordinates reconstructed from the Mth color image belonging to the group and the M+1th consecutive color image.

컴퓨팅 장치(100)는 각각 컬러영상(촬영영상)에서 특징점의 2차원 좌표를 추출하면 컬러 영상의 깊이 정보를 투영하거나 핀홀 카메라 모델을 통해 특징점의 3차원 좌표로 재구성한다. When the computing device 100 extracts the two-dimensional coordinates of a feature point from each color image (shot image), it projects the depth information of the color image or reconstructs it into the three-dimensional coordinates of the feature point through a pinhole camera model.

이에 컴퓨팅 장치(100)는 재구성된 특징점의 3차원 좌표를 점대점 매칭을 수행하여 영상 내 객체(특징점 좌표들)를 추적한다. Accordingly, the computing device 100 tracks the object (feature point coordinates) in the image by performing point-to-point matching on the 3D coordinates of the reconstructed feature point.

순서에 기초하여 M번째 컬러 영상의 재구성된 특징점의 3차원 좌표에서 M+1번째 컬러 영상의 재구성된 3차원 좌표로의 3차원 변화량을 계산할 수 있다. Based on the order, the 3D change amount from the 3D coordinates of the reconstructed feature point of the Mth color image to the reconstructed 3D coordinates of the M+1th color image can be calculated.

그리고 컴퓨팅 장치(100)는 3차원 변화량을 정합된 의료 영상에 적용하여 촬영 영상의 변화에도 지속적으로 객체의 추적이 가능하도록 할 수 있다. Additionally, the computing device 100 can apply the 3D change amount to the registered medical image to continuously track the object despite changes in the captured image.

도 9는 본 발명의 한 실시예에 컴퓨팅 장치의 하드웨어 구성도이다.Figure 9 is a hardware configuration diagram of a computing device in one embodiment of the present invention.

도 9을 참고하면, 영상 정합 모듈(110), 객체 추적 모듈(120) 그리고 학습 모듈(130)은 적어도 하나의 프로세서에 의해 동작하는 컴퓨팅 장치(300)에서, 본 발명의 동작을 실행하도록 기술된 명령들(instructions)이 포함된 프로그램을 실행한다. Referring to FIG. 9, the image registration module 110, the object tracking module 120, and the learning module 130 are described to execute the operations of the present invention in a computing device 300 operated by at least one processor. Run a program containing instructions.

컴퓨팅 장치(300)의 하드웨어는 적어도 하나의 프로세서(310), 메모리(320), 스토리지(330), 통신 인터페이스(340)를 포함할 수 있고, 버스를 통해 연결될 수 있다. 이외에도 입력 장치 및 출력 장치 등의 하드웨어가 포함될 수 있다. 컴퓨팅 장치(300)는 프로그램을 구동할 수 있는 운영 체제를 비롯한 각종 소프트웨어가 탑재될 수 있다.The hardware of the computing device 300 may include at least one processor 310, memory 320, storage 330, and communication interface 340, and may be connected through a bus. In addition, hardware such as input devices and output devices may be included. The computing device 300 may be equipped with various software, including an operating system capable of running programs.

프로세서(310)는 컴퓨팅 장치(300)의 동작을 제어하는 장치로서, 프로그램에 포함된 명령들을 처리하는 다양한 형태의 프로세서(310)일 수 있고, 예를 들면, CPU(Central Processing Unit), MPU(Micro Processor Unit), MCU(Micro Controller Unit), GPU(Graphic Processing Unit) 등 일 수 있다. 메모리(320)는 본 발명의 동작을 실행하도록 기술된 명령들이 프로세서(410)에 의해 처리되도록 해당 프로그램을 로드한다. 메모리(320)는 예를 들면, ROM(read only memory), RAM(random access memory) 등 일 수 있다. 스토리지(330)는 본 발명의 동작을 실행하는데 요구되는 각종 데이터, 프로그램 등을 저장한다. 통신 인터페이스(340)는 유/무선 통신 모듈일 수 있다.The processor 310 is a device that controls the operation of the computing device 300, and may be various types of processors 310 that process instructions included in a program, for example, a CPU (Central Processing Unit), MPU ( It may be a Micro Processor Unit (Micro Processor Unit), Micro Controller Unit (MCU), Graphic Processing Unit (GPU), etc. The memory 320 loads the program so that the instructions described to execute the operations of the present invention are processed by the processor 410. The memory 320 may be, for example, read only memory (ROM), random access memory (RAM), etc. The storage 330 stores various data, programs, etc. required to execute the operations of the present invention. The communication interface 340 may be a wired/wireless communication module.

본 발명에 따르면 마커를 사용하지 않으면서 해부학적 특징점을 이용한 무구속적 방식으로 3차원 영상간에 정합을 수행하고, 객체를 추적함으로써, 마커와 같은 추가적인 장비 없이도 최소화된 계산량으로 빠르고 정확한 정합 결과와 추적 데이터를 확보할 수 있다. According to the present invention, registration is performed between 3D images in an unconstrained manner using anatomical feature points without using markers and tracking objects, thereby providing fast and accurate registration results and tracking with a minimized amount of calculation without the need for additional equipment such as markers. Data can be secured.

본 발명에 따르면 실시간으로 의료영상과 깊이 영상의 정합이 자동으로 수행되기 때문에 3차원 영상 정합을 위한 기술자의 기술 숙련도에 영향을 받지 않으므로 숙련도 차이에 따른 오류를 최소화하여 일정하게 정확도가 높은 정합결과를 획득할 수 있다.According to the present invention, since the registration of medical images and depth images is performed automatically in real time, it is not affected by the technical skill of the technician for 3D image registration, and thus errors due to differences in skill are minimized to produce registration results with consistently high accuracy. It can be obtained.

본 발명에 따르면 추적하는 객체의 해부학적 특징점에 기초하여 기계학습 모델과 광학 추적 알고리즘을 동시에 사용하여 객체를 추적함으로써 누적 오차 발생을 방지하면서 고속으로 객체를 추적할 수 있다.According to the present invention, the object can be tracked at high speed while preventing the occurrence of cumulative errors by simultaneously tracking the object using a machine learning model and an optical tracking algorithm based on the anatomical feature points of the object being tracked.

이상에서 본 발명의 바람직한 실시예에 대하여 상세하게 설명하였지만 본 발명의 권리범위는 이에 한정되는 것은 아니고 다음의 청구범위에서 정의하고 있는 본 발명의 기본 개념을 이용한 당업자의 여러 변형 및 개량 형태 또한 본 발명의 권리범위에 속하는 것이다.Although the preferred embodiments of the present invention have been described in detail above, the scope of the present invention is not limited thereto, and various modifications and improvements made by those skilled in the art using the basic concept of the present invention defined in the following claims are also possible. falls within the scope of rights.

Claims

1. A method of operating a computing device operated by at least one processor, comprising:
A two-dimensional soft tissue image generated based on a user's medical image is input into a learned feature point extraction model to extract one or more first feature points based on anatomical location, and the first feature point is converted into three-dimensional coordinates based on the medical image. Steps to reconstruct,
A captured image of the user with a depth recognition camera is input to the feature point extraction model to extract one or more second feature points based on the anatomical location, and the depth information of the captured image is applied to extract the second feature point in three dimensions. A step of reconstructing into coordinates, and
Matching the 3D coordinates of the first feature point and the 3D coordinates of the second feature point to match the medical image and the captured image.
A method of operation, including.

In paragraph 1:
The feature point extraction model is,
An operation method that converts an input image into an image based on gradient information and then outputs two-dimensional coordinates of feature points pre-labeled in the image based on gradient information.

In paragraph 1:
The feature point extraction model is,
In the case of an image containing a face, an operation method of selecting and labeling one or more points among the outer end point of the eye, the inner end point of the eye, the corner of the mouth, and the starting point of the nose, where movement according to the face position is minimized, as a feature point.

In paragraph 2,
The step of reconstructing the first feature point into three-dimensional coordinates is,
An operation method of rendering a 3D soft tissue model based on the medical image, generating a 2D soft tissue image projected from the 3D soft tissue model in the coronal plane direction, and inputting it to the feature point extraction model.

In paragraph 4,
The step of reconstructing the first feature point into three-dimensional coordinates is,
An operation method of acquiring 2D coordinates of a first feature point for the 2D soft tissue image and extracting 3D coordinates having a contact point with the 3D soft tissue model using the 2D coordinates of the first feature point as a reference point.

In paragraph 5,
The step of matching the captured images is,
An operating method for performing point-to-point matching between feature points whose labeling matches the 3D coordinates of the first feature point and the 3D coordinates of the second feature point, and calculating coordinate conversion values between the matched feature points. .

In paragraph 6:
The step of matching the captured images is,
An operating method for converting the coordinate system of the medical image into a coordinate system for the captured image by applying the coordinate conversion value to the medical image.

delete

communication device,
memory, and
At least one processor that executes instructions of a program loaded into the memory,
The above program is
By inputting the user's medical image and the user's captured image taken with a depth recognition camera into a learned feature point extraction model, the coordinates of one or more first feature points and the coordinates of the second feature point, each labeled based on anatomical location, are obtained. , Matching the medical image and the captured image through matching between the coordinates of the first feature point and the coordinates of the second feature point,
If one or more feature point coordinates are extracted for each captured image by selectively using the feature point extraction model and the optical tracking algorithm from the collected continuously captured images, the feature point coordinates are matched between the consecutive captured images, and the amount of change in the matched feature point coordinates A computing device comprising instructions described to calculate and track an object in the captured image.

In paragraph 13:
The above program is,
Rendering a three-dimensional soft tissue model based on the medical image, and generating a two-dimensional soft tissue image projected from the three-dimensional soft tissue model toward the coronal plane,
Obtain the two-dimensional coordinates of the first feature point for the two-dimensional soft tissue image using a feature point extraction model, and extract three-dimensional coordinates having a contact point with the three-dimensional soft tissue model using the two-dimensional coordinates of the first feature point as a reference point. A computing device comprising the described instructions.

In paragraph 14:
The above program is,
Perform point-to-point matching between the 3-dimensional coordinates of the second feature point reconstructed by applying the depth information of the captured image and the feature point whose labeling matches the 3-dimensional coordinates of the first feature point, and matching. A computing device comprising instructions described to calculate coordinate transformation values between feature points.

In paragraph 13:
The above program is,
The sequentially captured images are sequentially grouped based on a preset unit, feature point coordinates are extracted for the first captured image in the group using a learned feature point extraction model, and from the second captured image, feature points extracted from the previous captured image are extracted. A computing device comprising instructions described for extracting feature point coordinates using an optical tracking algorithm based on the coordinates.