KR20240044196A

KR20240044196A - Method and apparatus for multi-stage segmentation of organs based on semi-supervised learning

Info

Publication number: KR20240044196A
Application number: KR1020220123683A
Authority: KR
Inventors: 홍헬렌; 윤현담
Original assignee: 서울여자대학교 산학협력단
Priority date: 2022-09-28
Filing date: 2022-09-28
Publication date: 2024-04-04

Abstract

반지도 학습에 기초하여 타겟 기관을 식별하는 방법 및 장치가 개시된다. 상기 방법은, 상기 타겟 기관을 포함하는 이미지 데이터에 대한 제1 레이블된 데이터셋 및 제1 레이블되지 않은 데이터셋에 대해 전처리를 수행하는 단계; 상기 전처리가 수행된 데이터셋에 기초하여, 상기 이미지 데이터 중 상기 타겟 기관이 위치한 영역에 대한 제2 레이블된 데이터셋 및 제2 레이블되지 않은 데이터셋을 획득하는 단계; 및 상기 제2 레이블된 데이터셋 및 상기 제2 레이블되지 않은 데이터 셋을 제1 AI 모델 및 제2 AI 모델에 입력하여, 상기 타겟 기관의 식별 결과를 획득하는 단계를 포함하고, 상기 제1 AI 모델 및 상기 제2 AI 모델 각각은, 회귀 작업을 수행하는 회귀 레이어 및 분할 작업을 수행하는 분할 레이어를 포함하고, 상기 제2 AI 모델은, 상기 제1 AI 모델에서 이용된 가중치의 EMA 값을 가중치로 이용하고, 상기 제1 AI 모델 및 상기 제2 AI 모델 각각의 회귀 레이어 및 분할 레이어를 통해 출력된 멀티 스케일 데이터에 기초하여, 상기 타겟 기관의 식별 결과를 획득하기 위한 손실 함수가 구성될 수 있다.A method and apparatus for identifying a target organ based on semi-supervised learning are disclosed. The method includes performing preprocessing on a first labeled dataset and a first unlabeled dataset for image data containing the target organ; Obtaining a second labeled dataset and a second unlabeled dataset for a region where the target organ is located among the image data, based on the preprocessed dataset; And inputting the second labeled data set and the second unlabeled data set into a first AI model and a second AI model, and obtaining an identification result of the target organization, wherein the first AI model And each of the second AI models includes a regression layer that performs a regression task and a split layer that performs a segmentation task, and the second AI model uses the EMA value of the weight used in the first AI model as a weight. A loss function for obtaining an identification result of the target organization can be constructed based on multi-scale data output through the regression layer and division layer of each of the first AI model and the second AI model.

Description

{METHOD AND APPARATUS FOR MULTI-STAGE SEGMENTATION OF ORGANS BASED ON SEMI-SUPERVISED LEARNING}

본 개시는 기관 분할 방법 및 장치에 관한 것이다. 보다 상세하게는, 본 개시는 반지도 학습에 기반한 다단계 기관 분할 방법 및 장치에 관한 것이다.The present disclosure relates to a method and device for organ segmentation. More specifically, the present disclosure relates to a multi-level organ segmentation method and apparatus based on semi-supervised learning.

복부 CT 영상을 통한 췌장 부분의 정확한 식별 및 분할은 췌장암의 진단, 수술 및 치료 계획에서 췌장의 모양을 이해하기 위해 사용될 수 있다. Accurate identification and segmentation of pancreatic segments through abdominal CT imaging can be used to understand the shape of the pancreas in the diagnosis, surgery, and treatment planning of pancreatic cancer.

다만, 환자 별로 췌장의 위치 및 형태가 다양하며, 췌장과 주변 장기 간의 구별이 쉽지 않다. 이에 따라, 복부 CT 영상에서 췌장 부분을 정확하게 식별 및 분할하는 것은 기술적으로 큰 어려움이 존재한다.However, the location and shape of the pancreas varies from patient to patient, and it is not easy to distinguish between the pancreas and surrounding organs. Accordingly, there is great technical difficulty in accurately identifying and segmenting the pancreas in abdominal CT images.

최근 인공지능 기반의 영상 처리 기술이 개발됨에 따라, 복부 CT 영상에 기초하여 췌장을 식별하는 방식이 도입되고 있다. 기존에는 레이블 데이터에 기초하여 췌장을 식별하는 지도 학습(supervised learning) 방식이 널리 사용되었다. With the recent development of artificial intelligence-based image processing technology, a method of identifying the pancreas based on abdominal CT images is being introduced. Previously, the supervised learning method to identify the pancreas based on label data was widely used.

다만, 지도 학습 방식을 위해서는 레이블 데이터가 중요한 역할을 하나, 현재 췌장을 분석하기 위한 레이블 데이터가 다수 존재하지 않다는 문제점이 존재한다.However, although label data plays an important role in the supervised learning method, there is a problem that there is currently no label data for analyzing the pancreas.

등록특허공보 제10-2202361호, 2021.01.07Registered Patent Publication No. 10-2202361, 2021.01.07

본 개시의 실시예는 반지도 학습에 기반한 다단계 기관 분할 방법 및 장치를 제공하는데 그 목적이 있다.The purpose of an embodiment of the present disclosure is to provide a multi-level organ segmentation method and device based on semi-supervised learning.

또한, 본 개시의 실시예는 형상의 차이가 크거나 위치가 서로 다른 위치에 존재하는 기관을 보다 정확하게 분할 및 식별하는 방법 및 장치를 제공하는데 그 목적이 있다.Additionally, an embodiment of the present disclosure aims to provide a method and device for more accurately dividing and identifying organs that have large differences in shape or are located at different locations.

본 개시가 해결하고자 하는 과제들은 이상에서 언급된 과제로 제한되지 않으며, 언급되지 않은 또 다른 과제들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.The problems to be solved by the present disclosure are not limited to the problems mentioned above, and other problems not mentioned can be clearly understood by those skilled in the art from the description below.

본 개시의 일 실시예에 따른, 장치에 의해 수행되는, 반지도 학습(semi-supervised learning)에 기초하여 타겟 기관(organ)을 식별하는 방법은, 상기 타겟 기관을 포함하는 이미지 데이터에 대한 제1 레이블된(labeled) 데이터셋(dataset) 및 제1 레이블되지 않은(unlabeled) 데이터셋에 대해 전처리를 수행하는 단계; 상기 전처리가 수행된 데이터셋에 기초하여, 상기 이미지 데이터 중 상기 타겟 기관이 위치한 영역에 대한 제2 레이블된 데이터셋 및 제2 레이블되지 않은 데이터셋을 획득하는 단계; 및 상기 제2 레이블된 데이터셋 및 상기 제2 레이블되지 않은 데이터 셋을 제1 AI 모델 및 제2 AI 모델에 입력하여, 상기 타겟 기관의 식별 결과를 획득하는 단계를 포함하고, 상기 제1 AI 모델 및 상기 제2 AI 모델 각각은, 회귀 작업(regression task)을 수행하는 회귀 레이어 및 분할 작업(segmentation task)을 수행하는 분할 레이어를 포함하고, 상기 제2 AI 모델은, 상기 제1 AI 모델에서 이용된 가중치의 EMA(exponential moving average) 값을 가중치로 이용하고, 상기 제1 AI 모델 및 상기 제2 AI 모델 각각의 회귀 레이어 및 분할 레이어를 통해 출력된 멀티 스케일(multi-scale) 데이터에 기초하여, 상기 타겟 기관의 식별 결과를 획득하기 위한 손실 함수(loss function)가 구성될 수 있다.According to an embodiment of the present disclosure, a method of identifying a target organ based on semi-supervised learning, performed by a device, includes first information on image data including the target organ. performing preprocessing on a labeled dataset and a first unlabeled dataset; Obtaining a second labeled dataset and a second unlabeled dataset for a region where the target organ is located among the image data, based on the preprocessed dataset; And inputting the second labeled data set and the second unlabeled data set into a first AI model and a second AI model, and obtaining an identification result of the target organization, wherein the first AI model And each of the second AI models includes a regression layer that performs a regression task and a segmentation layer that performs a segmentation task, and the second AI model is used in the first AI model. Using the EMA (exponential moving average) value of the weight as a weight, based on multi-scale data output through the regression layer and division layer of each of the first AI model and the second AI model, A loss function may be configured to obtain an identification result of the target organization.

그리고, 상기 전처리를 수행하는 단계는, 상기 제1 레이블된 데이터셋 및 상기 제1 레이블된 데이터셋에 대해, 강도(intensity) 정규화, 간격(spacing) 정규화, 및 데이터 증식(data augmentation)을 수행하는 단계를 포함할 수 있다.And, the step of performing the preprocessing includes performing intensity normalization, spacing normalization, and data augmentation on the first labeled dataset and the first labeled dataset. May include steps.

그리고, 상기 강도 정규화는, 상기 제1 레이블된 데이터셋 및 상기 제1 레이블된 데이터 셋에 포함된 상기 복부 이미지의 강도 값을 미리 정의된 값만큼 자르고(crop) 상기 타겟 기관과 관련 없는 영역을 제거한 후, 정규화를 수행하는 동작을 포함할 수 있다.And, the intensity normalization is performed by cropping the intensity values of the first labeled dataset and the abdominal image included in the first labeled dataset by a predefined value and removing regions unrelated to the target organ. Afterwards, an operation to perform normalization may be included.

그리고, 상기 제2 레이블된 데이터셋 및 상기 제2 레이블되지 않은 데이터셋을 획득하는 단계는, 상기 전처리가 수행된 제1 레이블된 데이터셋 및 상기 타겟 기관에 대한 그라운드 트루스(ground truth) 데이터와 비교한 결과에 기초하여, 상기 제2 레이블된 데이터셋을 획득하는 단계를 포함할 수 있다.And, the step of acquiring the second labeled dataset and the second unlabeled dataset includes comparing the first labeled dataset on which the preprocessing was performed and ground truth data for the target organization. Based on the result, obtaining the second labeled dataset may be included.

그리고, 상기 제2 레이블된 데이터셋 및 상기 제2 레이블되지 않은 데이터셋을 획득하는 단계는, 상기 제1 레이블되지 않은 데이터셋에 기초한 수도(pseudo) 레이블(label)을 이용하여, 상기 제2 레이블되지 않은 데이터 셋을 획득하는 단계를 포함할 수 있다.And, the step of obtaining the second labeled dataset and the second unlabeled dataset includes using a pseudo label based on the first unlabeled dataset, It may include a step of acquiring a data set that is not yet available.

그리고, 상기 손실 함수는, 상기 제1 AI 모델의 분할 레이어를 통해 출력된 제1 멀티-스케일 확률 데이터에 기초한 신뢰도 맵(confidence map) 및 상기 제2 AI 모델의 분할 레이어를 통해 출력된 제2 멀티-스케일 확률 데이터에 기초하여 출력된 상기 타겟 기관의 경계면에 대한 불확실성(uncertainty) 데이터에 기초할 수 있다.And, the loss function is a confidence map based on the first multi-scale probability data output through the division layer of the first AI model and the second multi-scale probability data output through the division layer of the second AI model. -It may be based on uncertainty data about the boundary surface of the target organ output based on scale probability data.

그리고, 상기 손실 함수는, 상기 제1 멀티 스케일 확률 데이터 및 그라운드 트루스 데이터 간의 차이를 통해 획득된 지도 학습 오차 데이터에 기초할 수 있다.And, the loss function may be based on supervised learning error data obtained through the difference between the first multi-scale probability data and ground truth data.

그리고, 상기 손실 함수는, 상기 제1 AI 모델의 회귀 레이어를 통해 출력된 제1 멀티 스케일 거리 맵 및 상기 제2 AI 모델의 회귀 레이어를 통해 출력된 제2 멀티 스케일 거리 맵 간의 차이를 통해 획득된 모델 간 오차 데이터에 기초할 수 있다.And, the loss function is obtained through the difference between the first multi-scale distance map output through the regression layer of the first AI model and the second multi-scale distance map output through the regression layer of the second AI model. It can be based on inter-model error data.

그리고, 상기 손실 함수는, 상기 제1 멀티 스케일 확률 데이터 및 상기 제2 멀티 스케일 확률 데이터간의 차이를 통해 획득된 레이어 간 오차 데이터에 기초할 수 있다.And, the loss function may be based on inter-layer error data obtained through the difference between the first multi-scale probability data and the second multi-scale probability data.

본 개시의 또 다른 실시예로, 반지도 학습(semi-supervised learning)에 기초하여 타겟 기관(organ)을 식별하는 장치는, 하나 이상의 메모리(memory); 및 하나 이상의 프로세서를 포함하고, 상기 하나 이상의 프로세서는, 상기 타겟 기관을 포함하는 이미지 데이터에 대한 제1 레이블된(labeled) 데이터셋(dataset) 및 제1 레이블되지 않은(unlabeled) 데이터셋에 대해 전처리를 수행하고; 상기 전처리가 수행된 데이터셋에 기초하여, 상기 이미지 데이터 중 상기 타겟 기관이 위치한 영역에 대한 제2 레이블된 데이터셋 및 제2 레이블되지 않은 데이터셋을 획득하고; 및 상기 제2 레이블된 데이터셋 및 상기 제2 레이블되지 않은 데이터 셋을 제1 AI 모델 및 제2 AI 모델에 입력하여, 상기 타겟 기관의 식별 결과를 획득하도록 설정되고, 상기 제1 AI 모델 및 상기 제2 AI 모델 각각은, 회귀 작업(regression task)을 수행하는 회귀 레이어 및 분할 작업(segmentation task)을 수행하는 분할 레이어를 포함하고, 상기 제2 AI 모델은, 상기 제1 AI 모델에서 이용된 가중치의 EMA(exponential moving average) 값을 가중치로 이용하고, 상기 제1 AI 모델 및 상기 제2 AI 모델 각각의 회귀 레이어 및 분할 레이어를 통해 출력된 멀티 스케일(multi-scale) 데이터에 기초하여, 상기 타겟 기관의 식별 결과를 획득하기 위한 손실 함수(loss function)가 구성될 수 있다.In another embodiment of the present disclosure, an apparatus for identifying a target organ based on semi-supervised learning includes one or more memories; and one or more processors, wherein the one or more processors pre-process a first labeled dataset and a first unlabeled dataset for image data containing the target organ. Do; Based on the preprocessed dataset, obtain a second labeled dataset and a second unlabeled dataset for a region where the target organ is located among the image data; and inputting the second labeled data set and the second unlabeled data set into a first AI model and a second AI model to obtain an identification result of the target organization, wherein the first AI model and the second AI model are set to obtain an identification result of the target organization. Each of the second AI models includes a regression layer that performs a regression task and a segmentation layer that performs a segmentation task, and the second AI model has weights used in the first AI model. Using the EMA (exponential moving average) value as a weight, based on multi-scale data output through the regression layer and division layer of each of the first AI model and the second AI model, the target A loss function may be constructed to obtain the identification result of the institution.

그리고, 상기 하나 이상의 프로세서는, 상기 제1 레이블된 데이터셋 및 상기 제1 레이블된 데이터셋에 대해, 강도(intensity) 정규화, 간격(spacing) 정규화, 및 데이터 증식(data augmentation)을 수행할 수 있다.And, the one or more processors may perform intensity normalization, spacing normalization, and data augmentation on the first labeled dataset and the first labeled dataset. .

그리고, 상기 하나 이상의 프로세서는, 상기 전처리가 수행된 제1 레이블된 데이터셋 및 상기 타겟 기관에 대한 그라운드 트루스(ground truth) 데이터와 비교한 결과에 기초하여, 상기 제2 레이블된 데이터셋을 획득할 수 있다.And, the one or more processors obtain the second labeled dataset based on a result of comparing the first labeled dataset on which the preprocessing was performed and ground truth data for the target institution. You can.

그리고, 상기 하나 이상의 프로세서는, 상기 제1 레이블되지 않은 데이터셋에 기초한 수도(pseudo) 레이블(label)을 이용하여, 상기 제2 레이블되지 않은 데이터 셋을 획득할 수 있다.And, the one or more processors may obtain the second unlabeled data set using a pseudo label based on the first unlabeled data set.

이 외에도, 본 개시를 구현하기 위한 실행하기 위한 컴퓨터 판독 가능한 기록 매체에 저장된 컴퓨터 프로그램이 더 제공될 수 있다.In addition to this, a computer program stored in a computer-readable recording medium for execution to implement the present disclosure may be further provided.

이 외에도, 본 개시를 구현하기 위한 방법을 실행하기 위한 컴퓨터 프로그램을 기록하는 컴퓨터 판독 가능한 기록 매체가 더 제공될 수 있다.In addition, a computer-readable recording medium recording a computer program for executing a method for implementing the present disclosure may be further provided.

본 개시의 전술한 과제 해결 수단에 의하면, 반지도 학습에 기반한 다단계 기관 분할 방법 및 장치가 제공될 수 있다.According to the above-described problem-solving means of the present disclosure, a multi-level organ segmentation method and device based on semi-supervised learning can be provided.

또한, 본 개시의 전술한 과제 해결 수단에 의하면, 형상의 차이가 크거나 위치가 서로 다른 위치에 존재하는 췌장의 구조에 대해 보다 정확하게 분석을 수행할 수 있다.In addition, according to the means for solving the above-described problems of the present disclosure, it is possible to more accurately analyze the structure of the pancreas that has a large difference in shape or exists in different positions.

본 개시의 효과들은 이상에서 언급된 효과로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.The effects of the present disclosure are not limited to the effects mentioned above, and other effects not mentioned may be clearly understood by those skilled in the art from the description below.

도 1은 본 개시의 일 실시예에 따른, 반지도 학습에 기반한 다단계 기관 분할 방법을 구현하기 위한 시스템의 개략도이다.
도 2는 본 개시의 일 실시예에 따른, 반지도 학습에 기반한 다단계 기관 분할 장치의 구성을 설명하기 위한 블록도이다.
도 3은 본 개시의 일 실시예에 따른, 반지도 학습에 기반한 다단계 기관 분할 방법을 설명하기 위한 순서도이다.
도 4는 본 개시의 일 실시예에 따른, 반지도 학습에 기반한 다단계 기관 분할 방법을 구현하기 위한 각 모델의 구성 및 동작을 설명하기 위한 도면이다.
도 5는 본 개시의 일 실시예에 따른, 장치가 전처리를 수행하는 방법을 설명하기 위한 도면이다.
도 6 및 도 7은 본 개시의 일 실시예에 따른, 반지도 학습에 기반한 다단계 기관 분할 방법에 적용되는 모델의 구성 및 동작을 설명하기 위한 도면이다.
도 8은 본 개시의 일 실시예에 따른, 반지도 학습에 기반한 다단계 기관 분할 결과를 나타내는 도면이다.1 is a schematic diagram of a system for implementing a multi-level organ segmentation method based on semi-supervised learning, according to an embodiment of the present disclosure.
Figure 2 is a block diagram for explaining the configuration of a multi-level organ segmentation device based on semi-supervised learning, according to an embodiment of the present disclosure.
Figure 3 is a flowchart for explaining a multi-level organization division method based on semi-supervised learning, according to an embodiment of the present disclosure.
FIG. 4 is a diagram illustrating the configuration and operation of each model for implementing a multi-level organization segmentation method based on semi-supervised learning, according to an embodiment of the present disclosure.
Figure 5 is a diagram for explaining a method by which a device performs preprocessing, according to an embodiment of the present disclosure.
Figures 6 and 7 are diagrams for explaining the configuration and operation of a model applied to a multi-level organ segmentation method based on semi-supervised learning, according to an embodiment of the present disclosure.
Figure 8 is a diagram showing the results of multi-level organization segmentation based on semi-supervised learning, according to an embodiment of the present disclosure.

본 개시 전체에 걸쳐 동일 참조 부호는 동일 구성요소를 지칭한다. 본 개시가 실시예들의 모든 요소들을 설명하는 것은 아니며, 본 개시가 속하는 기술분야에서 일반적인 내용 또는 실시예들 간에 중복되는 내용은 생략한다. 명세서에서 사용되는 '부, 모듈, 부재, 블록'이라는 용어는 소프트웨어 또는 하드웨어로 구현될 수 있으며, 실시예들에 따라 복수의 '부, 모듈, 부재, 블록'이 하나의 구성요소로 구현되거나, 하나의 '부, 모듈, 부재, 블록'이 복수의 구성요소들을 포함하는 것도 가능하다. Like reference numerals refer to like elements throughout this disclosure. The present disclosure does not describe all elements of the embodiments, and general content or overlapping content between the embodiments in the technical field to which the present disclosure pertains is omitted. The term 'unit, module, member, block' used in the specification may be implemented as software or hardware, and depending on the embodiment, a plurality of 'unit, module, member, block' may be implemented as a single component, or It is also possible for one 'part, module, member, or block' to include multiple components.

명세서 전체에서, 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 직접적으로 연결되어 있는 경우 뿐 아니라, 간접적으로 연결되어 있는 경우를 포함하고, 간접적인 연결은 무선 통신망을 통해 연결되는 것을 포함한다.Throughout the specification, when a part is said to be “connected” to another part, this includes not only direct connection but also indirect connection, and indirect connection includes connection through a wireless communication network. do.

또한 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다.Additionally, when a part "includes" a certain component, this means that it may further include other components rather than excluding other components, unless specifically stated to the contrary.

명세서 전체에서, 어떤 부재가 다른 부재 "상에" 위치하고 있다고 할 때, 이는 어떤 부재가 다른 부재에 접해 있는 경우뿐 아니라 두 부재 사이에 또 다른 부재가 존재하는 경우도 포함한다.Throughout the specification, when a member is said to be located “on” another member, this includes not only cases where a member is in contact with another member, but also cases where another member exists between the two members.

제 1, 제 2 등의 용어는 하나의 구성요소를 다른 구성요소로부터 구별하기 위해 사용되는 것으로, 구성요소가 전술된 용어들에 의해 제한되는 것은 아니다. Terms such as first and second are used to distinguish one component from another component, and the components are not limited by the above-mentioned terms.

단수의 표현은 문맥상 명백하게 예외가 있지 않는 한, 복수의 표현을 포함한다.Singular expressions include plural expressions unless the context clearly makes an exception.

각 단계들에 있어 식별부호는 설명의 편의를 위하여 사용되는 것으로 식별부호는 각 단계들의 순서를 설명하는 것이 아니며, 각 단계들은 문맥상 명백하게 특정 순서를 기재하지 않는 이상 명기된 순서와 다르게 실시될 수 있다. The identification code for each step is used for convenience of explanation. The identification code does not explain the order of each step, and each step may be performed differently from the specified order unless a specific order is clearly stated in the context. there is.

이하 첨부된 도면들을 참고하여 본 개시의 작용 원리 및 실시예들에 대해 설명한다.Hereinafter, the operating principle and embodiments of the present disclosure will be described with reference to the attached drawings.

본 명세서에서 '반지도 학습에 기반한 다단계 기관 분할 장치'는 연산처리를 수행하여 사용자에게 결과를 제공할 수 있는 다양한 장치들이 모두 포함된다. 예를 들어, 본 개시에 따른 반지도 학습에 기반한 다단계 기관 분할 장치는, 컴퓨터, 서버 장치 및 휴대용 단말기를 모두 포함하거나, 또는 어느 하나의 형태가 될 수 있다.In this specification, 'multi-level organ segmentation device based on semi-supervised learning' includes various devices that can perform computational processing and provide results to the user. For example, the multi-level organ segmentation device based on semi-supervised learning according to the present disclosure may include all of a computer, a server device, and a portable terminal, or may take the form of any one.

여기에서, 상기 컴퓨터는 예를 들어, 웹 브라우저(WEB Browser)가 탑재된 노트북, 데스크톱(desktop), 랩톱(laptop), 태블릿 PC, 슬레이트 PC 등을 포함할 수 있다.Here, the computer may include, for example, a laptop, desktop, laptop, tablet PC, slate PC, etc. equipped with a web browser.

상기 서버 장치는 외부 장치와 통신을 수행하여 정보를 처리하는 서버로써, 애플리케이션 서버, 컴퓨팅 서버, 데이터베이스 서버, 파일 서버, 게임 서버, 메일 서버, 프록시 서버 및 웹 서버 등을 포함할 수 있다.The server device is a server that processes information by communicating with external devices, and may include an application server, computing server, database server, file server, game server, mail server, proxy server, and web server.

상기 휴대용 단말기는 예를 들어, 휴대성과 이동성이 보장되는 무선 통신 장치로서, PCS(Personal Communication System), GSM(Global System for Mobile communications), PDC(Personal Digital Cellular), PHS(Personal Handyphone System), PDA(Personal Digital Assistant), IMT(International Mobile Telecommunication)-2000, CDMA(Code Division Multiple Access)-2000, W-CDMA(W-Code Division Multiple Access), WiBro(Wireless Broadband Internet) 단말, 스마트 폰(Smart Phone) 등과 같은 모든 종류의 핸드헬드(Handheld) 기반의 무선 통신 장치와 시계, 반지, 팔찌, 발찌, 목걸이, 안경, 콘택트 렌즈, 또는 머리 착용형 장치(head-mounted-device(HMD) 등과 같은 웨어러블 장치를 포함할 수 있다.The portable terminal is, for example, a wireless communication device that guarantees portability and mobility, such as PCS (Personal Communication System), GSM (Global System for Mobile communications), PDC (Personal Digital Cellular), PHS (Personal Handyphone System), and PDA. (Personal Digital Assistant), IMT (International Mobile Telecommunication)-2000, CDMA (Code Division Multiple Access)-2000, W-CDMA (W-Code Division Multiple Access), WiBro (Wireless Broadband Internet) terminal, smart phone ), all types of handheld wireless communication devices, and wearable devices such as watches, rings, bracelets, anklets, necklaces, glasses, contact lenses, or head-mounted-device (HMD). may include.

본 개시를 설명함에 있어서, "이미지 데이터"는 환자의 신체의 일부를 촬영한 CT (computer tomography) 이미지를 포함할 수 있으나, 이에 제한되는 것은 아니다.In explaining the present disclosure, “image data” may include, but is not limited to, a CT (computer tomography) image of a part of the patient's body.

본 개시를 설명함에 있어서 "기관"은 사람(예로, 환자) 또는 동물, 또는 사람 또는 동물의 일부 또는 전부일수 있다. 예를 들어, 기관은 간, 심장, 자궁, 뇌, 유방, 복부 등과 같은 장기, 및 혈관(예로, 동맥(artery) 또는 정맥(vein) 등), 지방 조직 등 중 적어도 하나를 포함할 수 있다. 그리고, “타겟 기관”은 실제 수술의 대상이 되는 신체의 일 부분 또는 AI 모델을 통해 식별/분할하려는 신체의 일 부분(예로, 췌장 등)을 의미할 수 있다.In describing the present disclosure, “organ” may be a human (eg, a patient) or an animal, or a part or all of a human or animal. For example, the organ may include at least one of organs such as the liver, heart, uterus, brain, breast, abdomen, etc., blood vessels (eg, arteries or veins, etc.), fatty tissue, etc. In addition, “target organ” may refer to a part of the body that is the target of actual surgery or a part of the body to be identified/segmented through an AI model (e.g., pancreas, etc.).

도 1은 본 개시의 일 실시예에 따른, 반지도 학습(semi-supervised learning)에 기반한 다단계 기관 분할 방법(또는, 반지도 학습에 기초하여 타겟 기관(organ)을 식별하는 방법)을 구현하기 위한 시스템(1000)의 개략도이다.1 is a diagram for implementing a multi-stage organ segmentation method based on semi-supervised learning (or a method of identifying a target organ based on semi-supervised learning) according to an embodiment of the present disclosure. This is a schematic diagram of system 1000.

도 1에 도시된 바와 같이, 반지도 학습에 기반한 다단계 기관 분할 방법을 구현하기 위한 시스템(1000)은, 장치(100), 병원 서버(200), 데이터 베이스(300) 및 AI 모델(400)을 포함할 수 있다.As shown in Figure 1, the system 1000 for implementing a multi-level organ segmentation method based on semi-supervised learning includes a device 100, a hospital server 200, a database 300, and an AI model 400. It can be included.

여기서, 도 1에는 장치(100)가 하나의 데스크 탑의 형태로 구현된 경우를 도시되어 있으나, 이에 한정되는 것은 아니다. 장치(100)는 상술한 바와 같이 다양한 유형의 장치 또는 하나 이상의 유형의 장치가 연결된 장치 군을 의미할 수 있다.Here, FIG. 1 illustrates the case where the device 100 is implemented in the form of a single desktop, but is not limited thereto. As described above, device 100 may refer to various types of devices or a group of devices in which one or more types of devices are connected.

시스템(1000)에 포함된 장치(100), 병원 서버(200), 데이터 베이스(300), 및 인공 지능(artificial intelligence, AI) 모델(400)은 네트워크(W)를 통해 통신을 수행할 수 있다. 여기서, 네트워크(W)는 유선 네트워크와 무선 네트워크를 포함할 수 있다. 예를 들어, 네트워크는 근거리 네트워크(LAN: Local Area Network), 도시권 네트워크(MAN: Metropolitan Area Network), 광역 네트워크(WAN: Wide Area Network) 등의 다양한 네트워크를 포함할 수 있다.The device 100, hospital server 200, database 300, and artificial intelligence (AI) model 400 included in the system 1000 can communicate through the network (W). . Here, the network W may include a wired network and a wireless network. For example, the network may include various networks such as a local area network (LAN), a metropolitan area network (MAN), and a wide area network (WAN).

또한, 네트워크(W)는 공지의 월드 와이드 웹(WWW: World Wide Web)을 포함할 수도 있다. 그러나, 본 개시의 실시예에 따른 네트워크(W)는 상기 열거된 네트워크에 국한되지 않고, 공지의 무선 데이터 네트워크나 공지의 전화 네트워크, 공지의 유무선 텔레비전 네트워크를 적어도 일부로 포함할 수도 있다.Additionally, the network W may include the known World Wide Web (WWW). However, the network (W) according to an embodiment of the present disclosure is not limited to the networks listed above, and may include at least some of a known wireless data network, a known telephone network, and a known wired and wireless television network.

장치(100)는 반지도 학습에 기반한 다단계 기관 분할 방법(또는, 반지도 학습에 기초하여 타겟 기관(organ)을 식별하는 방법)을 제공할 수 있다. The device 100 may provide a multi-stage organ segmentation method based on semi-supervised learning (or a method of identifying a target organ based on semi-supervised learning).

예로, 장치(100)는 타겟 기관을 포함하는 이미지 데이터에 대한 제1 레이블된(labeled) 데이터셋(dataset) 및 제2 레이블되지 않은(unlabeled) 데이터셋에 대해 전처리를 수행할 수 있다. 장치는 전처리가 수행된 데이터에 기초하여, 이미지 데이터 중 타겟 기관이 위치한 영역에 대한 제3 레이블된 데이터셋 및 제4 레이블되지 않은 데이터셋을 획득할 수 있다. 장치는 제3 레이블된 데이터셋 및 제4 레이블되지 않은 데이터 셋을 AI 모델(400)에 입력하여, 타겟 기관의 식별 결과를 획득할 수 있다. For example, the device 100 may perform preprocessing on a first labeled dataset and a second unlabeled dataset of image data including the target organ. The device may acquire a third labeled dataset and a fourth unlabeled dataset for the area where the target organ is located among the image data, based on the data on which preprocessing has been performed. The device may input the third labeled dataset and the fourth unlabeled dataset into the AI model 400 to obtain an identification result of the target organization.

상술된 장치가 수행하는 동작은 도 2 내지 도 8을 참조하여 구체적으로 설명하도록 한다.The operations performed by the above-described device will be described in detail with reference to FIGS. 2 to 8.

병원 서버(200)(예로, 클라우드 서버 등)는 환자의 신체에 대해 스캔을 하여 획득한 이미지 데이터(예로, 컴퓨터 단층촬영(computer tomography, CT) 이미지)를 저장할 수 있다. 병원 서버(200)는 장치(100), 데이터 베이스(300), 또는 AI 모델(400)로 저장한 이미지 데이터를 전송할 수 있다.The hospital server 200 (eg, cloud server, etc.) may store image data (eg, computer tomography (CT) image) obtained by scanning the patient's body. The hospital server 200 may transmit stored image data to the device 100, the database 300, or the AI model 400.

데이터 베이스(300)는 병원 서버(200)로부터 획득된 타겟 기관을 포함하는 이미지 데이터를 저장할 수 있다. 또한, 데이터 베이스(300)은 장치(100)에 의해 생성된 각종 레이블된 데이터 및 레이블되지 않은 데이터를 저장할 수 있다. 그리고, 데이터 베이스(300)는 AI 모델(400)의 학습 및 추론에 필요한 각종 파라미터를 저장할 수 있다.The database 300 may store image data including the target organ obtained from the hospital server 200. Additionally, the database 300 may store various labeled data and unlabeled data generated by the device 100. Additionally, the database 300 can store various parameters necessary for learning and inference of the AI model 400.

AI 모델(400)은 이미지 데이터에서 특정 기관을 분할/식별하도록 학습된 모델이다. AI 모델은 제1 AI 모델(예로, 학생(student) 모델) 및 제2 AI 모델(예로, 선생(teacher model))을 포함할 수 있다. The AI model 400 is a model learned to segment/identify specific organs in image data. The AI model may include a first AI model (eg, student model) and a second AI model (eg, teacher model).

여기서, 제1 AI 모델 및 상기 제2 AI 모델 각각은, 회귀 작업(regression task)을 수행하는 회귀 레이어 및 분할 작업(segmentation task)을 수행하는 분할 레이어를 포함할 수 있다. 즉, 제1 AI 모델 및 제2 AI 모델 각각은 듀얼 작업을 수행하는 복수의 레이어로 구성될 수 있다. 각 AI 모델의 구성 및 동작 방식은 도 2 내지 도 8을 참조하여 구체적으로 설명하도록 한다.Here, each of the first AI model and the second AI model may include a regression layer that performs a regression task and a segmentation layer that performs a segmentation task. That is, each of the first AI model and the second AI model may be composed of a plurality of layers that perform dual tasks. The configuration and operation method of each AI model will be described in detail with reference to FIGS. 2 to 8.

도 1은 AI 모델(400)이 장치(100) 외부에 구현(예로, 클라우드 기반(cloud-based)으로 구현)된 경우를 도시하고 있으나, 이에 한정되는 것은 아니며, 장치(100)에 일 구성 요소로 구현될 수 있다.1 illustrates a case where the AI model 400 is implemented outside of the device 100 (e.g., implemented as cloud-based), but is not limited thereto and is a component of the device 100. It can be implemented as:

도 2는 본 개시의 일 실시예에 따른, 반지도 학습(semi-supervised learning)에 기초하여 타겟 기관(organ)을 식별하는 장치(100)의 구성을 설명하기 위한 블록도이다.FIG. 2 is a block diagram illustrating the configuration of an apparatus 100 for identifying a target organ based on semi-supervised learning, according to an embodiment of the present disclosure.

도 2에 도시된 바와 같이, 장치(100)는 메모리(110), 통신 모듈(120), 디스플레이(130), 입력 모듈(140) 및 프로세서(150)를 포함할 수 있다. 다만, 이에 국한되는 것은 아니며, 장치(100)는 필요한 동작에 따라 당업자 관점에서 자명한 범위 내에서 소프트웨어 및 하드웨어 구성이 수정/추가/생략될 수 있다.As shown in FIG. 2 , device 100 may include memory 110, communication module 120, display 130, input module 140, and processor 150. However, it is not limited to this, and the software and hardware configuration of the device 100 may be modified/added/omitted depending on the required operation within the range obvious to those skilled in the art.

메모리(110)는 본 장치(100)의 다양한 기능을 지원하는 데이터와, 프로세서(150)의 동작을 위한 프로그램을 저장할 수 있고, 입/출력되는 데이터들(예를 들어, 음악 파일, 정지영상, 동영상 등)을 저장할 있고, 본 장치에서 구동되는 다수의 응용 프로그램(application program 또는 애플리케이션(application)), 본 장치(100)의 동작을 위한 데이터들, 명령어들을 저장할 수 있다. 이러한 응용 프로그램 중 적어도 일부는, 무선 통신을 통해 외부 서버로부터 다운로드 될 수 있다. The memory 110 can store data supporting various functions of the device 100 and a program for the operation of the processor 150, and can store input/output data (e.g., music files, still images, video, etc.), a number of application programs (application programs or applications) running on the device, data for operation of the device 100, and commands can be stored. At least some of these applications may be downloaded from an external server via wireless communication.

이러한, 메모리(110)는 플래시 메모리 타입(flash memory type), 하드디스크 타입(hard disk type), SSD 타입(Solid State Disk type), SDD 타입(Silicon Disk Drive type), 멀티미디어 카드 마이크로 타입(multimedia card micro type), 카드 타입의 메모리(예를 들어 SD 또는 XD 메모리 등), 램(random access memory; RAM), SRAM(static random access memory), 롬(read-only memory; ROM), EEPROM(electrically erasable programmable read-only memory), PROM(programmable read-only memory), 자기 메모리, 자기 디스크 및 광디스크 중 적어도 하나의 타입의 저장매체를 포함할 수 있다.The memory 110 is of a flash memory type, hard disk type, solid state disk type, SDD type (Silicon Disk Drive type), and multimedia card micro type. micro type), card type memory (e.g. SD or XD memory, etc.), random access memory (RAM), static random access memory (SRAM), read-only memory (ROM), EEPROM (electrically erasable) It may include at least one type of storage medium among programmable read-only memory (PROM), programmable read-only memory (PROM), magnetic memory, magnetic disk, and optical disk.

또한, 메모리(110)는 본 장치와는 분리되어 있으나, 유선 또는 무선으로 연결된 데이터베이스를 포함할 수 있다. 즉, 도 1에 도시된 데이터 베이스는 메모리(110)의 일 구성 요소로 구현될 수 있다.Additionally, the memory 110 is separate from the device, but may include a database connected by wire or wirelessly. That is, the database shown in FIG. 1 may be implemented as a component of the memory 110.

통신 모듈(120)는 외부 장치와 통신을 가능하게 하는 하나 이상의 구성 요소를 포함할 수 있으며, 예를 들어, 방송 수신 모듈, 유선통신 모듈, 무선통신 모듈, 근거리 통신 모듈, 위치정보 모듈 중 적어도 하나를 포함할 수 있다.The communication module 120 may include one or more components that enable communication with an external device, for example, at least one of a broadcast reception module, a wired communication module, a wireless communication module, a short-range communication module, and a location information module. may include.

유선 통신 모듈은, 지역 통신(Local Area Network; LAN) 모듈, 광역 통신(Wide Area Network; WAN) 모듈 또는 부가가치 통신(Value Added Network; VAN) 모듈 등 다양한 유선 통신 모듈뿐만 아니라, USB(Universal Serial Bus), HDMI(High Definition Multimedia Interface), DVI(Digital Visual Interface), RS-232(recommended standard232), 전력선 통신, 또는 POTS(plain old telephone service) 등 다양한 케이블 통신 모듈을 포함할 수 있다. Wired communication modules include various wired communication modules such as Local Area Network (LAN) modules, Wide Area Network (WAN) modules, or Value Added Network (VAN) modules, as well as USB (Universal Serial Bus) modules. ), HDMI (High Definition Multimedia Interface), DVI (Digital Visual Interface), RS-232 (recommended standard 232), power line communication, or POTS (plain old telephone service).

무선 통신 모듈은 와이파이(Wifi) 모듈, 와이브로(Wireless broadband) 모듈 외에도, GSM(global System for Mobile Communication), CDMA(Code Division Multiple Access), WCDMA(Wideband Code Division Multiple Access), UMTS(universal mobile telecommunications system), TDMA(Time Division Multiple Access), LTE(Long Term Evolution), 4G, 5G, 6G 등 다양한 무선 통신 방식을 지원하는 무선 통신 모듈을 포함할 수 있다.In addition to Wi-Fi modules and WiBro (Wireless broadband) modules, wireless communication modules include GSM (global System for Mobile Communication), CDMA (Code Division Multiple Access), WCDMA (Wideband Code Division Multiple Access), and UMTS (universal mobile telecommunications system). ), TDMA (Time Division Multiple Access), LTE (Long Term Evolution), 4G, 5G, 6G, etc. may include a wireless communication module that supports various wireless communication methods.

디스플레이(130)는 본 장치(100)에서 처리되는 정보(예를 들어, 타겟 기관을 포함한 이미지 데이터, 각종 레이블된 데이터셋 및 레이블되지 않은 데이터셋 등)를 표시(출력)한다. 예를 들어, 디스플레이는 본 장치(100)에서 구동되는 응용 프로그램(일 예로, 어플리케이션)의 실행화면 정보, 또는 이러한 실행화면 정보에 따른 UI(User Interface), GUI(Graphic User Interface) 정보를 표시할 수 있다.The display 130 displays (outputs) information processed by the device 100 (e.g., image data including target organs, various labeled datasets and unlabeled datasets, etc.). For example, the display may display execution screen information of an application (for example, an application) running on the device 100, or UI (User Interface) and GUI (Graphic User Interface) information according to such execution screen information. You can.

입력 모듈(140)는 사용자로부터 정보를 입력받기 위한 것으로서, 사용자 입력부를 통해 정보가 입력되면, 프로세서(150)는 입력된 정보에 대응되도록 본 장치(100)의 동작을 제어할 수 있다. The input module 140 is for receiving information from the user. When information is input through the user input unit, the processor 150 can control the operation of the device 100 to correspond to the input information.

이러한, 입력 모듈(140)은 하드웨어식 물리 키(예를 들어, 본 장치의 전면, 후면 및 측면 중 적어도 하나에 위치하는 버튼, 돔 스위치 (dome switch), 조그 휠, 조그 스위치 등) 및 소프트웨어식 터치 키를 포함할 수 있다. 일 예로서, 터치 키는, 소프트웨어적인 처리를 통해 터치스크린 타입의 디스플레이(130) 상에 표시되는 가상 키(virtual key), 소프트 키(soft key) 또는 비주얼 키(visual key)로 이루어지거나, 상기 터치스크린 이외의 부분에 배치되는 터치 키(touch key)로 이루어질 수 있다. 한편, 상기 가상키 또는 비주얼 키는, 다양한 형태를 가지면서 터치스크린 상에 표시되는 것이 가능하며, 예를 들어, 그래픽(graphic), 텍스트(text), 아이콘(icon), 비디오(video) 또는 이들의 조합으로 이루어질 수 있다. The input module 140 includes hardware-type physical keys (e.g., buttons, dome switches, jog wheels, jog switches, etc. located on at least one of the front, back, and sides of the device) and software-type physical keys. May include touch keys. As an example, the touch key consists of a virtual key, soft key, or visual key displayed on the touch screen type display 130 through software processing, or the above It may consist of a touch key placed in a part other than the touch screen. Meanwhile, the virtual key or visual key can be displayed on the touch screen in various forms, for example, graphic, text, icon, video or these. It can be made up of a combination of .

프로세서(150)는 장치(100)의 전반적인 동작 및 기능을 제어할 수 있다. 구체적으로, 프로세서(150)는 본 장치(100) 내의 구성요소들의 동작을 제어하기 위한 알고리즘 또는 알고리즘을 재현한 프로그램에 대한 데이터를 저장하는 메모리, 및 메모리에 저장된 데이터를 이용하여 전술한 동작을 수행하는 적어도 하나의 프로세서(미도시)로 구현될 수 있다. 이때, 메모리와 프로세서는 각각 별개의 칩으로 구현될 수 있다. 또는, 메모리와 프로세서는 단일 칩으로 구현될 수도 있다.The processor 150 may control the overall operation and functions of the device 100. Specifically, the processor 150 has a memory that stores data for an algorithm for controlling the operation of components within the device 100 or a program that reproduces the algorithm, and performs the above-described operations using the data stored in the memory. It may be implemented with at least one processor (not shown). At this time, the memory and processor may each be implemented as separate chips. Alternatively, the memory and processor may be implemented as a single chip.

또한, 프로세서(150)는 이하의 도 3 내지 도 8에서 설명되는 본 개시에 따른 다양한 실시 예들을 본 장치(100) 상에서 구현하기 위하여, 위에서 살펴본 구성요소들을 중 어느 하나 또는 복수를 조합하여 제어할 수 있다. In addition, the processor 150 can control any one or a combination of the above-described components in order to implement various embodiments according to the present disclosure described in FIGS. 3 to 8 below on the device 100. You can.

도 3은 본 개시의 일 실시예에 따른, 장치에 의해 수행되는, 반지도 학습에 기초하여 타겟 기관(organ)을 식별하는 방법을 설명하기 위한 순서도이다.FIG. 3 is a flowchart illustrating a method of identifying a target organ based on semi-supervised learning performed by an apparatus, according to an embodiment of the present disclosure.

장치는 타겟 기관을 포함하는 이미지 데이터에 대한 제1 레이블된(labeled) 데이터셋(dataset) 및 제1 레이블되지 않은(unlabeled) 데이터셋에 대해 전처리를 수행할 수 있다(S310).The device may perform preprocessing on a first labeled dataset and a first unlabeled dataset for image data including the target organ (S310).

구체적으로, 장치는 타겟 기관을 포함하는 이미지 데이터를 획득할 수 있다. 일 예로, 타겟 기관이 췌장인 경우, 장치는 췌장이 포함된 복부 CT 이미지 데이터를 다른 장치(예로, 병원 서버 등)로부터 획득할 수 있다.Specifically, the device may acquire image data containing the target organ. For example, when the target organ is the pancreas, the device may acquire abdominal CT image data including the pancreas from another device (eg, a hospital server, etc.).

도 4를 참조할 때, 장치는 타겟 기관(예로, 췌장)을 포함하는 이미지 데이터(예로, 복부 CT 이미지 데이터)에 대한 제1 레이블된 데이터셋(410-1) 및 제1 레이블되지 않은 데이터셋(410-2)을 획득할 수 있다. 그리고, 장치는 각 데이터 셋(410-1, 410-2)에 대해 전처리(420)를 수행할 수 있다.Referring to FIG. 4, the device includes a first labeled dataset 410-1 and a first unlabeled dataset for image data (e.g., abdominal CT image data) containing a target organ (e.g., pancreas). You can obtain (410-2). Additionally, the device may perform preprocessing 420 on each data set 410-1 and 410-2.

여기서, 제1 레이블된 데이터셋(410-1)에는 타겟 기관을 포함하는 이미지 데이터 및 타겟 기관에 대한 정답 데이터(즉, 그라운드 트루스(ground truth) 데이터)를 포함할 수 있다. Here, the first labeled dataset 410-1 may include image data including the target organ and correct answer data (i.e., ground truth data) for the target organ.

그리고, 도 4에 도시된 바와 같이, 각 데이터셋(410-1, 410-2)에는 512x512x181 크기의 이미지 데이터가 포함될 수 있으며, 466 픽셀로 구성될 수 있으나, 이에 제한되는 것은 아니다.And, as shown in FIG. 4, each dataset 410-1 and 410-2 may include image data with a size of 512x512x181 and may be composed of 466 pixels, but is not limited thereto.

장치는 전처리를 수행함으로써 전체 데이터셋의 강도(intensity) 및 간격(spacing) 값을 조정할 수 있다. 장치가 수행하는 전처리의 유형은 강도 정규화, 간격 정규화, 및 데이터 증식(data augmentation) 등으로 구분될 수 있다.The device can adjust the intensity and spacing values of the entire dataset by performing preprocessing. The type of preprocessing performed by the device can be divided into intensity normalization, interval normalization, and data augmentation.

일 예로, (최소(min)-최대(max)) 강도 정규화는, 제1 레이블된 데이터셋 및 제1 레이블된 데이터셋에 포함된 복부 이미지의 강도 값을 미리 정의된 값만큼(예로, -100HU 내지 240HU 범위만큼) 자르고(crop), 타겟 기관과 관련 없는 영역을 제거한 후, 정규화를 수행(예로, 0 내지 1로 정규화를 수행)하는 동작을 포함할 수 있다.As an example, (min-max) intensity normalization may be performed by adjusting the intensity values of the first labeled dataset and the abdominal image included in the first labeled dataset by a predefined value (e.g., -100HU). It may include an operation of cropping (ranging from 0 to 240 HU), removing regions unrelated to the target organ, and then performing normalization (e.g., performing normalization to 0 to 1).

간격 정규화는, 각 데이터셋에 포함된 이미지 데이터의 픽셀 간격을 동일하게 하기 위하여, 모든 이미지 데이터의 중앙 픽셀 간격 값을 리샘플링(resampling)(예로, 0.8594mm²만큼)할 수 있다.Interval normalization may resample the central pixel interval value of all image data (e.g., by 0.8594 mm ² ) in order to make the pixel interval of the image data included in each dataset the same.

데이터 증식은 각 데이터셋에 포함된 이미지 데이터에 대해 임의 회전(예로, -20° 내지 20°), 임의 변환(x축 및 y축에서 최대 20px 변환) 및 임의 크기 조정(예로, 0.8~1.2배)을 수행할 수 있다. 이에 따라, 장치는 각 데이터셋에 포함된 데이터 양을 (예로, 10배 정도) 증식시킬 수 있다.Data augmentation involves random rotation (e.g., -20° to 20°), random translation (up to 20px translation on the x and y axes), and arbitrary scaling (e.g., 0.8 to 1.2 times) for the image data included in each dataset. ) can be performed. Accordingly, the device can increase the amount of data included in each dataset (for example, by about 10 times).

일 예로, 도 5(a)에 도시된 바와 같이, 장치는 원래의(original)의 복부에 대한 이미지 데이터(즉, CT 이미지 데이터)를 획득하고, 획득된 이미지 데이터에 기초하여 각 데이터셋을 구성할 수 있다. As an example, as shown in FIG. 5(a), the device acquires image data (i.e., CT image data) for the original abdomen and configures each dataset based on the acquired image data. can do.

장치는 원래의 이미지에 대해 강도 정규화를 수행함으로써 도 5의 (b)와 같은 이미지 데이터를 획득할 수 있다. 또 다른 예로, 장치는 원래의 이미지에 대해 간격 정규화를 수행함으로써 도 5의 (c)와 같은 이미지 데이터를 획득할 수 있다.The device can obtain image data as shown in (b) of FIG. 5 by performing intensity normalization on the original image. As another example, the device may obtain image data as shown in (c) of FIG. 5 by performing interval normalization on the original image.

장치는 전처리가 수행된 데이터셋에 기초하여, 이미지 데이터 중 타겟 기관이 위치한 영역에 대한 제2 레이블된 데이터셋 및 제2 레이블되지 않은 데이터셋을 획득할 수 있다(S320).The device may acquire a second labeled dataset and a second unlabeled dataset for the area where the target organ is located among the image data, based on the preprocessed dataset (S320).

여기서, 이미지 데이터 중 타겟 기관이 위치한 영역은, 타겟 기관 주변을 구분한 바운딩 박스(bounding box) 내의 영역(즉, 타겟 기관을 중심으로 주변 영역을 구분한 영역)을 의미할 수 있다. 그리고, 제2 레이블된 데이터셋은 로컬라이즈된(localized) 레이블된 데이터 셋을 의미하고, 제2 레이블되지 않은 데이터셋은 로컬라이즈된 레이블되지 않은 데이터 셋을 의미할 수 있다.Here, the area where the target organ is located among the image data may mean an area within a bounding box that divides the area around the target organ (that is, an area that divides the surrounding area around the target organ). And, the second labeled dataset may mean a localized labeled data set, and the second unlabeled data set may mean a localized unlabeled data set.

일 예로, 장치는 전처리가 수행된 제1 레이블된 데이터셋 및 타겟 기관에 대한 그라운드 트루스(ground truth) 데이터와 비교한 결과에 기초하여, 제2 레이블된 데이터 셋을 획득할 수 있다. 구체적으로, 장치는 타겟 기관이 위치한 영역을 기준으로 이미지 데이터에 대해 자름(crop) 및 리사이즈(resize) 알고리즘을 적용함으로써 제2 레이블된 데이터셋을 획득할 수 있다.As an example, the device may acquire a second labeled data set based on a result of comparing the first labeled data set on which preprocessing was performed and ground truth data for the target organ. Specifically, the device may obtain a second labeled dataset by applying a crop and resize algorithm to the image data based on the area where the target organ is located.

또 다른 예로, 장치는 제1 레이블되지 않은 데이터셋에 기초한 수도(pseudo) 레이블을 이용하여, 제2 레이블되지 않은 데이터 셋을 획득할 수 있다. 구체적으로, 장치는 제1 레이블되지 않은 데이터셋에 기초한 수도 레이블에 대해 자름 및 리사이즈 알고리즘을 적용하여 제2 레이블되지 않은 데이터셋을 획득할 수 있다.As another example, the device may obtain a second unlabeled data set using a pseudo label based on the first unlabeled data set. Specifically, the device may obtain a second unlabeled dataset by applying a cropping and resizing algorithm to the capital label based on the first unlabeled dataset.

도 4를 참조하면, 장치는 전처리가 수행된 제1 레이블된 데이터셋 및 타겟 기관에 대한 그라운드 트루스 데이터(예로, 제1 레이블된 데이터셋에 포함된 그라운드 트루스 데이터)(450)와 비교함으로써 특정 모델(430)을 학습시킬 수 있다. 그리고, 장치는 비교 결과(예로, 특정 기관을 분할/식별할 확률)(440-1)에 기초하여 제2 레이블된 데이터셋(410-3)을 획득할 수 있다.Referring to FIG. 4, the device creates a specific model by comparing the first labeled dataset on which preprocessing was performed and ground truth data for the target organ (e.g., ground truth data included in the first labeled dataset) 450. (430) can be learned. And, the device may acquire a second labeled dataset 410-3 based on the comparison result (eg, probability of dividing/identifying a specific organ) 440-1.

그리고, 장치는 제1 레이블되지 않은 데이터셋(410-2)에 기초한 수도 레이블(440-2)에 대해 자름(crop) 및 리사이즈(resize) 알고리즘을 적용하여 제2 레이블되지 않은 데이터셋(410-4)을 획득할 수 있다.Then, the device applies a crop and resize algorithm to the number label 440-2 based on the first unlabeled dataset 410-2 to create a second unlabeled dataset 410-2. 4) can be obtained.

장치는 제2 레이블된 데이터셋 및 제2 레이블되지 않은 데이터 셋을 제1 AI 모델 및 제2 AI 모델에 입력하여, 타겟 기관의 식별 결과를 획득할 수 있다(S330).The device may input the second labeled data set and the second unlabeled data set into the first AI model and the second AI model to obtain an identification result of the target organization (S330).

여기서, 제1 AI 모델은 학생(student) 모델로 구현되고, 제2 AI 모델은 선생(teacher model)로 구현될 수 있다. 즉, 제1 AI 모델 및 제2 AI 모델에 기초하여 평균-선생(mean-teacher) 모델이 구성될 수 있다.Here, the first AI model may be implemented as a student model, and the second AI model may be implemented as a teacher model. That is, a mean-teacher model may be constructed based on the first AI model and the second AI model.

도 4를 참조하면, 제1 AI 모델(460-1) 및 제2 AI 모델(460-2) 각각은, 회귀 작업(regression task)을 수행하는 회귀 레이어(470-2, 470-3) 및 분할 작업(segmentation task)을 수행하는 분할 레이어(470-1, 470-4)를 포함할 수 있다. 즉, 제1 AI 모델(460-1) 및 제2 AI 모델(460-2)은 듀얼 태스크를 수행하는 레이어로 구성될 수 있다.Referring to FIG. 4, each of the first AI model 460-1 and the second AI model 460-2 includes regression layers 470-2 and 470-3 that perform a regression task and segmentation. It may include segmentation layers 470-1 and 470-4 that perform a segmentation task. That is, the first AI model 460-1 and the second AI model 460-2 may be composed of layers that perform dual tasks.

그리고, 제1 AI 모델(460-1) 및 제2 AI 모델(460-2) 각각의 회귀 레이어(470-2, 470-3) 및 분할 레이어(470-1, 470-4)을 통해 출력된 멀티 스케일 데이터에 기초하여 타겟 기관의 식별 결과를 획득하기 위한 손실 함수(loss function)(또는, 비용 함수(cost function))이 구성될 수 있다.And, output through the regression layers (470-2, 470-3) and division layers (470-1, 470-4) of the first AI model (460-1) and the second AI model (460-2), respectively. A loss function (or cost function) may be configured to obtain an identification result of the target organization based on multi-scale data.

여기서, 손실 함수는 제1 AI 모델(460-1) 및 제2 AI 모델(460-2)을 이용하여 출력된 값과 장치의 사용자가 원하는 출력 값의 오차를 출력하기 위한 함수를 의미한다.Here, the loss function refers to a function for outputting the error between the value output using the first AI model 460-1 and the second AI model 460-2 and the output value desired by the user of the device.

타겟 기관의 식별 결과를 획득하기 위한 손실 함수(L_total)는 수학식 1과 같이 구현될 수 있다. 수학식 1에서 α는 0.1일 수 있으나 이에 제한되는 것은 아니다.The loss function (L _total ) for obtaining the identification result of the target organization can be implemented as shown in Equation 1. In Equation 1, α may be 0.1, but is not limited thereto.

여기서, 지도 학습 오차 데이터(L_supervised)는 제1 AI 모델의 분할 레이어(470-1)를 통해 출력된 제1 멀티 스케일 확률 데이터(480-3) 및 그라운드 트루스 데이터(480-5) 간의 차이를 통해 획득되는 값일 수 있다. 지도 학습 오차 데이터(L_supervised(X,Y))는 수학식 2와 같이 구성될 수 있다.Here, the supervised learning error data (L _supervised ) is the difference between the first multi-scale probability data (480-3) and the ground truth data (480-5) output through the division layer (470-1) of the first AI model. It may be a value obtained through Supervised learning error data (L _supervised (X,Y)) can be configured as shown in Equation 2.

수학식 2에서 L_DICE는 DICE 손실(loss)을 의미하며, 데이터 불균형 특징이 존재하는 세맨틱 분할 네트워크(semantic segmentation network) 구조에 사용될 수 있다. f_seg는 분할 레이어의 출력 함수를 의미한다. 이에 제한되는 것은 아니다.

는 제1 AI 모델의 파라미터(또는, 가중치)를 의미한다. 는 바이너리 크로스 엔트로피 손실을 의미한다. S는 멀티 스케일 확률 값의 개수를 의미한다. N 및 M 각각은 레이블된 데이터 셋 및 레이블되지 않은 데이터셋의 개수를 의미한다. ₀, ₁, ₂, _s는 각각 0.4, 0.3, 0.2, 0.1일 수 있으나, 이에 제한되는 것은 아니다.In Equation 2, L _DICE means DICE loss, and can be used in a semantic segmentation network structure where data imbalance characteristics exist. f _seg refers to the output function of the split layer. It is not limited to this.

means the parameters (or weights) of the first AI model. stands for binary cross entropy loss. S refers to the number of multi-scale probability values. N and M refer to the number of labeled and unlabeled data sets, respectively. ₀ , _One , ₂ , _s may be 0.4, 0.3, 0.2, and 0.1, respectively, but is not limited thereto.

모델 간 오차 데이터(L_inter-model)는, 제1 AI 모델의 회귀 레이어(470-2)를 통해 출력된 제1 멀티 스케일 거리 맵(480-4) 및 제2 AI 모델의 회귀 레이어(470-3)를 통해 출력된 제2 멀티 스케일 거리 맵(480-1) 간의 차이를 통해 획득된 값일 수 있다. 모델 간 오차 데이터(L_inter-model(X))는 수학식 3과 같이 구성될 수 있다.Inter-model error data (L _inter-model ) is the first multi-scale distance map (480-4) output through the regression layer (470-2) of the first AI model and the regression layer (470-4) of the second AI model. It may be a value obtained through the difference between the second multi-scale distance map 480-1 output through 3). Inter-model error data (L _inter-model (X)) can be structured as shown in Equation 3.

수학식 3에서 L_MSE는 평균 제곱 오차(mean square error) 손실 값을 의미한다. f_reg는 회귀 레이어의 출력 함수를 의미한다.

tea는 제2 AI 모델의 파라미터(또는, 가중치)를 의미한다.In Equation 3, L _MSE means the mean square error loss value. f _reg refers to the output function of the regression layer.

tea means the parameters (or weights) of the second AI model.

타겟 기관의 경계면에 대한 불확실성(uncertainty) 데이터(L_uncertainty)는, 제1 AI 모델의 분할 레이어(470-2)를 통해 출력된 제1 멀티-스케일 확률 데이터(480-3)에 기초한 신뢰도 맵(confidence map)(480-6) 및 상기 제2 AI 모델의 분할 레이어(470-4)를 통해 출력된 제2 멀티-스케일 확률 데이터(480-2)에 기초하여 출력된 값일 수 있다. Uncertainty data (L _uncertainty ) about the boundary of the target organization is a reliability map based on the first multi-scale probability data (480-3) output through the division layer (470-2) of the first AI model ( It may be a value output based on the confidence map) 480-6 and the second multi-scale probability data 480-2 output through the division layer 470-4 of the second AI model.

여기서, 신뢰도 맵이란 이미지 데이터 내에서 특정 기관일 확률을 표시하는 플롯(plot) 데이터를 의미할 수 있다.Here, the reliability map may refer to plot data indicating the probability of being a specific institution within the image data.

타겟 기관의 경계면에 대한 불확실성(uncertainty) 데이터(L_uncertainty(X,C,Y))는 수학식 4와 같이 구성될 수 있다.Uncertainty data (L _uncertainty (X, C, Y)) about the boundary of the target organization can be configured as shown in Equation 4.

수학식 4에서 C는 신뢰도 맵을 의미할 수 있다.In Equation 4, C may mean a reliability map.

레이어간 오차 데이터(또는, 작업(task) 간 오차 데이터(L_inter-task)는 제1 멀티 스케일 확률 데이터(480-3) 및 제2 멀티 스케일 확률 데이터(480-2)간의 차이를 통해 획득된 값일 수 있다. 레이어간 오차 데이터는 수학식 5와 같이 구성될 수 있다.Inter-layer error data (or inter-task error data (L _inter-task ) is obtained through the difference between the first multi-scale probability data 480-3 and the second multi-scale probability data 480-2. The inter-layer error data may be structured as shown in Equation 5.

장치는 획득된 타겟 기관의 식별 결과(490-1) 및 별도의 수도 레이블(490-2)을 이용하여 다시 제1 AI 모델 및 제2 AI 모델을 학습시킬 수 있다.The device can learn the first AI model and the second AI model again using the acquired identification result of the target organization (490-1) and the separate capital label (490-2).

도 6 및 도 7은 본 개시의 일 실시예에 따른, 제1 AI 모델 및 제2 AI 모델의 구성 및 동작을 설명하기 위한 도면이다.Figures 6 and 7 are diagrams for explaining the configuration and operation of the first AI model and the second AI model according to an embodiment of the present disclosure.

도 6을 참조하면, 제1 AI 모델(학생 모델)은 입력 데이터(610-1) 및 레이블(620-2)에 기초하여 학습될 수 있다. 예로, 제1 AI 모델은 입력 데이터(610-1) 및 레이블(620-2) 간의 오차(예로, 분류 손실(classification loss))가 감소되도록 학습될 수 있다. 이 때, 제1 AI 모델의 포함된 히든 레이어(hidden layer) 상에 특정 노이즈(η)가 적용될 수 있다.Referring to FIG. 6, the first AI model (student model) may be learned based on the input data 610-1 and the label 620-2. For example, the first AI model may be trained so that the error (eg, classification loss) between the input data 610-1 and the label 620-2 is reduced. At this time, specific noise (η) may be applied on the hidden layer included in the first AI model.

제2 AI 모델(선생 모델)은 제1 AI 모델에서 이용된 가중치(θ)의EMA(exponential moving average) 값(θ')을 가중치로 이용할 수 있다. 그리고, 제1 AI 모델 및 제2 AI 모델 각각이 출력된 값의 오차 값(예로, 일관성 비용(consistency cost)이 감소되도록, 제1 AI 모델 및 제2 AI 모델은 학습될 수 있다.The second AI model (teacher model) may use the EMA (exponential moving average) value (θ') of the weight (θ) used in the first AI model as a weight. In addition, the first AI model and the second AI model may be learned so that the error value (e.g., consistency cost) of the output value of each of the first AI model and the second AI model is reduced.

제1 AI 모델 및 제2 AI 모델은 도 7에 도시된 바와 같이 불확실성-감지(uncertainty aware) 멀티 스케일 측정 네트워크로 구성될 수 있다.The first AI model and the second AI model may be composed of an uncertainty aware multi-scale measurement network as shown in FIG. 7.

도 8은 본 개시의 일 실시예에 따른, 이미지 데이터에서 타겟 기관을 식별한 결과를 나타낸 도면이다. 도 8에 도시된 바와 같이, 장치는 총 4가지 케이스(즉, 4 종류의 이미지 데이터)에서 타겟 기관을 식별한 결과를 출력할 수 있다.Figure 8 is a diagram showing the results of identifying a target organ in image data, according to an embodiment of the present disclosure. As shown in FIG. 8, the device can output the results of identifying the target organ in a total of four cases (i.e., four types of image data).

이 때, 장치는 1) 이미지 데이터 및 그라운드 트루스에 기초하여 학습된 모델 2) 평균-선생 모델에 기초하여 학습된 모델, 3) 평균-선생 모델 및 UA-MT 네트워크에 기초하여 학습된 모델, 4) 평균-선생 모델, UA-MT 네트워크, 및 듀얼-태스크에 기초하여 학습된 모델, 5) 평균-선생 모델, UA-MT 네트워크, 듀얼-태스크, 및 정제(refinement)에 기초하여 학습된 모델 각각을 이용하여 타겟 기관을 식별한 결과를 출력할 수 있다.At this time, the device has 1) a model learned based on image data and ground truth, 2) a model learned based on the mean-teacher model, 3) a model learned based on the mean-teacher model and the UA-MT network, 4 ) a model learned based on the mean-teacher model, UA-MT network, and dual-task, 5) a model learned based on the mean-teacher model, UA-MT network, dual-task, and refinement, respectively. You can output the results of identifying the target organization using .

한편, 개시된 실시예들은 컴퓨터에 의해 실행 가능한 명령어를 저장하는 기록매체의 형태로 구현될 수 있다. 명령어는 프로그램 코드의 형태로 저장될 수 있으며, 프로세서에 의해 실행되었을 때, 프로그램 모듈을 생성하여 개시된 실시예들의 동작을 수행할 수 있다. 기록매체는 컴퓨터로 읽을 수 있는 기록매체로 구현될 수 있다.Meanwhile, the disclosed embodiments may be implemented in the form of a recording medium that stores instructions executable by a computer. Instructions may be stored in the form of program code, and when executed by a processor, may create program modules to perform operations of the disclosed embodiments. The recording medium may be implemented as a computer-readable recording medium.

컴퓨터가 읽을 수 있는 기록매체로는 컴퓨터에 의하여 해독될 수 있는 명령어가 저장된 모든 종류의 기록 매체를 포함한다. 예를 들어, ROM(Read Only Memory), RAM(Random Access Memory), 자기 테이프, 자기 디스크, 플래쉬 메모리, 광 데이터 저장장치 등이 있을 수 있다. Computer-readable recording media include all types of recording media storing instructions that can be decoded by a computer. For example, there may be read only memory (ROM), random access memory (RAM), magnetic tape, magnetic disk, flash memory, optical data storage device, etc.

이상에서와 같이 첨부된 도면을 참조하여 개시된 실시예들을 설명하였다. 본 개시가 속하는 기술분야에서 통상의 지식을 가진 자는 본 개시의 기술적 사상이나 필수적인 특징을 변경하지 않고도, 개시된 실시예들과 다른 형태로 본 개시가 실시될 수 있음을 이해할 것이다. 개시된 실시예들은 예시적인 것이며, 한정적으로 해석되어서는 안 된다.As described above, the disclosed embodiments have been described with reference to the attached drawings. A person skilled in the art to which this disclosure pertains will understand that the present disclosure may be practiced in forms different from the disclosed embodiments without changing the technical idea or essential features of the present disclosure. The disclosed embodiments are illustrative and should not be construed as limiting.

100: 장치
110: 메모리
120: 통신 모듈
130: 디스플레이
140: 입력 모듈
150 : 프로세서100: device
110: memory
120: communication module
130: display
140: input module
150: processor

Claims

In a method for identifying a target organ based on semi-supervised learning, performed by a device, the method includes:
performing preprocessing on a first labeled dataset and a first unlabeled dataset for image data including the target organ;
Obtaining a second labeled dataset and a second unlabeled dataset for a region where the target organ is located among the image data, based on the preprocessed dataset; and
Inputting the second labeled data set and the second unlabeled data set into a first AI model and a second AI model to obtain an identification result of the target organization,
Each of the first AI model and the second AI model includes a regression layer that performs a regression task and a segmentation layer that performs a segmentation task,
The second AI model uses the EMA (exponential moving average) value of the weight used in the first AI model as a weight,
A loss function for obtaining an identification result of the target organization based on multi-scale data output through the regression layer and division layer of each of the first AI model and the second AI model. How it is constructed.

According to paragraph 1,
The step of performing the preprocessing is,
The method comprising performing intensity normalization, spacing normalization, and data augmentation on the first labeled dataset and the first labeled dataset.

According to paragraph 2,
The intensity normalization is,
An operation of cropping the first labeled dataset and intensity values of the abdominal image included in the first labeled dataset by a predefined value, removing regions unrelated to the target organ, and then performing normalization. Method, including.

According to paragraph 1,
Obtaining the second labeled dataset and the second unlabeled dataset includes:
A method comprising obtaining the second labeled dataset based on a result of comparing the first labeled dataset on which the preprocessing was performed and ground truth data for the target organ.

According to paragraph 4,
Obtaining the second labeled dataset and the second unlabeled dataset includes:
Obtaining the second unlabeled data set using a pseudo label based on the first unlabeled data set.

According to paragraph 1,
The loss function is,
Based on a confidence map based on the first multi-scale probability data output through the division layer of the first AI model and the second multi-scale probability data output through the division layer of the second AI model A method based on uncertainty data about the output interface of the target organ.

According to clause 6,
The loss function is,
A method based on supervised learning error data obtained through differences between the first multi-scale probability data and ground truth data.

In clause 7,
The loss function is,
Based on inter-model error data obtained through the difference between the first multi-scale distance map output through the regression layer of the first AI model and the second multi-scale distance map output through the regression layer of the second AI model. , method.

According to clause 8,
The loss function is,
A method based on inter-layer error data obtained through a difference between the first multi-scale probability data and the second multi-scale probability data.

In the device for identifying a target organ based on semi-supervised learning, the device includes:
One or more memories; and
Contains one or more processors,
The one or more processors:
perform preprocessing on a first labeled dataset and a first unlabeled dataset for image data containing the target organ;
Based on the preprocessed dataset, obtain a second labeled dataset and a second unlabeled dataset for a region where the target organ is located among the image data; and
Input the second labeled data set and the second unlabeled data set into a first AI model and a second AI model to obtain an identification result of the target organization,
Each of the first AI model and the second AI model includes a regression layer that performs a regression task and a segmentation layer that performs a segmentation task,
The second AI model uses the EMA (exponential moving average) value of the weight used in the first AI model as a weight,
A loss function for obtaining an identification result of the target organization based on multi-scale data output through the regression layer and division layer of each of the first AI model and the second AI model. The device consists of:

According to clause 10,
The one or more processors:
An apparatus for performing intensity normalization, spacing normalization, and data augmentation on the first labeled dataset and the first labeled dataset.

According to clause 11,
The intensity normalization is,
An operation of cropping the first labeled dataset and intensity values of the abdominal image included in the first labeled dataset by a predefined value, removing regions unrelated to the target organ, and then performing normalization. Device, including.

According to clause 10,
The one or more processors:
Apparatus for obtaining the second labeled dataset based on a result of comparing the first labeled dataset on which the preprocessing was performed and ground truth data for the target organ.

According to clause 13,
The one or more processors:
Apparatus, obtaining the second unlabeled data set using a pseudo label based on the first unlabeled data set.

According to clause 10,
The loss function is,
Based on a confidence map based on the first multi-scale probability data output through the division layer of the first AI model and the second multi-scale probability data output through the division layer of the second AI model A device based on the output uncertainty data about the boundary surface of the target organ.

According to clause 15,
The loss function is,
An apparatus based on supervised learning error data obtained through differences between the first multi-scale probability data and ground truth data.

According to clause 16,
The loss function is,
Based on inter-model error data obtained through the difference between the first multi-scale distance map output through the regression layer of the first AI model and the second multi-scale distance map output through the regression layer of the second AI model. , Device.

According to clause 17,
The loss function is,
Based on inter-layer error data obtained through the difference between the first multi-scale probability data and the second multi-scale probability data.