KR20230139257A

KR20230139257A - Device and method for classifying and segmenting ct image based on machine learning model

Info

Publication number: KR20230139257A
Application number: KR1020220037668A
Authority: KR
Inventors: 김남국; 경성구; 신기원; 홍길선
Original assignee: 재단법인 아산사회복지재단; 울산대학교 산학협력단
Priority date: 2022-03-25
Filing date: 2022-03-25
Publication date: 2023-10-05

Abstract

일 실시예에 따른 전자 장치는 트레이닝 입력 이미지에 기계 학습 모델의 인코더(encoder)를 적용함으로써 피처 맵(feature map)을 추출하고, 상기 피처 맵에 상기 기계 학습 모델의 범용 분류 모듈을 적용함으로써 상기 트레이닝 입력 이미지에 병변(lesion)이 나타날 제1 가능성 점수(possibility score)를 획득하며, 상기 피처 맵에 상기 기계 학습 모델의 범용 분할 모듈을 적용함으로써 상기 트레이닝 입력 이미지로부터 병변 영역을 추출하고, 상기 획득된 제1 가능성 점수 및 상기 추출된 병변 영역에 기초하여, 상기 범용 분류 모듈 및 상기 범용 분할 모듈의 출력들 간의 일관성(consistency)을 나타내는, 일관성 손실(consistency loss)을 가지는 목적함수 값을 계산하며, 상기 계산된 목적함수 값을 이용하여 상기 기계 학습 모델의 파라미터를 업데이트하는 프로세서를 포함할 수 있다.An electronic device according to an embodiment extracts a feature map by applying an encoder of a machine learning model to a training input image, and trains the feature map by applying a general classification module of the machine learning model to the feature map. Obtain a first likelihood score that a lesion will appear in the input image, extract a lesion area from the training input image by applying a general-purpose segmentation module of the machine learning model to the feature map, and extract the lesion area from the training input image. Based on the first likelihood score and the extracted lesion area, calculate an objective function value with a consistency loss that represents consistency between outputs of the universal classification module and the universal segmentation module, It may include a processor that updates parameters of the machine learning model using the calculated objective function value.

Description

Method and apparatus for classifying and segmenting CT images based on machine learning models {DEVICE AND METHOD FOR CLASSIFYING AND SEGMENTING CT IMAGE BASED ON MACHINE LEARNING MODEL}

이하, 기계 학습 모델 기반의 의료 영상을 분류하기 위한 기술이 개시된다.Hereinafter, a technology for classifying medical images based on a machine learning model is disclosed.

두개내출혈(Intracranial hemorrhage; ICH)의 분류(classification; CLS) 및 분할(segmentation; SEG) 중 하나의 단일 태스크(single task)를 위한 기계 학습 모델에 관한 연구들이 존재할 수 있다.There may be studies on machine learning models for the single task of classification (CLS) and segmentation (SEG) of intracranial hemorrhage (ICH).

환자 레벨(patient level)의 ICH의 분류 및 분할 태스크는 두 가지 유형들이 존재할 수 있다. There may be two types of classification and segmentation tasks of ICH at the patient level.

첫째, 체적(volumetric) 문제를 해결하기 위한 ICH의 분류 및 분할에 대한 3차원 기반 모델이 존재할 수 있다. 가변적인 3D U-Net 구조는 ICH의 분할을 위해 이용될 수 있다. 예를 들어, 모델의 아키텍처 및 전처리는 트레이닝 CT 스캔들의 통계적인 분석에 기초하여 선택될 수 있다. 다른 예를 들어, 18개 레이어들 ResNet의 3D 버전이 ICH의 분류 태스크를 위한 이용될 수 있다. 3D 기반 방법에서, 일반적으로 GPU 리소스의 제한으로 인해, 패치(patch) 기반 접근 방식 및 작은 깊이의 3D 컨볼루션 레이어가 이용될 수 있다. 그러나, 패치 기반 접근 방식은 종종 위양성들(false positives)과 과민성(over-sensitivity)을 유발할 수 있다. First, a 3D-based model for classification and segmentation of ICH may exist to solve the volumetric problem. A variable 3D U-Net structure can be used for segmentation of ICH. For example, the model's architecture and preprocessing can be selected based on statistical analysis of training CT scans. As another example, a 3D version of the 18 layers ResNet can be used for the classification task of ICH. In 3D-based methods, typically due to limitations in GPU resources, patch-based approaches and small-depth 3D convolutional layers can be used. However, patch-based approaches can often lead to false positives and over-sensitivity.

둘째, ICH의 분류 및 분할 태스크에 대한 2D 슬라이스 레벨에 대한 모델이 존재할 수 있다. 2D 기반 접근 방식이 3D 체적 정보에 취약하지만(vulnerable) 두꺼운 슬라이스 두께를 가진 이방성(anisotropic) CT 데이터에 유리하다고 주장될 수 있다. 2D 슬라이스 단위 U-net 네트워크는 이용될 수 있고, 슬라이스 분할 결과는 환자 레벨의 ICH 분할 태스크를 위하여 누적될 수 있다. 또한, 2D 기반 CNN 네트워크의 결과가 누적된 이후에, 환자 레벨의 ICH 분류 작업으로 확장하기 위해 누적된 피처를 양방향 LSTM으로 전송될 수 있다. 확장을 위한 전이 학습(transfer learning)이 수행될 때, 3-스테이지(three-stage) end-to-end 학습 프로세스가 수행되어야 한다는 것이 주장될 수 있다.Second, a model may exist at the 2D slice level for the classification and segmentation tasks of ICH. It could be argued that 2D-based approaches are vulnerable to 3D volumetric information but advantageous for anisotropic CT data with thick slice thickness. A 2D slice-level U-net network can be used, and slice segmentation results can be accumulated for a patient-level ICH segmentation task. Additionally, after the results of the 2D-based CNN network are accumulated, the accumulated features can be transferred to a bidirectional LSTM to expand to the patient-level ICH classification task. It can be argued that when transfer learning for scaling is performed, a three-stage end-to-end learning process should be performed.

다중 태스크 학습은 ICH 분류 작업을 해결하는 데 사용될 수 있다. 예를 들어, 분류 결과는 블리딩(bleeding) 양에 대한 정보로 변환될 수 있고, 이를 MLP(Multi-Linear Perceptron) 레이어에 연결하여 분류 태스크의 성능이 개선될 수 있다. 다른 예를 들어, Dilated Residual Network는 해상도를 백본으로 유지하기 위하여 채택될 수 있고, 또한 마지막 레이어에서 분기되어 다중 태스크가 수행될 수 있다. 다른 예를 들어, 5개의 연속적인 CT 슬라이스들이 입력으로 이용될 수 있다. Mask R-CNN으로 전이시키기 위하여, 3D 컨볼루션 연산자에 의하여 획득된 3D 피처는 2D 피처로 투영될 수 있다. 모든 슬라이스들에 대하여 반복되어 최종 3D 결과를 획득해야 하므로, 과도하고 중복된 3D 컨볼루션 연산이 발생될 수 있다. 다른 예를 들어, 분류 성능을 높이기 위해 다중 태스크 학습이 수행될 수 있다. 각 DenseNet 블록 뒤에 보조 분할 작업이 추가될 수 있고, 양방향 LSTM 레이어는 슬라이스 간의 공간 종속성을 결합하기 위하여 사용될 수 있다. 다른 예를 들어, 3D U-Net [24] 구조의 병목 현상에서 ConvLSTM을 사용하여 환자 레벨의 다중 태스크가 수행될 수 있고, 3가지 유형의 Hounsfield Unit(HU) 범위는 3개의 채널로 사용될 수 있다. 그러나, GPU 리소스의 요구는 CT 슬라이스의 개수가 증가함에 따라 기하급수적으로 증가할 수 있다. 다른 예를 들어, 분할 태스크를 위한 디코더는 지도 학습되고 재구성 태스크를 위한 디코더는 비지도 학습됨으로써 분할 태스크의 성능을 향상시키기 위한 준지도 다중 작업 학습이 수행될 수 있다. 다중 태스크 학습은 ICH 작업에 적용되지 않을 수 있으나, COVID-19와 같은 의학적 문제를 효과적으로 해결하는 데 이용될 수 있다.Multi-task learning can be used to solve the ICH classification task. For example, classification results can be converted into information about the amount of bleeding, and the performance of the classification task can be improved by connecting this to a Multi-Linear Perceptron (MLP) layer. As another example, a Dilated Residual Network can be employed to maintain resolution as a backbone, and can also be branched at the last layer to perform multiple tasks. As another example, five consecutive CT slices may be used as input. To transfer to Mask R-CNN, 3D features obtained by the 3D convolution operator can be projected into 2D features. Since it must be repeated for all slices to obtain the final 3D result, excessive and redundant 3D convolution operations may occur. As another example, multi-task learning may be performed to increase classification performance. Auxiliary segmentation operations can be added after each DenseNet block, and a bidirectional LSTM layer can be used to combine spatial dependencies between slices. As another example, in the bottleneck of the 3D U-Net [24] architecture, patient-level multi-tasks can be performed using ConvLSTM, and three types of Hounsfield Unit (HU) ranges can be used with three channels. . However, the demand for GPU resources can increase exponentially as the number of CT slices increases. For another example, the decoder for the segmentation task is supervised and the decoder for the reconstruction task is learned unsupervised, so that semi-supervised multi-task learning can be performed to improve the performance of the segmentation task. Multi-task learning may not be applicable to ICH tasks, but it can be used to effectively solve medical problems such as COVID-19.

마지막으로, 전이 학습(transfer learning) 접근 방식과 밀접하게 관련된 모델이 존재할 수 있다.Finally, there may be models that are closely related to transfer learning approaches.

전이 학습은 종종 ICH 태스크를 위하여 이용될 수 있다. ImageNet 사전 훈련 모델은 인기 있고 적용하기 쉬운 전이 학습 모델일 수 있고, 객체의 저수준 특징(low-level features)(예: 엣지들(edges), 텍스처들(textures))을 캡처하는 데 유용할 수 있다. 그러나, ImageNet 사전 학습 모델은 RGB 기반의 3채널을 다루기 때문에 1채널을 주로 사용하는 의료 영역에서 부적절할 수 있다. 또한, 체적 CT 데이터는 슬라이스 방식으로만 적용해야 하는 불편함이 존재할 수 있다. 예를 들어, 전이 학습을 통한 표현 학습(representation learning)를 이용하여 3D 의료 영역 및 2D 의료 영역에 대한 적절한 사전 학습 가중치를 효과적으로 생성하는 프레임워크가 도입될 수 있다. 상술한 프레임워크는 의료 영상 분야에서 왜곡된 영상을 원본으로 복원하여 CT 자체의 표현(representation)을 모델이 학습한 후, 전이 학습을 통해 분할 작업을 개선하는 2단계를 포함할 수 있다. 이러한 접근 방식은 1채널로 비교적 작은 데이터 세트의 의학적 문제를 해결할 때 도움이 된다고 주장될 수 있다.Transfer learning can often be used for ICH tasks. ImageNet pre-trained models can be popular and easy-to-apply transfer learning models, and can be useful for capturing low-level features of objects (e.g. edges, textures). . However, because the ImageNet pre-trained model deals with 3 RGB-based channels, it may be inappropriate in the medical field where 1 channel is mainly used. Additionally, there may be an inconvenience in that volumetric CT data must be applied only in a slice manner. For example, a framework can be introduced that effectively generates appropriate pre-learning weights for 3D medical domains and 2D medical domains using representation learning through transfer learning. The above-described framework may include two steps in the field of medical imaging: restoring a distorted image to its original state, learning the representation of the CT itself, and then improving the segmentation task through transfer learning. It could be argued that this approach is helpful when solving medical problems with relatively small data sets with one channel.

두개내출혈(Intracranial hemorrhage; ICH)은 경미하게 발생하더라도 긴급한 치료가 필요한 치명적인 질병일 수 있다. 트리아쥬 시스템(triage system)은 비조영두부 CT(non-contrast head CT; NCCT)에 컴퓨터 지원 진단(computer-aided diagnosis; CAD)을 이용하여 자동화되면, 바쁜 응급실의 업무량을 줄일 수 있으므로 외상성(traumatic) ICH 환자의 생존율은 증가될 수 있다. 따라서, ICH 환자를 정확하게 감지하기 위한 높은 민감도(high sensitivity) 모델뿐만 아니라 응급 상황에서 불필요한 작업량을 감소시키기 위한 높은 특이도(high specificity) 모델은 요구될 수 있다. 우수한 성능의 연구들이 여러 차례 수행되었음에도 불구하고, 여전히 더 높은 성능은 요구될 수 있고 외부 데이터 세트와 실제 발생률 데이터 세트에 대한 불안정한 성능이 개선되는 것이 요구될 수 있다.Intracranial hemorrhage (ICH), even if it occurs mildly, can be a fatal disease that requires urgent treatment. The triage system, when automated by using computer-aided diagnosis (CAD) on non-contrast head CT (NCCT), can reduce the workload of busy emergency rooms, thereby reducing the burden of traumatic injury. ) The survival rate of ICH patients may be increased. Therefore, a high sensitivity model to accurately detect ICH patients as well as a high specificity model to reduce unnecessary workload in emergency situations may be required. Although several studies with good performance have been performed, higher performance may still be required and the unstable performance on external data sets and real incidence data sets may require improvement.

두개내출혈(Intracranial hemorrhage; ICH)은 해부학에 따라 뇌실질출혈(cerebral parenchymal hemorrhage; CPH), 뇌실내출혈(intraventricular hemorrhage; IVH), 경막외출혈(epidural hemorrhage; EDH), 경막하출혈(subdural hemorrhage; SDH), 지주막하출혈(subarachnoid hemorrhage; SAH)의 5가지 하위 유형들 중 하나로 분류될 수 있다. 모든 유형의 ICH는 경미하게 발생하더라도 갑자기 재앙을 일으킬 수 있으므로 긴급한 치료를 요구할 수 있다. 딥 러닝 기반 컴퓨터 지원 진단(computer-aided diagnosis; CAD)을 이용한 분류 시스템은, 응급 상황에서 치료 지연과 평가 오류로 인해 예방 가능한 사망을 해결하기 위해, 제안될 수 있다. 방사선과 전문의는 분류(classification; CLS) 태스크에서 출혈이 있는 환자를 간단히 파악할 수 있고, 분할(segmentation; SEG) 태스크에서 위치와 출혈량을 정확하게 인식할 수 있다. 응급 상황에서 분류 태스크 및 분할 태스크의 두 가지 태스크들을 수행하는 딥 러닝 모델을 통해, 뇌출혈 환자의 우선 순위는 빠르게 지정될 수 있다. 따라서, 모델은 더 심각환 환자에게 적절한 치료를 분배하기 위하여, ICH 환자를 감지하는 데 높은 감도(sensitivity) 및 정상 환자에 대한 높은 특이성(specificity) 모두를 가질 것이 요구될 수 있다. 딥 러닝을 기반으로 한 최근의 이전 ICH 연구에 따른 모델은 자동 분류 및 분할 태스크를 가능하게 할 수 있다. Intracranial hemorrhage (ICH) can be classified according to anatomy into cerebral parenchymal hemorrhage (CPH), intraventricular hemorrhage (IVH), epidural hemorrhage (EDH), subdural hemorrhage (SDH), Subarachnoid hemorrhage (SAH) can be classified into one of five subtypes. All types of ICH, even mild ones, can suddenly become catastrophic and require urgent treatment. A classification system using deep learning-based computer-aided diagnosis (CAD) can be proposed to address preventable deaths due to treatment delays and assessment errors in emergency situations. Radiologists can easily identify patients with bleeding in a classification (CLS) task and accurately recognize the location and amount of bleeding in a segmentation (SEG) task. Through a deep learning model that performs two tasks, a classification task and a segmentation task, in emergency situations, patients with cerebral hemorrhage can be quickly prioritized. Therefore, the model may be required to have both high sensitivity for detecting ICH patients and high specificity for normal patients in order to allocate appropriate treatment to more severely ill patients. Models based on recent and previous ICH research based on deep learning can enable automatic classification and segmentation tasks.

그러나, 실제 응급 상황에 적용되는 경우의 문제는 여전히 존재할 수 있다.However, problems may still exist when applied to real emergency situations.

이질성(Heterogeneity): 비조영두부 CT(non-contrast head CT; NCCT)에서 출혈의 텍스처(texture), 강도(intensity), 크기(size) 및 위치(position)는 매우 다양하기 때문에, 감지(detection)는 전문 영상의학과 의사에게도 어려울 수 있다. 특히, 뇌출혈의 CT 강도는 시간이 지날수록 감소할 수 있다.Heterogeneity: The texture, intensity, size, and position of hemorrhages in non-contrast head CT (NCCT) vary widely, making detection difficult. can be difficult even for expert radiologists. In particular, the CT intensity of cerebral hemorrhage may decrease over time.

클래스 불균형(Class-imbalance): 5가지 유형들의 ICH는 여러 두개 내 위치에서 복합적으로 발생할 수 있고, 다양한 심각도를 가질 수 있다. ICH가 고르지 않고 실제로는 거의 발생하지 않을 수 있으므로, 균형 잡힌 데이터 세트를 수집하는 것이 어려울 수 있다.Class-imbalance: The five types of ICH can occur in combination in multiple intracranial locations and can have varying degrees of severity. Because ICH is uneven and may occur rarely in practice, collecting a balanced data set can be difficult.

외부 데이터에 대한 견고성(Robustness): 기존의 대부분의 ICH 연구에 따른 모델에서 견고성을 고려하지 않았기 때문에, 외부 데이터에 대한 견고성은 측정되기 어려울 수 있다. 기존의 모델은 몇 가지 외부 데이터 세트들에 의하여 검증되므로, 외부 데이터에 적용되거나 정상 환자 비율이 높은 실제 임상 환경에 적용되는 경우 신뢰할 수 없는 결과를 초래할 수 있다. 예를 들어, 데이터 세트에 따른 ICH의 유형과 양으로 인해 이전 보고서보다 외부 검증 데이터 세트에서 분류 성능이 예기치 않게 낮을 수 있다는 것이 개시될 수 있다.Robustness to external data: Because robustness is not considered in the models in most existing ICH studies, robustness to external data can be difficult to measure. Existing models are verified by several external data sets, so they may lead to unreliable results when applied to external data or in real clinical environments with a high proportion of normal patients. For example, it may be disclosed that classification performance may be unexpectedly lower in an external validation dataset than in previous reports due to the type and amount of ICH across the dataset.

환자 레벨의 태스크: ICH는 주로 CT를 사용하여 진단될 수 있다. 상당한 양의 GPU 리소스는 3D 체적 CT 스캔을 처리하기 위하여 요구될 수 있다. 기존 연구에서는 이러한 문제를 우회하기 위하여, 주로 슬라이스 기반 및 패치 기반 방법이 이용될 수 있다. 패치 기반 방법은 우수한 감도를 보여줄 수 있다. 그러나 많은 오탐지들이 발생할 수 있다. 반면, 슬라이스 기반 방법은 피처를 스택함으로써 GPU 리소스를 절약할 수 있다. 그러나 슬라이스 기반 방법은 결국 환자 레벨로 확장되어야 하고, 슬라이스 기반 방법의 학습은 3D CT 스캔으로 확장되는 동안 불안정하고 복잡할 수 있다.Patient-level tasks: ICH can primarily be diagnosed using CT. A significant amount of GPU resources may be required to process 3D volumetric CT scans. In existing research, slice-based and patch-based methods can mainly be used to circumvent this problem. Patch-based methods can show excellent sensitivity. However, many false positives can occur. On the other hand, slice-based methods can save GPU resources by stacking features. However, slice-based methods eventually need to be scaled up to the patient level, and learning of slice-based methods can be unstable and complicated while scaling up to 3D CT scans.

다중 태스크 학습(Multi-task learning; MTL)은 여러 관련된 태스크들에 포함된 의미 정보를 활용하여 모든 태스크들의 일반화 성능을 향상시킬 수 있다. 다중 태스크 학습에 의해 트레이닝된 모델은, 모든 태스크들을 만족시키는 피처를 포착해야 하므로, 과적합(overfitting)의 위험을 크게 감소시킬 수 있고, 다른 태스크를 위한 시너지 효과를 얻을 수 있다. 그러나, 다중 태스크 환경에서는 전체 평균 성능은 향상될 수 있지만, 다중 태스크 모델의 일부 특정 태스크에 대한 성능은 단일 작업 모델의 성능보다 나쁠 수 있다. 예를 들어, 하나의 태스크가 다중 태스크 학습 전체를 지배할 수 있고 그 결과 다른 태스크에 대한 성능이 저하되고 부정적인 전이 학습 효과가 발생할 수 있기 때문에, 파인 튜닝(fine tuning), 모델 용량(model capacity) 및 태스크 간의 공분산(covariance)과 같은 몇 가지가 존재할 수 있다.Multi-task learning (MTL) can improve the generalization performance of all tasks by utilizing semantic information contained in multiple related tasks. Since a model trained by multi-task learning must capture features that satisfy all tasks, the risk of overfitting can be greatly reduced and synergy effects for other tasks can be obtained. However, although the overall average performance may be improved in a multi-task environment, the performance of some specific tasks in the multi-task model may be worse than that of the single-task model. For example, fine tuning, model capacity, because one task can dominate the entire multi-task learning, resulting in poor performance on other tasks and negative transfer learning effects. and covariance between tasks.

전이 학습(Transfer learning)은 미리 정의된 태스크에 의해 사전 훈련된 모델을 대상 태스크에 적용하기 위해 재사용하는 것을 나타낼 수 있다. 전이 학습은 컴퓨터 비전 작업에서 자주 이용될 수 있다. 특히, 1000개 클래스들의 대규모 자연 이미지 데이터 세트에 의한 ImageNet 사전 학습된 접근 방식은, 다양한 태스크에 적용될 수 있다. 또한 전이 학습을 수행하는 트레이닝은, 전이 학습을 수행하지 않는 트레이닝보다 빠른 수렴과 향상된 성능을 가질 수 있다. 따라서, 사전 학습된 모델은, 상대적으로 작은 데이터 세트(예: 의료 영역의 데이터 세트)에서 효과적인 피처 추출기로서 자주 이용될 수 있다. 전이 학습은 사전 정의된 태스크에 따라 다양할 수 있다. 일반적으로, Autoencoder, Model Genesis와 같이, 입력 데이터에서 피처를 추출하거나, 왜곡되거나 압축된 이미지에서 원본 이미지를 복원함으로써 표현 학습(representation learning)이라고 하는 방식으로 초기에 표현(representation)을 학습한 후 대상 태스크를 수행하는 것은 효과적일 수 있다.Transfer learning may refer to reusing a model pre-trained by a predefined task to apply it to the target task. Transfer learning can be frequently used in computer vision tasks. In particular, the ImageNet pre-trained approach on a large natural image dataset of 1000 classes can be applied to a variety of tasks. Additionally, training that performs transfer learning can have faster convergence and improved performance than training that does not perform transfer learning. Therefore, pre-trained models can often be used as effective feature extractors in relatively small data sets (e.g., data sets in the medical domain). Transfer learning can vary depending on predefined tasks. In general, like Autoencoder and Model Genesis, a representation is initially learned in a method called representation learning by extracting features from input data or restoring the original image from a distorted or compressed image, and then learning the target. Performing tasks can be effective.

환자 레벨의 ICH 감지 문제를 해결하기 위하여, 이미지 스케일 변경, 치환 강도(permutation intensity) 및 노이즈 추가와 같은 강력한 증강 방법은 ICH의 이질성(heterogeneity)을 해결하기 위하여 사용될 수 있다. To solve the problem of ICH detection at the patient level, powerful enhancement methods such as changing image scale, permutation intensity, and adding noise can be used to address the heterogeneity of ICH.

ICH의 클래스 불균형을 해결하기 위하여, 모든 ICH는 이진화되어 정상 대 ICH 환자 분류가 수행될 수 있다. 다만, 트리아쥬(triage) 시스템의 목적에는 적합할 수 있다. To address the class imbalance of ICH, all ICHs can be binarized and classification of normal vs. ICH patients can be performed. However, it may be suitable for the purpose of a triage system.

외부 데이터에 대한 견고성을 위하여, 다중 태스크 전이 학습을 사용하는 일 실시예에 따른 기계 학습 모델은 환자 레벨의 ICH의 분류 및 분할하기 위해 제안될 수 있다. 일 실시예에 따른 기계 학습 모델은 2D 기반 인코더를 공유할 수 있고, 분류(classification; CLS), 분할(segmentation; SEG), 및 재구성(reconstruction; REC)의 3가지 다중 태스크들을 수행할 수 있다. 또한, 일 실시예에 따른 기계 학습 모델의 학습은, 분류 및 분할 작업의 출력들 간의 일관성을 나타내는 일관성(consistency; CON) 손실을 기계 학습 모델의 파라미터의 업데이트에 이용할 수 있다. 다중 태스크 학습은 인코더가 3개의 태스크들을 모두 포함하는 글로벌화된 의미론적 피처를 캡처하도록 트레이닝될 수 있고 균형 정확도(balanced accuracy) 및 외부 데이터 세트에 대한 견고성(robustness)을 높일 수 있다. 마지막으로 환자 레벨의 태스크에 대하여, 인코더에 의해 누적된 피처들은 3D 분류기인 LSTM 기반 네트워크로 전송되거나 환자 수준 ICH 작업을 위한 3D 분할기인 Conv3D 기반 네트워크로 전송될 수 있다.For robustness to external data, a machine learning model according to one embodiment using multi-task transfer learning may be proposed for classification and segmentation of ICH at the patient level. Machine learning models according to one embodiment may share a 2D-based encoder and perform three multiple tasks: classification (CLS), segmentation (SEG), and reconstruction (REC). Additionally, in learning a machine learning model according to one embodiment, consistency (CON) loss, which indicates consistency between outputs of classification and segmentation tasks, may be used to update parameters of the machine learning model. Multi-task learning allows the encoder to be trained to capture globalized semantic features encompassing all three tasks, increasing balanced accuracy and robustness to external data sets. Finally, for patient-level tasks, the features accumulated by the encoder can be sent to an LSTM-based network, a 3D classifier, or to a Conv3D-based network, a 3D segmenter for patient-level ICH tasks.

다만, 기술적 과제는 상술한 기술적 과제들로 한정되는 것은 아니며, 다른 기술적 과제들이 존재할 수 있다.However, the technical challenges are not limited to the above-mentioned technical challenges, and other technical challenges may exist.

일 실시예에 따른 프로세서에 의하여 수행되는 방법은, 트레이닝 입력 이미지에 기계 학습 모델의 인코더(encoder)를 적용함으로써 피처 맵(feature map)을 추출하는 단계, 상기 피처 맵에 상기 기계 학습 모델의 범용 분류 모듈을 적용함으로써 상기 트레이닝 입력 이미지에 병변(lesion)이 나타날 제1 가능성 점수(possibility score)를 획득하는 단계, 상기 피처 맵에 상기 기계 학습 모델의 범용 분할 모듈을 적용함으로써 상기 트레이닝 입력 이미지로부터 병변 영역을 추출하는 단계, 상기 획득된 제1 가능성 점수 및 상기 추출된 병변 영역에 기초하여, 상기 범용 분류 모듈 및 상기 범용 분할 모듈의 출력들 간의 일관성(consistency)을 나타내는, 일관성 손실(consistency loss)을 가지는 목적함수 값을 계산하는 단계, 및 상기 계산된 목적함수 값을 이용하여 상기 기계 학습 모델의 파라미터를 업데이트하는 단계를 포함할 수 있다.A method performed by a processor according to an embodiment includes extracting a feature map by applying an encoder of a machine learning model to a training input image, and general-purpose classification of the machine learning model to the feature map. Obtaining a first likelihood score that a lesion will appear in the training input image by applying a module; obtaining a lesion region from the training input image by applying a general-purpose segmentation module of the machine learning model to the feature map; extracting, based on the obtained first likelihood score and the extracted lesion area, having a consistency loss, indicating consistency between outputs of the universal classification module and the universal segmentation module. It may include calculating an objective function value, and updating parameters of the machine learning model using the calculated objective function value.

상기 목적함수 값을 계산하는 단계는, 상기 범용 분류 모듈에 관한 분류 손실(classification loss) 및 상기 범용 분할 모듈에 관한 분할 손실(segmentation loss)을 더 가지는 상기 목적함수 값을 계산하는 단계를 포함할 수 있다.Calculating the objective function value may include calculating the objective function value further having a classification loss for the general-purpose classification module and a segmentation loss for the general-purpose segmentation module. there is.

상기 목적함수 값을 계산하는 단계는, 상기 추출된 병변 영역을 이용하여 상기 트레이닝 입력 이미지에 병변이 나타날 제2 가능성 점수를 획득하는 단계, 및 상기 제1 가능성 점수 및 상기 제2 가능성 점수 간의 차이에 기초하여 상기 일관성 손실을 계산하는 단계를 포함할 수 있다.The step of calculating the objective function value includes obtaining a second likelihood score that a lesion will appear in the training input image using the extracted lesion area, and calculating the difference between the first likelihood score and the second likelihood score. It may include calculating the consistency loss based on the consistency loss.

상기 제2 가능성 점수를 획득하는 단계는, 상기 추출된 병변 영역에 평균 풀링 레이어(average pooling layer) 및 최대 풀링 레이어(max pooling layer) 중 적어도 하나를 적용함으로써 상기 제2 가능성 점수를 획득하는 단계를 포함할 수 있다.The step of obtaining the second likelihood score includes obtaining the second likelihood score by applying at least one of an average pooling layer and a maximum pooling layer to the extracted lesion area. It can be included.

상기 일관성 손실을 계산하는 단계는, 상기 제1 가능성 점수 및 상기 제2 가능성 점수의 차이의 제곱(square)을 계산하는 단계, 및 배치(batch)에 포함된 이미지들 각각에 대하여 계산된 상기 제1 가능성 점수 및 상기 제2 가능성 점수의 차이의 제곱을 평균함으로써 상기 일관성 손실을 계산하는 단계를 포함할 수 있다.Calculating the consistency loss includes calculating a square of the difference between the first likelihood score and the second likelihood score, and the first likelihood score calculated for each of the images included in the batch. and calculating the consistency loss by averaging the square of the difference between the likelihood score and the second likelihood score.

일 실시예에 따른 방법은 상기 피처 맵에 상기 기계 학습 모델의 복원 모듈을 적용함으로써 상기 피처 맵으로부터 복원 이미지(reconstruction image)를 획득하는 단계를 더 포함할 수 있고, 상기 목적함수 값을 계산하는 단계는, 상기 트레이닝 입력 이미지 및 상기 복원 이미지에 기초하여 상기 복원 모듈에 관한 복원 손실(reconstruction loss)을 계산하는 단계를 포함할 수 있다.The method according to one embodiment may further include obtaining a reconstruction image from the feature map by applying a reconstruction module of the machine learning model to the feature map, and calculating the objective function value. may include calculating a reconstruction loss for the reconstruction module based on the training input image and the reconstruction image.

상기 복원 손실을 계산하는 단계는, 상기 트레이닝 입력 이미지의 픽셀 및 상기 트레이닝 입력 이미지의 상기 픽셀에 대응하는 상기 복원 이미지의 픽셀 간의 강도(intensity) 차이의 평균 절대 오차(mean absolute error) 값을 계산하는 단계를 포함할 수 있다.The step of calculating the restoration loss includes calculating a mean absolute error value of the intensity difference between the pixel of the training input image and the pixel of the restored image corresponding to the pixel of the training input image. May include steps.

일 실시예에 따른 방법은, 상기 기계 학습 모델의 상기 인코더 및 상기 인코더에 연결되는 환자 레벨(patient level)의 분류 모듈(classification module)을 포함하는 분류 모델을 생성하는 단계, 트레이닝 데이터 세트의 대상 환자에 대한 CT 촬영에 기초하여 획득된 복수의 이미지들에 상기 분류 모델의 상기 인코더를 적용함으로써 복수의 피처 맵들을 획득하는 단계, 상기 획득된 복수의 피처 맵들에 상기 환자 레벨의 분류 모듈을 적용함으로써 상기 복수의 이미지들 중 적어도 한 이미지에 병변이 나타날 가능성 점수를 획득하는 단계, 및 상기 복수의 이미지들에 대한 가능성 점수 및 상기 트레이닝 데이터 세트에서 상기 복수의 이미지들에 대해 매핑된 참값 클래스를 이용하여 계산된 목적 함수 값에 기초하여 상기 분류 모델의 파라미터를 업데이트하는 단계를 포함할 수 있다.The method according to one embodiment includes generating a classification model including the encoder of the machine learning model and a classification module at a patient level connected to the encoder, the target patient of the training data set Obtaining a plurality of feature maps by applying the encoder of the classification model to a plurality of images acquired based on CT imaging for, applying the patient-level classification module to the obtained plurality of feature maps Obtaining a probability score that a lesion appears in at least one image among a plurality of images, and calculating it using the probability score for the plurality of images and a true value class mapped to the plurality of images in the training data set. It may include updating parameters of the classification model based on the objective function value.

상기 복수의 피처 맵들을 획득하는 단계는, 상기 복수의 이미지들 각각에 상기 분류 모델의 상기 인코더를 적용함으로써 피처 맵을 획득하는 것을 반복하는 단계를 포함할 수 있다.Obtaining the plurality of feature maps may include repeating obtaining the feature map by applying the encoder of the classification model to each of the plurality of images.

상기 분류 모델의 파라미터를 업데이트하는 단계는, 트레이닝 반복들 중 적어도 일부 반복(iteration)에서 상기 분류 모델의 인코더의 파라미터를 고정한 채로 상기 환자 레벨의 분류 모듈의 파라미터를 업데이트하는 단계, 및 상기 트레이닝 반복들 중 다른 반복에서 상기 분류 모델의 인코더의 파라미터의 고정을 해제하고, 상기 분류 모델의 인코더 및 상기 환자 레벨의 분류 모듈의 파라미터를 업데이트하는 단계를 포함할 수 있다.The step of updating the parameters of the classification model includes updating the parameters of the patient-level classification module while keeping the parameters of the encoder of the classification model fixed in at least some of the training iterations, and the training iterations In another iteration, the step may include unfixing the parameters of the encoder of the classification model and updating the parameters of the encoder of the classification model and the patient-level classification module.

일 실시예에 따른 상기 기계 학습 모델의 상기 인코더 및 상기 인코더에 연결되는 디코더(decoder)와 환자 레벨(patient level)의 분할 모듈(segmentation module)을 포함하는 분할 모델을 생성하는 단계, 트레이닝 데이터 세트의 대상 환자에 대한 CT 촬영에 기초하여 획득된 복수의 이미지들에 상기 인코더 및 상기 디코더를 적용함으로써 복수의 피처 맵들을 획득하는 단계, 상기 획득된 복수의 피처 맵들에 상기 환자 레벨의 분할 모듈을 적용함으로써 상기 복수의 이미지들로부터 병변 영역을 추출하는 단계, 및 상기 추출된 병변 영역 및 상기 트레이닝 데이터 세트에서 상기 복수의 이미지들에 매핑된 참값 병변 영역을 이용하여 계산된 목적 함수 값에 기초하여 상기 분할 모델의 파라미터를 업데이트하는 단계를 더 포함할 수 있다.Generating a segmentation model including the encoder of the machine learning model, a decoder connected to the encoder, and a patient level segmentation module according to an embodiment, of the training data set Obtaining a plurality of feature maps by applying the encoder and the decoder to a plurality of images obtained based on CT imaging of a target patient, by applying the patient-level segmentation module to the obtained plurality of feature maps extracting a lesion area from the plurality of images, and the segmentation model based on an objective function value calculated using the extracted lesion area and a true lesion area mapped to the plurality of images in the training data set. The step of updating the parameters may be further included.

상기 복수의 피처 맵들을 획득하는 단계는, 상기 복수의 이미지들 각각에 상기 분할 모델의 상기 인코더 및 상기 디코더를 적용함으로써 피처 맵을 획득하는 것을 반복하는 단계를 포함할 수 있다.Obtaining the plurality of feature maps may include repeating obtaining the feature map by applying the encoder and the decoder of the segmentation model to each of the plurality of images.

상기 분할 모델의 파라미터를 업데이트하는 단계는, 트레이닝 반복들 중 적어도 일부 반복(iteration)에서 상기 분할 모델의 인코더의 파라미터를 고정한 채로 상기 환자 레벨의 분할 모듈의 파라미터를 업데이트하는 단계 및 상기 트레이닝 반복들 중 다른 반복에서 상기 분할 모델의 인코더의 파라미터의 고정을 해제하고, 상기 분할 모델의 인코더 및 상기 환자 레벨의 분할 모듈의 파라미터를 업데이트하는 단계를 포함할 수 있다.The step of updating the parameters of the segmentation model includes updating the parameters of the patient-level segmentation module while fixing the parameters of the encoder of the segmentation model in at least some of the training iterations. In another iteration, it may include unfixing the parameters of the encoder of the segmentation model and updating the parameters of the encoder of the segmentation model and the segmentation module at the patient level.

상기 프로세서는, 상기 기계 학습 모델의 상기 인코더 및 상기 인코더에 연결되는 환자 레벨(patient level)의 분류 모듈(classification module)을 포함하는 분류 모델을 생성하고, 트레이닝 데이터 세트의 대상 환자에 대한 CT 촬영에 기초하여 획득된 복수의 이미지들에 상기 분류 모델의 상기 인코더를 적용함으로써 복수의 피처 맵들을 획득하며, 상기 획득된 복수의 피처 맵들에 상기 환자 레벨의 분류 모듈을 적용함으로써 상기 복수의 이미지들 중 적어도 한 이미지에 병변이 나타날 가능성 점수를 획득하고, 상기 복수의 이미지들에 대한 가능성 점수 및 상기 트레이닝 데이터 세트에서 상기 복수의 이미지들에 대해 매핑된 참값 클래스를 이용하여 계산된 목적 함수 값에 기초하여 상기 분류 모델의 파라미터를 업데이트할 수 있다.The processor generates a classification model including the encoder of the machine learning model and a classification module at a patient level connected to the encoder, and performs a CT scan on the target patient of the training data set. A plurality of feature maps are obtained by applying the encoder of the classification model to a plurality of images obtained based on the plurality of images, and at least one of the plurality of images is obtained by applying the patient-level classification module to the obtained plurality of feature maps. Obtain a probability score that a lesion appears in one image, and obtain the probability score for the plurality of images and the objective function value calculated using the truth value class mapped to the plurality of images in the training data set. Parameters of the classification model can be updated.

상기 프로세서는, 상기 기계 학습 모델의 상기 인코더 및 상기 인코더에 연결되는 디코더(decoder)와 환자 레벨(patient level)의 분할 모듈(segmentation module)을 포함하는 분할 모델을 생성하고, 트레이닝 데이터 세트의 대상 환자에 대한 CT 촬영에 기초하여 획득된 복수의 이미지들에 상기 인코더 및 상기 디코더를 적용함으로써 복수의 피처 맵들을 획득하며, 상기 획득된 복수의 피처 맵들에 상기 환자 레벨의 분할 모듈을 적용함으로써 상기 복수의 이미지들로부터 병변 영역을 추출하고, 상기 추출된 병변 영역 및 상기 트레이닝 데이터 세트에서 상기 복수의 이미지들에 매핑된 참값 병변 영역을 이용하여 계산된 목적 함수 값에 기초하여 상기 분할 모델의 파라미터를 업데이트할 수 있다.The processor generates a segmentation model including the encoder of the machine learning model, a decoder connected to the encoder, and a segmentation module at a patient level, and the target patient of the training data set. A plurality of feature maps are obtained by applying the encoder and the decoder to a plurality of images acquired based on CT imaging, and the plurality of feature maps are obtained by applying the patient level segmentation module to the obtained plurality of feature maps. Extract a lesion area from images and update parameters of the segmentation model based on an objective function value calculated using the extracted lesion area and a true lesion area mapped to the plurality of images in the training data set. You can.

일 실시예에 따른 기계 학습 모델의 인코더는 3개의 다중 태스크를 모두 충족하는 전역화된 피처를 캡처하도록 학습될 수 있다. 인코더는 균형 정확도(balanced accuracy)의 측면에서 개선될 수 있고, 외부 데이터 세트(external data set)에서 강력할 수 있다. The encoder of the machine learning model according to one embodiment may be trained to capture globalized features that satisfy all three multiple tasks. Encoders can be improved in terms of balanced accuracy and can be robust on external data sets.

일 실시예에 따른 기계 학습 모델의 인코더는, 다중 태스크를 통해 학습되어 연속적이거나 체적 피처를 이용하는 3D 오퍼레이터(operator)에 대한 전이 학습을 통해 환자 레벨의 태스크(예: 분류, 분할)로 확장될 수 있다. The encoder of the machine learning model according to one embodiment is learned through multiple tasks and can be extended to patient-level tasks (e.g., classification, segmentation) through transfer learning for a 3D operator that uses continuous or volumetric features. there is.

일 실시예에 따른 기계 학습 모델은, 실제 임상 환경을 반영한 2개의 외부 데이터 세트 및 ICH의 감지하기 어려운 사례가 있는 공개 데이터 세트를 통해 비교 실시예에 따른 기계 학습 모델보다 우수한 성능을 가질 수 있다.The machine learning model according to one embodiment may have better performance than the machine learning model according to the comparative example through two external data sets reflecting actual clinical environments and a public data set with difficult-to-detect cases of ICH.

도 1은 일 실시예에 따른 환자 레벨의 분류 모델 및 환자 레벨의 분할 모델을 나타낸다.
도 2는 일 실시예에 따른 다중 태스크 학습을 통한 인코더의 사전 학습을 나타낸다.
도 3은 일 실시예에 따른 환자 레벨의 분류 모델의 트레이닝을 나타낸다.
도 4는 일 실시예에 따른 환자 레벨의 분할 모델의 트레이닝을 나타낸다.
도 5 은 일 실시예에 따른 내부 데이터 세트 및 외부 데이터 세트들의 ICH의 양(amount)에 따른 빈도(frequency)를 나타낼 수 있다.
도 6은 일 실시예에 따른 내부 데이터 세트 및 외부 데이터 세트들의 ICH의 부피(volume) 및 강도(intensity)을 나타낼 수 있다.
도 7은 일 실시예 및 비교 실시예에 따른 다중 태스크 모델을 이용하여 학습된 인코더들의 활성화 맵들을 나타낼 수 있다.
도 8은 일 실시예에 따른 환자 레벨의 분류 모델 및 비교 실시예에 따른 분류 모델의 성능 비교를 나타낸다.
도 9은 일 실시예에 따른 환자 레벨의 분류 모델 및 비교 실시예들에 따른 분류 모델 간의 차이 및 견고성을 나타낼 수 있다.
도 10는 일 실시예 및 비교 실시예들에 따른 분할 모델의 결과를 나타낸다.
도 11은 일 실시예 및 비교 실시예들에 따른 분할 모델의 ICH 부피들을 나타낸다.Figure 1 shows a patient-level classification model and a patient-level segmentation model according to an embodiment.
Figure 2 shows dictionary learning of an encoder through multi-task learning according to an embodiment.
Figure 3 shows training of a patient-level classification model according to one embodiment.
Figure 4 shows training of a patient-level segmentation model according to one embodiment.
Figure 5 may show the frequency according to the amount of ICH of internal data sets and external data sets according to an embodiment.
Figure 6 may show the volume and intensity of ICH of internal data sets and external data sets according to an embodiment.
Figure 7 may show activation maps of encoders learned using a multi-task model according to one embodiment and a comparative example.
Figure 8 shows a performance comparison of a patient-level classification model according to one embodiment and a classification model according to a comparative embodiment.
Figure 9 may show the difference and robustness between a patient-level classification model according to one embodiment and classification models according to comparative embodiments.
Figure 10 shows the results of a segmentation model according to one embodiment and comparative examples.
Figure 11 shows ICH volumes of a segmented model according to one embodiment and comparative examples.

실시예들에 대한 특정한 구조적 또는 기능적 설명들은 단지 예시를 위한 목적으로 개시된 것으로서, 다양한 형태로 변경되어 구현될 수 있다. 따라서, 실제 구현되는 형태는 개시된 특정 실시예로만 한정되는 것이 아니며, 본 명세서의 범위는 실시예들로 설명한 기술적 사상에 포함되는 변경, 균등물, 또는 대체물을 포함한다.Specific structural or functional descriptions of the embodiments are disclosed for illustrative purposes only and may be changed and implemented in various forms. Accordingly, the actual implementation form is not limited to the specific disclosed embodiments, and the scope of the present specification includes changes, equivalents, or substitutes included in the technical idea described in the embodiments.

제1 또는 제2 등의 용어를 다양한 구성요소들을 설명하는데 사용될 수 있지만, 이런 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 해석되어야 한다. 예를 들어, 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소는 제1 구성요소로도 명명될 수 있다.Terms such as first or second may be used to describe various components, but these terms should be interpreted only for the purpose of distinguishing one component from another component. For example, a first component may be named a second component, and similarly, the second component may also be named a first component.

어떤 구성요소가 다른 구성요소에 "연결되어" 있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다.When a component is referred to as being “connected” to another component, it should be understood that it may be directly connected or connected to the other component, but that other components may exist in between.

단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 명세서에서, "포함하다" 또는 "가지다" 등의 용어는 설명된 특징, 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함으로 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.Singular expressions include plural expressions unless the context clearly dictates otherwise. In this specification, terms such as “comprise” or “have” are intended to designate the presence of the described features, numbers, steps, operations, components, parts, or combinations thereof, and are intended to indicate the presence of one or more other features or numbers, It should be understood that this does not exclude in advance the possibility of the presence or addition of steps, operations, components, parts, or combinations thereof.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 해당 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가진다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 갖는 것으로 해석되어야 하며, 본 명세서에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless otherwise defined, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by a person of ordinary skill in the art. Terms as defined in commonly used dictionaries should be interpreted as having meanings consistent with the meanings they have in the context of the related technology, and unless clearly defined in this specification, should not be interpreted in an idealized or overly formal sense. No.

이하, 실시예들을 첨부된 도면들을 참조하여 상세하게 설명한다. 첨부 도면을 참조하여 설명함에 있어, 도면 부호에 관계없이 동일한 구성 요소는 동일한 참조 부호를 부여하고, 이에 대한 중복되는 설명은 생략하기로 한다.Hereinafter, embodiments will be described in detail with reference to the attached drawings. In the description with reference to the accompanying drawings, identical components will be assigned the same reference numerals regardless of the reference numerals, and overlapping descriptions thereof will be omitted.

기계 학습 모델의 트레이닝을 위하여, 전이 학습(transfer learning)이 이용될 수 있다. 전이 학습은, 트레이닝된 제1 기계 학습 모델의 적어도 일부를 제2 기계 학습 모델의 적어도 일부로서 이용하여 제2 기계 학습 모델을 트레이닝시키는 기법을 나타낼 수 있다. 제1 기계 학습 모델은 프리텍스트 태스크(pretext task)를 수행하기 위한 모델을 나타낼 수 있고, 제2 기계 학습 모델은 대상 태스크(target task)를 수행하기 위한 모델을 나타낼 수 있다. 프리텍스트 태스크(또는, 업스트림 태스크(upstream task)라고도 함)는 제2 기계 학습 모델의 트레이닝에 이용되는 제1 기계 학습 모델을 통해 수행되는 태스크를 나타낼 수 있다. 대상 태스크(또는, 다운스트림 태스크(downstream task)라고도 함)는 수행하고자 하는 목표가 되는 태스크로서, 제2 기계 학습 모델을 통해 수행되는 태스크를 나타낼 수 있다. For training of machine learning models, transfer learning can be used. Transfer learning may refer to a technique of training a second machine learning model by using at least a portion of a trained first machine learning model as at least a portion of the second machine learning model. The first machine learning model may represent a model for performing a pretext task, and the second machine learning model may represent a model for performing a target task. A freetext task (also referred to as an upstream task) may represent a task performed through a first machine learning model used for training a second machine learning model. The target task (also referred to as a downstream task) is a task that is a goal to be performed and may represent a task performed through a second machine learning model.

제2 기계 학습 모델은, 트레이닝 데이터의 한계 및/또는 트레이닝 속도의 한계 등 다양한 문제들로 인하여, 초기 상태에서부터 트레이닝되기 어려울 수 있다. The second machine learning model may be difficult to train from the initial state due to various problems such as limitations in training data and/or limitations in training speed.

제2 기계 학습 모델을 초기 상태에서부터 트레이닝시키는 것 대신에, 제1 기계 학습 모델을 트레이닝시키고, 이후에, 트레이닝된 제1 기계 학습 모델의 적어도 일부를 제2 기계 학습 모델의 적어도 일부로 이용하여 제2 기계 학습 모델을 트레이닝시킬 수 있다. Instead of training the second machine learning model from an initial state, train the first machine learning model and then use at least a portion of the trained first machine learning model to form a second machine learning model. You can train machine learning models.

프리텍스트 태스크를 위한 제1 기계 학습 모델은 트레이닝될 수 있다. 제1 기계 학습 모델의 트레이닝은 제2 기계 학습 모델의 사전 학습(pre-training)으로 수행될 수 있다. A first machine learning model for a freetext task can be trained. Training of the first machine learning model may be performed as pre-training of the second machine learning model.

제2 기계 학습 모델의 트레이닝을 위한 초기 모델은, 트레이닝된 제1 기계 학습 모델의 적어도 일부를 포함할 수 있다. 예를 들어, 제1 기계 학습 모델 및 제2 기계 학습 모델은 인코더의 구조를 공유할 수 있다. 트레이닝된 제1 기계 학습 모델의 인코더는 제2 기계 학습 모델의 초기 모델의 인코더로서 이용될 수 있다. The initial model for training the second machine learning model may include at least a portion of the trained first machine learning model. For example, a first machine learning model and a second machine learning model may share the structure of an encoder. The encoder of the first trained machine learning model can be used as the encoder of the initial model of the second machine learning model.

제2 기계 학습 모델의 초기 모델은 트레이닝될 수 있다. 트레이닝된 제1 기계 학습 모델의 적어도 일부를 초기 모델로서 이용한 제2 기계 학습 모델의 트레이닝은 제1 기계 학습 모델에 기초하여 획득된 제2 기계 학습 모델의 적어도 일부의 파인 튜닝(fine-tuning)을 포함할 수 있다.An initial model of the second machine learning model may be trained. Training of a second machine learning model using at least a part of the trained first machine learning model as an initial model involves fine-tuning of at least a part of the second machine learning model obtained based on the first machine learning model. It can be included.

이하, 슬라이스 레벨의 다중 태스크(예: 분류(classification) 태스크, 분할(segmentation) 태스크, 및 복원(reconstruction) 태스크)를 프리텍스트 태스크로 다중 태스크 모델의 인코더를 사전 학습하고, 환자 레벨의 태스크(예: 환자 레벨의 병변 유무의 분류 태스크 및 환자 레벨의 병변 영역의 분할 태스크 중 하나)를 대상 태스크로 기계 학습 모델(예: 환자 레벨의 분류 모델 및 환자 레벨의 분할 모델)을 트레이닝시키는 장치 및 방법에 대하여 도 1 내지 도 11에서 후술한다.Hereinafter, the encoder of the multi-task model is pre-trained using slice-level multi-tasks (e.g., classification task, segmentation task, and reconstruction task) as freetext tasks, and patient-level tasks (e.g., : Apparatus and method for training a machine learning model (e.g., a patient-level classification model and a patient-level segmentation model) as a target task (one of the classification task of the presence or absence of a lesion at the patient level and the segmentation task of the lesion area at the patient level) This will be described later with reference to FIGS. 1 to 11.

도 1은 일 실시예에 따른 환자 레벨의 분류 모델 및 환자 레벨의 분할 모델을 나타낸다.Figure 1 shows a patient-level classification model and a patient-level segmentation model according to an embodiment.

환자 레벨 분류 장치(110)는, 환자 레벨에서 병변 유무를 분류하기 위하여, 복수의 이미지들(112)(예: CT 슬라이스들)에 기초하여 대상 환자의 대상 신체 부위에 병변이 포함될 가능성 점수(possibility score)(114)를 출력하는 장치를 나타낼 수 있다. 환자 레벨 분할 장치(120)는, 환자 레벨에서 병변 영역을 분할하기 위하여, 복수의 이미지들(122)(예: CT 슬라이스들)에 기초하여 대상 환자의 대상 신체 부위 중 병변 영역(124)을 분할하는 장치를 나타낼 수 있다.The patient level classification device 110 provides a probability score (possibility score) that a lesion is included in a target body part of a target patient based on a plurality of images 112 (e.g., CT slices) in order to classify the presence or absence of a lesion at the patient level. score) (114). The patient level segmentation device 120 divides the lesion area 124 among the target body parts of the target patient based on the plurality of images 122 (e.g., CT slices) in order to segment the lesion area at the patient level. It can represent a device that does this.

대상 환자는, 대상 환자의 대상 신체 부위에 병변이 존재하는지 여부를 분류하거나, 대상 환자의 대상 신체 부위 중 병변 영역을 추출함으로써 병변 영역을 분할하려는 환자를 나타낼 수 있다. 대상 신체 부위는 CT 촬영된 신체 부위를 나타낼 수 있고, 예시적으로, 뇌(brain)를 포함할 수 있다. 병변은, 두개내출혈(ICH)으로 인한 육체적인 또는 생리적인 변화를 나타낼 수 있고, 예시적으로, 병변 영역은 뇌 중에서 두개내출혈의 발생 부위를 나타낼 수 있다.The target patient may represent a patient for whom a lesion area is to be divided by classifying whether a lesion exists in the target body part of the target patient or extracting the lesion area from the target body part of the target patient. The target body part may represent a CT scanned body part and, for example, may include the brain. The lesion may represent a physical or physiological change due to intracranial hemorrhage (ICH), and illustratively, the lesion area may represent the location in the brain where the intracranial hemorrhage occurs.

복수의 이미지들은, 대상 환자에 대한 CT 촬영에 기초하여 획득될 수 있다. 복수의 이미지들은 한 번의 CT 촬영에 기초하여 획득된 CT 슬라이스들을 나타낼 수 있다. CT 슬라이스는, 엑스선 생성기와 엑스선 검출기가 대상 환자를 두고 쌍으로 회전하여 전 방향에서 촬영된 엑스선 영상(예: 사이노그램(sinogram))을 역 라돈 변환　계산을 통해 복원함으로써 획득된 횡단면 영상을 나타낼 수 있다. CT 슬라이스의 각　픽셀은 해당 픽셀에 대응하는 부분의 감쇠율을 나타낼 수 있다.A plurality of images may be acquired based on CT scan of the target patient. A plurality of images may represent CT slices acquired based on one CT scan. A CT slice represents a cross-sectional image obtained by rotating an X-ray generator and an X-ray detector in pairs around the target patient and restoring the You can. Each pixel of a CT slice may represent the attenuation rate of the portion corresponding to the pixel.

이하, 환자 레벨 분류 장치(110) 및 환자 레벨 분할 장치(120)를 각각 설명한다.Hereinafter, the patient level classification device 110 and the patient level division device 120 will be described, respectively.

일 실시예에 따른 환자 레벨 분류 장치(110)는 프로세서(111)를 포함할 수 있다. 환자 레벨 분류 장치(110)는, 복수의 이미지들(112)(예: CT 슬라이스들)에 환자 레벨(patient level)의 분류 모델(classification model)(113)(분류 모델이라고도 함)을 적용함으로써 가능성 점수(possibility score)(114)를 획득할 수 있다. The patient level classification device 110 according to one embodiment may include a processor 111. The patient level classification device 110 applies a classification model 113 (also referred to as a classification model) at the patient level to a plurality of images 112 (e.g., CT slices) to determine the possibility of A possibility score (114) can be obtained.

분류 모델(113)은 대상 환자의 병변(lesion) 유무를 분류하기 위한 기계 학습 모델을 나타낼 수 있다. 분류 모델(113)는, 인코더(encoder)(113a) 및 인코더(113a)에 연결되는 환자 레벨의 분류 모듈(classification module)(113b)을 포함할 수 있다. 후술하겠으나, 인코더(113a)는 다중 태스크 모델을 통해 사전 학습될 수 있다.The classification model 113 may represent a machine learning model for classifying the presence or absence of a lesion in a target patient. The classification model 113 may include an encoder 113a and a patient-level classification module 113b connected to the encoder 113a. As will be described later, the encoder 113a may be pre-trained through a multi-task model.

프로세서(111)는, 복수의 이미지들(112)에 인코더(113a)를 적용함으로써 복수의 피처 맵들(feature maps)을 획득할 수 있다. 프로세서(111)는 하나의 이미지에 인코더(113a)를 적용함으로써 하나의 피처 맵을 획득할 수 있다. 프로세서(111)는 복수의 이미지들(112) 각각에 인코더(113a)를 적용함으로써 피처 맵을 획득하는 것을 반복함으로써 복수의 피처 맵들을 획득할 수 있다.The processor 111 may obtain a plurality of feature maps by applying the encoder 113a to the plurality of images 112. The processor 111 can obtain one feature map by applying the encoder 113a to one image. The processor 111 may obtain a plurality of feature maps by repeating obtaining the feature map by applying the encoder 113a to each of the plurality of images 112.

프로세서(111)는, 복수의 피처 맵들에 환자 레벨의 분류 모듈(113b)를 적용함으로써 가능성 점수(114)를 획득할 수 있다. 가능성 점수(114)는, 복수의 이미지들에 나타난 대상 환자의 대상 신체 부위의 적어도 일부에 병변이 존재할 가능성을 나타낼 수 있다. 예를 들어, 가능성 점수(114)는 복수의 이미지들 중 적어도 한 이미지에 병변이 나타날 가능성을 나타낼 수 있다.The processor 111 may obtain the likelihood score 114 by applying the patient-level classification module 113b to the plurality of feature maps. The likelihood score 114 may indicate the possibility that a lesion exists in at least a portion of a target body part of a target patient shown in a plurality of images. For example, the likelihood score 114 may indicate the possibility that a lesion appears in at least one image among a plurality of images.

프로세서(111)는, 가능성 점수(114)에 기초하여 병변 유무를 분류할 수 있다. 예를 들어, 프로세서(111)는 가능성 점수(114)가 임계 점수를 초과하는 경우, 대상 환자의 대상 신체 부위에 병변이 포함된 것(예: ICH 클래스(ICH class))으로 분류할 수 있다. 프로세서(111)는, 가능성 점수(114)가 임계 점수 미만인 경우, 대상 환자의 대상 신체 부위에 병변이 포함되지 않은 것(예: 정상 클래스(normal class))으로 분류할 수 있다. 임계 점수는 병변 유무 분류의 기준이 되는 점수를 나타낼 수 있다.The processor 111 may classify the presence or absence of a lesion based on the likelihood score 114. For example, if the probability score 114 exceeds a threshold score, the processor 111 may classify the target patient as containing a lesion in the target body part (eg, ICH class). If the probability score 114 is less than the threshold score, the processor 111 may classify the target patient as not containing a lesion in the target body part (eg, normal class). The critical score may represent a score that serves as a standard for classifying the presence or absence of a lesion.

일 실시예에 따른 환자 레벨 분할 장치(120)는 프로세서(121)를 포함할 수 있다. 환자 레벨 분할 장치(120)는, 복수의 이미지들(122)(예: CT 슬라이스들)에 환자 레벨의 분할 모델(segmentation model)(123)(분할 모델이라고도 함)을 적용함으로써 병변 영역을 획득할 수 있다. 환자 레벨의 분할 태스크는 병변(예: ICH)를 국소화(localize)하고 환자 레벨에서 뇌출혈을 측정하기 위하여 수행될 수 있다.The patient level segmentation device 120 according to one embodiment may include a processor 121. The patient-level segmentation device 120 may obtain a lesion area by applying a patient-level segmentation model 123 (also referred to as a segmentation model) to a plurality of images 122 (e.g., CT slices). You can. Patient-level segmentation tasks can be performed to localize lesions (eg, ICH) and measure cerebral hemorrhage at the patient level.

환자 레벨의 분할 모델(123)은 대상 환자의 신체 중 병변 영역을 분할하기 위한 기계 학습 모델을 나타낼 수 있다. 환자 레벨의 분할 모델(123)는, 인코더(123a) 및 인코더(123a)에 연결되는 디코더(123b)와 환자 레벨의 분할 모듈(123c)를 포함할 수 있다. 후술하겠으나, 인코더(123a)는 다중 태스크 모델을 통해 사전 학습될 수 있다.The patient-level segmentation model 123 may represent a machine learning model for segmenting a lesion area in the target patient's body. The patient-level segmentation model 123 may include an encoder 123a, a decoder 123b connected to the encoder 123a, and a patient-level segmentation module 123c. As will be described later, the encoder 123a may be pre-trained through a multi-task model.

프로세서(121)는, 복수의 이미지들(122)에 인코더(123a) 및 디코더(123b)를 적용함으로써 복수의 피처 맵들(feature maps)을 획득할 수 있다. 프로세서(121)는 하나의 이미지에 인코더(123a) 및 디코더(123b)를 적용함으로써 하나의 피처 맵을 획득할 수 있다. 프로세서(121)는 복수의 이미지들(122) 각각에 인코더(123a) 및 디코더(123b)를 적용함으로써 피처 맵을 획득하는 것을 반복함으로써 복수의 피처 맵들을 획득할 수 있다.The processor 121 may obtain a plurality of feature maps by applying the encoder 123a and the decoder 123b to the plurality of images 122. The processor 121 can obtain one feature map by applying the encoder 123a and the decoder 123b to one image. The processor 121 may obtain a plurality of feature maps by repeating obtaining the feature map by applying the encoder 123a and the decoder 123b to each of the plurality of images 122.

프로세서(121)는, 복수의 피처 맵들에 환자 레벨의 분할 모듈(123b)를 적용함으로써 병변 영역(124)를 획득할 수 있다. 병변 영역(124)은, 복수의 이미지들에 나타난 하나 이상의 병변 영역을 포함할 수 있다. 병변 영역(124)은 복수의 이미지들(122) 각각에 나타난 병변 영역이 누적된 영역을 나타낼 수 있다.The processor 121 may obtain the lesion area 124 by applying the patient-level segmentation module 123b to the plurality of feature maps. The lesion area 124 may include one or more lesion areas shown in a plurality of images. The lesion area 124 may represent an area where the lesion areas shown in each of the plurality of images 122 are accumulated.

도 2는 일 실시예에 따른 다중 태스크 학습을 통한 인코더의 사전 학습을 나타낸다.Figure 2 shows dictionary learning of an encoder through multi-task learning according to an embodiment.

일 실시예에 따른 전자 장치는 프로세서를 포함할 수 있다. 전자 장치는, 프리텍스트 태스크를 위한 기계 학습 모델을 이용하여 대상 태스크를 위한 기계 학습 모델의 인코더를 사전 학습시킬 수 있다. 다중 태스크 모델은 프리텍스트 태스크가 다중 태스크(multi-task)인 경우, 프리텍스트 태스크를 수행하기 위한 기계 학습 모델을 나타낼 수 있다.An electronic device according to one embodiment may include a processor. The electronic device may pre-train the encoder of the machine learning model for the target task using the machine learning model for the freetext task. The multi-task model may represent a machine learning model for performing the pretext task when the pretext task is multi-task.

다중 태스크 모델(200)은, 분류 모델(예: 도 1의 환자 레벨의 분류 모델(113)) 및/또는 분할 모델(예: 도 1의 환자 레벨의 분할 모델(123))의 인코더를 사전 학습하는 데 이용될 수 있다. The multi-task model 200 pre-trains the encoder of a classification model (e.g., the patient-level classification model 113 in FIG. 1) and/or a segmentation model (e.g., the patient-level segmentation model 123 in FIG. 1). It can be used to

일 실시예에 따른 다중 태스크 모델(200)은 인코더(210), 인코더(210)에 연결되는 범용 분류 모듈(220), 인코더(210)에 연결되는 범용 분할 모듈(230), 인코더에 연결되는 복원 모듈(240)(reconstruction module)을 포함할 수 있다. The multi-task model 200 according to one embodiment includes an encoder 210, a universal classification module 220 connected to the encoder 210, a universal segmentation module 230 connected to the encoder 210, and a restoration connected to the encoder. It may include a module 240 (reconstruction module).

다중 태스크 모델(200)은 하나의 공유 인코더 및 복수의 작업별 모듈들을 포함하는 하드 파라미터 공유 아키텍처(hard parameter sharing architecture)를 가질 수 있다. 하드 파라미터 공유 아키텍처는, 뿌리 모델(예: 인코더)을 공유하고 이후 각각의 표현(representation)을 학습하는 아키텍처를 나타낼 수 있다. 인코더(210)는 다양한 ICH 피처들을 캡처하도록 트레이닝될 수 있다. 예시적으로, 인코더는 ResNet-50로 구현될 수 있다. ResNet-50의 구조는 상대적으로 가볍고, ResNet-50는 동작을 수행하기 위해 ResBlock로 제어하기 쉬운 4개의 단계들을 가질 수 있다.The multi-task model 200 may have a hard parameter sharing architecture including one shared encoder and a plurality of task-specific modules. A hard parameter sharing architecture may represent an architecture that shares a root model (e.g., encoder) and then learns each representation. Encoder 210 can be trained to capture various ICH features. Illustratively, the encoder may be implemented as ResNet-50. The structure of ResNet-50 is relatively lightweight, and ResNet-50 can have four stages that are easy to control with ResBlock to perform operations.

다중 태스크 모델(200)은 입력 이미지에 적용됨으로써 제1 가능성 점수 및 분할 영역을 출력할 수 있다. 다중 태스크 모델(200)은 제1 가능성 점수 및 분할 영역과 함께 복원 이미지를 더 출력할 수 있다.The multi-task model 200 may output a first likelihood score and a segmented region by being applied to the input image. The multi-task model 200 may further output a restored image along with the first likelihood score and the segmentation area.

다중 태스크 모델(200)의 트레이닝 데이터 세트는, 트레이닝 입력 이미지(211), 참값 클래스, 참값 병변 영역을 포함할 수 있다. 참값 클래스 및 참값 병변 영역은 트레이닝 입력 이미지(211)에 매핑될 수 있다. 참값 클래스는 트레이닝 입력 이미지(211)에 병변이 나타나는지 여부를 지시하는 클래스를 나타낼 수 있다. 예를 들어, 참값 클래스는 병변이 나타나는 트레이닝 입력 이미지(211)에 병변 클래스(예: ICH 클래스)로, 병변이 나타나지 않는 트레이닝 입력 이미지(211)에 정상 클래스(normal class)로 매핑될 수 있다. 참값 병변 영역은, 트레이닝 입력 이미지(211) 중에서 실제 병변에 대응하는 영역을 나타낼 수 있다.The training data set of the multi-task model 200 may include a training input image 211, a true value class, and a true lesion area. The true value class and the true lesion area may be mapped to the training input image 211. The true value class may represent a class that indicates whether a lesion appears in the training input image 211. For example, the true value class may be mapped as a lesion class (e.g., ICH class) to the training input image 211 in which the lesion appears, and as a normal class to the training input image 211 in which the lesion does not appear. The true lesion area may represent an area corresponding to an actual lesion in the training input image 211.

전자 장치의 프로세서는 트레이닝 입력 이미지(211)를 다중 태스크 모델(200)에 적용함으로써 제1 가능성 점수(221), 분할 영역(231), 및 복원 이미지(241) 중 적어도 하나를 출력할 수 있다. 후술하겠으나, 참값 클래스 및 참값 병변 영역은 다중 태스크 모델(200)의 목적 함수 값을 계산하는 데 이용될 수 있다.The processor of the electronic device may output at least one of the first likelihood score 221, the segmented area 231, and the restored image 241 by applying the training input image 211 to the multi-task model 200. As will be described later, the true value class and the true lesion area may be used to calculate the objective function value of the multi-task model 200.

전자 장치의 프로세서는 트레이닝 입력 이미지(211)에 인코더(210)를 적용함으로써 피처 맵(212)을 추출할 수 있다. 피처 맵(212)은, 트레이닝 입력 이미지(211)의 특징들을 추출한 것으로서, 트레이닝 입력 이미지(211)에 합성 곱을 수행함으로써 획득된 결과를 나타낼 수 있다.The processor of the electronic device may extract the feature map 212 by applying the encoder 210 to the training input image 211. The feature map 212 extracts the features of the training input image 211 and may represent a result obtained by performing convolution on the training input image 211.

전자 장치의 프로세서는 피처 맵(212)에 범용 분류 모듈(220)을 적용함으로써 제1 가능성 점수(221)를 획득할 수 있다. 범용 분류 모듈(220)은, 슬라이스 레벨의 분류 태스크를 위한 모듈로서, 피처 맵(212)으로부터 제1 가능성 점수(221)를 출력하는 모듈을 나타낼 수 있다. 제1 가능성 점수(221)는 트레이닝 입력 이미지(211)에 병변이 나타날 가능성을 나타낼 수 있다. 예시적으로, 제1 가능성 점수(221)는 0보다 크거나 같고 1보다 작거나 같은 실수 값으로서, 0에 가까운 값을 가질수록 트레이닝 입력 이미지(211)에 병변이 나타날 가능성이 낮은 것을 나타내고, 1에 가까운 값을 가질수록 트레이닝 입력 이미지(211)에 병변이 나타날 가능성이 높은 것을 나타낼 수 있다. The processor of the electronic device may obtain the first likelihood score 221 by applying the general-purpose classification module 220 to the feature map 212. The general classification module 220 is a module for a slice-level classification task and may represent a module that outputs the first likelihood score 221 from the feature map 212. The first likelihood score 221 may indicate the possibility that a lesion appears in the training input image 211. Exemplarily, the first likelihood score 221 is a real number greater than or equal to 0 and less than or equal to 1. A value closer to 0 indicates a lower probability that a lesion will appear in the training input image 211, and 1 The closer the value to , the higher the probability that a lesion will appear in the training input image 211.

슬라이스 레벨의 분류 태스크는, 각 CT 슬라이스에서 ICH의 존재 여부를 판단하기 위하여 수행될 수 있다. 인코더는, 슬라이스 레벨의 분류 태스크를 통해, 키 피처(key feature)를 찾을 수 있도록 학습될 수 있다. 예시적으로, 범용 분류 모듈(220)은 선형 분류기(예: 선형 블록(linear block))를 포함할 수 있다. A slice-level classification task can be performed to determine whether an ICH exists in each CT slice. The encoder can be trained to find key features through a slice-level classification task. Illustratively, the general-purpose classification module 220 may include a linear classifier (eg, linear block).

전자 장치의 프로세서는 피처 맵(212)에 범용 분할 모듈(230)을 적용함으로서 병변 영역(231)을 추출할 수 있다. 범용 분할 모듈(230)은, 슬라이스 레벨의 분할 태스크를 위한 모듈로서, 피처 맵(212)에 적용됨으로써 트레이닝 입력 이미지(211)로부터 추출된 병변 영역(231)을 출력하는 모듈을 나타낼 수 있다. 병변 영역(231)은 트레이닝 입력 이미지(211) 중 병변에 대응하는 영역을 나타낼 수 있다.The processor of the electronic device may extract the lesion area 231 by applying the general-purpose segmentation module 230 to the feature map 212. The general-purpose segmentation module 230 is a module for a slice-level segmentation task and may represent a module that outputs a lesion area 231 extracted from the training input image 211 by applying it to the feature map 212. The lesion area 231 may represent an area corresponding to a lesion in the training input image 211.

슬라이스 레벨의 분할 태스크는 ICH를 국소화(localize)하고 각 CT 슬라이스에서 뇌출혈을 측정하기 위하여 수행될 수 있다. 인코더는, 슬라이스 레벨의 분할 태스크를 통해, 로컬 피처(local feature)를 찾을 수 있도록 학습될 수 있다. 예시적으로, 범용 분할 모듈(230)은 스킵 연결들(skip connections)을 포함하는 U-Net 디코더(decoder)를 포함할 수 있다. 범용 분할 모듈(230)은 5 깊이들을 가지고, 스킵 연결을 이용하여 ResNet-50의 각 단계의 마지막 피처를 디코더로 전달할 수 있다. 추가적으로, SCSE(spatial and channel squeeze and excitation) 블록들은 피처를 보다 효과적으로 집중하며(with attention to the feature) 각 디코더 블록의 첫번째 및 마지막에 부착(attach)될 수 있다.Slice-level segmentation tasks can be performed to localize ICH and measure cerebral hemorrhage in each CT slice. The encoder can be trained to find local features through a slice-level segmentation task. Illustratively, the general-purpose segmentation module 230 may include a U-Net decoder including skip connections. The general-purpose segmentation module 230 has 5 depths and can deliver the last feature of each stage of ResNet-50 to the decoder using skip connection. Additionally, spatial and channel squeeze and excitation (SCSE) blocks can be attached to the first and last of each decoder block to more effectively focus attention on the feature.

전자 장치의 프로세서는 피처 맵(212)에 복원 모듈(240)을 적용함으로써 복원 이미지(241)를 획득할 수 있다. 복원 모듈(240)은, 슬라이스 레벨의 복원 태스크를 위한 모듈로서, 피처 맵(212)으로부터 복원 이미지(241)를 츨력할 수 있다. 복원 이미지(241)는 피처 맵(212)에 기초하여 트레이닝 입력 이미지(211)를 복원하도록 생성된 이미지를 나타낼 수 있다.The processor of the electronic device may obtain the restored image 241 by applying the restoration module 240 to the feature map 212. The restoration module 240 is a module for a slice-level restoration task and can output the restored image 241 from the feature map 212. The restored image 241 may represent an image generated to restore the training input image 211 based on the feature map 212.

슬라이스 레벨의 복원 태스크는 각 CT 슬라이스에 대하여 인코더에 의하여 압축된 출력으로부터 원본 CT 슬라이드를 복원하기 위하여 수행될 수 있다. 인코더는 슬라이스 레벨의 복원 태스크를 통해 ICH의 표현(representation)을 학습할 수 있다. 예시적으로, 복원 모듈(240)은 PixelShuffle 디코더를 포함할 수 있다. PixelShuffle 디코더는 디콘볼루션 레이어에 비해 낮은 계산 비용으로 저해상도(low-resolution) 데이터를 고해상도(high-resolution) 데이터로 초해상화(super-resolving)하는 것이 가능할 수 있다. 복원 모듈(240)은 사소한 솔루션들(trivial solutions)을 방지하기 위하여 스킵 연결들이 없는 Autoencoder와 같은 구조를 가질 수 있다. 인코더는 보다 더 의미있는 피처(meaningful feature)를 추출할 수 있다.A slice-level restoration task can be performed to restore the original CT slide from the output compressed by the encoder for each CT slice. The encoder can learn the representation of the ICH through a slice-level restoration task. By way of example, the restoration module 240 may include a PixelShuffle decoder. The PixelShuffle decoder may be able to super-resolve low-resolution data into high-resolution data at a lower computational cost than a deconvolution layer. The restoration module 240 may have an Autoencoder-like structure without skip connections to prevent trivial solutions. The encoder can extract more meaningful features.

전자 장치의 프로세서는 다중 태스크 모델(200)의 목적 함수를 계산할 수 있다.The processor of the electronic device may calculate the objective function of the multi-task model 200.

다중 태스크 모델(200)의 목적 함수는, 분류 손실(classification loss), 분할 손실(segmentation loss), 복원 손실(reconstruction loss), 및 일관성 손실(consistency loss)을 가질 수 있다. 예를 들어, 다중 태스크 모델(200)의 목적 함수 값은, 4개의 손실들에 대하여 모두 동일한 가중치로 다음과 같이 계산될 수 있다:The objective function of the multi-task model 200 may include classification loss, segmentation loss, reconstruction loss, and consistency loss. For example, the objective function value of the multi-task model 200 can be calculated with equal weights for all four losses as follows:

여기서, Total Loss는 다중 태스크 모델(200)의 목적 함수 값을 나타내고, SEG Loss는 분할 손실을 나타내며, CLS Loss는 분류 손실을 나타내고, CON Loss는 일관성 손실을 나타내며, REC Loss는 복원 손실을 나타낼 수 있다. Here, Total Loss represents the objective function value of the multi-task model 200, SEG Loss represents the segmentation loss, CLS Loss represents the classification loss, CON Loss represents the consistency loss, and REC Loss may represent the restoration loss. there is.

전자 장치의 프로세서는 분류 손실을 계산할 수 있다. 분류 손실은 인코더(210) 및 범용 분류 모듈(220)의 성능과 관련될 수 있다. 전자 장치의 프로세서는 제1 가능성 점수(221) 및 트레이닝 입력 이미지(211)에 매핑된 참값 클래스에 기초하여 분류 손실을 계산할 수 있다. 예시적으로, 분류 손실은 바이너리 크로스 엔트로피 손실(binary cross-entropy loss; BCE loss)로서 계산될 수 있다.The processor of the electronic device may calculate the classification loss. Classification loss may be related to the performance of the encoder 210 and the general-purpose classification module 220. The processor of the electronic device may calculate a classification loss based on the first likelihood score 221 and the truth class mapped to the training input image 211. Illustratively, the classification loss may be calculated as binary cross-entropy loss (BCE loss).

전자 장치의 프로세서는 분할 손실을 계산할 수 있다. 분할 손실은 인코더(210) 및 범용 분할 모듈(230)의 성능과 관련될 수 있다. 전자 장치의 프로세서는, 병변 영역(231) 및 참값 병변 영역에 기초하여 분할 손실을 계산할 수 있다. 예시적으로, 분할 손실은, 바이너리 크로스 엔트로피 손실(BCE loss) 및 중첩 기반 다이스 계수 손실(Dice-coefficient loss; DICE loss)의 일대일 조합(one-to-one combination)(예: 바이너리 크로스 엔트로피 손실 및 중첩 기반 다이스 계수 손실의 합)으로 계산될 수 있다.The electronic device's processor may calculate the segmentation loss. Segmentation loss may be related to the performance of the encoder 210 and the general-purpose segmentation module 230. The processor of the electronic device may calculate the segmentation loss based on the lesion area 231 and the true lesion area. Illustratively, the segmentation loss is a one-to-one combination of binary cross entropy loss (BCE loss) and overlap-based Dice-coefficient loss (DICE loss) (e.g., binary cross entropy loss and can be calculated as the sum of overlap-based Dice coefficient losses).

전자 장치의 프로세서는 복원 손실을 계산할 수 있다. 복원 손실은 인코더(210) 및 복원 모듈(240)의 성능과 관련될 수 있다. 전자 장치의 프로세서는, 트레이닝 입력 이미지(211) 및 복원 이미지(241)에 기초하여 분할 손실을 계산할 수 있다. 예를 들어, 전자 장치의 프로세서는, 트레이닝 입력 이미지(211)의 픽셀 및 상기 트레이닝 입력 이미지(211)의 픽셀에 대응하는 복원 이미지(241)의 픽셀 간의 강도(intensity) 차이의 평균 절대 오차(mean absolute error)를 계산할 수 있다. 예시적으로, 분할 손실은 평균 절대 오차 손실(예: L1 손실)로 계산될 수 있다. The processor of the electronic device may calculate the restoration loss. The restoration loss may be related to the performance of the encoder 210 and the restoration module 240. The processor of the electronic device may calculate the segmentation loss based on the training input image 211 and the restored image 241. For example, the processor of the electronic device may perform an average absolute error (mean) of the intensity difference between the pixels of the training input image 211 and the pixels of the reconstructed image 241 corresponding to the pixels of the training input image 211. absolute error) can be calculated. Illustratively, the segmentation loss may be calculated as the average absolute error loss (e.g., L1 loss).

전자 장치의 프로세서는 일관성 손실(250)을 계산할 수 있다. 일관성 손실은 범용 분류 모듈(220) 및 범용 분할 모듈(230)의 출력들(예: 제1 가능성 점수(221) 및 병변 영역(231)) 간의 일관성을 나타낼 수 있다. The processor of the electronic device may calculate the coherence loss 250. Consistency loss may indicate consistency between the outputs of the universal classification module 220 and the universal segmentation module 230 (e.g., first likelihood score 221 and lesion area 231).

범용 분류 모듈(220)의 출력 및 범용 분할 모듈(230)의 출력이 병변 유무를 서로 다르게 지시하는 경우, 일관성 손실(250)은 높은 값을 가질 수 있다. 예를 들어, 제1 가능성 점수(221)는 1에 가까운 값을 가지고 병변 영역(231)이 추출되지 않은 경우, 일관성 손실(250)은 높은 값을 가질 수 있다. 다른 예를 들어, 제1 가능성 점수(221)는 0에 가까운 값을 가지고 병변 영역(231)이 추출된 경우, 일관성 손실(250)은 높은 값을 가질 수 있다.If the output of the general classification module 220 and the output of the general segmentation module 230 indicate the presence or absence of a lesion differently, the consistency loss 250 may have a high value. For example, if the first likelihood score 221 has a value close to 1 and the lesion area 231 has not been extracted, the consistency loss 250 may have a high value. For another example, when the first likelihood score 221 is extracted with a value close to 0 and the lesion area 231 is extracted, the consistency loss 250 may have a high value.

범용 분류 모듈(220)의 출력 및 범용 분할 모듈(230)의 출력이 병변 유무를 서로 같게 지시하는 경우, 일관성 손실(250)은 낮은 값을 가질 수 있다. 예를 들어, 제1 가능성 점수(221)는 1에 가까운 값을 가지고 병변 영역(231)이 추출된 경우, 일관성 손실(250)은 낮은 값을 가질 수 있다. 다른 예를 들어, 제1 가능성 점수(221)는 0에 가까운 값을 가지고 병변 영역(231)이 추출되지 않은 경우, 일관성 손실(250)은 낮은 값을 가질 수 있다.When the output of the general classification module 220 and the output of the general segmentation module 230 indicate the presence or absence of a lesion equally, the consistency loss 250 may have a low value. For example, when the first likelihood score 221 has a value close to 1 and the lesion area 231 is extracted, the consistency loss 250 may have a low value. For another example, when the first likelihood score 221 has a value close to 0 and the lesion area 231 has not been extracted, the consistency loss 250 may have a low value.

일관성 손실(250)의 계산은 각 CT 슬라이스에서 범용 분류 모듈(220) 및 범용 분할 모듈(230) 간의 일치를 위하여 수행될 수 있다. 인코더(210)는, 일관성 손실(250)의 계산을 통해, 슬라이스 레벨의 분류 태스크 및 슬라이스 레벨의 분할 태스크의 보다 더 공통적인 피처(common feature)를 학습할 수 있다.Calculation of consistency loss 250 may be performed for consistency between the universal classification module 220 and the universal segmentation module 230 in each CT slice. The encoder 210 can learn more common features of the slice-level classification task and the slice-level segmentation task through calculation of the consistency loss 250.

전자 장치의 프로세서는, 제1 가능성 점수(221) 및 제2 가능성 점수에 기초하여 일관성 손실(250)을 계산할 수 있다. 제2 가능성 점수는 트레이닝 입력 이미지(211)에 병변이 나타날 가능성을 나타낼 수 있다.The processor of the electronic device may calculate the consistency loss 250 based on the first likelihood score 221 and the second likelihood score. The second likelihood score may indicate the likelihood that a lesion will appear in the training input image 211.

전자 장치의 프로세서는 추출된 병변 영역(231)을 이용하여 제2 가능성 점수를 획득할 수 있다. 전자 장치의 프로세서는 추출된 병변 영역(231)에 평균 풀링 레이어(average pooling layer) 및 최대 풀링 레이어(max pooling layer) 중 적어도 하나를 적용함으로써 제2 가능성 점수를 획득할 수 있다. 예시적으로, 전자 장치의 프로세서는 ResNet-50의 마지막 피처의 16x16 크기로 인하여, 추출된 병변 영역(231)에 4 개의 평균 풀링 레이어들 및 1 개의 최대 풀링 레이어를 적용할 수 있다. 전자 장치의 프로세서는, 제2 가능성 점수를 획득함으로써, 범용 분할 모듈(230)의 출력의 해상도를 범용 분류 모듈(220)의 출력 모듈의 해상도와 일치시킬 수 있다.The processor of the electronic device may obtain a second probability score using the extracted lesion area 231. The processor of the electronic device may obtain a second likelihood score by applying at least one of an average pooling layer and a max pooling layer to the extracted lesion area 231. As an example, the processor of the electronic device may apply four average pooling layers and one maximum pooling layer to the extracted lesion area 231 due to the 16x16 size of the last feature of ResNet-50. The processor of the electronic device may match the resolution of the output of the general-purpose segmentation module 230 with the resolution of the output module of the general-purpose classification module 220 by obtaining the second likelihood score.

참고로, 제1 가능성 점수 및 제2 가능성 점수는 모두 트레이닝 입력 이미지(211)에 병변이 나타날 가능성을 나타낼 수 있다. 다만, 제1 가능성 점수(221)는 범용 분류 모듈(220)에 기초하여 획득된 점수를 나타낼 수 있고, 제2 가능성 점수는 범용 분할 모듈(230)에 기초하여 획득된 점수를 나타낼 수 있다. 따라서, 제1 가능성 점수(221) 및 제2 가능성 점수의 차이가 큰 경우, 범용 분류 모듈(220) 및 범용 분할 모듈(230)의 출력들 간의 일관성이 낮을 수 있다. 반대로, 제1 가능성 점수(221) 및 제2 가능성 점수의 차이가 작은 경우, 범용 분류 모듈(220) 및 범용 분할 모듈(230)의 출력들 간의 일관성이 높을 수 있다.For reference, both the first likelihood score and the second likelihood score may indicate the possibility that a lesion appears in the training input image 211. However, the first probability score 221 may represent a score obtained based on the general-purpose classification module 220, and the second probability score may represent a score obtained based on the general-purpose segmentation module 230. Accordingly, when the difference between the first likelihood score 221 and the second likelihood score is large, consistency between the outputs of the universal classification module 220 and the universal segmentation module 230 may be low. Conversely, when the difference between the first likelihood score 221 and the second likelihood score is small, consistency between outputs of the general-purpose classification module 220 and the general-purpose segmentation module 230 may be high.

전자 장치의 프로세서는 제1 가능성 점수(221) 및 제2 가능성 점수 간의 차이에 기초하여 일관성 손실(250)을 계산할 수 있다. 예를 들어, 전자 장치의 프로세서는 제1 가능성 점수(221) 및 제2 가능성 점수의 차이의 제곱(square)을 이미지 각각에 대하여 계산할 수 있다. 전자 장치의 프로세서는 제1 가능성 점수(221) 및 제2 가능성 점수의 차이의 제곱을 평균할 수 있다. 배치(batch)는 하나 이상의 이미지들을 포함할 수 있다. 전자 장치의 프로세서는, 제1 가능성 점수(221) 및 제2 가능성 점수의 차이의 제곱을 배치에 대하여 평균함으로써 상기 일관성 손실을 계산할 수 있다.The processor of the electronic device may calculate the consistency loss 250 based on the difference between the first likelihood score 221 and the second likelihood score. For example, the processor of the electronic device may calculate the square of the difference between the first likelihood score 221 and the second likelihood score for each image. The processor of the electronic device may average the square of the difference between the first probability score 221 and the second probability score. A batch may contain one or more images. The processor of the electronic device may calculate the consistency loss by averaging the square of the difference between the first likelihood score 221 and the second likelihood score for the batch.

예시적으로, 일관성 손실( CON Loss)은 다음의 식에 의하여 계산될 수 있다.By way of example, consistency loss ( CON Loss ) can be calculated by the following equation.

여기서, 및 는 각각 범용 분류 모듈(220) 및 범용 분할 모듈(230)의 출력을 나타내고, N은 CT 슬라이스들의 개수를 나타내며, 는 4 개의 평균 풀링 레이어들 및 1 개의 최대 풀링 레이어의 조합을 나타낼 수 있다.here, and represents the output of the universal classification module 220 and the universal segmentation module 230, respectively, N represents the number of CT slices, may represent a combination of four average pooling layers and one maximum pooling layer.

도 3은 일 실시예에 따른 환자 레벨의 분류 모델의 트레이닝을 나타낸다. Figure 3 shows training of a patient-level classification model according to one embodiment.

도 1에서 전술된 바와 같이, 분류 모델(300)은 인코더(310), 인코더에 연결되는 환자 레벨의 분류 모듈(320)을 포함할 수 있다. 분류 모델(300)은, 복수의 이미지들(311)에 적용됨으로써 가능성 점수(322)를 출력할 수 있다.As described above in FIG. 1 , the classification model 300 may include an encoder 310 and a patient-level classification module 320 connected to the encoder. The classification model 300 may output a likelihood score 322 by being applied to a plurality of images 311 .

분류 모델(300)의 트레이닝 데이터 세트는, 복수의 이미지들(311) 및 복수의 이미지들(311)에 매핑된 참값 클래스를 포함할 수 있다. 후술하겠으나, 참값 클래스는 분류 모델(300)의 목적 함수 값의 계산을 위하여 이용될 수 있다. The training data set of the classification model 300 may include a plurality of images 311 and a true value class mapped to the plurality of images 311 . As will be described later, the true value class can be used to calculate the objective function value of the classification model 300.

전자 장치의 프로세서는 분류 모델(300)을 생성할 수 있다. 분류 모델(300)은 인코더(310) 및 인코더(310)에 연결되는 환자 레벨의 분류 모듈(320)(예: 도 3의 3D Classifier)을 포함할 수 있다. 인코더(310)는 트레이닝된 다중 태스크 모델(예: 도 2의 다중 태스크 모델(200))의 인코더에 기초하여 획득될 수 있다. 예를 들어, 전자 장치의 프로세서는 인코더(310)를 다중 태스크 모델을 이용하여 사전 학습시킬 수 있다. 전자 장치의 프로세서는 다중 태스크 모델의 사전 학습된 인코더를 포함하는 분류 모델(300)을 생성할 수 있다.The processor of the electronic device may generate a classification model 300. The classification model 300 may include an encoder 310 and a patient-level classification module 320 (eg, 3D Classifier in FIG. 3) connected to the encoder 310. Encoder 310 may be obtained based on the encoder of a trained multi-task model (e.g., multi-task model 200 in FIG. 2). For example, the processor of the electronic device may pre-train the encoder 310 using a multi-task model. The processor of the electronic device may generate a classification model 300 including a pre-trained encoder of a multi-task model.

전자 장치의 프로세서는 복수의 이미지들(311)에 인코더(310)를 적용함으로써 복수의 피처 맵들을 획득할 수 있다. 전자 장치의 프로세서는 하나의 이미지에 인코더(310)를 적용함으로써 하나의 피처 맵을 획득할 수 있다. 전자 장치의 프로세서는 복수의 이미지들(311) 각각에 인코더(310)를 적용함으로써 피처 맵을 획득하는 것을 반복함으로써 복수의 피처 맵들을 획득할 수 있다. The processor of the electronic device may obtain a plurality of feature maps by applying the encoder 310 to the plurality of images 311. The processor of the electronic device can obtain one feature map by applying the encoder 310 to one image. The processor of the electronic device may obtain a plurality of feature maps by repeating obtaining the feature map by applying the encoder 310 to each of the plurality of images 311.

하나의 피처 맵은 2차원 의료 영상(예: 하나의 CT 슬라이스)에 대한 2D 피처를 나타낼 수 있고, 스택(stack)된 복수의 피처 맵들은 3차원 의료 영상(예: 복수의 CT 슬라이스들)에 대한 3D 피처를 나타낼 수 있다. 예를 들어, 전자 장치의 프로세서는 CT 슬라이스들의 개수만큼 반복적으로 2D 피처들을 스택(stack)함으로써 3D 피처를 추출할 수 있다. 추출된 3D 피처들은, 체적(volumetric) 정보를 보완하기 위하여 환자 레벨의 분류 모듈(320)에 입력될 수 있다.One feature map may represent a 2D feature for a two-dimensional medical image (e.g., one CT slice), and multiple stacked feature maps may represent a 2D feature for a three-dimensional medical image (e.g., multiple CT slices). 3D features can be displayed. For example, a processor of an electronic device may extract 3D features by repeatedly stacking 2D features as many as the number of CT slices. The extracted 3D features can be input to the patient-level classification module 320 to supplement volumetric information.

전자 장치의 프로세서는 복수의 피처 맵들에 환자 레벨의 분류 모듈(320)을 적용함으로써 가능성 점수(322)를 획득할 수 있다. 가능성 점수(322)는 복수의 이미지들 중 적어도 한 이미지에 병변이 나타날 가능성을 나타낼 수 있다.The processor of the electronic device may obtain the likelihood score 322 by applying the patient-level classification module 320 to the plurality of feature maps. The likelihood score 322 may indicate the possibility that a lesion appears in at least one image among a plurality of images.

환자 레벨의 분류 태스크에서, 병변은 복수의 이미지들 중 적어도 일부의 연속하는 이미지들에서 나타날 수 있다. 환자 레벨의 분류 모듈(320)은 연속하는 이미지들에서 나타나는 병변의 순차적인 정보를 캡처하고 CT 슬라이스의 가변 길이를 제어할 수 있도록 LSTM(Long Short Term Memory)로 구현될 수 있다. 예를 들어, 환자 레벨의 분류 모듈(320)은 2 개의 선형 레이어들(linear layers) 및 양방향(bi-directional) LSTM을 가질 수 있다. 환자 레벨의 분류 모듈(320)은 양방향 LSTM을 통해 ICH 감지 성능을 개선시킬 수 있다.In a patient-level classification task, a lesion may appear in at least some consecutive images of a plurality of images. The patient-level classification module 320 may be implemented with Long Short Term Memory (LSTM) to capture sequential information of lesions appearing in consecutive images and control the variable length of CT slices. For example, the patient level classification module 320 may have two linear layers and a bi-directional LSTM. The patient-level classification module 320 can improve ICH detection performance through bidirectional LSTM.

전자 장치의 프로세서는 분류 모델(300)의 목적 함수 값을 계산할 수 있다. 전자 장치의 프로세서는 가능성 점수(322) 및 복수의 이미지들(311)에 매핑된 참값 클래스에 기초하여 목적 함수 값을 계산할 수 있다. 예를 들어, 전자 장치의 프로세서는 분류 모델(300)의 목적 함수 값을 바이너리 크로스 엔트로피 손실(binary cross-entropy loss; BCE loss)로 계산할 수 있다.The processor of the electronic device may calculate the objective function value of the classification model 300. The processor of the electronic device may calculate the objective function value based on the likelihood score 322 and the truth value class mapped to the plurality of images 311. For example, the processor of the electronic device may calculate the objective function value of the classification model 300 as binary cross-entropy loss (BCE loss).

전자 장치의 프로세서는 목적 함수 값에 기초하여 분류 모델(300)의 파라미터를 업데이트할 수 있다. The processor of the electronic device may update the parameters of the classification model 300 based on the objective function value.

전자 장치의 프로세서는, 분류 모델(300) 중 사전 학습된 부분 모델의 파라미터를 고정한 채로(예: 프리즈(freeze)된 채로), 사전 학습되지 않은 나머지 부분 모델의 파라미터를 업데이트할 수 있다. 예를 들어, 전자 장치의 프로세서는, 트레이닝 반복들 중 적어도 일부 반복(iteration)에서, 분류 모델(300)의 인코더(310)의 파라미터를 고정한 채로 환자 레벨의 분류 모듈(320)의 파라미터를 업데이트할 수 있다. The processor of the electronic device may update the parameters of the remaining non-pre-trained partial models while keeping the parameters of the pre-trained partial model fixed (e.g., frozen) among the classification model 300. For example, the processor of the electronic device may update the parameters of the patient-level classification module 320 while keeping the parameters of the encoder 310 of the classification model 300 fixed in at least some of the training iterations. You can.

전자 장치의 프로세서는, 분류 모델(300) 중 사전 학습된 부분 모델의 파라미터의 고정을 해제하고(예: 언프리즈(unfreeze)된 채로), 분류 모델(300)의 적어도 일부의 파라미터를 업데이트할 수 있다. 예를 들어, 전자 장치의 프로세서는, 트레이닝 반복들 중 다른 반복에서 분류 모델(300)의 인코더(310)의 파라미터의 고정을 해제하고, 분류 모델(300)의 인코더(310) 및 환자 레벨의 분류 모듈(320)의 파라미터를 업데이트할 수 있다.The processor of the electronic device may unfreeze (e.g., unfreeze) the parameters of the pre-trained partial model among the classification model 300 and update at least some parameters of the classification model 300. there is. For example, the processor of the electronic device may unfreeze the parameters of the encoder 310 of the classification model 300 in another one of the training iterations, and Parameters of the module 320 can be updated.

일 실시예에 따른 전자 장치의 프로세서는, 확장(expansion)의 프로세스에서, 즉시 인코더(310)의 파인 튜닝을 수행하지 않을 수 있다. 대신에, 전자 장치의 프로세서는, 단계적으로 분류 모델(300)의 트레이닝 및 인코더(310)의 파인 튜닝을 수행할 수 있다. 예를 들어, 전자 장치의 프로세서는 점진적인(gradual) 언프리즈(unfreeze) 접근법을 이용할 수 있다. 전자 장치의 프로세서는 인코더(310)의 파라미터를 고정한 채로 환자 레벨의 분류 모델의 트레이닝을 100 에폭(epochs) 동안 수행할 수 있다. 그 이후에, 전자 장치의 프로세서는 인코더(310)의 파라미터의 고정을 점진적으로 해제할 수 있다. 전자 장치의 프로세서는, 10 에폭마다 인코더(310)의 파라미터의 고정을 해제할 수 있다.The processor of the electronic device according to one embodiment may not immediately perform fine tuning of the encoder 310 in the expansion process. Instead, the processor of the electronic device may perform training of the classification model 300 and fine tuning of the encoder 310 step by step. For example, a processor in an electronic device may use a gradual unfreeze approach. The processor of the electronic device may perform training of the patient-level classification model for 100 epochs while fixing the parameters of the encoder 310. Afterwards, the processor of the electronic device may gradually release the fixation of the parameters of the encoder 310. The processor of the electronic device may release the fixation of the parameters of the encoder 310 every 10 epochs.

도 4는 일 실시예에 따른 환자 레벨의 분할 모델의 트레이닝을 나타낸다.Figure 4 shows training of a patient-level segmentation model according to one embodiment.

도 1에서 전술된 바와 같이, 분할 모델(400)은 인코더(410) 및 인코더(410)에 연결되는 디코더(420)와 환자 레벨의 분할 모듈(430)(예: 도 4의 3D Segmentor)을 포함할 수 있다. 분할 모델(400)은, 복수의 이미지들(411)에 적용됨으로써 병변 영역(432)를 출력할 수 있다.As described above in FIG. 1, the segmentation model 400 includes an encoder 410, a decoder 420 connected to the encoder 410, and a patient-level segmentation module 430 (e.g., 3D Segmentor in FIG. 4). can do. The segmentation model 400 may output a lesion area 432 by being applied to a plurality of images 411 .

분할 모델(400)의 트레이닝 데이터 세트는, 복수의 이미지들(411) 및 복수의 이미지들(411)에 매핑된 참값 병변 영역을 포함할 수 있다. 후술하겠으나, 참값 병변 영역은 분할 모델(400)의 목적 함수 값의 계산을 위하여 이용될 수 있다. The training data set of the segmentation model 400 may include a plurality of images 411 and a true lesion area mapped to the plurality of images 411 . As will be described later, the true lesion area can be used to calculate the objective function value of the segmentation model 400.

전자 장치의 프로세서는 분할 모델(400)을 생성할 수 있다. 분할 모델(400)은 인코더(410) 및 인코더(410)에 연결되는 디코더(420)와 환자 레벨의 분할 모듈(430)을 포함할 수 있다. 인코더(410)는 트레이닝된 다중 태스크 모델(예: 도 2의 다중 태스크 모델(200))의 인코더에 기초하여 획득될 수 있다. 예를 들어, 전자 장치의 프로세서는 인코더(410)를 다중 태스크 모델을 이용하여 사전 학습시킬 수 있다. 전자 장치의 프로세서는 다중 태스크 모델의 사전 학습된 인코더를 포함하는 분할 모델(400)을 생성할 수 있다.The processor of the electronic device may generate the segmentation model 400. The segmentation model 400 may include an encoder 410, a decoder 420 connected to the encoder 410, and a patient-level segmentation module 430. Encoder 410 may be obtained based on the encoder of a trained multi-task model (e.g., multi-task model 200 in FIG. 2). For example, the processor of the electronic device may pre-train the encoder 410 using a multi-task model. The processor of the electronic device may generate a segmentation model 400 including a pre-trained encoder of a multi-task model.

전자 장치의 프로세서는 복수의 이미지들(411)에 인코더(410) 및 디코더(420)를 적용함으로써 복수의 피처 맵들을 획득할 수 있다. 전자 장치의 프로세서는 하나의 이미지에 인코더(410) 및 디코더(420)를 적용함으로써 하나의 피처 맵을 획득할 수 있다. 전자 장치의 프로세서는 복수의 이미지들(411) 각각에 인코더(410) 및 디코더(420)를 적용함으로써 피처 맵을 획득하는 것을 반복함으로써 복수의 피처 맵들을 획득할 수 있다. The processor of the electronic device may obtain a plurality of feature maps by applying the encoder 410 and the decoder 420 to the plurality of images 411. The processor of the electronic device can obtain one feature map by applying the encoder 410 and decoder 420 to one image. The processor of the electronic device may obtain a plurality of feature maps by repeating obtaining the feature map by applying the encoder 410 and the decoder 420 to each of the plurality of images 411.

하나의 피처 맵은 2차원 의료 영상(예: 하나의 CT 슬라이스)에 대한 2D 피처를 나타낼 수 있고, 스택(stack)된 복수의 피처 맵들은 3차원 의료 영상(예: 복수의 CT 슬라이스들)에 대한 3D 피처를 나타낼 수 있다. 예를 들어, 전자 장치의 프로세서는 CT 슬라이스들의 개수만큼 반복적으로 2D 피처들을 스택(stack)함으로써 3D 피처를 추출할 수 있다. 추출된 3D 피처들은, 체적(volumetric) 정보를 보완하기 위하여 환자 레벨의 분할 모듈(430)에 입력될 수 있다.One feature map may represent a 2D feature for a two-dimensional medical image (e.g., one CT slice), and multiple stacked feature maps may represent a 2D feature for a three-dimensional medical image (e.g., multiple CT slices). 3D features can be displayed. For example, a processor of an electronic device may extract 3D features by repeatedly stacking 2D features as many as the number of CT slices. The extracted 3D features can be input to the patient-level segmentation module 430 to supplement volumetric information.

전자 장치의 프로세서는 복수의 피처 맵들에 환자 레벨의 분할 모듈(430)을 적용함으로써 병변 영역(432)을 추출할 수 있다. 병변 영역(432)은 복수의 이미지들(411)에 나타난 병변에 대응하는 영역을 나타낼 수 있다. 분할 모듈(430)은 3D 컨볼루션 레이어(3D convolution layer; Conv3D)를 가질 수 있다.The processor of the electronic device may extract the lesion area 432 by applying the patient-level segmentation module 430 to the plurality of feature maps. The lesion area 432 may represent an area corresponding to the lesion shown in the plurality of images 411 . The segmentation module 430 may have a 3D convolution layer (Conv3D).

전자 장치의 프로세서는 분할 모델(400)의 목적 함수 값을 계산할 수 있다. 전자 장치의 프로세서는 병변 영역(432) 및 복수의 이미지들(411)에 매핑된 참값 병변 영역에 기초하여 목적 함수 값을 계산할 수 있다. 예를 들어, 전자 장치의 프로세서는 분할 모델(400)의 목적 함수 값을 바이너리 크로스 엔트로피 손실(BCE loss) 및 다이스 계수 손실(DICE loss)의 조합으로 계산할 수 있다.The processor of the electronic device may calculate the objective function value of the segmentation model 400. The processor of the electronic device may calculate an objective function value based on the lesion area 432 and the true lesion area mapped to the plurality of images 411 . For example, the processor of the electronic device may calculate the objective function value of the segmentation model 400 as a combination of binary cross entropy loss (BCE loss) and DICE coefficient loss (DICE loss).

전자 장치의 프로세서는 목적 함수 값에 기초하여 분할 모델(400)의 파라미터를 업데이트할 수 있다. The processor of the electronic device may update the parameters of the segmentation model 400 based on the objective function value.

전자 장치의 프로세서는, 분할 모델(400) 중 사전 학습된 부분 모델의 파라미터를 고정한 채로(예: 프리즈(freeze)된 채로), 사전 학습되지 않은 나머지 부분 모델의 파라미터를 업데이트할 수 있다. 예를 들어, 전자 장치의 프로세서는, 트레이닝 반복들 중 적어도 일부 반복(iteration)에서, 분할 모델(400)의 인코더(410)의 파라미터를 고정한 채로 디코더(420) 및 환자 레벨의 분할 모듈(430)의 파라미터를 업데이트할 수 있다. The processor of the electronic device may update the parameters of the remaining non-pre-trained partial models while keeping the parameters of the pre-trained partial model fixed (e.g., frozen) among the segmentation models 400. For example, the processor of the electronic device may, in at least some of the training iterations, operate the decoder 420 and the patient-level segmentation module 430 while fixing the parameters of the encoder 410 of the segmentation model 400. Parameters can be updated.

전자 장치의 프로세서는, 분할 모델(400) 중 사전 학습된 부분 모델의 파라미터의 고정을 해제하고(예: 언프리즈(unfreeze)된 채로), 분할 모델(400)의 적어도 일부의 파라미터를 업데이트할 수 있다. 예를 들어, 전자 장치의 프로세서는, 트레이닝 반복들 중 다른 반복에서 분할 모델(400)의 인코더(410)의 파라미터의 고정을 해제하고, 분할 모델(400)의 인코더(410), 디코더(420), 및 환자 레벨의 분할 모듈(430)의 파라미터를 업데이트할 수 있다.The processor of the electronic device may unfix the parameters of the pre-learned partial model of the segmentation model 400 (e.g., unfreeze it) and update at least some parameters of the segmentation model 400. there is. For example, the processor of the electronic device may unfreeze the parameters of the encoder 410 of the segmentation model 400 in another one of the training iterations and , and the parameters of the segmentation module 430 at the patient level can be updated.

일 실시예에 따른 전자 장치의 프로세서는, 확장(expansion)의 프로세스에서, 즉시 인코더(410)의 파인 튜닝을 수행하지 않을 수 있다. 대신에, 전자 장치의 프로세서는, 단계적으로 분할 모델(400)의 학습 및 인코더(410)의 파인 튜닝을 수행할 수 있다. 예를 들어, 전자 장치의 프로세서는 점진적인(gradual) 언프리즈(unfreeze) 접근법을 이용할 수 있다. 전자 장치의 프로세서는 인코더(410)의 파라미터를 고정한 채로 환자 레벨의 분할 모델(400)의 학습을 100 에폭(epochs) 동안 수행할 수 있다. 그 이후에, 전자 장치의 프로세서는 인코더(410)의 파라미터의 고정을 점진적으로 해제할 수 있다. 전자 장치의 프로세서는, 10 에폭마다 인코더(410)의 파라미터의 고정을 해제할 수 있다.The processor of the electronic device according to one embodiment may not immediately perform fine tuning of the encoder 410 in the expansion process. Instead, the processor of the electronic device may perform learning of the segmentation model 400 and fine tuning of the encoder 410 step by step. For example, a processor in an electronic device may use a gradual unfreeze approach. The processor of the electronic device may perform learning of the patient-level segmentation model 400 for 100 epochs while fixing the parameters of the encoder 410. Afterwards, the processor of the electronic device may gradually release the fixation of the parameters of the encoder 410. The processor of the electronic device may release the fixation of the parameters of the encoder 410 every 10 epochs.

도 5 은 일 실시예에 따른 내부 데이터 세트 및 외부 데이터 세트들의 ICH의 양(amount)에 따른 빈도(frequency)를 나타낼 수 있다. 도 6은 일 실시예에 따른 내부 데이터 세트 및 외부 데이터 세트들의 ICH의 부피(volume) 및 강도(intensity)을 나타낼 수 있다.Figure 5 may show the frequency according to the amount of ICH of internal data sets and external data sets according to an embodiment. Figure 6 may show the volume and intensity of ICH of internal data sets and external data sets according to an embodiment.

일 실시예에 따른 기계 학습 모델은, 내부 테스트 세트(도 5 및 도 6에서, AMC)와 함께, 실제 응급 상황에서의 성능을 입증하기 위하여 정상 환자 비율이 높은 두 개의 외부 테스트 세트들(도 5 및 도 6에서, External1, External2), 및 작고 희미한 출혈로 인해 감지하기 어려운 케이스에 대한 견고성을 검증하기 위한 다른 외부 테스트 세트(도 5 및 도 6에서, External3)(예: 하나의 오픈 소스 ICH 데이터 세트인 PhysioNet 데이터 세트)를 이용하여 평가될 수 있다. The machine learning model according to one embodiment uses an internal test set (AMC in FIGS. 5 and 6), as well as two external test sets with a high proportion of normal patients (FIG. 5) to demonstrate performance in real emergency situations. and in Figure 6, External1, External2), and another external test set (in Figures 5 and 6, External3) to verify robustness against cases that are difficult to detect due to small, faint hemorrhages (e.g., one open source ICH data It can be evaluated using the PhysioNet data set).

내부 데이터 세트는 기계 학습 모델의 트레이닝, 검증, 및 테스트에 사용되는 데이터 세트를 나타낼 수 있다. 예시적으로, 내부 데이터 세트는 트레이닝 데이터 세트, 검증(validation) 데이터 세트, 및 테스트(test) 데이터 세트를 가질 수 있다.An internal data set may represent a data set used for training, validation, and testing of a machine learning model. Illustratively, the internal data set may have a training data set, a validation data set, and a test data set.

도 7은 일 실시예 및 비교 실시예에 따른 다중 태스크 모델을 이용하여 학습된 인코더들의 활성화 맵들을 나타낼 수 있다. Figure 7 may show activation maps of encoders learned using a multi-task model according to one embodiment and a comparative example.

활성화 맵들은 인코더의 마지막 컨볼루션 계층에서 출력된 피처 맵을 이용하여 채널별로 평균되고, 시그모이드 활성화로 정규화되며, 그 이후 입력 해상도와 매칭하도록 보간(interpolate)됨으로써 획득될 수 있다. Activation maps can be obtained by using the feature map output from the last convolutional layer of the encoder, averaged for each channel, normalized with sigmoid activation, and then interpolated to match the input resolution.

일 실시예 및 비교 실시예에 따른 다중 태스크 모델들의 인코더들은 공정한 비교를 위해 동일한 구조의 인코더(예: ResNet-50 인코더) 및 동일한 트레이닝 설정(예: 입력 크기, 학습률, 최적화, 증강 등)으로 트레이닝되고 평가될 수 있다. 일 실시예 및 비교 실시예에 따른 다중 태스크 모델들의 인코더들 간의 유일한 차이점은 슬라이스 레벨의 태스크의 조합에 따른 모듈(예: 범용 분류 모듈, 범용 분할 모듈, 및 복원 모듈)일 수 있다. Encoders of multi-task models according to one embodiment and comparative embodiments are trained with an encoder of the same structure (e.g., ResNet-50 encoder) and the same training settings (e.g., input size, learning rate, optimization, augmentation, etc.) for fair comparison. and can be evaluated. The only difference between the encoders of the multi-task models according to one embodiment and the comparative embodiment may be a module (eg, a general-purpose classification module, a general-purpose segmentation module, and a restoration module) according to a combination of slice-level tasks.

ICH 케이스의 중증도(severity)에 따른 비교를 수행하기 위하여, 2개의 중증 케이스들(severe cases), 2개의 중등도 케이스들(moderate cases), 및 4개의 경증 케이스들(mild cases)은 비교될 수 있다. To perform a comparison according to the severity of ICH cases, 2 severe cases, 2 moderate cases, and 4 mild cases can be compared. .

인코더는 슬라이스 레벨의 분류 태스크를 통해 전역(global) 정보를 집중적으로 추출하도록 학습될 수 있는 반면, 슬라이스 레벨의 분할 태스크를 통해 로컬(local) 특징들을 조밀하게(densely)하게 캡처할 수 있도록 학습될 수 있다. 인코더가 보다 더 다양한 태스크들을 수행하는 다중 태스크 모델을 통해 트레이닝되는 경우, 보다 더 의미론적인 피처들(more semantic features)이 추출될 수 있다.The encoder can be trained to intensively extract global information through a slice-level classification task, while the encoder can be trained to densely capture local features through a slice-level segmentation task. You can. If the encoder is trained through a multi-task model that performs more diverse tasks, more semantic features can be extracted.

도 7에서 나타난 바와 같이, 인코더가 초점을 맞추는 영역을 나타내는 활성화 맵은, 슬라이스 레벨의 태스크들의 다양한 조합에 따라 달라질 수 있다.As shown in FIG. 7, the activation map representing the area on which the encoder focuses may vary depending on various combinations of slice-level tasks.

슬라이스 레벨의 분류 태스크, 분할 태스크, 복원 태스크, 및 일관성 손실 각각의 개성은 활성화 맵을 통해 명확하게 나타날 수 있다. 예를 들어, 슬라이스 레벨의 분류 태스크를 위한 모델의 인코더(도 7에서, CLS로 표시됨)는 ICH의 가장 두드러진 지점에 초점을 맞추는 활성화 맵을 가질 수 있다. 슬라이스 레벨의 복원 태스크를 위한 모델의 인코더(도 7에서, REC로 표시됨) 및 슬라이스 레벨의 분할 태스크를 위한 모델의 인코더(도 7에서, SEG로 표시됨)는 픽셀 단위의 조밀 예측을 수행할 수 있고, 전체 이미지에 대한 특징을 캡처할하는 활성화 맵을 가질 수 있다. 슬라이스 레벨의 복원 태스크를 위한 모델의 인코더(REC)는 뇌 영역 내에서 슬라이스 레벨의 분할 태스크를 위한 모델의 인코더(SEG)에 비해 상대적으로 고르게 활성화되는 활성화 맵을 가질 수 있다. 슬라이스 레벨의 분할 태스크(SEG)를 위한 모델의 인코더는 전체 이미지 중에서 다소 산발적으로 초점이 맞춰진 활성화 맵을 가질 수 있다. 슬라이스 레벨의 분류 태스크 및 분할 태스크와 함께 일관성 손실을 이용하는 모델의 인코더(도 7에서, CLS+SEG+CON로 표시됨)는, 슬라이스 레벨의 분류 태스크를 위한 모델의 인코더(CLS)에서 나타난 두드러지는 부분에 집중하는 경향과 슬라이스 레벨의 분할 태스크를 위한 모델의 인코더(SEG)에서 나타난 산발적인 초점이 맞춰지는 경향 사이의 균형을 갖는 활성화 맵을 가질 수 있다.The individuality of each slice-level classification task, segmentation task, restoration task, and consistency loss can be clearly revealed through the activation map. For example, the encoder of a model for a slice-level classification task (in Figure 7, denoted CLS ) may have an activation map that focuses on the most salient points of the ICH. The encoder of the model for the slice-level restoration task (indicated by REC in Figure 7) and the encoder of the model for the slice-level segmentation task (indicated by SEG in Figure 7) can perform pixel-wise dense prediction, , we can have an activation map that captures features for the entire image. The encoder ( REC ) of the model for the slice-level restoration task may have an activation map that is relatively evenly activated compared to the encoder ( SEG ) of the model for the slice-level segmentation task within the brain region. The encoder of the model for the slice-level segmentation task ( SEG ) may have an activation map that is rather sporadically focused among the entire image. The encoder of the model that uses consistency loss with the slice-level classification task and the segmentation task (in Figure 7, denoted as CLS+SEG+CON ) is a prominent part of the encoder of the model for the slice-level classification task ( CLS ). It is possible to have an activation map that has a balance between the tendency to focus on and the tendency for sporadic focusing seen in the model's encoder ( SEG ) for the slice-level segmentation task.

슬라이스 레벨의 복수의 태스크들을 위한 모델의 인코더와 관련하여, 슬라이스 레벨의 태스크들의 다양성은 활성화 맵에 영향을 미칠 수 있다. 인코더가 많은 태스크들를 위한 모델을 이용하여 트레이닝되는 경우, 눈에 띄고 명확한 활성화 영역이 나타날 수 있다. With respect to the encoder of the model for multiple tasks at the slice level, the diversity of the tasks at the slice level may affect the activation map. If the encoder is trained using models for many tasks, noticeable and clear activation regions may emerge.

일 실시예에 따른 다중 태스크 모델의 인코더(도 7에서 MRI-Net으로 표시됨)는 비교 실시예들에 따른 인코더들에 비해 가장 이상적인 활성화 맵을 가질 수 있고, 이상적인 활성화 맵은 태스크의 기여도들 간의 균형을 맞출 수 있다. 일 실시예에 따른 다중 태스크 모델의 인코더는, 비교 실시예들(도 7에서 ImageNet 및 Model Genesis로 표시됨)에 따른 인코더들에 비하여 뇌의 가장자리 및 경계 영역에서 크게 활성화될 수 있다. 추가적으로, 더욱이 비교 실시예(도 7에서 Model Genesis로 표시됨)에 따른 인코더(예: ResNet-50 인코더)는 뇌 영역 내에서 고르게 활성화되는 슬라이스 레벨의 복원 태스크를 위한 모델(REC)의 인코더(도 7에서 REC로 표시됨)의 활성화 맵과 유사한 패턴의 활성화 맵을 가질 수 있다.The encoder of the multi-task model according to one embodiment (represented as MRI-Net in FIG. 7) may have the most ideal activation map compared to the encoders according to comparative embodiments, and the ideal activation map is a balance between the contributions of the tasks. can be matched. The encoder of the multi-task model according to one embodiment may be significantly activated in the edge and border regions of the brain compared to the encoders according to the comparative embodiments (represented as ImageNet and Model Genesis in FIG. 7). Additionally, the encoder (e.g., ResNet-50 encoder) according to the comparative example (denoted as Model Genesis in Figure 7) is an encoder of the model ( REC ) for the slice-level restoration task that is evenly activated within brain regions ( Figure 7 It may have an activation map with a similar pattern to the activation map (indicated by REC in ).

표 1은 내부 데이터 세트 및 외부 데이터 세트들의 세부사항(detail)을 나타낼 수 있다.Table 1 can show details of internal data sets and external data sets.

[표 1][Table 1]

환자 레벨의 분류 모델 및 환자 레벨의 분할 모델의 견고성은 정상 환자 비율이 높은 외부 데이터 세트들을 통해 검증될 수 있다. 환자 레벨의 분류 모델 및 환자 레벨의 분할 모델의 견고성은 상술한 다른 외부 데이터 세트를 통해 작고 희미한 출혈로 인해 감지하기 어려운 케이스에 대하여 검증될 수 있다.The robustness of the patient-level classification model and the patient-level segmentation model can be verified through external data sets with a high proportion of normal patients. The robustness of the patient-level classification model and the patient-level segmentation model can be verified for cases that are difficult to detect due to small and faint bleeding through other external data sets described above.

예시적으로, 데이터 세트는 높은 클래스 불균형을 갖는 5가지 유형(예: CPH, IVH, EDH, SDH 및 SAH)의 ICH 환자에 대한 데이터를 포함할 수 있다. 모든 종류의 ICH 클래스들은 환자 레벨의 분류 태스크(예: 바이너리 분류 태스크)를 위하여 하나의 클래스(예: ICH 클래스)로 통합될 수 있다. ICH 클래스들의 통합은 ICH에 대한 클래스 불균형을 해소할 수 있고, 기계 학습 모델은 응급 상황에서의 뇌출혈 여부에 집중할 수 있다. Exemplarily, a data set may include data for patients with five types of ICH (e.g., CPH, IVH, EDH, SDH, and SAH) with high class imbalance. All types of ICH classes can be integrated into one class (e.g., ICH class) for patient-level classification tasks (e.g., binary classification tasks). Integration of ICH classes can resolve class imbalance for ICH, and machine learning models can focus on whether there is cerebral hemorrhage in emergency situations.

ICH 환자의 분할 마스크(segmentation mask)의 라벨링은 14년 이상의 경험을 가진 3명의 선임 방사선과 전문의에 의하여 수행될 수 있다. 모든 CT 스캔들은 512×512 픽셀의 크기를 가질 수 있고, 다양한 깊이(depths)(예: CT 슬라이스의 개수)를 가질 수 있다. Labeling of segmentation masks for ICH patients can be performed by three senior radiologists with more than 14 years of experience. Every CT scan can have a size of 512×512 pixels and can have various depths (e.g., number of CT slices).

4개의 ICH 데이터 세트(예: 내부 데이터 세트 및 3개의 외부 데이터 세트들)의 CT 스캔은 표 1과 같은 응급실의 비증강 NCCT 프로토콜에 따라 약 5.0mm 두께로 획득될 수 있다. 내부 데이터 세트의 깊이는 28보다 크거나 같고 50보다 작거나 같은 범위에 분포될 수 있고, 내부 데이터 세트의 슬라이스 개수의 평균은 38.89일 수 있다. CT scans of the four ICH data sets (i.e., internal data set and three external data sets) can be acquired at approximately 5.0 mm thickness according to the emergency room unenhanced NCCT protocol as shown in Table 1. The depth of the internal data set may be distributed in a range greater than or equal to 28 and less than or equal to 50, and the average of the number of slices of the internal data set may be 38.89.

내부 데이터 세트는 AMC(Asan Medical Center)의 기관 윤리 위원회의 승인을 받은 데이터 세트를 나타낼 수 있다. 내부 데이터 세트의 CT 스캔은, 일정 시간 구간 동안(예: 2009년 9월부터 2017년 6월까지) 연속적으로 NCCT 검사를 받은 환자에 대해 데이터베이스를 후향적으로(retrospectively) 검색함으로써 획득될 수 있다. 획득된 CT 스캔의 임상 방사선 보고서에 기초하여, 내부 데이터 세트는 총 811명의 ICH 환자들 및 521명의 정상 환자들에 대응하는 데이터를 포함할 수 있다. Internal data sets may refer to data sets approved by the Institutional Ethics Committee of Asan Medical Center (AMC). CT scans in the internal data set may be obtained by retrospectively searching the database for patients who underwent consecutive NCCT examinations over a period of time (e.g., September 2009 to June 2017). Based on the clinical radiology reports of the acquired CT scans, the internal data set may contain data corresponding to a total of 811 ICH patients and 521 normal patients.

정상 비율이 높은 두 개의 외부 데이터 세트는 실제 임상 환경에서 모델을 검증하기 위하여 사용될 수 있다. 제1 외부 데이터 세트(External1)은 2018년 7월부터 2018년 10월까지 국내 노원을지병원에서 연속적으로 수집된 데이터를 나타낼 수 있다. 제2 외부 데이터 세트(External2)는 2019년 3월부터 2019년 6월까지 포항 뇌졸중 및 척추병원에서 연속적으로 수집된 데이터를 나타낼 수 있다. 제1 외부 데이터 세트 및 제2 외부 데이터 세트는 수동으로 선택되지 않을 수 있고, 제1 외부 데이터 세트 및 제2 외부 데이터 세트의 ICH 환자의 비율은 실제 임상에서의 발병률을 나타낼 수 있다. Two external data sets with high normality rates can be used to validate the model in a real clinical environment. The first external data set (External1) may represent data continuously collected from Nowon Eulji Hospital in Korea from July 2018 to October 2018. The second external data set (External2) may represent data continuously collected at Pohang Stroke and Spine Hospital from March 2019 to June 2019. The first external data set and the second external data set may not be manually selected, and the proportion of ICH patients in the first external data set and the second external data set may represent the incidence in actual clinical practice.

제3 외부 데이터 세트(External3)는 희미한 소량의 출혈이 나타난 데이터 세트에서 모델의 성능을 확인하기 위해 이용될 수 있다. 예를 들어, 75건(예: 75개의 CT 스캔들)을 가지는 PhysioNet 1.3.1 버전의 공개 데이터 세트는 제3 외부 데이터 세트로 사용될 수 있다. 제3 외부 데이터 세트 중 ICH 환자의 2개의 CT 스캔들은 방사선 전문의에 의하여 품질이 좋지 않은 것으로 판단되어 제외될 수 있다. 제3 외부 데이터 세트(예: PhysioNet 데이터 세트)의 평균 ICH 부피는 15.10mL일 수 있다. 외부 데이터 세트에서 주로 소량의 뇌출혈이 관찰될 수 있다.A third external data set (External3) can be used to check the model's performance on a data set that shows a small amount of faint bleeding. For example, the public dataset of PhysioNet version 1.3.1 with 75 cases (e.g. 75 CT scans) could be used as a third external dataset. Among the third external dataset, two CT scans of ICH patients were judged to be of poor quality by the radiologist and could be excluded. The average ICH volume in a third external data set (e.g., PhysioNet data set) may be 15.10 mL. In external data sets, mainly small amounts of cerebral hemorrhage can be observed.

일 실시예에 따른 기계 학습 모델 및 비교 실시예에 따른 기계 학습 모델의 성능은 비교될 수 있다.The performance of the machine learning model according to one embodiment and the machine learning model according to the comparative embodiment may be compared.

일 실시예에 따른 기계 학습 모델(예: 환자 레벨의 분류 모델 및 환자 레벨의 분할 모델)은 예시적으로, NVIDIA TITAN RTX 24GB GPU로 가속화된 Pytorch에서 구현될 수 있다. A machine learning model (e.g., a patient-level classification model and a patient-level segmentation model) according to one embodiment may be implemented in Pytorch accelerated with an NVIDIA TITAN RTX 24GB GPU.

전처리는 이미지(예: CT 슬라이스)에 40HU 너비, 140HU 수준의 브레인 윈도우(brain window)의 적용을 포함할 수 있다. 또한, Albummentation 라이브러리의 CLAHE(contrast limited adaptive histogram equalization)는 이미지의 콘트라스트(contrast)를 증가시키기 위해 적용될 수 있다. GPU 리소스의 제한으로 인해, 이미지는 선형 보간법을 통해 256×256의 크기로 축소될 수 있다.Preprocessing may include applying a brain window 40 HU wide and at the 140 HU level to the image (e.g., CT slice). Additionally, CLAHE (contrast limited adaptive histogram equalization) of the Albummentation library can be applied to increase the contrast of the image. Due to limitations in GPU resources, the image can be reduced to a size of 256×256 through linear interpolation.

데이터 증강(data augmentation)은 Albumentation 라이브러리 및 MONAI 프레임워크의 이용을 포함할 수 있다. 예를 들어, ICH의 이질성을 해결하기 위하여, ShiftScaleRotate, RandShiftIntensity, HorizontalFlip, Brightness-Contrast, Gauss Noise, 및 Blur와 같은 증강 기법이 이용될 수 있다.Data augmentation may include use of the Albumentation library and the MONAI framework. For example, to address the heterogeneity of ICH, augmentation techniques such as ShiftScaleRotate, RandShiftIntensity, HorizontalFlip, Brightness-Contrast, Gauss Noise, and Blur can be used.

설정(setting)은 배치 크기의 설정을 포함할 수 있다. 모든 실험들에서 배치 크기는 단일 GPU 메모리에 대해 최대로 설정될 수 있다. 따라서, 배치 크기는 모델에 따라 다를 수 있다. 일 실시예에 따른 기계 학습 모델 및 비교 실시예에 따른 기계 학습 모델은 유니폼 Xavier(uniform Xavier)에 의해 초기화될 수 있고, 5 Epoch의 워밍업, 5e-4의 가중치 감소 및 구간 (0.9, 0.999) 내의 베타를 사용하여 학습률이 1e-4인 Adam 옵티마이저를 이용하여 트레이닝될 수 있다. 학습률은 다음의 폴리 학습률 스케쥴에 따라 트레이닝 중에 감소될 수 있다: . 사전 학습의 에폭 수는 최대 1000일 수 있고, 학습 및 파인 튜닝(예: 다운스트림 태스크에 대한 학습)의 에폭 수는 최대 500일 수 있다. 그러나, 실험에 사용된 각 모델은 가장 높은 검증 점수를 기록한 수렴 모델일 수 있다.Settings may include setting the batch size. In all experiments, the batch size can be set to the maximum for single GPU memory. Therefore, batch size may vary depending on the model. The machine learning model according to one embodiment and the machine learning model according to the comparative embodiment may be initialized by uniform Xavier, with a warm-up of 5 epochs, a weight reduction of 5e-4, and an interval (0.9, 0.999). It can be trained using the Adam optimizer with a learning rate of 1e-4 using beta. The learning rate can be reduced during training according to the following poly learning rate schedule: . The number of epochs for pre-training can be up to 1000, and the number of epochs for training and fine tuning (e.g. learning for downstream tasks) can be up to 500. However, each model used in the experiment may be the convergence model that recorded the highest verification score.

일 실시예에 따른 기계 학습 모델 및 비교 실시예들에 따른 기계 학습 모델들 중 최고 성능을 갖는 모델은, 검증 데이터 세트에 대한 평가를 통해 결정될 수 있다. 검증 데이터 세트는 트레이닝 데이터 세트와 분리된 데이터 세트를 나타낼 수 있다. 일 실시예에 따른 기계 학습 모델 및 비교 실시예에 따른 기계 학습 모델에 대한 실험은, 동일한 트레이닝 설정(예: 입력 크기, 학습률, 옵티마이저(optimizer), 증강 조합(augmentation combinations) 등)에서 수행될 수 있다.The model with the highest performance among the machine learning model according to one embodiment and the machine learning models according to comparative embodiments may be determined through evaluation on a verification data set. The validation data set may represent a data set that is separate from the training data set. Experiments on the machine learning model according to one embodiment and the machine learning model according to the comparative embodiment are performed in the same training settings (e.g., input size, learning rate, optimizer, augmentation combinations, etc.). You can.

표 2는 일 실시예에 따른 환자 레벨의 분류 모델 및 비교 실시예에 따른 환자 레벨의 분류 모델 간의 성능의 비교를 나타낼 수 있다.Table 2 may show a comparison of performance between a patient-level classification model according to one embodiment and a patient-level classification model according to a comparative example.

[표 2][Table 2]

비교 실시예 1 내지 6에 따른 분류 모델은 종래 기술에 따른 분류 모델을 나타낼 수 있다. 비교 실시예 7 내지 14에 따른 분류 모델은 일 실시예에 따른 분류 모델을 기준으로 인코더의 사전 학습을 위한 다중 태스크 모델의 프리텍스트 태스크를 조정(예: 절제(ablation))한 모델을 나타낼 수 있다. 표 2의 데이터 세트의 평가 메트릭마다, 일 실시예 및 비교실시예들에 따른 분류 모델들 중에서, 밑줄체는 최대 값을 갖는 모델의 평가 메트릭을 나타내고, 볼드체는 두번째로 큰 값을 갖는 모델의 평가 메트릭을 나타낼 수 있다.The classification models according to Comparative Examples 1 to 6 may represent classification models according to the prior art. The classification models according to Comparative Examples 7 to 14 may represent a model that adjusts (e.g., ablates) the pretext task of the multi-task model for pre-learning of the encoder based on the classification model according to one embodiment. . For each evaluation metric of the data set in Table 2, among the classification models according to one embodiment and comparative examples, underlined indicates the evaluation metric of the model with the largest value, and bold indicates the evaluation of the model with the second largest value. Metrics can be displayed.

구체적으로, 비교 실시예 1에 따른 분류 모델은 논문 'A. Patel, S. C. Van De Leemput, M. Prokop, B. Van Ginneken, and R. Manniesing, "Image Level Training and Prediction: Intracranial Hemorrhage Identification in 3D Non-Contrast CT," (in English), Ieee Access, vol. 7, pp. 92355-92364, 2019, doi: 10.1109/Access.2019.2927792.'에 개시된 모델의 3 단계 학습의 풀 버전(full version)이고, 비교 실시예 2에 따른 분류 모델은 논문 'A. Patel, S. C. Van De Leemput, M. Prokop, B. Van Ginneken, and R. Manniesing, "Image Level Training and Prediction: Intracranial Hemorrhage Identification in 3D Non-Contrast CT," (in English), Ieee Access, vol. 7, pp. 92355-92364, 2019, doi: 10.1109/Access.2019.2927792.'에 개시된 모델의 심플 버전(simple version)이며, 비교 실시예 3에 따른 분류 모델은 논문 'S. P. Singh, L. Wang, S. Gupta, B. Gulyαs, and P. Padmanabhan, "Shallow 3D CNN for detecting acute brain hemorrhage from medical imaging sensors," IEEE Sensors Journal, 2020'에 개시되고, 비교 실시예 4에 따른 분류 모델은 Scratch 모델을 나타내며, 비교 실시예 5에 따른 분류 모델은 논문 'K. He, R. Girshick, and P. Dollαr, "Rethinking imagenet pre-training," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 4918-4927.'에 개시되고, 비교 실시예 6에 따른 분류 모델은 논문 'Z. Zhou, V. Sodha, J. Pang, M. B. Gotway, and J. Liang, "Models genesis," Medical image analysis, vol. 67, p. 101840, 2021'에 개시될 수 있다.Specifically, the classification model according to Comparative Example 1 is described in paper 'A. Patel, S. C. Van De Leemput, M. Prokop, B. Van Ginneken, and R. Manniesing, "Image Level Training and Prediction: Intracranial Hemorrhage Identification in 3D Non-Contrast CT," (in English), Ieee Access, vol. 7, pp. 92355-92364, 2019, doi: 10.1109/Access.2019.2927792.', and the classification model according to Comparative Example 2 is the full version of the paper 'A. Patel, S. C. Van De Leemput, M. Prokop, B. Van Ginneken, and R. Manniesing, "Image Level Training and Prediction: Intracranial Hemorrhage Identification in 3D Non-Contrast CT," (in English), Ieee Access, vol. 7, pp. 92355-92364, 2019, doi: 10.1109/Access.2019.2927792.', and the classification model according to Comparative Example 3 is the paper 'S. P. Singh, L. Wang, S. Gupta, B. Gulyαs, and P. Padmanabhan, "Shallow 3D CNN for detecting acute brain hemorrhage from medical imaging sensors," IEEE Sensors Journal, 2020', Comparative Example 4 The classification model according to represents the Scratch model, and the classification model according to Comparative Example 5 is described in the paper 'K. He, R. Girshick, and P. Dollαr, “Rethinking imagenet pre-training,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 4918-4927.', and the classification model according to Comparative Example 6 is described in the paper 'Z. Zhou, V. Sodha, J. Pang, M. B. Gotway, and J. Liang, “Models genesis,” Medical image analysis, vol. 67, p. It may be disclosed in 101840, 2021.

인코더의 사전 학습을 위한 최상의 다중 태스크 모델을 발견하기 위하여, 슬라이스 레벨의 분류 태스크, 분할 태스크, 복원 태스크, 및 일관성 손실의 모든 조합들에 대한 다중 태스크 모델들은 테스트될 수 있다.To find the best multi-task model for pre-training the encoder, multi-task models for all combinations of slice-level classification task, segmentation task, restoration task, and consistency loss can be tested.

비교 실시예 7(CLS로 표시될 수 있음)에 따른 분류 모델은 인코더의 사전 학습을 위하여 슬라이스 레벨의 분류 태스크를 수행하는 모델을 이용하는 모델을 나타내고, 비교 실시예 8(SEG로 표시될 수 있음)에 따른 분류 모델은 인코더의 사전 학습을 위하여 슬라이스 레벨의 분할 태스크를 수행하는 모델을 이용하는 모델을 나타내며, 비교 실시예 9(REC로 표시될 수 있음)에 따른 분류 모델은 인코더의 사전 학습을 위하여 슬라이스 레벨의 복원 태스크를 수행하는 모델을 이용하는 모델을 나타낼 수 있다.The classification model according to Comparative Example 7 (may be represented by CLS ) represents a model that uses a model that performs a slice-level classification task for pre-training of the encoder, and Comparative Example 8 (may be represented by SEG ) The classification model according to represents a model that uses a model that performs a slice-level segmentation task for pre-learning of the encoder, and the classification model according to Comparative Example 9 (which may be denoted as REC ) uses a slice-level model for pre-learning of the encoder. It can represent a model that uses a model that performs a level restoration task.

비교 실시예 10(CLS+SEG로 표시될 수 있음)에 따른 분류 모델은 인코더의 사전 학습을 위하여 슬라이스 레벨의 분류 태스크 및 분할 태스크를 수행하는 모델을 이용하는 모델을 나타내고, 비교 실시예 11(CLS+REC로 표시될 수 있음)에 따른 분류 모델은 인코더의 사전 학습을 위하여 슬라이스 레벨의 분류 태스크 및 복원 태스크를 수행하는 모델을 이용하는 모델을 나타내며, 비교 실시예 12(SEG+REC로 표시될 수 있음)에 따른 분류 모델은 인코더의 사전 학습을 위하여 슬라이스 레벨의 분할 태스크 및 복원 태스크를 수행하는 모델을 이용하는 모델을 나타낼 수 있다.The classification model according to Comparative Example 10 (which can be expressed as CLS+SEG ) represents a model that uses a model that performs a slice-level classification task and a segmentation task for pre-learning of the encoder, and the classification model according to Comparative Example 11 ( CLS+ The classification model according to (can be expressed as REC ) represents a model that uses a model that performs a slice-level classification task and restoration task for dictionary learning of the encoder, and Comparative Example 12 (can be expressed as SEG+REC ) The classification model according to may represent a model that uses a model that performs slice-level segmentation tasks and restoration tasks for dictionary learning of the encoder.

비교 실시예 13(CLS+SEG+CON로 표시될 수 있음)에 따른 분류 모델은 인코더의 사전 학습을 위하여 슬라이스 레벨의 분류 태스크 및 분할 태스크를 수행하는 모델을 이용하고 일관성 손실을 가지는 목적 함수 값을 이용하여 사전 학습된 인코더를 포함하는 모델을 나타내고, 비교 실시예 14(CLS+SEG+REC로 표시될 수 있음)에 따른 분류 모델은 인코더의 사전 학습을 위하여 슬라이스 레벨의 분류 태스크, 분할 태스크, 및 복원 태스크를 수행하는 모델을 이용하는 모델을 나타낼 수 있다.The classification model according to Comparative Example 13 (which can be expressed as CLS+SEG+CON ) uses a model that performs a slice-level classification task and a segmentation task for pre-training of the encoder and sets an objective function value with consistency loss. represents a model including a pre-trained encoder, and the classification model according to Comparative Example 14 (can be expressed as CLS+SEG+REC ) includes a slice-level classification task, a segmentation task, and It can represent a model that uses a model that performs a restoration task.

일 실시예에 따른 환자 레벨의 분류 모델 및 비교 실시예에 따른 분류 모델은 평가 메트릭(evaluation metric)을 이용하여 정량적으로 평가될 수 있다. 평가 메트릭 중 분류 태스크에 관한 평가 메트릭은, ROC 곡선 아래 면적(area under the ROC curve; AUC), 민감도(sensitivity; SEN), 특이성(specificity; SPE), 균형 정확도(balanced accuracy; B-ACC) 및 F1 점수(F1 score)를 포함할 수 있다.A patient-level classification model according to one embodiment and a classification model according to a comparative example may be quantitatively evaluated using an evaluation metric. Among the evaluation metrics, the evaluation metrics related to the classification task are area under the ROC curve (AUC), sensitivity (SEN), specificity (SPE), balanced accuracy (B-ACC), and It may include an F1 score.

여러가지 비교 실험들은 전이 학습의 환자 레벨의 분류 태스크로의 확장성 및 다중 태스크 학습의 효율성을 검증하기 위하여, 수행될 수 있다. 비교 실시예 1 내지 3은 종래 기술에 따른 분류 모델일 수 있고, 비교 실시예 5 및 6은 종래의 전이 학습을 이용한 분류 모델일 수 있다. Several comparative experiments can be performed to verify the scalability of transfer learning to patient-level classification tasks and the efficiency of multi-task learning. Comparative Examples 1 to 3 may be classification models according to the prior art, and Comparative Examples 5 and 6 may be classification models using conventional transfer learning.

표 2는 각 데이터 세트에 대한 일 실시예 및 비교 실시예들의 성능을 나타낼 수 있다. 표 2의 상단(표 2에서, previous works)은 종래의 모델 및 전이 학습 접근법에 관한 비교 실시예들에 해당하고, 표 2의 하단(표 2에서, ablation studies)은 일 실시예에 따른 기계 학습 모델의 절제 연구(ablation study)에 관한 비교 실시예들에 해당할 수 있다. Table 2 can show the performance of one embodiment and comparative examples for each data set. The top of Table 2 (in Table 2, previous works) corresponds to comparative examples of conventional models and transfer learning approaches, and the bottom of Table 2 (in Table 2, ablation studies) corresponds to machine learning according to one embodiment. It may correspond to comparative examples regarding ablation studies of the model.

일 실시예에 따른 환자 레벨의 분류 모델은 AUROC 점수 측면 및 외부 테스트 세트에서도 균형이 잘 잡혀 있고, 비교 실시예들에 따른 분류 모델의 대부분보다 훨씬 좋은 성능을 가질 수 있다. 또한 F1, B-ACC, SEN, SPE의 관점에서 일 실시예에 따른 분류 모델은 비교 실시예들에 따른 분류 모델보다 더 좋은 성능을 가질 수 있다. 특히, 비교 실시예 1 내지 3에 따른 분류 모델들은 외부 데이터 세트에서 불안정한 결과를 가질 수 있다. 일 실시예에 따른 분류 모델은 종래의 전이 학습을 이용한 비교 실시예에 따른 분류 모델에 비하여 대부분의 데이터 세트들에 대해 비교 실시예 5 및 6에 따른 분류 모델보다 훨씬 우수할 수 있다. 비교 실시예 5 및 6에 따른 분류 모델은 외부 데이터 세트에서도 불안정한 결과를 가질 수 있다. The patient-level classification model according to one embodiment is well balanced in terms of AUROC scores and external test sets, and may have much better performance than most of the classification models according to comparative embodiments. Additionally, in terms of F1, B-ACC, SEN, and SPE, the classification model according to one embodiment may have better performance than the classification model according to comparative embodiments. In particular, classification models according to Comparative Examples 1 to 3 may have unstable results in external data sets. The classification model according to one embodiment may be significantly better than the classification model according to Comparative Examples 5 and 6 for most data sets compared to the classification model according to Comparative Examples using conventional transfer learning. The classification models according to Comparative Examples 5 and 6 may have unstable results even in external data sets.

도 8은 일 실시예에 따른 환자 레벨의 분류 모델 및 비교 실시예에 따른 분류 모델의 성능 비교를 나타낸다.Figure 8 shows a performance comparison of a patient-level classification model according to one embodiment and a classification model according to a comparative embodiment.

도 8에서, 일 실시예에 따른 환자 레벨의 분류 모델(도 8에서 MRI-Net으로 표시됨)의 ROC는 비교 실시예에 따른 환자 레벨의 분류 모델의 ROC와 비교될 수 있다. 통계적 유의성은 DeLong 등의 방법으로 결정될 수 있다. 0.05 미만의 p 밸류(p value)를 갖는 비교 실시예는 *로 표시되고, 0.01 미만의 p 밸류를 갖는 비교 실시예는 **로 표시되며, 0.001 미만의 p 밸류를 갖는 비교 실시예는 ***로 표시될 수 있다. In FIG. 8, the ROC of the patient-level classification model (indicated as MRI-Net in FIG. 8) according to one embodiment may be compared with the ROC of the patient-level classification model according to the comparative embodiment. Statistical significance can be determined by the method of DeLong et al. Comparative examples with a p value of less than 0.05 are indicated by *, comparative examples with a p value of less than 0.01 are indicated by **, and comparative examples with a p value of less than 0.001 are indicated by ** It may be displayed as *.

일 실시예에 따른 분류 모델의 AUC는 비교 실시예 1 내지 6에 따른 분류 모델들의 AUC보다 클 수 있다. 일 실시예에 따른 분류 모델 및 비교 실시예들의 분류 모델들 간의 차이는 내부 데이터 세트에 대한 ROC 그래프(810)에서보다 외부 데이터 세트에 대한 ROC 그래프(820, 830, 840)에서 더 두드러지게 나타날 수 있다.The AUC of the classification model according to one embodiment may be larger than the AUC of the classification models according to Comparative Examples 1 to 6. Differences between the classification model according to one embodiment and the classification models of comparative embodiments may appear more prominently in the ROC graphs 820, 830, and 840 for the external data set than in the ROC graph 810 for the internal data set. there is.

도 9은 일 실시예에 따른 환자 레벨의 분류 모델 및 비교 실시예들에 따른 분류 모델 간의 차이 및 견고성을 나타낼 수 있다. Figure 9 may show the difference and robustness between a patient-level classification model according to one embodiment and classification models according to comparative embodiments.

일 실시예에 따른 분류 모델의 인코더의 사전 학습에 관하여, 슬라이스 레벨의 다중 태스크 모델의 인코더를 환자 레벨의 분류 태스크로 확장시키기에 적합한 다중 태스크 모델의 프리텍스트 태스크를 획득하기 위하여, 프리텍스트 태스크의 모든 조합들에 대한 절제 연구는 수행될 수 있다. 분류 모델의 성능은 다중 태스크 모델의 프리텍스트 태스크의 조합에 의존할 수 있다. Regarding pre-training of the encoder of the classification model according to one embodiment, in order to obtain a pretext task of the multi-task model suitable for extending the encoder of the slice-level multi-task model to the patient-level classification task, Ablation studies for all combinations can be performed. The performance of a classification model may depend on the combination of pretext tasks in a multi-task model.

표 2에서, 단일 태스크 모델을 이용하여 사전 학습된 인코더를 포함하는 비교 실시예들 중에서, 비교 실시예 7(CLS)에 따른 분류 모델는 4개의 데이터 세트들의 평균 F1으로 0.663의 값을 가지므로, 환자 레벨의 분류 태스크로 확장하는 데 가장 적합한 태스크일 수 있다. 듀얼 태스크(dual task) 모델을 이용하여 사전 학습된 인코더를 포함하는 비교 실시예들 중에서, 비교 실시예 11(CSL+REC)에 따른 분류 모델은 4개의 데이터 세트들의 평균 F1으로 0.766의 최고의 성능을 가질 수 있다. 비교 실시예 13 및 14 중에서, 비교 실시예 14(CLS+SEG+REC)에 따른 분류 모델은 4개의 데이터 세트들의 평균 F1으로 0.799을 가질 수 있다. In Table 2, among the comparative examples including a pre-trained encoder using a single task model, the classification model according to Comparative Example 7 ( CLS ) has a value of 0.663 as the average F1 of the four data sets, so the patient This may be the most appropriate task to expand to a level classification task. Among the comparative examples including a pre-trained encoder using a dual task model, the classification model according to Comparative Example 11 ( CSL+REC ) achieved the best performance of 0.766 with an average F1 of the four data sets. You can have it. Among Comparative Examples 13 and 14, the classification model according to Comparative Example 14 ( CLS+SEG+REC ) may have an average F1 of 0.799 for the four data sets.

비교 실시예 10(CLS+SEG)에 따른 분류 모델과 비교 실시예 13(CLS+SEG+CON)에 따른 분류 모델은 비교될 수 있다. 비교 실시예 13(CLS+SEG+CON)에 따른 분류 모델은 일관성 손실을 더 이용함으로써 비교 실시예 10(CLS+SEG)에 따른 분류 모델보다 2.8% 만큼 증가된 F1 점수를 가질 수 있다. 일 실시예는 직접 트레이닝된 스크래치 모델의 비교 실시예 4에 따른 분류 모델에 비해, 21.1% 만큼 증가된 F1 점수, 13.1% 만큼 증가된 균형 정확도(B-ACC), 18.8% 만큼 증가된 민감도(SEN), 및 12.8% 만큼 증가된 특이성(SPE)을 가질 수 있다. 일 실시예에 따른 분류 모델은 3개의 태스크들 및 일관성 손실을 통해 가장 좋은 이득(gain)을 가질 수 있다.The classification model according to Comparative Example 10 ( CLS+SEG ) and the classification model according to Comparative Example 13 ( CLS+SEG+CON ) can be compared. The classification model according to Comparative Example 13 ( CLS+SEG+CON ) can have an F1 score increased by 2.8% over the classification model according to Comparative Example 10 ( CLS+SEG ) by using more consistency loss. One embodiment compares the directly trained scratch model to an F1 score increased by 21.1%, balanced accuracy (B-ACC) increased by 13.1%, and sensitivity (SEN) increased by 18.8% compared to the classification model according to Example 4. ), and specificity (SPE) increased by 12.8%. A classification model according to one embodiment may have the best gain through three tasks and consistency loss.

표 3은 ICH 부피에 따른 환자 레벨의 분류 모델의 성능을 나타낸다.Table 3 shows the performance of the patient-level classification model according to ICH volume.

[표 3][Table 3]

일 실시예에 따른 분류 모델이 임상적으로 분류 시스템에 적용될 수 있는지 확인하기 위하여, 분류 모델의 ICH 유무 분류 성능(예: ICH 탐지 성능)은 분석될 수 있다. 분류 모델의 ICH 유무 분류 성능 중 30mL 이상의 ICH를 분류하는 성능은 사망률의 측면에서 중요할 수 있다. 표3에서 나타난 바와 같이, ICH 부피에 따른 ICH 환자 분류의 오류율은 분석될 수 있다. 일 실시예에 따른 분류 모델은, 내부 데이터 세트의 내부 테스트 세트 및 3개의 외부 데이터 세트들의 모든 30mL 이상의 ICH 클래스들을 분류할 수 있다. 분류 모델은, 내부 데이터 세트의 내부 테스트 세트 및 3개의 외부 데이터 세트들에서 각각 3.0%, 12.3%, 5.4% 및 25.6%의 위양성률(false positive rates)을 가질 수 있다. 분류 모델은, 또한 내부 테스트 세트 및 외부 데이터 세트들의 15mL 내지 30mL의 ICH 출혈을 갖는 환자들 모두를 완벽하게 분류할 수 있다. 다만, 분류 모델은 15mL 미만의 부피를 갖는 ICH 환자의 경우에서 위음성(false negative) 케이스를 가질 수 있다. 15mL 미만의 부피를 갖는 ICH 클래스의 복수의 이미지들에 대한 오류율은 내부 테스트 세트 및 3개의 외부 데이터 세트들에서 각각 7.1%, 16.6%, 44.4% 및 6.9%의 값을 가질 수 있다.In order to determine whether the classification model according to one embodiment can be applied clinically to a classification system, the classification model's ICH presence/absence classification performance (e.g., ICH detection performance) may be analyzed. Among the ICH presence/absence classification performance of the classification model, the performance of classifying ICH over 30mL may be important in terms of mortality. As shown in Table 3, the error rate of ICH patient classification according to ICH volume can be analyzed. The classification model according to one embodiment is capable of classifying all 30mL or greater ICH classes in the internal test set of the internal data set and the three external data sets. The classification model may have false positive rates of 3.0%, 12.3%, 5.4%, and 25.6% on the internal test set of the internal data set and the three external data sets, respectively. The classification model is also able to perfectly classify all patients with ICH bleeding between 15 mL and 30 mL in the internal test set and external data sets. However, the classification model may have false negative cases in the case of ICH patients with a volume of less than 15 mL. The error rate for multiple images of the ICH class with a volume less than 15 mL can have values of 7.1%, 16.6%, 44.4%, and 6.9% in the internal test set and three external data sets, respectively.

표 4는 일 실시예에 따른 환자 레벨의 분할 모델 및 비교 실시예에 따른 환자 레벨의 분할 모델 간의 성능의 비교를 나타낼 수 있다.Table 4 may show a comparison of performance between a patient-level segmentation model according to one embodiment and a patient-level segmentation model according to a comparative embodiment.

[표 4][Table 4]

비교 실시예 4 내지 6 및 15 내지 17에 따른 분할 모델은 종래 기술에 따른 분할 모델을 나타낼 수 있다. 비교 실시예 7 내지 14에 따른 모델은 일 실시예에 따른 분할 모델을 기준으로 인코더의 사전 학습을 위한 다중 태스크 모델의 프리텍스트 태스크를 조정(예: 절제(ablation))한 모델을 나타낼 수 있다. 표 4의 데이터 세트의 평가 메트릭마다, 일 실시예 및 비교실시예들에 따른 분할 모델들 중에서, 밑줄체는 최대 값을 갖는 모델의 평가 메트릭을 나타내고, 볼드체는 두번째로 큰 값을 갖는 모델의 평가 메트릭을 나타낼 수 있다. 일 실시예에 따른 환자 레벨의 분할 모델과 비교 실시예들에 따른 분할 모델들 각각에 대하여 paired t-test가 수행될 수 있다. 0.05 미만의 p 밸류(p value)를 갖는 비교 실시예는 *로 표시되고, 0.01 미만의 p 밸류를 갖는 비교 실시예는 **로 표시되며, 0.001 미만의 p 밸류를 갖는 비교 실시예는 ***로 표시될 수 있다.The split models according to Comparative Examples 4 to 6 and 15 to 17 may represent the split models according to the prior art. The models according to Comparative Examples 7 to 14 may represent a model in which the pretext task of the multi-task model for pre-learning of the encoder is adjusted (e.g., ablation) based on the segmentation model according to an embodiment. For each evaluation metric of the data set in Table 4, among the segmentation models according to one embodiment and comparative examples, underlined indicates the evaluation metric of the model with the largest value, and bold indicates the evaluation of the model with the second largest value. Metrics can be displayed. A paired t-test may be performed on each of the patient-level segmentation models according to one embodiment and the segmentation models according to comparative embodiments. Comparative examples with a p value of less than 0.05 are indicated by *, comparative examples with a p value of less than 0.01 are indicated by **, and comparative examples with a p value of less than 0.001 are indicated by ** It may be displayed as *.

구체적으로, 비교 실시예 4에 따른 분할 모델은 Scratch 모델을 나타내며, 비교 실시예 5에 따른 분할 모델은 논문 'K. He, R. Girshick, and P. Dollαr, "Rethinking imagenet pre-training," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 4918-4927.'에 개시되고, 비교 실시예 6에 따른 분할 모델은 논문 'Z. Zhou, V. Sodha, J. Pang, M. B. Gotway, and J. Liang, "Models genesis," Medical image analysis, vol. 67, p. 101840, 2021'에 개시될 수 있다. 비교 실시예 15에 따른 분할 모델은 논문 'A. Patel et al., "Intracerebral Haemorrhage Segmentation in Non-Contrast CT," Sci Rep, vol. 9, no. 1, p. 17858, Nov 28 2019, doi: 10.1038/s41598-019-54491-6.'에 개시될 수 있고, 비교 실시예 16 및 17에 따른 분할 모델은 논문 'F. Isensee, P. F. Jaeger, S. A. Kohl, J. Petersen, and K. H. Maier-Hein, "nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation," Nature methods, vol. 18, no. 2, pp. 203-211, 2021.'에 개시된 모델의 2D 버전 및 3D 버전의 모델을 나타낼 수 있다.Specifically, the split model according to Comparative Example 4 represents the Scratch model, and the split model according to Comparative Example 5 is described in the paper 'K. He, R. Girshick, and P. Dollαr, “Rethinking imagenet pre-training,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 4918-4927.', and the splitting model according to Comparative Example 6 is described in the paper 'Z. Zhou, V. Sodha, J. Pang, M. B. Gotway, and J. Liang, “Models genesis,” Medical image analysis, vol. 67, p. It may be disclosed in 101840, 2021. The splitting model according to Comparative Example 15 is described in paper 'A. Patel et al., “Intracerebral Haemorrhage Segmentation in Non-Contrast CT,” Sci Rep, vol. 9, no. 1, p. 17858, Nov 28 2019, doi: 10.1038/s41598-019-54491-6.', and the partitioning model according to Comparative Examples 16 and 17 is described in the paper 'F. Isensee, P. F. Jaeger, S. A. Kohl, J. Petersen, and K. H. Maier-Hein, "nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation," Nature methods, vol. 18, no. 2, pp. 203-211, 2021. It can represent a 2D version and a 3D version of the model disclosed.

비교 실시예 7(CLS로 표시될 수 있음)에 따른 분할 모델은 인코더의 사전 학습을 위하여 슬라이스 레벨의 분류 태스크를 수행하는 모델을 이용하는 모델을 나타내고, 비교 실시예 8(SEG로 표시될 수 있음)에 따른 분할 모델은 인코더의 사전 학습을 위하여 슬라이스 레벨의 분할 태스크를 수행하는 모델을 이용하는 모델을 나타내며, 비교 실시예 9(REC로 표시될 수 있음)에 따른 분할 모델은 인코더의 사전 학습을 위하여 슬라이스 레벨의 복원 태스크를 수행하는 모델을 이용하는 모델을 나타낼 수 있다.The segmentation model according to Comparative Example 7 (may be denoted as CLS ) represents a model that uses a model that performs a slice-level classification task for pre-training of the encoder, and Comparative Example 8 (may be denoted as SEG ) The segmentation model according to represents a model that uses a model that performs a slice-level segmentation task for pre-training of the encoder, and the segmentation model according to Comparative Example 9 (which may be denoted as REC ) uses a slice-level segmentation task for pre-learning of the encoder. It can represent a model that uses a model that performs a level restoration task.

비교 실시예 10(CLS+SEG로 표시될 수 있음)에 따른 분할 모델은 인코더의 사전 학습을 위하여 슬라이스 레벨의 분류 태스크 및 분할 태스크를 수행하는 모델을 이용하는 모델을 나타내고, 비교 실시예 11(CLS+REC로 표시될 수 있음)에 따른 분할 모델은 인코더의 사전 학습을 위하여 슬라이스 레벨의 분류 태스크 및 복원 태스크를 수행하는 모델을 이용하는 모델을 나타내며, 비교 실시예 12(SEG+REC로 표시될 수 있음)에 따른 분할 모델은 인코더의 사전 학습을 위하여 슬라이스 레벨의 분할 태스크 및 복원 태스크를 수행하는 모델을 이용하는 모델을 나타낼 수 있다.The segmentation model according to Comparative Example 10 (which can be expressed as CLS+SEG ) represents a model that uses a model that performs a slice-level classification task and a segmentation task for pre-learning of the encoder, and the segmentation model according to Comparative Example 11 ( CLS+ The segmentation model according to (can be expressed as REC ) represents a model that uses a model that performs a slice-level classification task and restoration task for dictionary learning of the encoder, and Comparative Example 12 (can be expressed as SEG+REC ) The segmentation model according to may represent a model that uses a model that performs slice-level segmentation tasks and restoration tasks for dictionary learning of the encoder.

비교 실시예 13(CLS+SEG+CON로 표시될 수 있음)에 따른 분할 모델은 인코더의 사전 학습을 위하여 슬라이스 레벨의 분류 태스크 및 분할 태스크를 수행하는 모델을 이용하고 일관성 손실을 가지는 목적 함수 값을 이용하여 사전 학습된 인코더를 포함하는 모델을 나타내고, 비교 실시예 14(CLS+SEG+REC로 표시될 수 있음)에 따른 분할 모델은 인코더의 사전 학습을 위하여 슬라이스 레벨의 분류 태스크, 분할 태스크, 및 복원 태스크를 수행하는 모델을 이용하는 모델을 나타낼 수 있다.The segmentation model according to Comparative Example 13 (which can be expressed as CLS+SEG+CON ) uses a model that performs a slice-level classification task and a segmentation task for pre-training of the encoder and sets an objective function value with consistency loss. represents a model including a pre-trained encoder, and the segmentation model according to Comparative Example 14 (can be expressed as CLS+SEG+REC ) includes a slice-level classification task, a segmentation task, and It can represent a model that uses a model that performs a restoration task.

일 실시예에 따른 환자 레벨의 분할 모델 및 비교 실시예에 따른 분할 모델은 평가 메트릭(evaluation metric)을 이용하여 정량적으로 평가될 수 있다.The patient-level segmentation model according to one embodiment and the segmentation model according to the comparative example may be quantitatively evaluated using an evaluation metric.

평가 메트릭 중 분할 태스크에 관한 평가 메트릭은, 다이스 유사성 계수(dice similarity coefficient; DSC)를 포함할 수 있다. 다이스 유사성 계수(DSC)는 양성 케이스(positive case)에 대하여만 측정되는 중첩 메트릭(overlap metric)을 나타낼 수 있다. 평가 메트릭 중 분할 태스크에 관한 평가 메트릭은, 위 양성 부피(False Positive Volumes; FPV)를 포함할 수 있다. 위 양성 부피(FPV)는, 정상 환자의 정상 영역 중에서 모델에 의하여 병변 영역으로 잘못 추출된 영역의 부피를 나타낼 수 있다. 정상 환자의 정상 영역은, 정상으로 라벨링된 환자에 관한 복수의 이미지들의 전체에 대응하는 영역을 나타낼 수 있다. 위 양성 부피(FPV)는 이미지의 픽셀을 부피(예: 단위 μL의 값)로 변환함으로써 획득될 수 있고, 임상적인 해석에서 환자의 공간 정보(spacing information)을 반영할 수 있다. 예를 들어, 위 양성 부피(FPV)는 다음과 같은 식에 따라 계산될 수 있다:Among the evaluation metrics, an evaluation metric related to the segmentation task may include a dice similarity coefficient (DSC). Dice Similarity Coefficient (DSC) may represent an overlap metric that is measured only for positive cases. Among the evaluation metrics, an evaluation metric related to the segmentation task may include false positive volumes (FPV). False positive volume (FPV) may represent the volume of an area incorrectly extracted as a lesion area by the model from the normal area of a normal patient. The normal area of a normal patient may represent an area corresponding to the entirety of a plurality of images related to the patient labeled as normal. False positive volume (FPV) can be obtained by converting the pixels of the image into a volume (e.g., a value in units of μL) and can reflect the patient's spacing information in clinical interpretation. For example, false positive volume (FPV) can be calculated according to the formula:

여기서, 는 환자 레벨의 분할 모델의 출력을 나타내고, spacing _xyz 는 부피 단위로 변환하기 위한 CT 슬라이스의 메타 정보를 나타낼 수 있다.here, represents the output of the patient-level segmentation model, and spacing _xyz may represent meta information of the CT slice for conversion to volume units.

분할 모델은 ICH 환자의 경우 다이스 유사성 계수(DSC) 및 정상 환자의 경우 위 양성 부피(FPV)로 평가될 수 있다. 분할 모델은, 다이스 유사성 계수(DSC) 및 위 양성 부피(FPV)로 모두 평가됨으로써 균형 잡힌 모델(balanced model)인지에 관하여 평가될 수 있다.The segmentation model can be evaluated by Dice similarity coefficient (DSC) for ICH patients and false positive volume (FPV) for normal patients. A segmentation model can be evaluated as to whether it is a balanced model by evaluating both the Dice Similarity Coefficient (DSC) and the False Positive Volume (FPV).

여러가지 비교 실험들은 전이 학습의 환자 레벨의 분할 태스크로의 확장성 및 다중 태스크 학습의 효율성을 검증하기 위하여 수행될 수 있다. 비교 실시예 15 내지 17은 종래 기술에 따른 분할 모델일 수 있고, 비교 실시예 5 및 6은 종래의 전이 학습을 이용한 분할 모델일 수 있다. Several comparative experiments can be performed to verify the scalability of transfer learning to patient-level segmentation tasks and the efficiency of multi-task learning. Comparative Examples 15 to 17 may be a segmentation model according to the prior art, and Comparative Examples 5 and 6 may be a segmentation model using conventional transfer learning.

표 4는 각 데이터 세트에 대한 일 실시예 및 비교 실시예들의 성능을 나타낼 수 있다. 표 4의 상단(표 4에서, previous works)은 종래의 모델 및 전이 학습 접근법에 관한 비교 실시예들에 해당하고, 표 4의 하단(표 4에서, ablation studies)은 일 실시예에 따른 기계 학습 모델의 절제 연구(ablation study)에 관한 비교 실시예들에 해당할 수 있다. Table 4 can show the performance of one embodiment and comparative examples for each data set. The top of Table 4 (in Table 4, previous works) corresponds to comparative examples of conventional models and transfer learning approaches, and the bottom of Table 4 (in Table 4, ablation studies) corresponds to machine learning according to one embodiment. It may correspond to comparative examples regarding ablation studies of the model.

일 실시예에 따른 분할 모델은 다이스 유사성 계수(DSC) 및 위 양성 부피(FPV) 측면에서, 내부 테스트 세트뿐만 아니라 외부 데이터 세트들에서도, 비교 실시예들에 따른 분할 모델의 대부분보다 훨씬 좋은 성능을 가질 수 있다.The segmentation model according to one embodiment performs much better than most of the segmentation models according to comparative embodiments in terms of Dice similarity coefficient (DSC) and false positive volume (FPV), not only on the internal test set but also on external data sets. You can have it.

일 실시예에 따른 분할 모델는, 비교 실시예 15에 따른 분할 모델에 비하여 4개의 데이터 세트들의 11.5%의 다이스 유사성 계수(DSC) 증가 및 75.8μL의 위 양성 부피(FPV)의 감소를 가질 수 있고, 대부분의 데이터 세트에 대하여 비교 실시예 15에 따른 분할 모델보다 훨씬 우수한 성능을 가질 수 있다.The segmentation model according to one embodiment may have an increase in Dice Similarity Coefficient (DSC) of 11.5% and a reduction in false positive volume (FPV) of 75.8 μL of the four data sets compared to the segmentation model according to Comparative Example 15, For most data sets, it can have much better performance than the segmentation model according to Comparative Example 15.

비교 실시예 17에 따른 분할 모델은 데이터 세트에 따라 불안정한 결과를 가질 수 있다. 특히, 비교실시예 17에 따른 분할 모델은, 3D 컨볼루션 레이어 연산자(Conv3D Operator)를 갖는 패치 기반 방법의 과적합으로 인한 높은 민감도에도 불구하고 매우 높은 위 양성을 가질 수 있다. 비교 실시예 16에 따른 분할 모델은 높은 민감도를 가지지만 660 μL의 평균 위 양성 부피(FPV)를 가질 수 있다. The segmentation model according to Comparative Example 17 may have unstable results depending on the data set. In particular, the segmentation model according to Comparative Example 17 may have very high false positives despite high sensitivity due to overfitting of the patch-based method with a 3D convolution layer operator (Conv3D Operator). The segmentation model according to Comparative Example 16 has high sensitivity but may have a mean false positive volume (FPV) of 660 μL.

비교 실시에 16에 따른 분할 모델은 민감도 및 특이도가 비교 실시예 17에 따른 분할 모델보다는 더 균형잡힐 수 있다. 일 실시예에 따른 분할 모델은 비교 실시예 16에 따른 분할 모델보다 4 개의 데이터 세트에 대한 1.1%의 다이스 유사성 계수(DSC) 증가 및 75.8μL의 위 양성 부피(FPV)의 감소를 가질 수 있다. The split model according to Comparative Example 16 may have more balanced sensitivity and specificity than the split model according to Comparative Example 17. The segmentation model according to one embodiment may have an increase in Dice Similarity Coefficient (DSC) of 1.1% and a reduction in false positive volume (FPV) of 75.8 μL for the four data sets over the segmentation model according to Comparative Example 16.

전이 학습의 관점에서, 비교 실시예 5에 따른 분할 모델은 비교 실시예 6에 따른 분할 모델보다 민감하지만, 비교 실시예 6에 따른 분할 모델은 비교 실시예 5에 따른 분할 모델보다 위 양성 부피(FPV)의 감소에서 더 좋은 성능을 가질 수 있다. 일 실시예에 따른 환자 레벨의 분할 모델은 비교 실시예 5에 따른 분할 모델보다 2.1%의 다이스 유사성 계수(DSC) 증가를 가지고, 비교 실시예 6에 따른 분할 모델보다 50.4 μL의 위 양성 부피(FPV) 감소를 가질 수 있다. 일 실시예에 따른 분할 모델 및 비교 실시예 5에 따른 분할 모델의 성능들 간의 내부 데이터 세트에서 p 밸류는 0.05미만의 값을 가질 수 있고, 통계적 유의성을 가질 수 있다.From the perspective of transfer learning, the segmentation model according to Comparative Example 5 is more sensitive than the segmentation model according to Comparative Example 6, but the segmentation model according to Comparative Example 6 has a lower false positive volume (FPV) than the segmentation model according to Comparative Example 5. ) can have better performance when reduced. The patient-level segmentation model according to one embodiment has a Dice Similarity Coefficient (DSC) increase of 2.1% over the segmentation model according to Comparative Example 5 and a false positive volume (FPV) of 50.4 μL compared to the segmentation model according to Comparative Example 6. ) can have a decrease. The p value in the internal data set between the performance of the partition model according to one embodiment and the partition model according to Comparative Example 5 may have a value of less than 0.05 and may have statistical significance.

일 실시예에 따른 분할 모델의 인코더의 사전 학습에 관하여, 슬라이스 레벨의 다중 태스크 모델이 환자 레벨의 분할 태스크로 확장되기에 적합한 프리텍스트 태스크를 평가하기 위하여, 프리텍스트 태스크의 모든 조합들에 대한 절제 연구는 수행될 수 있다. 환자 레벨의 분할 모델의 성능은 프리텍스트 태스크의 조합에 의존할 수 있다. Regarding the pre-training of the encoder of the segmentation model according to one embodiment, in order to evaluate the pretext task suitable for extending the slice-level multi-task model to the patient-level segmentation task, excision of all combinations of the pretext task Research can be conducted. The performance of the patient-level segmentation model may depend on the combination of pretext tasks.

표 4에서, 단일 태스크 모델을 이용하여 사전 학습된 인코더를 포함하는 비교 실시예들 중에서, 비교 실시예 8(SEG)에 따른 분할 모델이 4개의 데이터 세트들의 평균 다이스 유사성 계수(DSC)로 0.554의 값 및 위 양성 부피(FPV)로 22.1μL을 가지므로, 슬라이스 레벨의 분할 태스크는 환자 레벨의 분할 태스크로 확장하는 데 가장 적합한 태스크일 수 있다. 듀얼 태스크(dual task) 모델을 이용하여 사전 학습된 인코더를 포함하는 비교 실시예들 중에서, 비교 실시예 10(CLS+SEG)에 따른 분할 모델은 4개의 데이터 세트들의 평균 다이스 유사성 계수(DSC)의 최대 값 0.553을 가질 수 있고, 비교 실시예 11(CLS+REC)에 따른 분할 모델은 4개의 데이터 세트들의 평균 위 양성 부피(FPV)의 최소 값 15.6μL을 가질 수 있다.In Table 4, among the comparative examples including an encoder pre-trained using a single task model, the segmentation model according to comparative example 8 ( SEG ) had an average Dice similarity coefficient (DSC) of 0.554 for the four data sets. value and false positive volume (FPV) of 22.1 μL, the slice-level segmentation task may be the most suitable task to scale to the patient-level segmentation task. Among the comparative examples including a pre-trained encoder using a dual task model, the segmentation model according to Comparative Example 10 ( CLS+SEG ) has the average Dice Similarity Coefficient (DSC) of the four data sets. It may have a maximum value of 0.553, and the segmentation model according to Comparative Example 11 ( CLS+REC ) may have a minimum value of 15.6 μL of the average false positive volume (FPV) of the four data sets.

비교 실시예 13 및 14 중에서, 비교 실시예 14(CLS+SEG+REC)에 따른 분할 모델은 4개의 데이터 세트들의 평균 다이스 유사성 계수(DSC)의 최대 값 0.557 및 평균 위 양성 부피(FPV)의 최소 값 6.8μL을 가질 수 있다.Among Comparative Examples 13 and 14, the segmentation model according to Comparative Example 14 ( CLS+SEG+REC ) had the maximum value of the average Dice similarity coefficient (DSC) of 0.557 and the minimum average false positive volume (FPV) of the four data sets. It can have a value of 6.8μL.

비교 실시예 10(CLS+SEG)에 따른 분할 모델과 비교 실시예 13(CLS+SEG+CON)에 따른 분할 모델은 비교될 수 있다. 비교 실시예 13(CLS+SEG+CON)에 따른 분할 모델은 일관성 손실을 더 이용함으로써 비교 실시예 10(CLS+SEG)에 따른 분할 모델보다 0.4% 만큼 증가된 다이스 유사성 계수(DSC) 및 52.5μL만큼 감소된 위 양성 부피(FPV)를 가질 수 있다. The split model according to Comparative Example 10 ( CLS+SEG ) and the split model according to Comparative Example 13 ( CLS+SEG+CON ) can be compared. The segmentation model according to Comparative Example 13 ( CLS+SEG+CON ) further exploits the consistency loss, thereby increasing the Dice Similarity Coefficient (DSC) by 0.4% and 52.5 μL over the segmentation model according to Comparative Example 10 ( CLS+SEG ). may have a reduced false positive volume (FPV).

일 실시예에 따른 분할 모델은 직접 트레이닝된 스크래치 모델의 비교 실시예 4에 따른 분할 모델에 비해, 13.5% 만큼 증가된 다이스 유사성 계수(DSC), 21.8μL 만큼 감소된 위 양성 부피(FPV)를 가질 수 있다. 일 실시예에 분할 모델은 3개의 태스크들 및 일관성 손실로 가장 좋은 이득(gain)을 가질 수 있다. The segmentation model according to one embodiment has a Dice similarity coefficient (DSC) increased by 13.5% and a false positive volume (FPV) reduced by 21.8 μL compared to the segmentation model according to Comparative Example 4 of a directly trained scratch model. You can. In one embodiment the split model may have the best gain with 3 tasks and consistency loss.

도 10는 일 실시예 및 비교 실시예들에 따른 분할 모델의 결과를 나타낸다. 도 11은 일 실시예 및 비교 실시예들에 따른 분할 모델의 ICH 부피들을 나타낸다.Figure 10 shows the results of a segmentation model according to one embodiment and comparative examples. Figure 11 shows ICH volumes of a segmented model according to one embodiment and comparative examples.

육안 검사(visual inspection)를 위하여, 중증 뇌출혈(severe brain hemorrhage), 경증 뇌출혈(mild brain hemorrhage), 빔 경화 인공물이 있는 정상 케이스(normal case with beam hardening artifact), 및 정상 케이스(normal case) 총 4가지 경우들에 대한 분할 모델의 성능은 비교될 수 있다. For visual inspection, a total of 4 cases were included: severe brain hemorrhage, mild brain hemorrhage, normal case with beam hardening artifact, and normal case. The performance of segmentation models for the cases can be compared.

일 실시예에 따른 분할 모델은 비교 실시예들에 따른 분할 모델보다 수치적으로 더 우수할 뿐만 아니라 중증 뇌출혈 및 경증 뇌출혈 모두에서 시각적으로도 더 우수할 수 있다. 특히, 정상 환자(예: 빔 경화 인공물이 있는 정상 케이스 및 정상 케이스 중 하나)의 경우 비교 실시예들에 따른 분할 모델들은 빔 경화 인공물에 민감하게 반응하지만, 일 실시예에 따른 분할 모델은 은 위 양성 부피(FPV)에 대하여 강건함을 가질 수 있다. The segmentation model according to one embodiment is not only numerically superior to the segmentation model according to comparative examples, but may also be visually superior in both severe and mild cerebral hemorrhage. In particular, for normal patients (e.g., a normal case with a beam hardening artifact and one of the normal cases), the segmentation models according to comparative embodiments are sensitive to beam hardening artifacts, but the segmentation model according to one embodiment is sensitive to beam hardening artifacts. Can be robust against positive volume (FPV).

도 11에서 나타난 바와 같이, 일 실시예에 따른 분할 모델은 비교 실시예에 따른 분할 모델에 비해 정상 환자의 경우에 매우 큰 위 양성 부피(FPV) 감소 성능을 가질 수 있다. 일 실시예에 따른 분할 모델은 외부 데이터 세트에서도 비교 실시예에 따른 분할 모델보다 좋은 성능을 가질 수 있다.As shown in FIG. 11, the segmentation model according to one embodiment may have significantly greater false positive volume (FPV) reduction performance in normal patients compared to the segmentation model according to the comparative example. The segmentation model according to one embodiment may have better performance than the segmentation model according to the comparative embodiment even in an external data set.

3차원의 비조영두부 CT(NCCT)에 대하여 환자 레벨에서 ICH의 유무를 분류하고, 병변 영역(예: ICH 영역)을 분할하기 위한 다중 태스크 모델을 이용한 전이 학습 방법이 제안될 수 있다. 다중 태스크 모델을 이용한 전이 학습 방법은 외부 데이터 세트에 대한 높은 균형 잡힌 정확도 및 견고함을 갖고, 실제 임상 환경에서 환자 레벨의 ICH 태스크(예: 분류 및 분할)를 위해 적용될 가능성을 가질 수 있다. 일 실시예에 따른 환자 레벨의 분할 모델 및/또는 분류 모델의 인코더는 직접 트레이닝된 스크래치 모델보다 다양한 태스크들을 통해 다양한 관점에서 피처를 캡처하도록 트레이닝될 수 있다. 인코더는 4개의 태스크들의 프리텍스트 태스크를 위한 다중 태스크 모델을 이용하여 사전 훈련될 수 있다. For 3D non-contrast head CT (NCCT), a transfer learning method using a multi-task model to classify the presence or absence of ICH at the patient level and segment the lesion area (e.g., ICH area) can be proposed. Transfer learning methods using multi-task models have high balanced accuracy and robustness to external data sets, and may have the potential to be applied for patient-level ICH tasks (e.g. classification and segmentation) in real clinical environments. The encoder of the patient-level segmentation model and/or classification model according to one embodiment may be trained to capture features from various perspectives through various tasks than a directly trained scratch model. The encoder can be pre-trained using a multi-task model for the freetext task of 4 tasks.

절제 연구를 통해, 다중 태스크 모델의 태스크 조합 및 대상 태스크(예: 환자 레벨의 분류 태스크, 환자 레벨의 분할 태스크)에 따라 기계 학습 모델의 적어도 일부의 전이가능성(transferability) 및 성능이 달라질 수 있는 것이 발견될 수 있다. 프리텍스트 태스크가 단일 태스크인 경우, 대상 태스크와 동일한 종류의 프리텍스트 태스크로 사전 훈련된 인코더가 대상 태스크를 위한 기계 학습 모델의 성능에 있어서 가장 효과적일 수 있다. 다만, 프리텍스트 태스크가 여러 태스크들인 경우, 인코더가 많은 태스크들에 의하여 함께 훈련될수록 인코더의 환자 레벨 태스크로의 전이가능성 및 대상 태스크에 대한 성능이 향상될 수 있고, 대상 태스크를 위한 모델은 실제 임상 환경 및 외부 데이터 세트에 대해 더 견고할 수 있다. Through ablation studies, it has been shown that the transferability and performance of at least some of the machine learning models can vary depending on the task combination and target task of the multi-task model (e.g., patient-level classification task, patient-level segmentation task). can be found When the pretext task is a single task, an encoder pre-trained with the same type of pretext task as the target task may be most effective in terms of the performance of the machine learning model for the target task. However, when the pretext task is multiple tasks, the more the encoder is trained together by many tasks, the transferability of the encoder to the patient-level task and the performance on the target task can be improved, and the model for the target task can be used in actual clinical trials. Can be more robust to environmental and external data sets.

예를 들어, 슬라이스 레벨의 분류 태스크 및 복원 태스크(CLS+REC라고도 표시됨)는 듀얼 태스크를 이용하여 사전 학습된 인코더 중에서, 환자 레벨의 분류 모델에 대하여 최고 성능을 가지는 반면, 환자 레벨의 분할 모델에 대하여 최저 성능을 가질 수 있다. 일관성 손실(CON라고도 표시됨)은 슬라이스 레벨의 분류 태스크의 희소 예측 및 슬라이스 레벨의 분할 태스크의 밀집 예측 간의 균형을 맞출 수 있고, 환자 레벨의 분류 모델 및 분할 모델이 대상 태스크에 대해 균형을 이루고 정확해지도록 할 수 있다. Model Genesis 접근법에서 픽셀 단위 밀집 예측 태스크(예: 슬라이스 레벨의 분할 태스크, 슬라이스 레벨의 복원 태스크)를 위한 모델을 통해 사전 훈련된 인코더는, 환자 레벨의 분류 모델로의 낮은 전이가능성을 나타낼 수 있다. 특히, Model Genesis 접근법은 부정적인 전이 효과로 인하여, 스크래치 모델보다 환자 레벨의 분류 모델에서 더 낮은 성능을 가질 수 있다. 밀집 예측 태스크를 통해 표현 학습(representation learning)의 수행은 환자 레벨의 분류 태스크의 사전 학습에 적합하지 않을 수 있다. 쉽게 적용할 수 있는 ImageNet 접근법의 경우, ICH 감지를 위해 물체의 가장자리 및 텍스처와 같은 낮은 수준의 특징을 추출하는 것은 환자 레벨의 분류 모델 및 분할 모델 모두 스크래치 모델보다 성능이 우수할 수 있다.For example, the slice-level classification task and restoration task (also denoted as CLS+REC ) have the highest performance among encoders pre-trained using dual tasks for the patient-level classification model, while performing well for the patient-level segmentation model. It may have the lowest performance. Consistency loss (also denoted CON ) can balance the sparse predictions of the slice-level classification task and the dense predictions of the slice-level segmentation task, ensuring that the patient-level classification model and segmentation model are balanced and accurate for the target task. You can let it go. In the Model Genesis approach, encoders pre-trained with models for pixel-level dense prediction tasks (e.g., slice-level segmentation task, slice-level restoration task) may exhibit low transferability to patient-level classification models. In particular, the Model Genesis approach may have lower performance in patient-level classification models than the scratch model due to negative spillover effects. Performing representation learning through a dense prediction task may not be suitable for prior learning of a patient-level classification task. For the easily applicable ImageNet approach, extracting low-level features such as edges and textures of objects for ICH detection, both patient-level classification and segmentation models can outperform scratch models.

이상에서 설명된 실시예들은 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치, 방법 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPGA(field programmable gate array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 소프트웨어 애플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 컨트롤러를 포함할 수 있다. 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The embodiments described above may be implemented with hardware components, software components, and/or a combination of hardware components and software components. For example, the devices, methods, and components described in the embodiments may include, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, and a field programmable gate (FPGA). It may be implemented using a general-purpose computer or a special-purpose computer, such as an array, programmable logic unit (PLU), microprocessor, or any other device capable of executing and responding to instructions. The processing device may execute an operating system (OS) and software applications running on the operating system. Additionally, a processing device may access, store, manipulate, process, and generate data in response to the execution of software. For ease of understanding, a single processing device may be described as being used; however, those skilled in the art will understand that a processing device includes multiple processing elements and/or multiple types of processing elements. It can be seen that it may include. For example, a processing device may include multiple processors or one processor and one controller. Additionally, other processing configurations, such as parallel processors, are possible.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치, 또는 전송되는 신호 파(signal wave)에 영구적으로, 또는 일시적으로 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.Software may include a computer program, code, instructions, or a combination of one or more of these, which may configure a processing unit to operate as desired, or may be processed independently or collectively. You can command the device. Software and/or data may be used on any type of machine, component, physical device, virtual equipment, computer storage medium or device to be interpreted by or to provide instructions or data to a processing device. , or may be permanently or temporarily embodied in a transmitted signal wave. Software may be distributed over networked computer systems and stored or executed in a distributed manner. Software and data may be stored on a computer-readable recording medium.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있으며 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded on a computer-readable medium. A computer-readable medium may include program instructions, data files, data structures, etc., singly or in combination, and the program instructions recorded on the medium may be specially designed and constructed for the embodiment or may be known and available to those skilled in the art of computer software. It may be possible. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic media such as floptical disks. -Includes optical media (magneto-optical media) and hardware devices specifically configured to store and execute program instructions, such as ROM, RAM, flash memory, etc. Examples of program instructions include machine language code, such as that produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter, etc.

위에서 설명한 하드웨어 장치는 실시예의 동작을 수행하기 위해 하나 또는 복수의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The hardware devices described above may be configured to operate as one or multiple software modules to perform the operations of the embodiments, and vice versa.

이상과 같이 실시예들이 비록 한정된 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 이를 기초로 다양한 기술적 수정 및 변형을 적용할 수 있다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.As described above, although the embodiments have been described with limited drawings, those skilled in the art can apply various technical modifications and variations based on this. For example, the described techniques are performed in a different order than the described method, and/or components of the described system, structure, device, circuit, etc. are combined or combined in a different form than the described method, or other components are used. Alternatively, appropriate results may be achieved even if substituted or substituted by an equivalent.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents of the claims also fall within the scope of the claims described below.

Claims

In a method performed by a processor,
extracting a feature map by applying an encoder of a machine learning model to the training input image;
Obtaining a first likelihood score that a lesion will appear in the training input image by applying a general-purpose classification module of the machine learning model to the feature map;
extracting a lesion area from the training input image by applying a general-purpose segmentation module of the machine learning model to the feature map;
Based on the obtained first likelihood score and the extracted lesion area, calculate an objective function value with a consistency loss that represents the consistency between the outputs of the universal classification module and the universal segmentation module. steps; and
Updating parameters of the machine learning model using the calculated objective function value.
How to include .

According to paragraph 1,
The step of calculating the objective function value is,
Comprising the step of calculating the objective function value further having a classification loss for the general-purpose classification module and a segmentation loss for the general-purpose segmentation module.
method.

According to paragraph 1,
The step of calculating the objective function value is,
obtaining a second probability score that a lesion will appear in the training input image using the extracted lesion area; and
Computing the consistency loss based on the difference between the first likelihood score and the second likelihood score,
method.

According to paragraph 3,
The step of obtaining the second probability score is,
Obtaining the second likelihood score by applying at least one of an average pooling layer and a max pooling layer to the extracted lesion area,
method.

According to paragraph 3,
The step of calculating the consistency loss is,
calculating a square of the difference between the first likelihood score and the second likelihood score; and
Comprising the step of calculating the consistency loss by averaging the square of the difference between the first likelihood score and the second likelihood score calculated for each of the images included in the batch,
method.

According to paragraph 1,
Obtaining a reconstruction image from the feature map by applying a reconstruction module of the machine learning model to the feature map.
It further includes,
The step of calculating the objective function value is,
Comprising a step of calculating a reconstruction loss for the reconstruction module based on the training input image and the reconstruction image,
method.

According to clause 6,
The step of calculating the restoration loss is,
Comprising the step of calculating a mean absolute error value of the intensity difference between the pixel of the training input image and the pixel of the reconstructed image corresponding to the pixel of the training input image,
method.

According to paragraph 1,
generating a classification model including the encoder of the machine learning model and a classification module at a patient level connected to the encoder;
Obtaining a plurality of feature maps by applying the encoder of the classification model to a plurality of images obtained based on CT scan of a target patient in a training data set;
obtaining a probability score of a lesion appearing in at least one image among the plurality of images by applying the patient-level classification module to the obtained plurality of feature maps; and
Updating parameters of the classification model based on a likelihood score for the plurality of images and an objective function value calculated using a truth class mapped to the plurality of images in the training data set.
How to include .

According to clause 8,
The step of acquiring the plurality of feature maps includes:
comprising repeating obtaining a feature map by applying the encoder of the classification model to each of the plurality of images,
method.

According to clause 8,
The step of updating the parameters of the classification model is,
updating parameters of the patient-level classification module while keeping parameters of the encoder of the classification model fixed in at least some of the training iterations; and
Unfreezing the parameters of the encoder of the classification model and updating the parameters of the encoder of the classification model and the patient-level classification module in another of the training iterations.
method.

According to paragraph 1,
Generating a segmentation model including the encoder of the machine learning model, a decoder connected to the encoder, and a segmentation module at a patient level;
Obtaining a plurality of feature maps by applying the encoder and the decoder to a plurality of images obtained based on CT scan of a target patient of a training data set;
extracting a lesion area from the plurality of images by applying the patient-level segmentation module to the acquired plurality of feature maps; and
Updating parameters of the segmentation model based on an objective function value calculated using the extracted lesion area and the true lesion area mapped to the plurality of images in the training data set.
How to include more.

According to clause 11,
The step of acquiring the plurality of feature maps includes:
Comprising repeating obtaining a feature map by applying the encoder and the decoder of the segmentation model to each of the plurality of images,
method.

According to clause 11,
The step of updating the parameters of the segmentation model is,
updating parameters of the patient-level segmentation module while keeping parameters of the encoder of the segmentation model fixed in at least some of the training iterations; and
Unfreezing the parameters of the encoder of the segmentation model and updating the parameters of the encoder of the segmentation model and the segmentation module at the patient level in another of the training iterations.
method.

A computer program combined with hardware and stored in a computer-readable recording medium to execute the method of any one of claims 1 to 13.

In electronic devices,
A feature map is extracted by applying the encoder of the machine learning model to the training input image, and a lesion is detected in the training input image by applying the general classification module of the machine learning model to the feature map. Obtain a first likelihood score to appear, extract a lesion area from the training input image by applying a universal segmentation module of the machine learning model to the feature map, and obtain the first likelihood score and the extracted Based on the lesion area, calculate an objective function value with a consistency loss that represents the consistency between the outputs of the general classification module and the general segmentation module, and use the calculated objective function value. Processor for updating parameters of the machine learning model
Electronic devices containing.

According to clause 15,
The processor,
Generating a classification model including the encoder of the machine learning model and a classification module at a patient level connected to the encoder,
Obtaining a plurality of feature maps by applying the encoder of the classification model to a plurality of images obtained based on CT scan of the target patient of the training data set,
Obtaining a probability score that a lesion appears in at least one of the plurality of images by applying the patient-level classification module to the obtained plurality of feature maps,
updating parameters of the classification model based on a likelihood score for the plurality of images and an objective function value calculated using a truth class mapped to the plurality of images in the training data set,
Electronic devices.

According to clause 15,
The processor,
Generate a segmentation model including the encoder of the machine learning model, a decoder connected to the encoder, and a segmentation module at the patient level,
Obtaining a plurality of feature maps by applying the encoder and the decoder to a plurality of images obtained based on CT scan of the target patient of the training data set,
Extracting a lesion area from the plurality of images by applying the patient-level segmentation module to the acquired plurality of feature maps,
Updating parameters of the segmentation model based on an objective function value calculated using the extracted lesion area and a true lesion area mapped to the plurality of images in the training data set,
Electronic devices.