KR20230171160A

KR20230171160A - Method and apparatus for adjusting quality of medical image

Info

Publication number: KR20230171160A
Application number: KR1020220071393A
Authority: KR
Inventors: 이지나; 홍영택; 심학준; 맹신희
Original assignee: 주식회사 온택트헬스
Priority date: 2022-06-13
Filing date: 2022-06-13
Publication date: 2023-12-20

Abstract

의료 영상의 품질을 조절하기 위한 것으로, 조절 방법은, 인공지능 모델의 입력으로서 입력 영상을 획득하는 단계, 상기 입력 영상에서 특징 맵들을 추출하는 단계, 상기 특징 맵들에 대하여, 채널에 따른 스퀴즈 동작(squeeze operation) 및 공간적 자극 동작(excitation operation)을 수행함으로써 압축 및 재교정된(compressed and re-calibrated) 특징 맵들을 생성하는 단계, 및 상기 압축 및 재교정된 특징 맵들에 기반하여, 품질 조절된 출력 영상을 생성하는 단계를 포함할 수 있다.To adjust the quality of medical images, the adjustment method includes obtaining an input image as an input to an artificial intelligence model, extracting feature maps from the input image, and squeezing the feature maps according to the channel ( generating compressed and re-calibrated feature maps by performing a squeeze operation and a spatial excitation operation, and based on the compressed and re-calibrated feature maps, a quality adjusted output. It may include the step of generating an image.

Description

Method and apparatus for adjusting the quality of medical images {METHOD AND APPARATUS FOR ADJUSTING QUALITY OF MEDICAL IMAGE}

본 발명은 의료 영상에 대한 것으로, 특히, 의료 영상의 품질을 조절하기 위한 방법 및 장치에 대한 것이다.The present invention relates to medical images, and particularly to a method and device for controlling the quality of medical images.

질병은 인간의 심신에 장애를 일으켜서 정상적인 기능을 저해하는 상태를 의미하는 것으로, 질병에 따라 인간은 고통을 받고 심지어 생을 유지하지 못할 수 있다. 따라서, 질병을 진단하고, 치료하고 나아가 예방하기 위한 다양한 사회적 시스템 및 기술들이 인류의 역사와 함께 발전해왔다. 질병의 진단 및 치료에 있어서, 기술의 눈부신 발전에 따라 다양한 도구들 및 방식들이 개발되어 왔지만, 아직까지, 종국적으로는 의사의 판단에 의존하고 있는 현실이다.Disease refers to a condition that causes disorders in the human mind and body, impeding normal functioning. Depending on the disease, a person may suffer and may even be unable to sustain his or her life. Accordingly, various social systems and technologies for diagnosing, treating, and even preventing diseases have developed along with human history. In the diagnosis and treatment of diseases, various tools and methods have been developed in accordance with the remarkable advancement of technology, but the reality is that they are still ultimately dependent on the judgment of doctors.

한편, 최근 인공지능(artificial intelligence, AI) 기술이 크게 발전하면서 다양한 분야에서 주목되고 있다. 특히, 방대한 양의 누적된 의료 데이터와, 영상 위주의 진단 데이터 등의 환경으로 인해, 의료 분야에 인공지능 알고리즘을 접목하려는 다양한 시도와 연구가 진행 중이다. 구체적으로, 질병을 진단, 예측하는 등 종래의 임상적 판단에 머물러 있던 작업들을 인공지능 알고리즘을 이용하여 해결하려는 다양한 연구가 이루어지고 있다. 또한, 진단 등을 위한 중간 과정으로서 의료 데이터를 처리하고 분석하는 작업을 인공지능 알고리즘을 이용하여 해결하려는 다양한 연구가 이루어지고 있다. Meanwhile, artificial intelligence (AI) technology has recently developed significantly and is attracting attention in various fields. In particular, due to the environment of vast amounts of accumulated medical data and image-oriented diagnostic data, various attempts and research are underway to apply artificial intelligence algorithms to the medical field. Specifically, various studies are being conducted to use artificial intelligence algorithms to solve tasks that have traditionally been limited to clinical judgment, such as diagnosing and predicting diseases. In addition, various studies are being conducted to solve the task of processing and analyzing medical data as an intermediate process for diagnosis, etc. using artificial intelligence algorithms.

미국공개특허 US 2021/0312242US published patent US 2021/0312242

미국공개특허 US 2018/0293712US published patent US 2018/0293712

본 발명은 인공지능(artificial intelligence, AI) 알고리즘을 이용하여 의료 영상의 품질을 효과적으로 조절하기 위한 방법 및 장치를 제공하기 위한 것이다.The present invention is intended to provide a method and device for effectively controlling the quality of medical images using an artificial intelligence (AI) algorithm.

본 발명은 의료 영상에서 잡음을 제거하기 위한 방법 및 장치를 제공하기 위한 것이다.The present invention is intended to provide a method and device for removing noise from medical images.

본 발명은 저선량(low-dose) CT(computer tomography) 영상에 기반하여 표준선량(standard-dose) 또는 고선량(high-dose) CT 영상을 생성하기 위한 방법 및 장치를 제공하기 위한 것이다.The present invention is intended to provide a method and device for generating a standard-dose or high-dose CT image based on a low-dose CT (computer tomography) image.

본 발명에서 이루고자 하는 기술적 과제들은 이상에서 언급한 기술적 과제들로 제한되지 않으며, 언급하지 않은 또 다른 기술적 과제들은 아래의 기재로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The technical problems to be achieved in the present invention are not limited to the technical problems mentioned above, and other technical problems not mentioned will be clearly understood by those skilled in the art from the description below. You will be able to.

본 발명의 일 실시 예에 따른, 의료 영상의 품질을 조절하기 위한 방법은, 인공지능 모델의 입력으로서 입력 영상을 획득하는 단계, 상기 입력 영상에서 특징 맵들을 추출하는 단계, 상기 특징 맵들에 대하여, 채널에 따른 스퀴즈 동작(squeeze operation) 및 공간적 자극 동작(excitation operation)을 수행함으로써 압축 및 재교정된(compressed and re-calibrated) 특징 맵들을 생성하는 단계, 및 상기 압축 및 재교정된 특징 맵들에 기반하여, 품질 조절된 출력 영상을 생성하는 단계를 포함할 수 있다.According to an embodiment of the present invention, a method for adjusting the quality of a medical image includes obtaining an input image as an input to an artificial intelligence model, extracting feature maps from the input image, and regarding the feature maps, generating compressed and re-calibrated feature maps by performing a channel-dependent squeeze operation and a spatial excitation operation, and based on the compressed and re-calibrated feature maps. Thus, the step of generating a quality-adjusted output image may be included.

본 발명의 일 실시 예에 따르면, 상기 방법은, 상기 인공지능 모델에 대하여 적어도 하나의 드랍아웃 패턴을 적용하는 단계를 더 포함할 수 있다.According to one embodiment of the present invention, the method may further include applying at least one dropout pattern to the artificial intelligence model.

본 발명의 일 실시 예에 따르면, 상기 적어도 하나의 드랍아웃 패턴은, 상기 인공지능 모델의 훈련 중 에포크(epoch) 별로 다르게 적용된 드랍아웃 패턴들과 독립적으로 선택될 수 있다.According to one embodiment of the present invention, the at least one dropout pattern may be selected independently from dropout patterns applied differently for each epoch during training of the artificial intelligence model.

본 발명의 일 실시 예에 따르면, 상기 성능은, 상기 훈련 중 생성된 출력 영상에 대한 라디오믹스(radiomics), 최대신호대잡음비(Peak Signal-to-noise, PSNR), 구조적 유사도(Structural Similarity index method, SSIM) 중 적어도 하나에 기반하여 평가될 수 있다.According to one embodiment of the present invention, the performance is determined by radiomics, peak signal-to-noise (PSNR), and structural similarity index method for the output image generated during the training. SSIM) may be evaluated based on at least one of the following:

본 발명의 일 실시 예에 따르면, 상기 인공지능 모델은, 상기 출력 영상을 생성하는 생성자(generator) 네트워크 및, 상기 생성자 네트워크에 의해 생성된 모조(fake) 영상 및 실제(real) 영상을 구분하는 판별자(discriminator) 네트워크를 포함할 수 있다.According to an embodiment of the present invention, the artificial intelligence model includes a generator network that generates the output image, and a discrimination function to distinguish between a fake image and a real image generated by the generator network. May include a discriminator network.

본 발명의 일 실시 예에 따르면, 상기 생성자 네트워크 및 상기 판별자 네트워크는, 판별자 손실(discriminator loss) 및 생성자 손실(generator loss)을 최소화하도록 훈련되며, 상기 판별자 손실은, 와씨스타인 거리(Wasserstein distance)에 기반하여 정의되고, 상기 생성자 손실은, 적대적 손실(adversarial loss) 및 유사도 손실(similarity loss)에 기반하여 정의될 수 있다.According to one embodiment of the present invention, the generator network and the discriminator network are trained to minimize the discriminator loss and the generator loss, and the discriminator loss is the Wasistein distance. (Wasserstein distance), and the generator loss may be defined based on adversarial loss and similarity loss.

본 발명의 일 실시 예에 따르면, 상기 적대적 손실은, 상기 생성자 네트워크에 의해 생성된 모조 영상에서 추출된 패치(patch)들에 기반하여 결정되고, 상기 유사도 손실은, 상기 생성자 네트워크에 의해 생성된 모조 영상의 전체에 기반하여 결정될 수 있다.According to an embodiment of the present invention, the adversarial loss is determined based on patches extracted from the fake image generated by the generator network, and the similarity loss is determined based on the patches extracted from the fake image generated by the generator network. It can be determined based on the entire video.

본 발명의 일 실시 예에 따르면, 상기 입력 영상은, 저선량(low-dose) CT(computer tomography) 영상을 포함하고, 상기 출력 영상은, 가상의 표준선량(standard-dose) 또는 고선량(high-dose) CT 영상을 포함할 수 있다.According to one embodiment of the present invention, the input image includes a low-dose CT (computer tomography) image, and the output image includes a virtual standard-dose or high-dose CT (computer tomography) image. dose) may include CT images.

본 발명의 일 실시 예에 따른, 의료 영상의 품질을 조절하기 위한 장치는, 상기 장치의 동작을 위한 명령어 집합을 저장하는 저장부, 및 상기 저장부와 연결된 적어도 하나의 프로세서를 포함하며, 상기 적어도 하나의 프로세서는, 인공지능 모델의 입력으로서 입력 영상을 획득하고, 상기 입력 영상에서 특징 맵들을 추출하고, 상기 특징 맵들에 대하여, 채널에 따른 스퀴즈 동작(squeeze operation) 및 공간적 자극 동작(excitation operation)을 수행함으로써 압축 및 재교정된(compressed and re-calibrated) 특징 맵들을 생성하고, 상기 압축 및 재교정된 특징 맵들에 기반하여, 품질 조절된 출력 영상을 생성하도록 제어할 수 있다.According to an embodiment of the present invention, a device for controlling the quality of medical images includes a storage unit that stores a set of instructions for operating the device, and at least one processor connected to the storage unit, wherein the at least One processor acquires an input image as an input to an artificial intelligence model, extracts feature maps from the input image, and performs a squeeze operation and a spatial excitation operation according to the channel for the feature maps. By performing , it is possible to generate compressed and re-calibrated feature maps, and control to generate a quality-adjusted output image based on the compressed and re-calibrated feature maps.

본 발명의 일 실시 예에 따른, 컴퓨터 판독 가능한 매체에 저장된 프로그램은, 프로세서에 의해 동작되면 전술한 방법을 실행할 수 있다.According to an embodiment of the present invention, a program stored in a computer-readable medium can execute the above-described method when operated by a processor.

본 발명에 대하여 위에서 간략하게 요약된 특징들은 후술하는 본 발명의 상세한 설명의 예시적인 양상일 뿐이며, 본 발명의 범위를 제한하는 것은 아니다.The features briefly summarized above with respect to the present invention are merely exemplary aspects of the detailed description of the present invention that follows, and do not limit the scope of the present invention.

본 발명에 따르면, 인공지능 모델의 훈련을 위한 의료 영상이 보다 효과적으로 생성될 수 있다.According to the present invention, medical images for training artificial intelligence models can be generated more effectively.

본 발명에서 얻을 수 있는 효과는 이상에서 언급한 효과들로 제한되지 않으며, 언급하지 않은 또 다른 효과들은 아래의 기재로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The effects that can be obtained from the present invention are not limited to the effects mentioned above, and other effects not mentioned can be clearly understood by those skilled in the art from the description below. will be.

도 1은 본 발명의 일 실시 예에 따른 시스템을 도시한다.
도 2는 본 발명의 일 실시 예에 따른 질병 발생 가능성을 예측하는 장치의 구조를 도시한다.
도 3은 본 발명에 적용 가능한 인공지능 모델을 구성하는 퍼셉트론(perceptron)의 예를 도시한다.
도 4는 본 발명에 적용 가능한 인공지능 모델을 구성하는 인공 신경망의 예를 도시한다.
도 5는 본 발명에 적용 가능한 GAN(generative adversarial network)의 예를 도시한다.
도 6은 본 발명의 일 실시 예에 따른 인공지능 모델의 구조의 일 예를 도시한다.
도 7은 본 발명의 일 실시 예에 따라 의료 영상의 품질을 조절하는 절차의 예를 도시한다.
도 8은 본 발명의 일 실시 예에 따라 인공지능 모델을 훈련 및 운용하는 절차의 예를 도시한다.
도 9는 본 발명의 일 실시 예에 따른 인공지능 모델을 훈련하기 위한 손실 값들의 예를 도시한다.1 shows a system according to one embodiment of the present invention.
Figure 2 shows the structure of a device for predicting the possibility of disease occurrence according to an embodiment of the present invention.
Figure 3 shows an example of a perceptron constituting an artificial intelligence model applicable to the present invention.
Figure 4 shows an example of an artificial neural network constituting an artificial intelligence model applicable to the present invention.
Figure 5 shows an example of a generative adversarial network (GAN) applicable to the present invention.
Figure 6 shows an example of the structure of an artificial intelligence model according to an embodiment of the present invention.
Figure 7 shows an example of a procedure for adjusting the quality of medical images according to an embodiment of the present invention.
Figure 8 shows an example of a procedure for training and operating an artificial intelligence model according to an embodiment of the present invention.
Figure 9 shows an example of loss values for training an artificial intelligence model according to an embodiment of the present invention.

이하에서는 첨부한 도면을 참고로 하여 본 발명의 실시 예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나, 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시 예에 한정되지 않는다. Hereinafter, with reference to the attached drawings, embodiments of the present invention will be described in detail so that those skilled in the art can easily implement the present invention. However, the present invention may be implemented in many different forms and is not limited to the embodiments described herein.

본 발명의 실시 예를 설명함에 있어서 공지 구성 또는 기능에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우에는 그에 대한 상세한 설명은 생략한다. 그리고, 도면에서 본 발명에 대한 설명과 관계없는 부분은 생략하였으며, 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.In describing embodiments of the present invention, if it is determined that a detailed description of a known configuration or function may obscure the gist of the present invention, the detailed description thereof will be omitted. In addition, in the drawings, parts that are not related to the description of the present invention are omitted, and similar parts are given similar reference numerals.

본 발명은 의료 영상의 품질을 조절하기 위한 것이다. 구체적으로, 본 발명은 GAN(generative adversarial network) 기반 인공지능 모델을 이용하여 주어진 의료 영상의 품질을 조절하는 기술에 관한 것으로, 인공지능 모델의 구조 및 운용에 대한 다양한 실시 예들을 설명한다.The present invention is intended to control the quality of medical images. Specifically, the present invention relates to a technology for controlling the quality of a given medical image using a GAN (generative adversarial network)-based artificial intelligence model, and describes various embodiments of the structure and operation of the artificial intelligence model.

도 1은 본 발명의 일 실시 예에 따른 시스템을 도시한다.1 shows a system according to one embodiment of the present invention.

도 1을 참고하면, 시스템은 서비스 서버(110), 데이터 서버(120), 적어도 하나의 클라이언트 장치(130)를 포함한다. Referring to FIG. 1, the system includes a service server 110, a data server 120, and at least one client device 130.

서비스 서버(110)는 인공지능 모델 기반의 서비스를 제공한다. 즉, 서비스 서버(110)는 인공지능 모델을 이용하여 학습 및 예측 동작을 수행한다. 서비스 서버(110)는 네트워크를 통해 데이터 서버(120) 또는 적어도 하나의 클라이언트 장치(130)와 통신을 수행할 수 있다. 예를 들어, 서비스 서버(110)는 데이터 서버(120)로부터 인공지능 모델을 훈련하기 위한 학습 데이터를 수신하고, 훈련을 수행할 수 있다. 서비스 서버(110)는 적어도 하나의 클라이언트 장치(130)로부터 학습 및 예측(prediction) 동작에 필요한 데이터를 수신할 수 있다. 또한, 서비스 서버(110)는 적어도 하나의 클라이언트 장치(130)에게 예측 결과에 대한 정보를 송신할 수 있다. The service server 110 provides services based on artificial intelligence models. That is, the service server 110 performs learning and prediction operations using an artificial intelligence model. The service server 110 may communicate with the data server 120 or at least one client device 130 through a network. For example, the service server 110 may receive learning data for training an artificial intelligence model from the data server 120 and perform training. The service server 110 may receive data required for learning and prediction operations from at least one client device 130. Additionally, the service server 110 may transmit information about the prediction result to at least one client device 130.

데이터 서버(120)는 서비스 서버(110)에 저장된 인공지능 모델의 훈련을 위한 학습 데이터를 제공한다. 다양한 실시 예들에 따라, 데이터 서버(120)는 누구나 접근 가능한 공공 데이터를 제공하거나 또는 허가를 필요로 하는 데이터를 제공할 수 있다. 필요에 따라, 학습 데이터는 데이터 서버(120)에 의해 또는 서비스 서버(120)에 의해 전처리할 수 있다. 다른 실시 예에 따라, 데이터 서버(120)는 생략될 수 있다. 이 경우, 서비스 서버(110)는 외부에서 훈련된 인공지능 모델을 사용하거나 또는 서비스 서버(110)에 오프라인으로 학습 데이터가 제공될 수 있다.The data server 120 provides learning data for training the artificial intelligence model stored in the service server 110. According to various embodiments, the data server 120 may provide public data that anyone can access or provide data that requires permission. If necessary, the learning data may be preprocessed by the data server 120 or the service server 120. According to another embodiment, the data server 120 may be omitted. In this case, the service server 110 may use an externally trained artificial intelligence model, or learning data may be provided to the service server 110 offline.

적어도 하나의 클라이언트 장치(130)는 서비스 서버(110)에 의해 운용되는 인공지능 모델에 관련된 데이터를 서비스 서버(110)와 송신 및 수신한다. 적어도 하나의 클라이언트 장치(130)는 사용자에 의해 사용되는 장비이며, 사용자에 의해 입력되는 정보를 서비스 서버(110)에게 송신하고, 서비스 서버(110)로부터 수신되는 정보를 저장하거나 사용자에게 제공(예: 표시)할 수 있다. 경우에 따라, 어느 하나의 클라이언트로부터 송신된 데이터에 기반하여 예측 동작이 수행되고, 예측의 결과에 관련된 정보가 다른 클라이언트에게 제공될 수 있다. 적어도 하나의 클라이언트 장치(130)는 데스크탑 컴퓨터, 랩탑 컴퓨터, 스마트폰, 타블렛, 웨어러블 기기 등 다양한 형태의 컴퓨팅 장치일 수 있다.At least one client device 130 transmits and receives data related to an artificial intelligence model operated by the service server 110 with the service server 110. At least one client device 130 is equipment used by the user, and transmits information input by the user to the service server 110, and stores the information received from the service server 110 or provides it to the user (e.g. : mark) is possible. In some cases, a prediction operation may be performed based on data transmitted from one client, and information related to the result of the prediction may be provided to another client. At least one client device 130 may be various types of computing devices, such as desktop computers, laptop computers, smartphones, tablets, and wearable devices.

도 1에 도시되지 아니하였으나, 시스템은 서비스 서버(110)를 관리하기 위한 관리 장치를 더 포함할 수 있다. 관리 장치는 서비스를 관리하는 주체에 의해 사용되는 장치로서, 서비스 서버(110)의 상태를 모니터링하거나, 서비스 서버(110)의 설정을 제어한다. 관리 장치는 네트워크를 통해 서비스 서버(110)에 접속하거나 또는 케이블 연결을 통해 직접 연결될 수 있다. 관리 장치의 제어에 따라, 서비스 서버(110)는 동작을 위한 파라미터를 설정할 수 있다.Although not shown in FIG. 1, the system may further include a management device for managing the service server 110. The management device is a device used by the entity that manages the service, and monitors the status of the service server 110 or controls settings of the service server 110. The management device may be connected to the service server 110 through a network or directly through a cable connection. According to the control of the management device, the service server 110 can set parameters for operation.

도 1을 참고하여 설명한 바와 같이, 서비스 서버(110), 데이터 서버(120), 적어도 하나의 클라이언트 장치(130), 관리 장치 등이 네트워크를 통해 연결되고, 상호작용할 수 있다. 여기서, 네트워크는 유선 네트워크 및 무선 네트워크 중 적어도 하나를 포함할 수 있고, 셀룰러 네트워크, 근거리 네트워크, 광역 네트워크 중 어느 하나 또는 둘 이상의 조합으로 이루어질 수 있다. 예를 들어, 네트워크는 LAN(local area network), WLAN(wireless LAN), 블루투스(bluetooth), LTE(long term evolution), LTE-A(LTE-advanced), 5G(5th generation) 중 적어도 하나에 기반하여 구현될 수 있다.As described with reference to FIG. 1, the service server 110, the data server 120, at least one client device 130, a management device, etc. may be connected and interact through a network. Here, the network may include at least one of a wired network and a wireless network, and may be comprised of any one or a combination of two or more of a cellular network, a local area network, and a wide area network. For example, the network is based on at least one of LAN (local area network), WLAN (wireless LAN), Bluetooth, long term evolution (LTE), LTE-advanced (LTE-A), and 5th generation (5G). This can be implemented.

도 2는 본 발명의 일 실시 예에 따른 질병 발생 가능성을 예측하는 장치의 구조를 도시한다. 도 2에 예시된 구조는 도 1의 서비스 서버(110), 데이터 서버(120), 적어도 하나의 클라이언트 장치(130)의 구조로 이해될 수 있다.Figure 2 shows the structure of a device for predicting the possibility of disease occurrence according to an embodiment of the present invention. The structure illustrated in FIG. 2 may be understood as the structure of the service server 110, data server 120, and at least one client device 130 of FIG. 1.

도 2를 참고하면, 장치는, 통신부(210), 저장부(220), 제어부(230)를 포함한다.Referring to FIG. 2, the device includes a communication unit 210, a storage unit 220, and a control unit 230.

통신부(210)는 네트워크에 접속하고, 다른 장치와 통신을 수행하기 위한 기능을 수행한다. 통신부(210)는 유선 통신 및 무선 통신 중 적어도 하나를 지원할 수 있다. 통신을 위해, 통신부(210)는 RF(radio frequency) 처리 회로, 디지털 데이터 처리 회로 중 적어도 하나를 포함할 수 있다. 경우에 따라, 통신부(210)는 케이블을 연결하기 위한 단자를 포함하는 구성요소로 이해될 수 있다. 통신부(210)는 데이터, 신호를 송신 및 수신하기 위한 구성요소이므로, '송수신부(transceiver)'라 지칭될 수 있다.The communication unit 210 performs functions to connect to the network and communicate with other devices. The communication unit 210 may support at least one of wired communication and wireless communication. For communication, the communication unit 210 may include at least one of a radio frequency (RF) processing circuit and a digital data processing circuit. In some cases, the communication unit 210 may be understood as a component including a terminal for connecting a cable. Since the communication unit 210 is a component for transmitting and receiving data and signals, it may be referred to as a 'transceiver'.

저장부(220)는 장치의 동작을 위해 필요한 데이터, 프로그램, 마이크로 코드, 명령어 집합, 어플리케이션 등을 저장한다. 저장부(220)는 일시적 또는 비일시적 저장 매체로 구현될 수 있다. 또한, 저장부(220)는 장치에 고정되어 있거나, 또는 분리 가능한 형태로 구현될 수 있다. 예를 들어, 저장부(220)는 콤팩트 플래시(compact flash, CF) 카드, SD(secure digital) 카드, 메모리 스틱(memory stick), 솔리드 스테이트 드라이브(solid-state drive; SSD) 및 마이크로(micro) SD 카드 등과 같은 낸드 플래시 메모리(NAND flash memory), 하드 디스크 드라이브(hard disk drive; HDD) 등과 같은 마그네틱 컴퓨터 기억 장치 중 적어도 하나로 구현될 수 있다.The storage unit 220 stores data, programs, microcode, instruction sets, applications, etc. necessary for the operation of the device. The storage unit 220 may be implemented as a temporary or non-transitory storage medium. Additionally, the storage unit 220 may be fixed to the device or may be implemented in a detachable form. For example, the storage unit 220 may include compact flash (CF) cards, secure digital (SD) cards, memory sticks, solid-state drives (SSD), and micro It may be implemented as at least one of magnetic computer storage devices such as NAND flash memory such as an SD card and a hard disk drive (HDD).

제어부(230)는 장치의 전반적인 동작을 제어한다. 이를 위해, 제어부(230)는 적어도 하나의 프로세서, 적어도 하나의 마이크로 프로세서 등을 포함할 수 있다. 제어부(230)는 저장부(220)에 저장된 프로그램을 실행하고, 통신부(210)를 통해 네트워크에 접속할 수 있다. 특히, 제어부(230)는 후술하는 다양한 실시 예들에 따른 알고리즘들을 수행하고, 후술하는 실시 예들에 따라 장치가 동작하도록 제어할 수 있다.The control unit 230 controls the overall operation of the device. To this end, the control unit 230 may include at least one processor, at least one microprocessor, etc. The control unit 230 can execute a program stored in the storage unit 220 and access the network through the communication unit 210. In particular, the control unit 230 may perform algorithms according to various embodiments described later and control the device to operate according to embodiments described later.

도 1 및 도 2를 참고하여 설명한 구조에 기반하여, 본 발명의 다양한 실시 예들에 따른 인공지능 알고리즘 기반의 서비스가 제공될 수 있다. 여기서, 인공지능 알고리즘을 구현하기 위해 인공 신경망으로 이루어진 인공지능 모델이 사용될 수 있다. 인공 신경망의 구성 단위인 퍼셉트론(perceptron) 및 인공 신경망의 개념은 다음과 같다.Based on the structure described with reference to FIGS. 1 and 2, services based on artificial intelligence algorithms can be provided according to various embodiments of the present invention. Here, an artificial intelligence model consisting of an artificial neural network can be used to implement the artificial intelligence algorithm. The concepts of perceptron, which is a structural unit of artificial neural network, and artificial neural network are as follows.

퍼셉트론은 생물의 신경 세포를 모델링한 것으로서, 다수의 신호들을 입력으로 삼아 하나의 신호를 출력하는 구조를 가진다. 도 3은 본 발명에 적용 가능한 인공지능 모델을 구성하는 퍼셉트론의 예를 도시한다. 도 3을 참고하면, 퍼셉트론은 입력 값들(예: x₁, x₂, x₃, …, x_n) 각각 대하여 가중치들(302-1 내지 302-n)(예: w_1j, w_2j, w_3j, …, w_nj)을 곱한 후, 가중치 곱해진(weighted) 입력 값들을 변환 함수(transfer function)(304)을 이용하여 합산한다. 합산 과정에서, 바이어스(bias) 값(예: b_k)이 더해질 수 있다. 퍼셉트론은 변환 함수(304)의 출력인 네트(net) 입력 값(예: net_j)에 대하여 활성 함수(activation function)(406)을 적용함으로써, 출력 값(예: o_j)를 생성한다. 경우에 따라, 활성 함수(406)은 임계치(예: θ_j)에 기반하여 동작할 수 있다. 활성 함수는 다양하게 정의될 수 있다. 본 발명이 이에 제한되는 것은 아니나, 예를 들어, 활성 함수로서, 스텝 함수(step function), 시그모이드(sigmoid), Relu, Tanh 등이 사용될 수 있다.The perceptron is modeled after a biological nerve cell and has a structure that takes multiple signals as input and outputs a single signal. Figure 3 shows an example of a perceptron constituting an artificial intelligence model applicable to the present invention. Referring to Figure 3, the perceptron sets weights (302-1 to 302-n) (e.g., w _1j , w _2j , _w ) for each of the input values (e.g., x ₁ , x ₂ , x ₃ , ..., x n). After multiplying by _3j , ..., w _nj ), the weighted input values are summed using a transfer function 304. During the summation process, a bias value (e.g., b _k ) may be added. The perceptron generates an output value (e.g., o _{j) by applying an activation function (406) to the net input value (e.g., net j} ₎ , which is the output of the transformation function (304). In some cases, activation function 406 may operate based on a threshold (eg, θ _j ). Activation functions can be defined in various ways. The present invention is not limited to this, but for example, a step function, sigmoid, Relu, Tanh, etc. may be used as the activation function.

도 3와 같은 퍼셉트론들이 나열되고, 레이어를 이룸으로써 인공 신경망이 설계될 수 있다. 도 4는 본 발명에 적용 가능한 인공지능 모델을 구성하는 인공 신경망의 예를 도시한다. 도 4에서, 원으로 표현된 각 노드는 도 3의 퍼셉트론으로 이해될 수 있다. 도 4를 참고하면, 인공 신경망은 입력 레이어(input layer)(402), 복수의 은닉 레이어(hidden layer)들(404a, 404b), 출력 레이어(output layer)(406)를 포함한다. An artificial neural network can be designed by arranging perceptrons as shown in Figure 3 and forming layers. Figure 4 shows an example of an artificial neural network constituting an artificial intelligence model applicable to the present invention. In FIG. 4, each node represented by a circle can be understood as the perceptron of FIG. 3. Referring to FIG. 4, the artificial neural network includes an input layer 402, a plurality of hidden layers 404a and 404b, and an output layer 406.

예측을 수행하는 경우, 입력 레이어(402)의 각 노드로 입력 데이터가 제공되면, 입력 데이터는 입력 레이어(402), 은닉 레이어들(404a, 404b)을 이루는 퍼셉트론들에 의한 가중치 적용, 변환 함수 연산 및 활성 함수 연산 등을 거쳐 출력 레이어(406)까지 순전파(forward propagation)된다. 반대로, 훈련을 수행하는 경우, 출력 레이어(406)로부터 입력 레이어(402)를 향한 역전파(backward propagation)를 통해 오차가 계산되고, 계산된 오차에 따라 각 퍼셉트론에 정의된 가중치 값들이 갱신될 수 있다.When performing prediction, when input data is provided to each node of the input layer 402, the input data is weighted and transformed by the perceptrons that make up the input layer 402 and the hidden layers 404a and 404b. and forward propagation to the output layer 406 through activation function calculation, etc. Conversely, when training is performed, the error is calculated through backward propagation from the output layer 406 to the input layer 402, and the weight values defined in each perceptron can be updated according to the calculated error. there is.

GAN(generative adversarial network)은 적대적(adversarial) 학습을 수행하도록 설계된 인공지능 네트워크로서, 생성자(generator) 및 판별자(discriminator)가 경쟁적으로 학습하며 모두 발전하는 방식으로 훈련이 진행된다. 즉, 생성자는 판별자의 분류 성공 확률을 낮추고자 노력하고, 판별자는 분류 성공 확률을 높이고자 노력하면서, 서로를 경쟁적으로 발전시킨다. GAN의 구조는 이하 도 5와 같다.GAN (generative adversarial network) is an artificial intelligence network designed to perform adversarial learning, and is trained in a way that the generator and discriminator learn competitively and both develop. In other words, the generator strives to lower the discriminator's classification success probability, and the discriminator strives to increase the classification success probability, developing competitively with each other. The structure of GAN is shown in Figure 5 below.

도 5는 본 발명에 적용 가능한 GAN의 예를 도시한다. 도 5는 샘플(예: 영상)을 분류하는 GAN의 구조를 예시한다. 도 5를 참고하면, GAN은 생성자(510) 및 판별자(520)를 포함한다. 생성자(510) 및 판별자(520) 각각은 인공 신경망을 포함하며, 생성자(510)는 모조(fake) 샘플을 생성하고, 판별자(520)는 모조 샘플을 진짜(real) 샘플과 구분한다. 구체적으로, 생성자(510)는 랜덤 입력(random input)에 기반하여 합성 샘플(synthetic sample)을 생성한다. 예를 들어, 랜덤 입력은 잠재 벡터(latent vector)를 포함할 수 있다. 판별자(520)는 실제 샘플(real sample)을 이용하여 사전에 훈련된 상태로서, 실제 샘플 및 합성 샘플을 구분할 수 있도록 더 훈련된다. Figure 5 shows an example of a GAN applicable to the present invention. Figure 5 illustrates the structure of a GAN that classifies samples (e.g., images). Referring to Figure 5, GAN includes a generator 510 and a discriminator 520. The generator 510 and the discriminator 520 each include an artificial neural network, the generator 510 generates fake samples, and the discriminator 520 distinguishes the fake samples from real samples. Specifically, the generator 510 generates a synthetic sample based on random input. For example, random input may include latent vectors. The discriminator 520 is trained in advance using real samples and is further trained to distinguish between real samples and synthetic samples.

이를 위해, 판별자(520)는 실제 샘플 및 합성 샘플을 학습 데이터로서 이용하고, 적어도 하나의 손실(loss) 값, 예를 들어, 판별자 손실 및 생성자 손실을 출력한다. 판별자 손실은 실제 샘플을 합성 샘플로 오분류(misclassify)한 경우 또는 합성 샘플을 실제 샘플로 오분류한 경우에 대한 정보를 포함하며, 판별자(520)에게 패널티로 작용한다. 판별자 손실을 이용하여 역전파가 수행되면, 판별자(520) 내의 인공 신경망의 가중치들이 갱신된다. 생성자(510)는 판별자(520)로부터의 피드백(예: 생성자 손실)에 기반하여 훈련된다. 훈련을 통해, 생성자(510)는 실제 샘플과 더 유사한 합성 샘플을 생성하도록 발전할 수 있다. 생성자 손실은 판별자(520)가 오류 없이 분류한 경우에 대한 정보를 포함하며, 생성자(510)에게 패널티로 작용한다. 생성자 손실을 이용하여 역전파가 수행되면, 생성자(510) 내의 인공 신경망의 가중치들이 갱신된다.To this end, the discriminator 520 uses real samples and synthetic samples as learning data and outputs at least one loss value, for example, a discriminator loss and a generator loss. The discriminator loss includes information about misclassifying a real sample as a synthetic sample or misclassifying a synthetic sample as a real sample, and acts as a penalty to the discriminator 520. When backpropagation is performed using the discriminator loss, the weights of the artificial neural network in the discriminator 520 are updated. Generator 510 is trained based on feedback (e.g., generator loss) from discriminator 520. Through training, generator 510 can evolve to generate synthetic samples that are more similar to real samples. The generator loss includes information about cases where the discriminator 520 classified without error, and acts as a penalty to the generator 510. When backpropagation is performed using the generator loss, the weights of the artificial neural network in the generator 510 are updated.

전술한 GAN의 구조는 다양한 실시 예들에 따른 의료 영상 조절을 위한 인공지능 모델에 적용될 수 있다. 이하, 의료 영상의 일 예로서, CT(computer tomography) 영상에 대한 품질 조절의 실시 예들이 설명된다. 그러나, 본 발명의 다양한 실시 예들은 CT 영상은 물론 엑스레이(X-ray), 초음파, MRI(magnetic resonance imaging), PET(positron emission tomography), SPET(single photon emission computed tomography) 등 다양한 형태의 의료 영상들에도 적용될 수 있다.The structure of the GAN described above can be applied to an artificial intelligence model for medical image control according to various embodiments. Hereinafter, as an example of a medical image, embodiments of quality control for CT (computer tomography) images will be described. However, various embodiments of the present invention can be used in various forms of medical imaging, such as CT images, X-rays, ultrasound, magnetic resonance imaging (MRI), positron emission tomography (PET), and single photon emission computed tomography (SPET). It can also be applied to fields.

CT는 병 진단을 위한 신뢰할 수 있는 영상 검사들 중 하나이다. CT 영상의 품질은 전압 및 전류와 같은 스캔 요소에 따라 달라진다. 엑스레이 선량(X-ray dose)는 화질과 직결된다. 고선량(high-dose)는 고화질 영상을 얻게 하지만, DNA 손상과 세포 기형을 일으켜 질병이나 종양을 유발할 수 있다. 반면에 저선량은 환자에게 조사되는 선량을 감소시킬 수 있지만, 잡음이나 인공물(artifacts)에 의한 영상 품질의 저하를 야기할 수 있다. 이때, 재구성 알고리즘은 저선량 파라미터에 의해 획득된 CT 영상의 품질을 향상킬 수 있다. 최근, 딥러닝은 컴퓨터 비전(vision) 및 의료 영상 처리 분야에서 유망한 성능을 보여주고 있다. 비지도 학습인 GAN(generative adversarial networks)은 CT 잡음 제거 연구에서 우수한 성능을 보여준다. 예를 들어, WGAN(Wasserstein GAN)은 와씨스타인 거리(Wasserstein distance)를 기반으로 하는 개선된 GAN 프레임워크로서, 기존 GAN보다 더 높은 훈련 안정성을 보여준다. 그러나 GAN 기반 후처리(post-processing) 알고리즘을 임상에 적용(clinical adoption)하는 것은 여전히 한계를 가진다.CT is one of the reliable imaging tests for diagnosing diseases. The quality of CT images depends on scan factors such as voltage and current. X-ray dose is directly related to image quality. High-dose results in high-definition images, but can cause DNA damage and cell malformations, causing disease or tumors. On the other hand, low doses can reduce the dose irradiated to the patient, but can cause deterioration of image quality due to noise or artifacts. At this time, the reconstruction algorithm can improve the quality of the CT image obtained by low-dose parameters. Recently, deep learning has shown promising performance in the fields of computer vision and medical image processing. GAN (generative adversarial networks), an unsupervised learning method, shows excellent performance in CT noise removal research. For example, WGAN (Wasserstein GAN) is an improved GAN framework based on the Wasserstein distance and shows higher training stability than existing GANs. However, clinical adoption of GAN-based post-processing algorithms still has limitations.

전술한 문제를 해결하기 위해, 본 발명은 불확실성 추정을 포함하는 새로운 CT 영상 잡음 제거 모델을 제안한다. 다양한 실시 예들에 따른 인공지능 모델은 잔차 밀집 블록(residual dense block)과 스킵 연결(skip connection)을 통해 성능을 향상시켰다. 동시에, 다양한 실시 예들에 따른 인공지능 모델은, 단순한 정보 전달이 아닌 본질적으로 중요한 데이터를 전달하기 위해, 어탠션(attention) 모듈을 사용한다. 또한, 보다 안정적인 잡음 제거 성능을 만들기 위해, 다양한 실시 예들에 따른 인공지능 모델은 베이지안 근사(Bayesian approximation)를 사용하여 모델 불확실성을 측정한다.To solve the above-mentioned problem, the present invention proposes a new CT image denoising model including uncertainty estimation. Artificial intelligence models according to various embodiments improved performance through residual dense blocks and skip connections. At the same time, artificial intelligence models according to various embodiments use an attention module to deliver inherently important data rather than simply conveying information. Additionally, in order to create more stable noise removal performance, artificial intelligence models according to various embodiments measure model uncertainty using Bayesian approximation.

저선량 CT(low-does CT, LDCT) 및 대응하는 표준선량 CT(standard-dose CT, SDCT) 간 관계는 이하 [수학식 1]과 같이 표현될 수 있다.The relationship between low-dose CT (low-does CT, LDCT) and the corresponding standard-dose CT (SDCT) can be expressed as [Equation 1] below.

[수학식 1]에서, y는 표준선량 CT, x는 저선량 CT, T()는 변환 행렬(translation function)을 의미한다. In [Equation 1], y means standard dose CT, x means low dose CT, and T() means the transformation matrix (translation function).

일반적으로 CT 영상의 잡음 분포는 푸아송(Poission) 양자 잡음과 가우스 전자 잡음의 복잡한 합성에 의해 모델링될 수 있다. 잡음들 사이의 관계 T를 정확하게 정의할 수 없기 때문에, 기존의 기존 잡음 제거 기술은 CT 잡음 제거에서 우수한 성능을 달성할 수 없었다. 네트워크는 역(inverse) 문제를 해결하여 영상을 생성하도록 훈련된다. 역-변환 함수는 이하 [수학식 2]와 같이 표현될 수 있다.In general, the noise distribution of CT images can be modeled by a complex synthesis of Poisson quantum noise and Gaussian electronic noise. Because the relationship T between noises cannot be accurately defined, existing conventional noise removal techniques cannot achieve good performance in CT noise removal. The network is trained to generate images by solving an inverse problem. The inverse-transformation function can be expressed as [Equation 2] below.

[수학식 2]에서, T^-1()는 변환 함수의 역함수, y는 표준선량 CT,

는 역-변환 함수에 의해 생성된 저선량 CT, x는 저선량 CT를 의미한다. 네트워크는 비선형 함수를 통해 고차원 기능을 효율적으로 학습한다. 결과적으로 x에 가까운 잡음 제거 영상이 생성될 수 있다.In [Equation 2], T ^-1 () is the inverse function of the conversion function, y is the standard dose CT,

means low-dose CT generated by the inverse-transform function, and x means low-dose CT. The network efficiently learns high-dimensional features through nonlinear functions. As a result, a noise-removed image close to x can be generated.

GAN 기반의 인공지능 모델을 완전-컨볼루션(fully convolutional) 구조로 설계하면, 인공지능 모델 내의 커널(kernel)들은 임의의 크기의 영상을 이용하여 훈련될 수 있다. 이 경우, 생성자 네트워크 내의 컨볼루션 커널들을 랜덤하게 샘플링된 로컬 패치(local patch)들을 이용하여 훈련될 수 있다. 그러나, 본 발명의 실시 예들에 따른 인공지능 모델의 컨볼루션 커널들은, 원본 크기의 출력 영상을 생성할 수 있도록, 원본(original) 크기(예: 512×512)의 영상을 이용하여 훈련될 수 있다. 이를 통해, 표준선량 CT 및 모조(fake) 표준선량 CT 간 인지적인 차이가, 로컬하게(locally) 샘플링된 차원이 아닌, 원본 차원에서 측정될 수 있다. 한편, 판별자 네트워크는, 표준선량 CT 및 모조(fake) 표준선량 CT의 로컬 텍스쳐를 구분할 수 있도록, 랜덤하게 샘플링된 로컬 패치들을 이용하여 훈련될 수 있다. 다양한 실시 예들에 따른 인공지능 모델의 구조에 대한 구체적인 예는 이하 도 6과 같다.If a GAN-based artificial intelligence model is designed with a fully convolutional structure, kernels within the artificial intelligence model can be trained using images of arbitrary sizes. In this case, the convolution kernels in the generator network can be trained using randomly sampled local patches. However, the convolution kernels of the artificial intelligence model according to embodiments of the present invention can be trained using images of the original size (e.g., 512 × 512) so as to generate an output image of the original size. . This allows the perceptual difference between standard-dose CT and fake standard-dose CT to be measured in the original dimension, rather than in the locally sampled dimension. Meanwhile, the discriminator network can be trained using randomly sampled local patches to distinguish the local texture of standard dose CT and fake standard dose CT. Specific examples of the structure of artificial intelligence models according to various embodiments are shown in FIG. 6 below.

도 6은 본 발명의 일 실시 예에 따른 인공지능 모델의 구조의 일 예를 도시한다. 이하 도 6을 참고한 설명은 인공지능 모델을 구성하는 블록/모듈/계층의 크기를 구체적인 예시하고 있으나, 이하 예시되는 크기는 구체적인 실시 예에 따라 변형될 수 있다.Figure 6 shows an example of the structure of an artificial intelligence model according to an embodiment of the present invention. The description below with reference to FIG. 6 specifically illustrates the sizes of blocks/modules/layers constituting the artificial intelligence model, but the sizes illustrated below may be modified depending on specific embodiments.

생성자 네트워크(610)는 원본 크기(예: 512×512)의 저선량 CT를 입력으로 취하고, 동일 크기의 모조 표준선량 CT를 생산한다. 컨볼루션 계층(612)은 입력되는 저선량 CT 영상들을 리키(leaky) ReLU 활성화를 가지는 3×3 컨볼루션 커널들을 통해 32개의 특징 맵들로 인코딩한다. 인코딩된 특징 맵들은 DRBD(dense residual block with dropout) 블록으로 제공된다. DRBD 블록은 3개의 연속적인 컨볼루션 계층들(613-1 내지 613-3)을 포함하고, 잔차 연산(residual operation)에 의해 연결된다. 또한, DRBD 블록은 3개의 컨볼루션 계층들(613-1 내지 613-3)의 연산 결과를 처리하는 추가 컨볼루션 계층(613-4) 및 sSE(squeeze and spatial excitation) 모듈(614)을 포함한다.The generator network 610 takes a low-dose CT of the original size (e.g., 512×512) as input and produces a dummy standard-dose CT of the same size. The convolution layer 612 encodes the input low-dose CT images into 32 feature maps through 3×3 convolution kernels with leaky ReLU activation. Encoded feature maps are provided as DRBD (dense residual block with dropout) blocks. The DRBD block includes three consecutive convolutional layers 613-1 to 613-3, and are connected by a residual operation. Additionally, the DRBD block includes an additional convolution layer 613-4 and a squeeze and spatial excitation (sSE) module 614 that processes the operation results of the three convolution layers 613-1 to 613-3. .

구체적으로, 컨볼루션 계층(613-1)의 출력 및 컨볼루션 계층(612)의 출력은 합산기(617a)에 의해 합산된 후, 컨볼루션 계층(613-2)에 입력된다. 컨볼루션 계층(613-1)의 출력, 컨볼루션 계층(613-2)의 출력 및 컨볼루션 계층(612)의 출력은 합산기(617b)에 의해 합산된 후, 컨볼루션 계층(613-3)에 입력된다. 컨볼루션 계층(613-1)의 출력, 컨볼루션 계층(613-2)의 출력, 컨볼루션 계층(613-3)의 출력 및 컨볼루션 계층(612)의 출력은 합산기(617c)에 의해 합산된 후, 컨볼루션 계층(613-4)에 입력된다. 컨볼루션 계층들(613-1 내지 613-4) 각각은 리키(leaky) ReLU 활성화를 가질 수 있다. 예를 들어, 리키 ReLU 활성화 함수는 0.3의 음의 경사도(negative slope) 계수를 가질 수 있다. Specifically, the output of the convolution layer 613-1 and the output of the convolution layer 612 are summed by the summer 617a and then input to the convolution layer 613-2. The output of the convolution layer 613-1, the output of the convolution layer 613-2, and the output of the convolution layer 612 are summed by the summer 617b, and then added to the convolution layer 613-3. is entered into The output of the convolution layer 613-1, the output of the convolution layer 613-2, the output of the convolution layer 613-3, and the output of the convolution layer 612 are summed by a summer 617c. After that, it is input to the convolution layer (613-4). Each of the convolutional layers 613-1 to 613-4 may have leaky ReLU activation. For example, the Leaky ReLU activation function may have a negative slope coefficient of 0.3.

sSE(squeeze and spatial excitation) 모듈(614)은 특징 맵들을 압축 및 재교정(compressing and re-calibrating)함으로써 특징 맵들의 이용을 강화할 수 있다. 구체적으로, sSE 모듈(614)은 특징 맵에 대한 전체 정보를 요약하는 스퀴즈 동작(squeeze operation) 및 각 특징 맵의 중요성(importance)을 스케일링하는 자극 동작(excitation operation)을 수행한다. 이때, sSE 모듈(614)은 채널을 따라(along a channel) 스퀴즈 동작을 수행하고, 공간적으로 자극 동작을 수행함으로써, 픽셀-별 공간 정보를 효과적으로 사용하게 한다. 구체적으로, sSE 모듈(614)은 모든 채널을 아우르는 하나의 특징 맵을 추출하고, 추출된 특징 맵을 각 채널에 적용한다. 즉, sSE 모듈(614)는 추출된 특징 맵을 각 채널의 특징 맵에 곱한다. 여기서, sSE 모듈(614)에 입력되는 특징 맵들은 H×W×C의 3차원 구조를 가지며, 추출된 특징 맵은 H×W×1의 2차원 구조를 가지며, sSE 모듈(614)에서 출력되는 결과물은 다시 H×W×C의 3차원 구조를 가질 수 있다. 여기서, C는 채널 개수를 의미한다. 또한, sSE 모듈(614)은 스퀴징된(sequeezed) 정보를 획득하기 위해 시그모이드(sigmoid) 활성화를 가지는 1×1 컨볼루션 계층에서 입력을 처리하고, 스퀴징된 정보 및 원본 입력을 곱함으로써 3차원적으로, 즉, 공간적으로 재교정된 특징 맵을 획득할 수 있다. The squeeze and spatial excitation (sSE) module 614 can enhance the use of feature maps by compressing and re-calibrating the feature maps. Specifically, the sSE module 614 performs a squeeze operation that summarizes the entire information about the feature maps and an excitation operation that scales the importance of each feature map. At this time, the sSE module 614 effectively uses pixel-specific spatial information by performing a squeeze operation along a channel and a stimulation operation spatially. Specifically, the sSE module 614 extracts one feature map encompassing all channels and applies the extracted feature map to each channel. That is, the sSE module 614 multiplies the extracted feature map by the feature map of each channel. Here, the feature maps input to the sSE module 614 have a three-dimensional structure of H × W × C, and the extracted feature maps have a two-dimensional structure of H × W × 1, and the feature maps output from the sSE module 614 The result can again have a three-dimensional structure of H×W×C. Here, C means the number of channels. Additionally, the sSE module 614 processes the input in a 1×1 convolutional layer with sigmoid activation to obtain the squeezed information, and multiplies the squeezed information and the original input. A three-dimensional, that is, spatially recalibrated feature map can be obtained.

도 6에서, sSE 모듈(614)의 출력 및 컨볼루션 계층(613-4)의 출력이 곱셈기(618)에 의해 곱해지는 것으로 표현되어 있으나, 전술한 sSE 모듈(614)의 동작은 곱셈기(618)에 의한 곱셈 동작을 포함한다. 따라서, 도 6의 곱셈기(618)는 sSE 모듈(614)에 포함되는 것으로 이해되거나, 또는 전술한 동작 중 곱셈 동작은 sSE 모듈(614)의 동작에서 제외되는 것으로 이해될 수 있다.In Figure 6, the output of the sSE module 614 and the output of the convolution layer 613-4 are expressed as being multiplied by the multiplier 618, but the operation of the sSE module 614 described above is performed by the multiplier 618. Includes a multiplication operation by . Accordingly, the multiplier 618 of FIG. 6 may be understood as being included in the sSE module 614, or the multiplication operation among the above-described operations may be understood as being excluded from the operation of the sSE module 614.

드랍아웃 계층(615)은 설정된 비율로 드랍아웃 동작을 수행한다. 드랍아웃 계층(615)으로의 입력은 sSE 모듈(614)의 출력(예: 곱셉기(618)의 출력) 및 컨볼루션 모듈(612)의 출력이 합산기(617d)에 의해 합산된 결과이다. 드랍아웃 계층(615)은 특정 뉴런, 즉, 노드들을 일정 확률로 훈련에서 제외함으로써, 인공지능 모델들의 노드들 간 균형있는 훈련을 유도하고, 오버피팅(overfitting)을 방지할 수 있다. 다시 말해, 드랍아웃 계층(615)은 훈련 시 적어도 하나의 노드를 제외하고, 나머지 노드들을 이용하여 순방향/역방향 전파(forward/backward propagation)를 수행할 수 있다. 드랍아웃에 의해 특정한 적어도 하나의 노드가 훈련에서 제외될 수 있으며, 드랍아웃 동작에 의해 제외되는 패턴(이하 '드랍아웃 패턴')은 에포크(epoch) 별로 다르게 조절될 수 있다. 여기서, 노드를 제외함은 노드를 오프(off)함으로 표현될 수 있다. 일 실시 예에 따라, 각 노드에 적용되는 제외 확률은 고정된 값으로 정의되거나, 훈련 상황에 따라 적응적으로 조절될 수 있다. 예를 들어, 제외 확률은 0.5로 설정될 수 있다. 제외 확률의 결정, 드랍아웃 패턴의 제어 등은 별도의 제어 모듈(예: 도 2의 제어부(230))에 의해 수행될 수 있다.The dropout layer 615 performs a dropout operation at a set rate. The input to the dropout layer 615 is the result of summing the output of the sSE module 614 (e.g., the output of the multiplier 618) and the output of the convolution module 612 by the summer 617d. The dropout layer 615 can induce balanced training among nodes of artificial intelligence models and prevent overfitting by excluding certain neurons, that is, nodes, from training with a certain probability. In other words, the dropout layer 615 may exclude at least one node during training and perform forward/backward propagation using the remaining nodes. At least one specific node may be excluded from training by dropout, and the pattern excluded by the dropout operation (hereinafter referred to as 'dropout pattern') may be adjusted differently for each epoch. Here, excluding a node can be expressed as turning the node off. According to one embodiment, the exclusion probability applied to each node may be defined as a fixed value or may be adaptively adjusted depending on the training situation. For example, the exclusion probability may be set to 0.5. Determination of the exclusion probability, control of the dropout pattern, etc. may be performed by a separate control module (e.g., the control unit 230 of FIG. 2).

마지막 컨볼루션 계층(616)을 제외한 컨볼루션 계층들, 즉, 컨볼루션 계층들(612, 613-1 내지 613-4)은 3×3 컨볼루션 커널들을 포함한다. 컨볼루션 커널들의 가중치 및 바이어스는 He 정규 초기화(normal initialization)에 의해 초기화될 수 있다. 마지막 컨볼루션 계층(616)에 포함되는 컨볼루션 커널은, 빠른 학습이 가능하도록, 9×9 크기를 가질 수 있다. 마지막 컨볼루션 계층(616)은 드랍아웃 계층(615)에서 출력된 특징 맵들을 원하는 채널에 부합하는 형태로 가공한다. 예를 들어, DRBD 블록에서 출력되는 특징 맵들의 형태는 [batch size, 512, 512, 16]이고, 특징 맵들은 마지막 컨볼루션 계층(616)에 의해 최종 출력 형태인 [batch size, 512, 512, 1]의 크기로 가공될 수 있다. 다시 말해, 최종 출력은 하나의 채널을 가지는 이미지로서, 마지막 컨볼루션 계층(616)은 앞선 특징 맵들을 재정렬 및 압축함으로써 최종 출력물을 생성한다. 특징 맵의 차원을 보존하기 위해, 모든 컨볼루션 계층들(612, 613-1 내지 613-4, 616)에 제로 패딩이 적용될 수 있다.Convolutional layers except the last convolutional layer 616, i.e., convolutional layers 612, 613-1 to 613-4, include 3×3 convolution kernels. The weights and biases of the convolution kernels can be initialized by He normal initialization. The convolution kernel included in the last convolution layer 616 may have a size of 9×9 to enable fast learning. The final convolution layer 616 processes the feature maps output from the dropout layer 615 into a form that matches the desired channel. For example, the form of the feature maps output from the DRBD block is [batch size, 512, 512, 16], and the feature maps are converted into the final output form by the last convolution layer 616, [batch size, 512, 512, 1] can be processed into sizes. In other words, the final output is an image with one channel, and the last convolution layer 616 generates the final output by rearranging and compressing the previous feature maps. To preserve the dimensionality of the feature map, zero padding may be applied to all convolutional layers 612, 613-1 to 613-4, 616.

판별자 네트워크(620)는 표준선량 CT 및 모조 표준선량 CT로부터의 로컬 패치들을 구분하도록 훈련된다. 예를 들어, 주어진 크기(예: 64×64)의 로컬 패치들이 설정된 횟수(예: 5번)만큼 표준선량 CT에서 랜덤하게 샘플링되고, 모조 표준선량 CT의 대응하는 위치에서 로컬 패치들이 샘플링될 수 있다. 판별자 네트워크(620)는 4개의 컨볼루션 블록들(622-1 내지 622-4)을 포함하고, 하나의 컨볼루션 계층(623)을 포함한다. 컨볼루션 블록들(622-1 내지 622-4) 각각에서, 첫번째 계층은 1-스트라이드(stride)를 이용하는 3×3 컨볼루션 커널들을 포함하고, 두번째 계층은 2-스트라이드들을 이용하는 3×3 컨볼루션 커널들을 포함한다. 여기서, 스트라이드는 몇개의 픽셀을 스킵하면서 영상을 처리하는지를 의미한다. 2개의 컨볼루션 계층들, 즉, 첫번째 계층 및 두번째 계층은 제로-패딩(zero-padding)을 포함하지 아니하고, ReLU를 이용하여 활성화된다. 컨볼루션 특징 맵들의 개수는 다음 컨볼루션 블록을 통과할 때 두배가 된다. 마지막 컨볼루션 계층(623)은 하나의 스트라이드를 가지는 4×4 컨볼루션 커널을 포함한다. 컨볼루션 계층(623)의 출력은 단일 확률 값을 포함할 수 있다.Discriminator network 620 is trained to distinguish local patches from standard-dose CT and sham standard-dose CT. For example, local patches of a given size (e.g., 64 there is. The discriminator network 620 includes four convolutional blocks 622-1 to 622-4 and one convolutional layer 623. In each of the convolution blocks 622-1 to 622-4, the first layer includes 3×3 convolution kernels using 1-stride, and the second layer includes 3×3 convolution kernels using 2-strides. Contains kernels. Here, stride refers to how many pixels are skipped while processing the image. The two convolutional layers, the first layer and the second layer, do not include zero-padding and are activated using ReLU. The number of convolutional feature maps is doubled when passing through the next convolutional block. The last convolution layer 623 includes a 4×4 convolution kernel with one stride. The output of convolutional layer 623 may include a single probability value.

도 7은 본 발명의 일 실시 예에 따라 의료 영상의 품질을 조절하는 절차의 예를 도시한다. 도 7은 연산 능력을 가진 장치(예: 도 1의 서비스 서버(110))의 동작 방법을 예시한다.Figure 7 shows an example of a procedure for adjusting the quality of medical images according to an embodiment of the present invention. FIG. 7 illustrates a method of operating a device with computing capabilities (eg, the service server 110 of FIG. 1).

도 7을 참고하면, S701 단계에서, 장치는 입력 영상을 획득한다. 입력 영상은 상대적으로 저품질을 가지는 영상이다. 예를 들어, 입력 영상은 저선량 CT 영상일 수 있다. 입력 영상은 의료 영상 촬영 장비로부터 촬영과 함께 실시간으로 제공되거나 또는 의료 영상을 저장한 데이터베이스로부터 제공될 수 있다.Referring to FIG. 7, in step S701, the device acquires an input image. The input image is an image with relatively low quality. For example, the input image may be a low-dose CT image. The input image may be provided in real time along with imaging from medical imaging equipment or may be provided from a database storing medical images.

S703 단계에서, 장치는 입력 영상에서 특징 맵들을 추출한다. 특징 맵들을 추출하기 위해, 장치는 인공지능 모델에 포함된 적어도 하나의 컨볼루션 계층들을 이용할 수 있다. 장치는 복수의 채널들에 대응하는 특징 맵들을 추출할 수 있으며, 각 특징 맵은 2차원 구조를 가질 수 있다.In step S703, the device extracts feature maps from the input image. To extract feature maps, the device may use at least one convolutional layer included in the artificial intelligence model. The device can extract feature maps corresponding to a plurality of channels, and each feature map can have a two-dimensional structure.

S705 단계에서, 장치는 채널에 따른 스퀴즈 동작 및 공간적인 자극 동작을 수행한다. 즉, 장치는, 추출된 특징 맵들로부터 주요 정보를 추출하고, 추출된 정보를 공간적으로 재조정한다. 즉, 장치는 추출된 특징 맵들의 정보를 요약하고, 요약된 정보를 스케일링한다. 구체적으로, 장치는 모든 특징 맵들에 대응하는 요약을 위한 필터를 생성하고, 필터를 특징 맵들에 곱할 수 있다. 여기서, 필터는 또 다른 하나의 특징 맵으로 이해될 수 있다. 이에 따라, 장치는 압축 및 재교정된(compressed and re-calibrated) 특징 맵들을 생성할 수 있다.In step S705, the device performs a squeeze operation and a spatial stimulation operation according to the channel. That is, the device extracts key information from the extracted feature maps and spatially readjusts the extracted information. That is, the device summarizes the information of the extracted feature maps and scales the summarized information. Specifically, the device may create a filter for summary corresponding to all feature maps and multiply the filter by the feature maps. Here, the filter can be understood as another feature map. Accordingly, the device can generate compressed and re-calibrated feature maps.

S707 단계에서, 장치는 조절된 품질을 가지는 출력 영상을 생성한다. 장치는 요약 및 스케일링된 특징 맵들, 다시 말해, 압축 및 재교정된(compressed and re-calibrated) 특징 맵들에 기반하여 조절된 품질을 가지는 영상을 생성할 수 있다. 조절된 품질은 향상된 또는 개선된 품질을 포함할 수 있다. 여기서, 개선된 품질은 S701 단계에서 획득된 영상에 비교하여 높은 품질을 가짐을 의미한다. 예를 들어, 개선된 품질을 가지는 영상은 입력 영상에 비하여 선명하거나, 더 많은 특징 정보를 포함하거나, 또는 더 큰 해상도를 가질 수 있다.In step S707, the device generates an output image with adjusted quality. The device may generate an image with adjusted quality based on summarized and scaled feature maps, that is, compressed and re-calibrated feature maps. Adjusted quality may include enhanced or improved quality. Here, improved quality means having higher quality compared to the image acquired in step S701. For example, an image with improved quality may be clearer, contain more feature information, or have greater resolution than the input image.

도 7을 참고하여 설명한 절차를 통해, 입력 영상의 품질이 개선될 수 있다. 도 7에 도시되지 아니하였으나, 일 실시 예에 따라, 장치는 드랍아웃 패턴을 적용할 수 있다. 여기서, 드랍아웃을 수행하는 경우, 미리 설정된 확률에 따라 각 에포크(epoch)에서 제거되는 노드들에 대하여, 다른 노드들에 의존하지 아니한 채로, 가중치가 갱신될 수 있다. 드랍아웃에 의해, 인공지능 모델에 포함된 적어도 하나의 노드가 제외된 상태에서 예측 동작이 수행될 수 있다. 이때, 드랍아웃 패턴은 확률에 기반하여 랜덤하게 다양하게 정의될 수 있다. 드랍아웃 패턴을 적용하는 절차는 이하 도 8과 같다.Through the procedure described with reference to FIG. 7, the quality of the input image can be improved. Although not shown in FIG. 7, according to one embodiment, the device may apply a dropout pattern. Here, when performing dropout, weights can be updated for nodes removed in each epoch according to a preset probability without depending on other nodes. By dropout, a prediction operation may be performed with at least one node included in the artificial intelligence model excluded. At this time, the dropout pattern can be randomly defined in various ways based on probability. The procedure for applying the dropout pattern is shown in FIG. 8 below.

도 8은 본 발명의 일 실시 예에 따라 인공지능 모델을 훈련 및 운용하는 절차의 예를 도시한다. 도 8은 연산 능력을 가진 장치(예: 도 1의 서비스 서버(110))의 동작 방법을 예시한다.Figure 8 shows an example of a procedure for training and operating an artificial intelligence model according to an embodiment of the present invention. FIG. 8 illustrates a method of operating a device with computing capabilities (eg, the service server 110 of FIG. 1).

도 8을 참고하면, S801 단계에서, 장치는 에포크 별로 드랍아웃 패턴을 변경하며 훈련을 수행한다. 즉, 훈련 과정에서, 장치는 랜덤하게 드랍아웃 패턴들 결정 및 적용함으로써 적어도 하나의 노드를 제외하고, 나머지 노드들을 이용하여 순방향/역방향 전파를 수행한다. 이때, 장치는 순방향 전파의 결과에 기반하여 생성자 손실 및 판별자 손실을 산출하고, 생성자 손실 및 판별자 손실을 최소화하도록 생성자 네트워크 및 판별자 네트워크의 가중치들을 갱신함으로써, 역방향 전파를 수행할 수 있다. 이 경우, 적어도 하나의 노드를 제외한 나머지 노드들에 관련된 가중치들만이 갱신될 수 있다. 여기서, 드랍아웃 패턴은 에포크 단위로 변경된다. 드랍아웃을 수행함에 따라, 미리 설정된 확률에 따라 각 에포크에서 제거되는 노드들에 대하여, 다른 노드들에 의존하지 아니한 채로, 가중치가 갱신될 수 있다.Referring to FIG. 8, in step S801, the device performs training by changing the dropout pattern for each epoch. That is, during the training process, the device excludes at least one node by randomly determining and applying dropout patterns and performs forward/backward propagation using the remaining nodes. At this time, the device can perform backward propagation by calculating the generator loss and discriminator loss based on the results of forward propagation, and updating the weights of the generator network and discriminator network to minimize the generator loss and discriminator loss. In this case, only the weights related to the remaining nodes except at least one node can be updated. Here, the dropout pattern changes on an epoch basis. As dropout is performed, weights can be updated for nodes removed in each epoch according to a preset probability without depending on other nodes.

S803 단계에서, 장치는 에포크 별 성능 평가를 수행한다. 장치는 성능 평가를 위해 선택된 학습 데이터를 이용하여, 에포크 별로 인공지능 모델의 성능을 평가할 수 있다. 성능 평가는 다양한 방식으로 수행될 수 있다. 일 실시 예에 따라, 성능 평가는 라디오믹스(radiomics)에 기반하여 수행될 수 있다. 구체적으로, 장치는 생성자에 의해 생성된 모조 영상 및 레이블에 해당하는 영상 각각에서 패치들을 추출하고, 추출된 패치들 각각에 대한 라디오믹스 특징들을 생성한다. 예를 들어, 라디오믹스 특징들은 GLCM(gray-level co-occurrence matrix), GLRLM(gray-level run length matrix), GLSZM(gray-level size zone matrix), GLDM(gray-level dependence matrix), NGTDM(neighboring gray tone difference matrix) 또는 이들의 변형된 정보를 포함할 수 있다. 장치는 생성된 라디오믹스 특징들에 기반하여 비교 지표(예: 일치 상관 계수(concordance correlation coefficient, CCC))를 결정하고, 결정된 비교 지표에 기반하여 성능을 평가할 수 있다. 예를 들어, 비교 지표가 임계치보다 크면, 충분한 성능을 가진 것으로 평가될 수 있다. 즉, 비교 지표가 곧 성능으로 해석될 수 있다.In step S803, the device performs performance evaluation for each epoch. The device can evaluate the performance of the artificial intelligence model for each epoch using the learning data selected for performance evaluation. Performance evaluation can be performed in a variety of ways. According to one embodiment, performance evaluation may be performed based on radiomics. Specifically, the device extracts patches from each of the imitation images generated by the generator and the image corresponding to the label, and generates radiomics features for each of the extracted patches. For example, radiomics features include gray-level co-occurrence matrix (GLCM), gray-level run length matrix (GLRLM), gray-level size zone matrix (GLSZM), gray-level dependence matrix (GLDM), and NGTDM ( neighboring gray tone difference matrix) or modified information thereof. The device may determine a comparison index (e.g., concordance correlation coefficient (CCC)) based on the generated radiomics features and evaluate performance based on the determined comparison index. For example, if the comparison index is greater than the threshold, it may be evaluated as having sufficient performance. In other words, comparative indicators can be interpreted as performance.

S805 단계에서, 장치는 복수의 드랍아웃 패턴들을 적용하여 인공지능 모델을 운용한다. 즉, 장치는 복수의 에포크들에서 독립적으로 갱신된 가중치를 가진 노드들을 포함하는 인공지능 모델을 이용하며, 확률에 기반하여 결정되는 복수의 드랍아웃 패턴들을 적용한 복수의 예측 결과들을 생성한 후, 예측 결과들을 결합함으로써 단일한 하나의 예측 결과를 생성할 수 있다. 일 실시 예에 따라, 장치는 복수의 예측 결과들을 평균화함으로써 예측 결과들을 결합할 수 있다. 다른 실시 예에 따라, 장치는 복수의 예측 결과들을 가중치 평균화함으로써 예측 결과들을 결합할 수 있다.In step S805, the device operates an artificial intelligence model by applying a plurality of dropout patterns. That is, the device uses an artificial intelligence model that includes nodes with independently updated weights in multiple epochs, generates multiple prediction results by applying multiple dropout patterns determined based on probability, and then predicts By combining the results, a single prediction result can be generated. According to one embodiment, a device may combine prediction results by averaging a plurality of prediction results. According to another embodiment, a device may combine prediction results by weight averaging a plurality of prediction results.

도 8을 참고하여, 드랍아웃 패턴에 관련된 실시 예가 설명되었다. 여기서, 드랍아웃 패턴은 복수의 드랍아웃 결과들의 조합일 수 있다. 예를 들어, 확률 0.5의 드랍아웃 연산에 의해, 인공지능 모델 내의 전체 노드들 중 일부가 제외됨으로써 노드들의 부분집합A가 선택될 수 있다. 또한, 드랍아웃 연산을 재수행하면, 전체 노드들 중 다른 일부가 제외됨으로써 노드들의 부분집합B가 선택될 수 있다. 드랍아웃 연산 시마다 선택되는 노드들의 부분집합이 상이할 수 있으며, 복수의 부분집합들을 그룹핑함으로써 하나의 드랍아웃 패턴이 정의될 수 있다. 이 경우, 하나의 드랍아웃 패턴을 적용함은 서로 다른 노드들의 부분집합을 도출한 복수의 드랍아웃 연산들에 따른 영상들을 평균화함으로 이해될 수 있다. 구체적인 예로, 하나의 드랍아웃 패턴은 20회의 드랍아웃된 노드 부분집합들을 포함할 수 있다.With reference to FIG. 8, an embodiment related to the dropout pattern has been described. Here, the dropout pattern may be a combination of multiple dropout results. For example, by a dropout operation with probability 0.5, a subset A of nodes may be selected by excluding some of the total nodes in the artificial intelligence model. Additionally, when the dropout operation is performed again, subset B of nodes can be selected by excluding other parts of all nodes. The subset of nodes selected during each dropout operation may be different, and one dropout pattern can be defined by grouping a plurality of subsets. In this case, applying a dropout pattern can be understood as averaging images resulting from a plurality of dropout operations that derive subsets of different nodes. As a specific example, one dropout pattern may include subsets of nodes that have been dropped out 20 times.

전술한 인공지능 모델의 훈련 과정에서, 목표 함수들(objective functions)은 판별자 손실(discriminator loss) 및 생성자 손실(generator loss)을 최소화하는 것으로 정의될 수 있다. 여기서, 판별자 손실 및 생성자 손실은 다음과 같이 정의될 수 있다. In the training process of the aforementioned artificial intelligence model, objective functions can be defined as minimizing discriminator loss and generator loss. Here, the discriminator loss and generator loss can be defined as follows.

먼저, 판별자 손실은 이하 [수학식 3]과 같이 정의될 수 있다.First, the discriminator loss can be defined as [Equation 3] below.

[수학식 3]에서, L_D는 판별자 손실, E[]는 기대 값(expected value) 연산자, D(x)는 판별자의 출력, G(x)는 생성자의 출력,

는 입력 영상 및 생성자의 출력 간 직선(straight line) 상에서 랜덤하게 샘플링된 2D 가우시안 점(gaussian point), 는

의 경사도, 는 판별자에 의해 판별된

에 대한 결과 값을 의미한다.In [Equation 3], L _D is the discriminator loss, E[] is the expected value operator, D(x) is the output of the discriminator, G(x) is the output of the generator,

is a 2D Gaussian point randomly sampled on a straight line between the input image and the output of the generator, Is

The slope of is determined by the discriminator

It means the result value for .

여기서, 전단의 2개 항(term)들은 와씨스타인 거리를 이용한 판별자 손실들이고, 마지막 항은 경사도 패널티(gradient penalty)를 의미한다.Here, the first two terms are discriminator losses using the Wasistein distance, and the last term means the gradient penalty.

이때, 생성자를 위한 적대적(adversarial) 손실은 이하 [수학식 4]와 같이 정의될 수 있다.At this time, the adversarial loss for the generator can be defined as [Equation 4] below.

[수학식 4]에서, L_adv는 적대적 손실, E[]는 기대 값(expected value) 연산자, D(x)는 판별자의 출력, G(x)는 생성자의 출력을 의미한다.In [Equation 4], L _adv means the adversarial loss, E[] means the expected value operator, D(x) means the output of the discriminator, and G(x) means the output of the generator.

일 실시 예에 따라, 분석의 유연성을 위한 다중-스케일 구조적 유사성(multi-scale structural similarity, MS-SSIM)이 유사도 손실(similarity loss)로서 사용될 수 있다. MS-SSIM은 이하 [수학식 5]와 같이 정의될 수 있다.According to one embodiment, multi-scale structural similarity (MS-SSIM) may be used as the similarity loss for analysis flexibility. MS-SSIM can be defined as follows [Equation 5].

[수학식 5]에서, SSIM(x,y)는 영상 x 및 영상 y에 대한 구조적 유사도, C₁ 및 C₂는 상수, μ_x는 영상 x에 대한 표준 편차(standard deviation), μ_y는 영상 y에 대한 표준 편차, σ_x는 영상 x에 대한 상호 공분산(cross-covariance), σ_y는 영상 x에 대한 상호 공분산, σ_xy는 영상 x 및 영상 y에 대한 상호 공분산, x_j는 영상 x에 대한 j번째 레벨에서의 로컬 이미지 컨텐츠, y_j는 영상 y에 대한 j번째 레벨에서의 로컬 이미지 컨텐츠, M은 스케일 레벨을 의미한다.In [Equation 5], SSIM(x,y) is the structural similarity for image x and image y, C ₁ and C ₂ are constants, μ _x is the standard deviation for image x, and μ _y is the image standard deviation for y, σ _x is the cross-covariance for image x, σ _y is the cross-covariance for image x, σ _xy is the cross-covariance for image x and image y, _and is the local image content at the jth level for image y, y _j is the local image content at the jth level for image y, and M is the scale level.

위와 같은 MS-SSIM에 근거하여, 유사도 손실은 이하 [수학식 6]과 같이 정의될 수 있다.Based on the above MS-SSIM, the similarity loss can be defined as [Equation 6] below.

[수학식 6]에서, L_similarity은 유사도 손실, MS_SSIM(x,y)는 영상 x 및 영상 y에 대한 MS-SSIM을 의미한다.In [Equation 6], L _similarity means similarity loss, and MS_SSIM(x,y) means MS-SSIM for image x and image y.

일 실시 예에 따른 생성자 손실은 전술한 적대적 손실 및 유사도 손실에 기반하여 정의될 수 있다. 예를 들어, 이하 [수학식 7]과 같이, 생성자 손실은 적대적 손실 및 유사도 손실의 합으로 정의될 수 있다.Generator loss according to one embodiment may be defined based on the above-described adversarial loss and similarity loss. For example, as shown in [Equation 7] below, the generator loss can be defined as the sum of the adversarial loss and the similarity loss.

[수학식 7]에서, L_G는 생성자 손실, L_adv는 적대적 손실, L_similarity은 유사도 손실, α는 유사도 손실에 대한 가중치를 의미한다. [수학식 7]과 같이, 적대적 손실 및 유사도 손실이 동일 가중치 또는 서로 다른 가중치로 결합될 수 있다. In [Equation 7], L _G refers to the generator loss, L _adv refers to the adversarial loss, L _similarity refers to the similarity loss, and α refers to the weight for the similarity loss. As shown in [Equation 7], the adversarial loss and similarity loss can be combined with the same weight or different weights.

도 9는 본 발명의 일 실시 예에 따른 인공지능 모델을 훈련하기 위한 손실 값들의 예를 도시한다. 도 9는 유사도 손실 및 적대적 손실의 관계를 도시한다. 도 9를 참고하면, 저품질 영상(902)가 생성자(910)로 입력되고, 생성자(910)는 모조 고품질 영상(904)을 생성한다. 그리고, 레이블에 해당하는 실제 고품질 영상(906)이 제공된다. 여기서, 모조 고품질 영상(904) 및 실제 고품질 영상(906)에 기반하여, 유사도 손실(L_similarity)이 결정될 수 있다. 모조 고품질 영상(904) 및 실제 고품질 영상(906) 각각에서 랜덤하게 복수의 패치들(954, 956)이 추출된다. 추출된 복수의 패치들(954, 956)은 판별자(920)에 입력되고, 판별자(920)에 의한 판별 결과로부터 적대석 손실(L_adv)이 결정될 수 있다. 즉, 적대적 손실(L_adv)은 생성된 모조 고품질 영상(904)에서 추출된 패치들(954) 및 실제 고품질 영상(906)의 패치들(956)에 기반하여 결정되고, 유사도 손실(L_similarity)은 모조 고품질 영상(904) 및 실제 고품질 영상(906)의 전체에 기반하여 결정될 수 있다.Figure 9 shows an example of loss values for training an artificial intelligence model according to an embodiment of the present invention. Figure 9 shows the relationship between similarity loss and adversarial loss. Referring to FIG. 9, a low-quality image 902 is input to the generator 910, and the generator 910 generates a fake high-quality image 904. Then, an actual high-quality image 906 corresponding to the label is provided. Here, the similarity loss (L _similarity ) may be determined based on the fake high-quality image 904 and the actual high-quality image 906. A plurality of patches 954 and 956 are randomly extracted from each of the fake high-quality image 904 and the actual high-quality image 906. The extracted plurality of patches 954 and 956 are input to the discriminator 920, and the adversary stone loss (L _adv ) can be determined from the discrimination result by the discriminator 920. That is, the adversarial loss (L _adv ) is determined based on the patches 954 extracted from the generated fake high-quality image 904 and the patches 956 of the actual high-quality image 906, and the similarity loss (L _similarity ) may be determined based on the totality of the simulated high quality image 904 and the actual high quality image 906.

본 발명의 예시적인 방법들은 설명의 명확성을 위해서 동작의 시리즈로 표현되어 있지만, 이는 단계가 수행되는 순서를 제한하기 위한 것은 아니며, 필요한 경우에는 각각의 단계가 동시에 또는 상이한 순서로 수행될 수도 있다. 본 발명에 따른 방법을 구현하기 위해서, 예시하는 단계에 추가적으로 다른 단계를 포함하거나, 일부의 단계를 제외하고 나머지 단계를 포함하거나, 또는 일부의 단계를 제외하고 추가적인 다른 단계를 포함할 수도 있다.Exemplary methods of the present invention are expressed as a series of operations for clarity of explanation, but this is not intended to limit the order in which the steps are performed, and each step may be performed simultaneously or in a different order, if necessary. In order to implement the method according to the present invention, other steps may be included in addition to the exemplified steps, some steps may be excluded and the remaining steps may be included, or some steps may be excluded and additional other steps may be included.

본 발명의 다양한 실시 예는 모든 가능한 조합을 나열한 것이 아니고 본 발명의 대표적인 양상을 설명하기 위한 것이며, 다양한 실시 예에서 설명하는 사항들은 독립적으로 적용되거나 또는 둘 이상의 조합으로 적용될 수도 있다.The various embodiments of the present invention do not list all possible combinations, but are intended to explain representative aspects of the present invention, and matters described in the various embodiments may be applied independently or in combination of two or more.

또한, 본 발명의 다양한 실시 예는 하드웨어, 펌웨어(firmware), 소프트웨어, 또는 그들의 결합 등에 의해 구현될 수 있다. 하드웨어에 의한 구현의 경우, 하나 또는 그 이상의 ASICs(Application Specific Integrated Circuits), DSPs(Digital Signal Processors), DSPDs(Digital Signal Processing Devices), PLDs(Programmable Logic Devices), FPGAs(Field Programmable Gate Arrays), 범용 프로세서(general processor), 컨트롤러, 마이크로 컨트롤러, 마이크로 프로세서 등에 의해 구현될 수 있다. Additionally, various embodiments of the present invention may be implemented by hardware, firmware, software, or a combination thereof. For hardware implementation, one or more ASICs (Application Specific Integrated Circuits), DSPs (Digital Signal Processors), DSPDs (Digital Signal Processing Devices), PLDs (Programmable Logic Devices), FPGAs (Field Programmable Gate Arrays), general purpose It can be implemented by a processor (general processor), controller, microcontroller, microprocessor, etc.

본 발명의 범위는 다양한 실시 예의 방법에 따른 동작이 장치 또는 컴퓨터 상에서 실행되도록 하는 소프트웨어 또는 머신-실행가능한 명령들(예를 들어, 운영체제, 애플리케이션, 펌웨어(firmware), 프로그램 등), 및 이러한 소프트웨어 또는 명령 등이 저장되어 장치 또는 컴퓨터 상에서 실행 가능한 비-일시적 컴퓨터-판독가능 매체(non-transitory computer-readable medium)를 포함한다. The scope of the present invention includes software or machine-executable instructions (e.g., operating systems, applications, firmware, programs, etc.) that enable operations according to the methods of various embodiments to be executed on a device or computer, and such software or It includes non-transitory computer-readable medium in which instructions, etc. are stored and can be executed on a device or computer.

Claims

In a method for controlling the quality of medical images,
Obtaining an input image as input to an artificial intelligence model;
extracting feature maps from the input image;
generating compressed and re-calibrated feature maps by performing a channel-dependent squeeze operation and a spatial excitation operation on the feature maps; and
A method comprising generating a quality-adjusted output image based on the compressed and re-calibrated feature maps.

In claim 1,
The method further includes applying at least one dropout pattern to the artificial intelligence model.

In claim 2,
A method in which the at least one dropout pattern is selected independently from dropout patterns applied differently for each epoch during training of the artificial intelligence model.

In claim 3,
The performance is evaluated based on at least one of radiomics, peak signal-to-noise ratio (PSNR), and structural similarity index method (SSIM) for the output image generated during the training. How to become.

In claim 1,
The artificial intelligence model includes a generator network that generates the output image, and a discriminator network that distinguishes between a fake image and a real image generated by the generator network. .

In claim 5,
The generator network and the discriminator network are trained to minimize discriminator loss and generator loss,
The discriminator loss is defined based on the Wasserstein distance,
A method in which the generator loss is defined based on adversarial loss and similarity loss.

In claim 6,
The adversarial loss is determined based on patches extracted from the fake image generated by the generator network,
The method wherein the similarity loss is determined based on the totality of fake images generated by the generator network.

In claim 1,
The input image includes a low-dose CT (computer tomography) image,
The output image includes a virtual standard-dose or high-dose CT image.

In a device for controlling the quality of medical images,
a storage unit that stores a set of instructions for operating the device; and
It includes at least one processor connected to the storage unit,
The at least one processor,
Obtain the input image as input to the artificial intelligence model,
Extract feature maps from the input image,
Generate compressed and re-calibrated feature maps by performing a channel-dependent squeeze operation and a spatial excitation operation on the feature maps,
A device that controls to generate a quality-adjusted output image based on the compressed and re-calibrated feature maps.

A program stored in a computer-readable medium to execute the method according to any one of claims 1 to 8 when operated by a processor.