KR20210152249A

KR20210152249A - Apparatus and method for artificial intelligence based automatic analysis of video fluoroscopic swallowing study

Info

Publication number: KR20210152249A
Application number: KR1020200069108A
Authority: KR
Inventors: 편성범; 이기선; 이은영
Original assignee: 고려대학교 산학협력단
Priority date: 2020-06-08
Filing date: 2020-06-08
Publication date: 2021-12-15
Also published as: KR102418073B1

Abstract

Disclosed are an apparatus and method for automatic analysis of video fluoroscopic swallowing study (VFSS) based on artificial intelligence, which can automate the analysis of VFSS based on an artificial intelligence technology, and a recording medium. According to an embodiment of the present invention, the apparatus for automatic analysis of VFSS based on artificial intelligence includes: a data input unit which is configured to receive image frames of a VFSS video acquired for an object to be analyzed, by a VFSS inspection device for VFSS; and a swallowing stage classifying unit which classifies the image frames of the VFSS video into swallowing stages including an oral stage, a pharyngeal stage, and an esophageal stage by an artificial intelligence model.

Description

Apparatus and method for artificial intelligence based automatic analysis of video fluoroscopic swallowing study

본 발명은 인공지능 기술을 기반으로 비디오투시연하검사(VFSS; video fluoroscopic swallowing study)의 연하 단계 분류 및 연하 장애 수준 결정을 자동화하는 인공지능 기반 비디오 투시 연하검사 자동화 분석 장치 및 방법에 관한 것이다.The present invention relates to an artificial intelligence-based video fluoroscopic swallowing test automation analysis apparatus and method for automating the classification of the swallowing phase and the determination of the level of swallowing disorder in a video fluoroscopic swallowing study (VFSS) based on artificial intelligence technology.

연하 장애(dysphagia)란 근육신경 계통의 이상이나 상부식도의 구조적 이상으로 인해 발생하는 삼킴의 어려움 증상을 지칭한다. 연하 장애는 고령 환자 뿐만 아니라, 뇌졸중 환자, 두경부 종양 환자 등에서 흔히 나타나는 증상이며 대표적인 합병증으로 흡인성 폐렴이나 기도 막힘, 영양 장애나 탈수 등의 증상이 발생할 수 있다. 하지만 대개의 경우, 연하 장애의 원인 질환 치료가 어려우며 같은 원인 질환인 경우에도 다양한 형태의 연하 장애를 일으키므로, 연하 장애의 패턴을 분석하여 그 자체로 치료적 접근을 하는 것(재활의학적 접근)이 더 도움이 된다.Dysphagia refers to symptoms of difficulty swallowing caused by abnormalities in the muscular nervous system or structural abnormalities in the upper esophagus. Dysphagia is a common symptom not only in elderly patients, but also in stroke patients and head and neck tumor patients. Typical complications include aspiration pneumonia, airway blockage, nutritional disorders, or dehydration. However, in most cases, it is difficult to treat the causative disease of dysphagia, and even the same causative disease causes various types of dysphagia. more helpful

이러한, 연하 장애의 진단을 위한 검사법으로 현재 표준 검사(gold standard)로 인정받고 있는 것이 비디오 투시 연하검사(VFSS; video fluoroscopic swallowing study)이다. VFSS 방식에 의하면, 예를 들어 바륨 조영제를 이용하여 구강, 인두, 식도의 구조적 이상과 움직임을 평가할 수 있고, 기도의 흡인 여부를 직접 확인할 수 있다.A video fluoroscopic swallowing study (VFSS) that is currently recognized as a gold standard as a test method for diagnosing swallowing disorders is a video fluoroscopic swallowing study. According to the VFSS method, for example, structural abnormalities and movements of the oral cavity, pharynx, and esophagus can be evaluated using a barium contrast agent, and airway aspiration can be directly confirmed.

그러나 종래의 VFSS 검사 방식의 경우, VFSS 디지털 영상을 취득한 이후의 분석은 분석자가 수작업으로 동영상의 프레임 단위 분석을 통하여 식괴의 이동경로를 분석하거나 연하 단계별 연하 시간을 측정하는 등의 방법으로 최종적으로 연하 장애 분류를 하고 있다. 이와 같이, VFSS 검사 후 영상데이터 분석 작업은 분석자의 경험도 및 숙련도에 의존한 수작업(manual)에 의해 수행되는 관계로, 분석 결과의 편향성(bias)이 높고, 재현성(reproducibility)과 객관성이 낮을 수밖에 없으며, 분석자에 따라 분석 결과가 다를 수 있어 결과의 일관성이 낮을 수밖에 없다. 뿐만 아니라, 분석자의 많은 노동력 및 시간이 요구되는 노동 집약적 분석을 필요로 하여 분석 결과 또한 상당한 시간이 소요된 이후에 제시될 수밖에 없다.However, in the case of the conventional VFSS test method, the analysis after acquiring the VFSS digital image is performed by the analyst manually analyzing the movement path of the food mass through frame-by-frame analysis of the video or measuring the swallowing time for each swallowing step. Classification of disabilities. As such, since the image data analysis work after the VFSS test is performed manually depending on the experience and skill level of the analyst, the bias of the analysis result is high, and the reproducibility and objectivity are low. In addition, the results of the analysis may be different depending on the analyst, so the consistency of the results is inevitably low. In addition, since labor-intensive analysis that requires a lot of labor and time for the analyst is required, the analysis results are inevitably presented after a considerable amount of time is also required.

본 발명은 인공지능 기술을 기반으로 비디오 투시 연하검사(VFSS)의 연하 단계 분류 및 연하 장애 수준 결정을 자동화하는 인공지능 기반 비디오 투시 연하검사 자동화 분석 장치 및 방법, 기록 매체를 제공하기 위한 것이다.An object of the present invention is to provide an artificial intelligence-based video fluoroscopic swallowing test automation analysis apparatus and method, and a recording medium that automates the classification of the swallowing phase and the determination of the level of dysphagia in the video fluoroscopic swallowing test (VFSS) based on artificial intelligence technology.

본 발명의 실시예에 따른 인공지능 기반 비디오 투시 연하검사 자동화 분석 장치는, 비디오 투시 연하 검사(VFSS)용 VFSS 검사 기기에 의해 분석 대상에 대해 획득된 VFSS 동영상의 영상 프레임들을 입력받도록 구성되는 데이터 입력부; 및 상기 VFSS 동영상의 영상 프레임들을 인공지능 모델에 의해 구강부 단계, 인두부 단계 및 식도부 단계를 포함하는 연하 단계들로 분류하는 연하 단계 분류부를 포함한다.An artificial intelligence-based video fluoroscopic swallowing test automated analysis apparatus according to an embodiment of the present invention includes a data input unit configured to receive image frames of a VFSS moving picture acquired for an analysis object by a VFSS test device for a video fluoroscopic swallowing test (VFSS) ; and a swallowing stage classification unit that classifies the image frames of the VFSS video into swallowing stages including the oral stage, pharyngeal stage, and esophageal stage by an artificial intelligence model.

본 발명의 실시예에 따른 인공지능 기반 비디오 투시 연하검사 자동화 분석 장치는, 다수의 연하 장애 환자에 관한 연하 단계별 VFSS 학습 영상 및 검사 데이터를 포함하는 학습 데이터를 학습하여 상기 인공지능 모델을 생성하는 데이터 학습부를 더 포함할 수 있다.Artificial intelligence-based video fluoroscopy automatic swallowing test analysis apparatus according to an embodiment of the present invention is data for generating the artificial intelligence model by learning the learning data including the VFSS learning image and examination data for each swallowing step for a plurality of patients with swallowing disorders It may further include a learning unit.

본 발명의 실시예에 따른 인공지능 기반 비디오 투시 연하검사 자동화 분석 장치는, 상기 FSS 동영상 중 상기 인공지능 모델에 의해 분류된 각 연하 단계에 해당하는 영상 구간을 추출하고, 상기 각 연하 단계에 해당하는 영상 구간으로부터 각 연하 단계별 연하 시간을 산출하도록 구성되는 연하 시간 산출부를 더 포함할 수 있다.The artificial intelligence-based video fluoroscopic swallowing test automation analysis apparatus according to an embodiment of the present invention extracts an image section corresponding to each swallowing step classified by the artificial intelligence model from the FSS video, and corresponds to each swallowing step It may further include a swallowing time calculator configured to calculate a swallowing time for each swallowing step from the video section.

상기 연하 단계 분류부는, 상기 인공지능 모델에 의해 연하 장애 분류 기준에 따라 연하 장애를 분류하도록 구성될 수 있다. 상기 연하 장애 분류 기준은 연하 장애를 다수의 침투-흡인 스케일 수준 중 어느 하나로 분류하도록 정의되는 침투-흡인 스케일(Penetration-Aspiration Scale)을 포함할 수 있다.The swallowing stage classification unit may be configured to classify a swallowing disorder according to a swallowing disorder classification criterion by the artificial intelligence model. The swallowing disorder classification criteria may include a Penetration-Aspiration Scale, which is defined to classify a swallowing disorder as one of a plurality of penetration-aspiration scale levels.

상기 인공지능 모델은, 상기 침투-흡인 스케일에 정의된 상기 다수의 침투-흡인 스케일 수준 별로 학습된 다수의 서브 인공지능 모델의 출력값들을 기반으로 상기 분석 대상의 연하 장애를 상기 침투-흡인 스케일에 따라 분류하도록 구성될 수 있다.The artificial intelligence model, based on the output values of a plurality of sub-AI models learned for each level of the plurality of penetration-aspiration scales defined in the penetration-aspiration scale, determines the swallowing disorder of the analysis target according to the penetration-aspiration scale It can be configured to classify.

상기 다수의 침투-흡인 스케일 수준 중 제1 침투-흡인 스케일 수준에 관하여 학습된 서브 인공지능 모델은, 상기 VFSS 동영상의 각 영상 프레임으로부터 다수의 영상 특징 정보를 추출하도록 구성되는 영상 특징 추출부; 및 상기 다수의 영상 특징 정보를 기반으로 상기 제1 침투-흡인 스케일 수준에 관한 확률을 출력하도록 구성되는 서브 인공 신경망을 포함할 수 있다.The sub-AI model trained with respect to a first penetration-aspiration scale level among the plurality of penetration-aspiration scale levels may include: an image feature extraction unit configured to extract a plurality of image feature information from each image frame of the VFSS video; and a sub-artificial neural network configured to output a probability related to the first penetration-aspiration scale level based on the plurality of image feature information.

상기 영상 특징 추출부는, 상기 VFSS 동영상의 각 영상 프레임을 컨볼루션 처리하여 컨볼루션 영상을 생성하는 하나 이상의 컨볼루션 처리부; 및 상기 다수의 영상 특징 정보를 추출하기 위해 상기 컨볼루션 영상을 서브 샘플링 처리하는 하나 이상의 서브 샘플링 처리부를 포함할 수 있다.The image feature extraction unit may include: at least one convolution processing unit configured to generate a convolutional image by convolution processing each image frame of the VFSS video; and one or more sub-sampling processing units for sub-sampling the convolutional image in order to extract the plurality of image feature information.

상기 서브 인공 신경망은, 상기 다수의 영상 특징 정보를 입력받도록 구성되는 입력 노드들을 포함하는 입력층; 상기 제1 침투-흡인 스케일 수준에 관한 확률을 출력하도록 구성되는 출력 노드를 포함하는 출력층; 및 상기 입력 노드들과 상기 출력 노드 사이에 완전 연결층 구조로 연결되는 은닉 노드들을 포함하는 은닉층을 포함할 수 있다.The sub artificial neural network may include: an input layer including input nodes configured to receive the plurality of image feature information; an output layer comprising an output node configured to output a probability relating to the first penetration-aspiration scale level; and a hidden layer including hidden nodes connected between the input nodes and the output nodes in a fully connected layer structure.

본 발명의 실시예에 따른 인공지능 기반 비디오 투시 연하검사 자동화 분석 장치는, 상기 VFSS 동영상 중 정지 객체를 제외한 움직임이 발생된 이동 객체를 검출하여 이동 경로를 추적하고, 상기 VFSS 동영상 중 상기 분석 대상에 의해 섭취된 식괴의 이동 경로를 추적하도록 구성되는 이동 객체 추적부를 더 포함할 수 있다.The artificial intelligence-based video fluoroscopic swallowing test automation analysis apparatus according to an embodiment of the present invention detects a moving object in which motion has occurred except for a still object in the VFSS video, tracks the movement path, and provides the analysis target in the VFSS video. It may further include a moving object tracking unit configured to track the movement path of the food mass ingested by the.

본 발명의 실시예에 따른 인공지능 기반 비디오 투시 연하검사 자동화 분석 방법은, 데이터 입력부에 의해, 비디오 투시 연하 검사(VFSS)용 VFSS 검사 기기에 의해 분석 대상에 대해 획득된 VFSS 동영상의 영상 프레임들을 입력받는 단계; 및 연하 단계 분류부에 의해, 상기 VFSS 동영상의 영상 프레임들을 인공지능 모델에 의해 구강부 단계, 인두부 단계 및 식도부 단계를 포함하는 연하 단계들로 분류하는 단계를 포함한다.In the artificial intelligence-based video fluoroscopic swallowing test automated analysis method according to an embodiment of the present invention, image frames of a VFSS video acquired for an analysis target by a VFSS test device for video fluoroscopic swallowing test (VFSS) are input by a data input unit receiving; and classifying, by the swallowing stage classification unit, the image frames of the VFSS video into swallowing stages including the oral stage, pharyngeal stage, and esophageal stage by an artificial intelligence model.

본 발명의 실시예에 따른 인공지능 기반 비디오 투시 연하검사 자동화 분석 방법은, 데이터 학습부에 의해, 다수의 연하 장애 환자에 관한 연하 단계별 VFSS 학습 영상 및 검사 데이터를 포함하는 학습 데이터를 학습하여 상기 인공지능 모델을 생성하는 단계를 더 포함할 수 있다.The artificial intelligence-based video fluoroscopic swallowing test automation analysis method according to an embodiment of the present invention is performed by learning, by a data learning unit, learning data including VFSS learning images and test data for a number of swallowing disorder patients for a number of swallowing disorders. The method may further include generating an intelligence model.

본 발명의 실시예에 따른 인공지능 기반 비디오 투시 연하검사 자동화 분석 방법은, 연하 시간 산출부에 의해, 상기 VFSS 동영상 중 상기 인공지능 모델에 의해 분류된 각 연하 단계에 해당하는 영상 구간을 추출하고, 상기 각 연하 단계에 해당하는 영상 구간으로부터 각 연하 단계별 연하 시간을 산출하는 단계를 더 포함할 수 있다.The artificial intelligence-based video fluoroscopic swallowing test automated analysis method according to an embodiment of the present invention extracts, by a swallowing time calculator, an image section corresponding to each swallowing stage classified by the artificial intelligence model from the VFSS video, The method may further include calculating a swallowing time for each swallowing step from the image section corresponding to each swallowing step.

본 발명의 실시예에 따른 인공지능 기반 비디오 투시 연하검사 자동화 분석 방법은, 상기 인공지능 모델에 의해, 연하 장애를 다수의 침투-흡인 스케일 수준 중 어느 하나로 분류하도록 정의되는 침투-흡인 스케일(Penetration-Aspiration Scale)에 따라 상기 분석 대상의 연하 장애를 분류하는 단계를 더 포함할 수 있다.The artificial intelligence-based video fluoroscopic swallowing test automation analysis method according to an embodiment of the present invention is defined to classify a swallowing disorder into any one of a number of penetration-aspiration scale levels by the artificial intelligence model Penetration- The method may further include classifying the swallowing disorder of the analysis target according to the Aspiration Scale).

상기 분석 대상의 연하 장애를 분류하는 단계는, 상기 다수의 침투-흡인 스케일 수준 별로 학습된 다수의 서브 인공지능 모델의 출력값들을 기반으로 상기 분석 대상의 연하 장애를 상기 침투-흡인 스케일에 따라 분류하는 단계를 포함할 수 있다.In the step of classifying the swallowing disorder of the analysis target, the swallowing disorder of the analysis target is classified according to the penetration-aspiration scale based on the output values of a plurality of sub-AI models learned for each of the plurality of penetration-aspiration scale levels. may include steps.

상기 분석 대상의 연하 장애를 상기 침투-흡인 스케일에 따라 분류하는 단계는, 상기 다수의 침투-흡인 스케일 수준 중 제1 침투-흡인 스케일 수준에 관하여 학습된 제1 서브 인공지능 모델에 의해 상기 제1 침투-흡인 스케일 수준의 확률을 예측하는 단계를 포함할 수 있다.In the step of classifying the swallowing disorder of the analysis target according to the penetration-aspiration scale, the first sub-AI model learned with respect to a first penetration-aspiration scale level among the plurality of penetration-aspiration scale levels includes the first predicting the probability of the penetration-aspiration scale level.

상기 제1 침투-흡인 스케일 수준의 확률을 예측하는 단계는, 상기 제1 서브 인공지능 모델의 영상 특징 추출부에 의해, 상기 VFSS 동영상의 각 영상 프레임을 컨볼루션 처리하여 컨볼루션 영상을 생성하고, 상기 컨볼루션 영상을 서브 샘플링 처리하여 상기 각 영상 프레임으로부터 다수의 영상 특징 정보를 추출하는 단계; 및 상기 제1 서브 인공지능 모델의 완전 연결층 구조를 가지는 서브 인공 신경망에 의해, 상기 다수의 영상 특징 정보를 기반으로 상기 제1 침투-흡인 스케일 수준에 관한 확률을 출력하는 단계를 포함할 수 있다.The step of predicting the probability of the first penetration-suction scale level includes, by the image feature extraction unit of the first sub-AI model, convolution processing each image frame of the VFSS video to generate a convolutional image, extracting a plurality of image feature information from each image frame by subsampling the convolutional image; and outputting, by a sub-artificial neural network having a fully connected layer structure of the first sub-AI model, a probability regarding the first penetration-suction scale level based on the plurality of image feature information. .

본 발명의 실시예에 따른 인공지능 기반 비디오 투시 연하검사 자동화 분석 방법은, 이동 객체 추적부에 의해, 상기 VFSS 동영상 중 정지 객체를 제외한 움직임이 발생된 이동 객체를 검출하여 이동 경로를 추적하고, 상기 VFSS 동영상 중 상기 분석 대상에 의해 섭취된 식괴의 이동 경로를 추적하는 단계를 더 포함할 수 있다.In the artificial intelligence-based video fluoroscopic swallowing test automation analysis method according to an embodiment of the present invention, a moving object in which a movement has occurred except for a still object in the VFSS video is detected by a moving object tracking unit to track the movement path, and the The method may further include tracking the movement path of the food mass ingested by the analysis target in the VFSS video.

본 발명의 실시예에 따르면, 상기 인공지능 기반 비디오 투시 연하검사 자동화 분석 방법을 실행하기 위한 프로그램이 기록된 컴퓨터로 판독 가능한 기록 매체가 제공된다.According to an embodiment of the present invention, there is provided a computer-readable recording medium in which a program for executing the artificial intelligence-based video fluoroscopy automated swallowing test analysis method is recorded.

본 발명의 실시예에 의하면, 인공지능 기술을 기반으로 비디오투시연하검사(VFSS)의 연하 단계 분류 및 연하 장애 수준 결정을 자동화하는 인공지능 기반 비디오 투시 연하검사 자동화 분석 장치 및 방법, 기록 매체가 제공된다.According to an embodiment of the present invention, there is provided an artificial intelligence-based video fluoroscopic swallowing test automation analysis apparatus and method, and a recording medium that automates the classification of the swallowing stage and the determination of the level of swallowing disorder of the video fluoroscopic swallowing test (VFSS) based on artificial intelligence technology do.

도 1은 본 발명의 실시예에 따른 인공지능 기반 비디오 투시 연하검사 자동화 분석 장치의 구성도이다.
도 2는 본 발명의 실시예에 따른 인공지능 기반 비디오 투시 연하검사 자동화 분석 방법의 순서도이다.
도 3은 구강부 단계, 인두부 단계 및 식도부 단계를 예시한 도면이다.
도 4는 본 발명의 실시예에 따른 인공지능 기반 비디오 투시 연하검사 자동화 분석 장치를 구성하는 인공지능 모델의 구성도이다.
도 5는 연하 장애 평가를 위한 침투-흡인 스케일의 예시도이다.
도 6은 본 발명의 실시예에 따른 인공지능 기반 비디오 투시 연하검사 자동화 분석 장치를 구성하는 서브 인공지능 모델의 구성도이다.
도 7은 도 2의 단계 S150의 순서도이다.
도 8 및 도 9는 본 발명의 실시예에 따른 인공지능 기반 비디오 투시 연하검사 자동화 분석 장치를 구성하는 인공지능 모델을 예시한 개념도이다.
도 10은 VFSS 동영상의 예시도이다.1 is a block diagram of an artificial intelligence-based video fluoroscopy automatic swallowing test analysis apparatus according to an embodiment of the present invention.
2 is a flowchart of an artificial intelligence-based video fluoroscopy automatic swallowing test analysis method according to an embodiment of the present invention.
3 is a diagram illustrating an oral stage, a pharyngeal stage, and an esophageal stage.
4 is a block diagram of an artificial intelligence model constituting an artificial intelligence-based video fluoroscopy automatic swallowing test analysis apparatus according to an embodiment of the present invention.
5 is an exemplary view of the penetration-aspiration scale for the evaluation of dysphagia.
6 is a block diagram of a sub-AI model constituting an artificial intelligence-based video fluoroscopy automatic swallowing test analysis apparatus according to an embodiment of the present invention.
7 is a flowchart of step S150 of FIG. 2 .
8 and 9 are conceptual views illustrating an artificial intelligence model constituting an artificial intelligence-based video fluoroscopy automatic swallowing test analysis apparatus according to an embodiment of the present invention.
10 is an exemplary diagram of a VFSS video.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하고, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. 명세서 전체에 걸쳐 동일 참조 부호는 동일 구성 요소를 지칭한다.Advantages and features of the present invention and methods of achieving them will become apparent with reference to the embodiments described below in detail in conjunction with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but may be implemented in various different forms, and only these embodiments allow the disclosure of the present invention to be complete, and common knowledge in the art to which the present invention pertains It is provided to fully inform those who have the scope of the invention, and the present invention is only defined by the scope of the claims. Like reference numerals refer to like elements throughout.

본 명세서에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다. 본 명세서에서 사용되는 '~부, ~모듈'은 적어도 하나의 기능이나 동작을 처리하는 단위로서, 예를 들어 소프트웨어, FPGA 또는 하드웨어 구성요소를 의미할 수 있다. '~부, ~모듈'에서 제공하는 기능은 복수의 구성요소에 의해 분리되어 수행되거나, 다른 추가적인 구성요소와 통합될 수도 있다. 본 명세서의 '~부, ~모듈'은 반드시 소프트웨어 또는 하드웨어에 한정되지 않으며, 어드레싱할 수 있는 저장 매체에 있도록 구성될 수도 있고, 하나 또는 그 이상의 프로세서들을 재생시키도록 구성될 수도 있다. 이하에서는 도면을 참조하여 본 발명의 실시예에 대해서 구체적으로 설명하기로 한다.In the present specification, when a part "includes" a certain component, it means that other components may be further included, rather than excluding other components, unless otherwise stated. As used herein, '~ unit, ~ module' is a unit that processes at least one function or operation, and may refer to, for example, software, FPGA, or hardware component. The functions provided by '~ unit, ~ module' may be performed separately by a plurality of components, or may be integrated with other additional components. 'Part, ~ module' in this specification is not necessarily limited to software or hardware, and may be configured to be in an addressable storage medium, or may be configured to reproduce one or more processors. Hereinafter, an embodiment of the present invention will be described in detail with reference to the drawings.

도 1은 본 발명의 실시예에 따른 인공지능 기반 비디오 투시 연하검사 자동화 분석 장치의 구성도이다. 도 1을 참조하면, 본 발명의 실시예에 따른 인공지능 기반 비디오 투시 연하검사 자동화 분석 장치(100)는 인공지능 기반으로 연하 장애 환자와 같은 분석 대상에 대한 비디오 투시 연하 검사(VFSS)를 자동으로 수행하도록 구성되는 연하 검사부(100a)를 포함할 수 있다.1 is a block diagram of an artificial intelligence-based video fluoroscopy automatic swallowing test analysis apparatus according to an embodiment of the present invention. Referring to FIG. 1 , an artificial intelligence-based video fluoroscopic swallowing test automation analysis apparatus 100 according to an embodiment of the present invention automatically performs a video fluoroscopic swallowing test (VFSS) for an analysis target, such as a patient with dysphagia, based on artificial intelligence. It may include a swallowing test unit 100a configured to perform.

또한, 본 발명의 실시예에 따른 인공지능 기반 비디오 투시 연하검사 자동화 분석 장치(100)는 VFSS 검사 기기에 의해 분석 대상에 대해 촬영된 VFSS 동영상을 입력받아 연하 검사부(100a)로 출력하는 데이터 입력부(120), 연하 검사부(100a)를 제어하여 연하 검사 기능을 실행하는 제어부(180), 연하 검사부(100a)의 연하 검사 기능을 위한 프로그램 및 각종 정보를 저장하는 메모리(190) 및 연하 검사부(100a)에 의해 제공되는 각종 연하 검사 결과(연하 단계 분류 결과, 연하 단계별 연하 시간, 연하 단계별 영상 구간, 연하 장애 수준, 식괴 이동 궤적 등)를 출력하는 출력부(170)를 포함할 수 있다.In addition, the artificial intelligence-based video fluoroscopy automatic swallowing test analysis apparatus 100 according to an embodiment of the present invention receives a VFSS video captured for an analysis target by a VFSS test device and outputs a data input unit ( 120), a controller 180 that controls the swallowing test unit 100a to execute a swallowing test function, a memory 190 for storing programs and various information for the swallowing test function of the swallowing tester 100a, and a swallowing tester 100a may include an output unit 170 for outputting various swallowing test results provided by

도 2는 본 발명의 실시예에 따른 인공지능 기반 비디오 투시 연하검사 자동화 분석 방법의 순서도이다. 도 1 및 도 2를 참조하면, 연하 검사부(100a)는 데이터 학습부(110), 연하 단계 분류부(130), 연하 시간 산출부(150) 및 이동 객체 추적부(160)를 포함할 수 있다.2 is a flowchart of an artificial intelligence-based video fluoroscopy automatic swallowing test analysis method according to an embodiment of the present invention. 1 and 2 , the swallowing test unit 100a may include a data learning unit 110 , a swallowing stage classification unit 130 , a swallowing time calculation unit 150 , and a moving object tracking unit 160 . .

데이터 학습부(110)는 다양한 연하 장애 환자(또는 연하 장애가 없는 사람)에 관한 연하 단계별 VFSS 학습 영상 및 검사 결과 데이터(예를 들어, 환자가 VFSS 검사를 진행하는 동안 각 연하 단계별 소요된 시간, 연하 장애 분류 기준에 따른 정량적 평가값을 의사 등이 평가한 결과 등)를 포함하는 학습 데이터를 학습하여 인공지능 모델(인공지능 확률모델)(140)을 생성할 수 있다(S110).The data learning unit 110 provides VFSS learning images and test result data for each swallowing disorder patient (or a person without a swallowing disorder) for various swallowing disorders (for example, the time taken for each swallowing step while the patient is performing the VFSS test, swallowing An artificial intelligence model (artificial intelligence probabilistic model) 140 may be generated by learning the learning data including a result of evaluation by a doctor or the like of a quantitative evaluation value according to the disability classification criterion (S110).

도 3은 구강부 단계(a), 인두부 단계(b) 및 식도부 단계(c)를 예시한 도면이다. 연하 단계는 구강부 단계, 인두부 단계 및 식도부 단계를 포함할 수 있다. 구강부 단계는 분석 대상이 섭취한 식괴가 구강부(1)에 머무르는 단계, 인두부 단계는 식괴가 인두부(2)를 통과하는 단계, 식도부 단계는 식괴가 식도부(3)를 통과하는 단계일 수 있다.3 is a diagram illustrating an oral stage (a), a pharyngeal stage (b) and an esophageal stage (c). The swallowing phase may include an oral phase, pharyngeal phase and esophageal phase. The oral stage is a stage in which the mass eaten by the analysis target stays in the oral cavity (1), the pharyngeal stage is a stage in which the mass passes through the pharynx (2), and the esophageal stage is the stage in which the mass passes through the esophagus (3). may be a step.

데이터 학습부(110)는 연하 단계별 및 연하 장애 분류 기준, 예를 들어, 연하 장애 분류를 위한 침투-흡인 스케일(PAS; Penetration-Aspiration Scale) 별로 구분된 학습 데이터에 의해 인공 지능망을 학습하여, VFSS 동영상 내 연하 단계별 영상구간 분류 및 추출, 연하 장애 분류를 위한 인공지능 모델을 생성할 수 있다.The data learning unit 110 learns the artificial intelligence network by learning data divided by each swallowing step and swallowing disorder classification criteria, for example, Penetration-Aspiration Scale (PAS) for swallowing disorder classification, and VFSS It is possible to create an artificial intelligence model for classifying and extracting image sections for each swallowing step in the video, and for classifying swallowing disorders.

데이터 입력부(120)는 비디오 투시 연하 검사(VFSS)용 VFSS 검사 기기에 의해 분석 대상에 대해 획득된 VFSS 동영상의 영상 프레임들을 입력받아 연하 검사부(100a)로 전달할 수 있다(S120). VFSS 동영상 정보는 디지털 데이터일 수 있으며, 수집된 데이터에 대해 정규화 알고리즘 등의 다양한 데이터 전처리가 수행될 수 있다.The data input unit 120 may receive the image frames of the VFSS moving picture acquired for the analysis target by the VFSS inspection device for video fluoroscopic swallowing test (VFSS) and transmit the received image frames to the swallowing test unit 100a ( S120 ). VFSS moving picture information may be digital data, and various data preprocessing such as a normalization algorithm may be performed on the collected data.

연하 단계 분류부(130)는 데이터 학습부(110)의 학습에 의해 생성된 인공지능 모델(140)에 의해, VFSS 동영상의 영상 프레임들을 구강부 단계(oral phase), 인두부 단계(pharyngeal phase) 및 식도부 단계(esophagus phase)를 포함하는 3단계 연하 단계들(구강부 단계, 인두부 단계 및 식도부 단계) 또는 4단계 이상의 연하 단계들로 분류할 수 있다(S130).The swallowing phase classification unit 130 divides the image frames of the VFSS video into an oral phase and a pharyngeal phase using the artificial intelligence model 140 generated by the learning of the data learning unit 110 . And it can be classified into three swallowing phases (oral phase, pharyngeal phase, and esophageal phase) or four or more swallowing phases including an esophagus phase (S130).

연하 시간 산출부(150)는 VFSS 동영상 중 인공지능 모델(140)에 의해 분류된 각 연하 단계에 해당하는 영상 구간을 추출하고, 각 연하 단계에 해당하는 영상 구간으로부터 각 연하 단계별 연하 시간을 산출할 수 있다(S140). 연하 시간 산출부(150)는 예를 들어, VFSS 동영상의 영상 프레임들 중 각 연하 단계에 해당하는 연속된 영상 프레임의 개수에 인접한 두 영상 프레임 간의 간격을 곱한 값으로부터 연하 단계별 연하 시간을 산출할 수 있다.The swallowing time calculator 150 extracts the video section corresponding to each swallowing stage classified by the artificial intelligence model 140 from the VFSS video, and calculates the swallowing time for each swallowing step from the video section corresponding to each swallowing step. can be (S140). The swallowing time calculator 150 may calculate the swallowing time for each swallowing step from a value obtained by multiplying the number of consecutive video frames corresponding to each swallowing step among the video frames of the VFSS video by the interval between two adjacent video frames, for example. have.

또한, 연하 단계 분류부(130)는 인공지능 모델(140)에 의해 연하 장애 분류 기준에 따라 분석 대상의 연하 장애를 분류할 수 있다(S150). 실시예에서, 연하 장애 분류 기준은 침투-흡인 스케일(PAS; Penetration-Aspiration Scale)을 포함할 수 있다. 침투-흡인 스케일은 연하 장애를 다수의 침투-흡인 스케일 수준 중 어느 하나로 분류하도록 정의될 수 있다.Also, the swallowing stage classification unit 130 may classify a swallowing disorder to be analyzed according to the swallowing disorder classification criteria by the artificial intelligence model 140 ( S150 ). In an embodiment, the dysphagia classification criterion may include a Penetration-Aspiration Scale (PAS). The penetration-aspiration scale may be defined to classify a dysphagia into one of a number of levels of the penetration-aspiration scale.

이동 객체 추적부(160)는 VFSS 동영상 중 정지 객체를 제외한 움직임이 발생된 이동 객체(예를 들어, 설골, 식괴 등)를 검출하여 이동 경로를 추적하고, VFSS 동영상 중 분석 대상에 의해 섭취된 식괴의 이동 경로를 추적할 수 있다(S160). 이를 통해 연하 장애의 원인을 파악하거나, 기도 흡인 등을 판단할 수 있다.The moving object tracking unit 160 detects a moving object (eg, hyoid bone, food mass, etc.) in which motion has occurred in the VFSS video except for a static object, and tracks the movement path, and food mass consumed by the analysis target in the VFSS video. It is possible to track the movement path of (S160). Through this, it is possible to determine the cause of the dysphagia or determine airway aspiration.

이동 객체 추적부(160)는 예를 들어, 순차적으로 입력되는 영상 프레임들에서 객체들을 추출하고, 인접한 영상 프레임들 간의 객체들의 좌표 변화량으로부터 움직임 벡터를 생성하여 객체들을 정지 객체와 이동 객체로 분류할 수 있으나, 이와 다른 방식으로 이동 객체를 추적하는 것도 가능하다.The moving object tracking unit 160, for example, extracts objects from sequentially input image frames, generates a motion vector from the amount of coordinate change of the objects between adjacent image frames, and classifies the objects into a still object and a moving object. However, it is also possible to track the moving object in a different way.

도 4는 본 발명의 실시예에 따른 인공지능 기반 비디오 투시 연하검사 자동화 분석 장치를 구성하는 인공지능 모델의 구성도이다. 도 1 내지 도 4를 참조하면, 인공지능 모델(140)은 다수의 서브 인공지능 모델(142, 144, 146)과, 침투-흡인 스케일 결정부(148)를 포함할 수 있다.4 is a block diagram of an artificial intelligence model constituting an artificial intelligence-based video fluoroscopy automatic swallowing test analysis apparatus according to an embodiment of the present invention. 1 to 4 , the artificial intelligence model 140 may include a plurality of sub artificial intelligence models 142 , 144 , 146 , and a penetration-suction scale determining unit 148 .

도 5는 연하 장애 평가를 위한 침투-흡인 스케일의 예시도이다. 도 6은 본 발명의 실시예에 따른 인공지능 기반 비디오 투시 연하검사 자동화 분석 장치를 구성하는 서브 인공지능 모델의 구성도이다. 도 7은 도 2의 단계 S150의 순서도이다.5 is an exemplary view of the penetration-aspiration scale for the evaluation of dysphagia. 6 is a block diagram of a sub-AI model constituting an artificial intelligence-based video fluoroscopy automatic swallowing test analysis apparatus according to an embodiment of the present invention. 7 is a flowchart of step S150 of FIG. 2 .

도 4 내지 도 7을 참조하면, 인공지능 모델(140)은 침투-흡인 스케일에 정의된 다수의 침투-흡인 스케일 수준(예를 들어, 도 5에 도시된 PAS1 ~ PAS8) 별로 학습된 다수의 서브 인공지능 모델(142, 144, 146)의 출력값들(확률값들)을 기반으로 분석 대상의 연하 장애를 침투-흡인 스케일에 따라 분류할 수 있다.4 to 7, the artificial intelligence model 140 is a plurality of penetration-aspiration scale levels defined in the penetration-aspiration scale (eg, PAS1 to PAS8 shown in FIG. 5) of a plurality of learned sub Based on the output values (probability values) of the artificial intelligence models 142, 144, and 146, the swallowing disorder of the analysis target may be classified according to the penetration-aspiration scale.

침투-흡인 스케일(PAS)은 크게 정상(Normal), 침투(Penetration), 흡인(Aspiration)의 3개의 카테고리로 나눌 수 있으며, 다시 도 5에 도시된 바와 같이 8개의 세부 카테고리의 PAS 수준(PAS1 ~ PAS8)으로 분류될 수 있다.The penetration-aspiration scale (PAS) can be divided into three categories: normal, penetration, and aspiration, and again, as shown in FIG. PAS8).

도 5에서, PAS1은 음식물이 기도로 침투하지 않은 정상 수준, PAS2는 음식물이 기도로 유입되어 성대 위에 잔류하며 잔여물이 없는 제1 침투 수준, PAS3은 음식물이 기도로 유입되어 성대 위에 잔류하며 잔여물이 남아 있는 제2 침투 수준, PAS4는 음식물이 기도로 유입되어 성대에 접촉하며 잔여물이 남아 있지 않은 제3 침투 수준, PAS5는 음식물이 기도로 유입되어 성대에 접촉하며 잔여물이 남아 있는 제4 침투 수준을 나타낸다.In FIG. 5 , PAS1 is a normal level at which food does not penetrate into the airways, PAS2 is a first penetration level where food enters the airways and remains on the vocal cords and there is no residue, and PAS3 is food flows into the airways and remains on the vocal cords and remains on the vocal cords. The second level of penetration in which water remains, PAS4, is the third penetration level, in which food enters the airway and contacts the vocal cords and no residue remains, and PAS5, where food enters the airways and contacts the vocal cords and residues remain. 4 indicates the level of penetration.

또한, PAS6은 음식물이 기도로 유입되어 성문을 통과하고, 성문 아래에 남아 있는 잔여물이 없는 제1 흡인 수준, PAS7은 음식물이 기도로 유입되어 성문을 통과하고, 환자의 반응이 있는 상태에서 성문 아래에 남아 있는 잔여물이 존재하는 제2 흡인 수준, PAS8은 음식물이 기도로 유입되어 성문을 통과하고, 성문 아래에 남아 있는 잔여물이 존재하며, 환자의 반응도 없는 제3 흡인 수준을 나타낸다.In addition, PAS6 indicates that food enters the airway and passes through the glottis, the first level of aspiration with no residue remaining under the glottis, PAS7 indicates that food enters the airway and passes through the glottis, and the patient responds through the glottis. The second aspiration level with residual residue underneath, PAS8, represents the third aspiration level at which food enters the airway and passes through the glottis, residual residue is present under the glottis, and the patient is unresponsive.

PAS 수준별 확률 예측을 위해, 인공지능 모델(140)의 각 서브 인공지능 모델(142, 144, 146)은 영상 특징 추출부(142a) 및 서브 인공 신경망(142b)을 포함할 수 있다. 영상 특징 추출부(142a)는 VFSS 동영상의 각 영상 프레임으로부터 다수의 영상 특징 정보(다수의 특징 맵)를 추출할 수 있다.For probability prediction for each PAS level, each sub-AI model 142 , 144 , and 146 of the AI model 140 may include an image feature extractor 142a and a sub-artificial neural network 142b. The image feature extraction unit 142a may extract a plurality of pieces of image feature information (a plurality of feature maps) from each image frame of the VFSS video.

도 8 및 도 9는 본 발명의 실시예에 따른 인공지능 기반 비디오 투시 연하검사 자동화 분석 장치를 구성하는 인공지능 모델을 예시한 개념도이다. 도 8은 인공지능 모델을 학습하는 과정을 나타낸 도면이다.8 and 9 are conceptual diagrams illustrating an artificial intelligence model constituting an artificial intelligence-based video fluoroscopy automatic swallowing test analysis apparatus according to an embodiment of the present invention. 8 is a diagram illustrating a process of learning an artificial intelligence model.

도 1, 도 4, 도 6 내지 도 8을 참조하여 인공지능 모델을 학습하는 과정에 대해 설명하면, 데이터 학습부(110)는 학습 데이터(훈련 데이터)(10)를 이용하여 인공지능 모델을 학습한다.Referring to FIGS. 1, 4, and 6 to 8 , the process of learning the artificial intelligence model will be described. The data learning unit 110 learns the artificial intelligence model using the learning data (training data) 10 . do.

학습 데이터(10)는 연하 장애를 갖지 않는 사람의 VFSS 동영상 및 검사 데이터에 해당하는 긍정 학습 데이터(20)와, 연하 장애를 갖고 있는 사람의 VFSS 동영상 및 검사 데이터에 해당하는 부정 학습 데이터(30)를 포함할 수 있다.The training data 10 includes positive learning data 20 corresponding to VFSS video and test data of a person without a swallowing disorder, and negative learning data 30 corresponding to VFSS video and test data of a person with a swallowing disorder. may include

부정 학습 데이터(30)는 연하 단계별 및 연하 장애 분류 기준별로 구분될 수 있으며, 예를 들어 다수의 침투-흡인 스케일(PAS) 수준별로 분류된 서브 학습 데이터(32, 34, 36)를 포함할 수 있다.Negative learning data 30 may be divided by swallowing stage and by swallowing disorder classification criteria, for example, a plurality of penetration-aspiration scale (PAS) levels may include sub-learning data 32, 34, 36 classified by level. have.

각 서브 학습 데이터(32, 34, 36)는 대응되는 각 서브 인공지능 모델(142, 144, 146)의 학습에 활용될 수 있다. 서브 인공지능 모델(142, 144, 146)은 침투-흡인 스케일(PAS) 수준의 개수 만큼 마련될 수 있다.Each sub-learning data 32 , 34 , and 36 may be utilized for learning of each corresponding sub-AI model 142 , 144 , 146 . The sub-AI models 142, 144, and 146 may be provided as many as the number of penetration-suction scale (PAS) levels.

이와 같이, 다양한 VFSS 동영상을 연하 장애 기준(예를 들어, PAS) 별로 구분하고, 각 VFSS 동영상의 영상 프레임들을 연하 단계별로 연속된 프레임으로 구분한 후, 이를 이용한 인공지능 기반의 훈련을 통해 연하 장애 수준 및 연하 단계를 구분하기 위한 인공지능 모델(140)을 생성할 수 있다.In this way, various VFSS videos are classified by swallowing disorder criteria (eg, PAS), and the video frames of each VFSS video are divided into consecutive frames for each swallowing step, and then, through artificial intelligence-based training using this, An artificial intelligence model 140 for classifying a level and a swallowing stage may be generated.

도 9는 학습된 인공지능 모델에 의해 연하 검사를 자동으로 수행하는 과정을 나타낸 도면이다. 도 10은 VFSS 동영상의 예시도이다. 도 1, 도 4, 도 6, 도 7, 도 9 및 도 10을 참조하면, 영상 특징 추출부(142a)는 VFSS 동영상(50)의 영상 프레임(40)에서 다수의 영상 특징 정보(예를 들어, 특징 맵)를 추출하기 위한 하나 이상의 컨볼루션 처리부(1422) 및 하나 이상의 서브 샘플링 처리부(1424)를 포함할 수 있다.9 is a diagram illustrating a process of automatically performing a swallowing test by a learned artificial intelligence model. 10 is an exemplary diagram of a VFSS video. 1, 4, 6, 7, 9 and 10 , the image feature extracting unit 142a performs a plurality of image feature information (for example, in the image frame 40 of the VFSS video 50 ). , feature map) may include one or more convolution processing units 1422 and one or more sub-sampling processing units 1424 .

컨볼루션 처리부(1422)는 VFSS 동영상의 각 영상 프레임(40)을 컨볼루션(Convolution) 처리하여 다수의 컨볼루션 영상(제1 특징 맵)을 생성할 수 있다(S152).The convolution processing unit 1422 may generate a plurality of convolutional images (first feature maps) by convolutionally processing each image frame 40 of the VFSS video ( S152 ).

컨볼루션 처리부(1422)는 설정된 픽셀 크기의 기준 영상(예를 들어, 3×3 기준 영상, 2×2 기준 영상, 7×7 기준 영상 등)을 VFSS 영상(또는 이전에 생성된 컨볼루션 영상)에 매핑하여 특징 맵을 생성할 수 있다.The convolution processing unit 1422 converts a reference image (eg, a 3×3 reference image, a 2×2 reference image, a 7×7 reference image, etc.) of a set pixel size into a VFSS image (or a previously generated convolution image). can be mapped to create a feature map.

서브 샘플링 처리부(1424)는 다수의 컨볼루션 영상(제1 특징 맵)으로부터 다수의 영상 특징 정보(제2 특징 맵)를 추출하기 위해 컨볼루션 처리부(1422)에 의해 생성된 컨볼루션 영상에 대해 풀링(pooling) 등의 서브 샘플링(subsampling) 처리를 수행할 수 있다(S154).The sub-sampling processing unit 1424 pools the convolutional images generated by the convolution processing unit 1422 in order to extract a plurality of image feature information (second feature maps) from the plurality of convolutional images (first feature maps). A subsampling process such as (pooling) may be performed (S154).

컨볼루션 처리부(1422)의 컨볼루션 처리와 서브 샘플링 처리부(1424)의 서브 샘플링 처리는 순차적으로 복수회 반복하여 수행될 수 있으며, 최종적인 컨볼루션 처리 또는 서브 샘플링 처리를 통해 서브 인공 신경망(142b)으로 입력될 영상 특징 정보가 생성될 수 있다.The convolution processing of the convolution processing unit 1422 and the sub-sampling processing of the sub-sampling processing unit 1424 may be sequentially and repeatedly performed a plurality of times, and the sub artificial neural network 142b through the final convolution processing or sub-sampling processing. Image characteristic information to be input may be generated.

서브 인공 신경망(142b)은 영상 특징 추출부(142a)에 의해 추출된 다수의 영상 특징 정보(예를 들어, 특징 맵)를 기반으로, 예를 들어 완전 연결층 인공 신경망에 의해 제1 침투-흡인 스케일 수준에 관한 확률을 출력할 수 있다(S156).The sub artificial neural network 142b is based on a plurality of image feature information (eg, feature maps) extracted by the image feature extraction unit 142a, for example, the first penetration-suction by the fully connected layer artificial neural network. A probability related to the scale level may be output (S156).

서브 인공 신경망(142b)은 다수의 영상 특징 정보를 입력받는 입력 노드들을 포함하는 입력층(IL), 제1 침투-흡인 스케일 수준에 관한 확률을 출력하는 하나 이상의 출력 노드를 포함하는 출력층(OL) 및 입력 노드들과 출력 노드 사이에 완전 연결층(fully connected layer) 구조로 연결되는 은닉 노드들을 포함하는 은닉층(HL)을 포함할 수 있다.The sub artificial neural network 142b includes an input layer (IL) including input nodes receiving a plurality of image feature information, and an output layer (OL) including one or more output nodes for outputting a probability related to the first penetration-aspiration scale level and a hidden layer HL including hidden nodes connected between input nodes and output nodes in a fully connected layer structure.

침투-흡인 스케일 결정부(148)는 다수의 침투-흡인 스케일 수준 별로 학습된 다수의 서브 인공지능 모델(142, 144, 146)의 출력값들(확률값들)을 기반으로 분석 대상의 연하 장애를 침투-흡인 스케일에 따라 분류할 수 있다(S158).The penetration-aspiration scale determining unit 148 penetrates the swallowing disorder of the analysis target based on the output values (probability values) of a plurality of sub-AI models 142, 144, and 146 learned for each of the plurality of penetration-aspiration scale levels. -Can be classified according to the suction scale (S158).

서브 인공지능 모델(142, 144, 146)의 각 출력값(확률값)은 0, 1의 이진값으로 출력되거나, 0 ~ 1 사이 미리 설정된 범위 내의 확률 값 또는 수치로 출력될 수 있다.Each output value (probability value) of the sub-AI models 142, 144, and 146 may be output as a binary value of 0 or 1, or may be output as a probability value or numerical value within a preset range between 0 and 1.

침투-흡인 스케일 결정부(148)는 예를 들어, 다수의 서브 인공지능 모델(142, 144, 146) 중 제1 PAS 수준에 대응되는 제1 서브 인공지능 모델에서 출력되는 확률값이 1 이고, 나머지 서브 인공지능 모델에서 출력되는 확률값이 0 인 경우, 분석 대상의 침투-흡인 스케일(PAS)을 제1 PAS 수준으로 결정할 수 있다.Penetration-aspiration scale determining unit 148, for example, the probability value output from the first sub-AI model corresponding to the first PAS level among the plurality of sub-AI models 142, 144, 146 is 1, and the remaining When the probability value output from the sub-AI model is 0, the penetration-aspiration scale (PAS) of the analysis target may be determined as the first PAS level.

또한, 침투-흡인 스케일 결정부(148)는 다수의 서브 인공지능 모델(142, 144, 146)에서 출력되는 확률값들의 통계값(예를 들어, 평균 또는 각 PAS 수준별로 설정되거나 학습된 가중치가 반영된 가중 평균 등)을 기반으로 분석 대상의 PAS 수준을 결정할 수도 있다.In addition, the penetration-aspiration scale determining unit 148 is a statistical value of the probability values output from a plurality of sub-AI models 142, 144, 146 (eg, average or each PAS level set or learned weight is reflected) It is also possible to determine the PAS level of an analysis target based on a weighted average, etc.).

이와 같이, 환자의 VFSS 동영상(50) 정보가 입력될 경우, 사용자에게 연하 단계별 동영상을 연속된 프레임별로 구분하고 해당 연하 단계별 연하 시간을 추정할 수 있으며, 최종적으로 연하 장애 수준(PAS 수준)을 예측할 수 있다.In this way, when the patient's VFSS video 50 information is input, the user can classify the video for each swallowing step by successive frames, estimate the swallowing time for the corresponding swallowing step, and finally predict the level of swallowing disorder (PAS level). can

본 발명의 실시예에 따라, 방대한 VFSS 검사 동영상 데이터 및 검사 결과 정보를 데이터 마이닝한 결과를 기반으로 인공지능 학습을 통하여 연하 단계 및 연하 장애 수준 분류를 위한 인공지능 확률모델을 생성할 수 있으며, 인공지능 확률모델을 기반으로 임의의 VFSS 검사 동영상에서 각 연하 단계별 동영상 프레임을 자동 인지 및 추출하여 자동으로 연하 단계별 소요시간 및 연하 장애 분류를 예측하고 표 1에 도시된 바와 같이 연하 검사 결과를 제공할 수 있다.According to an embodiment of the present invention, it is possible to generate an artificial intelligence probabilistic model for classifying the swallowing stage and the level of swallowing disorder through artificial intelligence learning based on the results of data mining of vast VFSS test video data and test result information. Based on an intelligent probabilistic model, it is possible to automatically recognize and extract video frames for each swallowing step from a random VFSS test video, automatically predict the time required for each swallowing step and classification of swallowing disorders, and provide the swallowing test results as shown in Table 1. have.

따라서 본 발명의 실시예에 의하면, VFSS 동영상의 각 프레임 이미지별로 분석해야 하는 VFSS 분석가의 노동 집약적인 분석 업무를 절감할 수 있으며, 연하 단계별 시간 측정, 연하 장애 정도의 예측을 정확하면서 일관성 있는 결과로 제시할 수 있어, 연하 장애자 진단의 효율성과 정확성을 제고할 수 있다.Therefore, according to the embodiment of the present invention, the labor-intensive analysis work of the VFSS analyst who has to analyze each frame image of the VFSS video can be reduced, and the time measurement for each swallowing phase and the prediction of the degree of swallowing disorder can be accurately and consistent with results. It can be presented, so it is possible to improve the efficiency and accuracy of diagnosis of dysphagia.

또한, 본 발명의 실시예에 의하면, VFSS 동영상 중 정지 객체를 제외한 움직임이 발생된 이동 객체(예를 들어, 설골, 식괴 등)를 검출하여 이동 경로를 추적할 수 있다.In addition, according to an embodiment of the present invention, it is possible to detect a moving object (eg, hyoid bone, food mass, etc.) in which a motion has occurred except for a still object in the VFSS video to track the movement path.

즉, 각 연하 단계에 해당하는 구간 별로 설골(hyoid bone) 등 연하와 관련된 기관의 이동 상태(움직임 범위, 이동 궤적 등)을 산출하고, 이를 미리 설정된 정상 이동 상태 또는 비정상 이동 상태 등과 비교하여 연하 장애의 원인을 파악하는데 활용할 수 있다.That is, the movement state (movement range, movement trajectory, etc.) of organs related to swallowing, such as the hyoid bone, is calculated for each section corresponding to each swallowing stage, and compared with a preset normal movement state or abnormal movement state. It can be used to determine the cause of

또한, 본 발명의 실시예에 의하면, VFSS 동영상 중 분석 대상에 의해 섭취된 식괴의 이동 경로를 추적하여, 식괴가 기도(airway)로 침투(penetration)하였는지 여부, 흡인(aspiration) 여부 등을 이동 객체의 이동 궤적을 통해 확인할 수 있다.In addition, according to an embodiment of the present invention, by tracking the movement path of the food mass ingested by the analysis target in the VFSS video, whether the food mass has penetrated into the airway, whether aspiration, etc. is a moving object It can be confirmed through the movement trajectory of

이상에서 설명된 실시예들은 하드웨어 구성요소, 소프트웨어 구성요소, 및/ 또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치, 방법 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(Arithmetic Logic Unit), 디지털 신호 프로세서(Digital Signal Processor), 마이크로컴퓨터, FPGA(Field Programmable Gate Array), PLU(Programmable Logic Unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다.The embodiments described above may be implemented by a hardware component, a software component, and/or a combination of a hardware component and a software component. For example, the apparatus, method, and component described in the embodiments may include, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate (FPGA). Array), a programmable logic unit (PLU), a microprocessor, or any other device capable of executing and responding to instructions, may be implemented using one or more general purpose or special purpose computers.

처리 장치는 운영 체제 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 애플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술 분야에서 통상의 지식을 가진 자는 처리 장치가 복수 개의 처리 요소(Processing Element) 및/또는 복수 유형의 처리요소를 포함할 수 있음을 이해할 것이다.The processing device may run an operating system and one or more software applications running on the operating system. A processing device may also access, store, manipulate, process, and generate data in response to execution of the software. For convenience of understanding, the processing device is sometimes described as being used, but one of ordinary skill in the art will recognize that the processing device includes a plurality of processing elements and/or a plurality of types of processing elements. It will be understood that this may include

예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다. 또한, 병렬 프로세서(Parallel Processor) 와 같은, 다른 처리 구성(Processing configuration)도 가능하다. 소프트웨어는 컴퓨터 프로그램(Computer Program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다.For example, the processing device may include a plurality of processors or one processor and one controller. Other processing configurations are also possible, such as a Parallel Processor. The software may include a computer program, code, instructions, or a combination of one or more thereof, which configures a processing device to operate as desired or is independently or collectively processed You can command the device.

소프트웨어 및/ 또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치, 또는 전송되는 신호파(signal wave)에 영구적으로, 또는 일시적으로 구체화(embody) 될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.The software and/or data may be any kind of machine, component, physical device, virtual equipment, computer storage medium or device, to be interpreted by or provide instructions or data to the processing device. , or may be permanently or temporarily embody in a transmitted signal wave. The software may be distributed over networked computer systems and stored or executed in a distributed manner. Software and data may be stored in one or more computer-readable recording media.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다.The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, etc. alone or in combination. The program instructions recorded on the medium may be specially designed and configured for the embodiment, or may be known and available to those skilled in the art of computer software.

컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CDROM, DVD와 같은 광기록 매체(optical media) 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 실시예의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.Examples of the computer-readable recording medium include magnetic media such as hard disks, floppy disks and magnetic tapes, optical media such as CDROMs and DVDs, and ROM, RAM, and flash memory. Hardware devices specially configured to store and execute program instructions, such as, etc. are included. Examples of program instructions include not only machine language codes such as those generated by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

이상과 같이 실시예들이 비록 한정된 실시예와 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다. 그러므로, 다른 구현들, 다른 실시예들 및 청구범위와 균등한 것들도 후술하는 청구범위의 범위에 속한다.As described above, although the embodiments have been described with reference to the limited embodiments and drawings, various modifications and variations are possible from the above description by those skilled in the art. For example, the described techniques are performed in an order different from the described method, and/or the described components of the system, structure, apparatus, circuit, etc. are combined or combined in a different form than the described method, or other components Or substituted or substituted by equivalents may achieve an appropriate result. Therefore, other implementations, other embodiments, and equivalents to the claims are also within the scope of the following claims.

100: 인공지능 기반 비디오 투시 연하검사 자동화 분석 장치
100a: 연하 검사부
110: 데이터 학습부
120: 데이터 입력부
130: 연하 단계 분류부
140: 인공지능 모델
142, 144, 146: 서브 인공지능 모델
142a: 영상 특징 추출부
142b: 서브 인공 신경망
148: 침투-흡인 스케일 결정부
150: 연하 시간 산출부
160: 이동 객체 추적부
170: 출력부
180: 제어부
190: 메모리100: Artificial intelligence-based video fluoroscopy swallowing test automation analysis device
100a: swallowing test unit
110: data learning unit
120: data input unit
130: swallowing stage classification unit
140: artificial intelligence model
142, 144, 146: sub-AI model
142a: image feature extraction unit
142b: sub artificial neural network
148: penetration-aspiration scale determining part
150: swallowing time calculator
160: moving object tracking unit
170: output unit
180: control unit
190: memory

Claims

a data input unit configured to receive image frames of a VFSS moving picture acquired for an analysis object by a VFSS inspection device for video fluoroscopic swallowing test (VFSS); and
Artificial intelligence-based video fluoroscopic swallowing test automation analysis device comprising a swallowing stage classification unit that classifies the image frames of the VFSS video into swallowing stages including the oral stage, pharyngeal stage, and esophageal stage by an artificial intelligence model.

According to claim 1,
An artificial intelligence-based video fluoroscopic swallowing test automation analysis device further comprising a data learning unit for generating the artificial intelligence model by learning learning data including VFSS learning images and examination data for each swallowing disorder patient for a plurality of swallowing disorders.

According to claim 1,
A swallowing time calculator configured to extract an image section corresponding to each swallowing step classified by the artificial intelligence model from the VFSS video and calculate the swallowing time for each swallowing step from the video section corresponding to each swallowing step. AI-based video fluoroscopy automatic swallowing test analysis device.

According to claim 1,
The swallowing stage classification unit is configured to classify a swallowing disorder according to a swallowing disorder classification criterion by the artificial intelligence model,
The swallowing disorder classification criterion includes a penetration-aspiration scale that is defined to classify a swallowing disorder into any one of a plurality of penetration-aspiration scale levels.

5. The method of claim 4,
The artificial intelligence model is
Configured to classify the swallowing disorder of the analysis target according to the penetration-aspiration scale based on the output values of a plurality of sub-AI models learned for each penetration-aspiration scale level defined in the penetration-aspiration scale, artificial Intelligence-based video fluoroscopic swallowing test automation analysis device.

6. The method of claim 5,
The sub-AI model learned with respect to the first penetration-aspiration scale level among the plurality of penetration-aspiration scale levels is,
an image feature extraction unit configured to extract a plurality of image feature information from each image frame of the VFSS video; and
And a sub-artificial neural network configured to output a probability related to the first penetration-aspiration scale level based on the plurality of image feature information, AI-based video fluoroscopy automatic swallowing test analysis device.

7. The method of claim 6,
The image feature extraction unit,
one or more convolution processing units for generating a convolutional image by convolutionally processing each image frame of the VFSS video; and
One or more sub-sampling processing units for sub-sampling the convolutional image in order to extract the plurality of image feature information,
The sub artificial neural network is
an input layer including input nodes configured to receive the plurality of image characteristic information;
an output layer comprising an output node configured to output a probability relating to the first penetration-aspiration scale level; and
Artificial intelligence-based video fluoroscopic swallowing test automation analysis apparatus comprising a hidden layer comprising hidden nodes connected in a fully connected layer structure between the input nodes and the output nodes.

8. The method according to any one of claims 1 to 7,
Further comprising a moving object tracking unit configured to detect a moving object in which motion has occurred except for a still object in the VFSS moving image and track the moving path, and to track the moving path of the food mass eaten by the analysis target in the VFSS moving image. , AI-based video fluoroscopic swallowing test automation analysis device.

receiving, by a data input unit, image frames of a VFSS moving picture acquired for an analysis object by a VFSS test device for video fluoroscopic swallowing test (VFSS); and
Artificial intelligence-based video fluoroscopic swallowing, comprising, by a swallowing stage classification unit, classifying the image frames of the VFSS video into swallowing stages including the oral stage, pharyngeal stage and esophageal stage by an artificial intelligence model Inspection automation analysis method.

10. The method of claim 9,
Artificial intelligence-based video fluoroscopic swallowing test automation, further comprising the step of generating the artificial intelligence model by learning, by the data learning unit, learning data including VFSS learning images and examination data for each swallowing stage for a plurality of patients with swallowing disorders analysis method.

10. The method of claim 9,
extracting, by the swallowing time calculator, an image section corresponding to each swallowing step classified by the artificial intelligence model from the VFSS video, and calculating the swallowing time for each swallowing step from the video section corresponding to each swallowing step Further comprising, artificial intelligence-based video fluoroscopy swallowing test automation analysis method.

10. The method of claim 9,
Classifying the dysphagia of the analysis subject according to the Penetration-Aspiration Scale, which is defined by the artificial intelligence model to classify the dysphagia into any one of a plurality of penetration-aspiration scale levels, further comprising: ,
The step of classifying the dysphagia of the analysis target is,
Based on the output values of a plurality of sub-AI models learned for each of the plurality of penetration-aspiration scale levels, classifying the swallowing disorder of the analysis target according to the penetration-aspiration scale, AI-based video fluoroscopic swallowing test Automated analysis methods.

13. The method of claim 12,
Classifying the dysphagia of the analysis target according to the penetration-aspiration scale,
Predicting the probability of the first penetration-aspiration scale level by a first sub-AI model learned with respect to the first penetration-aspiration scale level among the plurality of penetration-aspiration scale levels,
Predicting the probability of the first penetration-aspiration scale level comprises:
By the image feature extraction unit of the first sub-AI model, each image frame of the VFSS video is convolutionally processed to generate a convolutional image, and the convolutional image is subjected to subsampling processing to obtain a plurality of images from each image frame. extracting image feature information; and
Comprising the step of outputting a probability regarding the first penetration-suction scale level based on the plurality of image feature information by a sub-artificial neural network having a fully connected layer structure of the first sub-AI model, artificial intelligence A method for automated analysis of swallowing tests based on video fluoroscopy.

10. The method of claim 9,
By the moving object tracking unit, detecting a moving object in which motion has occurred except for a still object in the VFSS moving image, tracking the moving path, and tracing the moving path of the food mass consumed by the analysis target in the VFSS moving image. Further comprising, an artificial intelligence-based video fluoroscopic swallowing test automation analysis method.

A computer-readable recording medium in which a program for executing the artificial intelligence-based video fluoroscopic swallowing test automation analysis method of any one of claims 9 to 14 is recorded.