WO2024058585A1

WO2024058585A1 - Method and analysis device for classifying severity of lung disease of subject by using voice data and clinical information

Info

Publication number: WO2024058585A1
Application number: PCT/KR2023/013863
Authority: WO
Inventors: 김태영; 이수정; 정명진; 김재호; 박혜윤; 조주희; 강단비; 공성아; 방가람; 신선혜; 류혜인
Original assignee: 사회복지법인 삼성생명공익재단
Priority date: 2022-09-16
Filing date: 2023-09-15
Publication date: 2024-03-21

Abstract

This method for classifying the severity of a lung disease of a subject by using voice data and clinical information comprises steps in which an analysis device: receives voice data and clinical information about the subject; pre-processes the voice data and the clinical information; inputs the pre-processed voice data and clinical information into a pre-trained learning model; and classifies the severity of the lung disease of the subject on the basis of an output value of the learning model.

Description

Method and analysis device for classifying the severity of a subject's lung disease using voice data and clinical information

The technology described below relates to a technique for predicting the degree of lung disease using the subject's voice.

Early diagnosis of lung diseases such as COPD (chronic obstructive pulmonary disease) is important to prevent worsening. COPD can be clinically diagnosed by performing pulmonary function tests on patients with coughing, sputum production, and shortness of breath. However, because early symptoms of COPD are difficult to identify, early detection is difficult with basic diagnosis alone.

Recently, a study was conducted to diagnose COPD using a deep learning model that analyzes chest CT (Computed Tomography) images. However, this diagnostic technique also requires chest imaging of the patient, so it is difficult to contribute to the early detection of lung disease.

The technology described below seeks to provide a technique for predicting the degree of lung disease such as COPD based on the subject's voice and clinical information.

A method of classifying the severity of a subject's lung disease using voice data and clinical information includes the steps of: an analysis device receiving voice data and clinical information of a subject; the analysis device preprocessing the voice data and clinical information; The analysis device includes inputting the pre-processed voice data and clinical information into a pre-trained learning model, and the analysis device classifies the severity of the subject's lung disease based on the output value of the learning model.

The analysis device that classifies the severity of the subject's lung disease includes an interface device that receives the subject's voice data and clinical information, a storage device that stores a learning model that receives the voice data and clinical information and classifies the severity of the lung disease, and the input device. It includes a computing device that preprocesses voice data and clinical information, inputs the preprocessed voice data and clinical information into the learning model, and classifies the severity of the subject's lung disease based on the output value of the learning model.

The technology described below can predict the degree of lung disease by analyzing the user's voice and clinical information that can be obtained relatively easily. The technology described below can diagnose the severity of lung disease through voice recording and self-diagnosis without the patient having to visit a medical institution.

Figure 1 is an example of a lung disease severity classification system using voice and clinical information.

Figure 2 is an example of the learning process of a learning model for lung disease severity classification.

Figure 3 shows the results of verifying the performance of a learning model that classifies lung disease severity.

Figure 4 is an example of an analysis device that classifies lung disease severity.

The technology described below may be subject to various changes and may have various embodiments, and specific embodiments will be illustrated in the drawings and described in detail. However, this is not intended to limit the technology described below to specific embodiments, and should be understood to include all changes, equivalents, and substitutes included in the spirit and scope of the technology described below.

Terms such as first, second, A, B, etc. may be used to describe various components, but the components are not limited by the terms, and are only used for the purpose of distinguishing one component from other components. It is used only as For example, a first component may be named a second component without departing from the scope of the technology described below, and similarly, the second component may also be named a first component. The term and/or includes any of a plurality of related stated items or a combination of a plurality of related stated items.

In terms used in this specification, singular expressions should be understood to include plural expressions, unless clearly interpreted differently from the context, and terms such as “including” refer to the described features, numbers, steps, operations, and components. , it means the existence of parts or a combination thereof, but should be understood as not excluding the possibility of the presence or addition of one or more other features, numbers, step operation components, parts, or combinations thereof.

Before providing a detailed description of the drawings, it would be clarified that the division of components in this specification is merely a division according to the main function each component is responsible for. That is, two or more components, which will be described below, may be combined into one component, or one component may be divided into two or more components for more detailed functions. In addition to the main functions it is responsible for, each of the components described below may additionally perform some or all of the functions handled by other components, and some of the main functions handled by each component may be performed by other components. Of course, it can also be carried out exclusively by .

In addition, when performing a method or operation method, each process forming the method may occur in a different order from the specified order unless a specific order is clearly stated in the context. That is, each process may occur in the same order as specified, may be performed substantially simultaneously, or may be performed in the opposite order.

The technology described below is a technique for predicting or classifying the severity of lung diseases such as COPD based on the subject's voice and clinical information. For convenience of explanation, the following explanation will focus on COPD. However, the technology described below can be used to predict or classify the severity of various lung diseases other than COPD.

User data used for analysis includes the user's voice and clinical information. At this time, user data is collected from a specific subject, and can be collected before and after exercise for a subject performing a certain exercise. The user's voice is collected before and after exercise, and input variables include features extracted from the voice. Some of the clinical information may be collected separately before and after exercise. Furthermore, clinical information may include questionnaire information collected from the subject. A detailed description of user data will be provided later.

The following explains that the analysis device classifies or predicts the degree of lung disease based on the user's voice and clinical information. The analysis device can be implemented as a variety of devices capable of processing data. For example, an analysis device can be implemented as a PC, a server on a network, a smart device, a wearable device, or a chipset with a dedicated program embedded therein. Furthermore, analysis devices may be built into various devices such as exercise equipment, vehicles, smart speakers, etc.

The analysis device can classify lung disease using a machine learning model. Machine learning models include decision trees, random forest, KNN (K-nearest neighbor), Naive Bayes, SVM (support vector machine), and ANN (artificial neural network). The following learning model will be explained focusing on DNN (Deep Neural Network). However, the learning model for lung disease classification can be implemented as various types of models.

Figure 1 is an example of a lung disease severity classification system 100 using voice and clinical information. In Figure 1, the analysis device is a user terminal 130, a computer terminal 140, and a server 150.

Subject A performs a certain exercise for a certain amount of time. Patients with lung disease may have different vocal characteristics before and after exercise. Accordingly, user data can be collected from subject A before and after exercise, respectively.

As described above, the user data may include the subject's voice data and clinical information. Voice data consists of voice data before exercise and voice data after exercise. The voice data before exercise and the voice data after exercise are composed of data in which the same subject A uttered the same words or sentences (text) before and after exercise, respectively. Clinical information may consist of various items. Some of the items included in clinical information correspond to data collected before and after exercise.

The database (DB, 110) may store the subject's voice data and clinical information. The database 110 may be a device such as an Electronic Medical Record (EMR).

The user terminal 120 may receive user data from subject A. In Figure 1, the user terminal 120 illustrates a device such as a smart device. The user terminal 120 corresponds to a device that can collect user voice through a microphone and receive clinical information through a certain interface device. The user terminal 120 may be any one of various types of devices, such as a smart device, PC, wearable device, smart speaker, etc.

The user terminal 130 may receive user data from the database 110. Furthermore, the user terminal 120 and the user terminal 130 may be the same device. In this case, the user terminal 130 may be a device that collects and analyzes user data at the same time. The user terminal 130 may constantly preprocess the user data of the subject. For example, the user terminal 130 may remove noise from the subject's voice data. Additionally, the user terminal 130 may convert voice data into one of the following types: chromagram, Mel frequency cepstral coefficient (MFCC), and Mel spectrogram. Additionally, the user terminal 130 may perform preprocessing to normalize information of different categories among clinical information to a certain range. The user terminal 130 may classify the severity of the subject's lung disease by inputting user data into a pre-built learning model. User A can check the degree of the subject's lung disease through the user terminal 130.

The computer terminal 140 receives user data from the database 110 or the user terminal 120. The computer terminal 140 may constantly preprocess user data. The computer terminal 140 may classify the severity of the subject's lung disease by inputting user data into a pre-built learning model. User B can check the degree of the subject's lung disease through the computer terminal 140.

The server 150 receives user data from the database 110 or the user terminal 120. The server 150 may constantly preprocess the user data of the subject. The server 150 may classify the severity of the subject's lung disease by inputting user data into a pre-built learning model. User A can access the server 150 through the user terminal to check the degree of the subject's lung disease.

Figure 2 is an example of a learning process 200 of a learning model for lung disease severity classification. A learning model may be one of various types. In Figure 2, the learning model shows a deep learning model as an example. A learning model that classifies lung disease severity can be named a classification model. Classification models are built using training data. The learning process of the classification model can be performed by a learning device. A learning device refers to a computing device that controls digital data processing and the learning process of deep learning models.

The learning device constructs learning data (210). Training data can be collected from various groups depending on the severity of lung disease. For example, learning data may be collected from the normal group, severity 1 group, ..., and severity n group, respectively. Lung disease severity can be determined based on FEV1 (Forced expiratory volume). FEV1 refers to the amount of air expelled from the lungs when exhaling in 1 second. If the patient's FEV1 is lower than a threshold (eg, the average of the entire population), the patient can be classified as a COPD patient. If a patient's FEV1 is above the threshold, the patient can be classified as a patient with low severity. That is, in this case, subjects can be classified into normal, low-severity lung disease patients, and high-severity lung disease patients. Learning data includes clinical information and voice data for each group. Furthermore, the training data also includes the label value of each training data. Voice data is collected separately before and after performing certain exercises. Voice data can be collected as subjects utter the same sentence.

Voice data may consist of items as shown in Table 1 below. The learning device can extract 32 features as shown in Table 1 below from voice signals. Furthermore, voice data may consist of any number of items among the items in Table 1 below.

1One	운동 전 침묵구간 수Number of silent sections before exercise
22	운동 후 침묵구간 수Number of silent sections after exercise
33	운동 전후 침묵구간 수의 차이Difference in number of silent sections before and after exercise
44	운동 전후 침묵구간 수의 비율Ratio of number of silent sections before and after exercise
55	운동 전 침묵구간 길이Length of silent period before exercise
66	운동 후 침묵구간 길이Length of silence after exercise
77	운동 전후 침묵구간 차이Differences between silent intervals before and after exercise
88	운동 전후 침묵구간 비율Ratio of silent sections before and after exercise
99	운동 전 녹음파일의 전체 길이Total length of pre-exercise recording file
1010	운동 후 녹음파일의 전체 길이Total length of post-exercise recording file
1111	운동 전후 녹음파일의 전체 길이 차이Difference in total length of recording files before and after exercise
1212	운동 전후 녹음파일의 전체 길이 비율Total length ratio of recording files before and after exercise
1313	운동 전 녹음파일 전체 길이 대비 침묵구간 길이의 비율Ratio of the length of the silent section to the total length of the pre-exercise recording file
1414	운동 후 녹음파일 전체 길이 대비 침묵구간 길이의 비율Ratio of the length of the silent section to the total length of the recording file after exercise
1515	운동 전후 녹음파일 전체 길이 대비 침묵구간 길이의 차이Difference between the length of the silent section compared to the total length of the recording file before and after exercise
1616	운동 전후 녹음파일 전체 길이 대비 침묵구간 길이의 비율Ratio of the length of the silent section to the total length of the recording file before and after exercise
1717	운동 전 지터(jitter)Jitter before exercise
1818	운동 후 지터(jitter)Jitter after exercise
1919	운동 전 시머(shimmer)Shimmer before exercise
2020	운동 후 시머(shimmer)Post-workout shimmer
2121	운동 전 HNR(harmonic to noise ratio)Harmonic to noise ratio (HNR) before exercise
2222	운동 후 HNRHNR after exercise
2323	운동 전 포먼트(formant) Formant before exercise
2424	운동 전 포먼트(formant) Formant before exercise
2525	운동 전 발성 속도(speech rate)Speech rate before exercise
2626	운동 후 발성 속도(speech rate)Speech rate after exercise
2727	운동 전 f0 (기본주파수)f0 (fundamental frequency) before exercise
2828	운동 후 f0 (기본주파수)f0 (fundamental frequency) after exercise
2929	운동 전 조음 속도(articulation rate)Articulation rate before exercise
3030	운동 후 조음 속도(articulation rate)Articulation rate after exercise
3131	운동 전 음절 길이(syllable duration)Syllable duration before exercise
3232	운동 후 음절 길이(syllable duration)Syllable duration after exercise

Among the characteristics of the voice signal, the number of silent sections, length of silent sections, recording length, and ratio of silent sections to recording length are expected to significantly increase after exercise compared to before exercise in the COPD patient group compared to the normal group. If the text for the user's speech is set in advance, the learning device can extract silence sections and conversation sections from the entire file using a voice recognition tool.

The silent section is defined as a section in which a signal with an amplitude level of -36dBFS (decibel full scale) or less lasts for more than 200ms.

Jitter is a value that indicates how constant the period of vibration is. The more irregular the period or amplitude, the larger the value.

Shimmer is a number that indicates how constant the amplitude of vibration is. The more irregular the period or amplitude, the larger the value.

Formant is a resonance that occurs in the vocal tract (the space that extends from the pharynx and oral cavity to the nasal cavity and lips).

HNR (harmonic to noise ratio) is the average value of the ratio between overtones that exist between 70 and 4,500 Hz and abnormal overtones that exist between 1,500 and 4,500 Hz. The larger the value, the higher the noise ratio.

Speech rate refers to the number of words per minute in speech.

f0 (fundamental frequency) is the frequency of vocal cord vibration and perceptually corresponds to pitch.

Articulation rate is the number of syllables per second in speech.

Syllable duration refers to the duration of a syllable.

The learning device can extract jitter, shimmer, formants, HNR, speech rate, f0, articulation rate, and syllable length using publicly available software for speech analysis.

Clinical information can consist of 31 items as shown in Table 2 below. The clinical information below includes self-administration variables. Some of the clinical information may be collected through wearable devices, sensor devices, etc. Furthermore, clinical information may consist of any number of items among the items in Table 2 below.

1One	성별gender
22	나이age
33	키 key
44	몸무게weight
55	BMI (Body Mass Index)BMI (Body Mass Index)
66	의자 앉아 일어서기 횟수Number of times to sit down and stand up
77	안정 시 SpO2 (혈중 산소 포화도)Resting SpO2 (blood oxygen saturation)
88	운동 후 SpO2 (혈중 산소 포화도)SpO2 (blood oxygen saturation) after exercise
99	안정 시 심박수resting heart rate
1010	운동 후 심박수heart rate after exercise
1111	운동자각도 (의자 앉아 일어서기 후 호흡곤란 정도로 1~3 범위이며 1=괜찮다, 2= 약간 숨이 차다, 3= 숨이 차다)Awareness of exertion (difficulty breathing after sitting on a chair and standing up ranges from 1 to 3; 1=fine, 2=slightly out of breath, 3=out of breath)
1212	호흡곤란 정도 (0~4 범위이며 0=정상, 4=최중증)Degree of dyspnea (ranging from 0 to 4, 0=normal, 4=severe)
1313	나는 전혀 기침을 하지 않는다/한다 (0~5 범위이며 0=전혀 없다, 5=심하다)I do not cough at all (range 0 to 5, 0=not at all, 5=severe)
1414	나는 가슴에 전혀 가래가 없다/있다I don't/have any phlegm in my chest at all
1515	나는 가슴 답답함을 전혀 느끼지 않는다 / 느낀다I do not feel / feel any chest tightness at all
1616	나는 언덕이나 계단을 오를 때 전혀 숨이 차지 않다 / 차다I don't feel out of breath at all when I climb hills or stairs.
1717	나는 집에서 활동하는데, 전혀 재약을 받지 않는다 / 받는다I work from home, and I don't / get any medication at all.
1818	나는 폐질환에도 불구하고 외출하는데 자신이 있다 / 없다Despite my lung disease, I am/are not confident in going out.
1919	나는 잠을 깊이 잔다 / 못한다I sleep deeply / I can’t.
2020	나는 기운이 왕성하다 / 없다I have a lot of energy / I have no energy.
2121	위 질문 8항의 합 (0~40)Sum of 8 questions above (0~40)
2222	지금까지 살아오는 동안 피운 담배의 양은 총 얼마나 됩니까? (1 = 피운 적 없음 or 평생 5갑 미만, 2 = 평생 5갑 이상 )How many cigarettes have you smoked in your entire life so far? (1 = never smoked or less than 5 packs in lifetime, 2 = more than 5 packs in lifetime)
2323	금연여부 (1 = 금연중(현재 전혀 피우지 않음), 2 = 현재 흡연중 (금연 시도 중이라도 현재 완전히 금연 못했으면 흡연 중으로 표시)Smoking cessation status (1 = quitting smoking (currently not smoking at all), 2 = currently smoking (even if you are trying to quit smoking, if you have not completely quit smoking, mark as smoking)
2424	금연한 기간 (1 = 1달이내 금연, 2 = 1년 이내 금연, 3 = 1년 이전 금연)Period of quitting smoking (1 = quit smoking within 1 month, 2 = quit smoking within 1 year, 3 = quit smoking more than 1 year ago)
2525	하루피는 흡연량 ((x) 개피 / 하루)Amount of cigarettes smoked per day ((x) cigarettes per day)
2626	현재 흡연 중이라면 최근 1년 동안 담배를 끊고자 하루(24시간) 이상 금연한 적이 있습니까? (1 = 아니오, 2 = 예(최근 1년간 하루 이상 금연 시도))If you currently smoke, have you tried to quit smoking for at least one day (24 hours) in the past year? (1 = No, 2 = Yes (attempted to quit smoking at least one day in the past year))
2727	몇 세부터 흡연을 시작하였습니까? (약 x세)At what age did you start smoking? (about x years old)
2828	흡연 시 하루에 피운 담배양은 (20개비 =1갑) (하루 평균 x개비)When smoking, the amount of cigarettes smoked per day (20 cigarettes = 1 pack) (average x cigarettes per day)
2929	지금까지 흡연한 총기간(금연한 경우도 흡연했던 총 기간 (총 x개월)Total number of years you have smoked (total number of years you have smoked even if you quit smoking (total x months))
3030	금연 기간 (약 x개월 동안 금연)Smoking cessation period (approximately x months without smoking)
3131	학력 (1=초등학교 이하, 2=중학교, 3=고등학교, 4=전문대/4년제 대학, 5=대학원 이상, 6=기타 )Education (1=Elementary school or lower, 2=Middle school, 3=High school, 4=Junior college/4-year university, 5=Graduate school or higher, 6=Other)

The learning device can consistently preprocess the initial learning data. Preprocessing for voice data may include noise removal, data type conversion, etc. Preprocessing of clinical information may include the process of adjusting values into certain categories. For example, the learning device can normalize clinical information using preprocessing techniques such as Min-Max Normalization and z-score normalization.

Additionally, the learning device can convert the value of clinical information into a constant value by one-hot vector coding. The learning device can input encoded clinical information into a learning model.

The learning device treats 32 voice variables and 31 types of clinical information as individual input variables and can construct a total of 63 input variables as learning data.

The learning device builds a classification model using the learning data (220). The learning device extracts one input data from the collected learning data and inputs it into the classification model. The classification model outputs a probability value for lung disease severity for the corresponding input data. The learning device compares the value output by the classification model with the known correct answer (label value) and updates the weight of the classification model so that the classification model outputs a label corresponding to the correct answer. The learning device repeats the learning process using multiple learning data.

The researcher built and verified the aforementioned classification model using data from 248 people collected from the affiliated institution. The researcher divided the data of 248 people into a 4:1 ratio and used them as training data and verification data, respectively. Data from 248 patients consisted of 54 cases with high COPD severity (FEV1 < 50), 144 cases with low severity (FEV1 ≥ 50), and 50 cases with normal COPD severity. The researcher built various machine learning models. The researcher built models of MLP (Multi-layer Perceptron), Landon Forest, Extra Tree Classifier, XGBoost, and LightGBM, respectively. The researcher compared the performance of the built models, and random forest showed the highest performance. Figure 3 shows the results of verifying the performance of a learning model that classifies lung disease severity. Looking at Figure 3, the built model showed an average micro AUROC (area under the ROC) and an average macro AUROC of 0.87. Therefore, the classification model showed significantly high performance in classifying lung disease severity.

Figure 4 is an example of an analysis device 300 that classifies the severity of lung disease. The analysis device 300 corresponds to the above-described analysis device (130, 140, or 150 in FIG. 1). The analysis device 300 may be physically implemented in various forms. For example, the analysis device 300 may take the form of a smart device, a computer device such as a PC, a network server, a wearable device, an exercise device, or a chipset dedicated to data processing.

The analysis device 300 may include a storage device 310, a memory 320, an arithmetic device 330, an interface device 340, a communication device 350, and an output device 360.

The storage device 310 may store the above-described classification model. The classification model is a pre-trained model. The classification model is a model that outputs lung disease severity based on input user data (voice data and clinical information).

The storage device 310 can store user data. User data is the user's voice data and clinical information that are subject to analysis. Voice data consists of data collected before exercise and data collected after exercise. Voice data may consist of the items in Table 1. Clinical information may consist of the items in Table 2.

The memory 320 may store data and information generated when the analysis device classifies the severity of lung disease using the subject's user data.

The interface device 340 is a device that receives certain commands and data from the outside.

The interface device 340 may receive the subject's voice data from a physically connected input device or an external storage device. At this time, the input device may include a device such as a microphone. Voice data consists of data measured before and after exercise.

The interface device 340 may receive the subject's clinical information from a physically connected input device or an external storage device.

The interface device 340 may analyze the subject's user data and transmit the results of classifying the severity of lung disease to an external object.

Meanwhile, the interface device 340 may receive data or information transmitted through the communication device 350 below.

The communication device 350 refers to a configuration that receives and transmits certain information through a wired or wireless network.

The communication device 350 may receive the subject's voice data from an external object (database, user terminal, microphone, etc.).

The communication device 350 may receive clinical information about a subject from an external object.

The communication device 350 may analyze the subject's user data and transmit the results of classifying the severity of lung disease to an external object such as a user terminal.

The output device 360 is a device that outputs certain information. The output device 360 can output interfaces, classification results, etc. required for the data processing process.

The computing device 330 may preprocess user data consistently. For example, the computing device 330 may convert voice data into a certain type of data. Additionally, the computing device 330 may normalize each value of clinical information into a certain category.

The computing device 330 inputs the preprocessed user data into a pre-trained learning model. The computing device 330 may classify the severity of the subject's lung disease based on the probability value output by the learning model.

The computing device 330 may be a device such as a processor that processes data and performs certain operations, an AP, or a chip with an embedded program.

Additionally, the method for classifying the severity of a subject's lung disease as described above may be implemented as a program (or application) including an executable algorithm that can be executed on a computer. The program may be stored and provided in a temporary or non-transitory computer readable medium.

A non-transitory readable medium refers to a medium that stores data semi-permanently and can be read by a device, rather than a medium that stores data for a short period of time, such as registers, caches, and memories. Specifically, the various applications or programs described above include CD, DVD, hard disk, Blu-ray disk, USB, memory card, ROM (read-only memory), PROM (programmable read only memory), and EPROM (Erasable PROM, EPROM). Alternatively, it may be stored and provided in a non-transitory readable medium such as EEPROM (Electrically EPROM) or flash memory.

Temporarily readable media include Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDR SDRAM), and Enhanced SDRAM (Enhanced RAM). It refers to various types of RAM, such as SDRAM (ESDRAM), Synchronous DRAM (Synclink DRAM, SLDRAM), and Direct Rambus RAM (DRRAM).

This embodiment and the drawings attached to this specification only clearly show some of the technical ideas included in the above-described technology, and those skilled in the art can easily understand them within the scope of the technical ideas included in the specification and drawings of the above-described technology. It is self-evident that all inferable variations and specific embodiments are included in the scope of rights of the above-mentioned technology.

Claims

The analysis device receives the subject's voice data and clinical information;

The analysis device preprocesses the voice data and the clinical information;

The analysis device inputs the pre-processed voice data and clinical information into a pre-trained learning model; and

A method of classifying the severity of a subject's lung disease using voice data and clinical information, wherein the analysis device classifies the severity of the subject's lung disease based on the output value of the learning model.
According to paragraph 1,

The voice data classifies the severity of the subject's lung disease using voice data and clinical information, including voice data in which the subject utters a certain text before exercise and voice data in which the subject utters a certain text after the exercise. How to.
According to paragraph 1,

The voice data includes the number of silent sections before exercise, the number of silent sections after exercise, the difference in the number of silent sections before and after exercise, the ratio of the number of silent sections before and after exercise, the length of silent sections before exercise, the length of silent sections after exercise, the difference between silent sections before and after exercise, Ratio of silent section before and after exercise, total length of recorded file before exercise, total length of recorded file after exercise, difference in total length of recorded file before and after exercise, ratio of total length of recorded file before and after exercise, length of silent section compared to total length of recorded file before exercise. ratio, the ratio of the length of the silent section to the total length of the recording file after exercise, the difference in the length of the silent section compared to the total length of the recording file before and after exercise, the ratio of the length of the silent section to the total length of the recording file before and after exercise, jitter before exercise, exercise After jitter, before exercise shimmer, after exercise shimmer, before exercise HNR (harmonic to noise ratio), after exercise HNR, before exercise formant, before exercise formant, Speech rate before exercise, speech rate after exercise, f0 (fundamental frequency) before exercise, f0 (fundamental frequency) after exercise, articulation rate before exercise, articulation rate after exercise ), a method of classifying the severity of a subject's lung disease using voice data and clinical information including a plurality of items among the syllable duration before exercise and the syllable duration after exercise.
According to paragraph 1,

The clinical information includes voice data and clinical information including gender, age, height, weight, BMI (Body Mass Index), oxygen saturation, heart rate, perceived exertion, degree of difficulty breathing, cough, smoking cessation, smoking cessation period, and smoking period. A method of classifying the severity of a subject's lung disease using information.
An interface device that receives the subject's voice data and clinical information;

A storage device that stores a learning model that receives voice data and clinical information and classifies the severity of lung disease; and

It includes a computing device that preprocesses the input voice data and clinical information, inputs the preprocessed voice data and clinical information into the learning model, and classifies the severity of the subject's lung disease based on the output value of the learning model. An analysis device that classifies the severity of a patient's lung disease.
According to clause 5,

The voice data is an analysis device for classifying the severity of the subject's lung disease, including voice data in which the subject utters a certain text before exercising and voice data in which the subject utters the certain text after the exercise.
According to clause 5,

The voice data includes the number of silent sections before exercise, the number of silent sections after exercise, the difference in the number of silent sections before and after exercise, the ratio of the number of silent sections before and after exercise, the length of silent sections before exercise, the length of silent sections after exercise, the difference between silent sections before and after exercise, Ratio of silent section before and after exercise, total length of recorded file before exercise, total length of recorded file after exercise, difference in total length of recorded file before and after exercise, ratio of total length of recorded file before and after exercise, length of silent section compared to total length of recorded file before exercise. ratio, the ratio of the length of the silent section to the total length of the recording file after exercise, the difference in the length of the silent section compared to the total length of the recording file before and after exercise, the ratio of the length of the silent section to the total length of the recording file before and after exercise, jitter before exercise, exercise After jitter, before exercise shimmer, after exercise shimmer, before exercise HNR (harmonic to noise ratio), after exercise HNR, before exercise formant, before exercise formant, Speech rate before exercise, speech rate after exercise, f0 (fundamental frequency) before exercise, f0 (fundamental frequency) after exercise, articulation rate before exercise, articulation rate after exercise. ), an analysis device that classifies the severity of a subject's lung disease, including multiple items among syllable duration before exercise and syllable duration after exercise.
According to clause 5,

The clinical information includes the subject's lung disease, including gender, age, height, weight, BMI (Body Mass Index), oxygen saturation, heart rate, perceived exertion, degree of difficulty breathing, cough, smoking cessation, smoking cessation period, and smoking period. An analysis device that classifies severity.