TWI798111B

TWI798111B - Cough identification method and system thereof

Info

Publication number: TWI798111B
Application number: TW111122515A
Authority: TW
Inventors: 盧沛怡; 洪淑惠; 芮嘉勇; 郭漢彬; 洪宗杰
Original assignee: 財團法人國家實驗研究院
Priority date: 2022-06-16
Filing date: 2022-06-16
Publication date: 2023-04-01
Also published as: TW202401456A

Abstract

The present invention relates to a cough identification method and a system thereof. The identification method comprises the following steps: inputting a plurality of training audios and storing them in a storage device; converting the plurality of training audios into a plurality of audio signals, and inputting them into a personal audio feature extraction module performs convolutional neural network operations to establish a personal audio feature model; the personal audio features obtained by the personal audio feature model and the cough audio are input to the cough audio analysis and identification module then performs convolutional neural network operations to establish a cough audio analysis and identification model. When performing disease identification, pre-input audio into the personal audio feature model to obtain personal audio features, then input real-time cough audio and the personal audio features to the cough audio analysis and identification model to identify the corresponding respiratory disease.

Description

Cough sound recognition method and system

本發明係關於一種咳聲辨識方法及其系統，特別係關於一種利用卷積神經網路(Convolutional neural networks,CNN)運算，對個人聲音及其咳聲進行分析，進而正確辨識其咳聲所對應的呼吸道疾病的辨識方法及其系統。 The present invention relates to a cough recognition method and system thereof, in particular to a method that uses convolutional neural networks (CNN) operations to analyze individual voices and their coughs, and then correctly identify the corresponding coughs. Identification method and system for respiratory diseases.

咳嗽為一種呼吸道常見症狀，其係由氣管、支氣管黏膜或胸膜受炎症、異物、物理或化學刺激所引起。咳嗽是多種咳嗽疾病的生理表徵，不同的咳嗽疾病所展現出的咳嗽特徵也不盡相同。 Cough is a common symptom of the respiratory tract, which is caused by inflammation, foreign body, physical or chemical stimulation of the trachea, bronchial mucosa or pleura. Cough is a physiological symptom of various cough diseases, and different cough diseases exhibit different cough characteristics.

在醫學上，經驗豐富的醫生可根據病人咳嗽聲的特徵進行咳嗽疾病的診斷，常見的咳嗽疾病及其特徵包含： In medicine, experienced doctors can diagnose coughing diseases based on the characteristics of the patient's coughing sound. Common coughing diseases and their characteristics include:

1.純乾咳或純濕咳-鼻後滴漏綜合症。 1. Pure dry cough or pure wet cough-postnasal drip syndrome.

2.乾咳並以喘息聲結束-哮喘。 2. A dry cough that ends with a wheezing sound - asthma.

3.哮吼性咳嗽音調高-急性喉炎 3. Roaring cough with high pitch - acute laryngitis

4.具有卡噠聲的咳嗽-慢性阻塞性肺病。 4. Cough with a clicking sound - COPD.

5.乾咳無力但急促-肺炎。 5. Dry cough weak but short-pneumonia.

6.乾咳且具有痙攣性-百日咳。 6. Dry cough with spasticity - pertussis.

7.單聲咳嗽-上呼吸道發炎。 7. Single cough - upper airway inflammation.

由於藉由咳嗽音頻判斷咳嗽所對應的呼吸道疾病種類需要一定的經驗累積，因此一般只有經驗豐富的醫生可以進行精準判斷，沒有經驗或經驗較少的人則無法根據咳嗽音頻判斷其所對應的呼吸道疾病。 Since it takes a certain amount of experience to judge the type of respiratory disease corresponding to a cough by cough audio, generally only experienced doctors can make an accurate judgment, and people with no experience or less experience cannot judge the corresponding respiratory disease based on cough audio. disease.

有鑑於此，如何建立一種無須人工進行辨識的技術，使其能直接利用咳嗽聲直接正確辨識出其所對應的呼吸道疾病，將是相關醫學產業所希望達成之目標。因此，本發明之發明人思索並設計一種咳聲辨識方法及其系統，針對習知技術之缺失加以改善，進而增進產業上之實施利用。 In view of this, how to establish a technology that does not require manual identification, so that it can directly and correctly identify the corresponding respiratory diseases by directly using the cough sound, will be the goal that the relevant medical industry hopes to achieve. Therefore, the inventors of the present invention conceived and designed a cough recognition method and its system to improve the deficiencies of the conventional technology, thereby enhancing the implementation and utilization in the industry.

有鑑於上述習知技術之問題，本發明之目的在於提供一種咳聲辨識方法及其系統，以解決習知之人工判讀精確度不足且難以自動化之問題。 In view of the above-mentioned problems in the prior art, the purpose of the present invention is to provide a cough recognition method and system thereof, so as to solve the conventional problem of insufficient accuracy of manual interpretation and difficulty in automation.

根據本發明之一目的，提出一種咳聲辨識方法，其包含下列步驟：步驟S1：通過輸入裝置輸入複數個訓練音頻及其對應的複數個訓練咳聲音頻，儲存於儲存裝置；步驟S2：藉由處理器存取儲存裝置，將複數個訓練音頻轉換為複數個音頻訊號；步驟S3：藉由處理器將複數個音頻訊號輸入至個人音頻特徵擷取模組進行卷積神經網路運算，以建立個人音頻特徵模型，取得複數個個人音頻特徵；步驟S4：藉由處理器將複數個個人音頻特徵及其對應的複數個訓練咳聲音頻一起輸入至咳聲音頻分析辨識模組進行卷積神經網路運算，以建立咳聲音頻分析辨識模型；步驟S5：通過輸入裝置輸入待辨識個人音頻及其待辨識咳聲音頻，藉由處理器進行判讀程序，依據個人音頻特徵模型及咳聲音頻分析辨識模型判讀對應之呼吸道疾病種類；步驟S6：通過輸出裝置存取儲存裝置，將經判讀分析之呼吸道疾病種類輸出。 According to one object of the present invention, a cough recognition method is proposed, which includes the following steps: Step S1: input a plurality of training audio and its corresponding training cough audio through an input device, and store them in a storage device; Step S2: borrow The storage device is accessed by the processor, and the plurality of training audios are converted into a plurality of audio signals; Step S3: The processor inputs the plurality of audio signals to the personal audio feature extraction module to perform convolutional neural network calculations to Establish a personal audio feature model and obtain a plurality of personal audio features; Step S4: Input the plurality of personal audio features and the corresponding plurality of training cough audio to the cough audio analysis and identification module through the processor to perform convolutional neural network calculations to establish a cough audio analysis and identification model; Step S5: Input the audio of the person to be identified and the audio of the cough to be identified through the input device, and perform the interpretation program through the processor, and interpret the corresponding respiratory disease type according to the personal audio feature model and the cough audio analysis and identification model; Step S6: Pass The output device accesses the storage device and outputs the types of respiratory diseases that have been interpreted and analyzed.

根據本發明之另一目的，提出一種咳聲辨識系統，其包含輸入裝置、儲存裝置、處理器以及輸出裝置。其中，輸入裝置用以輸入複數個訓練音頻及其對應的複數個訓練咳聲音頻、待辨識個人音頻及其待辨識咳聲音頻；儲存裝置連接於輸入裝置及輸出裝置，用以儲存複數個訓練音頻及其對應的複數個訓練咳聲音頻、待辨識個人音頻及其待辨識咳聲音頻；輸出裝置連接於儲存裝置，將經判讀分析之呼吸道疾病種類輸出；處理器連接於儲存裝置，執行複數個指令以施行下列步驟：將複數個訓練音頻轉換為複數個音頻訊號，並將複數個音頻訊號輸入至個人音頻特徵擷取模組進行卷積神經網路運算，以建立個人音頻特徵模型，取得複數個個人音頻特徵；將複數個個人音頻特徵及其對應的複數個訓練咳聲音頻(也轉換為複數個訓練咳聲音頻訊號)一起輸入至咳聲音頻分析辨識模組進行卷積神經網路運算，以建立咳聲音頻分析辨識模型；依據個人音頻特徵模型及咳聲音頻分析辨識模型，判讀待辨識個人音頻及其咳聲音頻，以分析出對應之呼吸道疾病種類。 According to another object of the present invention, a cough recognition system is provided, which includes an input device, a storage device, a processor, and an output device. Among them, the input device is used to input a plurality of training audio and its corresponding plurality of training cough audio, the audio of the individual to be identified and its audio of cough to be identified; the storage device is connected to the input device and the output device to store the plurality of training audio The audio and its corresponding multiple training cough audio, the audio of the individual to be identified and the audio of the cough to be identified; the output device is connected to the storage device to output the types of respiratory diseases that have been interpreted and analyzed; the processor is connected to the storage device to execute multiple Instructions are used to perform the following steps: convert a plurality of training audios into a plurality of audio signals, and input the plurality of audio signals to the personal audio feature extraction module to perform convolutional neural network operations to establish a personal audio feature model, and obtain A plurality of personal audio features; the plurality of personal audio features and their corresponding plurality of training cough audio (also converted into a plurality of training cough audio signals) are input to the cough audio analysis and identification module for convolutional neural network Calculation to establish a cough audio analysis and identification model; according to the personal audio feature model and the cough audio analysis and identification model, the audio of the person to be identified and the cough audio are interpreted to analyze the corresponding type of respiratory disease.

較佳地，所述複數個音頻訊號及所述複數個訓練咳聲音頻訊號可為梅爾倒頻譜係數(Mel-Frequency Cepstral Coefficient,MFCC)；梅爾倒頻譜係數係為一組用來建立梅爾倒頻譜的關鍵係數，由聲音訊號當中的片段，可得到一組足以代表此聲音訊號之倒頻譜(Cepstrum)，而梅爾倒頻譜係數即是從這個倒頻譜中推得的倒頻譜。與一般的倒頻譜不同，梅爾倒頻譜的特色在於，其上的頻帶是均勻分布於梅爾刻度上，亦即，這類頻帶相較於一般所看到、線性的倒頻譜表示方法，與人類非線性的聽覺系統更為接近。例如：在音訊壓縮的技術中，便經常使用梅爾倒頻譜來處理。 Preferably, the plurality of audio signals and the plurality of training cough audio signals may be Mel-Frequency Cepstral Coefficients (MFCC); Mel-Frequency Cepstral Coefficients are a group used to establish Mel The key coefficients of the Mel cepstrum, from the segment of the sound signal, a group of cepstrums (Cepstrum) that can represent the sound signal can be obtained, and the Mel cepstrum coefficients are the cepstrum derived from the cepstrum. Different from the general cepstrum, the characteristic of the Mel cepstrum is that the frequency bands on it are uniformly distributed on the Mel scale, that is, compared with the generally seen, linear cepstrum representation method, this type of frequency band is different from the The human nonlinear auditory system is much closer. For example: In audio compression technology, Mel cepstrum is often used for processing.

承上所述，使用本發明之咳聲辨識方法及其系統，可快速且便利的得知患者罹患之呼吸道疾病種類，透過此方式以辨別呼吸道疾病種類，可輔助臨床判讀，提升後續診斷結果的正確率。 Based on the above, using the cough sound recognition method and system of the present invention, it is possible to quickly and conveniently know the type of respiratory disease that the patient is suffering from. Through this method, the type of respiratory disease can be identified, which can assist clinical interpretation and improve the accuracy of subsequent diagnosis results. Correct rate.

1:待辨識個人音頻 1: Personal audio to be identified

2:待辨識咳聲音頻 2: Cough audio to be identified

3:個人音頻特徵模型 3: Personal audio feature model

4:轉換為待辨識咳聲音頻訊號 4: Convert to the cough audio signal to be identified

5:咳聲音頻分析辨識模型 5: Cough audio analysis and identification model

6:呼吸道疾病種類 6: Types of respiratory diseases

7:訓練音頻 7: Training Audio

8:訓練咳聲音頻 8: Training cough audio

9:轉換為訓練咳聲音頻訊號 9: Convert to training cough audio signal

10:複數卷積層 10: Complex convolution layer

11:複數長短記憶層 11: Complex long and short memory layers

12:全連接層 12: Fully connected layer

13:複數全連接層 13: Complex fully connected layer

20:咳聲辨識系統 20: Cough sound recognition system

21:輸入裝置 21: Input device

22:儲存裝置 22: storage device

23:處理器 23: Processor

24:輸出裝置 24: output device

A:音訊輸入 A:Audio input

B:特徵擷取與分析辨識 B: Feature extraction and analysis and identification

C:咳聲音頻分析辨識模組 C: Cough sound audio analysis and identification module

S1~S6:步驟 S1~S6: steps

為使本發明之技術特徵、內容與優點及其所能達成之功效更為顯而易見，茲將本發明配合附圖，並以實施例之表達形式詳細說明如下：第1圖係為本發明實施例之咳聲辨識方法之步驟流程圖；第2圖係為本發明實施例之咳聲辨識方法之方塊示意圖；第3圖係為本發明實施例之個人音頻特徵擷取模組訓練個人音頻特徵模型之示意圖；第4圖係為本發明實施例之咳聲音頻分析辨識模組訓練咳聲音頻分析辨識模型之示意圖；第5圖係為本發明實施例之咳聲辨識系統之示意圖。 In order to make the technical features, content and advantages of the present invention and the effects that can be achieved more obvious, the present invention is hereby combined with the accompanying drawings, and described in detail in the form of embodiments as follows: The first figure is an embodiment of the present invention The flow chart of the steps of the cough recognition method; Figure 2 is a schematic block diagram of the cough recognition method of the embodiment of the present invention; Figure 3 is the personal audio feature extraction module training the personal audio feature model of the embodiment of the present invention Figure 4 is a schematic diagram of the cough audio analysis and recognition module training cough audio analysis and recognition model of the embodiment of the present invention; Figure 5 is a schematic diagram of the cough recognition system of the embodiment of the present invention.

為利貴審查委員瞭解本發明之技術特徵、內容與優點及其所能達成之功效，茲將本發明配合附圖，並以實施例之表達形式詳細說明如下，而其中所使用之圖式，其主旨僅為示意及輔助說明書之用，未必為本發明實施後之真實比例與精準配置，故不應就所附之圖式的比例與配置關係解讀、侷限本發明於實際實施上的權利範圍，合先敘明。 In order for the Ligui Examiner to understand the technical features, content and advantages of the present invention and the effects it can achieve, the present invention is hereby combined with the accompanying drawings and described in detail in the form of an embodiment as follows, and the drawings used therein, its The subject matter is only for illustration and auxiliary instructions, and not necessarily the true proportion and precise configuration of the present invention after implementation, so it should not be interpreted based on the proportion and configuration relationship of the attached drawings, and limit the scope of rights of the present invention in actual implementation. Together first describe.

除非另有定義，本文所使用的所有術語(包括技術和科學術語)具有與本發明所屬技術領域的通常知識者通常理解的含義。將進一步理解的是，諸如在通常使用的字典中定義的那些術語應當被解釋為具有與它們在相關技術和本發明的上下文中的含義一致的含義，並且將不被解釋為理想化的或過度正式的意義，除非本文中明確地如此定義。 Unless otherwise defined, all terms (including technical and scientific terms) used herein have the meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms such as those defined in commonly used dictionaries should be interpreted to have meanings consistent with their meanings in the context of the relevant art and the present invention, and will not be interpreted as idealized or excessive formal meaning, unless expressly so defined herein.

請一併參閱第1圖、第2圖及第4圖，第1圖係為本發明實施例之咳聲辨識方法之步驟流程圖；第2圖係為本發明實施例之咳聲辨識方法之方塊示意圖；而第4圖係為本發明實施例之咳聲音頻分析辨識模組訓練咳聲音頻分析辨識模型之示意圖。如第1圖所示，咳聲辨識方法包含以下步驟(S1~S6)： Please refer to Fig. 1, Fig. 2 and Fig. 4 together. Fig. 1 is a flow chart of the steps of the cough recognition method of the embodiment of the present invention; Fig. 2 is a flowchart of the cough recognition method of the embodiment of the present invention Block diagram; and Figure 4 is a schematic diagram of the cough audio analysis and recognition module training cough audio analysis and recognition model of the embodiment of the present invention. As shown in Figure 1, the cough recognition method includes the following steps (S1~S6):

步驟S1：通過輸入裝置輸入複數個訓練音頻7及其對應的複數個訓練咳聲音頻8，儲存於儲存裝置。 Step S1: Input a plurality of training audios 7 and corresponding training cough audios 8 through an input device, and store them in a storage device.

通過輸入裝置輸入複數個訓練音頻7及其對應的複數個訓練咳聲音頻8，輸入至系統的儲存裝置當中，這裡所述的輸入裝置為音頻採集設備，例如麥克風，抑或是具有音頻採集功能的電子設備，例如智慧型手機、平板電腦、筆記型電腦、相機等，但不侷限於此，任何可採集音頻的設備均可作為輸入裝置。 Input a plurality of training audio 7 and its corresponding training cough audio 8 through the input device, and input them into the storage device of the system. The input device described here is an audio collection device, such as a microphone, or an audio collection device. Electronic devices such as smartphones, tablets, Notebook computers, cameras, etc., but not limited to, any device that can capture audio can be used as an input device.

步驟S2：藉由處理器存取儲存裝置，將複數個訓練音頻7轉換為複數個音頻訊號。 Step S2: Convert the plurality of training audio signals 7 into a plurality of audio signals through the processor accessing the storage device.

此步驟係將訓練音頻7轉換為特定之音頻訊號，較佳為梅爾倒頻譜係數，因梅爾倒頻譜與人類非線性的聽覺系統更為接近，將其作為後續建立個人音頻特徵模型的效果較顯著。 This step is to convert the training audio 7 into a specific audio signal, preferably the Mel cepstrum coefficient, because the Mel cepstrum is closer to the human nonlinear auditory system, and it will be used as the effect of subsequent establishment of a personal audio feature model more significant.

步驟S3：藉由處理器將複數個音頻訊號輸入至個人音頻特徵擷取模組進行卷積神經網路運算，以建立個人音頻特徵模型3，取得複數個個人音頻特徵。 Step S3: Input the plurality of audio signals to the personal audio feature extraction module through the processor to perform convolutional neural network calculations to establish a personal audio feature model 3 and obtain a plurality of personal audio features.

個人音頻特徵擷取模組包含複數個卷積網路層(convolutional layers)、複數個長短期記憶層(long short-term memory)以及複數個全連接層(fully-connected layers)，且每層包含一觸發函數。 The personal audio feature extraction module includes a plurality of convolutional layers, a plurality of long short-term memory layers, and a plurality of fully-connected layers, and each layer includes A trigger function.

藉由個人音頻特徵擷取模組可將不同人的音頻輸入後進行卷積神經網路運算，使音頻訊號被映射(mapping)至一高維度連續特徵空間(latent space)，所述高維度連續特徵空間即為所述個人音頻特徵模型3，其為一高維度連續特徵空間，具有複數個高維度向量(latent vector)，所述複數個高維度向量即為不同人之個人音頻特徵；在經由卷積神經網路運算訓練模型時，可使用但不限於歐式距離(Euclidean distance)將屬於同一人之不同音頻得到之高維度向量間之距離最小化，並同時最大化屬於不同人之音頻特徵間的距離。 Through the personal audio feature extraction module, the audio of different people can be input and then the convolutional neural network operation is performed, so that the audio signal is mapped to a high-dimensional continuous feature space (latent space), and the high-dimensional continuous The feature space is the personal audio feature model 3, which is a high-dimensional continuous feature space with multiple high-dimensional vectors (latent vectors), and the multiple high-dimensional vectors are the personal audio features of different people; When training the convolutional neural network model, you can use but not limited to Euclidean distance to minimize the distance between high-dimensional vectors obtained from different audios belonging to the same person, and at the same time maximize the distance between audio features belonging to different people. distance.

請一併參照第3圖，第3圖係為本發明實施例之個人音頻特徵擷取模組訓練個人音頻特徵模型3之示意圖。由圖中可知，個人A所發出之音頻A1至 An為n個獨立之音頻且音頻內容無須相同(如不同之語句、或聲音)，個人音頻特徵擷取模組需將A1至An的音頻映射至所述高維度連續特徵空間中相近的區域；類似地，另一個人B提供m個音頻B1至Bm也需映射到接近的區域。另一方面，藉由個人音頻特徵擷取模組訓練的過程中，會以額外的損失函數(loss function)來最大化不同人音頻之間的差異，因此相較於傳統辨識方法具有更高之區別能力(discrimination)。訓練模型時會以複數個不同人之音頻進行，且每個人將提供複數個且涵蓋不同內容之音頻作為訓練模型之用。 Please also refer to FIG. 3 . FIG. 3 is a schematic diagram of a personal audio feature extraction module training a personal audio feature model 3 according to an embodiment of the present invention. It can be seen from the figure that the audio A1 sent by individual A to An is n independent audio and the audio content does not have to be the same (such as different sentences or sounds), and the personal audio feature extraction module needs to map the audio from A1 to An to similar regions in the high-dimensional continuous feature space; Similarly, m audio B1 to Bm provided by another person B also needs to be mapped to a nearby area. On the other hand, during the training process of the personal audio feature extraction module, an additional loss function (loss function) will be used to maximize the difference between different people's audio, so it has a higher accuracy than traditional recognition methods. Discrimination. When training the model, multiple audios from different people will be used, and each person will provide multiple audios covering different content for training the model.

步驟S4：藉由處理器將複數個個人音頻特徵及其對應的複數個訓練咳聲音頻8一起輸入至咳聲音頻分析辨識模組C進行卷積神經網路運算，以建立咳聲音頻分析辨識模型5。 Step S4: Input the multiple personal audio features and the corresponding multiple training cough audios 8 to the cough audio analysis and identification module C through the processor to perform convolutional neural network calculations to establish cough audio analysis and identification Model 5.

咳聲音頻分析辨識模組C包含複數個卷積網路層10、複數個長短期記憶層11與複數個全連接層13(其中一者獨立為單一全連接層12)，且每層也包含一觸發函數。 The cough audio analysis and recognition module C includes a plurality of convolutional network layers 10, a plurality of long-term short-term memory layers 11 and a plurality of fully connected layers 13 (one of which is independently a single fully connected layer 12), and each layer also includes A trigger function.

再參照第4圖，由圖中可知，咳聲音頻分析辨識模組C包含兩種輸入資料，一者為步驟S3中建立之個人音頻特徵模型3中所取得複數個個人音頻特徵；而另一者則為所述複數個個人音頻特徵所對應的複數個訓練咳聲音頻8，所述複數個訓練咳聲音頻8也轉換為複數個訓練咳聲音頻訊號9，所述複數個訓練咳聲音頻訊號9較佳為梅爾倒頻譜係數；亦即，個人A之個人音頻特徵與其咳聲音頻訊號、個人B之個人音頻特徵與其咳聲音頻訊號等，以此類推；其中，所述複數個訓練咳聲音頻訊號9經過複數個卷積網路層10、複數個長短期記憶層11以及單一全連接層12後，再與所述複數個個人音頻特徵一同經過剩餘的複數個全連接層13進行訓練，最終建立一咳聲音頻分析辨識模型5。 Referring to Fig. 4 again, it can be seen from the figure that the cough sound audio analysis and recognition module C includes two kinds of input data, one is a plurality of personal audio characteristics obtained in the personal audio characteristic model 3 established in step S3; and the other The other is the plurality of training cough audio 8 corresponding to the plurality of personal audio features, and the plurality of training cough audio 8 is also converted into a plurality of training cough audio signals 9, and the plurality of training cough audio The signal 9 is preferably Mel cepstral coefficients; that is, the personal audio characteristics of individual A and its cough audio signal, the personal audio characteristics of individual B and its cough audio signal, etc., and so on; wherein, the plurality of training The cough audio signal 9 passes through a plurality of convolutional network layers 10, a plurality of long-term short-term memory layers 11 and a single fully connected layer 12, and then passes through the remaining plurality of fully connected layers 13 together with the plurality of personal audio features. Training, and finally establish a cough audio analysis and recognition model 5.

步驟S5：通過輸入裝置輸入待辨識個人音頻1及其待辨識咳聲音頻2，藉由處理器進行判讀程序，依據個人音頻特徵模型3及咳聲音頻分析辨識模型5判讀對應之呼吸道疾病種類6。 Step S5: Input the personal audio 1 to be identified and the cough audio 2 to be identified through the input device, and execute the interpretation program through the processor, and interpret the corresponding respiratory disease type 6 according to the personal audio feature model 3 and the cough audio analysis and identification model 5 .

將待辨識個人音頻1輸入至個人音頻特徵模型3以獲得一個人音頻特徵，再將所述個人音頻特徵與其待辨識咳聲音頻2(即同一人之咳聲音頻，可為即時錄製或預先錄製)轉換為待辨識咳聲音頻訊號4，一起輸入至咳聲音頻分析辨識模型5中進行判讀，以獲得對應之呼吸道疾病種類6。藉由所述個人音頻特徵作為後續呼吸道疾病種類辨識之個人化校正資訊，可使咳聲疾病之辨識精準度大幅提升。 Input the personal audio 1 to be identified into the personal audio feature model 3 to obtain a personal audio feature, and then combine the personal audio feature with the cough audio 2 to be identified (that is, the cough audio of the same person, which can be recorded in real time or pre-recorded) It is converted into a cough audio signal 4 to be identified, and is input to the cough audio analysis and identification model 5 for interpretation, so as to obtain the corresponding respiratory disease type 6 . Using the personal audio features as the personalized correction information for subsequent identification of respiratory diseases can greatly improve the accuracy of identification of cough diseases.

步驟S6：通過輸出裝置存取儲存裝置，將經判讀分析之呼吸道疾病種類輸出。 Step S6: Access the storage device through the output device, and output the types of respiratory diseases that have been interpreted and analyzed.

經由咳聲音頻分析辨識模型5所判讀分析之呼吸道疾病種類辨識結果6，通過輸出裝置讀取儲存裝置以顯示其對應之呼吸道疾病種類；所述輸出裝置可包含各種顯示介面，例如電腦螢幕、顯示器或手持裝置顯示器等。 The identification result 6 of the type of respiratory disease analyzed by the cough audio analysis and identification model 5 is read from the storage device through the output device to display the corresponding type of respiratory disease; the output device may include various display interfaces, such as computer screens, monitors Or handheld device display, etc.

請參閱第5圖，第5圖係為本發明實施例之咳聲辨識系統之示意圖。如圖所示，咳聲辨識系統20可包含輸入裝置21、儲存裝置22、處理器23及輸出裝置24。輸入裝置21可包含各類音頻採集設備，例如麥克風抑或是具有音頻採集功能的電子設備，例如智慧型手機、平板電腦、筆記型電腦、相機等，透過檔案方式傳輸複數個訓練音頻及其對應的複數個訓練咳聲音頻、待辨識個人音頻及其待辨識咳聲音頻至儲存裝置22當中的記憶體儲存，記憶體可包含唯讀記憶體、快閃記憶體、磁碟或是雲端資料庫等。 Please refer to FIG. 5, which is a schematic diagram of a cough recognition system according to an embodiment of the present invention. As shown in the figure, the cough recognition system 20 may include an input device 21 , a storage device 22 , a processor 23 and an output device 24 . The input device 21 may include various audio collection devices, such as a microphone or an electronic device with audio collection function, such as a smart phone, a tablet computer, a notebook computer, a camera, etc., and transmit a plurality of training audio and its corresponding A plurality of training cough audios, individual audios to be identified and cough audios to be identified are stored in the memory of the storage device 22. The memory can include read-only memory, flash memory, disk or cloud database, etc. .

接著，咳聲辨識系統20藉由處理器23來存取儲存裝置22，處理器23可包含電腦或伺服器當中的中央處理器、圖像處理器、微處理器等，其可包含多核心的處理單元或者是多個處理單元的組合。處理器23執行指令以存取儲存裝置22當中的複數個訓練音頻7及其對應的複數個訓練咳聲音頻8進行卷積神經網路運算，以獲得個人音頻特徵模型3以及咳聲音頻分析辨識模型5；其後，處理器23執行指令以存取儲存裝置22當中的待辨識個人音頻1及其待辨識咳聲音頻2，利用個人音頻特徵模型3以及咳聲音頻分析辨識模型5對待辨識個人音頻1及其待辨識咳聲音頻2進行判讀程序，以獲得一對應之呼吸道疾病種類辨識結果6；最後，輸出裝置24存取儲存裝置22將所判讀獲得之呼吸道疾病種類辨識結果6輸出，輸出裝置24可包含各種顯示介面，例如電腦螢幕、顯示器或手持裝置顯示器等，但不限於此。 Then, the cough recognition system 20 accesses the storage device 22 through the processor 23. The processor 23 may include a central processing unit, an image processor, a microprocessor, etc. in a computer or a server, and it may include a multi-core A processing unit or a combination of multiple processing units. The processor 23 executes instructions to access the plurality of training audios 7 in the storage device 22 and the corresponding plurality of training cough audios 8 to perform convolutional neural network operations to obtain the personal audio feature model 3 and cough audio analysis and identification Model 5; thereafter, the processor 23 executes instructions to access the personal audio 1 to be identified and the cough audio 2 to be identified in the storage device 22, and use the personal audio feature model 3 and the cough audio analysis and identification model 5 to identify the individual The audio 1 and the cough audio 2 to be identified are interpreted to obtain a corresponding identification result 6 of the type of respiratory disease; finally, the output device 24 accesses the storage device 22 to output the identification result 6 of the type of respiratory disease obtained through interpretation, and outputs The device 24 may include various display interfaces, such as, but not limited to, a computer screen, a display, or a display of a handheld device.

綜上所述，使用本發明之咳聲辨識方法及其系統，可快速且便利的得知患者罹患之呼吸道疾病種類，透過此方式以辨別呼吸道疾病種類，可輔助臨床判讀，提升後續診斷結果的正確率。 To sum up, using the cough sound recognition method and system of the present invention, the type of respiratory disease that the patient is suffering from can be quickly and conveniently known. By identifying the type of respiratory disease in this way, it can assist clinical interpretation and improve the accuracy of subsequent diagnosis results. Correct rate.

以上所述僅為舉例性，而非為限制性者。任何未脫離本發明之精神與範疇，而對其進行之等效修改或變更，均應包含於後附之申請專利範圍中。 The above descriptions are illustrative only, not restrictive. Any equivalent modification or change made without departing from the spirit and scope of the present invention shall be included in the scope of the appended patent application.

S1~S6:步驟 S1~S6: steps

Claims

A cough recognition method, the following steps: Step S1: input a plurality of training audios and corresponding training cough audios through an input device, and store them in the storage device; Step S2: use a processor to access the storage device, convert the plurality of training audio into a plurality of audio signals; step S3: input the plurality of audio signals to a person audio feature extraction module through the processor to perform convolutional neural network calculations to create a person The audio feature model obtains a plurality of personal audio features; step S4: input the plurality of personal audio features and the plurality of training cough audio to a cough audio analysis and identification module through the processor to perform convolutional neural network to establish a cough audio analysis and identification model; step S5: input a personal audio to be identified and a cough audio to be identified through the input device, and perform an interpretation program by the processor, according to the personal audio feature model and The cough audio analysis and recognition model interprets a type of respiratory disease corresponding to it; and step S6: access the storage device through an output device, and output the type of respiratory disease that has been interpreted and analyzed, wherein the personal audio frequency described in step S3 The feature extraction module includes a plurality of convolutional layers, a plurality of long short-term memory layers and a plurality of fully-connected layers, and each layer includes a The trigger function is to perform a convolutional neural network operation on the plurality of audio signals by the personal audio feature extraction module, so that the plurality of audio signals are mapped to a high-dimensional continuous feature space (latent space), The high-dimensional continuous feature space is the personal audio feature model, and has a plurality of high-dimensional vectors (latent vector), the plurality of The high-dimensional vectors are the plurality of personal audio features.

The cough recognition method as described in Claim 1, wherein the plurality of training cough audio systems described in step S4 are converted into a plurality of training cough audio signals, and then input together with the plurality of personal audio features into the The cough audio analysis and identification module performs convolutional neural network operations to establish the cough audio analysis and identification model.

The cough recognition method according to claim 1, wherein the plurality of audio signals in step S2 are Mel cepstral coefficients.

The cough recognition method according to claim 2, wherein the plurality of training cough audio signals in step S4 are Mel cepstrum coefficients.

The cough recognition method according to any one of claim 1 to claim 4, wherein the cough audio to be recognized in step S5 is real-time recording or pre-recording.

A cough recognition system, which includes: an input device, which is used to input a plurality of training audio and its corresponding plurality of training cough audio, a personal audio to be identified, and a cough audio to be identified; a storage device connected to The input device is used to store the plurality of training audios and the corresponding plurality of training cough audios, the audio to be identified and the audio to be identified; an output device is connected to the storage device and is The type output of the respiratory disease for interpretation and analysis; and a processor, connected to the storage device, for executing a plurality of instructions to perform the following steps: convert the plurality of training audio into a plurality of audio signals, and convert the plurality of An audio signal is input to a human audio feature extraction module for convolutional neural network operations to establish A human audio feature model obtains a plurality of personal audio features; the plurality of personal audio features and the plurality of training cough audio are input to a cough audio analysis and identification module for convolutional neural network operations to create a cough Sound audio analysis and identification model; and according to the personal audio feature model and the cough audio analysis and identification model, interpret the personal audio to be identified and the cough audio to be identified to analyze the corresponding type of respiratory disease, wherein the personal audio feature The capture module includes a plurality of convolutional layers, a plurality of long short-term memory layers, and a plurality of fully-connected layers, and each layer contains a trigger The function uses the personal audio feature extraction module to perform convolutional neural network operations on the multiple audio signals, so that the multiple audio signals are mapped (mapped) to a high-dimensional continuous feature space (latent space), the The high-dimensional continuous feature space is the personal audio feature model, and has multiple high-dimensional vectors (latent vectors), and the multiple high-dimensional vectors are the multiple personal audio features.

The cough recognition system as described in claim 6, wherein the plurality of training cough audio systems are converted into a plurality of training cough audio signals, and then input to the cough audio analysis and identification module together with the plurality of personal audio features The group performs convolutional neural network operations to establish the cough audio analysis and identification model.

The cough recognition system as claimed in claim 6, wherein the plurality of audio signals are Mel cepstral coefficients.

The cough recognition system according to claim 7, wherein the plurality of training cough audio signals are Mel cepstral coefficients.

The cough recognition system according to any one of claim 6 to claim 9, wherein The cough audio to be identified is recorded in real time or pre-recorded.