WO2021114761A1 - Lung rale artificial intelligence real-time classification method, system and device of electronic stethoscope, and readable storage medium - Google Patents

Lung rale artificial intelligence real-time classification method, system and device of electronic stethoscope, and readable storage medium Download PDF

Info

Publication number
WO2021114761A1
WO2021114761A1 PCT/CN2020/113511 CN2020113511W WO2021114761A1 WO 2021114761 A1 WO2021114761 A1 WO 2021114761A1 CN 2020113511 W CN2020113511 W CN 2020113511W WO 2021114761 A1 WO2021114761 A1 WO 2021114761A1
Authority
WO
WIPO (PCT)
Prior art keywords
lung
data
rales
electronic stethoscope
artificial intelligence
Prior art date
Application number
PCT/CN2020/113511
Other languages
French (fr)
Chinese (zh)
Inventor
蔡盛盛
胡南
刘仁雨
徐兴国
Original Assignee
苏州美糯爱医疗科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏州美糯爱医疗科技有限公司 filed Critical 苏州美糯爱医疗科技有限公司
Publication of WO2021114761A1 publication Critical patent/WO2021114761A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/26Pre-filtering or post-filtering
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B7/00Instruments for auscultation
    • A61B7/003Detecting lung or respiration noise
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B7/00Instruments for auscultation
    • A61B7/02Stethoscopes
    • A61B7/04Electric stethoscopes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/0212Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using orthogonal transformation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/24Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/66Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Definitions

  • Step 4 Use the logarithmic Mel filter bank transformation result matrix F i to calculate the data matrix ⁇ i,0 , ⁇ i,1 and ⁇ i,2 of the three channels;
  • the data matrix of the three channels calculated in step 4 includes:
  • an artificial intelligence real-time classification system for lung rales of an electronic stethoscope including:
  • Logarithmic mel filter bank transform the result matrix of the data vector to calculate the data matrix of the three channels
  • the present invention extracts the three-channel logarithmic mel filter bank transform feature as the input of the convolutional neural network
  • the present invention clearly provides a specific and effective convolutional neural network structure for lung rales classification, in which the convolutional layer is used to discover the deeper features of the input data, and the pool is added after the convolutional layer Layer to improve the fault tolerance of the network;
  • the present invention adds a truncated normal distribution with a standard deviation of 0.1 for parameter weight initialization. At the same time, it uses Adam optimization, Dropout learning, and L2 regularization to prevent overfitting and improve the cost. Robustness of the method;
  • the present invention can realize multi-classification of the four cases of wet rales, wheezing, and both of them and neither of them.
  • Figure 1 is a schematic diagram of breath sounds with different additional rales in the prior art
  • FIG. 4 is a characteristic diagram of the preprocessing and extraction of the present invention: (a) is an example diagram of the signal waveform of the original acquisition; (b) is an example diagram of the signal waveform after preprocessing a certain 2-second data block;
  • FIG. 5 is a schematic diagram of the structure of the lung rale artificial intelligence real-time classification system of the electronic stethoscope of the present invention.
  • the lung sound signal is collected in real time through the electronic stethoscope, and a buffer space is allocated for the collected data and continues to enter the buffer.
  • a buffer space is allocated for the collected data and continues to enter the buffer.
  • the preprocessed data block is counted as the vector x i , and the data vector x is calculated mel logarithmic transform filter bank i is denoted as i matrix F.
  • each piece of data has a variable length and a duration of 10 seconds to 90 seconds
  • the applicant team 508 pieces of lung sound data collected by the pediatric department of several domestic hospitals also covering the four lung sound situations involved in the present invention, each piece of data is more than 30 seconds in length
  • a total of 1,428 pieces of data are used as a lung sound database for neural network training And verification of classification effect.
  • the present invention through the above method, (1) can provide a unified real-time classification method of rales under the condition that the total length of actual lung sound collection is uncertain; (2) the present invention can realize wet rales, stridor, Both include and neither include the multi-classification of these four cases; (3) The present invention can effectively improve the robustness of rale detection and classification results.
  • the present invention extracts the three-channel logarithmic mel filter bank transform feature as the input of the convolutional neural network

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Public Health (AREA)
  • Molecular Biology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Surgery (AREA)
  • Medical Informatics (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Veterinary Medicine (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Epidemiology (AREA)
  • Pulmonology (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)

Abstract

A lung rale artificial intelligence real-time classification method of an electronic stethoscope. The method comprises: acquire a lung sound signal in real time by means of an electronic stethoscope, and automatically classify lung rale; slidingly extract the acquired data every two seconds as a data block, and normalize the data blocks by means of a band-pass filter; use a logarithm Mel filter bank transformation to calculate data matrixes of three channels, and input same into a pre-built and trained convolutional neural network, an output of the convolutional neural network being the probability values of four lung sound conditions; and a system gives the final probability values of the four lung sound conditions in combination with results of multiple data blocks. The robustness of a rale detection and classification result can be effectively improved. The present invention further provides a lung rale artificial intelligence real-time classification system and device of the electronic stethoscope and a computer readable storage medium, and the system and device have the same beneficial effects.

Description

一种电子听诊器的肺部啰音人工智能实时分类方法、系统、装置及可读存储介质Method, system, device and readable storage medium for artificial intelligence real-time classification of lung rales of electronic stethoscope 技术领域Technical field
本发明涉及计算机听觉、人工智能技术领域,具体为一种电子听诊器的肺部啰音人工智能实时分类方法、系统、装置及可读存储介质。The invention relates to the technical field of computer hearing and artificial intelligence, in particular to a method, system, device and readable storage medium for real-time artificial intelligence classification of pulmonary rales of an electronic stethoscope.
背景技术Background technique
由于环境污染,空气质量恶化的影响,哮喘、肺炎、支气管炎等各种呼吸疾病的发病率正在逐年提高,每年有近100万5岁以下儿童死于急性下呼吸道感染,死亡人数超过艾滋病毒、疟疾和结核病溃疡的总和。由于呼吸系统的病变已成为严重威胁人类健康的疾病之一,对呼吸疾病准确的诊断和有效的治疗是保障患者尽早康复的有效方式。Due to environmental pollution and deteriorating air quality, the incidence of various respiratory diseases such as asthma, pneumonia, and bronchitis is increasing year by year. Every year, nearly 1 million children under the age of 5 die from acute lower respiratory infections, and the number of deaths exceeds that of HIV, The sum of malaria and tuberculosis ulcers. As the disease of the respiratory system has become one of the diseases that seriously threaten human health, accurate diagnosis and effective treatment of respiratory diseases are effective ways to ensure the patient's recovery as soon as possible.
目前医院用来检查和鉴定呼吸疾病的方法有:(1)胸部X线:这种方法可以记录肺部的大体病变,如肺部炎症、肿块、结核等。(2)肺CT:这种方法有助于对胸部X线发现的问题作出定性诊断,如肿块的类别、位置等。(3)支气管镜检查:这种方法用来确诊大多数肺部及气道疾病。但这些方法不仅价格昂贵而且相对会对人体造成影响,并且由于地域的限制,有些人可能无法接触到这些诊断的方法。The methods currently used by hospitals to examine and identify respiratory diseases include: (1) Chest X-ray: This method can record general lung lesions, such as lung inflammation, masses, and tuberculosis. (2) Lung CT: This method helps to make a qualitative diagnosis of the problems found on chest X-rays, such as the type and location of the mass. (3) Bronchoscopy: This method is used to diagnose most lung and airway diseases. However, these methods are not only expensive but also relatively affect the human body, and due to geographical restrictions, some people may not be able to access these diagnostic methods.
听诊是对呼吸疾病最早期也是最直接的检查手段之一。医护人员主要通过听诊器听取患者的呼吸音是否含有啰音——主要包括湿啰音和喘鸣音,图1展示了含有不同附加啰音的呼吸音,其中(a)包含湿啰音,(b)包含喘鸣音,(c)同时包含湿啰音和喘鸣音,(d)是正常的呼吸音。然而这种方法一直受制于听诊环境、医技水平等因素。Auscultation is one of the earliest and most direct examination methods for respiratory diseases. Medical staff mainly use stethoscopes to listen to whether the patient's breath sounds contain rales-mainly including wet rales and stridor. Figure 1 shows breath sounds with different additional rales, among which (a) contains wet rales, (b ) Contains wheezing sounds, (c) contains both rales and wheezing sounds, and (d) is a normal breathing sound. However, this method has always been restricted by factors such as the auscultation environment and the level of medical skills.
现有技术中,如公开号CN106022258A公开的数字听诊器与滤除心音提取肺音的方法,采用离散熵值先筛选出部分有效帧,然后对筛选的有效帧提取其平均幅值作为阈值,通过此阈值得到包含心音的肺音帧,再进行小波变换 并使用阈值滤除相关的小波系数从而得到相对纯净的肺音帧。对肺音帧进行MFCC特征参数矩阵提取,并将此特征参数矩阵送入传统的后向传播(BP)网络进行类别判断。此方法需要通过两次阈值判断,相关的有用信息会在阈值判断中丢失,从而降低了MFCC特征参数矩阵的有效性。In the prior art, such as the digital stethoscope disclosed in Publication No. CN106022258A and the method for extracting lung sounds by filtering out heart sounds, discrete entropy values are used to first filter out part of the effective frames, and then the average amplitude of the filtered effective frames is extracted as the threshold. The threshold is used to obtain the lung sound frame containing the heart sound, and then the wavelet transform is performed and the threshold is used to filter out the relevant wavelet coefficients to obtain a relatively pure lung sound frame. The MFCC feature parameter matrix is extracted from the lung sound frame, and the feature parameter matrix is sent to the traditional Backward Propagation (BP) network for category judgment. This method needs to pass two threshold judgments, and relevant useful information will be lost in the threshold judgment, thereby reducing the effectiveness of the MFCC feature parameter matrix.
如CN107704885A公开的在智能平台上实现心音与肺音分类的方法,首先对收到的数据进行5点重采样,采样频率为2205Hz,得到重采样信号之后进行滤波处理,设置带通最大衰减为3db,阻带最小衰减为18db。接着利用dmey小波进行小波去噪,得到去噪信号之后利用自相关系数进行分段。然后对每个分段提取MFCC特征参数矩阵并将特征参数矩阵输入支持向量机(SVM)分类器进行分类处理。但SVM分类器在处理如MFCC特征参数矩阵等高维的数据时并不是非常的高效,并且此方法也没有给出一种可以实时分类的方法。For example, CN107704885A discloses the method of realizing heart sound and lung sound classification on the intelligent platform. First, the received data is resampled at 5 points at a sampling frequency of 2205Hz. After the resampled signal is obtained, filter processing is performed, and the maximum bandpass attenuation is set to 3db , The minimum attenuation of the stop band is 18db. Then use dmey wavelet to perform wavelet denoising, and use autocorrelation coefficient to segment after the denoising signal is obtained. Then extract the MFCC feature parameter matrix for each segment and input the feature parameter matrix into a support vector machine (SVM) classifier for classification processing. However, the SVM classifier is not very efficient when processing high-dimensional data such as the MFCC feature parameter matrix, and this method does not provide a method that can be classified in real time.
如B.Mohammed发表的论文“Pattern recognition methods applied to respiratory sounds classification into normal and wheeze classes”结合MFCC特征与高斯混合模型(GMM)对正常肺音与含喘鸣音的肺音进行分类;P.Mayorga发表的论文“Acoustics based assessment of respiratory diseases using GMM classification”也同样使用GMM来对肺音的啰音进行分类;S.Alsmadi等人发表的论文“Design of a DSP-based instrument for real-time classification of pulmonary sounds”使用K-最近邻(K-NN)和最小距离准则来对肺音整体是否异常做出判断。For example, the paper "Pattern recognition methods applied to respiratory sounds classification into normal and wheeze classes" published by B. Mohammed combines MFCC features and Gaussian Mixture Model (GMM) to classify normal lung sounds and lung sounds with wheeze; P.Mayorga The published paper "Acoustics based assessment of respiratory disease using GMM classification" also uses GMM to classify the rales of lung sounds; the paper published by S. Alsmadi et al. "Design of a DSP-based instrument for real-time classification of "Pulmonary sounds" uses K-nearest neighbor (K-NN) and minimum distance criteria to judge whether the overall lung sounds are abnormal.
以上论文提出的方法可以针对某一种啰音或者数据的整体情况进行分类,但是不能对湿啰音、喘鸣音、两者都含有和两者都没有的多种情况进行全面的判断。The method proposed in the above papers can classify a certain kind of rales or the overall situation of the data, but cannot make a comprehensive judgment on wet rales, wheezing, and multiple situations that contain both and none of them.
再如公开号CN107818366A公开的一种基于卷积神经网络的肺音分类方法、系统及用途,首先对肺音信号进行带通滤波,接着通过短时傅里叶变换将肺音时序信号转为二维的频谱图,最后将此频谱图作为输入的特征来对此 肺音信号进行分类。此专利只是对卷积神经网络的简单应用,需要输入固定长度大小的信号后得到简单的正常/不正常肺音的二值结论。该方法无法满足实时性,容易受到短时间内干扰的影响引起误判,且分类结果过于简单。Another example is a lung sound classification method, system and application based on a convolutional neural network disclosed in the publication number CN107818366A. First, the lung sound signal is band-pass filtered, and then the lung sound timing signal is converted into two through short-time Fourier transform. Dimensional spectrogram, and finally use this spectrogram as an input feature to classify this lung sound signal. This patent is only a simple application of a convolutional neural network, and a simple binary conclusion of normal/abnormal lung sounds is obtained after inputting a signal of a fixed length. This method cannot meet real-time performance, is easily affected by short-time interference and causes misjudgment, and the classification result is too simple.
目前针对肺部啰音信号进行分类的现有技术主要集中于传统的机器学习与模式识别,也有少数涉及深度学习里面比较简单的技术应用,总的来说这些现有技术存在以下几个缺点:At present, the existing technologies for classifying lung rale signals mainly focus on traditional machine learning and pattern recognition. There are also a small number of relatively simple technology applications involved in deep learning. In general, these existing technologies have the following shortcomings:
(1)上述方法的输入都需要是固定长度才可以提取得到固定长度的特征参数,然而在实际的应用场景中得到的是不定长的肺音信号,且实时的信号采集与诊断非常重要;(1) The input of the above method needs to be of fixed length to extract the characteristic parameters of fixed length. However, in the actual application scenario, the lung sound signal of variable length is obtained, and real-time signal acquisition and diagnosis are very important;
(2)啰音里面也有很多种类,不同的啰音种类对应着有不同的病症,因此,能够识别不同种类的啰音至关重要,但上述方法没有给出对于不同啰音种类的多分类方案;(2) There are also many types of rales, and different types of rales correspond to different diseases. Therefore, it is important to be able to identify different types of rales, but the above method does not give a multi-classification scheme for different types of rales. ;
(3)每个病人肺部病变的情况不同,导致即便是同一种啰音也可能在不同时刻呈现不一样的肺音,现有技术对啰音检测与分类结果的鲁棒性还很差。(3) The conditions of the lung lesions of each patient are different, so that even the same kind of rales may present different lung sounds at different times. The existing technology is still very poor in the robustness of the rale detection and classification results.
随着近年来物联网(IoT)技术与人工智能(AI)技术的蓬勃发展,基于人工智能的方法来对肺部啰音进行实时分类成为可能。因此,实现一种电子听诊器的肺部啰音实时分类方法就成为迫切需要。With the rapid development of Internet of Things (IoT) technology and artificial intelligence (AI) technology in recent years, it is possible to classify lung rales in real time based on artificial intelligence methods. Therefore, it becomes an urgent need to realize a real-time classification method of lung rales of an electronic stethoscope.
发明内容Summary of the invention
本发明的目的在于提供一种电子听诊器的肺部啰音人工智能实时分类方法、系统、装置及可读存储介质,以解决上述背景技术中提出的问题。The purpose of the present invention is to provide a method, system, device and readable storage medium for real-time artificial intelligence classification of lung rales of an electronic stethoscope, so as to solve the problems raised in the background art.
为实现上述目的,本发明提供如下技术方案:In order to achieve the above objectives, the present invention provides the following technical solutions:
一种电子听诊器的肺部啰音人工智能实时分类方法,包括:An artificial intelligence real-time classification method for lung rales of an electronic stethoscope, including:
步骤1.从电子听诊器启动肺音采集开始,实时读取采集通道中的数据到某缓存空间,当数据累积到2秒时长时,启动肺部啰音自动分类程序; Step 1. From the start of lung sound acquisition by the electronic stethoscope, the data in the acquisition channel is read in real time to a certain buffer space. When the data is accumulated to 2 seconds, the automatic classification program of lung rales is started;
步骤2.对该2秒时长的数据块降采样到f s=8kHz,通过1个带通滤波器,并作归一化;若该数据块为第i个数据块,计该预处理后的数据块为向量x i Step 2. Downsample the 2 second data block to f s = 8kHz, pass through a band-pass filter, and normalize; if the data block is the i-th data block, count the preprocessed data block The data block is a vector x i ;
步骤3.计算数据向量x i的对数梅尔滤波器组变换,表示为矩阵F i Step 3. Calculate the logarithmic Mel filter bank transformation of the data vector x i , expressed as a matrix F i ;
步骤4.利用对数梅尔滤波器组变换结果矩阵F i,计算出三个通道的数据矩阵Δ i,0、Δ i,1和Δ i,2Step 4. Use the logarithmic Mel filter bank transformation result matrix F i to calculate the data matrix Δ i,0 , Δ i,1 and Δ i,2 of the three channels;
步骤5.将这三个通道的数据矩阵Δ 0、Δ 1和Δ 2各自归一化,输入一个预先搭建并训练好的卷积神经网络,该卷积神经网络的输出为四个概率值:该数据块中只存在湿啰音的概率p i,c、该数据块中只存在喘鸣音的概率p i,w、该数据块同时存在湿啰音与喘鸣音的概率p i,cw、该数据块既不存在湿啰音与喘鸣音的概率p i,Null,保存这四个概率值p i=[p i,c,p i,w,p i,cw,p i,Null] T Step 5. Normalize the data matrices Δ 0 , Δ 1 and Δ 2 of the three channels, and input a pre-built and trained convolutional neural network. The output of the convolutional neural network is four probability values: The probability that only wet rales exist in the data block p i,c , the probability that only wheeze sounds in the data block p i,w , the probability that both wet rales and wheeze exist in the data block p i,cw , The data block does not have the probability p i,Null of wet rales and stridor, save these four probability values p i =[pi ,c ,pi ,w ,pi ,cw ,pi ,Null ] T.
步骤6.当缓存空间中保存的数据时长达到3.9秒时,剔除前1.9秒数据,将剩下的2秒数据作为第i+1个数据块,回到步骤2;当缓存空间中保存的数据时长未达到3.9秒肺音采集就结束时,进入步骤7; Step 6. When the length of the data stored in the cache space reaches 3.9 seconds, remove the previous 1.9 seconds of data, use the remaining 2 seconds of data as the i+1 data block, and go back to step 2; when the data stored in the cache space When the lung sound collection is finished before the duration of 3.9 seconds, go to step 7;
步骤7.若最终未保存任何数据块的概率值,输出为“无法判断是否存在啰音”;若最终共保存了N个数据块上的概率值p 1,p 2,...,p N,利用这些概率值,输出“肺音中只存在湿啰音”、“肺音中只存在喘鸣音”、“肺音中同时存在湿啰音与喘鸣音”与“肺音中无啰音”四种状态中的一种,并给出该状态的概率值。 Step 7. If the probability value of any data block is not saved in the end, the output will be "Cannot determine whether there is a rale"; if the probability values p 1 ,p 2 ,...,p N on N data blocks are finally saved Using these probability values, output "only rales in lung sounds", "only wheezing sounds in lung sounds", "rales and stridor sounds in lung sounds at the same time" and "no rales in lung sounds" One of the four states of "tone", and the probability value of that state is given.
优选的,步骤2中所用的滤波器为Butterworth带通滤波器,通带为100Hz~1000Hz。Preferably, the filter used in step 2 is a Butterworth band pass filter, and the pass band is 100 Hz to 1000 Hz.
优选的,步骤3中计算数据向量x i的对数梅尔滤波器组变换矩阵F i包括: Preferably, calculating the logarithmic Mel filter bank transformation matrix F i of the data vector x i in step 3 includes:
首先,计算x i的短时傅里叶变换谱:将x i分为M=31段,每段包含N FFT=1024个采样点,段间交迭50%;令第m段数据表示为x i,m(n),n=0,1,...,N FFT-1,则该段的快速傅里叶变换计算为
Figure PCTCN2020113511-appb-000001
其中h(n)为汉明窗;
First, calculate x i STFT spectrum: x i will be divided into M = 31 sections each comprising N FFT = 1024 sampling points, a 50% overlap between segments; paragraph order of data is represented by x m i,m (n),n=0,1,...,N FFT -1, then the fast Fourier transform of this segment is calculated as
Figure PCTCN2020113511-appb-000001
Where h(n) is the Hamming window;
然后,|Y i,m(k)| 2经由一个梅尔滤波器组滤波;该梅尔滤波器组包含Q=29个梅尔频率域范围f Mel(f)=2959×log 10(1+f/700),f~[0,f s/2]上均匀间距且50%交迭的三角形滤波器Ψ q,q=1,2,...,Q;梅尔滤波器组滤波后的结果为
Figure PCTCN2020113511-appb-000002
Then, |Y i,m (k)| 2 is filtered by a Mel filter bank; the Mel filter bank contains Q=29 Mel frequency domain ranges f Mel (f) = 2959×log 10 (1+ f/700),f~[0,f s /2], uniformly spaced and 50% overlapping triangular filter Ψ q ,q=1,2,...,Q; after filtering by the Mel filter bank The result is
Figure PCTCN2020113511-appb-000002
最后,计算x i的对数梅尔滤波器组变换矩阵F i,其第q行m列的元素由下式给出:F i[q,m]=log[y i,m(q)]。 Finally, the logarithmic Mel filter bank transformation matrix F i of x i is calculated, and the elements in the qth row and m column are given by the following formula: F i [q,m]=log[y i,m (q)] .
优选的,步骤4中计算出三个通道的数据矩阵包括:Preferably, the data matrix of the three channels calculated in step 4 includes:
首先,第一个通道上的29×29维数据矩阵Δ 0=F[:,1:M-2]; First, the 29×29-dimensional data matrix on the first channel Δ 0 =F[:,1:M-2];
然后,第二个通道上的29×29维数据矩阵Δ 1=F[:,2:M-1]-F[:,1:M-2]; Then, the 29×29-dimensional data matrix on the second channel Δ 1 =F[:,2:M-1]-F[:,1:M-2];
最后,第三个通道上的29×29维数据矩阵Δ 2=(F[:,3:M]-F[:,2:M-1])-Δ 1Finally, the 29×29-dimensional data matrix Δ 2 on the third channel = (F[:,3:M]-F[:,2:M-1])-Δ 1 .
优选的,步骤5中的卷积神经网络由一个大样本有标注的数据集训练得到,该网络的具体结构如图3所示;该卷积神经网络共有4个卷积层,其卷积核大小分别为5×5、3×3、3×3和3×3;卷积层使用ReLU作为激活函数;池化层使用最大池化;输出层通过softmax输出4个概率p i,c、p i,w、p i,cw和p i,Null;在训练该卷积神经网络过程中,标准差为0.1的截断正态分布用于参数初始权重,同时使用了Adam优化、Dropout学习以及L 2正则化。 Preferably, the convolutional neural network in step 5 is trained from a large sample labeled data set. The specific structure of the network is shown in Figure 3. The convolutional neural network has 4 convolutional layers, and its convolution kernel The sizes are 5×5, 3×3, 3×3, and 3×3; the convolutional layer uses ReLU as the activation function; the pooling layer uses maximum pooling; the output layer outputs 4 probabilities p i, c , p through softmax i,w , p i,cw and p i,Null ; in the process of training the convolutional neural network, a truncated normal distribution with a standard deviation of 0.1 is used for the initial weight of the parameters, and Adam optimization, dropout learning and L 2 are used at the same time Regularization.
优选的,步骤7中最终可能输出的四种状态所对应的概率值分别为:Preferably, the probability values corresponding to the four states that may finally be output in step 7 are:
“肺音中只存在湿啰音”的概率Probability of "There are only rales in lung sounds"
Figure PCTCN2020113511-appb-000003
Figure PCTCN2020113511-appb-000003
“肺音中只存在喘鸣音”的概率Probability of "there is only wheezing in lung sounds"
Figure PCTCN2020113511-appb-000004
Figure PCTCN2020113511-appb-000004
“肺音中无啰音”的概率
Figure PCTCN2020113511-appb-000005
Probability of "no rales in lung sounds"
Figure PCTCN2020113511-appb-000005
“肺音中同时存在湿啰音与喘鸣音”的概率p cw=1-p c-p w-p NullThe probability that "wet rales and wheezing sounds exist in lung sounds at the same time" is p cw = 1-p c- p w- p Null .
最终输出的是这四种状态中概率最大的那一种状态以及其对应的概率。The final output is the state with the highest probability among the four states and its corresponding probability.
为解决上述技术问题,本发明还提供一种电子听诊器的肺部啰音人工智能实时分类系统,包括:In order to solve the above technical problems, the present invention also provides an artificial intelligence real-time classification system for lung rales of an electronic stethoscope, including:
电子听诊器,对肺音采集,为采集得到的数据分配一个缓存空间并持续进入缓存,当数据累积到2秒时长时,启动肺部啰音自动分类程序;The electronic stethoscope collects lung sounds, allocates a buffer space for the collected data, and continuously enters the buffer. When the data accumulates to 2 seconds, the lung rales automatic classification program is started;
带通滤波器,对采集的数据进行滤波,并作归一化;Band pass filter, filter the collected data and normalize it;
对数梅尔滤波器组,对数据向量变换结果矩阵,计算出三个通道的数据矩阵;Logarithmic mel filter bank, transform the result matrix of the data vector to calculate the data matrix of the three channels;
卷积神经网络,用于三个通道的数据矩阵输入,输出并保存四个概率值;Convolutional neural network, used for three-channel data matrix input, output and save four probability values;
其中:电子听诊器、带通滤波器、对数梅尔滤波器组以及卷积神经网络顺次连接。Among them: electronic stethoscope, band pass filter, logarithmic mel filter bank and convolutional neural network are connected in sequence.
优选的,所述带通滤波器采用Butterworth带通滤波器,通带为100Hz~1000Hz,卷积神经网络共有4个卷积层,其卷积核大小分别为5×5、3×3、3×3和3×3;卷积层使用ReLU作为激活函数;池化层使用最大池化;输出层通过softmax输出。Preferably, the band-pass filter adopts a Butterworth band-pass filter with a pass band of 100 Hz to 1000 Hz. The convolutional neural network has 4 convolutional layers, and the size of the convolution kernel is 5×5, 3×3, and 3 respectively. ×3 and 3×3; the convolutional layer uses ReLU as the activation function; the pooling layer uses maximum pooling; the output layer is output through softmax.
为解决上述技术问题,本发明还提供一种电子听诊器的肺部啰音人工智 能实时分类装置,其特征在于,包括:To solve the above technical problems, the present invention also provides a real-time artificial intelligence classification device for pulmonary rales of an electronic stethoscope, which is characterized in that it includes:
存储器,用于存储计算机程序;Memory, used to store computer programs;
处理器,用于执行所述计算机程序时实现上述任一项所述的电子听诊器的肺部啰音人工智能实时分类方法的步骤。The processor is used to implement the steps of any one of the above-mentioned methods for artificial intelligence real-time classification of lung rales of the electronic stethoscope when the computer program is executed.
为解决上述技术问题,本发明还提供一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现上述任一项所述的电子听诊器的肺部啰音人工智能实时分类方法的步骤。To solve the above technical problems, the present invention also provides a computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium. The steps of an artificial intelligence real-time classification method for lung rales of an electronic stethoscope are described.
与现有技术相比,本发明的有益效果是:Compared with the prior art, the beneficial effects of the present invention are:
(1)本发明随时间滑动选取数据块输入特定的卷积神经网络进行分类,并最终联合所有的数据块的分类结果得到最终总的啰音分类结果,不需要预设输入数据的长度,可实现啰音实时自动分类,且利用多时间段联合啰音分类可提高分类结果的鲁棒性;(1) The present invention selects data blocks slidingly over time and inputs them into a specific convolutional neural network for classification, and finally combines the classification results of all data blocks to obtain the final total rale classification result, without the need to preset the length of the input data. Realize the real-time automatic classification of rales, and the use of multi-time period joint rale classification can improve the robustness of the classification results;
(2)本发明提取三通道的对数梅尔滤波器组变换特征作为卷积神经网络的输入;(2) The present invention extracts the three-channel logarithmic mel filter bank transform feature as the input of the convolutional neural network;
(3)本发明明确给出了一种具体且有效的卷积神经网络结构用于肺部啰音分类,其中利用卷积层来发现输入数据更深层的特征,并在卷积层之后加入池化层来提高网络的容错能力;(3) The present invention clearly provides a specific and effective convolutional neural network structure for lung rales classification, in which the convolutional layer is used to discover the deeper features of the input data, and the pool is added after the convolutional layer Layer to improve the fault tolerance of the network;
(4)本发明在训练卷积神经网络过程中加入了标准差为0.1的截断正态分布用于参数权重初始化,同时使用Adam优化、Dropout学习以及L2正则化来防止过拟合,提高了本方法的鲁棒性;(4) In the process of training the convolutional neural network, the present invention adds a truncated normal distribution with a standard deviation of 0.1 for parameter weight initialization. At the same time, it uses Adam optimization, Dropout learning, and L2 regularization to prevent overfitting and improve the cost. Robustness of the method;
(5)本发明可实现湿啰音、喘鸣音、两者都包含和两者都不包含这四种情况的多分类。(5) The present invention can realize multi-classification of the four cases of wet rales, wheezing, and both of them and neither of them.
附图说明Description of the drawings
图1为现有技术中含有不同附加啰音的呼吸音示意图;Figure 1 is a schematic diagram of breath sounds with different additional rales in the prior art;
图2为本发明所提出电子听诊器的肺部啰音人工智能实时分类方法的流程图;Fig. 2 is a flowchart of a method for real-time artificial intelligence classification of lung rales of an electronic stethoscope according to the present invention;
图3为本发明单个数据块做四分类所用的卷积神经网络结构图;Fig. 3 is a structure diagram of a convolutional neural network used for four classifications of a single data block according to the present invention;
图4为本发明预处理与提取的特征图:其中(a)为原始采集的信号波形示例图;(b)为对其中某段2秒数据块预处理之后的信号波形示例图;4 is a characteristic diagram of the preprocessing and extraction of the present invention: (a) is an example diagram of the signal waveform of the original acquisition; (b) is an example diagram of the signal waveform after preprocessing a certain 2-second data block;
图5为本发明电子听诊器的肺部啰音人工智能实时分类系统结构示意图;5 is a schematic diagram of the structure of the lung rale artificial intelligence real-time classification system of the electronic stethoscope of the present invention;
图6为本发明电子听诊器的肺部啰音人工智能实时分类装置结构示意图。Fig. 6 is a schematic diagram of the structure of the lung rale artificial intelligence real-time classification device of the electronic stethoscope of the present invention.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.
请参阅图1~6,本发明提供一种技术方案:Please refer to Figures 1 to 6, the present invention provides a technical solution:
通过电子听诊器实时采集肺音信号,为采集得到的数据分配一个缓存空间并持续进入缓存。当数据累积到2秒时长时,启动肺部啰音自动分类程序(数据波形示例如图4(a)所示)。The lung sound signal is collected in real time through the electronic stethoscope, and a buffer space is allocated for the collected data and continues to enter the buffer. When the data accumulates to 2 seconds, start the lung rales automatic classification program (the data waveform example is shown in Figure 4(a)).
对该2秒时长的数据块降采样到f s=8kHz,通过1个通带为100Hz~1000Hz的Butterworth带通滤波器进行滤波,并作归一化。图4(b)所展示的是其中一段预处理之后的2秒数据块,若该数据块为第i个数据块,计该预处理后的数据块为向量x i,并计算此数据向量x i的对数梅尔滤波器组变换,表示为矩阵F i,具体过程为:(1)首先,计算x i的短时傅里叶变换谱:将x i分为M=31段,每段包含N FFT=1024个采样点,段间交迭50%;令第m段数据表示为x i,m(n),n=0,1,...,N FFT-1,则该段的快速傅里叶变换计算为
Figure PCTCN2020113511-appb-000006
其中h(n)为汉明窗;(2)然后,|Y i,m(k)| 2经由一个梅尔滤波器组滤波,该梅尔滤波器组包含Q=29个梅尔频率域范围f Mel(f)=2959×log 10(1+f/700),f~[0,f s/2]上均匀间距且50%交迭的三角形滤波器Ψ q,q=1,2,...,Q,梅尔滤波器组滤波后的结果为
Figure PCTCN2020113511-appb-000007
(3)最后,计算x i的对数梅尔滤波器组变换矩阵F i,其第q行m列的元素由下式给出:F i[q,m]=log[y i,m(q)]。在得到对数梅尔滤波器组变换结果矩阵F i之后,通过
Figure PCTCN2020113511-appb-000008
计算出三个通道的数据矩阵Δ i,0、Δ i,1和Δ i,2。将这三个通道的数据矩阵Δ 0、Δ 1和Δ 2各自归一化,输入一个预先搭建并训练好的卷积神经网络(如图3所展示),该卷积神经网络的输出为四个概率值:该数据块中只存在湿啰音的概率p i,c、该数据块中只存在喘鸣音的概率p i,w、该数据块同时存在湿啰音与喘鸣音的概率p i,cw、该数据块既不存在湿啰音与喘鸣音的概率p i,Null,保存这四个概率值p i=[p i,c,p i,w,p i,cw,p i,Null] T
The data block with a duration of 2 seconds is down-sampled to f s = 8 kHz, filtered by a Butterworth band-pass filter with a pass band of 100 Hz to 1000 Hz, and normalized. Figure 4(b) shows a 2-second data block after preprocessing. If the data block is the i-th data block, the preprocessed data block is counted as the vector x i , and the data vector x is calculated mel logarithmic transform filter bank i is denoted as i matrix F., the specific process is: (1) first, calculate the short time Fourier transform of the spectrum of x i: x i will be divided into M = 31 sections, each Contains N FFT =1024 sampling points, the overlap between segments is 50%; let the m- th segment data be expressed as x i,m (n),n=0,1,...,N FFT -1, then The fast Fourier transform is calculated as
Figure PCTCN2020113511-appb-000006
Where h(n) is the Hamming window; (2) Then, |Y i,m (k)| 2 is filtered by a mel filter bank, which contains Q=29 mel frequency domain ranges f Mel (f)=2959×log 10 (1+f/700), a triangular filter with uniform spacing and 50% overlap on f~[0,f s /2] Ψ q , q=1,2,. ..,Q, the result of Mel filter bank filtering is
Figure PCTCN2020113511-appb-000007
(3) Finally, the logarithmic Mel filter bank transformation matrix F i of x i is calculated, and the elements in the qth row and m column are given by the following formula: F i [q,m]=log[y i,m ( q)]. After obtaining the logarithmic mel filter bank transformation result matrix F i , pass
Figure PCTCN2020113511-appb-000008
Calculate the data matrix Δ i,0 , Δ i,1 and Δ i,2 of the three channels. Normalize the data matrices Δ 0 , Δ 1 and Δ 2 of the three channels, and input a pre-built and trained convolutional neural network (shown in Figure 3). The output of the convolutional neural network is four. Probability values: the probability p i,c of only wet rales in the data block, the probability p i,w of only wheezing sounds in the data block, the probability that both wet rales and wheezing sounds exist in the data block p i,cw , the probability p i,Null that the data block does not have wet rales and wheezing sounds, save these four probability values p i =[pi ,c ,pi ,w ,pi ,cw , p i,Null ] T.
当缓存空间中保存的数据时长达到3.9秒时,剔除前1.9秒数据,将剩下的2秒数据作为第i+1个数据块,并重复上述的过程。当缓存空间中保存的数据时长未达到3.9秒肺音采集就结束时,则进行判断:若最终未保存任何数据块的概率值,输出为“无法判断是否存在啰音”;若最终共保存了N个数据块上的概率值p 1,p 2,...,p N,计算“肺音中只存在湿啰音”的概率
Figure PCTCN2020113511-appb-000009
“肺音中只存在喘鸣音”的概率
Figure PCTCN2020113511-appb-000010
“肺音中同时存在湿啰音与喘鸣音”的概率p cw=1-p c-p w-p Null和“肺音中无啰音”的概率
Figure PCTCN2020113511-appb-000011
比较这四个概率的大小,概率最大的状态即为识别出的状态,输出“肺音中只存在湿啰音”、“肺音中只存在喘鸣音”、“肺音中同时存在湿啰音与喘鸣音”与“肺音中无啰音”四种状态中的一种,并给出该状态的概率值。
When the length of the data stored in the cache space reaches 3.9 seconds, the previous 1.9 seconds of data are removed, the remaining 2 seconds of data are regarded as the i+1th data block, and the above process is repeated. When the data saved in the buffer space does not reach 3.9 seconds before the lung sound collection ends, a judgment is made: if the probability value of any data block is not saved in the end, the output is "Cannot judge whether there is a rale"; if it is finally saved Probability values p 1 ,p 2 ,...,p N on N data blocks, calculate the probability of "only wet rales in lung sounds"
Figure PCTCN2020113511-appb-000009
Probability of "there is only wheezing in lung sounds"
Figure PCTCN2020113511-appb-000010
Probability of "Role and stridor in lung sounds at the same time" p cw = 1-p c -p w -p Null and probability of "No rales in lung sounds"
Figure PCTCN2020113511-appb-000011
Comparing the magnitudes of these four probabilities, the state with the highest probability is the recognized state. Outputs "only wet rales in lung sounds", "only stridor sounds in lung sounds", and "wet rales in lung sounds at the same time." One of the four states of "tone and stridor" and "no rales in lung sounds", and the probability value of this state is given.
利用生物医学与健康信息国际会议提供的920段肺音数据(涵盖了本发明涉及的四种肺音情况,每段数据长度非定长,持续时间在10秒~90秒)与申请人团队在国内几家医院儿科采集的508段肺音数据(同样涵盖了本发明涉及的四种肺音情况,每段数据长度在30秒以上),共1428段数据作为肺音数据库,进行神经网络的训练和分类效果的验证。将其中1071段数据作为训练集,将其按本发明的数据块滑动选取方式切分出共14524个数据块,按前述方法提取其各自的三通道对数梅尔滤波器组变换特征并打上标记,进行卷积神经网络的训练。该网络的具体结构如图3所示;该卷积神经网络共有4个卷积层,其卷积核大小分别为5×5、3×3、3×3和3×3;卷积层使用ReLU作为激活函数;池化层使用最大池化;输出层通过softmax输出4个概率p i,c、p i,w、p i,cw和p i,Null;在训练该卷积神经网络过程中,标准差为0.1的截断正态分布用于参数初始权重,同时使用了Adam优化、Dropout学习以及L 2正则化。最后,利用余下的357段肺音数据作为测试集,得到最终测试集肺音数据段的啰音分类准确率为95.80%。 Using the 920 pieces of lung sound data provided by the International Conference on Biomedicine and Health Information (covering the four lung sound situations involved in the present invention, each piece of data has a variable length and a duration of 10 seconds to 90 seconds) and the applicant team 508 pieces of lung sound data collected by the pediatric department of several domestic hospitals (also covering the four lung sound situations involved in the present invention, each piece of data is more than 30 seconds in length), a total of 1,428 pieces of data are used as a lung sound database for neural network training And verification of classification effect. Take the 1071 pieces of data as the training set, divide it into a total of 14,524 data blocks according to the data block sliding selection method of the present invention, and extract their respective three-channel logarithmic Mel filter bank transform features according to the aforementioned method and mark them , To train the convolutional neural network. The specific structure of the network is shown in Figure 3; the convolutional neural network has 4 convolutional layers, and the convolution kernel sizes are 5×5, 3×3, 3×3, and 3×3; the convolutional layer uses ReLU is used as the activation function; the pooling layer uses maximum pooling; the output layer outputs 4 probabilities p i,c , p i,w , p i,cw and p i,Null through softmax; in the process of training the convolutional neural network , A truncated normal distribution with a standard deviation of 0.1 is used for the initial weights of the parameters, and Adam optimization, Dropout learning, and L 2 regularization are also used. Finally, using the remaining 357 pieces of lung sound data as the test set, the rale classification accuracy of the lung sound data section of the final test set is 95.80%.
本发明提出的一种电子听诊器的肺部啰音人工智能实时分类方法,主要解决的技术问题包括:The present invention proposes a real-time artificial intelligence classification method for pulmonary rales of an electronic stethoscope. The main technical problems to be solved include:
(1)如何在实际肺音采集总时长不确定的条件下给出一种统一的啰音实时分类方法;(2)由于不同的啰音与不同的病症有关,因此如何实现啰音的 多分类;(3)如何提高啰音检测与分类结果的鲁棒性。(1) How to provide a unified real-time classification method of rales under the condition that the total length of the actual lung sound collection is uncertain; (2) Since different rales are related to different diseases, how to realize the multi-classification of rales ; (3) How to improve the robustness of rale detection and classification results.
本发明,通过上述方法,(1)可以在实际肺音采集总时长不确定的条件下给出一种统一的啰音实时分类方法;(2)本发明可以实现湿啰音、喘鸣音、两者都包含和两者都不包含这四种情况的多分类;(3)本发明可以有效的提高啰音检测与分类结果的鲁棒性。The present invention, through the above method, (1) can provide a unified real-time classification method of rales under the condition that the total length of actual lung sound collection is uncertain; (2) the present invention can realize wet rales, stridor, Both include and neither include the multi-classification of these four cases; (3) The present invention can effectively improve the robustness of rale detection and classification results.
具体来说:Specifically:
(1)本发明随时间滑动选取数据块输入特定的卷积神经网络进行分类,并最终联合所有的数据块的分类结果得到最终总的啰音分类结果,不需要预设输入数据的长度,可实现啰音实时自动分类,且利用多时间段联合啰音分类可提高分类结果的鲁棒性;(1) The present invention selects data blocks slidingly over time and inputs them into a specific convolutional neural network for classification, and finally combines the classification results of all data blocks to obtain the final total rale classification result, without the need to preset the length of the input data. Realize the real-time automatic classification of rales, and the use of multi-time period joint rale classification can improve the robustness of the classification results;
(2)本发明提取三通道的对数梅尔滤波器组变换特征作为卷积神经网络的输入;(2) The present invention extracts the three-channel logarithmic mel filter bank transform feature as the input of the convolutional neural network;
(3)本发明明确给出了一种具体且有效的卷积神经网络结构用于肺部啰音分类,其中利用卷积层来发现输入数据更深层的特征,并在卷积层之后加入池化层来提高网络的容错能力;(3) The present invention clearly provides a specific and effective convolutional neural network structure for lung rales classification, in which the convolutional layer is used to discover the deeper features of the input data, and the pool is added after the convolutional layer Layer to improve the fault tolerance of the network;
(4)本发明在训练卷积神经网络过程中加入了标准差为0.1的截断正态分布用于参数权重初始化,同时使用Adam优化、Dropout学习以及L2正则化来防止过拟合,提高了本方法的鲁棒性;(4) In the process of training the convolutional neural network, the present invention adds a truncated normal distribution with a standard deviation of 0.1 for parameter weight initialization. At the same time, it uses Adam optimization, Dropout learning, and L2 regularization to prevent overfitting and improve the cost. Robustness of the method;
(5)本发明可实现湿啰音、喘鸣音、两者都包含和两者都不包含这四种情况的多分类。(5) The present invention can realize multi-classification of the four cases of wet rales, wheezing, and both of them and neither of them.
尽管已经示出和描述了本发明的实施例,对于本领域的普通技术人员而言,可以理解在不脱离本发明的原理和精神的情况下可以对这些实施例进行多种变化、修改、替换和变型,本发明的范围由所附权利要求及其等同物限定。Although the embodiments of the present invention have been shown and described, those of ordinary skill in the art can understand that various changes, modifications, and substitutions can be made to these embodiments without departing from the principle and spirit of the present invention. And variations, the scope of the present invention is defined by the appended claims and their equivalents.

Claims (10)

  1. 一种电子听诊器的肺部啰音人工智能实时分类方法,其特征在于,包括:An artificial intelligence real-time classification method for lung rales of an electronic stethoscope, which is characterized in that it comprises:
    步骤1.从电子听诊器启动肺音采集开始,实时读取采集通道中的数据到某缓存空间,当数据累积到2秒时长时,启动肺部啰音自动分类程序;Step 1. From the start of lung sound acquisition by the electronic stethoscope, the data in the acquisition channel is read in real time to a certain buffer space. When the data is accumulated to 2 seconds, the automatic classification program of lung rales is started;
    步骤2.对该2秒时长的数据块降采样到f s=8kHz,通过1个带通滤波器,并作归一化;若该数据块为第i个数据块,计该预处理后的数据块为向量x iStep 2. Downsample the 2 second data block to f s = 8kHz, pass through a band-pass filter, and normalize; if the data block is the i-th data block, count the preprocessed data block The data block is a vector x i ;
    步骤3.计算数据向量x i的对数梅尔滤波器组变换,表示为矩阵F iStep 3. Calculate the logarithmic Mel filter bank transformation of the data vector x i , expressed as a matrix F i ;
    步骤4.利用对数梅尔滤波器组变换结果矩阵F i,计算出三个通道的数据矩阵Δ i,0、Δ i,1和Δ i,2Step 4. Use the logarithmic Mel filter bank transformation result matrix F i to calculate the data matrix Δ i,0 , Δ i,1 and Δ i,2 of the three channels;
    步骤5.将这三个通道的数据矩阵Δ 0、Δ 1和Δ 2各自归一化,输入一个预先搭建并训练好的卷积神经网络,该卷积神经网络的输出为四个概率值:该数据块中只存在湿啰音的概率p i,c、该数据块中只存在喘鸣音的概率p i,w、该数据块同时存在湿啰音与喘鸣音的概率p i,cw、该数据块既不存在湿啰音与喘鸣音的概率p i,Null,保存这四个概率值p i=[p i,c,p i,w,p i,cw,p i,Null] TStep 5. Normalize the data matrices Δ 0 , Δ 1 and Δ 2 of the three channels, and input a pre-built and trained convolutional neural network. The output of the convolutional neural network is four probability values: The probability p i,c that there is only a wet rale in the data block, the probability that there is only a wheezing sound in the data block p i,w , the probability that the data block has both a wet rale and a wheezing sound p i,cw , The data block does not have the probability p i,Null of wet rales and stridor, save these four probability values p i =[pi ,c ,pi ,w ,pi ,cw ,pi ,Null ] T.
    步骤6.当缓存空间中保存的数据时长达到3.9秒时,剔除前1.9秒数据,将剩下的2秒数据作为第i+1个数据块,回到步骤2;当缓存空间中保存的数据时长未达到3.9秒肺音采集就结束时,进入步骤7;Step 6. When the length of the data stored in the cache space reaches 3.9 seconds, remove the previous 1.9 seconds of data, use the remaining 2 seconds of data as the i+1 data block, and go back to step 2; when the data stored in the cache space When the lung sound collection is finished before the duration of 3.9 seconds, go to step 7;
    步骤7.若最终未保存任何数据块的概率值,输出为“无法判断是否存在啰音”;若最终共保存了N个数据块上的概率值p 1,p 2,...,p N,利用这些概率值,输出“肺音中只存在湿啰音”、“肺音中只存在喘鸣音”、“肺音中同时存在湿啰音与喘鸣音”与“肺音中无啰音”四种状态中的一种,并给出该状态的概 率值。 Step 7. If the probability value of any data block is not saved in the end, the output will be "Cannot determine whether there is a rale"; if the probability values p 1 ,p 2 ,...,p N on N data blocks are finally saved Using these probability values, output "only rales in lung sounds", "only wheezing sounds in lung sounds", "rales and stridor sounds in lung sounds at the same time" and "no rales in lung sounds" One of the four states of "tone", and the probability value of that state is given.
  2. 根据权利要求1所述的一种电子听诊器的肺部啰音人工智能实时分类方法,其特征在于,步骤2中所用的滤波器为Butterworth带通滤波器,通带为100Hz~1000Hz。An artificial intelligence real-time classification method for lung rales of an electronic stethoscope according to claim 1, wherein the filter used in step 2 is a Butterworth band pass filter with a pass band of 100 Hz to 1000 Hz.
  3. 根据权利要求1所述的一种电子听诊器的肺部啰音人工智能实时分类方法,其特征在于,An artificial intelligence real-time classification method for lung rales of an electronic stethoscope according to claim 1, characterized in that,
    步骤3中计算数据向量x i的对数梅尔滤波器组变换矩阵F i包括: The calculation of the logarithmic Mel filter bank transformation matrix F i of the data vector x i in step 3 includes:
    首先,计算x i的短时傅里叶变换谱:将x i分为M=31段,每段包含N FFT=1024个采样点,段间交迭50%;令第m段数据表示为x i,m(n),n=0,1,...,N FFT-1,则该段的快速傅里叶变换计算为
    Figure PCTCN2020113511-appb-100001
    k=0,1,...,N FFT/2-1,其中h(n)为汉明窗;
    First, calculate x i STFT spectrum: x i will be divided into M = 31 sections each comprising N FFT = 1024 sampling points, a 50% overlap between segments; paragraph order of data is represented by x m i,m (n),n=0,1,...,N FFT -1, then the fast Fourier transform of this segment is calculated as
    Figure PCTCN2020113511-appb-100001
    k=0,1,...,N FFT /2-1, where h(n) is the Hamming window;
    然后,|Y i,m(k)| 2经由一个梅尔滤波器组滤波;该梅尔滤波器组包含Q=29个梅尔频率域范围f Mel(f)=2959×log 10(1+f/700),f~[0,f s/2]上均匀间距且50%交迭的三角形滤波器Ψ q,q=1,2,...,Q;梅尔滤波器组滤波后的结果为
    Figure PCTCN2020113511-appb-100002
    q=1,2,...,Q;
    Then, |Y i,m (k)| 2 is filtered by a Mel filter bank; the Mel filter bank contains Q=29 Mel frequency domain ranges f Mel (f) = 2959×log 10 (1+ f/700),f~[0,f s /2], uniformly spaced and 50% overlapping triangular filter Ψ q ,q=1,2,...,Q; after filtering by the Mel filter bank The result is
    Figure PCTCN2020113511-appb-100002
    q=1,2,...,Q;
    最后,计算x i的对数梅尔滤波器组变换矩阵F i,其第q行m列的元素由下式给出:F i[q,m]=log[y i,m(q)]。 Finally, the logarithmic Mel filter bank transformation matrix F i of x i is calculated, and the elements in the qth row and m column are given by the following formula: F i [q,m]=log[y i,m (q)] .
  4. 根据权利要求1所述的一种电子听诊器的肺部啰音人工智能实时分类方法,其特征在于,An artificial intelligence real-time classification method for lung rales of an electronic stethoscope according to claim 1, characterized in that,
    步骤4中计算出三个通道的数据矩阵包括:The data matrix of the three channels calculated in step 4 includes:
    首先,第一个通道上的29×29维数据矩阵Δ 0=F[:,1:M-2]; First, the 29×29-dimensional data matrix on the first channel Δ 0 =F[:,1:M-2];
    然后,第二个通道上的29×29维数据矩阵Δ 1=F[:,2:M-1]-F[:,1:M-2]; Then, the 29×29-dimensional data matrix on the second channel Δ 1 =F[:,2:M-1]-F[:,1:M-2];
    最后,第三个通道上的29×29维数据矩阵Δ 2=(F[:,3:M] -F[:,2:M-1])-Δ 1Finally, 29 × 29-dimensional data on a third channel matrix Δ 2 = (F [:, 3: M] - F [:, 2: M-1]) - Δ 1.
  5. 根据权利要求1所述的一种电子听诊器的肺部啰音人工智能实时分类方法,其特征在于,An artificial intelligence real-time classification method for lung rales of an electronic stethoscope according to claim 1, characterized in that,
    步骤5中的卷积神经网络由一个大样本有标注的数据集训练得到,该网络的具体结构如图3所示;该卷积神经网络共有4个卷积层,其卷积核大小分别为5×5、3×3、3×3和3×3;卷积层使用ReLU作为激活函数;池化层使用最大池化;输出层通过softmax输出4个概率p i,c、p i,w、p i,cw和p i,Null;在训练该卷积神经网络过程中,标准差为0.1的截断正态分布用于参数初始权重,同时使用了Adam优化、Dropout学习以及L 2正则化。 The convolutional neural network in step 5 is trained from a large-sample labeled data set. The specific structure of the network is shown in Figure 3. The convolutional neural network has 4 convolutional layers, and the convolution kernel sizes are respectively 5×5, 3×3, 3×3 and 3×3; the convolutional layer uses ReLU as the activation function; the pooling layer uses maximum pooling; the output layer outputs 4 probabilities p i,c , p i,w through softmax , P i, cw and p i, Null ; in the process of training the convolutional neural network, a truncated normal distribution with a standard deviation of 0.1 is used for the initial weight of the parameters, and Adam optimization, dropout learning and L 2 regularization are used at the same time.
  6. 根据权利要求1所述的一种电子听诊器的肺部啰音人工智能实时分类方法,其特征在于,An artificial intelligence real-time classification method for lung rales of an electronic stethoscope according to claim 1, characterized in that,
    步骤7中最终可能输出的四种状态所对应的概率值分别为:The probability values corresponding to the four possible output states in step 7 are:
    “肺音中只存在湿啰音”的概率Probability of "There are only rales in lung sounds"
    Figure PCTCN2020113511-appb-100003
    Figure PCTCN2020113511-appb-100003
    “肺音中只存在喘鸣音”的概率Probability of "there is only wheezing in lung sounds"
    Figure PCTCN2020113511-appb-100004
    Figure PCTCN2020113511-appb-100004
    “肺音中无啰音”的概率
    Figure PCTCN2020113511-appb-100005
    Probability of "no rales in lung sounds"
    Figure PCTCN2020113511-appb-100005
    “肺音中同时存在湿啰音与喘鸣音”的概率p cw=1-p c-p w-p NullThe probability that "wet rales and wheezing sounds exist in lung sounds at the same time" is p cw = 1-p c- p w- p Null .
  7. 一种电子听诊器的肺部啰音人工智能实时分类系统,其特征在于,包括:An artificial intelligence real-time classification system for pulmonary rales of an electronic stethoscope, which is characterized in that it comprises:
    电子听诊器,对肺音采集,为采集得到的数据分配一个缓存空间并持续 进入缓存,当数据累积到2秒时长时,启动肺部啰音自动分类程序;The electronic stethoscope collects lung sounds, allocates a buffer space for the collected data and continuously enters the buffer. When the data accumulates to 2 seconds, the lung rales automatic classification program is started;
    带通滤波器,对采集的数据进行滤波,并作归一化;Band pass filter, filter the collected data and normalize it;
    对数梅尔滤波器组,对数据向量变换结果矩阵,计算出三个通道的数据矩阵;Logarithmic mel filter bank, transform the result matrix of the data vector to calculate the data matrix of the three channels;
    卷积神经网络,用于三个通道的数据矩阵输入,输出并保存四个概率值;Convolutional neural network, used for three-channel data matrix input, output and save four probability values;
    其中:电子听诊器、带通滤波器、对数梅尔滤波器组以及卷积神经网络顺次连接。Among them: electronic stethoscope, band pass filter, logarithmic mel filter bank and convolutional neural network are connected in sequence.
  8. 根据权利要求1所述的一种电子听诊器的肺部啰音人工智能实时分类系统,其特征在于,所述带通滤波器采用Butterworth带通滤波器,通带为100Hz~1000Hz,卷积神经网络共有4个卷积层,其卷积核大小分别为5×5、3×3、3×3和3×3;卷积层使用ReLU作为激活函数;池化层使用最大池化;输出层通过softmax输出。The lung rale artificial intelligence real-time classification system of an electronic stethoscope according to claim 1, wherein the band-pass filter adopts a Butterworth band-pass filter with a pass band of 100 Hz to 1000 Hz, and a convolutional neural network There are 4 convolutional layers in total, and the convolution kernel sizes are 5×5, 3×3, 3×3, and 3×3; the convolutional layer uses ReLU as the activation function; the pooling layer uses maximum pooling; the output layer passes softmax output.
  9. 一种电子听诊器的肺部啰音人工智能实时分类装置,其特征在于,包括:An artificial intelligence real-time classification device for pulmonary rales of an electronic stethoscope, which is characterized in that it comprises:
    存储器,用于存储计算机程序;Memory, used to store computer programs;
    处理器,用于执行所述计算机程序时实现如权利要求1~6任一项所述的电子听诊器的肺部啰音人工智能实时分类方法的步骤。The processor is configured to implement the steps of the method for real-time artificial intelligence classification of lung rales of the electronic stethoscope according to any one of claims 1 to 6 when the computer program is executed.
  10. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1~6任一项所述的电子听诊器的肺部啰音人工智能实时分类方法的步骤。A computer-readable storage medium, wherein a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the electronic stethoscope according to any one of claims 1 to 6 is The steps of an artificial intelligence real-time classification method of lung rales.
PCT/CN2020/113511 2019-12-13 2020-09-04 Lung rale artificial intelligence real-time classification method, system and device of electronic stethoscope, and readable storage medium WO2021114761A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911280663.2A CN110970042B (en) 2019-12-13 2019-12-13 Pulmonary ralated artificial intelligence real-time classification method, system and device of electronic stethoscope and readable storage medium
CN201911280663.2 2019-12-13

Publications (1)

Publication Number Publication Date
WO2021114761A1 true WO2021114761A1 (en) 2021-06-17

Family

ID=70034137

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/113511 WO2021114761A1 (en) 2019-12-13 2020-09-04 Lung rale artificial intelligence real-time classification method, system and device of electronic stethoscope, and readable storage medium

Country Status (2)

Country Link
CN (1) CN110970042B (en)
WO (1) WO2021114761A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114663307A (en) * 2022-03-22 2022-06-24 哈尔滨工业大学 Integrated image denoising system based on uncertainty network

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110970042B (en) * 2019-12-13 2023-04-18 苏州美糯爱医疗科技有限公司 Pulmonary ralated artificial intelligence real-time classification method, system and device of electronic stethoscope and readable storage medium
CN111466947A (en) * 2020-04-15 2020-07-31 哈尔滨工业大学 Electronic auscultation lung sound signal processing method
TWI769449B (en) * 2020-04-21 2022-07-01 廣達電腦股份有限公司 Filtering system and filtering method
CN111933185A (en) * 2020-10-09 2020-11-13 深圳大学 Lung sound classification method, system, terminal and storage medium based on knowledge distillation
CN112932525B (en) * 2021-01-27 2023-04-07 山东大学 Lung sound abnormity intelligent diagnosis system, medium and equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130259240A1 (en) * 2012-03-29 2013-10-03 Gwo-Ching CHANG Device and method for detecting occurrence of wheeze
CN106073709A (en) * 2016-06-03 2016-11-09 中国科学院声学研究所 A kind of method and apparatus of rale detection
CN107545906A (en) * 2017-08-23 2018-01-05 京东方科技集团股份有限公司 Lung Sounds processing method, processing equipment and readable storage medium storing program for executing
CN107818366A (en) * 2017-10-25 2018-03-20 成都力创昆仑网络科技有限公司 A kind of lungs sound sorting technique, system and purposes based on convolutional neural networks
US20190088367A1 (en) * 2012-06-18 2019-03-21 Breathresearch Inc. Method and apparatus for training and evaluating artificial neural networks used to determine lung pathology
CN110532424A (en) * 2019-09-26 2019-12-03 西南科技大学 A kind of lungs sound tagsort system and method based on deep learning and cloud platform
CN110970042A (en) * 2019-12-13 2020-04-07 苏州美糯爱医疗科技有限公司 Artificial intelligent real-time classification method, system and device for pulmonary rales of electronic stethoscope and readable storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9687208B2 (en) * 2015-06-03 2017-06-27 iMEDI PLUS Inc. Method and system for recognizing physiological sound
TWI646942B (en) * 2018-02-06 2019-01-11 財團法人工業技術研究院 Lung sound monitoring device and lung sound monitoring method
CN109273085B (en) * 2018-11-23 2021-11-02 南京清科信息科技有限公司 Pathological respiratory sound library establishing method, respiratory disease detection system and respiratory sound processing method
CN109805954B (en) * 2019-01-23 2021-09-14 苏州美糯爱医疗科技有限公司 Method for automatically eliminating friction sound interference of electronic stethoscope

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130259240A1 (en) * 2012-03-29 2013-10-03 Gwo-Ching CHANG Device and method for detecting occurrence of wheeze
US20190088367A1 (en) * 2012-06-18 2019-03-21 Breathresearch Inc. Method and apparatus for training and evaluating artificial neural networks used to determine lung pathology
CN106073709A (en) * 2016-06-03 2016-11-09 中国科学院声学研究所 A kind of method and apparatus of rale detection
CN107545906A (en) * 2017-08-23 2018-01-05 京东方科技集团股份有限公司 Lung Sounds processing method, processing equipment and readable storage medium storing program for executing
CN107818366A (en) * 2017-10-25 2018-03-20 成都力创昆仑网络科技有限公司 A kind of lungs sound sorting technique, system and purposes based on convolutional neural networks
CN110532424A (en) * 2019-09-26 2019-12-03 西南科技大学 A kind of lungs sound tagsort system and method based on deep learning and cloud platform
CN110970042A (en) * 2019-12-13 2020-04-07 苏州美糯爱医疗科技有限公司 Artificial intelligent real-time classification method, system and device for pulmonary rales of electronic stethoscope and readable storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114663307A (en) * 2022-03-22 2022-06-24 哈尔滨工业大学 Integrated image denoising system based on uncertainty network
CN114663307B (en) * 2022-03-22 2023-07-04 哈尔滨工业大学 Integrated image denoising system based on uncertainty network

Also Published As

Publication number Publication date
CN110970042A (en) 2020-04-07
CN110970042B (en) 2023-04-18

Similar Documents

Publication Publication Date Title
WO2021114761A1 (en) Lung rale artificial intelligence real-time classification method, system and device of electronic stethoscope, and readable storage medium
CN108670200B (en) Sleep snore classification detection method and system based on deep learning
Ma et al. Lungbrn: A smart digital stethoscope for detecting respiratory disease using bi-resnet deep learning algorithm
Mendonca et al. A review of obstructive sleep apnea detection approaches
Amrulloh et al. Automatic cough segmentation from non-contact sound recordings in pediatric wards
Belkacem et al. End-to-end AI-based point-of-care diagnosis system for classifying respiratory illnesses and early detection of COVID-19: a theoretical framework
JP6435257B2 (en) Method and apparatus for processing patient sounds
US11712198B2 (en) Estimation of sleep quality parameters from whole night audio analysis
CN110570880B (en) Snore signal identification method
JP7197922B2 (en) Machine learning device, analysis device, machine learning method and analysis method
Swarnkar et al. Neural network based algorithm for automatic identification of cough sounds
CN111696575A (en) Low ventilation and apnea detection and identification system based on hybrid neural network model
Shen et al. Detection of snore from OSAHS patients based on deep learning
Wu et al. A novel approach to diagnose sleep apnea using enhanced frequency extraction network
CN113974607B (en) Sleep snore detecting system based on pulse neural network
CN111312293A (en) Method and system for identifying apnea patient based on deep learning
Amrulloh et al. A novel method for wet/dry cough classification in pediatric population
JP2023531464A (en) A method and system for screening for obstructive sleep apnea during wakefulness using anthropometric information and tracheal breath sounds
Luo et al. Design of embedded real-time system for snoring and OSA detection based on machine learning
Shi et al. Obstructive sleep apnea detection using difference in feature and modified minimum distance classifier
Kala et al. An objective measure of signal quality for pediatric lung auscultations
Bandyopadhyaya et al. Automatic lung sound cycle extraction from single and multichannel acoustic recordings
Patel et al. Different Transfer Learning Approaches for Recognition of Lung Sounds
Shang et al. Sleep Apnea Detection Based on Snoring Sound Analysis Using DS-MS neural network
Patel et al. Multi Feature fusion for COPD Classification using Deep learning algorithms

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20897738

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20897738

Country of ref document: EP

Kind code of ref document: A1