CN111523487A

CN111523487A - Method for preprocessing and automatically labeling physiological sound

Info

Publication number: CN111523487A
Application number: CN202010336569.0A
Authority: CN
Inventors: 赵列宾; 殷勇; 罗雯懿; 傅丽娟; 王汉松; 袁加俊; 张磊; 周宏远; 曲菲
Original assignee: Shanghai Tuoxiao Intelligent Technology Co ltd; Shanghai Childrens Medical Center Affiliated to Shanghai Jiaotong University School of Medicine
Current assignee: Shanghai Tuoxiao Intelligent Technology Co ltd; Shanghai Childrens Medical Center Affiliated to Shanghai Jiaotong University School of Medicine
Priority date: 2020-04-26
Filing date: 2020-04-26
Publication date: 2020-08-11

Abstract

The method for processing and automatically labeling the physiological sound provided by the invention is convenient for scientific and standardized research on the physiological sound by optimizing the preprocessing method after the physiological sound is collected. The method specifically comprises signal preprocessing, which is used for removing resampling or weak signals from physiological sounds; band-pass filtering, namely generating band-pass filtering representing the acoustic vibration electronic signal for the physiological sound conversion; signal normalization, namely performing wavelet transformation on the electronic signals representing the acoustic vibration generated by the physiological sound transformation and decomposing the electronic signals into a plurality of signals with different frequencies; calculating an intelligent threshold value, and calculating threshold values of signals with different frequencies; and finding out potential characteristic targets, and finding out all potential characteristic targets according to the calculation threshold. Meanwhile, an automatic labeling method of physiological sounds is provided, manual labeling is combined with machine labeling, machine labeling self-learning is guided, training is concentrated, and accuracy of machine labeling is enhanced.

Description

Method for preprocessing and automatically labeling physiological sound

Technical Field

The invention relates to the technical field of medical data processing, in particular to a method for preprocessing and automatically labeling physiological sounds.

Background

The physiological sounds include sounds generated by various organs such as heart sounds, lung sounds, bowel sounds, vascular echoes, tracheal breathing sounds, bronchial breathing sounds, joint movement sounds and the like. The respiratory sound is commonly called lung sound, and can reflect the acoustic characteristics of lung tissues, trachea, chest wall and other propagation media. The heart noise is a kind of heart sound, and specifically refers to abnormal sounds generated by the vibration of the wall, valve or vessel caused by the turbulent flow of blood in the heart or vessel during the contraction or relaxation of the heart, which are noises with different frequencies, different intensities and longer duration, besides the heart sound and extra heart sound. Through accurate analysis and classification of physiological sounds, the diagnosis of related diseases can be made to be important and decisive.

While the frequency range of respiratory sounds and other physiological sounds is about 50-3000Hz, and the sensitive band of human ears is about 1000-2000Hz, the low frequency response of conventional physiological sound detecting devices (such as mechanical stethoscopes) is poor, so that it is difficult to capture weak sounds during auscultation. In addition, different doctors have different clinical experiences and disease diagnosis levels, and the auscultation of physiological sounds of the same patient often has different judgment results, even quite different, namely, the infected part, degree and stage, and the pathophysiological change and evolution process in the infected part, degree and stage are judged through the physiological sounds, and the development direction and prognosis are predicted. Therefore, the preprocessing work of the electronic stethoscope after collecting the physiological sounds is the starting point for understanding the physiological sounds, and the quality of the preprocessing work is directly related to the research work of the subsequent physiological sounds. The existing physiological sound is marked by adopting an artificial auscultation mode, so that the marking standard has large difference, and the auscultation levels and the auscultation modes of different doctors have difference, so that an automatic marking method of the physiological sound is urgently needed.

Disclosure of Invention

Technical problem to be solved

The invention aims to solve the problem of optimizing a pretreatment method after physiological sound collection and facilitating scientific and standardized research on physiological sound. Meanwhile, an automatic labeling method of physiological sounds is provided, manual labeling is combined with machine labeling, machine labeling self-learning is guided, training is concentrated, and accuracy of machine labeling is enhanced.

(II) technical scheme

An object of the present invention is to provide a method for preprocessing physiological sounds, the method comprising:

signal preprocessing for removing resampling or weak signals from physiological sounds;

band-pass filtering, namely generating band-pass filtering representing the acoustic vibration electronic signal for the physiological sound conversion;

signal normalization, namely performing wavelet transformation on the electronic signals representing the acoustic vibration generated by the physiological sound transformation and decomposing the electronic signals into a plurality of signals with different frequencies;

calculating an intelligent threshold value, and calculating threshold values of signals with different frequencies;

potential feature targets (abnormal physiological sounds) are found, and all potential feature targets are found according to the calculation threshold.

The pretreatment method of physiological sound is used for identifying and diagnosing damp rale.

Wherein:

the intelligent threshold comprises a time domain characteristic value and a frequency domain characteristic value, wherein the time domain characteristic value is obtained by decomposing a physiological sound into a plurality of signals with different frequencies through carrying out wavelet transformation on an electronic signal representing acoustic vibration, and calculating the time domain characteristic values of the signals with different frequencies; the frequency domain characteristic value is obtained by performing wavelet transformation on the electronic signal representing the acoustic vibration so as to decompose a physiological sound into a plurality of signals with different frequencies and calculating the frequency domain characteristic values of the signals with different frequencies. The potential feature target is found by moving along the physiological sound signal by using a 20ms window, and all potential feature targets are found according to the comparison between the time domain feature value and the frequency domain feature value in the calculated intelligent threshold and the time domain feature value and the frequency domain feature value in the constructed normal lung sound database.

The preprocessing method of the physiological sound is used for identifying and diagnosing wheeze sounds.

Wherein

The band-pass filtering is used for carrying out band-pass filtering on the physiological sound by a Hamming window, and the intelligent threshold is calculated by matching the average value of signals decomposed into a plurality of different frequencies with corresponding correction coefficients. And finding out the potential feature target by comparing all wave crests of the signal after the band-pass filtering with a calculated intelligent threshold, and if the wave crests are larger than the calculated intelligent threshold, taking the signal within a set length with the wave crests as the center as the potential feature target, and further finding out all the potential feature targets, and if the wave crests are smaller than the calculated intelligent threshold, judging the potential feature targets as non-feature targets. And the set length is 300 ms.

Another object of the present invention is to provide an automatic labeling method for physiological sounds, which comprises preprocessing physiological sounds according to the preprocessing method of any one of claims 1 to 8, and labeling the physiological sounds by manual labeling and machine labeling.

Further, the intelligent threshold value is calculated in the preprocessing method of the physiological sound, and the energy characteristic value and the variance characteristic value of the signal decomposed into a plurality of different frequencies are further included.

(III) advantageous effects

The method for processing and automatically labeling the physiological sound provided by the invention is convenient for scientific and standardized research on the physiological sound by optimizing the preprocessing method after the physiological sound is collected. Meanwhile, an automatic labeling method of physiological sounds is provided, manual labeling is combined with machine labeling, machine labeling self-learning is guided, training is concentrated, and accuracy of machine labeling is enhanced.

Drawings

FIG. 1 is a diagram illustrating wavelet transformation according to the present embodiment;

FIG. 2 is a mathematical modeling diagram corresponding to wavelet transform;

fig. 3 shows mathematical calculation formulas corresponding to wavelet transformation.

Detailed Description

The following examples are given to further illustrate the embodiments of the present invention. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.

Example 1

A method of pre-processing physiological sounds, the method comprising:

The pretreatment method of physiological sound is used for identifying and diagnosing damp rales and wheezes.

Example 2

The pretreatment method of physiological sound is used for identifying and diagnosing the moist rale and is designed as follows:

the intelligent threshold comprises a time domain characteristic value and a frequency domain characteristic value, wherein the time domain characteristic value is obtained by decomposing a physiological sound into a plurality of signals with different frequencies through carrying out wavelet transformation on an electronic signal representing acoustic vibration, and calculating the time domain characteristic values of the signals with different frequencies; the frequency domain characteristic value is obtained by performing wavelet transformation on the electronic signal representing the acoustic vibration so as to decompose a physiological sound into a plurality of signals with different frequencies and calculating the frequency domain characteristic values of the signals with different frequencies. The potential feature target is found by utilizing a 20ms window to move along the lung sound signal, and all potential feature targets are found according to the comparison of the time domain feature value and the frequency domain feature value in the calculated intelligent threshold value and the time domain feature value and the frequency domain feature value in the constructed normal lung sound database.

Example 3

The preprocessing method of the physiological sound is used for recognizing and diagnosing wheeze sounds and is designed as follows:

Example 4

An automatic labeling method for physiological sounds, the method comprising preprocessing physiological sounds according to any one of claims 1 to 8, and labeling the physiological sounds by manual labeling in cooperation with machine labeling.

The manual labeling method comprises the following steps:

inviting 4 clinical experts, dividing into 2 groups, manually judging the same type of lung sounds, dividing into normal lung sounds, wheezing sounds, wet rales and indistinguishable lung sounds, and marking according to the judgment result if the two experts are distinguished to be the same; if the judgment is different, inviting another group of experts to distinguish, and if the judgment result is still different, marking the data as being not distinguishable. Wherein the specialist is typically a physician with 10 years of clinical experience. The respiratory sound segments are decomposed into eight different frequency band signals through wavelet decomposition, and the energy characteristics, variance characteristics, average value characteristics, frequency distribution characteristics and the like of each frequency band are extracted. I.e. a minimum of 32 diagnostic values can be extracted per breath sound segment. And establishing a feature library by using the 32 features, and automatically labeling the suspected fragments. In the labeling process, clinical experts are required to participate in the early stage, the clinical experts are provided with earphones, automatic labeling software plays breath sound audio, and the clinical experts press a left mouse button to mark after hearing obvious crackle or wheeze. The automatic labeling software can automatically label the characteristic sound segment near the marking point.

Wherein, the wavelet transformation process is shown in fig. 1, Dn is a high frequency signal, and An is a low frequency signal.

While a mathematical model corresponding to the wavelet transform is shown in fig. 2, and a specific mathematical calculation formula is shown in fig. 3.

The specific process of machine labeling comprises two steps:

(1) through an algorithm and semi-artificial participation, two sound banks are extracted from lung sounds, in the process, one sound bank consists of crackles (wet rale) and the other bank consists of false crackles. Since Cracks is substantially less than 20ms, we extract 20ms per sound file. And (4) providing time domain characteristic values and frequency domain specific diagnosis values of the Cracks and the false Cracks, and then training by using a vector machine to obtain parameters of the vector machine. The parameters of the vector machine can be used for well distinguishing the Cracks from the false Cracks.

(2) Through algorithm and semi-artificial participation, two sound banks are extracted from lung sound, wherein one sound bank consists of Cracks (wheeze sound) and the other bank consists of false Cracks. Since Cracks is substantially less than 300ms, we extract 300ms per sound file. And (4) extracting wave peak values of the Cracks and the false Cracks, and then performing training or a traditional statistical algorithm by using a vector machine to obtain a correction coefficient. Then the correction coefficient is matched with the wave peak value, so that the Cracks and the false Cracks can be well distinguished.

(3) Firstly, signal preprocessing is carried out, such as resampling or weak signal identification, then band-pass filtering is carried out on the lung sound signals (the lung sound signals are mainly at 100-2000Hz), then intelligent threshold values are calculated (each section of lung sound signals has own threshold value, potential Cracks are found according to the threshold value), and then a window of 20ms is used for moving along the lung sound signals, so that all potential Cracks are found. Extracting time domain features and frequency domain features of the potential Cracks, further judging whether the potential Cracks are true Cracks or false Cracks by using a vector machine, and finally judging whether the lung sound is wet crackle or not according to the number of the Cracks contained in the lung sound in 1 second on average.

(4) Firstly, signal preprocessing is carried out, such as resampling or weak signal identification, then band-pass filtering is carried out on lung sound signals by a Hamming window (the lung sound signals are mainly at 100-2000Hz), then an intelligent threshold is calculated (each section of lung sound signals has a threshold value, the intelligent threshold value is calculated to be obtained by calculating the average value of signals decomposed into a plurality of different frequencies in cooperation with corresponding correction coefficients, potential Cracks are found out according to the threshold value, then a wave peak value is compared with the calculated intelligent threshold value, 300ms with the wave peak as the center when the wave peak is larger than the calculated intelligent threshold value is used as the potential Cracks, then all the potential Cracks are found out, and non-Cracks can be judged when the wave peak is smaller than the calculated intelligent threshold value.

Furthermore, the intelligent threshold value is calculated in the method for preprocessing the physiological sound, and the parameters such as the energy characteristic value, the variance characteristic value and the like of the signals with different frequencies are decomposed, and the parameters are provided for the artificial labeling expert.

In summary, the above embodiments are not intended to be limiting embodiments of the present invention, and modifications and equivalent variations made by those skilled in the art based on the spirit of the present invention are within the technical scope of the present invention.

Claims

1. A method for preprocessing physiological sounds, the method comprising:

band-pass filtering, namely performing band-pass filtering on the representative acoustic vibration electronic signal generated by converting the physiological sound;

signal normalization, namely performing wavelet transformation on the representative acoustic vibration electronic signal generated by physiological sound conversion and decomposing the signal into a plurality of signals with different frequencies;

and finding out potential characteristic targets, and finding out all potential characteristic targets according to the calculation threshold.

2. The method for pre-treating physiological sounds according to claim 1, wherein the method is used for identification and diagnosis of aphalone.

3. The method as claimed in claim 2, wherein the computing intelligent threshold comprises a time domain feature value and a frequency domain feature value, wherein the time domain feature value is obtained by decomposing a physiological sound into a plurality of signals with different frequencies by performing wavelet transform on an electronic signal representing acoustic vibration, and computing the time domain feature values of the signals with different frequencies; the frequency domain characteristic value is obtained by performing wavelet transformation on the electronic signal representing the acoustic vibration so as to decompose a physiological sound into a plurality of signals with different frequencies and calculating the frequency domain characteristic values of the signals with different frequencies.

4. The method as claimed in claim 3, wherein the step of finding out the potential feature targets is to use a 20ms window to move along the lung sound signal, and then find out all the potential feature targets according to the comparison between the time domain feature values and the frequency domain feature values in the calculated intelligent threshold and the time domain feature values and the frequency domain feature values in the constructed normal physiological sound database.

5. The method of claim 1, wherein the method is used for identification and diagnosis of wheezes.

6. The method as claimed in claim 5, wherein the band-pass filtering is performed by using a hamming window to perform the band-pass filtering on the physiological sound, and the intelligent threshold is calculated by using an average value of signals decomposed into a plurality of different frequencies and corresponding correction coefficients.

7. The method as claimed in claim 6, wherein the step of finding out the potential feature target is performed by comparing all peaks of the band-pass filtered signal with a calculated intelligent threshold, and if the peak is greater than the calculated intelligent threshold, the signal within a predetermined length centered on the peak is used as the potential feature target, and further, if the peak is less than the calculated intelligent threshold, the signal is determined as the non-feature target.

8. The method as claimed in claim 7, wherein the predetermined length is 300 ms.

9. An automatic labeling method for physiological sounds, which is characterized in that the method comprises the steps of preprocessing the physiological sounds according to the preprocessing method of the physiological sounds of any one of claims 1 to 8, and then labeling the physiological sounds by matching manual labeling with machine labeling.

10. The method of claim 9, wherein the calculating the intelligent threshold in the method of pre-processing the physiological sounds further comprises decomposing the energy characteristic value and the variance characteristic value into a plurality of signals of different frequencies.