WO2017135127A1 - Dispositif d'extraction bioacoustique, dispositif d'analyse bioacoustique, programme d'extraction bioacoustique, et support d'informations lisible par ordinateur et dispositif stocké - Google Patents

Dispositif d'extraction bioacoustique, dispositif d'analyse bioacoustique, programme d'extraction bioacoustique, et support d'informations lisible par ordinateur et dispositif stocké Download PDF

Info

Publication number
WO2017135127A1
WO2017135127A1 PCT/JP2017/002592 JP2017002592W WO2017135127A1 WO 2017135127 A1 WO2017135127 A1 WO 2017135127A1 JP 2017002592 W JP2017002592 W JP 2017002592W WO 2017135127 A1 WO2017135127 A1 WO 2017135127A1
Authority
WO
WIPO (PCT)
Prior art keywords
bioacoustic
data
auditory image
unit
auditory
Prior art date
Application number
PCT/JP2017/002592
Other languages
English (en)
Japanese (ja)
Inventor
崇宏 榎本
正武 芥川
亮 野中
憲市郎 川野
アール. アビラトナ,ウダンタ
Original Assignee
国立大学法人徳島大学
ザユニバーシティオブクイーンズランド
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 国立大学法人徳島大学, ザユニバーシティオブクイーンズランド filed Critical 国立大学法人徳島大学
Priority to JP2017565502A priority Critical patent/JP6908243B2/ja
Publication of WO2017135127A1 publication Critical patent/WO2017135127A1/fr

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/08Detecting, measuring or recording devices for evaluating the respiratory organs

Definitions

  • the present invention relates to a bioacoustic extraction device, a bioacoustic analysis device, a bioacoustic extraction program, a computer-readable recording medium, and a recorded device.
  • Bioacoustic analysis is performed to analyze bioacoustics, which are sounds generated by humans, and to determine and analyze cases of diseases and diseases.
  • the analysis target acoustic data includes only bioacoustic data.
  • acoustics other than bioacoustics such as noise are excluded, and necessary acoustic data is obtained. Extraction work is required. If noise is included, it will affect the accuracy of case analysis, judgment, and diagnosis. However, even if the original acoustic data is removed together with the noise, it will also affect the reliability of judgment results and so on. Therefore, in the bioacoustic analysis, it is required to accurately select only bioacoustic data.
  • Snoring sounds are taken up as an example of bioacoustics.
  • SAS Sleep Apnea Syndrome
  • OSAS Obstructive ⁇ Sleep Apnea ⁇ ⁇ Syndrome
  • cardiovascular diseases such as hypertension, stroke, angina pectoris, and myocardial infarction.
  • the present invention has been made to solve such conventional problems.
  • the main objects of the present invention are a bioacoustic extraction apparatus, a bioacoustic analysis apparatus, a bioacoustic extraction program, a computer-readable recording medium, and a computer-readable recording medium that can accurately extract necessary bioacoustic data from acoustic data including bioacoustics. To provide recorded equipment.
  • the bioacoustic extraction apparatus is a bioacoustic extraction apparatus for extracting necessary bioacoustic data from original acoustic data including bioacoustic data.
  • An input unit for acquiring original acoustic data including bioacoustic data, a voiced section estimating unit for estimating a voiced section from the original acoustic data input from the input unit, and the voiced section estimating unit An auditory image generation unit that generates an auditory image according to an auditory image model based on the sounded section estimated in step S1, and an acoustic feature amount that extracts an acoustic feature amount from the auditory image generated by the auditory image generation unit
  • An extraction unit, a classification unit that classifies the acoustic feature amount extracted by the acoustic feature amount extraction unit into a predetermined type, and a biological body based on a predetermined threshold with respect to the acoustic feature amount classified by the classification unit Determine whether it
  • the auditory image generation unit is configured to generate a stabilized auditory image using an auditory image model, and the acoustic feature amount extraction The unit can extract the acoustic feature amount based on the stabilized auditory image generated by the auditory image generator.
  • the auditory image generation unit is configured to further generate a generalized stabilized auditory image and an auditory spectrum from the stabilized auditory image.
  • the acoustic feature amount extraction unit can extract an acoustic feature amount based on the overall stabilized auditory image generated by the auditory image generation unit and the auditory spectrum.
  • the acoustic feature quantity extraction unit includes the kurtosis, skewness, spectrum centroid, spectrum band of the auditory spectrum and / or the overall stabilized auditory image. At least one of width, spectrum w flatness, spectrum roll-off, spectrum entropy, and octave-based spectrum contrast can be extracted as an acoustic feature amount.
  • the auditory image generation unit is configured to generate a neural activity pattern using an auditory image model, and the acoustic feature amount extraction The unit can extract the acoustic feature amount based on the neural activity pattern generated by the auditory image generation unit.
  • the acoustic feature quantity extraction unit can also extract at least one of the total number of peaks, the appearance position, the amplitude, the center of gravity, the inclination, the increase, and the decrease as the acoustic feature quantity obtained from the acoustic spectrum.
  • the bioacoustic extraction device is a bioacoustic extraction device for extracting necessary bioacoustic data from original acoustic data including bioacoustic data.
  • An input unit for acquiring original sound data including data, a sound interval estimation unit for estimating a sound interval from the original sound data input from the input unit, and the sound interval estimation unit An auditory image generator that generates an auditory image according to an auditory image model based on a voiced section, an auditory spectrum generator that generates an auditory spectrum for the auditory image generated by the auditory image generator, and an auditory image
  • a generalized stabilized auditory image generating unit that generates a generalized stabilized auditory image, an auditory spectrum generated by the auditory spectrum generating unit, and a generalized stable auditory image generated by the generalized stabilized auditory image generating unit
  • An acoustic feature amount extraction unit that extracts an acoustic feature amount from an auditory image, a classification unit that classifies
  • bioacoustic extraction apparatus it is possible to extract a section having a period from the original acoustic data.
  • the sounded section estimation unit performs preprocessing by differentiating or subtracting original sound data, and the preprocessing.
  • a squarer for squaring the preprocessed data pre-processed by the detector, a downsampler for downsampling the squared data squared by the squarer, and the downsampled data downsampled by the downsampler And a median filter for obtaining a median value from.
  • the input unit can be a non-contact microphone that is installed in a non-contact manner with the patient to be examined.
  • the discrimination of the bioacoustic data by the discrimination unit can be performed as non-language processing.
  • language processing such as speaker identification and speech recognition. Can be applied widely, regardless of language.
  • the original acoustic data is a bioacoustic acquired when the patient sleeps, and is necessary from the bioacoustic data acquired under sleep. Bioacoustic data can be extracted.
  • the original acoustic data is sleep-related sounds collected during sleep of the patient, and the bioacoustic data is snoring sound data.
  • the predetermined type can be classified into a snoring sound and a non-snoring sound.
  • bioacoustic analyzer for extracting and analyzing necessary bioacoustic data from original acoustic data including bioacoustic data.
  • An auditory image generation unit that generates an auditory image according to an auditory image model based on the estimated voiced section, and an acoustic feature amount extraction that extracts an acoustic feature amount from the auditory image generated by the auditory image generation unit
  • a classification unit that classifies the acoustic feature amount extracted by the acoustic feature amount extraction unit into a predetermined type, and the acoustic feature amount classified by the classification unit based on a predetermined threshold Determine if it is data
  • the true value data determined with the biological sound data in the determination unit may include a screening unit for performing screening.
  • the bioacoustic analyzer for extracting and analyzing necessary bioacoustic data from the original acoustic data including the bioacoustic data is provided.
  • An auditory image generator that generates an auditory image according to an auditory image model based on the estimated voiced section; an auditory spectrum generator that generates an auditory spectrum for the auditory image generated by the auditory image generator;
  • a generalized stabilized auditory image generating unit that generates a generalized stabilized auditory image for the auditory image, an auditory spectrum generated by the auditory spectrum generating unit, and a generalized auditory image generating unit.
  • An acoustic feature amount extraction unit that extracts an acoustic feature amount from the generalized stabilized auditory image, a classification unit that classifies the acoustic feature amount extracted by the acoustic feature amount extraction unit into a predetermined type, and a classification by the classification unit
  • a discriminating unit that discriminates whether or not the acoustic feature quantity is bioacoustic data based on a predetermined threshold value, and screening that performs screening on true value data discriminated as bioacoustic data by the discriminating unit A portion.
  • the screening unit can be configured to perform disease screening on bioacoustic data extracted from the original acoustic data.
  • the screening unit performs screening for obstructive sleep apnea syndrome on the bioacoustic data extracted from the original acoustic data. Can be configured.
  • a bioacoustic extraction method for extracting necessary bioacoustic data from original acoustic data including bioacoustic data.
  • a step of acquiring original sound data including data, a step of estimating a sound section from the acquired original sound data, and generating an auditory image according to an auditory image model based on the estimated sound section A step of extracting an acoustic feature amount from the generated auditory image, a step of classifying the extracted acoustic feature amount into a predetermined type, and the classified acoustic feature amount. And determining whether or not the data is bioacoustic data based on a predetermined threshold value.
  • a bioacoustic extraction method for extracting necessary bioacoustic data from original acoustic data including bioacoustic data.
  • a step of generating, a step of generating an overall stabilized auditory image from the stabilized auditory image, a step of extracting a predetermined acoustic feature obtained from the generated overall stabilized auditory image, and the extracted A step of determining whether or not the acoustic feature value is bioacoustic data based on a predetermined threshold value.
  • the bioacoustic extraction method of the nineteenth aspect of the present invention in the step of generating an auditory spectrum from the stabilized auditory image and extracting the predetermined acoustic feature amount, the overall stabilization In addition to the auditory image, a predetermined acoustic feature amount obtained from the generated auditory spectrum can be extracted.
  • the acoustic features that contribute to the identification from the extracted acoustic feature amounts prior to the step of extracting the predetermined acoustic feature amount.
  • a step of selecting an amount can be included.
  • the step of determining whether or not the bioacoustic data is a classification of a snoring sound or a non-snoring sound using a multinomial distribution logistic regression analysis it can.
  • the bioacoustic analysis method is a bioacoustic analysis method for extracting and analyzing necessary bioacoustic data from original acoustic data including bioacoustic data.
  • the true value data may include the step of screening.
  • the step of performing the screening comprises an obstructive sleep apnea syndrome or a non-obstructive sleep apnea using a multinomial logistic regression analysis. Can be screened for syndrome.
  • the bioacoustic extraction program is a bioacoustic extraction program for extracting necessary bioacoustic data from original acoustic data including bioacoustic data,
  • An input function for acquiring original sound data including data, a sound section estimation function for estimating a sound section from the original sound data input by the input function, and the sound section estimation function
  • An auditory image generation function for generating an auditory image according to an auditory image model based on a voiced section
  • an acoustic feature amount extraction function for extracting an acoustic feature amount from the auditory image generated by the auditory image generation function
  • a classification function for classifying the acoustic feature quantity extracted by the acoustic feature quantity extraction function into a predetermined type, and a biological sound based on a predetermined threshold with respect to the acoustic feature quantity classified by the classification function
  • a discrimination function of discriminating whether the data or not can be realized on the computer.
  • a bioacoustic extraction program for extracting necessary bioacoustic data from original acoustic data including bioacoustic data
  • An input function for acquiring original sound data including data, a sound section estimation function for estimating a sound section from the original sound data input by the input function, and the sound section estimation function
  • a stabilized auditory image generation function that generates a stabilized auditory image according to an auditory image model based on a voiced section, a function that generates a general stabilized auditory image from the stabilized auditory image, and the generated general stable image
  • Function for extracting a predetermined acoustic feature amount from the auditory auditory image, and a classification function for classifying the predetermined acoustic feature amount extracted by the acoustic feature amount extraction function into a predetermined type , To the acoustic feature quantity classified in the classifier, and a determination function of determining whether the bio
  • the bioacoustic analysis program is a bioacoustic analysis program for extracting and analyzing necessary bioacoustic data from original acoustic data including bioacoustic data.
  • An input function for acquiring original sound data including bioacoustic data, a sound section estimation function for estimating a sound section from the original sound data input by the input function, and the sound section estimation function A stabilized auditory image generation function that generates a stabilized auditory image according to an auditory image model based on the estimated voiced section, a function that generates a generalized stabilized auditory image from the stabilized auditory image, and the generated
  • the acoustic feature extraction function for extracting a predetermined acoustic feature from the overall stabilized auditory image, and the predetermined acoustic feature extracted by the acoustic feature extraction function are classified into predetermined types.
  • the function of screening data can be realized by a computer.
  • a computer-readable recording medium or recorded device stores the above program.
  • the program includes a program distributed in a download manner through a network line such as the Internet, in addition to a program stored and distributed in the recording medium.
  • the recording medium includes a device capable of recording the program, for example, a general purpose or dedicated device in which the program is implemented in a state where the program can be executed in the form of software, firmware, or the like.
  • a device capable of recording the program for example, a general purpose or dedicated device in which the program is implemented in a state where the program can be executed in the form of software, firmware, or the like.
  • each process and function included in the program may be executed by computer-executable program software, or each part of the process or hardware may be executed by hardware such as a predetermined gate array (FPGA, ASIC), or program software.
  • FPGA field-programmable gate array
  • ASIC application specific integrated circuit
  • FIG. 9A is the original acoustic data
  • FIG. 9B is the preprocessed data
  • FIG. 9A is the original acoustic data
  • FIG. 9B is the preprocessed data
  • FIG. 9C is the square data
  • FIG. 9D is the down-sampling data
  • FIG. 9E is a graph showing the median waveform.
  • FIG. 10A is the original acoustic data
  • FIG. 10B is the ZCR processing result according to Comparative Example 1
  • FIG. 10C is the STE processing result according to Comparative Example 2
  • FIG. 10D is the waveform of the voiced section estimation processing result according to Example 1. It is a graph which shows.
  • the embodiments described below exemplify a bioacoustic extraction device, a bioacoustic analysis device, a bioacoustic extraction program, a computer-readable recording medium, and a recorded device for embodying the technical idea of the present invention.
  • the bioacoustic extraction device, the bioacoustic analysis device, the bioacoustic extraction program, the computer-readable recording medium, and the recorded device are not specified as follows. Further, the present specification by no means specifies the members shown in the claims to the members of the embodiments.
  • each element constituting the present invention may be configured such that a plurality of elements are constituted by the same member and the plurality of elements are shared by one member, and conversely, the function of one member is constituted by a plurality of members. It can also be realized by sharing.
  • a bioacoustic extraction apparatus that automatically extracts snoring sound as ecological acoustic data to be extracted from sleep-related sound as original acoustic data will be described.
  • a bioacoustic extraction apparatus according to an embodiment of the present invention is shown in the block diagram of FIG.
  • the bioacoustic extraction apparatus 100 shown in this figure includes an input unit 10, a sound section estimation unit 20, an auditory image generation unit 30, an acoustic feature amount extraction unit 40, a classification unit 50, and a determination unit 60.
  • the input unit 10 is a member for acquiring original acoustic data including bioacoustic data.
  • the input unit 10 includes a microphone unit and a preamplifier unit, and inputs the collected original sound data to a computer constituting the bioacoustic extraction device 100.
  • a non-contact microphone that is preferably installed in a non-contact manner with the patient to be examined can be used for the microphone section.
  • the voiced section estimation unit 20 is a member for estimating a voiced section from the original acoustic data input from the input unit 10. As shown in the block diagram of FIG. 2, the voiced section estimation unit 20 performs pre-processing by differentiating or subtracting the original sound data and pre-processing data pre-processed by the pre-processing unit 21. To obtain a median value from the downsampled data downsampled by the downsampler 23, a downsampler 23 for downsampling the squared data squared by the squarer 22, Median filter 24.
  • the auditory image generation unit 30 is a member for generating an auditory image according to the established auditory image model (AIM) based on the voiced section estimated by the voiced section estimation unit 20.
  • AIM auditory image model
  • the acoustic feature amount extraction unit 40 is a member for extracting feature amounts from the auditory image generated by the auditory image generation unit 30.
  • the acoustic feature amount extraction unit 40 is generated by synchronously adding an auditory spectrum (AS) generated by synchronously adding a stabilized auditory image (Stabilized auditory image: SAI) in the horizontal axis direction and SAI in the vertical axis direction.
  • the feature amount can be extracted based on the generalized stabilized auditory image (SSAI).
  • SSAI generalized stabilized auditory image
  • the acoustic feature quantity extraction unit 40 extracts at least one of kurtosis, distortion, spectrum centroid, spectrum bandwidth, spectrum flatness, spectrum roll-off, spectrum entropy, and OSC of the auditory spectrum as a feature quantity. .
  • the acoustic feature quantity extraction unit 40 can also extract at least one of the total number of peaks, the appearance position, the amplitude, the center of gravity, the inclination, the increase, and the decrease as the feature quantity obtained from the acoustic spectrum.
  • the classification unit 50 is a member for classifying the feature amount extracted by the acoustic feature amount extraction unit 40 into a predetermined type.
  • the discriminating unit 60 is a member for discriminating whether or not the feature quantity classified by the classifying unit 50 is bioacoustic data based on a predetermined threshold value.
  • Bioacoustic analyzer 110 it is possible to automatically extract a snoring sound with high accuracy by constructing the bioacoustic extraction device 100 that simulates from the human auditory pathway to the learning mechanism.
  • a bioacoustic analysis device for analyzing bioacoustic data extracted by the bioacoustic extraction device can also be configured.
  • the bioacoustic analysis apparatus 110 further includes a screening unit 70 that performs screening on true value data determined by the determination unit 60 as bioacoustic data.
  • the bioacoustic extraction device and the bioacoustic analysis device described above can be implemented as software by a program in addition to being configured by dedicated hardware. For example, by installing a bioacoustic extraction program or a bioacoustic analysis program on a general-purpose or dedicated computer, loading or downloading and executing it, a virtual bioacoustic extraction device or bioacoustic analysis device is realized. You can also. (Acoustic analysis of conventional snoring sound)
  • (I) a method using a network in which Mel-frequency cepstral coefficients (MFCC) and a hidden Markov model (HMM) are interconnected;
  • Subband spectral energy a method using robust linear regression (RLR) or principal component analysis (PCA),
  • RLR robust linear regression
  • PCA principal component analysis
  • FCM unsupervised Fuzzy C-Means
  • (Iv) 34 feature amounts combining a plurality of acoustic analysis methods and a method using Ada Boost have been proposed.
  • the snoring sound and the non-snoring sound can be automatically classified with high accuracy.
  • the performance evaluation of sound classification methods is based on manual classification, which is considered a gold standard technique, that is, classification of manual work by human ears. Therefore, the present inventors have thought that a high-performance sound classifier can be constructed by imitating human hearing ability, and have achieved the present invention.
  • a bioacoustic extraction device that can automatically classify snoring sounds / non-snoring sounds using an auditory image model (AIM) has been achieved.
  • AIM auditory image model
  • AIM is an auditory image model that imitates an “auditory image” that is considered to be an expression in the brain that humans use to perceive sound.
  • AIM is a model of an auditory image that simulates the function from the peripheral system of the human auditory system including the cochlear basement membrane to the central system.
  • Such AIM has been established mainly in research on hearing and spoken language perception, and is used in the field of speaker recognition and speech recognition, but is used to discriminate bioacoustics such as snoring sounds and intestinal sounds. There are no reported examples as far as the present inventors know. (Example)
  • FIG. 3 shows a flowchart of the bioacoustic extraction method using the AIM according to the present embodiment. (Sound section estimation)
  • the input unit 10 includes a non-contact type microphone that is a form of a microphone unit and a preamplifier unit, and collects the obtained audio data by a computer.
  • the non-contact type microphone was installed at a position about 50 cm away from the patient's mouth.
  • the microphone used for recording was Model NT3 manufactured by Australia RODE
  • the preamplifier was Mobile-Pre USB manufactured by M-AUDIO, USA
  • the sampling frequency during recording was 44.1 kHz
  • the digital resolution was 16 bits / sample.
  • the voiced section estimation unit 20 uses a short-term energy method (STE) and a median filter 24.
  • the STE method is a method of detecting signal energy equal to or higher than a certain threshold value as a sound section.
  • the k-th short-term energy Ek of the sleep-related sound s (n) can be expressed by the following equation.
  • n is the sample number and N is the segment length.
  • a 10th-order median filter was used to smooth Ek.
  • the AE is extracted by detecting a sound having an SNR of 5 dB or more in the segment.
  • the background noise is used as an average value of all frames of short-term energy obtained by performing the STE method from a signal of only background noise for one second.
  • Non-Patent Document 2 it is reported that in a listening experiment in which the relationship between singing voice and voice identification and sound duration is investigated, the signal length is 200 ms or more and the identification rate exceeds 70%. Accordingly, in this embodiment, a detection sound having a signal length of 200 ms or more is defined as AE. (Generation of auditory image model)
  • an auditory image is generated using an auditory image model (Auditory Image Model: AIM).
  • the auditory image generation unit 30 analyzes a voiced section (AE) using AIM.
  • AE voiced section
  • an AIM simulator is provided by the Patterson group. Although the simulator can operate in a C language environment, in this embodiment, AIM 2006 ⁇ http://www.pdn.cam.ac.uk/groups/cnbh/aim2006 which can be used for MATLAB. /> (Module: gm2002, dcgc, hl, sf2003, ti2003) was used.
  • PCP Pre-cochlea processing
  • BMM Basilar membrane motion
  • NAP Neural activity pattern
  • STROBES Strobe identification
  • SAI Stabilized auditory image
  • AIM processing is shown in the block diagram of FIG.
  • PCP pre-cochlea processing
  • filter processing using a band pass filter is performed.
  • filters are arranged at regular intervals, like an equivalent rectangular bandwidth (ERB), to represent the spectral analysis performed in the basement membrane of the cochlea.
  • Auditory filter banks gamma-chirp filter bank, gamma tone filter bank
  • the output from each filter in the filter bank can be obtained.
  • a gamma chirp filter bank is used in which 50 filters having different center frequencies and bandwidths are arranged for each location between 100 Hz and 6000 Hz. Note that the number of filters to be used may be adjusted as appropriate.
  • the output of each filter of the BMM is low-pass filtered and half-wave rectified to represent the neural signal conversion process performed by the inner hair cells.
  • the auditory image when a local maximum point is detected in each frequency channel, a 35 ms frame is created with the local maximum point as the origin, and a buffer storing past NAP expressions is stored.
  • the auditory image is generated by converting the time axis into the time interval axis by integrating with the information from the time.
  • This series of processing is called STI (Strobed temporal integration), and the auditory image can be output as SAI for each frame.
  • STI can generate a stable auditory image by time-integrating the NAP expression over time. Therefore, in the present embodiment, the auditory spectrum (AS) and SSAI of the 10th and subsequent frames of the auditory image obtained from one episode of AE are analyzed. (Hearing spectrum: AS)
  • an auditory image is shown in FIG.
  • the vertical axis represents the center frequency axis of the auditory filter
  • the horizontal axis represents the time interval axis.
  • AS is an expression corresponding to an excitation pattern of the auditory nerve, and is a spectrum in the frequency domain where the maximum point of the formant can be confirmed.
  • the number of AS dimensions corresponds to the number of auditory filters.
  • SSAI is a time-domain spectrum that has vertices only at specific intervals because the output of each channel includes only a limited time interval when the signal is stationary and periodic.
  • the number of dimensions of SSAI is determined by the frame size and the sampling rate of the input signal.
  • AS and SSAI are normalized with the maximum amplitude 1 in order to minimize the influence of the signal amplitude envelope between frames.
  • step S304 the acoustic feature amount obtained from the AIM is extracted.
  • AS and SSAI in each SAI frame of AE can be calculated.
  • a method for extracting feature amounts from AS and SSAI will be described. Since AS and SSAI have a shape similar to a spectrum, features are extracted using the following eight types of feature amounts.
  • Kurtosis is a feature value that measures the tendency of protrusion of the spectrum per average value.
  • the formula for kurtosis is shown below.
  • skewness is a characteristic amount for measuring the asymmetry of the spectrum per average value.
  • the equation for skewness is shown below.
  • the spectral centroid is a feature quantity for calculating the spectral centroid.
  • the equation of the spectrum centroid is shown below.
  • Spectral bandwidth is a feature quantity that quantifies the frequency bandwidth of a signal.
  • the equation for the spectral bandwidth is shown below.
  • Spectral flatness is a feature value that quantifies sound quality.
  • the equation for spectral flatness is shown below.
  • Spectral roll-off is a feature value for evaluating a frequency that occupies c ⁇ 100% of the entire band of the spectrum distribution.
  • the equation for the spectrum roll-off is shown below.
  • Spectral entropy is a feature that indicates the whiteness of the signal.
  • the equation for spectral entropy is shown below.
  • i is a spectrum sample point
  • N is a total number of spectrum sample points
  • k is a frame number
  • X is a spectrum amplitude.
  • X> 0 and c 0.95.
  • микл ⁇ -based spectral contrast is a feature quantity that represents spectral contrast.
  • the spectrum is divided into subbands by an octave filter bank.
  • the number of subbands is set to 3 for AS and 5 for SSAI in consideration of the number of dimensions of the spectrum.
  • the spectral peak Peak k (b), spectral valley Valley k (b), and spectral contrast OSC k (b) of the b-th subband are respectively expressed by the following equations.
  • X ′ is a feature vector rearranged in descending order within the subband
  • j is a sample point of the spectrum within the subband
  • N b is the total number of sample points within the subband
  • the spectral flatness was applied only to AS in this example because the value of SF k approached zero as much as possible when integrating SSAI and could not be quantified.
  • the average value and standard deviation of each feature value are defined as the feature values obtained from AE. That is, (i) a 20-dimensional AS feature vector, (ii) a 22-dimensional SSAI feature vector, and (iii) a 42-dimensional feature vector can be extracted from the AE. In addition to these, it is also possible to use feature quantities obtained from the spectrum, such as spectral asymmetry, band energy ratio, and the like.
  • each feature vector is referred to as (i) ASF: Auditory spectrum features, (ii) SSAIF: Summary SAI features, and (iii) AIMF: AIM features. (Classification of snoring / non-snoring using MLR)
  • step S306 learning is performed based on the MLR model using the feature vector in step S306, the snoring sound / non-snoring sound classification using the MLR is performed in step S305, and the discrimination based on the threshold is performed in step S307.
  • a multinomial distribution logistic regression using a feature vector extracted from the AE Multi-nomial logistic regression (MLR) analysis was used.
  • MLR analysis is an excellent statistical analysis technique as a discriminator for binary identification that classifies a plurality of measurement values into one of two categories using a logistic curve.
  • the equation of MLR is shown in the following equation.
  • p indicates the probability that the sound to be classified is classified into the snoring sound category.
  • a model of ⁇ d estimated by learning based on the maximum likelihood method and a dependent variable Y is constructed.
  • each AE can be classified into one of two categories (snoring sound or non-snoring sound), and the classification of the snoring sound and the non-snoring sound can be performed by the classifier.
  • This simulation was performed using Statistics Toolbox Version 9.0 of MATLAB (R2014a, The MathWorks, Inc., Natick, MA, USA).
  • sensitivity, specificity, accuracy, positive predictive value (PPV), and negative predictive value (NPV) are used as indicators of classification performance. It was.
  • “Sensitivity” here is the ability of the discrimination result to detect snoring. Specificity is the proportion of non-snoring that results in a determination result equal to or less than a threshold value.
  • Positive predictive value (PPV) represents the probability of actually snoring when the determination result is equal to or greater than a threshold value.
  • negative predictive value (NPV) represents the probability of non-snoring when the determination result is equal to or less than a threshold value.
  • the ROC (Receiver Operating Characteristic) curve is plotted with the false positive rate (1-specificity) on the horizontal axis and the true positive rate (sensitivity) on the vertical axis.
  • the ROC curve can be constructed by P thre .
  • the optimum threshold value of the ROC curve that is, the optimum P thre can be obtained using the method of Youden's index.
  • the ROC curve draws a large arc at the upper left in the case of an ideal classification unit. Due to this property, an area under the curve (AUC), which is the area of the lower region of the ROC curve, can be used as an index representing the performance of the classifier or the classification algorithm.
  • the AUC value takes a value in the range of 0.5 to 1, and is a classification accuracy evaluation index having a characteristic approaching 1 when the classification accuracy is good. (Learning dataset and test dataset)
  • AEs extracted from 40 people are divided into a learning data set 16141 (snoring sound 13406, non-snoring sound 2735) and a test data set 10651 (snoring sound 7346, non-snoring sound 3305). . (labeling)
  • the AE labeling work is performed based on the listening results.
  • Three evaluators listened to the AE flowing from the headphones (SHURE SRH840) and labeled the AE by consensus. In this way, the snoring sound was not selected without everyone's consent.
  • Table 2 shows the AEs that were determined to be non-snore during such labeling work.
  • FIG. 6 shows the relationship between the feature vector index and Accuracy. From this figure, the feature quantity effective for the classification of the snoring sound / non-snoring sound can be understood.
  • Table 4 shows feature vectors (ASF opt ′ , SSAIF opt. ) Extracted from AS and SSAI and contributing to accuracy improvement of 1% or more. From this result, it was confirmed that the number of AS dimensions can be reduced to 4 dimensions, and the number of dimensions of SSAI can be significantly reduced to 5 dimensions.
  • AS auditory spectrum
  • SSAI summary SAI
  • OT optimum threshold
  • TP true positive
  • FP false positive
  • TN true negative
  • FN false negative
  • S Sensitivity
  • Spe. Specificity
  • AUC area under the curve
  • Acc. Accuracy
  • PPV positive predictive value
  • NPV negative predictive value
  • Table 7 summarizes the classification performance and the accuracy Acc obtained by classifying the data of 40 subjects by the above-described method of this example. As is clear from this table, the subject data and conditions for classification are different, but the method of this example achieves classification accuracy superior to any reported example.
  • Acc. Accuracy
  • PPV positive predictive value
  • MFCCs mel-frequency cepstrum coefficients
  • OSAS obstructive sleep apnea syndrome
  • SED subband energy distribution
  • ACs autocorrelation coefficients
  • LPCs linear predictive coding coefficients
  • AIMF AIM feature (Duckitt et al.)
  • Non-Patent Document 6 used only the breathing sound and silence period to classify the non-snoring sounds using normalized AC's, LPCs, etc., and the speech sounds and noises were 10 minutes or 20 minutes before the patient fell asleep. It is reported that it can be avoided by excluding it (Non-Patent Document 6). Therefore, in order to investigate the proportion of breathing sounds and other non-snoring sounds, a non-snoring sound in the database used in this example was subjected to an audibility evaluation test by three examples, breathing sounds, cough, voice The number of episodes was investigated by classifying into four classes (sleeping, moaning, speaking) and noise (bed squeak, metal, siren, etc.).
  • Azarbarzin et al. Reported that the data set of simple snore sound and OSAS snore sound was classified using SED of 500 Hz, and the accuracy of 93.1% was obtained (Non-patent Document 9). However, the classification target data is extracted from only 15 minutes. In contrast, the present embodiment achieves 97.3% for data as long as 2 hours. (Dafna et al.)
  • the result when both AS and SSAI are used is as follows. The highest accuracy is shown. Further, even when only SSAI is used, the accuracy of 94% is obtained, and SSAI information is also effectively used.
  • the feature amount obtained from the acoustic spectrum for example, the feature amount corresponding to the total number, position, and amplitude of the peak, the center of gravity, inclination, and increase of the spectrum.
  • the feature amount corresponding to the decrease can be extracted from the AS or SSAI.
  • screening can be performed only for a section having a pitch (period) in the extracted snoring sounds.
  • a section having a pitch may be extracted when the snoring sound is extracted in advance.
  • a sound having an SNR of 5 dB or more in the segment is used as the AE, but a sound with SNR ⁇ 5 dB can also be used by using the sounded section detection method.
  • a non-contact microphone was used to record sleep-related sounds. This approach is often discussed in sleep-related sound classification studies compared to contact microphones.
  • the non-contact microphone has an advantage that recording can be performed without imposing a load on the subject, while the magnitude of the SNR during recording is a problem.
  • noise reduction processing such as a spectral subtraction method is used for preprocessing as an approach to improve the SNR of a signal.
  • the spectrum subtraction process generates synthesized speech called musical noise, and it becomes difficult to estimate the fundamental frequency at a low SNR.
  • the gamma chirp filter bank used in the BMM of this embodiment can effectively extract voice from a noisy environment even with a low SNR such as -2 dB without causing musical noise. This is presumably because AIM has the characteristic of preserving the fine structure of periodic sounds rather than noise. AIM-based feature vectors are also reported to have higher noise suppression than MFCC. For the above reasons, it can be said that AIM has excellent noise resistance against recording in a real environment. (Sound section estimation)
  • step S801 sleep related sounds are collected.
  • the sleep related sound used as original sound data (FIG. 9A) is recorded from the patient during sleep using a non-contact type microphone.
  • step S802 the original sound data is differentiated or differentiated.
  • This process is performed by the pre-processor 21 shown in FIG.
  • the original acoustic data of FIG. 9A is differentiated by a differentiator which is the pre-processor 21, and the signal waveform of the pre-processed data obtained as a result is shown in FIG. 9B.
  • the difference can be performed by a first-order FIR (Finite Impulse Response) filter which is one of digital filters.
  • FIR Finite Impulse Response
  • step S803 the preprocess data is squared. This process is performed by the squarer 22 shown in FIG. FIG. 9C shows a signal waveform of the square data obtained as a result of squaring the preprocessed data in FIG. 9B by the squarer 22.
  • step S804 the square data is down-sampled.
  • This processing is performed by the downsampling device 23 shown in FIG. FIG. 9D shows a signal waveform of the down-sampling data obtained as a result of down-sampling the square data of FIG. 9C by the down-sampler 23.
  • step S805 the median value is acquired from the downsampling data.
  • This processing is performed by the median filter 24 shown in FIG. FIG. 9E shows a signal waveform obtained as a result of obtaining the median value by the median filter 24 from the down-sampling data of FIG. 9D.
  • the median filter 24 shown in FIG. 9E shows a signal waveform obtained as a result of obtaining the median value by the median filter 24 from the down-sampling data of FIG. 9D.
  • it is possible to accurately extract a sound section (breathing sound, snoring sound, etc.).
  • the detection of the voiced section may be realized using a learning machine such as a neural network, other time series analysis techniques, and signal analysis / modeling techniques in addition to using the difference, the square, and the like as described above.
  • a learning machine such as a neural network, other time series analysis techniques, and signal analysis / modeling techniques in addition to using the difference, the square, and the like as described above.
  • a conventionally known method as a method for detecting a speech section is applied as a comparative example.
  • a method using a zero-crossing rate (ZCR) is referred to as Comparative Example 1
  • an STE method based on the energy of an audio signal is referred to as Comparative Example 2
  • FIGS. 10B and 10C show each of the sound sections of the original acoustic data in FIG.
  • FIGS. 10B and 10C The results of the automatic extraction are shown in FIGS. 10B and 10C, respectively.
  • FIG. 10D shows an automatic extraction result obtained by the method according to Example 1 described above.
  • These ZCR and STE methods are typical techniques as a voice section detection method for detecting only a voice section from a voice signal input from the outside. (ZCR)
  • ZCR is defined by the following equation.
  • the ZCR is mainly used in a scene where the ZCR of the voiced section to be uttered is considerably smaller than the ZCR of the silent section. This method depends on the type of voice (strong harmony structure). (STE method)
  • Equation 1 the STE function of sound is defined as shown in Equation 1 above.
  • This STE method is mainly used in a scene in which the value of E k in a voiced section is considerably larger than E k in a silent section, the SNR is high, and E k in a voiced section can be clearly read from background noise. Has been.
  • sleep-related sound data is collected for 10 subjects as original sound data, and the length of the sleep-related sound is 120 s. It has been classified in advance as sound (Breath).
  • screening based on the AIM that is, sieving is performed by the screening unit 70 of FIG. 1 on the data classified into the snoring sound and the non-snoring sound.
  • the screening part 70 determines whether it is OSAS (obstructive sleep apnea syndrome) or non-OSAS from the snoring sound.
  • Example 3 was performed in order to confirm whether the screening unit 70 can appropriately classify OSAS and non-OSAS.
  • an AE data set extracted from 31 subjects was prepared as a data set for classifying snoring sounds and non-snoring sounds. Of these, 20 were used as learning data sets and 11 were used as test data sets.
  • the data of the subject's sleep for 1 hour was extracted for 2 hours.
  • AE is pre-labeled into snoring sound or non-snoring sound by hand. Details of the dataset used to classify snoring sounds and non-snoring sounds are shown in the table below.
  • the data set used for OSAS and non-OSAS classification is shown in the following table.
  • 50 subjects were used, 35 of which were used as the learning data set and 15 were used as the test data set.
  • the classification performance of the snoring sound or the non-snoring sound by the classifying unit 50 and the discriminating unit 60 of FIG. 1 is extracted from a 6-dimensional feature vector (kurtosis, skewness, spectral centroid, spectral bandwidth, and SSAI extracted from AS. Kurtosis and skewness).
  • a 6-dimensional feature vector kurtosis, skewness, spectral centroid, spectral bandwidth, and SSAI extracted from AS. Kurtosis and skewness.
  • the eight-dimensional features of the snoring sound discriminated by the classifying unit 50 and the discriminating unit 60 of FIG. 1 (distortion degree extracted from AS, spectral centroid, spectral roll-off, kurtosis extracted from SSAI, Based on the skewness, spectral bandwidth, spectral roll-off, spectral entropy) vectors, the screening unit 70 classified OSAS and non-OSAS.
  • the above-described combination of feature amounts can be used in the classification of the snoring sound or the non-snoring sound, and the classification of the OSAS and the non-OSAS.
  • the above-described combination of feature amounts can be used.
  • spectral asymmetry, band energy ratio, and the like can also be used.
  • the evaluation of the OSAS and non-OSAS classification was performed by a 10-fold cross-validation test. Here, 9fold randomly selected from the data set was used for learning, and the remaining 1fold was used for testing.
  • the threshold value used as the judgment standard of apnea-hypopnea index was set to 15 events / h and the OSAS patients were screened by the screening unit 70, as shown in Table 14 above. Excellent results were obtained with a sensitivity of 85.00% ⁇ 26.87 and a specificity of 95.00% ⁇ 15.81, confirming the usefulness of this example.
  • the AHI is not limited to this value, and may be 5 events / h, 10 events / h, or the like. Further, in the analysis in the classification unit 50, the determination unit 60, and the screening unit 70, classification, determination, and sieving in consideration of the characteristics of each gender are possible.
  • MLR multi-class classification problem
  • a multi-class classification problem can be considered using the learning machine. For example, it can be classified directly into OSAS snoring (1), non-OSAS snoring (2), and non-snoring (3) based on the feature amount.
  • automatic extraction can be classified into multiple classes such as snoring (1), breathing sound (2), and cough (3).
  • Example 4 it was verified whether snoring / non-snoring discrimination and OSAS / non-OSAS discrimination were possible with the number of subjects increased, that is, with the subject database expanded.
  • AE Audio event
  • PSG polysomnography
  • SAI Stabilized auditory image
  • AS Auditory spectrum
  • SSAI Summary stabilized auditory
  • Each frame was normalized so that the maximum amplitude of AS and SSAI was 1.
  • AS Auditory spectrum
  • SSAI Summary stabilized auditory
  • From AS eight characteristics of Kurtosis, Skewness, Spectral centroid, Spectral bandwidth, Spectral roll-off, Spectral entropy, Spectral contrast, Spectral flatness were used.
  • SSAI seven feature values other than Spectral flatness were used.
  • the average value of each feature amount is used as the feature amount obtained from the AE.
  • a stepwise method which is a feature selection algorithm, was used for each male, female, and male / female data set.
  • step S307 discrimination based on the threshold value was performed.
  • Table 16 shows the performance evaluation results (Leave-one-out cross-validation) of automatic snoring sound classification using AIM.
  • Table 17 shows the performance evaluation results of the OSAS screening based on the snoring sound, which was automatically extracted using the AIM by the method described above.
  • Table 18 shows the performance evaluation results of OSAS screening using only the snoring sound, which is extracted manually without labeling and automatically extracted.
  • Table 17 suggests that OSAS screening can be performed with high accuracy in any data set based on snoring sounds that are automatically extracted using only AIM, imitating human auditory ability. From Tables 17 and 18, it was found that the performance of OSAS screening based on snoring sounds manually extracted by labeling was higher in all subject sets than in the case of automatic extraction. This result suggests that the OSAS screening performance is improved by further improving the performance of automatic extraction of snoring sound using AIM. In order to improve the snoring automatic extraction performance, it is possible to change the normalization method in the AS and SSAI frames, for example, normalize in one episode instead of normalizing in frames. In addition of feature amounts, for example, signal processing such as pitch information and formant frequency information, speech recognition, feature vectors used for speech signal processing, and the like as described in Non-Patent Document 1 are possible. .
  • AIM processing was performed for a high-sounding sound section.
  • the SAI is obtained for every 35 ms frame by the AIM processing.
  • AS and SSAI are obtained from each SAI.
  • feature values (kurtosis, skewness, spectrum bandwidth, spectrum centroid, spectrum entropy, spectrum roll-off, spectrum flatness, etc.) are extracted from AS and SSAI, respectively. Since the feature amount is obtained by the number of frames, each feature amount obtained from one sound of the sounded section can be used as an average value and a standard deviation by averaging.
  • the average value and standard deviation of feature values obtained from AS and SSAI were used.
  • the snoring sound has been described as an example, but the subject of the present invention is not limited to the snoring sound, but can be used for other sounds (biological sound) generated by biological objects, and from the detected bioacoustics, It can be applied to discovery and diagnosis of various cases. For example, by detecting sleep sound, OSAS screening as described above and discrimination of sleep disorders can be performed. Diagnosis of asthma, pneumonia, etc. can be made from lung sounds, respiratory sounds, coughs, and the like. Alternatively, various heart diseases can be made from heart sounds, and various intestinal diseases such as functional gastrointestinal disorders can be screened by analyzing intestinal sounds. In addition, it can also be applied to detection of fetal movement sounds, muscle sounds, and the like.
  • this invention can be utilized not only for a human but for other living organisms. For example, it can be suitably used for health examinations of pets and animals bred at zoos.
  • the bioacoustic extraction device, the bioacoustic analysis device, the bioacoustic extraction program, the computer-readable recording medium, and the recorded device according to the present invention measure snoring sounds together with or in place of a patient's polysomnographic examination, It can utilize suitably as a use which diagnoses.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Pulmonology (AREA)
  • Biophysics (AREA)
  • Pathology (AREA)
  • Engineering & Computer Science (AREA)
  • Physiology (AREA)
  • Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Molecular Biology (AREA)
  • Surgery (AREA)
  • Animal Behavior & Ethology (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)

Abstract

L'invention concerne un dispositif d'extraction bioacoustique (100) pour extraire avec précision des données bioacoustiques nécessaires à partir de données acoustiques brutes qui comprennent les données bioacoustiques. Le dispositif d'extraction bioacoustique (100) comprend : une unité d'entrée (10) pour acquérir des données acoustiques brutes qui comprennent des données bioacoustiques; une unité d'estimation de section sonore (20) qui estime une section sonore, à partir des données acoustiques brutes entrées depuis l'unité d'entrée (10); une unité de génération d'image auditive (30) qui génère une image auditive selon un modèle d'image auditive, sur la base de la section sonore estimée par l'unité d'estimation de section sonore (20); une unité d'extraction de quantité de caractéristique acoustique (40) qui extrait une quantité de caractéristique acoustique en lien avec l'image auditive générée par l'unité de génération d'image auditive (30); une unité de classement (50) qui classe, dans un type prescrit, la quantité de caractéristique acoustique extraite par l'unité d'extraction de quantité de caractéristique acoustique (40); et une unité de détermination (60) qui détermine, sur la base d'une valeur de seuil prescrite, si la quantité de caractéristique acoustique classée par l'unité de classement (50) représente des données bioacoustiques.
PCT/JP2017/002592 2016-02-01 2017-01-25 Dispositif d'extraction bioacoustique, dispositif d'analyse bioacoustique, programme d'extraction bioacoustique, et support d'informations lisible par ordinateur et dispositif stocké WO2017135127A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2017565502A JP6908243B2 (ja) 2016-02-01 2017-01-25 生体音響抽出装置、生体音響解析装置、生体音響抽出プログラム及びコンピュータで読み取り可能な記録媒体並びに記録した機器

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2016017572 2016-02-01
JP2016-017572 2016-02-01

Publications (1)

Publication Number Publication Date
WO2017135127A1 true WO2017135127A1 (fr) 2017-08-10

Family

ID=59499570

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2017/002592 WO2017135127A1 (fr) 2016-02-01 2017-01-25 Dispositif d'extraction bioacoustique, dispositif d'analyse bioacoustique, programme d'extraction bioacoustique, et support d'informations lisible par ordinateur et dispositif stocké

Country Status (2)

Country Link
JP (1) JP6908243B2 (fr)
WO (1) WO2017135127A1 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109243470A (zh) * 2018-08-16 2019-01-18 南京农业大学 基于音频技术的肉鸡咳嗽监测方法
WO2019216320A1 (fr) * 2018-05-08 2019-11-14 国立大学法人徳島大学 Appareil d'apprentissage automatique, appareil d'analyse, procédé d'apprentissage automatique et procédé d'analyse
CN111105812A (zh) * 2019-12-31 2020-05-05 普联国际有限公司 一种音频特征提取方法、装置、训练方法及电子设备
CN111938649A (zh) * 2019-05-16 2020-11-17 医疗财团法人徐元智先生医药基金会亚东纪念医院 利用神经网络从鼾声来预测睡眠呼吸中止的方法
JP2020537147A (ja) * 2017-10-11 2020-12-17 ビーピー エクスプロレーション オペレーティング カンパニー リミテッドBp Exploration Operating Company Limited 音響周波数領域特徴を使用した事象の検出

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008545170A (ja) * 2005-06-29 2008-12-11 フラウンホッファー−ゲゼルシャフト ツァ フェルダールング デァ アンゲヴァンテン フォアシュンク エー.ファオ 音声信号を分析する装置、方法、およびコンピュータ・プログラム
US20120004749A1 (en) * 2008-12-10 2012-01-05 The University Of Queensland Multi-parametric analysis of snore sounds for the community screening of sleep apnea with non-gaussianity index
JP2014008263A (ja) * 2012-06-29 2014-01-20 Univ Of Yamanashi シャント狭窄診断支援システムおよび方法,アレイ状採音センサ装置,ならびに逐次細分化自己組織化マップ作成装置,方法およびプログラム

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4935931B2 (ja) * 2008-10-16 2012-05-23 富士通株式会社 無呼吸検出プログラムおよび無呼吸検出装置
JP5827108B2 (ja) * 2011-11-18 2015-12-02 株式会社アニモ 情報処理方法、装置及びプログラム
JP6136394B2 (ja) * 2012-08-09 2017-05-31 株式会社Jvcケンウッド 呼吸音分析装置、呼吸音分析方法および呼吸音分析プログラム
JP6412458B2 (ja) * 2015-03-31 2018-10-24 セコム株式会社 超音波センサ

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008545170A (ja) * 2005-06-29 2008-12-11 フラウンホッファー−ゲゼルシャフト ツァ フェルダールング デァ アンゲヴァンテン フォアシュンク エー.ファオ 音声信号を分析する装置、方法、およびコンピュータ・プログラム
US20120004749A1 (en) * 2008-12-10 2012-01-05 The University Of Queensland Multi-parametric analysis of snore sounds for the community screening of sleep apnea with non-gaussianity index
JP2014008263A (ja) * 2012-06-29 2014-01-20 Univ Of Yamanashi シャント狭窄診断支援システムおよび方法,アレイ状採音センサ装置,ならびに逐次細分化自己組織化マップ作成装置,方法およびプログラム

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
TSUZAKI, MINORU: "Effectiveness of Auditory Parameterization for Unit Selection in Concatenative Speech Synthesis", IEICE TECHNICAL REPORT, vol. 101, no. 232, 2001, pages 23 - 30 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2020537147A (ja) * 2017-10-11 2020-12-17 ビーピー エクスプロレーション オペレーティング カンパニー リミテッドBp Exploration Operating Company Limited 音響周波数領域特徴を使用した事象の検出
JP7277059B2 (ja) 2017-10-11 2023-05-18 ビーピー エクスプロレーション オペレーティング カンパニー リミテッド 音響周波数領域特徴を使用した事象の検出
WO2019216320A1 (fr) * 2018-05-08 2019-11-14 国立大学法人徳島大学 Appareil d'apprentissage automatique, appareil d'analyse, procédé d'apprentissage automatique et procédé d'analyse
JPWO2019216320A1 (ja) * 2018-05-08 2021-06-17 国立大学法人徳島大学 機械学習装置、解析装置、機械学習方法および解析方法
JP7197922B2 (ja) 2018-05-08 2022-12-28 国立大学法人徳島大学 機械学習装置、解析装置、機械学習方法および解析方法
CN109243470A (zh) * 2018-08-16 2019-01-18 南京农业大学 基于音频技术的肉鸡咳嗽监测方法
CN109243470B (zh) * 2018-08-16 2020-05-05 南京农业大学 基于音频技术的肉鸡咳嗽监测方法
CN111938649A (zh) * 2019-05-16 2020-11-17 医疗财团法人徐元智先生医药基金会亚东纪念医院 利用神经网络从鼾声来预测睡眠呼吸中止的方法
JP2020185390A (ja) * 2019-05-16 2020-11-19 醫療財團法人徐元智先生醫藥基金會亞東紀念醫院 睡眠時無呼吸予測方法
CN111105812A (zh) * 2019-12-31 2020-05-05 普联国际有限公司 一种音频特征提取方法、装置、训练方法及电子设备

Also Published As

Publication number Publication date
JPWO2017135127A1 (ja) 2019-01-10
JP6908243B2 (ja) 2021-07-21

Similar Documents

Publication Publication Date Title
JP6908243B2 (ja) 生体音響抽出装置、生体音響解析装置、生体音響抽出プログラム及びコンピュータで読み取り可能な記録媒体並びに記録した機器
Kim et al. Detection of sleep disordered breathing severity using acoustic biomarker and machine learning techniques
US10007480B2 (en) Multi-parametric analysis of snore sounds for the community screening of sleep apnea with non-Gaussianity index
Abeyratne et al. Pitch jump probability measures for the analysis of snoring sounds in apnea
Karunajeewa et al. Silence–breathing–snore classification from snore-related sounds
Lin et al. Automatic wheezing detection using speech recognition technique
Lei et al. Content-based classification of breath sound with enhanced features
Xie et al. Audio-based snore detection using deep neural networks
US20220007964A1 (en) Apparatus and method for detection of breathing abnormalities
JP7197922B2 (ja) 機械学習装置、解析装置、機械学習方法および解析方法
WO2018011801A1 (fr) Estimation de paramètres de qualité du sommeil à partir de l'analyse audio d'une nuit complète
Nonaka et al. Automatic snore sound extraction from sleep sound recordings via auditory image modeling
El Emary et al. Towards developing a voice pathologies detection system
Kang et al. Snoring and apnea detection based on hybrid neural networks
Fezari et al. Acoustic analysis for detection of voice disorders using adaptive features and classifiers
Karan et al. Detection of Parkinson disease using variational mode decomposition of speech signal
Fonseca et al. Discrete wavelet transform and support vector machine applied to pathological voice signals identification
Porieva et al. Investigation of lung sounds features for detection of bronchitis and COPD using machine learning methods
Saudi et al. Computer aided recognition of vocal folds disorders by means of RASTA-PLP
Shi et al. Obstructive sleep apnea detection using difference in feature and modified minimum distance classifier
Corcoran et al. Glottal Flow Analysis in Parkinsonian Speech.
Dafna et al. Automatic detection of snoring events using Gaussian mixture models
Liu et al. Classifying respiratory sounds using electronic stethoscope
Sengupta et al. Optimization of cepstral features for robust lung sound classification
Herath et al. An investigation of critical frequency sub-bands of snoring sounds for OSA diagnosis

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17747286

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2017565502

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17747286

Country of ref document: EP

Kind code of ref document: A1