CN106328150A

CN106328150A - Bowel sound detection method, device and system under noisy environment

Info

Publication number: CN106328150A
Application number: CN201610686377.6A
Authority: CN
Inventors: 战鸽; 朱斌杰; 陈平; 应冬文; 颜永红
Original assignee: Beijing Yimai Medical Technology Co Ltd
Current assignee: BEIJING YIMAI MEDICAL TECHNOLOGY CO., LTD.; Shandong Yi Mai Medical Technology Co., Ltd.
Priority date: 2016-08-18
Filing date: 2016-08-18
Publication date: 2017-01-11
Anticipated expiration: 2036-08-18
Also published as: CN106328150B

Abstract

The invention discloses bowel sound detection method, device and system under noisy environment. The bowel sound detection method comprises steps of collecting a bowl sound mixed signal of a current user through a sensor, wherein the bowl sound mixed signal comprises a bowl sound signal and an environment interference signal; converting the bowl sound mixed signal to a digital signal; extracting a time-frequency spectrum characteristic of the digital signal; inputting the time-frequency spectrum characteristic of the digital signal into a trained convolution nerve network to process; and detecting a time point when the bowl sound happens so as to distinguish the bowl sound signal and the environment interference signal. The bowel sound detection method, device and system under the noisy environment utilize a difference of the time-frequency spectrum characteristics of the bowl sound signal and the environment interference signal, train a convolution nerve network to distinguish the bowl sound and the interference sound, can finish the detection on the bowl sound under the noisy environment, and can help to improve the accuracy of the bowl sound detection.

Description

Borborygmus detection method under noisy environment, Apparatus and system

Technical field

The present invention relates to processing of biomedical signals technical field, in particular to the borborygmus under a kind of noisy environment Sound detection method, Apparatus and system.

Background technology

Intestinal is pipeline the longest in digestive organs, and major function is digestion and assimilates food.Once intestinal has exception, just Disorder of Digestion and Absorption, and a series of related symptoms may be caused.Therefore, the diagnosis of intestinal tract disease is the most necessary.At intestinal In the diagnosis and treatment process of disease, borborygmus auscultation belongs to conventional noninvasive test.

In the diagnosis and treatment process of present stage, the collection of borborygmus and distinguish the artificial auscultation depending on doctor.Borborygmus Generation derive from the motion of intestinal, its appearance in time range has openness, thus the auscultation process to borborygmus It is generally required to longer time and relatively quiet environment.The method using sensor acquisition signal combination computer-assisted analysis Can be the collection of borborygmus and offer sound assurance is provided, but, acoustical signal noisy in existing environment and patient's body The vibration signal of intracavity often has a similar wave characteristic to borborygmus signal, easily to the collection of borborygmus with distinguish formation Serious interference, causes the accuracy identifying borborygmus under noisy environment poor.

Summary of the invention

In view of this, the purpose of the embodiment of the present invention is to provide under a kind of noisy environment borborygmus detection method, dress Put and system, it is possible to be lifted in noisy environment the accuracy identifying borborygmus signal from collection borborygmus mixed signal.

First aspect, embodiments provides the borborygmus detection method under a kind of noisy environment, including:

By the borborygmus mixed signal of sensor acquisition active user, wherein, described borborygmus mixed signal includes intestinal Ring tone signal and environmental disturbances signal；

Described borborygmus mixed signal is converted to digital signal；

Extract the time-frequency feature of described digital signal；

The time-frequency feature of described digital signal is inputted in the convolutional neural networks trained and processes, detect intestinal The time point that ring sound occurs, thus distinguish described borborygmus signal and described environmental disturbances signal；Wherein, described convolutional Neural net The training process of network includes:

Borborygmus sample signal and least one set interference sample signal is gathered respectively by described sensor；

Sample signal is disturbed all to be converted into numeral sample signal described borborygmus sample signal and least one set；

Extract the time-frequency feature of described numeral sample signal；

In time-frequency domain, described numeral sample signal is made signal label；Described signal label includes for labelling intestinal Ring tone signal borborygmus label of time of occurrence point in described borborygmus sample signal, and for labelling interference signal described The interference label of time of occurrence point in interference sample signal；

From described numeral sample signal, borborygmus label letter is extracted according to described borborygmus label and described interference label Number and each group interference label signal as training sample；

Using the time-frequency feature of numeral sample signal corresponding for described training sample as training data, by described borborygmus Label and described interference label are as supervision message, and training is for distinguishing borborygmus signal and the convolutional Neural of various interference signal Network.

In conjunction with first aspect, embodiments provide the first possible embodiment of first aspect, wherein, carry The time-frequency feature taking described numeral sample signal includes:

The described numeral sample signal with sequential is carried out framing and windowing；

Described numeral sample signal after windowing is carried out fast Fourier transform, extracts power spectrum；

Use Gammatone bank of filters that described power spectrum is filtered；Described Gammatone bank of filters is accomplished that one Planting linear transformation, its impulse respective table is shown as:

g_i(t)=At^n-1exp(-2πb_it)cos(2πf_i+φ_i),t≥0,1≤i≤N,

Wherein, A represents the constant of regulation ratio, and n represents filter order, b_iRepresent the rate of decay, f_iExpression center frequency Rate, φ_iRepresenting phase place, N represents number of filter；For i-th wave filter, there is b_i=1.019ERB (f_i), wherein equivalent rectangular Bandwidth ERB (f_i) expression formula be

E R B (f_{i}) = 24.7 \times (4.37 \times \frac{f_{i}}{1000} + 1),

The coefficient matrix of the filtered described power spectrum through Gammatone bank of filters is carried out discrete cosine change Change, obtain Gammatone cepstrum coefficient；

Using described Gammatone cepstrum coefficient as the time-frequency feature of described numeral sample signal.

In conjunction with the first possible embodiment of first aspect, embodiments providing first aspect the second can The embodiment of energy, wherein, makes signal label to described numeral sample signal in time-frequency domain and includes:

In time-frequency domain, the described numeral sample signal that each time point is corresponding is judged；Wherein, each time point Corresponding described numeral sample signal is the signal frame after described framing and windowing；

When the described signal frame of current point in time has borborygmus signal, borborygmus label is set for described signal frame； When the described signal frame of current point in time has interference signal, interference label is set for described signal frame；Wherein, described borborygmus Phonetic symbol label and described interference label multi-C vector represent；

From described numeral sample signal, borborygmus label letter is extracted according to described borborygmus label and described interference label Number and each group interference label signal include as training sample: according to arranging described borborygmus label and the institute of described interference label State signal frame order, from described numeral sample signal, extract borborygmus label signal and each group of interference label signal as training Sample.

In conjunction with the embodiment that the second of first aspect or first aspect is possible, embodiments provide first party The third possible embodiment in face, wherein: described convolutional neural networks includes input layer, multiple hidden layer, full linking layer And output layer.Described hidden layer and full linking layer all contain the parameter of self, and described parameter includes weights and biasing；

Described convolutional neural networks training process use gradient descent method, detailed process includes:

Convolutional neural networks is carried out random initializtion；

Start training, order random-ising by described training sample and described signal label, take out the most at random Take J training sample and form a sample set as input sample, extract the signal label group corresponding with described input sample Become a sub-set of tags, complete being trained on described sample set all inputs sample and one take turns training, complete in institute Being trained on sample set is had once to train；

During taking turns training one, in described sample set, all of input sample all carries out propagated forward, passes through After the effect of described convolutional neural networks, the output layer at convolutional neural networks compares with corresponding signal label, calculates Difference square as square error between output result and corresponding signal label；Obtain the output result of all input samples Square error with signal label；

During taking turns training one, utilize described square error to carry out back propagation and parameter updates, including: from described Output layer starts, and reversely sequentially passes through each layer, obtains the equivalent error on each layer；Utilize the equivalent error meter on each layer Calculate the gradient of parameter on place layer, utilize the parameter of gradient updating place layer on each layer；

During once training, complete last when taking turns training, calculate the mean error of all described square errors, Described mean error is utilized to judge whether described convolutional neural networks restrains；The stable threshold set is tended at described mean error Time, determine that described convolutional neural networks reaches convergence, if described convolutional neural networks reaches to restrain, deconditioning；Otherwise open Begin newly once to train, during until the number of times of training or duration reach to set threshold value, deconditioning；

After training stops, using current convolutional neural networks as the convolutional neural networks trained.

In conjunction with the third possible embodiment of first aspect, embodiments provide the 4th kind of first aspect Possible embodiment, wherein, the detailed process that described input sample carries out propagated forward includes:

Described input sample is done computing by the input layer of described convolutional neural networks, and each layer of convolutional neural networks is to upper Computing is done in the output of one layer；

In convolutional neural networks, l layer is output as

x^l=f (u^l) formula (1)

Wherein f () is activation primitive, u^l=W^lx^l _- ¹+b^l, x^l _- ¹It is the output of l-1 layer, the input of l layer, W^lAnd b^l It is weights and the biasing of l layer respectively；Activation primitive uses sigmoid function or hyperbolic tangent function；

The square error calculated between output result and corresponding signal label includes: input sample, meter for each Calculating the square error between output result and the corresponding signal label obtained by the output layer of convolutional neural networks, jth is defeated The squared error function entering sample is

Wherein, K represents described output result and the dimension of signal label,Represent that jth sample is through convolutional Neural net The kth dimension of the output result after network,Represent the kth dimension of the signal label that jth sample is corresponding.

In conjunction with the 4th kind of possible embodiment of first aspect, embodiments provide the 5th kind of first aspect Possible embodiment, wherein, described back propagation and parameter update and specifically include:

By described output result with the square error of signal label from the beginning of described output layer, be transferred to convolution the most successively Each layer in neutral net, obtains the equivalent error on each layer；Described equivalent error is that square error is to place layer parameter Error rate, computing formula is

Wherein, E is the square error of output result, and b is the parameter of convolutional neural networks；

Equivalent error on output layer is

L in formula represents output layer, operative symbolRepresent element multiplication one by one；y^LFor the output result of output layer, t^LFor The signal label of output layer；

Equivalent error on other layers is

Utilize the equivalent error δ on each layer^l, calculate the gradient of parameter on the layer of place, obtain the gradient of weights and biasing It is respectively as follows:

η is learning rate, arranges different learning rates for different parameters；

Utilize the parameter of the gradient updating place layer of parameter on each layer；Plus place layer in the original parameter of each layer The gradient of parameter obtains new parameter.

In conjunction with the 4th kind of possible embodiment of first aspect, embodiments provide the 5th kind of first aspect Possible embodiment, wherein, during once training, complete last when taking turns training, calculate all described square mistakes The mean error of difference, described mean error function is:

Wherein, the number of samples during J represents once training；

At described mean error E^JWhen tending to the stable threshold set, determine that described convolutional neural networks reaches to receive Hold back；

If described convolutional neural networks reaches to restrain, deconditioning；Otherwise start newly once to train, update convolution god Through the parameter of network, gradually minimize E^J, the output result making described convolutional neural networks is close with corresponding signal label.

Second aspect, the embodiment of the present invention also provides for the borborygmus sound detection device under a kind of noisy environment, including:

Convolutional neural networks training module, for the training of convolutional neural networks, concrete training process includes: by sensing Device gathers borborygmus sample signal and least one set interference sample signal respectively；By described borborygmus sample signal and least one set Interference sample signal is all converted into numeral sample signal；Extract the time-frequency feature of described numeral sample signal；In time-frequency domain, Described numeral sample signal is made signal label；Described signal label includes for labelling borborygmus signal at described borborygmus The borborygmus label of time of occurrence point in sample signal, and when labelling interference signal occurs in described interference sample signal Between point interference label；From described numeral sample signal, borborygmus is extracted according to described borborygmus label and described interference label Label signal and each group of interference label signal are as training sample；Time-frequency by numeral sample signal corresponding for described training sample Described borborygmus label and described interference label, as training data, are used for distinguishing intestinal by spectrum signature as supervision message, training Ring tone signal and the convolutional neural networks of various interference signal；

Signal acquisition module, for by the borborygmus mixed signal of sensor acquisition active user, wherein, described borborygmus Mixture of tones signal includes borborygmus signal and environmental disturbances signal；

Signal conversion module, for described borborygmus mixed signal is converted to digital signal, and extracts described numeral letter Number time-frequency feature；

Borborygmus detection module, for inputting the training of described convolutional neural networks by the time-frequency feature of described digital signal The described convolutional neural networks that module trains processes, detects the time point that borborygmus occurs, thus distinguish described Borborygmus signal and described environmental disturbances signal.

In conjunction with second aspect, embodiments provide the first possible embodiment of second aspect, wherein, institute State convolutional neural networks training module to include:

Signal windowing unit, for carrying out framing and windowing to the described numeral sample signal with sequential；

Fourier transform unit, for the described numeral sample signal after windowing is carried out fast Fourier transform, extracts Power spectrum；

Gammatone bank of filters, is used for realizing a kind of linear transformation, filters described power spectrum；Described Gammatone The impulse respective table of bank of filters is shown as:

g_i(t)=At^n-1exp(-2πb_it)cos(2πf_i+φ_i),t≥0,1≤i≤N,

E R B (f_{i}) = 24.7 \times (4.37 \times \frac{f_{i}}{1000} + 1);

Discrete cosine transform unit, for the filtered described power spectrum through Gammatone bank of filters being Matrix number carries out discrete cosine transform, obtains Gammatone cepstrum coefficient；Using described Gammatone cepstrum coefficient as described The time-frequency feature of numeral sample signal.

The third aspect, the embodiment of the present invention also provides for the borborygmus detecting system under a kind of noisy environment, including second party The borborygmus sound detection device of face offer and sensor；

Described sensor is for gathering borborygmus sample signal and least one set interference sample in neural network training process This signal；Gathering the borborygmus mixed signal of active user during carrying out borborygmus detection, wherein, described borborygmus mixes Signal includes borborygmus signal and environmental disturbances signal；And the signal gathered is sent to described borborygmus sound detection device.

Borborygmus detection method under the noisy environment that the embodiment of the present invention is provided, Apparatus and system, utilize borborygmus The difference of performance in time and frequency domain characteristics of signal and environmental disturbances signal, train one convolutional neural networks distinguish borborygmus with Interference tones, can complete the detection to borborygmus in noisy environment, contributes to promoting the accuracy of borborygmus detection.

For making the above-mentioned purpose of the present invention, feature and advantage to become apparent, preferred embodiment cited below particularly, and coordinate Appended accompanying drawing, is described in detail below.

Accompanying drawing explanation

In order to be illustrated more clearly that the technical scheme of the embodiment of the present invention, below by embodiment required use attached Figure is briefly described, it will be appreciated that the following drawings illustrate only certain embodiments of the present invention, and it is right to be therefore not construed as The restriction of scope, for those of ordinary skill in the art, on the premise of not paying creative work, it is also possible to according to this A little accompanying drawings obtain other relevant accompanying drawings.

Fig. 1 shows the flow chart of the borborygmus detection method under a kind of noisy environment that the embodiment of the present invention is provided；

Fig. 2 shows in the borborygmus detection method that the embodiment of the present invention is provided, training convolutional neural networks concrete The flow chart of method；

Fig. 3 shows in the borborygmus detection method that the embodiment of the present invention is provided, the training process of convolutional neural networks Flow chart；

Fig. 4 shows in the borborygmus detection method that the embodiment of the present invention is provided, the structural representation of convolutional neural networks Figure；

Fig. 5 shows the structural representation of the borborygmus sound detection device under a kind of noisy environment that the embodiment of the present invention is provided Figure；

Fig. 6 shows the structural representation of the borborygmus detecting system under a kind of noisy environment that the embodiment of the present invention is provided Figure.

Detailed description of the invention

For making the purpose of the embodiment of the present invention, technical scheme and advantage clearer, below in conjunction with the embodiment of the present invention Middle accompanying drawing, is clearly and completely described the technical scheme in the embodiment of the present invention, it is clear that described embodiment is only It is a part of embodiment of the present invention rather than whole embodiments.Generally real with the present invention illustrated described in accompanying drawing herein The assembly executing example can be arranged with various different configurations and design.Therefore, below to the present invention's provided in the accompanying drawings The detailed description of embodiment is not intended to limit the scope of claimed invention, but is merely representative of the selected reality of the present invention Execute example.Based on embodiments of the invention, the institute that those skilled in the art are obtained on the premise of not making creative work There are other embodiments, broadly fall into the scope of protection of the invention.

At present, the collection of borborygmus and distinguish the artificial auscultation depending on doctor, the auscultation process one to borborygmus As need longer time and relatively quiet environment.The method using sensor acquisition signal combination computer-assisted analysis can Think that diagnosis and treatment process provides sound assurance, but the interference signal such as the noisy speech in existing environment has with borborygmus signal Similar wave characteristic, easily to the collection of borborygmus with distinguish and form serious interference.For this problem, at noisy environment Under borborygmus detection algorithm have great importance.

Convolutional neural networks is applicable to image, Speech processing, is usually used to structure complexity, nonlinearity Dependency relation.Being mainly characterized by it and can go out there is distinctive by there being the training procedure extraction of supervision of convolutional neural networks The feature combining form of matter, various feature combining form is stored in multi-level convolution kernel.Further, roll up in multilamellar Under the common effect of lamination, it is possible to achieve to the most abstract of feature and combination, so that the expression energy of multi-level convolution kernel Power is abundanter, extends the suitability of the content that convolutional neural networks is acquired on the basis of finite sample.

Utilize the difference of performance in time and frequency domain characteristics of borborygmus signal and noise signal, train a convolutional Neural net Network distinguishes borborygmus and noise, can complete the detection to borborygmus under noise jamming.Convolutional neural networks can be spontaneous Extracting characteristically for distinguishing the key message of borborygmus and voice, the learning outcome form of expression in time-frequency domain is with special The form of expression levied is close, it is simple to analyzes and adjusts.

Based on this, the invention provides the borborygmus detection method under a kind of noisy environment, Apparatus and system, can filter Noise in environment, apparent extraction and identification borborygmus, complete the detection to borborygmus in noisy environment.

For ease of the present embodiment is understood, first to the intestinal under a kind of noisy environment disclosed in the embodiment of the present invention Ring sound detection method describes in detail, and Fig. 1 shows the borborygmus under a kind of noisy environment that the embodiment of the present invention is provided The flow chart of detection method.As it is shown in figure 1, this detection method includes:

Step S101, by the borborygmus mixed signal of sensor acquisition active user, wherein, borborygmus mixed signal bag Include borborygmus signal and environmental disturbances signal；

Step S102, is converted to digital signal by above-mentioned borborygmus mixed signal；

Step S103, extracts the time-frequency feature of above-mentioned digital signal；

Step S104, inputs the time-frequency feature of digital signal in the convolutional neural networks trained and processes, inspection Measure the time point that borborygmus occurs, thus distinguish borborygmus signal and environmental disturbances signal.

Wherein, the concrete grammar of training convolutional neural networks is as in figure 2 it is shown, comprise the steps.

Step S201, gathers borborygmus sample signal and least one set interference sample signal respectively by sensor.

Step S202, disturbs sample signal to be all converted into numeral sample signal borborygmus sample signal and least one set.

Step S203, extracts the time-frequency feature of numeral sample signal.The time-frequency feature that the embodiment of the present invention uses is Gammatone cepstrum coefficient, the concrete steps of the Gammatone cepstrum coefficient extracting numeral sample signal include: to when having The numeral sample signal of sequence carries out framing and windowing；To the numeral sample signal elder generation zero padding of each frame to N point, N=2ⁱ, i is whole Number, and i >=8；Then, the numeral sample signal of each frame being carried out windowing or preemphasis processes, windowed function uses Hamming window Or breathe out peaceful window (hanning) (hamming).

Numeral sample signal after windowing is carried out fast Fourier transform, extracts power spectrum；

Use Gammatone bank of filters that power spectrum is filtered；Described Gammatone bank of filters is accomplished that a kind of line Property conversion, its impulse respective table is shown as:

g_i(t)=At^n-1exp(-2πb_it)cos(2πf_i+φ_i),t≥0,1≤i≤N,

E R B (f_{i}) = 24.7 \times (4.37 \times \frac{f_{i}}{1000} + 1),

The coefficient matrix of the filtered power spectrum through Gammatone bank of filters is carried out discrete cosine transform, To Gammatone cepstrum coefficient；Gammatone cepstrum coefficient combines the auditory properties of human ear, is a kind of audition filtering characteristics, The resolution of low frequency is high, and the resolution of high frequency is suitably compressed.

It should be noted that through above-mentioned steps, borborygmus sample signal c (t) correspondence can be respectively obtained Gammatone cepstrum coefficient S (j) that Gammatone cepstrum coefficient C (j) is corresponding with speech samples signal, both cepstrum coefficients The training of convolutional neural networks will be used for as training data.In like manner, the borborygmus collected under noisy environment to be detected mixes Close the Gammatone cepstrum coefficient that signal is obtained by above-mentioned steps, the inspection of borborygmus time of occurrence can be used for as feature Survey.

Step S204, in time-frequency domain, makes signal label to numeral sample signal；Signal label includes for labelling intestinal Ring tone signal borborygmus label of time of occurrence point in borborygmus sample signal, and for labelling interference signal at interference sample The interference label of time of occurrence point in signal.The detailed process of this step includes:

In time-frequency domain, the numeral sample signal that each time point is corresponding is judged；Wherein, each time point is corresponding Numeral sample signal be the signal frame after framing and windowing；

When the signal frame of current point in time has borborygmus signal, borborygmus label is set for signal frame；When current Between point signal frame in have interference signal time, for signal frame, interference label is set.

Wherein, borborygmus label and interference label multi-C vector represent.If only one group interference signal, such as interference letter Number being voice signal, signal label can use bivector to represent, has [1,0] to be carved with borborygmus when representing this certain moment t Sound occurs, [0,1] is carved with voice and is occurred when representing this.Note time index t here be no longer concrete sampled signal time Between index, but the time sequencing index of the Gammatone cepstrum coefficient obtained through step S203, i.e. Gammatone cepstrum T frame coefficient time point in time sequencing in coefficient.If having many group interference signals, need to solve many classification problems, Label vector dimension can be increased, keep the value result of element in vector and the corresponding relation of classification results.

Step S205, according to borborygmus label and interference label extract from numeral sample signal borborygmus label signal with Each group interference label signal is as training sample.Detailed process includes: according to arranging borborygmus label and the signal of interference label Frame sequential, extracts borborygmus label signal and each group of interference label signal as training sample from numeral sample signal.Training The form of sample is continuous d frame Gammatone cepstrum coefficient matrix, and a frame at this matrix center is with borborygmus or voice Cepstrum coefficient.After all training samples have extracted, the Gammatone cepstrum coefficient not being extracted is considered not comprise borborygmus And voice, the most it is not used for training convolutional neural networks.Sample one the training sample set of composition being extracted, internal Put in order and only represent the order being extracted, the most corresponding concrete time time point.Corresponding, the mark of successive frame order The label that only marked borborygmus and voice appearance in label is extracted, and forms tag set, and remaining label is not used.By This, be available for two class sample of signal of training convolutional neural networks and corresponding label.In like manner, adopt under noisy environment The borborygmus mixed signal that collection arrives, it is possible to sample drawn set.

Step S206, using the time-frequency feature of numeral sample signal corresponding for training sample as training data, by borborygmus Phonetic symbol label and interference label are as supervision message, and training is for distinguishing borborygmus signal and the convolutional Neural net of various interference signal Network.

The structure of described convolutional neural networks as shown in Figure 4, including input layer, multiple hidden layer, full linking layer and defeated Go out layer.Hidden layer and full linking layer all contain the parameter of self, and described parameter includes weights and biasing.Convolutional neural networks hidden Hide layer comprise alternately arranged two convolutional layer and two down-sampling layers, convolutional layer and down-sampling layer all comprise self weights and Biasing.Convolutional layer, by the convolutional calculation of convolution kernel with input, obtains an output from the block of input every time, passes through convolution kernel Traversal in input obtains complete output.Described convolution kernel is the weights of convolutional layer.Down-sampling layer is by designing Proportionality coefficient, input is compressed.

Convolutional neural networks training process as it is shown on figure 3, detailed process includes:

Step S2061, carries out random initializtion to convolutional neural networks；Except weights and the biasing of convolutional neural networks need Outside initializing, it is even more important that need the quantity of the setting network degree of depth and convolution kernel.The present embodiment uses typical case Configuration, along with raising and the increase of training sample of learning tasks complexity, can suitably increase the degree of depth and the convolution of network The quantity of core.Meanwhile, the specification of convolution kernel is also important influence factor, it is proposed that the limit of convolution kernel in design ground floor convolutional layer Length is general more than sample time span, is so conducive to the feature representation form of convolutional neural networks acquistion global sense；

Step S2062, starts training, order random-ising by training sample and signal label, the most random Extract J training sample and form a sample set as input sample, extract the signal label composition corresponding with inputting sample One sub-set of tags, completes being trained on sample set all inputs sample and one takes turns training, complete at all samples Being trained in subset is once trained；

Step S2063, during taking turns training one, in sample set, all of input sample all carries out forward direction biography Broadcasting, after the effect of convolutional neural networks, the output layer at convolutional neural networks compares with corresponding signal label, meter Calculate difference between output result and corresponding signal label square as square error；Obtain the output knot of all input samples Fruit and the square error of signal label；

Square error cost function is defined as

E^{J} = \frac{1}{2} Σ_{j = 1}^{J} Σ_{k = 1}^{K} {(y_{k}^{j} - t_{k}^{j})}^{2},

Number of samples during wherein J represents once training, K represents the dimension of output and label,Represent jth sample warp Cross the kth dimension of the output of convolutional neural networks,Represent the kth dimension of the label that jth sample is corresponding.The target of training is intended to more The parameter of new network so that network output and label closer to, namely minimize E^J.During for one of them sample, then The error function of jth sample is

E^{j} = \frac{1}{2} Σ_{k = 1}^{K} {(y_{k}^{j} - t_{k}^{j})}^{2} .

In definition neutral net, l layer is output as

x^l=f (u^l) wherein u^l=W^lx^l-1+b^l,

Here f () is activation primitive, x^l-1It is the output of l-1 layer, the namely input of l layer, W^lAnd b^lIt is respectively The weights of l layer and biasing.Activation primitive can have a variety of, usually sigmoid function or hyperbolic tangent function, Sigmoid function is by output squeezing to [0,1], and hyperbolic tangent function is by output squeezing to [-1,1].By training data normalizing Cancellation average and variance are the distribution form of 1, can strengthen convergence during stochastic gradient descent.The most permissible Realizing propagated forward, the output of last layer is done computing by each layer, obtains exporting result, sample through nonlinear activation primitive Information is successively transmitted, and last output result is i.e. to inputting the predictive value that sample is borborygmus or voice.

Step S2064, during taking turns training one, utilizes square error to carry out back propagation and parameter updates, including: From the beginning of output layer, reversely sequentially pass through each layer, obtain the equivalent error on each layer；Utilize the equivalent error on each layer Calculate the gradient of parameter on place layer, utilize the parameter of gradient updating place layer on each layer；

Back propagation and parameter renewal process include:

The rate of change of neural network parameter is defined as by error

\frac{\partial E}{\partial b} = \frac{\partial E}{\partial u} \frac{\partial u}{\partial b} = \frac{\partial E}{\partial u} \cdot 1 = δ,

Then the back propagation on output layer is

L layer i.e. output layer, operative symbol thereinRepresenting element multiplication one by one, the back propagation on other layers is

By the error rate δ on each layer^lThe gradient of each weights and biasing can be obtained

{ΔW}^{l} = - η \frac{\partial E}{\partial W^{l}} = - {ηx}^{l - 1} {(δ^{l})}^{T},

{Δb}^{l} = - η \frac{\partial E}{\partial b^{l}} = - {ηδ}^{l},

η therein is learning rate, and the parameter that can be different arranges different learning rates, utilizes gradient descent method to update ginseng During number, the gradient of parameter is added in original parameter and obtains new parameter.

Output at convolutional layer be multiple input convolution combination result, be represented by

x_{j}^{l} = f (\underset{i &Element; M_{j}}{Σ} x_{i}^{l - 1} * k_{i j}^{l} + b_{j}^{l}),

WhereinRepresent the jth dimension output on l layer, M_jRepresent input set,Represent in input set one concrete Input,Represent the weights contacting this input on l layer with jth dimension output,Represent corresponding biasing.Before and after convolutional layer with Down-sampling layer is connected, and back propagation and parameter on convolutional layer update the inverse process with down-sampling layer.In the embodiment of the present invention Down-sampling layer weights useRepresenting, down-sampling factor n represents, down-sampling process will the block weighted average of n × n.By mistake Difference rate of change is when down-sampling layer back propagation, it is only necessary to be once multiplied available with the weights participating in during propagated forward calculating The above error rate on a convolutional layer.According to aforementioned back propagation, can obtain the error rate on convolutional layer is

Up () therein represent up-sampling calculate, it is simply that by the object tools on a point to the block carrying out down-sampling In the matrix that size is identical, this process is also referred to as Kronecker and amasss, and is represented by

u p (x) &equiv; x &CircleTimes; 1_{n \times n},

N therein is exactly the factor during down-sampling calculates.Then, can the error rate of change to biasing on this convolutional layer

\frac{\partial E}{\partial b_{j}} = \underset{u, v}{Σ} {(δ_{j}^{l})}_{u v},

The block position of down-sampling, the error rate of change to convolution kernel is carried out when what wherein u, v represented is propagated forward For

\frac{\partial E}{\partial k_{i j}^{l}} = \underset{u, v}{Σ} {(δ_{j}^{l})}_{u v} {(p_{i}^{l - 1})}_{u v},

WhereinIt isIn with convolution kernelBlock by element multiplication.The thus obtained error rate of change to parameter Substitute into the formula in back-propagation process and calculate the gradient of parameter, and then undated parameter.

It is output as on down-sampling layer

x_{j}^{l} = f (β_{j}^{l} d o w n (x_{j}^{l - 1}) + b_{j}^{l}),

Wherein down () represents that down-sampling calculates, by same in two dimensions for input under the control of down-sampling factor of n Time be compressed into original 1/n.When l+1 layer is convolutional layer, can be byMatrix according to the whole inverted arrangements of ranks order, withCarry out complete convolution algorithm, the result of complete convolution again withElement multiplication one by one, can obtainWhat is called is rolled up completely Long-pending, it is convolution again after zero padding on boundary position, thus can obtain identical with down-sampling layer output sizePass through The error rate of change to parameter on down-sampling layer can be obtained

\frac{\partial E}{\partial b_{j}} = \underset{u, v}{Σ} {(δ_{j}^{l})}_{u v},

And then can be with undated parameter.

Step S2065, during once training, complete last when taking turns training, calculate the flat of all square errors All error, utilizes mean error to judge whether convolutional neural networks restrains；When mean error tends to the stable threshold set, really Determine convolutional neural networks and reach convergence, if convolutional neural networks reaches to restrain, deconditioning；Otherwise return step S2602, Start newly once to train, during until the number of times of training or duration reach to set threshold value, deconditioning；

Selection for the condition of convergence is not unique, and the stable threshold of mean error can regard concrete application to be needed really Fixed, it is also possible to the number of times trained by setting carrys out the time of controlled training neutral net.

Step S2066, after training stops, using current convolutional neural networks as the convolutional neural networks trained.

In other embodiments, it would however also be possible to employ other time and frequency domain characteristics such as amplitude spectrum, power spectrum etc., concrete process Method belongs to common knowledge, does not repeats at this.

Corresponding with the borborygmus detection method under above-mentioned noisy environment, the embodiment of the present invention additionally provides a kind of noisy ring Borborygmus sound detection device under border.As it is shown in figure 5, this borborygmus sound detection device, including such as lower module:

Convolutional neural networks training module 501, for the training of convolutional neural networks, concrete training process is examined with borborygmus In survey method, the training process of convolutional neural networks is identical, does not repeats them here.

Signal acquisition module 502, for by the borborygmus mixed signal of sensor acquisition active user；

Signal conversion module 503, for borborygmus mixed signal being converted to digital signal, and extract digital signal time Spectrum signature；

Borborygmus detection module 504, for inputting convolutional neural networks training module by the time-frequency feature of digital signal The convolutional neural networks trained processes, detects the time point that borborygmus occurs, thus distinguish borborygmus signal and Environmental disturbances signal.

Wherein, convolutional neural networks training module 501 includes:

Signal windowing unit, for carrying out framing and windowing to the numeral sample signal with sequential；

Fourier transform unit, for the numeral sample signal after windowing carries out fast Fourier transform, extracts power Spectrum；

Gammatone bank of filters, is used for realizing a kind of linear transformation, filters power spectrum；Concrete methods of realizing is upper The borborygmus detection method stated is it is stated that repeat no more.

Discrete cosine transform unit, for the coefficient square to the filtered power spectrum through Gammatone bank of filters Battle array carries out discrete cosine transform, obtains Gammatone cepstrum coefficient.

Further embodiment of this invention additionally provides the borborygmus detecting system under a kind of noisy environment, shown in Figure 6, bag Include the borborygmus sound detection device 62 in above-described embodiment and sensor 64.Sensor 64 is for adopting in neural network training process Collection borborygmus sample signal and least one set interference sample signal；The intestinal of active user is gathered during carrying out borborygmus detection Ring mixture of tones signal, wherein, borborygmus mixed signal includes borborygmus signal and environmental disturbances signal；And the signal of collection is sent out Deliver to borborygmus sound detection device.Wherein, the concrete structure of borborygmus sound detection device 62 can use the structure shown in Fig. 5.

Those skilled in the art is it can be understood that arrive, for convenience and simplicity of description, and the system of foregoing description With the specific works process of device, it is referred to the corresponding process in preceding method embodiment, does not repeats them here.

Borborygmus detection method under the noisy environment that the embodiment of the present invention is provided, Apparatus and system, it is adaptable at noise Detection to borborygmus in heterocycle border, utilizes the difference of performance in time and frequency domain characteristics of borborygmus signal and environmental disturbances signal Different, borborygmus signal can be identified from multiple interference signal quickly and accurately.

The above, the only detailed description of the invention of the present invention, but protection scope of the present invention is not limited thereto, and any Those familiar with the art, in the technical scope that the invention discloses, can readily occur in change or replace, should contain Cover within protection scope of the present invention.Therefore, protection scope of the present invention should described be as the criterion with scope of the claims.

Claims

1. the borborygmus detection method under a noisy environment, it is characterised in that including:

By the borborygmus mixed signal of sensor acquisition active user, wherein, described borborygmus mixed signal includes borborygmus Signal and environmental disturbances signal；

Described borborygmus mixed signal is converted to digital signal；

Extract the time-frequency feature of described digital signal；

The time-frequency feature of described digital signal is inputted in the convolutional neural networks trained and processes, detect borborygmus The time point occurred, thus distinguish described borborygmus signal and described environmental disturbances signal；Wherein, described convolutional neural networks Training process includes:

Extract the time-frequency feature of described numeral sample signal；

In time-frequency domain, described numeral sample signal is made signal label；Described signal label includes for labelling borborygmus Signal is the borborygmus label of time of occurrence point in described borborygmus sample signal, and for labelling interference signal in described interference The interference label of time of occurrence point in sample signal；

According to described borborygmus label and described interference label extract from described numeral sample signal borborygmus label signal and Each group interference label signal is as training sample；

Using the time-frequency feature of numeral sample signal corresponding for described training sample as training data, by described borborygmus label With described interference label as supervision message, training is for distinguishing borborygmus signal and the convolutional Neural net of various interference signal Network.

Borborygmus detection method the most according to claim 1, it is characterised in that extract the time-frequency of described numeral sample signal Spectrum signature includes:

Use Gammatone bank of filters that described power spectrum is filtered；Described Gammatone bank of filters is accomplished that a kind of line Property conversion, its impulse respective table is shown as:

g_i(t)=At^n-1exp(-2πb_it)cos(2πf_i+φ_i),t≥0,1≤i≤N,

Wherein, A represents the constant of regulation ratio, and n represents filter order, b_iRepresent the rate of decay, f_iRepresent mid frequency, φ_i Representing phase place, N represents number of filter；For i-th wave filter, there is b_i=1.019ERB (f_i), wherein equivalent rectangular bandwidth ERB(f_i) expression formula be

E R B (f_{i}) = 24.7 \times (4.37 \times \frac{f_{i}}{1000} + 1),

The coefficient matrix of the filtered described power spectrum through Gammatone bank of filters is carried out discrete cosine transform, To Gammatone cepstrum coefficient；

Borborygmus detection method the most according to claim 2, it is characterised in that in time-frequency domain, described numeral sample is believed Number make signal label include:

In time-frequency domain, the described numeral sample signal that each time point is corresponding is judged；Wherein, each time point is corresponding Described numeral sample signal be the signal frame after described framing and windowing；

When the described signal frame of current point in time has borborygmus signal, borborygmus label is set for described signal frame；When working as When the described signal frame of front time point has interference signal, interference label is set for described signal frame；Wherein, described borborygmus phonetic symbol Sign and described interference label multi-C vector represents；

According to described borborygmus label and described interference label extract from described numeral sample signal borborygmus label signal and Each group interference label signal includes as training sample: according to arranging described borborygmus label and the described letter of described interference label Number frame sequential, extracts borborygmus label signal and each group of interference label signal as training sample from described numeral sample signal This.

4. according to the borborygmus detection method described in claim 1 or claim 3, it is characterised in that described convolutional Neural net Network includes input layer, multiple hidden layer, full linking layer and output layer；Described hidden layer and full linking layer all contain the ginseng of self Number, described parameter includes weights and biasing；

Convolutional neural networks is carried out random initializtion；

Start training, order random-ising by described training sample and described signal label, the most not repeatedly randomly draw J Training sample, as input sample one sample set of composition, extracts the signal label corresponding with described input sample and forms one Sub-set of tags, completes being trained on described sample set all inputs sample and one takes turns training, complete at all samples Being trained in subset is once trained；

During taking turns training one, in described sample set, all of input sample all carries out propagated forward, through described After the effect of convolutional neural networks, the output layer at convolutional neural networks compares with corresponding signal label, calculates output Difference square as square error between result and corresponding signal label；Obtain output result and the letter of all input samples The square error of number label；

During taking turns training one, utilize described square error to carry out back propagation and parameter updates, including: from described output Layer starts, and reversely sequentially passes through each layer, obtains the equivalent error on each layer；The equivalent error on each layer is utilized to calculate institute The gradient of parameter on layer, utilizes the parameter of gradient updating place layer on each layer；

During once training, complete last when taking turns training, calculate the mean error of all described square errors, utilize Described mean error judges whether described convolutional neural networks restrains；When described mean error tends to the stable threshold set, Determine that described convolutional neural networks reaches convergence, if described convolutional neural networks reaches to restrain, deconditioning；Otherwise start Newly once train, during until the number of times of training or duration reach to set threshold value, deconditioning；

Borborygmus detection method the most according to claim 4, it is characterised in that described input sample carries out propagated forward Detailed process includes:

Described input sample is done computing by the input layer of described convolutional neural networks, and each layer of convolutional neural networks is to last layer Output do computing；

In convolutional neural networks, l layer is output as

x^l=f (u^l) formula (1)

Wherein f () is activation primitive, u^l=W^lx^l-1+b^l, x^l-1It is the output of l-1 layer, the input of l layer, W^lAnd b^lRespectively It is weights and the biasing of l layer；Activation primitive uses sigmoid function or hyperbolic tangent function；

Calculate output result and corresponding signal label between square error include: for each input sample, calculate by Square error between output result and corresponding signal label that the output layer of convolutional neural networks obtains, jth input sample This squared error function is

Wherein, K represents described output result and the dimension of signal label,Represent that jth sample is after convolutional neural networks The kth dimension of output result,Represent the kth dimension of the signal label that jth sample is corresponding.

Borborygmus detection method the most according to claim 5, it is characterised in that described back propagation and parameter update concrete Including:

By described output result with the square error of signal label from the beginning of described output layer, be transferred to convolutional Neural the most successively Each layer in network, obtains the equivalent error on each layer；Described equivalent error is the square error mistake to place layer parameter Difference rate of change, computing formula is

Equivalent error on output layer is

L in formula represents output layer, operative symbolRepresent element multiplication one by one；

Equivalent error on other layers is

Utilize the equivalent error δ on each layer^l, calculate the gradient of parameter on the layer of place, obtain the gradient of weights and biasing respectively For:

Utilize the parameter of the gradient updating place layer of parameter on each layer；Plus place layer parameter in the original parameter of each layer Gradient obtain new parameter.

Borborygmus detection method the most according to claim 6, it is characterised in that during once training, complete Rear one when taking turns training, calculates the mean error of all described square errors, and described mean error function is:

Wherein, the number of samples during J represents once training；

At described mean error E^JWhen tending to the stable threshold set, determine that described convolutional neural networks reaches convergence；

If described convolutional neural networks reaches to restrain, deconditioning；Otherwise start newly once to train, update convolutional Neural net The parameter of network, gradually minimizes E^J, the output result making described convolutional neural networks is close with corresponding signal label.

8. the borborygmus sound detection device under a noisy environment, it is characterised in that including:

Convolutional neural networks training module, for the training of convolutional neural networks, concrete training process includes: divided by sensor Cai Ji borborygmus sample signal and least one set interference sample signal；Described borborygmus sample signal and least one set are disturbed Sample signal is all converted into numeral sample signal；Extract the time-frequency feature of described numeral sample signal；In time-frequency domain, to institute State numeral sample signal and make signal label；Described signal label includes for labelling borborygmus signal at described borborygmus sample The borborygmus label of time of occurrence point in signal, and disturb signal time of occurrence point in described interference sample signal for labelling Interference label；From described numeral sample signal, borborygmus label is extracted according to described borborygmus label and described interference label Signal and each group of interference label signal are as training sample；The time-frequency spectrum of numeral sample signal corresponding for described training sample is special Levy as training data, be used for distinguishing borborygmus as supervision message, training using described borborygmus label and described interference label Signal and the convolutional neural networks of various interference signal；

Signal acquisition module, for by the borborygmus mixed signal of sensor acquisition active user, wherein, described borborygmus mixes Close signal and include borborygmus signal and environmental disturbances signal；

Signal conversion module, for described borborygmus mixed signal is converted to digital signal, and extracts described digital signal Time-frequency feature；

Borborygmus detection module, for inputting described convolutional neural networks training module by the time-frequency feature of described digital signal The described convolutional neural networks trained processes, detects the time point that borborygmus occurs, thus distinguish described borborygmus Tone signal and described environmental disturbances signal.

Borborygmus sound detection device the most according to claim 8, it is characterised in that described convolutional neural networks training module bag Include:

Fourier transform unit, for the described numeral sample signal after windowing carries out fast Fourier transform, extracts power Spectrum；

Gammatone bank of filters, is used for realizing a kind of linear transformation, filters described power spectrum；Described Gammatone filters The impulse respective table of device group is shown as:

g_i(t)=At^n-1exp(-2πb_it)cos(2πf_i+φ_i),t≥0,1≤i≤N,

E R B (f_{i}) = 24.7 \times (4.37 \times \frac{f_{i}}{1000} + 1);

Discrete cosine transform unit, for the coefficient square to the filtered described power spectrum through Gammatone bank of filters Battle array carries out discrete cosine transform, obtains Gammatone cepstrum coefficient；Using described Gammatone cepstrum coefficient as described numeral The time-frequency feature of sample signal.

10. the borborygmus detecting system under a noisy environment, it is characterised in that include the borborygmus described in claim 8 or 9 Detection device and sensor；

Described sensor is for gathering borborygmus sample signal and least one set interference sample letter in neural network training process Number；The borborygmus mixed signal of active user, wherein, described borborygmus mixed signal is gathered during carrying out borborygmus detection Including borborygmus signal and environmental disturbances signal；And the signal gathered is sent to described borborygmus sound detection device.