CN104157293B - The signal processing method of targeted voice signal pickup in a kind of enhancing acoustic environment - Google Patents

The signal processing method of targeted voice signal pickup in a kind of enhancing acoustic environment Download PDF

Info

Publication number
CN104157293B
CN104157293B CN201410427254.1A CN201410427254A CN104157293B CN 104157293 B CN104157293 B CN 104157293B CN 201410427254 A CN201410427254 A CN 201410427254A CN 104157293 B CN104157293 B CN 104157293B
Authority
CN
China
Prior art keywords
signal
model
sound
acoustic environment
voice signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201410427254.1A
Other languages
Chinese (zh)
Other versions
CN104157293A (en
Inventor
陈国钦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujian Normal University
Original Assignee
Fujian Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujian Normal University filed Critical Fujian Normal University
Priority to CN201410427254.1A priority Critical patent/CN104157293B/en
Publication of CN104157293A publication Critical patent/CN104157293A/en
Application granted granted Critical
Publication of CN104157293B publication Critical patent/CN104157293B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The present invention relates to a kind of signal processing method for strengthening targeted voice signal pickup in acoustic environment.(1)Obtained by testingESNThe parameter of network, sets up corresponding source of sound model;(2)Model is used for into two kinds of occasions:When model is output as desired certain targeted voice signal, when being input into the mixing for the acoustic environment reflected sound signals and targeted voice signal of the target language source of sound, model can be used for the echo cancellor of live public address;When model is output as desired certain targeted voice signal, when being input into the mixing for the acoustic environment reflected sound signals and targeted voice signal in other special sound sources, model can be used for the echo cancellor of two specific human world voice communications;(3)When model is used to target voice people in actual acoustic environment, the position of pickup changes, and can also suppress the reflected signal of the sound source signal of training indication, and export corresponding enhanced targeted voice signal.The present invention overcomes and moves because of pickup position, and causes what quality of speech signal was subject to affect.

Description

The signal processing method of targeted voice signal pickup in a kind of enhancing acoustic environment
Technical field
The invention belongs to the treatment technology of indoor voice signal pickup, is related to by experiment to Echo State Networks Parameter selects and trains the digital signal processing method of modeling, targeted voice signal pickup in particularly a kind of enhancing acoustic environment Signal processing method.
Background technology
At the scene in public address, being related to liking for echo impact is eliminated:Specific objective voice and the specific objective voice Ambient sound, is mainly used in improving acoustic gain.Main correlation technique has:(1)Conventional art such as arrowband equilibrium is to filter peak The process of value, eliminates feedback self-vibration;Shift frequency method is using frequency spectrum movement public address again is carried out to signal, to destroy feedback self-vibration bar Part etc., the common issue that they are present all are that treatment technology is complicated, and are unfavorable for the fidelity of voice signal;(2) it is based on The echo cancelltion technology that the method for modern Digital Signal Processing is then processed using adaptive-filtering.
In voice communication, being related to liking for echo impact is eliminated:Specific objective voice and another special sound Ambient sound, mainly reaches the purpose of speech enhan-cement.The Related product of echo cancellor is mainly at two aspects:It is flat based on DSP The Echo Canceller of platform and the echo cancellation algorithm software of the voice communication based on windows platform.They are all based on adaptive Answer the product of echo cancelltion technology, echo cancelltion must accurately analog echo path, and promptly adapt to its change.This Including the selection of the structure and adaptive algorithm of sef-adapting filter, and reduce the impact of noise to algorithm the convergence speed etc.. Self-adaptive echo counteracting process mainly has following two aspects problem:
First, design is mainly for following use problem:(1)Process and converse simultaneously.Only remote signaling does not have near-end During signal, the filter coefficient to echo simulation is obtained, the random component big in addition equal to introduction is arrived when near end signal is added Adaptive process, filter coefficient can surround the change of this intermediate value and significantly increase, and cause hydraulic performance decline.This must be detected The key element that near end signal is present, stops adaptation function when talking at the same time, keeps filter coefficient above constant. (2)The conventional LMS algorithm amount of storage of algorithm based on adaptive-filtering is little, realize and detection is easier, but poor astringency;And The good RLS algorithm of convergence is computationally intensive, therefore occurs in that many their innovatory algorithms, and is applied to solve actual echo The adaptive-filtering cancellation algorithms of problem are processed.(3)When echo cancellation algorithm is applied to windows platform, it is necessary to solve collection With the stationary problem of audio stream plays.Relative to traditional DSP platform, present PC, possess abundant cpu resource and magnanimity Memory source, then complicated echo cancellation algorithm can run freely.But, application program is difficult directly to control in bottom The collection of sound card is played, and acquisition is non real-time audio stream, so as to bring the stationary problem of collection and audio stream plays.This Receive the voice of distal end after, these speech datas are passed to echo cancellation algorithm and are made reference, this is that algorithm needs Individual input signal;Then pass to sound card again, sound card release after through echo path, echo cancellor calculation is passed to after locally gathering again Method, is another input signal of algorithm needs.If two signals for passing to echo cancellation algorithm synchronously obtain bad, i.e., two There is frame dislocation in signal, be difficult to be eliminated.
Secondly, the sound echo Adaptive noise cancellation technology that acoustical coupling is formed between speaker and mike exists following Technical problem:(1)Due to time delay it is longer(Reach 1s), need the higher order filter of thousand of coefficients being fitted, need more Many computing resources.(2)The stability of so long higher order filter and to improve its adaptive speed be all relatively difficult thing Feelings.First, sound echo path shows unstable due to the change of acoustic characteristic;Secondly, sound echo be by many Jing propagate come 's;Again, the propagation scattering propertiess in room sound space are non-linear, with general(Or)Linear filter can not be compared with Well which is modeled.(3)For stereoThe Acoustic echo cancellation problem of system, remain at present one it is important, rich Challenging research topic, as the development of the echo cancellation technology that disappears, current echo eliminate the emphasis of research, is returned by circuit The elimination of sound, has turned to the elimination of acoustic echo.
Generation as voice signal can be adoptedOrModel is described, the sound of cabinet speaker to mike Channel(The generation of reflected sound signals)Can also useOrModel approximate description,Model is with less limit The function of more accurate simulated sound channel.For indoor Acoustic channel is equivalent to the result that a large amount of standing waves are superimposed, has compared with multi-peak, need More limit numberModeling out, as long as and the voice signal that sends for people's sound system then usual little limit number 'sModel can just be simulated and.Therefore, if one can be set upModel, which is output as target voice, and is input into For target voice and Ambient acoustical signal, then what is suppressed is reflected sound signals, and accordingly strengthen is targeted voice signal.
Dynamic neural network, also known as recurrent neural network, is made up of dynamic neuron, is studied for Dynamic System Identification In a kind of neutral net that developed.The training process of dynamic neural network is constantly to adjust network parameter (such as weights etc.) to make Network output approaches the process of preferable output, is to set upThe powerful of model.As a kind of New Recursive nerve net Network, Echo State Networks(Network)In terms of Nonlinear Systems Identification, more traditional recurrent neural network has larger Improve.First, in terms of stability, Recursive Networks can be ensured by presetting the spectral radius of reserve pool weight matrix Stability;Secondly, in terms of network training, the determination for exporting weights is unique and is global optimum, therefore does not have tradition The Local Minimum problem of neutral net generally existing, and there is no conventional dynamic neutral net by error transfer factor and convergence rate Slow problem;In addition,Network avoids the process that conventional recursive neutral net asks for sequential partial differential, thereforeThe training process of network becomes extremely simple.
Just because ofThe superperformance that network is shown in terms of Nonlinear Systems Identification, therefore, the present invention is for upper Demand is stated, is also utilizedNetwork is set up a kind of suppression indoor noise environment reflected sound signals and strengthens output targeted voice signal Model.Whereby, in echo cancellation process, above-mentioned sef-adapting filter problem encountered will be resolved.
The content of the invention
It is an object of the invention to provide a kind of solve the method that above-mentioned sef-adapting filter offsets acoustic environment reflected signal The signal processing method of targeted voice signal pickup in the enhancing acoustic environment of existing deficiency.
For achieving the above object, the technical scheme is that:Targeted voice signal pickup in a kind of enhancing acoustic environment Signal processing method, comprises the following steps:
Step 1:It is determined that the types of models set up:Including the first source of sound model and the second source of sound model, first source of sound Model is to suppress the reflected sound signals that produce in acoustic environment of target voice sheet and accordingly strengthen targeted voice signal;It is described Second source of sound model is the reflected sound signals that suppress another particular person voice to produce in acoustic environment and accordingly strengthens target voice Signal;
Step 2:The training data source of model is divided into two kinds and obtains preparation:When preparing to set up the first source of sound model, need to obtain Take targeted voice signalData sampling point;When preparing to set up the second source of sound model, particular person voice signal need to be obtainedAnd targeted voice signalData sampling point;
Step 3:Obtain the Ambient acoustical signal of training pattern:First, indoor noise environment is input into from electroacoustics system Pumping signal, obtains the impulse response signal of indoor noise environment, and is converted into digital signal;Secondly, set exponent number, profit All-pole filter coefficient is obtained with based on autocorrelative linear prediction algorithm, the all-pole filter is used to simulate in acoustic environment Acoustic channel transmission characteristic;Again and, with prepare suppress reflected sound corresponding to sound source signalOrThrough full pole Point wave filter obtains corresponding Ambient acoustical signal
Step 4:ESNThe determination of network parameter:
ESNThe equation of network is:
Wherein,Intrinsic nerve unit activation primitive is represented, hyperbolic tangent function is generally taken,Represent output function, typical case In the case of take identity function,ForThe state variable of moment reserve pool,ForWhen etching system input vector,It isNetworkThe output at moment;For randomly generate and partially connected higher-dimension square formation, reserve pool once generation, its connection Weights keep constant;WithRespectivelyThe input weight matrix and output weight vector of network;It is right to export State variable connection weight vector;Represent the bias term of output or represent noise;WithRandomly generate and protect Hold constant, unique needs adjustment is output weights
To make mike take after the signal frame input model of certain length, the target voice of output corresponding length can be processed Frame, the value of the random connection weight vector of above three are as follows:
, i.e.,, value exists(,)Between;
, i.e.,, value exists(0,)Between;
, i.e.,, value(0,)Between;
Wherein,Value is less, and the time for setting up state is relatively shorter, improves the real-time of model calculation, andValue is bigger Model exactness is higher, but declines may generalization ability;Value is:①The yardstick of input reserve pool is determined,;②;③
Step 5:WithAsNetwork inputs,Expect as target, it is rightNetwork is trained, and is inhibited specific source of sound reflected sound and accordingly strengthens the model of targeted voice signal;Moment, storage The state variable in standby pondState equation:
;
For given nonlinear system inputoutput pair (,), utilizeNetwork identification The process of the system is:First, the weights in reserve pool are initializedWith;Secondly, inputExcitation system, tries to achieveEach moment condition responsive of network;It is linear relationship between state variable and desired output in reserve pool, thereforeThe training process of network is fairly simple, and the process for solving be not in multiple local for often having of traditional neural network most The slow shortcoming of little, convergence rate;
Output weightsDetermination adopt basic linear regression algorithm:
In embodiments of the present invention, when the model obtained by the training can also be used to that Acoustic channel changes in actual acoustic environment Targeted voice signal strengthen, i.e., from mike obtain signalIn include:Targeted voice signal, specific ring Border sound reflecting signal, in input model, obtain enhanced targeted voice signalOutput, which adoptsIt is real Existing code segment is as follows:
;
;
In embodiments of the present invention, in the step 2, obtain targeted voice signalData sampling point, its data Frame length is more than 625ms.
In embodiments of the present invention, in the step 3, described input signal is white noise ping, recurrent pulse Or counterfeit noise.
In embodiments of the present invention, in the step 3, the impulse response signal of the acoustic environment is by can use indoors Any one speaker of scope and mike relevant position obtain.
In embodiments of the present invention, in the step 3, the exponent numberDetermination process it is as follows:
Indoor limit number, i.e. it is indoor sound standing wave number that the exponent number of linear prediction is corresponding, and which is estimated as the following formula:
,
In formulaFor estimation frequency,For respective wavelength,For estimation bandwidth,For the velocity of sound,,For Chamber volume,For indoor total surface area;
The then exponent number
In embodiments of the present invention, in the step 3, the Ambient acoustical signal, which refers to:When for first During source of sound model,Be by targeted voice signal, i.e., byFormed by all-pole filter;When for During two source of sound models,It is by particular person voice signalFormed by all-pole filter.
In embodiments of the present invention, it is describedIt is in network parameter, describedBy experimental selection.
When in embodiments of the present invention, the model obtained by the training can be used in that Acoustic channel changes in actual acoustic environment Targeted voice signal strengthens, and which is referred to after model is set up, when the position of pickup changes, additionally it is possible to suppress training indication Reflected signal of the sound source signal in time-varying acoustic environment, the corresponding enhanced targeted voice signal of output.
Compared to prior art, the invention has the advantages that:
1st, model of the invention first has two main features:(1)Training pattern can be used for two kinds of situations:When model is exported Certain targeted voice signal is desired for, when being input into the mixing for the particular person acoustic environment reflected signal and targeted voice signal, be can use In the echo cancellor of target voice people scene public address;When model output is desired for certain targeted voice signal, it is input into as other are specific During the mixing of the acoustic environment reflected signal and targeted voice signal of people's speech source, can be used for returning for two specific human world voice communications Sound is eliminated;(2)When the model of training gained is used to specific objective voice people in actual environment, the position of pickup is in certain model Enclose when changing, can suppress to train the reflected signal in indication acoustic environment, and accordingly strengthen the output of targeted voice signal; Therefore, the signal processing method can be applicable to the acoustic gain raising of various live public addressProcess, or in voice communication it is double Echo is eliminated when sayingProcess, or the enhancement process of voice recording signal;
2nd, for stereoThe Acoustic echo cancellation problem of system, remains an important, rich challenge at present The research topic of property;If stereoThe Echo Cancellation of system application adaptive-filtering is processed, and there will be complexity With it is computationally intensive, and be often difficult to the problem of ideal effect;And can then be avoided using processing method of the present invention.
Description of the drawings
Fig. 1 be model training of the present invention and using realize block diagram.
Fig. 2 isNetwork model schemes.
Fig. 3 is little indoor 1 shock response of Acoustic channel, spectrogram.
Fig. 4 is little indoor 2 shock response of Acoustic channel, spectrogram.
Fig. 5 is the LPC spectrum envelope comparison diagrams that Acoustic channel 1 takes 600 limits and 100 limits.
Fig. 6 is two voice signal figures of the present invention for example.
Fig. 7 is the first source of sound model of the inventionInstance processes design sketch.
Fig. 8 is the first source of sound model of the inventionInstance processes design sketch.
Fig. 9 is that the first source of sound model of the invention takes instance processes design sketch when 6000 data points are modeled.
Figure 10 is that the first source of sound model of the invention takes instance processes design sketch when 10000 data points are modeled.
Figure 11 is that the first source of sound model of the invention takes different limit numbers(100 limits of Acoustic channel 1 are taken during training, using when Take 600 limits of Acoustic channel 1)All-pole filter obtain reflected sound signals and be used to train modeling and knot during instance processes Fruit is schemed.
Figure 12 is the Ambient sound addition Gaussian noise that the present invention is used for training to Figure 11 models(25dB signal noises Than)In the case of, the instance processes result figure of model.
Figure 13 is the second source of sound model of the inventionInstance processes design sketch.
Figure 14 is the second source of sound model of the inventionInstance processes design sketch.
Figure 15 is that the second source of sound model of the invention takes instance processes design sketch when 6000 data points are modeled.
Figure 16 is that the second source of sound model of the invention takes instance processes design sketch when 10000 data points are modeled.
Figure 17 is that the second source of sound model of the invention takes different limit numbers(100 limits of Acoustic channel 1 are taken during training, using when Take 600 limits of Acoustic channel 1)All-pole filter obtain reflected sound signals and be used to train modeling and knot during instance processes Fruit is schemed.
Figure 18 is the Ambient sound addition Gaussian noise that the present invention is used for training to Figure 17 models(25dB signal noises Than)In the case of, the instance processes result figure of model.
Specific embodiment
Below in conjunction with the accompanying drawings, technical scheme is specifically described.
As shown in figure 1, a kind of signal processing method for strengthening targeted voice signal pickup in acoustic environment of the present invention, including such as Lower step:
Step 1:It is determined that the types of models set up:When model preparation suppresses what target voice sheet was produced in acoustic environment Reflected sound signals and when accordingly strengthening targeted voice signal, the model hereinafter referred to as the first source of sound model;When model preparation suppresses Reflected sound signals that another particular person voice is produced in acoustic environment and when accordingly strengthening targeted voice signal, the model with Lower the second source of sound of abbreviation model.
Step 2:The training data source of model is divided into two kinds and obtains preparation:When preparing to set up the first source of sound model, as long as Obtain targeted voice signalSuitable quantity data sampling point;When preparing to set up the second source of sound model, should obtain specific People's voice signalAnd targeted voice signalSuitable quantity data sampling point.
Step 3:Obtain the Ambient acoustical signal of training pattern:Can be from electroacoustics system to indoor noise environment input stimulus Signal, obtains the impulse response signal of indoor noise environment, and is converted into digital signal;Set appropriate exponent number, using base All-pole filter coefficient is obtained in autocorrelative linear prediction algorithm, the all-pole filter is used to simulate the sound in acoustic environment Channel transfer characteristic;With prepare suppress reflected sound corresponding to specific sound source signal (Or) filter through full limit Ripple device obtains corresponding Ambient signal
Step 4:Traditional recurrent neural network is generally configured with less hidden neuron number, but its weights adjusts machine System is complex;And Echo State Networks (Echo State Network, hereinafter referred to asNetwork) comprising one compared with Big reserve pool, needs only to adjust from hidden layer to output with more hidden layer and state layer neuron, but network training The connection weight of node.The non-linear dynamic characteristic of network is produced by one large-scale " reserve pool "." reserve pool " Neuron comprising a large amount of generation and partially connecteds at random, " reserve pool " has contained the running status of system, and has memory Function;Under outside input action, " input-state-output " drive system is constituted,The equation of network can be write as:
WhereinIntrinsic nerve unit activation primitive is represented, hyperbolic tangent function is generally taken,Represent output function, typical case In the case of take identity function,ForThe state variable at moment " reserve pool ",ForWhen etching system input vector, It isNetworkThe output at moment.For randomly generate and partially connected (generally remaining 1%-5%) higher-dimension square formation, " reserve pool ", once generation, its connection weight keeps constant;WithRespectivelyThe input weight matrix of network and defeated Go out weight vector;It is that output is vectorial to state variable connection weight;Represent the bias term of output or represent noise.WithAlso randomly generate and keep constant, unique needs adjustment is output weights
Although having substantial amounts of research to be the reserve pool with regard to how to obtain " good " related to particular problem, not The method of formation system, most researchs are carried out from experimental viewpoint, and final performance is determined by the parameters of reserve pool:(1) Reserve pool scaleWith the sparse degree of reserve pool:Reserve pool scaleIt is bigger,The dynamical system that network can be represented May be more complicated,Network is more accurate to the dynamical system description for giving;But, reserve pool scale arbitrarily can not increase, Because if reserve pool scale is excessive may to cause over-fitting problem, and causes generalization ability to decline;ParameterRepresent be The neuron being connected with each other in reserve pool accounts for total neuron number()Percentage ratio(Generally remain 1%-5%), storage can be weighed The abundant degree of vector included in standby pond, vector are abundanter, and its None-linear approximation ability is stronger.(2)Connect inside reserve pool Power spectral radius, only whenWhen,Network just has echoing characteristic;(3) Reserve pool input block yardstick:Before i.e. input signal is connected to reserve pool neuron, need be multiplied a yardstick because Son;The essence of the principle is by input block yardstick, by Input transformation to the corresponding scope of neuron activation functions.
To make mike take after the signal frame input model of certain length, the target voice of output corresponding length can be processed Frame, the value of the random connection weight vector of above three are as follows:
, i.e.,, value exists(,)Between;
, i.e.,, value exists(0,)Between;
, i.e.,, value(0, )Between;
Wherein,Value is less, and the time for setting up state is relatively shorter, improves the real-time of model calculation, andValue is bigger Model exactness is higher, but declines may generalization ability;Value is:①The yardstick of input reserve pool is determined,;②;③.DescribedNetwork parameterSelected by experiment, it is concrete to determine Process is:(1)TakeMeet,,,,In Any one class value, input training data modeling, then to mode input instance data, when observation processes output, whether system is stable, I.e. with the presence or absence of vibration, when there is vibration, parameter is turned down, until model stability is exported;(2)Increase reducesValue, in repetition The training of one step and simulation data, when reaching optimum efficiencyValue, as determine parameter value.Following In one specific embodiment, a kind of parameter appropriate value of two kinds of models is:First source of sound model, When 0.3, take;Second source of sound model,When 0.3, take
Step 5:WithAsNetwork inputs,Expect as target, it is rightNetwork is trained, and is inhibited specific source of sound reflected sound and accordingly strengthens the model of targeted voice signal;Moment, storage The state variable in standby pondState equation(Take hyperbolic tangent function calculating):
;
Neutral net is typically necessary by certain feedback of the information to adjust weights in training.AndThe study of network Mechanism is more special, encourages reserve pool by input signal first, so as to produce continuous state variable signal in reserve pool, most Linear regression algorithm by reserve pool state variable with target output signal determines afterwardsNetwork weight.
For given nonlinear system inputoutput pair (,), utilizeNetwork identification The process of the system is:First, the weights in reserve pool are initializedWith;Secondly, inputExcitation system, tries to achieveEach moment condition responsive of network;It is linear relationship between state variable and desired output in reserve pool, thereforeThe training process of network is fairly simple, and the process for solving be not in multiple local for often having of traditional neural network most The slow shortcoming of little, convergence rate.The determination of output weights W can use basic linear regression algorithm:
The model of training gained can be used for targeted voice signal when Acoustic channel in actual acoustic environment changes to be strengthened, i.e., from The signal that mike is obtainedIn include:Targeted voice signal, specific ambient sound reflected signal, it Input model, can obtain enhanced targeted voice signalOutput(The code segment explanation of expression):
; //InputNetwork //
The state variable of // calculating neuronal pool//
;
.// calculate enhanced echo signal output//
Refer to Fig. 1-2, Fig. 1-2 be model training of the present invention and using realize block diagram, andNetwork model.
On the one hand set up model training environment(All-pole filter):In any one sound room, first according to being used Electroacoustics system to indoor input white noise pulse, encouraged the standing wave response of interior, obtained indoor shock response and be simultaneously converted to Digital signal;Then select indoor noise environmentThe exponent number of model, using based on autocorrelative linear prediction algorithm () All-pole filter coefficient is obtained, the all-pole filter is used to simulate the Acoustic channel transmission characteristic in acoustic environment.
On the other hand set upNetwork strengthens the output model of target voice:First, the speech data of training is obtained, I.e. when preparing to set up the first source of sound model, as long as obtaining targeted voice signalSuitable quantity data sampling point;Work as standard For when setting up the second source of sound model, particular person voice signal should be obtainedAnd targeted voice signalSuitable quantity number According to sampled point;Then, the sound reflecting signal trained is obtained using above-mentioned all-pole filter;Finally,Network In parameter,By randomly generating,By experimental selection,Produced by training.
After setting up the model of corresponding uses, you can use to target voice people in the acoustic environment of training, and allow Mike Wind is used in certain limit movement.
Refer to Fig. 3-5, Fig. 3-5 is two Acoustic channels responses in an acoustic environment of the invention and wherein Acoustic channel 1 The LPC spectrum envelopes of the different limit numbers of response compare.
The acoustic environment(6.3×3.6×2.8(m3)Little interior)Shock response takes two Acoustic channels respectively:1. Acoustic channel 1 Shock response(Sample rate 16KHz), frequency spectrum is in 100Hz ~ 400Hz;2. the shock response of Acoustic channel 2(Sample rate 16KHz), frequency Spectrum is in 100Hz ~ 400Hz.
Wherein Acoustic channel 1 responds different limit numbersSpectrum envelope, is respectively 600 limit numbers in figureFrequency spectrum Envelope and 100 limit numbersSpectrum envelope.It can be seen that with 600 limit numbersSpectrum envelope is connect with former channel frequency response It is near consistent, and the LPC spectrum envelope deviations of 100 limit numbers are larger.
Fig. 6 is referred to, Fig. 6 is two primary speech signals that this example is adopted.
Signal sampling rate 16KHz, a voice signal is for the targeted voice signal in example, front 1-10000 data Sampled point is used for training pattern, and rear 12001-24000 data samplings point is used for process of the model to application example;Another signal For producing the Ambient acoustical signal of example needs, front 1-10000 data samples are used for training pattern, rear 12001-24000 Data sampling point is used for process of the model to application example.
Fig. 7-8 are referred to, Fig. 7-8 is the key parameter of the first source of sound model of the inventionWithExample during different values Treatment effect.
Modeling conditions:1. takeWithWhen 0.3, take, with the 600 of Acoustic channel 1 Pole filter simulation output reflected sound, and with 10000 data sampling point training patterns;2. takeWith0.3 When, take, with 600 pole filter simulation output reflected sounds of Acoustic channel 1, and adopted with 10000 data Sampling point training pattern.
The mode input signals below that respectively two kinds of conditions are set up(16KHz sample rates, 6000 data sampled points are one Frame):1. the reflected sound that 600 pole filters of Acoustic channel 1 are simulated;2. targeted voice signal;3. simulate Mike Wind number
After model is set up in training, input signal, model outputWithWave-form similarity, andWithWave-form similarity be calculated as follows:
IfIt is output as, reflected signal rejection ability is calculated as follows:
(dB);
Result:TakeDuring modeling,0.9132,0.7463, similarity is improved 0.1669, to reflected sound rejection ability be14.02dB;And takeDuring modeling,0.9044,0.7463, similarity improves 0.1581, to reflected sound rejection ability is9.50dB.It can be seen that takingFor a kind of appropriate value, andGeneralization ability have dropped.
Fig. 9-10 are referred to, Fig. 9-10 is that the first source of sound model takes instance processes effect when different amount of training data are modeled Really.
Modeling conditions:1. take,When 0.3, take, with 600 poles of Acoustic channel 1 The all-pole filter of point obtains reflected signal, training data(16KHz sample rates)Take 6000 data sampled points;2. take,When 0.3, take, obtained with the all-pole filter of 600 limits of Acoustic channel 1 Reflected signal, training data(16KHz sample rates)Take 10000 data sampled points.
Signals below is input into respectively to the model that two kinds of conditions are set up(16KHz sample rates):Simulation mike mixed signal In reflected signal obtained with the all-pole filter of 600 limits of Acoustic channel 2, the mixed signal of input processing is 6000 data Sampled point is a frame.
Training set up after model, input signal, model outputWithWave-form similarity, andWithWave-form similarity be calculated as follows:
IfIt is output as, reflected signal rejection ability is calculated as follows:
(dB);
Result:During 6000 data sampled point training modelings,0.8871,0.7475, similarity is improved 0.1396, to reflected sound signals rejection ability be12.45dB;During 10000 data sampled point training modelings, 0.9036,0.7475, similarity improves 0.1561, to reflected sound signals rejection ability is15.50dB.It can be seen that: (1)Model increases to 10000 data sampled points in training data point(16KHz sample rates about 625ms)When, can just obtain preferably The first source of sound model;(2)Illustrate that Acoustic channel changes(Pickup position change)When, model still effectively produces a desired effect.
Figure 11 is referred to, Figure 11 is the all-pole filter acquisition reflection that the first source of sound model of the invention takes different limit numbers Acoustical signal is simultaneously modeled and result during instance processes for training.
Modeling conditions:Take,When 0.3, take, during training pattern, reflection letter Number obtained with 100 limit all-pole filters of Acoustic channel 1, training data(16KHz sample rates)Take 10000 data samplings Point.
To the mode input signal(16KHz sample rates):Reflected signal in simulation mike mixed signal is with Acoustic channel 2 The all-pole filter of 600 limits obtain, it is a frame that the example mixed signal of input processing is 6000 data sampling points.
Training set up after model, input signal, model outputWithWave-form similarity, andWithWave-form similarity be calculated as follows:
IfIt is output as, reflected signal rejection ability is calculated as follows:
(dB);
Result:0.9067,0.7432, similarity improves 0.1635, to reflected sound signals rejection ability For15.06dB.It can be seen that:(1)As long as taking certain limit number to obtain during trainingCoefficient, sets up Acoustic channelModel Wave filter, and its spectral characteristic need not obtain completely the same with primary sound channel, thus obtain the reflected sound signals of training, training The the first source of sound model set up can obtain more apparent reinforced effects.
Figure 12 is referred to, Figure 12 is the Ambient sound addition Gaussian noise that the present invention is used for training to Figure 11 models (25dB signal noise ratios)In the case of, the instance processes result of model.
With the same computational methods of Figure 11, result is:0.9242,0.7359, similarity improves 0.1883, It is 12.58dB to reflected sound signals rejection ability.In the case of the same reflected sound signals of mode input, adding noise makes Reduce, and resultImprove in contrast.It can be seen that the first source of sound model of the invention also has certain to noise simultaneously Inhibitory action so that its process output waveform similarity raising degree it is more than under noise-free case.
Refer to Figure 13-14, Figure 13-14 for be the second source of sound model of the invention key parameterWithDuring different values Instance processes effect.
Modeling conditions:1. takeWithWhen 0.3, take, with the 600 of Acoustic channel 1 Pole filter simulated reflections sound, with 10000 data sampling point training patterns;2. takeWithWhen 0.3, take, with 600 pole filter simulated reflections sound of Acoustic channel 1, mould is trained with 10000 data sampling points Type.
The mode input signals below that respectively two kinds of conditions are set up(16KHz sample rates, 6000 data sampled points are one Frame):1. the reflected sound that 600 pole filters of Acoustic channel 1 are simulated;2. targeted voice signal;3. simulate Mike Wind number
Training set up after model, input signal, model outputWithWave-form similarity, andWithWave-form similarity be calculated as follows:
IfIt is output as, reflected signal rejection ability is calculated as follows:
(dB);
Result:TakeWhen,0.8734,0.7498, similarity improves 0.1236, To reflected sound rejection ability it is11.19dB;And takeWhen,0.8192,0.7498, Similarity improves 0.0694, to reflected sound rejection ability is7.23dB.It can be seen that takingFor a kind of conjunction Suitable value, andGeneralization ability have dropped.
Figure 15-16 are referred to, Figure 15-16 is that the second source of sound model takes instance processes effect when different amount of training data are modeled Really.
Modeling conditions:1. take,When 0.3, take, with the 600 of Acoustic channel 1 The all-pole filter of limit obtains reflected signal, training data(Obtained with 16KHz sample rates)Take 6000 data sampled points; 2. take,When 0.3, take, with the all-pole filter of 600 limits of Acoustic channel 1 Obtain reflected signal, training data(16KHz sample rates)Take 10000 data sampled points.
Signals below is input into respectively to the model that two kinds of conditions are set up(16KHz sample rates):Simulation mike mixed signal In reflected signal obtained with the all-pole filter of 600 limits of Acoustic channel 2, the example signal of input processing is 6000 data Sampled point is a frame.
Training set up after model, input signal, model outputWithWave-form similarity, andWithWave-form similarity be calculated as follows:
IfIt is output as, reflected signal rejection ability is calculated as follows:
(dB);
Result:In 6000 data sampled points,0.8281,0.7451, similarity improves 0.083, To reflected sound signals rejection ability it is8.22dB;During 10000 data sampled points,0.8400,0.7451, Similarity improves 0.0949, to reflected sound signals rejection ability is10.81dB.It can be seen that:(1)Model is in training data point Increase to 10000 sampled points(16KHz sample rates about 625ms)When, just can obtain preferable second source of sound model;(2)Acoustic channel Change(Pickup position change)When, model still effectively produces a desired effect.
Figure 17 is referred to, Figure 17 is the all-pole filter acquisition reflection that the second source of sound model of the invention takes different limit numbers Acoustical signal is simultaneously modeled and result during instance processes for training.
Modeling conditions:Take,When 0.3, take, during training pattern, reflection letter Number obtained with 100 limit all-pole filters of Acoustic channel 1, training data(16KHz sample rates)Take 10000 data samplings Point.
To the mode input signal(16KHz sample rates):Reflected signal in simulation mike mixed signal is with Acoustic channel 2 The all-pole filter of 600 limits obtain, it is a frame that the example mixed signal of input processing is 6000 data sampling points.
Training set up after model, input signal, model outputWithWave-form similarity, andWithWave-form similarity be calculated as follows:
IfIt is output as, reflected signal rejection ability is calculated as follows:
(dB);
Result:0.8690,0.740, similarity improves 0.121, to reflected sound signals rejection ability is9.20dB.It can be seen that:(1)As long as taking certain limit number to obtain during trainingCoefficient, sets up Acoustic channelModel is filtered Ripple device, and its spectral characteristic need not obtain completely the same with primary sound channel, thus obtain the reflected sound signals of training, and training is built The second vertical source of sound model can obtain reinforced effects.
Figure 18 is referred to, Figure 18 is that the present invention adds Gaussian noise to Figure 17 Ambients sound(25dB signal noise ratios)Feelings Under condition, the instance processes result of model.
With the same computational methods of Figure 17, result is:0.8215,0.6868, similarity improves 0.1347, It is 8.24dB to reflected sound signals rejection ability.Comprising in the case of same reflected sound signals, add noise makes mode inputReduce, and resultImprove in contrast.It can be seen that the second source of sound model of the invention also has certain to noise simultaneously Inhibitory action so that its process output waveform similarity raising degree it is more than under noise-free case.
Summary Fig. 7 to Figure 18 results, it is seen that the treatment effect of the first source of sound model is better than the second source of sound model;Explanation Due to the Ambient acoustical signal spectrum composition that the first source of sound model suppresses it is consistent with targeted voice signal, and the second source of sound model Then without this concordance, therefore the stronger modeling effect of concordance is better.
It is more than presently preferred embodiments of the present invention, all changes made according to technical solution of the present invention, produced function are made During with scope without departing from technical solution of the present invention, protection scope of the present invention is belonged to.

Claims (9)

1. the signal processing method that targeted voice signal is picked up in a kind of enhancing acoustic environment, it is characterised in that:Comprise the following steps:
Step 1:It is determined that the types of models set up:Including the first source of sound model and the second source of sound model, the first source of sound model Accordingly strengthen targeted voice signal to suppress the reflected sound signals that produce in acoustic environment of target voice sheet;Described second Source of sound model is the reflected sound signals that suppress another particular person voice to produce in acoustic environment and accordingly strengthens targeted voice signal;
Step 2:The training data source of model is divided into two kinds and obtains preparation:When preparing to set up the first source of sound model, mesh need to be obtained Mark voice signal S1The data sampling point of (n);When preparing to set up the second source of sound model, particular person voice signal m (n) need to be obtained With targeted voice signal S1The data sampling point of (n);
Step 3:Obtain the Ambient acoustical signal of training pattern:First, from electroacoustics system to indoor noise environment input stimulus Signal, obtains the impulse response signal of indoor noise environment, and is converted into digital signal y (n);Secondly, exponent number p is set, using base All-pole filter coefficient is obtained in autocorrelative linear prediction algorithm, the all-pole filter is used to simulate the sound in acoustic environment Channel transfer characteristic;Again and, with prepare suppress reflected sound corresponding to sound source signal m (n) or S1N () is filtered through full limit Device obtains corresponding Ambient acoustical signal S2(n);
Step 4:The determination of ESN network parameters:
The equation of ESN networks is:
X (i+1)=f (WX (i)+WinU(i)+WbackY(i))
Y ( i + 1 ) = f o u t ( W o u t [ X ( i ) , U ( i + 1 ) , Y ( i ) ] + W b i a s o u t )
Wherein, f represents intrinsic nerve unit activation primitive, generally takes hyperbolic tangent function, foutRepresent output function, typical case Under take identity function, state variables of the X (i) for i moment reserve pools, U (i) for i when etching system input vector, Y (i) is ESN nets The output at network i moment;W be randomly generate and partially connected higher-dimension square formation, once generation, its connection weight keeps reserve pool It is constant;WinThe input weight matrix and output weight vector of ESN networks are respectively with W;WbackState variable is connected for output Weight vector;Represent the bias term of output or represent noise;WinAnd WbackRandomly generate and keep constant, it is unique to need Adjustment is output weights Wout
To make mike take after the signal frame input model of certain length, the target voice frame of output corresponding length can be processed, The value of the random connection weight vector of above three is as follows:
Win=a × (2 × rand (N, 1) -1), i.e. N × 1 random matrix, value (- a, a) between;
Wback=b × rand (N, 1), i.e. N × 1 random matrix, value (0, b) between;
W=c × sprand (N, N, p), i.e. the normal distribution random matrix of N × N, partially connected p, value (0, c) between;
Wherein, N values are less, and the time for setting up state is relatively shorter, improve the real-time of model calculation, and N values get over large-sized model essence Really property is higher, but declines may generalization ability;A, b, c value is:1. a determines the yardstick of input reserve pool, a >=1;②0<b< 1;③0<c<1;N >=300, p=0.01-0.05;
Step 5:With U (n)=S1(n)+S2N () is used as ESN network inputs, D=S1N () is expected as target, ESN networks is entered Row training, is inhibited specific source of sound reflected sound and accordingly strengthens the model of targeted voice signal;I moment, the state of reserve pool The state equation of variable X:
X (i)=tanh (WinU(i)+WX(i-1)+WbackD(i-1));
For given nonlinear system inputoutput pair (U (n), D (n);N=1,2,3 ...), using ESN network identifications, this is The process of system is:First, initialize the weights W and W in reserve poolin;Secondly, U (n) excitation system is input into, tries to achieve ESN networks Each moment condition responsive;It is linear relationship between state variable and desired output in reserve pool, therefore the training of ESN networks Process is fairly simple, and the process for solving is not in that multiple Local Minimums for often having of traditional neural network, convergence rate are slow Shortcoming;
Output weights WoutDetermination adopt basic linear regression algorithm:
Wout=(XTX)-1XTD。
2. the signal processing method that target voice is picked up in a kind of enhancing acoustic environment according to claim 1, its feature exist In:Targeted voice signal when model obtained by the training can also be used to that Acoustic channel changes in actual acoustic environment strengthens, i.e., Include in signal U (n) obtained from mike:Targeted voice signal S1(n), specific ambient sound reflected signal S2(n), input In model, enhanced targeted voice signal Y (n) output is obtained, which adopts the code segment that MATLAB is realized as follows:
X(:, 1)=tanh (Win×U(:,1);
For i=1:1:m-1
X(:, i+1) and=tanh (Win×U(:,i+1)+W×X(:,i)+Wback×Y(i));
Y (i+1)=Wout×X(:,i+1);
end
Y (m)=Wout×X(:,m)。
3. the signal processing method that target voice is picked up in a kind of enhancing acoustic environment according to claim 1, its feature exist In:In the step 2, targeted voice signal S is obtained1N the data sampling point of (), its data frame length are more than 625ms.
4. the signal processing method that target voice is picked up in a kind of enhancing acoustic environment according to claim 1, its feature exist In:In the step 3, described input signal is white noise ping, recurrent pulse or counterfeit noise.
5. the signal processing method that target voice is picked up in a kind of enhancing acoustic environment according to claim 1, its feature exist In:In the step 3, the impulse response signal of the acoustic environment by can indoors any one speaker of use range and Mike relevant position obtains.
6. the signal processing method that target voice is picked up in a kind of enhancing acoustic environment according to claim 1, its feature exist In:In the step 3, the determination process of the exponent number p is as follows:
Indoor limit number, i.e. it is indoor sound standing wave number that the exponent number of linear prediction is corresponding, and which is estimated as the following formula:
d N = 2 Vf 2 c 3 ( 1 + &lambda; 2 &Lambda; ) d &omega; ,
In formula, f is estimation frequency, and λ is respective wavelength, and d ω are the bandwidth of estimation, and c is the velocity of sound, and Λ=4V/S, V are chamber volume, S is indoor total surface area;
The then exponent number p=2dN.
7. the signal processing method that target voice is picked up in a kind of enhancing acoustic environment according to claim 1, its feature exist In:In the step 3, Ambient acoustical signal S2N (), which refers to:When for the first source of sound model when, S2N () is by target Voice signal, i.e., by m (n)=S1N () is formed by all-pole filter;When for the second source of sound model when, S2N () is by specific People's voice signal m (n) is formed by all-pole filter.
8. the signal processing method that target voice is picked up in a kind of enhancing acoustic environment according to claim 1, its feature exist In:Described ESN network parameters a, b, c, N, p is selected by testing, and concrete determination process is:(1) take a, b, c, N, p meet a >= 1,0<b<1,0<c<Any one class value in 1, input training data modeling, then to mode input instance data, observation processes defeated When going out, whether system is stable, i.e., with the presence or absence of vibration, turn parameter b down when there is vibration, until model stability is exported;(2) increase Or reduce N values, and repeating training and the simulation data of previous step, the value of a, b, c, N, p when reaching optimum efficiency as determines ginseng Numerical value.
9. the signal processing method that target voice is picked up in a kind of enhancing acoustic environment according to claim 2, its feature exist In:Targeted voice signal when model obtained by the training can be used in that Acoustic channel changes in actual acoustic environment strengthens, and which is Refer to once model set up after, when the position of pickup changes, additionally it is possible to suppress training indication sound source signal when change of voice ring Reflected signal in border, the corresponding enhanced targeted voice signal of output.
CN201410427254.1A 2014-08-28 2014-08-28 The signal processing method of targeted voice signal pickup in a kind of enhancing acoustic environment Expired - Fee Related CN104157293B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410427254.1A CN104157293B (en) 2014-08-28 2014-08-28 The signal processing method of targeted voice signal pickup in a kind of enhancing acoustic environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410427254.1A CN104157293B (en) 2014-08-28 2014-08-28 The signal processing method of targeted voice signal pickup in a kind of enhancing acoustic environment

Publications (2)

Publication Number Publication Date
CN104157293A CN104157293A (en) 2014-11-19
CN104157293B true CN104157293B (en) 2017-04-05

Family

ID=51882775

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410427254.1A Expired - Fee Related CN104157293B (en) 2014-08-28 2014-08-28 The signal processing method of targeted voice signal pickup in a kind of enhancing acoustic environment

Country Status (1)

Country Link
CN (1) CN104157293B (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9582753B2 (en) * 2014-07-30 2017-02-28 Mitsubishi Electric Research Laboratories, Inc. Neural networks for transforming signals
CN104966517B (en) * 2015-06-02 2019-02-01 华为技术有限公司 A kind of audio signal Enhancement Method and device
CN105142089B (en) * 2015-06-25 2016-05-18 厦门一心智能科技有限公司 A kind of on-the-spot pickup in classroom and sound reinforcement system of position that can self adaptation speaker
CN105976822B (en) * 2016-07-12 2019-12-03 西北工业大学 Audio signal extracting method and device based on parametrization supergain beamforming device
CN113114866A (en) * 2017-03-10 2021-07-13 株式会社Bonx Portable communication terminal, control method thereof, communication system, and recording medium
US11304000B2 (en) * 2017-08-04 2022-04-12 Nippon Telegraph And Telephone Corporation Neural network based signal processing device, neural network based signal processing method, and signal processing program
CN107689227A (en) * 2017-08-23 2018-02-13 上海爱优威软件开发有限公司 A kind of voice de-noising method and system based on data fusion
CN108113665B (en) * 2017-12-14 2020-10-30 河北大学 Automatic noise reduction method for electrocardiosignal
CN107799123B (en) * 2017-12-14 2021-07-23 南京地平线机器人技术有限公司 Method for controlling echo eliminator and device with echo eliminating function
CN108510997A (en) * 2018-01-18 2018-09-07 晨星半导体股份有限公司 Electronic equipment and echo cancel method applied to electronic equipment
CN108977897B (en) * 2018-06-07 2021-11-19 浙江天悟智能技术有限公司 Melt spinning process control method based on local internal plasticity echo state network
CN109841206B (en) * 2018-08-31 2022-08-05 大象声科(深圳)科技有限公司 Echo cancellation method based on deep learning
CN110166882B (en) 2018-09-29 2021-05-25 腾讯科技(深圳)有限公司 Far-field pickup equipment and method for collecting human voice signals in far-field pickup equipment
CN111261179A (en) * 2018-11-30 2020-06-09 阿里巴巴集团控股有限公司 Echo cancellation method and device and intelligent equipment
CN112055284B (en) * 2019-06-05 2022-03-29 北京地平线机器人技术研发有限公司 Echo cancellation method, neural network training method, apparatus, medium, and device
CN110246516B (en) * 2019-07-25 2022-06-17 福建师范大学福清分校 Method for processing small space echo signal in voice communication
CN112562701B (en) * 2020-11-16 2023-03-28 华南理工大学 Heart sound signal double-channel self-adaptive noise reduction algorithm, device, medium and equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101124740A (en) * 2005-02-23 2008-02-13 艾利森电话股份有限公司 Adaptive bit allocation for multi-channel audio encoding
CN101933306A (en) * 2007-12-31 2010-12-29 阿尔卡特朗讯美国公司 Method and apparatus for detecting and suppressing echo in packet networks

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101124740A (en) * 2005-02-23 2008-02-13 艾利森电话股份有限公司 Adaptive bit allocation for multi-channel audio encoding
CN101933306A (en) * 2007-12-31 2010-12-29 阿尔卡特朗讯美国公司 Method and apparatus for detecting and suppressing echo in packet networks

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Automatic speech recognition using a predictive echo state network classifier;Mark D. Skowronski, John G. Harris;《Neural Networks》;20070430;414–423 *
Minimum mean squared error time series classification using an echo state network prediction model;Mark D. Skowronski 等;《 IEEE International Symposium on Circuits and Systems》;20060531;3153-3156 *
Noise-Robust Automatic Speech Recognition Using a Predictive Echo State Network;Mark D. Skowronski ,John G. Harris;《IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING》;20070731;1724-1729 *
基于回声状态网络的非特定人孤立词语音识别方法研究;苗瑾;;《兰州大学硕士学位论文》;20131231;全文 *

Also Published As

Publication number Publication date
CN104157293A (en) 2014-11-19

Similar Documents

Publication Publication Date Title
CN104157293B (en) The signal processing method of targeted voice signal pickup in a kind of enhancing acoustic environment
CN110619885B (en) Method for generating confrontation network voice enhancement based on deep complete convolution neural network
CN109841206B (en) Echo cancellation method based on deep learning
CN105957520B (en) A kind of voice status detection method suitable for echo cancelling system
CN111833896B (en) Voice enhancement method, system, device and storage medium for fusing feedback signals
Christensen et al. The CHiME corpus: a resource and a challenge for computational hearing in multisource environments
CN107452389A (en) A kind of general monophonic real-time noise-reducing method
CN107845389A (en) A kind of sound enhancement method based on multiresolution sense of hearing cepstrum coefficient and depth convolutional neural networks
CN103067322B (en) The method of the voice quality of the audio frame in assessment channel audio signal
Ratnarajah et al. IR-GAN: Room impulse response generator for far-field speech recognition
CN109273021A (en) A kind of real-time conferencing noise-reduction method and device based on RNN
CN109524020A (en) A kind of speech enhan-cement processing method
CN107967920A (en) A kind of improved own coding neutral net voice enhancement algorithm
CN106157964A (en) A kind of determine the method for system delay in echo cancellor
CN113241085B (en) Echo cancellation method, device, equipment and readable storage medium
CN115442191B (en) Communication signal noise reduction method and system based on relative average generation countermeasure network
CN106327555A (en) Method and device for obtaining lip animation
CN114267372A (en) Voice noise reduction method, system, electronic device and storage medium
CN106161820B (en) A kind of interchannel decorrelation method for stereo acoustic echo canceler
CN112382301B (en) Noise-containing voice gender identification method and system based on lightweight neural network
CN115424627A (en) Voice enhancement hybrid processing method based on convolution cycle network and WPE algorithm
CN111354367A (en) Voice processing method and device and computer storage medium
CN107045874A (en) A kind of Non-linear Speech Enhancement Method based on correlation
CN101516055B (en) Method and device capable of simulating three-dimensional echo sound effect in different acoustic environments
CN116705071A (en) Playback voice detection method based on data enhancement and pre-training model feature extraction

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170405

Termination date: 20200828

CF01 Termination of patent right due to non-payment of annual fee