CN109872720A - It is a kind of that speech detection algorithms being rerecorded to different scenes robust based on convolutional neural networks - Google Patents
It is a kind of that speech detection algorithms being rerecorded to different scenes robust based on convolutional neural networks Download PDFInfo
- Publication number
- CN109872720A CN109872720A CN201910085725.8A CN201910085725A CN109872720A CN 109872720 A CN109872720 A CN 109872720A CN 201910085725 A CN201910085725 A CN 201910085725A CN 109872720 A CN109872720 A CN 109872720A
- Authority
- CN
- China
- Prior art keywords
- frequency
- voice
- time
- neural networks
- pond
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Abstract
Speech detection algorithms are rerecorded to different scenes robust based on convolutional neural networks the invention discloses a kind of, more particularly to speech detection algorithms field, by the way that voice time-frequency figure is input in algorithm model, algorithm model includes seven layers, every layer includes a convolutional layer and a pond layer, and residual error connection is added by line rectification function in the output of convolutional layer between the layers, final feature is extracted finally by global poolization, and passes through sigmoid predicted detection result.Data entry modality of the present invention using time-frequency figure as network in the present invention, relative to voice data is directly inputted, the characteristic information that time-frequency figure introduces rewriting device has relatively intensive distribution, is more advantageous to neural network characteristics extraction, to accelerate to train, precision is improved.
Description
Technical field
The present invention relates to speech detection algorithms fields, it is more particularly related to which a kind of be based on convolutional neural networks
Speech detection algorithms are rerecorded to different scenes robust.
Background technique
Existing research proves, voice convert (Voice Conversion, VC), speech synthesis (Speech Synthesis,
SS) and rerecording voice etc., duplicity voice can effectively cheat Speaker Identification (Automatic Speaker
Recognition, ASV) system, thus personation accessing system, rerecording voice can be such that the higher mistake of ASV system generation connects
By rate, social safety generation is seriously threatened.Wherein, the voice messaging and feature that VC and SS needs target speaker more, then
In addition existing algorithm not yet full maturity, cost of implementation and difficulty are relatively high;And it rerecords voice and utilizes cheap sound pick-up outfit
It is readily available, and rerecords all features that voice forgives target person voice substantially, therefore, opposite VC and SS has more prestige
The side of body.For this purpose, rerecording the detection of voice should be taken seriously.
SV (automatic Speaker Identification) system in practice using more and more, such as: access control system, phone silver
The fields such as row, military affairs.Since speaker verification's process does not need any aspectant contact, ASV system be very easy to by
To the attack of duplicity voice.The duplicity voice that audio frequency apparatus generates can bring prestige to ASV (automatic Speaker Identification) system
The side of body, influences the security performance of the system.In nearest more than ten years, digital audio product not only emerges one after another in type, but also
The integrated function of various product is also more and more, increasingly stronger.Now with the PC for being equipped with audio processing software or
Person have the relatively inexpensive equipment such as PDA of audio processing ability can achieve the effect that it is same or similar.For example, high quality,
Sound pick-up outfit-smart phone of low cost, the duplicity voice formed will constitute risk to ASV system.Duplicity voice
Including replay attack, voice conversion, speech synthesis etc..Attacker can forge characteristic using fraudulent voice, to obtain
Illegal identity access to system, and then the file data of user, privacy will be stolen, and bring the damage that can not much make up
It loses.Wherein replay attack is relative to voice conversion and speech synthesis with more threat.Replay attack is from realistic objective speaker
The speech samples of middle acquisition, form are continuous pre-recorded speech samples.Spoofing attack based on replay does not need pair
Voice does any technical treatment, and the voice and replay voice of realistic objective speaker has identical frequency spectrum and advanced spy
Sign, it is the voice attack type being easiest to.And synthesize voice and deform voice of the voice relative to realistic objective speaker, it is
There is certain errors and variations, be not identical, so to the detection of replay attack relative to synthesis voice and deformation
Voice has bigger difficulty.
Summary of the invention
In order to overcome the drawbacks described above of the prior art, the embodiment of the present invention provides one kind based on convolutional neural networks to not
Speech detection algorithms are rerecorded with scene robust, the data entry modality by using time-frequency figure as network in the present invention, phase
For directly inputting voice data, the characteristic information that time-frequency figure introduces rewriting device has relatively intensive distribution, more favorably
It is extracted in neural network characteristics, to accelerate to train, improves precision, to different recording arrangements, recorded environment and record distance
The detection for rerecording voice has very high accuracy.
To achieve the above object, the invention provides the following technical scheme: a kind of convolutional neural networks that are based on are to different scenes
Robust rerecords speech detection algorithms, specifically includes the following steps:
A, raw tone is acquired using sound pick-up outfit, and is converted through DA/AD, voice is rerecorded in acquisition;
B, raw tone can generate distortion in conversion process, and the distortion data of raw tone is calculated by distortion model,
Wherein, distortion model expression formula are as follows:
Y (t) is to rerecord voice, and x (t) is raw tone, and λ is the amplitude transformation factor, and α is the time shaft linear extendible factor, η
It is superimposed noise;
Corresponding frequency domain changes expression formula:
Y (j ω), X (j ω), N (j ω) they are respectively the frequency domain representation of y (t), x (t), η, for fixed sound pick-up outfit,
It is characterized in highly stable, i.e., λ, α are constants;
C, it rerecords voice and voice time-frequency figure is produced by Short Time Fourier Transform;
D, voice time-frequency figure is input in algorithm model, and algorithm model includes seven layers, and every layer includes a convolutional layer and one
A pond layer, residual error connection is added by line rectification function in the output of convolutional layer between the layers, finally by the overall situation
Pondization extracts final feature, and passes through sigmoid predicted detection result.
In a preferred embodiment, when rerecording voice and being converted, Short Time Fourier Transform uses the 126 length Chinese
Bright (hanning) window, step-length 50, the size of time-frequency figure are (64x62).
In a preferred embodiment, algorithm model is used in frequency dimension convolution, and time dimension pond is specifically set
It is set to using 3x1 convolution kernel, the pond 1x2, and can mutually agree with the feature distribution feature of time-frequency figure, voice time-frequency figure characteristic distributions
It is with independence and again with uniformity in special frequency channel between adjacent speech frame.
In a preferred embodiment, algorithm model uses technology of the deep learning as data-driven.
In a preferred embodiment, rewriting device can introduce variation, depth on the frequency domain of primitive sound signal
Practise input data of the model using original audio signal as network.
In a preferred embodiment, when the algorithm model carries out frequency dimension progress convolution, do not consider the time
The correlation of dimension, and when frequency dimension carries out convolution, while carrying out time dimension and carrying out pond.
In a preferred embodiment, convolution kernel can parameter sharing, the equipment for the same distribution that time dimension has
Characteristic information repetition training convolution nuclear parameter, pond layer use the pond (1x2) of time dimension, and frequency dimension is without pond.
Technical effect and advantage of the invention:
1, data entry modality of the present invention using time-frequency figure as network in of the invention, relative to directly inputting voice number
According to, the characteristic information that time-frequency figure introduces rewriting device has relatively intensive distribution, it is more advantageous to neural network characteristics extraction,
To accelerate to train, precision is improved;
2, using in frequency dimension convolution, time dimension pond is specifically configured to using 3x1 convolution kernel, the pond 1x2 the present invention
Change, only carries out convolution in frequency dimension, do not consider the correlation of time dimension, convolution nuclear parameter amount can be significantly reduced, so that
Model has stronger anti-over-fitting ability, and data volume is depended in reduction unduly, while in the training process due to convolution kernel
Parameter sharing, the characteristic information repetition training convolution nuclear parameter of the equipment for the same distribution that time dimension has can make training more
Add sufficiently;
3, the present invention does not need to need manually to choose specific one or multiple spies as traditional machine learning method
Then sign is classified with classifier again, can spontaneously extract the feature and deep layer that relevant feature includes some shallow-layer edges
Feature then so that classify, simplify whole flow process and reached better effect;
4, inventive algorithm has the detection for rerecording voice of different recording arrangements, recording environment and recording distance very high
Accuracy.
Detailed description of the invention
Fig. 1 is algorithm model structural schematic diagram of the invention.
Fig. 2 is that voice of the invention rerecords process schematic.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
Embodiment 1
It is as shown in Figure 1 it is a kind of based on convolutional neural networks to the speech detection algorithms of rerecording of different scenes robust, algorithm
Model shares 7 layers, and every layer includes a convolutional layer and a pond layer, and the output of convolutional layer passes through line rectification function, and
Residual error connection is added between layers, extracts final feature finally by global poolization, and pass through sigmoid predicted detection knot
Fruit, using in frequency dimension convolution, time dimension pond is specifically configured to using 3x1 convolution kernel, the pond 1x2, and maximizing reduces
Model capacity greatly reduces the risk of over-fitting, and it is again special with the feature distribution of time-frequency figure to the dependence of data volume to reduce model
Point height is agreed with, and training parameter is assigned to more reasonable place, more compact parameter is trained with more effective feature;
Voice time-frequency figure, is generated by Short Time Fourier Transform, and relative to voice data is directly inputted, time-frequency figure is for rerecording
The characteristic information that equipment introduces has relatively intensive distribution, is more advantageous to neural network characteristics extraction, to accelerate to train, improves
Precision, rewriting device can introduce variation on the frequency domain of primitive sound signal, and the performance of deep learning model has data high
Dependence, using original audio signal as the input data of network, feature distribution is excessively sparse, greatly improves nerve net
The difficulty of network extraction validity feature;
Embodiment 2
It is as shown in Figure 2 it is a kind of speech detection algorithms are rerecorded to different scenes robust based on convolutional neural networks, rerecord
Lead to a degree of distortion of voice data, including the linear extendible on amplitude distortion and time shaft, wherein distortion model expression
Formula are as follows:
Y (t) is to rerecord voice, and x (t) is raw tone, and λ is the amplitude transformation factor, and α is the time shaft linear extendible factor, η
It is superimposed noise;
Corresponding frequency domain changes expression formula:
Y (j ω), X (j ω), N (j ω) they are respectively the frequency domain representation of y (t), x (t), η, for fixed sound pick-up outfit,
It is characterized in highly stable, i.e., λ, α are constants;
Embodiment 3
In this embodiment, using 0.2 second voice segments as experimental data, Short Time Fourier Transform uses 126 length Hammings
(hanning) window, step-length 50, the size of time-frequency figure are (64x62);
Further, in the above-mentioned technical solutions, convolution is carried out using in frequency dimension, while carries out pond in time dimension
Change, only carries out convolution in frequency dimension, do not consider the correlation of time dimension, convolution nuclear parameter amount can be significantly reduced, so that
Model has stronger anti-over-fitting ability, and data volume is depended in reduction unduly, while in the training process due to convolution kernel
Parameter sharing, the characteristic information repetition training convolution nuclear parameter of the equipment for the same distribution that time dimension has can make training more
Adding sufficiently, pond layer uses the pond (1x2) of time dimension, and for frequency dimension without pond, pond can be reduced the dimension of feature,
Accelerate the calculating of network, and network structure is made to have stronger robustness to flexible, the deformation of data characteristics, for time-frequency figure,
Feature distribution not only reduces characteristic dimension, but also not will lead to frequency only in time dimension pond with deformation there is no flexible
The loss of dimensional characteristics, is calculated by multilayer convolution and pondization, and characteristic dimension eventually becomes one-dimensional, length and time-frequency figure frequency phase
Together;
Further, in the above-mentioned technical solutions, raw tone library is by 30000 sections of voices, and totally 60 people record composition, sampling
Frequency 16kHz, quantified precision 16bits;
The voice of 10 spokesman of random selection guarantees training for training as test data, the voice of remaining 50 people
The independence of data and test data avoids the recording of same position spokesman from appearing in different data collection;
Specific recording process is as follows: for training set, being combined by different distance and equipment to original language under quiet environment
Sound library is rerecorded 4 times, rerecords sound bank thus to obtain 4, they separately include 25000 sections of voices, is mentioned at random from 4 sound banks
Take totally 25000 sections of voices collectively constitute totally 50000 sections of training dataset with raw tone as negative sample.Raw tone passes through
Laptop computer is associated Y40-70AT-IFI and is played;Rewriting device is that 14 (Ins14VD-258) are got in the Inspion spirit of laptop computer Dell
With smart phone millet 2S;
The case where 4 recordings, is as shown in table 1:
1 recorded speech of table
For test data, it is arranged using the identical recording of table two, in order to verify interference of the model to environment random noise
Voice robustness, recorded respectively in quiet environment in the environment of having certain random noise, test set includes 4 voices altogether
Library, each sound bank include the quiet environment under the library recording mode and totally 10000 tested speech containing ambient noise;
Further, in the above-mentioned technical solutions, network error function is cross entropy loss function, is optimized using Adam and is calculated
Method is trained, and initial learning rate is set as 0.001, and dynamic regularized learning algorithm rate in the training process, and every training 10000 times will
Learning rate reduces one times, and training batch size is 32 every time, in order in training process supervised training effect, from training data with
Machine chooses 2000 datas for verifying, by comparative training data degradation function and verify data loss function, to lose letter
Number, which is added regularization term and regularization coefficient is arranged, can effectively prevent over-fitting for 0.0001;
Table 2 lists some important hyper parameter settings in training process, has in the setting lower network in training process
Quickly convergence, and finally obtain quite high accuracy;
2 hyper parameter (β of table1、β2Respectively Adam optimizer parameter)
Further, in the above-mentioned technical solutions, the present embodiment contains 4 experiments test, is for different records respectively
The test of control equipment and different recordings apart from lower progress, it is as shown in table 3 to test experimental results every time:
3 experimental result of table
Test experiments accuracy in varied situations is attained by 99.8% or more, and it is fine to ensure that experimental model has
Versatility.
Last: the foregoing is only a preferred embodiment of the present invention, is not intended to restrict the invention, all in the present invention
Spirit and principle within, any modification, equivalent replacement, improvement and so on, should be included in protection scope of the present invention it
It is interior.
Claims (7)
1. it is a kind of based on convolutional neural networks to the speech detection algorithms of rerecording of different scenes robust, spy is, specifically includes
Following steps:
A, raw tone is acquired using sound pick-up outfit, and is converted through DA/AD, voice is rerecorded in acquisition;
B, raw tone can generate distortion in conversion process, and the distortion data of raw tone is calculated by distortion model, wherein
Distortion model expression formula are as follows:
Y (t) is to rerecord voice, and x (t) is raw tone, and λ is the amplitude transformation factor, and α is the time shaft linear extendible factor, and η is folded
Plus noise;
Corresponding frequency domain changes expression formula:
Y (j ω), X (j ω), N (j ω) are respectively the frequency domain representation of y (t), x (t), η, for fixed sound pick-up outfit, feature
Be it is highly stable, i.e., λ, α are constants;
C, it rerecords voice and voice time-frequency figure is produced by Short Time Fourier Transform;
D, voice time-frequency figure is input in algorithm model, and algorithm model includes seven layers, and every layer includes a convolutional layer and a pond
Change layer, residual error connection is added by line rectification function in the output of convolutional layer between the layers, finally by global pool
Final feature is extracted, and passes through sigmoid predicted detection result.
2. it is according to claim 1 it is a kind of based on convolutional neural networks to different scenes robust rerecord speech detection calculate
Method, it is characterised in that: when rerecording voice and being converted, Short Time Fourier Transform uses 126 length Hamming (hanning) windows, step
A length of 50, the size of time-frequency figure is (64x62).
3. it is according to claim 1 it is a kind of based on convolutional neural networks to different scenes robust rerecord speech detection calculate
Method, it is characterised in that: algorithm model is used in frequency dimension convolution, time dimension pond, is specifically configured to using 3x1 convolution
Core, the pond 1x2, and can mutually agreeing with the feature distribution feature of time-frequency figure, voice time-frequency figure characteristic distributions adjacent speech frame it
Between have independence and special frequency channel again it is with uniformity.
4. it is according to claim 3 it is a kind of based on convolutional neural networks to different scenes robust rerecord speech detection calculate
Method, it is characterised in that: algorithm model uses technology of the deep learning as data-driven.
5. it is according to claim 4 it is a kind of based on convolutional neural networks to different scenes robust rerecord speech detection calculate
Method, it is characterised in that: rewriting device can introduce variation on the frequency domain of primitive sound signal, and deep learning model is believed with original audio
Input data number as network.
6. it is according to claim 3 it is a kind of based on convolutional neural networks to different scenes robust rerecord speech detection calculate
Method, it is characterised in that: the algorithm model carries out frequency dimension when carrying out convolution, does not consider the correlation of time dimension, and
When frequency dimension carries out convolution, while carrying out time dimension and carrying out pond.
7. it is according to claim 3 it is a kind of based on convolutional neural networks to different scenes robust rerecord speech detection calculate
Method, it is characterised in that: convolution kernel can parameter sharing, the equipment for the same distribution that time dimension has characteristic information repetition training volume
Product nuclear parameter, pond layer use the pond (1x2) of time dimension, and frequency dimension is without pond.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910085725.8A CN109872720B (en) | 2019-01-29 | 2019-01-29 | Re-recorded voice detection algorithm for different scene robustness based on convolutional neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910085725.8A CN109872720B (en) | 2019-01-29 | 2019-01-29 | Re-recorded voice detection algorithm for different scene robustness based on convolutional neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109872720A true CN109872720A (en) | 2019-06-11 |
CN109872720B CN109872720B (en) | 2022-11-22 |
Family
ID=66918246
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910085725.8A Active CN109872720B (en) | 2019-01-29 | 2019-01-29 | Re-recorded voice detection algorithm for different scene robustness based on convolutional neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109872720B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110211604A (en) * | 2019-06-17 | 2019-09-06 | 广东技术师范大学 | A kind of depth residual error network structure for voice deformation detection |
CN110689902A (en) * | 2019-12-11 | 2020-01-14 | 北京影谱科技股份有限公司 | Audio signal time sequence processing method, device and system based on neural network and computer readable storage medium |
CN110797031A (en) * | 2019-09-19 | 2020-02-14 | 厦门快商通科技股份有限公司 | Voice change detection method, system, mobile terminal and storage medium |
CN111370028A (en) * | 2020-02-17 | 2020-07-03 | 厦门快商通科技股份有限公司 | Voice distortion detection method and system |
CN111916067A (en) * | 2020-07-27 | 2020-11-10 | 腾讯科技(深圳)有限公司 | Training method and device of voice recognition model, electronic equipment and storage medium |
CN112614483A (en) * | 2019-09-18 | 2021-04-06 | 珠海格力电器股份有限公司 | Modeling method based on residual convolutional network, voice recognition method and electronic equipment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170092297A1 (en) * | 2015-09-24 | 2017-03-30 | Google Inc. | Voice Activity Detection |
US20180068675A1 (en) * | 2016-09-07 | 2018-03-08 | Google Inc. | Enhanced multi-channel acoustic models |
CN108198561A (en) * | 2017-12-13 | 2018-06-22 | 宁波大学 | A kind of pirate recordings speech detection method based on convolutional neural networks |
CN109065030A (en) * | 2018-08-01 | 2018-12-21 | 上海大学 | Ambient sound recognition methods and system based on convolutional neural networks |
-
2019
- 2019-01-29 CN CN201910085725.8A patent/CN109872720B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170092297A1 (en) * | 2015-09-24 | 2017-03-30 | Google Inc. | Voice Activity Detection |
US20180068675A1 (en) * | 2016-09-07 | 2018-03-08 | Google Inc. | Enhanced multi-channel acoustic models |
CN108198561A (en) * | 2017-12-13 | 2018-06-22 | 宁波大学 | A kind of pirate recordings speech detection method based on convolutional neural networks |
CN109065030A (en) * | 2018-08-01 | 2018-12-21 | 上海大学 | Ambient sound recognition methods and system based on convolutional neural networks |
Non-Patent Citations (1)
Title |
---|
项世军: "稳健音频水印研究", 《中国优秀博硕士学位论文全文数据库(博士)信息科技辑》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110211604A (en) * | 2019-06-17 | 2019-09-06 | 广东技术师范大学 | A kind of depth residual error network structure for voice deformation detection |
CN112614483A (en) * | 2019-09-18 | 2021-04-06 | 珠海格力电器股份有限公司 | Modeling method based on residual convolutional network, voice recognition method and electronic equipment |
CN110797031A (en) * | 2019-09-19 | 2020-02-14 | 厦门快商通科技股份有限公司 | Voice change detection method, system, mobile terminal and storage medium |
CN110689902A (en) * | 2019-12-11 | 2020-01-14 | 北京影谱科技股份有限公司 | Audio signal time sequence processing method, device and system based on neural network and computer readable storage medium |
CN111370028A (en) * | 2020-02-17 | 2020-07-03 | 厦门快商通科技股份有限公司 | Voice distortion detection method and system |
CN111916067A (en) * | 2020-07-27 | 2020-11-10 | 腾讯科技(深圳)有限公司 | Training method and device of voice recognition model, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN109872720B (en) | 2022-11-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109872720A (en) | It is a kind of that speech detection algorithms being rerecorded to different scenes robust based on convolutional neural networks | |
CN108711436B (en) | Speaker verification system replay attack detection method based on high frequency and bottleneck characteristics | |
CN108922518A (en) | voice data amplification method and system | |
CN108231067A (en) | Sound scenery recognition methods based on convolutional neural networks and random forest classification | |
CN104835498A (en) | Voiceprint identification method based on multi-type combination characteristic parameters | |
CN110111797A (en) | Method for distinguishing speek person based on Gauss super vector and deep neural network | |
CN108198561A (en) | A kind of pirate recordings speech detection method based on convolutional neural networks | |
Monge-Alvarez et al. | Audio-cough event detection based on moment theory | |
CN108831443A (en) | A kind of mobile sound pick-up outfit source discrimination based on stacking autoencoder network | |
CN108091326A (en) | A kind of method for recognizing sound-groove and system based on linear regression | |
CN106531174A (en) | Animal sound recognition method based on wavelet packet decomposition and spectrogram features | |
CN111402922B (en) | Audio signal classification method, device, equipment and storage medium based on small samples | |
Sukhwal et al. | Comparative study of different classifiers based speaker recognition system using modified MFCC for noisy environment | |
Cao et al. | Underwater target classification at greater depths using deep neural network with joint multiple‐domain feature | |
CN110136746B (en) | Method for identifying mobile phone source in additive noise environment based on fusion features | |
Zheng et al. | MSRANet: Learning discriminative embeddings for speaker verification via channel and spatial attention mechanism in alterable scenarios | |
CN110390937A (en) | A kind of across channel method for recognizing sound-groove based on ArcFace loss algorithm | |
Reimao | Synthetic speech detection using deep neural networks | |
Chen et al. | Deep learning in automatic detection of dysphonia: Comparing acoustic features and developing a generalizable framework | |
CN114863937A (en) | Hybrid birdsong identification method based on deep migration learning and XGboost | |
El‐Dahshan et al. | Intelligent methodologies for cardiac sound signals analysis and characterization in cepstrum and time‐scale domains | |
Zhou et al. | Robust sound event classification by using denoising autoencoder | |
CN111785262A (en) | Speaker age and gender classification method based on residual error network and fusion characteristics | |
CN116631406B (en) | Identity feature extraction method, equipment and storage medium based on acoustic feature generation | |
Boujnah et al. | Smartphone-captured ear and voice database in degraded conditions |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information | ||
CB02 | Change of applicant information |
Address after: 510665 293 Zhongshan Avenue, Tianhe District, Guangzhou, Guangdong. Applicant after: GUANGDONG POLYTECHNIC NORMAL University Address before: 510665 293 Zhongshan Avenue, Tianhe District, Guangzhou, Guangdong. Applicant before: GUANGDONG POLYTECHNIC NORMAL University |
|
GR01 | Patent grant | ||
GR01 | Patent grant |