CN110136726A - A kind of estimation method, device, system and the storage medium of voice gender - Google Patents
A kind of estimation method, device, system and the storage medium of voice gender Download PDFInfo
- Publication number
- CN110136726A CN110136726A CN201910539105.7A CN201910539105A CN110136726A CN 110136726 A CN110136726 A CN 110136726A CN 201910539105 A CN201910539105 A CN 201910539105A CN 110136726 A CN110136726 A CN 110136726A
- Authority
- CN
- China
- Prior art keywords
- voice
- identified
- voice data
- gender
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 238000000605 extraction Methods 0.000 claims abstract description 31
- 238000012549 training Methods 0.000 claims description 48
- 238000001228 spectrum Methods 0.000 claims description 34
- 238000009432 framing Methods 0.000 claims description 29
- 238000004590 computer program Methods 0.000 claims description 11
- 238000001914 filtration Methods 0.000 claims description 10
- 230000009466 transformation Effects 0.000 claims description 9
- 238000013528 artificial neural network Methods 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 2
- 235000013399 edible fruits Nutrition 0.000 claims 1
- 230000001537 neural effect Effects 0.000 claims 1
- 238000009826 distribution Methods 0.000 description 9
- 238000012545 processing Methods 0.000 description 7
- 238000004458 analytical method Methods 0.000 description 5
- 230000008447 perception Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 238000007781 pre-processing Methods 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- 230000005021 gait Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000000737 periodic effect Effects 0.000 description 2
- NGVDGCNFYWLIFO-UHFFFAOYSA-N pyridoxal 5'-phosphate Chemical compound CC1=NC=C(COP(O)(O)=O)C(C=O)=C1O NGVDGCNFYWLIFO-UHFFFAOYSA-N 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 230000003595 spectral effect Effects 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 230000036651 mood Effects 0.000 description 1
- 210000005036 nerve Anatomy 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 230000001755 vocal effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/02—Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/06—Decision making techniques; Pattern matching strategies
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/18—Artificial neural networks; Connectionist approaches
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/24—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Acoustics & Sound (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Business, Economics & Management (AREA)
- Game Theory and Decision Science (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Electrically Operated Instructional Devices (AREA)
Abstract
The present invention provides estimation method, device, system and the storage mediums of a kind of voice gender, which comprises obtains voice data to be identified;Feature extraction is carried out to the voice data to be identified, obtains the phonetic feature of the voice data to be identified;The phonetic feature is inputted into trained voice and estimates model, obtains the gender estimated result of the voice data to be identified.According to the method for the present invention, device, system and storage medium, after carrying out feature extraction to voice data, estimate that model carries out voice estimation by the voice gender of foundation, to realize that fast and accurately voice gender is estimated under the environment such as voice complexity and different phonetic, promotes user experience.
Description
Technical field
The present invention relates to voice processing technology fields, relate more specifically to the processing of the estimation of voice gender.
Background technique
With the development of information technology and the increase of social safety demand, the sides such as auto authentication, people information are portrayed
To living things feature recognition, there are urgent demands for the application in face.Therefore, living things feature recognition becomes computer industry research hotspot
One of.Current main living things feature recognition includes face characteristic identification, fingerprint recognition, Application on Voiceprint Recognition, gender identifies, the age is estimated
Meter, race's identification, Expression Recognition, Gait Recognition, track identification etc..Main carrier of biological information includes face, iris, refers to
Line, voice, gait etc..The biological characteristic of individual generally has uniqueness, by distinguishing one or more individual biological characteristic
Information just can recognize that individual ID.And the individual biological information between same community often have very strong similitude and
Correlation, such as age, sex, race etc..
However, with social mode diversification in many scenes, the biometric images such as portrait, iris letter can not be acquired
Breath, the only other informations such as voice.The research of voice propagation, voice attributes and signature analysis receives more and more attention.Face
To different scenes and environment bring noise, all ages and classes, different language, even different moods difference, the voice of speaker
Identification complexity greatly increases.Currently, voice gender estimation method is broadly divided into the method based on time sequence, and it is based on time sequence
The method key of column is to construct Recognition with Recurrent Neural Network model, as RNN, LSTM method are difficult accurately if background is complex
Estimation.
Therefore, the estimation of voice gender exists in the prior art to be affected by background noise and different language environment,
Cause voice gender accuracy of identification low, the unhappy problem of speed influences user experience.
Summary of the invention
The present invention is proposed in view of the above problem.The present invention provides a kind of estimation method of voice gender, device,
System and computer storage medium after carrying out feature extraction to voice data, estimate that model carries out by the voice gender of foundation
Voice estimation, to realize that fast and accurately voice gender is estimated under the environment such as voice complexity and different phonetic.
According to the first aspect of the invention, a kind of estimation method of voice gender is provided, comprising:
Obtain voice data to be identified;
Feature extraction is carried out to the voice data to be identified, obtains the phonetic feature of the voice data to be identified;
The phonetic feature is inputted into trained voice and estimates model, the gender for obtaining the voice data to be identified is estimated
Count result.
Optionally, voice data to be identified is obtained further include: voice data to be identified described in alignment and/or preemphasis.
Optionally, feature extraction is carried out to the voice data to be identified, obtains the voice of the voice data to be identified
Feature, comprising:
Framing is carried out to the voice data to be identified, and Hamming window is added to every frame voice data to be identified after framing;
Based on adding every frame voice data to be identified after Hamming window to carry out Fourier transformation or Fast Fourier Transform (FFT)
Or Short Time Fourier Transform obtains vector characteristics;
The amplitude frequency spectrum of the vector characteristics is converted into power spectrum;
Mel filtering is carried out to the power spectrum, obtains voice of the Mel cepstrum feature as the voice data to be identified
Feature.
Optionally, the method also includes:
Feature extraction is carried out to the voice training data with label, obtains training phonetic feature;
Neural network is trained to obtain the trained voice based on training phonetic feature and corresponding label and is estimated
Count model.
Optionally, the phonetic feature is inputted into trained voice and estimates model, obtain the voice data to be identified
Gender estimated result, comprising:
The phonetic feature is inputted into trained voice and estimates model, obtains the label probability of the phonetic feature;
Using label corresponding to maximum probability in the label probability as the gender estimated result.
Optionally, the trained voice estimation model includes convolutional neural networks.
Optionally, the gender estimated result includes male, women or without voice.
According to the second aspect of the invention, a kind of estimation device of voice gender is provided, comprising:
Data acquisition module, for obtaining voice data to be identified;
Characteristic extracting module obtains the voice to be identified for carrying out feature extraction to the voice data to be identified
The phonetic feature of data;
Identification module estimates model for the phonetic feature to be inputted trained voice, obtains the language to be identified
The gender estimated result of sound data.
According to the third aspect of the invention we, provide a kind of estimating system of voice gender, including memory, processor and
It is stored in the computer program run on the memory and on the processor, which is characterized in that the processor executes
The step of first aspect the method is realized when the computer program.
According to the fourth aspect of the invention, a kind of computer storage medium is provided, computer program is stored thereon with,
The step of being characterized in that, first aspect the method realized when the computer program is computer-executed.
Estimation method, device, system and the computer storage medium of voice gender according to an embodiment of the present invention, to voice
After data carry out feature extraction, estimate that model carries out voice estimation by the voice gender of foundation, to realize that voice is multiple
Fast and accurately voice gender is estimated under the environment such as miscellaneous and different phonetic, promotes user experience.
Detailed description of the invention
The embodiment of the present invention is described in more detail in conjunction with the accompanying drawings, the above and other purposes of the present invention,
Feature and advantage will be apparent.Attached drawing is used to provide to further understand the embodiment of the present invention, and constitutes explanation
A part of book, is used to explain the present invention together with the embodiment of the present invention, is not construed as limiting the invention.In the accompanying drawings,
Identical reference label typically represents same parts or step.
Fig. 1 is the schematic flow chart of the estimation method of voice gender according to an embodiment of the present invention;
Fig. 2 is the example of the estimation method of voice gender according to an embodiment of the present invention;
Fig. 3 is the schematic block diagram of the estimation device of voice gender according to an embodiment of the present invention;
Fig. 4 is the schematic block diagram of the estimating system of voice gender according to an embodiment of the present invention.
Specific embodiment
In order to enable the object, technical solutions and advantages of the present invention become apparent, root is described in detail below with reference to accompanying drawings
According to example embodiments of the present invention.Obviously, described embodiment is only a part of the embodiments of the present invention, rather than this hair
Bright whole embodiments, it should be appreciated that the present invention is not limited by example embodiment described herein.Based on described in the present invention
The embodiment of the present invention, those skilled in the art's obtained all other embodiment in the case where not making the creative labor
It should all fall under the scope of the present invention.
The estimation of voice gender is exactly the extraction vocal print feature according to speaker's voice, using computer depth learning technology into
The analysis of row relevant treatment, judges speaker's gender.By the way that more correlations can be extracted to the accurate gender prediction of speaker's voice
Attribute and people information can be applied to several scenes, multiple terminal environment, be suitble to need automated biological under man-machine interaction environment
Signature analysis, user's people information such as portray at the application, are of great significance in the work such as security protection, human-computer interaction, business service.
The estimation method 100 of voice gender according to an embodiment of the present invention is shown referring to Fig. 1, Fig. 1, as shown in Figure 1, one
The estimation method 100 of kind voice gender, comprising:
Step S110 obtains voice data to be identified;
Step S120 carries out feature extraction to the voice data to be identified, obtains the language of the voice data to be identified
Sound feature;
The phonetic feature is inputted trained voice and estimates model, obtains the voice number to be identified by step S130
According to gender estimated result.
Wherein, phonetic feature is to meet or the auditory perception property of similar human ear, and by the voice signal in voice data
The voice that computer is capable of handling is converted to, waveform can be become to one by the feature extraction to voice data and include sound
The multi-C vector of information.Speech feature extraction is carried out to the voice data to be identified, it can be by voice signal and background signal
Or environmental signal separates, so that background signal or environmental signal be avoided to impact the estimation of subsequent gender, improves voice gender
The accuracy of estimation.And voice gender is obtained by the sufficient amount of training data training neural network with gender label and is estimated
Model is counted, rapidly and accurately gender estimation can be further realized based on the phonetic feature of voice data to be identified.Due to voice
Gender estimation model is based on multiple types and sufficient amount of training data is trained, and has high generalization, and but do not have
There is the ability for characterizing specific identity, one is given in the probabilistic model of Sex distribution to phonetic feature and is pre-estimated, thus
To corresponding gender estimated result.
Optionally, the estimation method of voice gender according to an embodiment of the present invention can be with memory and processor
It is realized in unit or system.
The estimation method of voice gender according to an embodiment of the present invention can be deployed in personal terminal, can also be distributed ground portion
Administration is at server end (or cloud) and personal terminal.For example, the estimation method when the voice gender is deployed in personal terminal
When, after personal terminal obtains voice data to be identified, the estimation of voice gender is carried out at personal terminal, is obtained described to be identified
The gender estimated result of voice data;When the estimation method of predicate sound gender is deployed in server end (or cloud) and a with being distributed
When at people's terminal, after personal terminal obtains voice data to be identified, the server end (or cloud) is subjected to voice gender
After estimation, the gender estimated result of the voice data to be identified is sent to personal terminal.
According to embodiments of the present invention, in step S110, the acquisition voice data to be identified, which can be, directly acquires voice
Data are also possible to obtain voice data from other data sources;The voice data can be live signal, be also possible to non-reality
When signal, herein with no restrictions.
In one example, obtaining voice data to be identified includes: directly by microphone pickup, acquiring described to be identified
Voice data.
In one example, obtaining voice data to be identified includes: obtaining the voice number to be identified from other data sources
According to.For example, acquiring the voice data to be identified by other voice acquisition devices, then obtained from the voice acquisition device
The voice data to be identified;Or the voice data to be identified is obtained from cloud.
According to embodiments of the present invention, it in step S110, may further include: after obtaining the voice data to be identified,
The voice data to be identified is pre-processed.
Optionally, carrying out pretreatment to the voice data to be identified includes: language to be identified described in alignment and/or preemphasis
Sound data.
In one example, it includes following at least one for being aligned the voice data to be identified: by the voice to be identified
Data are converted to Unified coding format, the voice data to be identified are converted to identical sample rate and/or port number, by institute
It states voice data to be identified and is cut to same length, the voice data to be identified is normalized.
Voice data to be identified described in preemphasis can compensate voice signal in voice data and be constrained by articulatory system
High frequency section, and the formant of high frequency can be highlighted.
In one example, voice data to be identified described in preemphasis includes:
Voice data s (n) is passed through into a high-pass filter: H (z)=1-a*z-1, wherein the range of pre emphasis factor a
Are as follows: 0.9 < a < 1.0;It is then y (n)=x by preemphasis treated result if the speech sample value at n moment is x (n)
(n)-a*x (n-1), n are natural number.
According to embodiments of the present invention, step S120 may further include:
Framing is carried out to the voice data to be identified, and Hamming window is added to every frame voice data to be identified after framing;
Based on adding every frame voice data to be identified after Hamming window to carry out Fourier transformation or Fast Fourier Transform (FFT)
Or Short Time Fourier Transform obtains vector characteristics;
The amplitude frequency spectrum of the vector characteristics is converted into power spectrum;
Mel filtering is carried out to the power spectrum, obtains voice of the Mel cepstrum feature as the voice data to be identified
Feature.
Wherein, after carrying out the processing of preemphasis digital filtering to voice data to be identified, adding window sub-frame processing can be carried out.
Since the voice signal in voice data has short-term stationarity, it is considered that voice signal approximation is constant in 10--30ms,
Thus voice signal can be divided into some short sections to be handled i.e. framing.For example, the framing of voice signal can adopt
It is realized with the method that the window of moveable finite length is weighted, general frame number per second is about 33~100 frames;Or
The overlapping part of the method that person uses overlapping segmentation, former frame and a later frame is known as frame shifting, and frame moves and the ratio of frame length is generally 0
~0.5.
In one example, to the voice data to be identified carry out framing include: by the voice data to be identified by
It is 20ms according to frame length, step-length is that 10ms carries out framing.
It is global more continuous in order to make, Gibbs' effect is avoided the occurrence of, the voice data after framing can be carried out adding the Chinese
Bright window, every frame signal add a Hamming window, decay to every frame signal both ends close to 0, and add Hamming window after, originally without week
The voice signal of phase property shows the Partial Feature of periodic function, convenient for carrying out Fourier expansion when subsequent characteristics extraction.
In one example, Hamming window is added to include: every after assuming framing every frame voice data to be identified after framing
Frame voice data to be identified is S (n), and n=0 ... N-1, N is the size of every frame voice data to be identified, then plus (multiplied by) Chinese
It is S'(n after bright window)=S (n) * W (n), wherein W (n, b)=(1-b)-b*cos (2pn/ (N-1)), 0≤n≤N-1, b are to be
Number.It is appreciated that different b values can generate different Hamming windows, b=0.46 can be generally used.
Since the variation of voice signal in the time domain is generally difficult to find out the characteristic of signal, so being usually converted into frequency
Energy distribution on domain is observed, and different Energy distributions can represent the characteristic of different phonetic.So being multiplied by Hamming window
Afterwards, voice data must also be using Fourier Tranform (Fourier Transform, or FT) or fast fourier transform
(Fast Fourier Transform, or FFT) or Short Time Fourier Transform (Short-time Fourier Transform,
Or STFT) to obtain the Energy distribution on frequency spectrum.
In one example, the amplitude frequency spectrum of the vector characteristics is converted into power spectrum, comprising:
Amplitude frequency spectrum modulus square to the vector characteristics thoroughly deserves the power spectrum.
In one example, Mel filtering is carried out to the power spectrum, obtains Mel cepstrum feature as described to be identified
The phonetic feature of voice data, comprising:
By the power spectrum multiplied by one group of triangular filter, the logarithmic energy of each filter output is obtained;
Discrete cosine transform is carried out to the logarithmic energy and obtains the Mel cepstrum feature of L rank as the voice to be identified
The phonetic feature of data.
Wherein, triangular filter can smooth frequency spectrum, and the effect of harmonic carcellation, highlight being total to for original voice
Shake peak, can also reduce operand, accelerate the speed of feature extraction, to promote the speed of entire voice gender estimation method.
Optionally, it includes: linear using linear prediction analysis, perception for carrying out feature extraction to the voice data to be identified
Predictive coefficient, Tandem feature and Bottleneck feature, Fbank feature based on wave filter group, linear prediction residue error,
Or mel-frequency cepstrum coefficient carries out feature extraction.
Optionally, the phonetic feature includes following one kind: mel cepstrum coefficients MFCC, perception linear predictor coefficient PLP,
The regular spectral coefficient PNCC of depth characteristic Deep Feature, energy.
Optionally, the method 100 can also include:
Feature extraction is carried out to the voice training data with label, obtains training phonetic feature;
Neural network is trained to obtain the trained voice based on training phonetic feature and corresponding label and is estimated
Count model.
Wherein it is possible to establish voice training database for storing training data, instructed with the training neural network
The voice estimation model perfected.Voice training database may include enough data volumes such as 300,000 speech samples, and every
Sample has corresponding gender label.Label are as follows: male, female, without voice (non-talking sound).Sample can pick up from video respectively, lead to
Words, recording, environment distribution is wide, comprising film, news, speech, dialogue etc., is related to multilingual, the multiplicity of voice training data
Change and plays a significant role to the generalization and robustness that improve model.
In one example, the trained voice estimation model may include: 7 one-dimensional convolutional layers, 7
BatchNormal layers, 1 pond layer, 6 relu layers and 3 full articulamentums.It specifically includes: (1) first layer conv layers of convolution kernel
Size 3, number 1024, step-length are once to connect one BatchNorm layers behind 1, one relu layers, an one-dimensional MaxPool
Layer, core size are 3, and step-length 2 exports 32*500;(2) second layer convolution kernel size 5, number 32, step-length 1, step-length 1
One BatchNorm layers are once connect below, and one relu layers, one MaxPool layers one-dimensional, and core size is 3, step-length 2, output
32*248;(3) third layer conv layers of convolution kernel size 3, number 64, step-length are once to connect one BatchNorm layers behind 1,
One relu layers, one MaxPool layers one-dimensional, and core size is 3, and step-length 2 exports 64*122;(4) the 4th layers of convolution kernel size
3, number 128, step-length is once to connect one BatchNorm layers behind 1, and one relu layers, one MaxPool layers one-dimensional, core
Size is 3, step-length 2, exports 128*60;(5) the conv layer convolution kernel size 3 of layer 5, layer 6 and layer 7, number are
128, step-length is once to connect one BatchNorm layers behind 1, and one relu layers, one is MaxPool layers one-dimensional, and core size is 3,
Step-length is 2, and layer 7 exports 128*5;(6) the 8th layers of conv layers of convolution kernel size 3, number 256, step-length are primary behind 1
One BatchNorm layers are connect, one relu layers, one MaxPool layers one-dimensional, and core size is 5 output 256*1;(7) followed by 2
A full articulamentum, first input dimension 256 export dimension 64;Second full articulamentum inputs dimension 64, exports dimension 3;
It (8) is finally SoftMax layers.
According to embodiments of the present invention, it in step S130, may further include:
The phonetic feature is inputted into trained voice and estimates model, obtains the label probability of the phonetic feature;
Using label corresponding to maximum probability in the label probability as the gender estimated result.
Optionally, the trained voice estimation model includes convolutional neural networks.
Optionally, the gender estimated result includes male, women or without voice.
According to embodiments of the present invention, the method 100 can further include:
Show the gender estimated result.
In one embodiment, referring to fig. 2, showing Fig. 2 shows the estimation method of the voice gender of the embodiment of the present invention
Example.As shown in Fig. 2, the estimation method 200 of the voice gender includes:
Firstly, establishing voice training database;It include diversified sufficient amount of instruction in the voice training database
Practice data, the training data has corresponding gender label, and the gender label can be male, women or without voice;Tool
Body may include: collection, clean and according to training data described in the gender label for labelling;
Then, the training data in the voice training database is pre-processed, is can specifically include as follows at least
It is a kind of: the training data being converted into Unified coding format, the training data is converted into identical sample rate (such as
48000) and/or port number, training data is cut to same length, training data is normalized;
And preemphasis and feature extraction are carried out to the training data in the voice training database, specifically it can wrap
It includes:
Training data is obtained by a high-pass filter by preemphasis treated training data;
Framing is carried out to preemphasis treated training data, and Hamming window is added to every frame training data after framing;
Based on adding every frame training data after Hamming window to carry out Fourier transformation or Fast Fourier Transform (FFT) or in short-term
Fourier transformation obtains the vector characteristics of training data;
The amplitude frequency spectrum of the vector characteristics of the training data is converted into power spectrum;
Mel filtering is carried out to the power spectrum of institute's training data, obtains Mel cepstrum feature (if size is 1005*13) institute
State the phonetic feature of training data;
Then, it is trained using the phonetic feature of the training data and corresponding gender label training neural network
Voice gender estimate model;
Then, after obtaining voice data to be identified, the voice data to be identified is pre-processed, specifically can wrap
It includes following at least one: the voice data to be identified is converted into Unified coding format, the voice data to be identified is turned
It is changed to identical sample rate (such as 48000) and/or port number, the voice data to be identified is cut to same length, by institute
State voice data normalization to be identified;
Then, feature extraction is carried out to the voice data to be identified, obtains the Mel cepstrum of the voice data to be identified
Feature can specifically include:
The voice data to be identified obtained by a high-pass filter to that treated is described wait know by preemphasis
Other voice data;
Framing is carried out to the voice data to be identified, and Hamming window is added to every frame voice data to be identified after framing;
Based on adding every frame voice data to be identified after Hamming window to carry out Fourier transformation or Fast Fourier Transform (FFT)
Or Short Time Fourier Transform obtains vector characteristics;
The amplitude frequency spectrum of the vector characteristics is converted into power spectrum;
To the power spectrum carry out Mel filtering, obtain Mel cepstrum feature (size be 1005*13) i.e. as described in wait know
The phonetic feature of other voice data;
The phonetic feature of the voice data to be identified is inputted into trained voice and estimates model, it is special to obtain the voice
The label probability of sign;
Using label corresponding to maximum probability in the label probability as the gender estimated result, the gender estimation
As a result including male, women or without voice;
It follows that the estimation method of voice gender according to an embodiment of the present invention, carries out feature extraction to voice data
Afterwards, estimate that model carries out voice estimation by the voice gender of foundation, to realize the rings such as voice complexity and different phonetic
Fast and accurately voice gender is estimated under border, promotes user experience.
Fig. 3 shows the schematic block diagram of the estimation device 300 of voice gender according to an embodiment of the present invention.Such as Fig. 3 institute
Show, the estimation device 300 of voice gender according to an embodiment of the present invention includes:
Data acquisition module 310, for obtaining voice data to be identified;
Characteristic extracting module 320 obtains the language to be identified for carrying out feature extraction to the voice data to be identified
The phonetic feature of sound data;
Identification module 330 is estimated model for the phonetic feature to be inputted trained voice, is obtained described to be identified
The gender estimated result of voice data.
According to embodiments of the present invention, data acquisition module 310, which obtains the voice data to be identified and can be, directly acquires
Voice data is also possible to obtain voice data from other data sources;The voice data can be live signal, be also possible to
Non-real time signals, herein with no restrictions.
In one example, data acquisition module 310 obtains voice data to be identified and includes: directly being picked up by microphone
Sound acquires the voice data to be identified.
In one example, data acquisition module 310 obtains voice data to be identified and includes: obtaining institute from other data sources
State voice data to be identified.For example, the voice data to be identified is acquired by other voice acquisition devices, then from institute's predicate
Sound acquisition device obtains the voice data to be identified;Or the voice data to be identified is obtained from cloud.
According to embodiments of the present invention, device 300 may further include:
Preprocessing module 340 carries out the voice data to be identified pre- after obtaining the voice data to be identified
Processing.
Optionally, it includes: alignment and/or pre-add that preprocessing module 340, which carries out pretreatment to the voice data to be identified,
The weight voice data to be identified.
In one example, it includes following at least one that preprocessing module 340, which is aligned the voice data to be identified: by institute
State voice data to be identified and be converted to Unified coding format, by the voice data to be identified be converted to identical sample rate and/
Or port number, the voice data to be identified is cut to same length, the voice data to be identified is normalized.
Voice data to be identified described in preemphasis can compensate voice signal in voice data and be constrained by articulatory system
High frequency section, and the formant of high frequency can be highlighted.
In one example, voice data to be identified described in 340 preemphasis of preprocessing module includes: by voice data s (n)
Pass through a high-pass filter: H (z)=1-a*z-1, wherein the range of pre emphasis factor a are as follows: 0.9 < a < 1.0;If the n moment
Speech sample value be x (n), then be y (n)=x (n)-a*x (n-1) by preemphasis treated result, n is natural number.
According to embodiments of the present invention, characteristic extracting module 320 may further include:
Framing module 321, for carrying out framing to the voice data to be identified, and to every frame language to be identified after framing
Sound data add Hamming window;
Fourier transformation module 322, for based on adding every frame voice data to be identified after Hamming window to carry out in Fu
Leaf transformation or Fast Fourier Transform (FFT) or Short Time Fourier Transform obtain vector characteristics;
Power module 323, for the amplitude frequency spectrum of the vector characteristics to be converted to power spectrum;
Phonetic feature module 324 obtains described in the conduct of Mel cepstrum feature for carrying out Mel filtering to the power spectrum
The phonetic feature of voice data to be identified.
Wherein, after carrying out the processing of preemphasis digital filtering to voice data to be identified, adding window sub-frame processing can be carried out.
Since the voice signal in voice data has short-term stationarity, it is considered that voice signal approximation is constant in 10--30ms,
Thus voice signal can be divided into some short sections to be handled i.e. framing.For example, the framing of voice signal can adopt
It is realized with the method that the window of moveable finite length is weighted, general frame number per second is about 33~100 frames;Or
The overlapping part of the method that person uses overlapping segmentation, former frame and a later frame is known as frame shifting, and frame moves and the ratio of frame length is generally 0
~0.5.
In one example, framing module 321 to the voice data to be identified carry out framing include: will be described to be identified
Voice data is 20ms according to frame length, and step-length is that 10ms carries out framing.
It is global more continuous in order to make, Gibbs' effect is avoided the occurrence of, the voice data after framing can be carried out adding the Chinese
Bright window, every frame signal add a Hamming window, decay to every frame signal both ends close to 0, and add Hamming window after, originally without week
The voice signal of phase property shows the Partial Feature of periodic function, convenient for carrying out Fourier expansion when subsequent characteristics extraction.
In one example, framing module 321 adds Hamming window to include: to assume every frame voice data to be identified after framing
Every frame voice data to be identified after framing is S (n), and n=0 ... N-1, N are the size of every frame voice data to be identified, then
Add after (multiplied by) Hamming window for S'(n)=S (n) * W (n), wherein W (n, b)=(1-b)-b*cos (2pn/ (N-1)), 0≤n≤
N-1, b are coefficient.It is appreciated that different b values can generate different Hamming windows, b=0.46 can be generally used.
Since the variation of voice signal in the time domain is generally difficult to find out the characteristic of signal, so being usually converted into frequency
Energy distribution on domain is observed, and different Energy distributions can represent the characteristic of different phonetic.So being multiplied by Hamming window
Afterwards, voice data must also be using Fourier Tranform (Fourier Transform, or FT) or fast fourier transform
(Fast Fo urier Transform, or FFT) or Short Time Fourier Transform (Short-time Fouri er
Transform, or STFT) to obtain the Energy distribution on frequency spectrum.
In one example, the amplitude frequency spectrum of the vector characteristics is converted to power spectrum by power module 323, comprising:
Amplitude frequency spectrum modulus square to the vector characteristics thoroughly deserves the power spectrum.
In one example, phonetic feature module 324 carries out Mel filtering to the power spectrum, obtains Mel cepstrum feature
Phonetic feature as the voice data to be identified, comprising:
By the power spectrum multiplied by one group of triangular filter, the logarithmic energy of each filter output is obtained;
Discrete cosine transform is carried out to the logarithmic energy and obtains the Mel cepstrum feature of L rank as the voice to be identified
The phonetic feature of data.
Wherein, triangular filter can smooth frequency spectrum, and the effect of harmonic carcellation, highlight being total to for original voice
Shake peak, can also reduce operand, accelerate the speed of feature extraction, to promote the speed of entire voice gender estimation method.
Optionally, it includes: linear using linear prediction analysis, perception for carrying out feature extraction to the voice data to be identified
Predictive coefficient, Tandem feature and Bottleneck feature, Fbank feature based on wave filter group, linear prediction residue error,
Or mel-frequency cepstrum coefficient carries out feature extraction.
Optionally, the phonetic feature includes following one kind: mel cepstrum coefficients MFCC, perception linear predictor coefficient PLP,
The regular spectral coefficient PNCC of depth characteristic Deep Feature, energy.
Optionally, described device 300 can also include:
Model module 350 obtains training voice special for carrying out feature extraction to the voice training data with label
Sign;Neural network is trained to obtain the trained voice estimation mould based on training phonetic feature and corresponding label
Type.
Wherein, model module 350 can establish voice training database for storing training data, with the training nerve
Network obtains trained voice estimation model.Voice training database may include enough data volumes such as 300,000 voices
Sample, every sample have corresponding gender label.Label are as follows: male, female, without voice (non-talking sound).Sample can be adopted respectively
From video, call, recording, environment distribution is wide, comprising film, news, speech, dialogue etc., is related to multilingual, voice training number
According to diversification to improve model generalization and robustness play a significant role.
In one example, the trained voice estimation model may include: 7 one-dimensional convolutional layers, 7
BatchNormal layers, 1 pond layer, 6 relu layers and 3 full articulamentums.It specifically includes: (1) first layer conv layers of convolution kernel
Size 3, number 1024, step-length are once to connect one BatchNorm layers behind 1, one relu layers, an one-dimensional MaxPool
Layer, core size are 3, and step-length 2 exports 32*500;(2) second layer convolution kernel size 5, number 32, step-length 1, step-length 1
One BatchNorm layers are once connect below, and one relu layers, one MaxPool layers one-dimensional, and core size is 3, step-length 2, output
32*248;(3) third layer conv layers of convolution kernel size 3, number 64, step-length are once to connect one BatchNorm layers behind 1,
One relu layers, one MaxPool layers one-dimensional, and core size is 3, and step-length 2 exports 64*122;(4) the 4th layers of convolution kernel size
3, number 128, step-length is once to connect one BatchNorm layers behind 1, and one relu layers, one MaxPool layers one-dimensional, core
Size is 3, step-length 2, exports 128*60;(5) the conv layer convolution kernel size 3 of layer 5, layer 6 and layer 7, number are
128, step-length is once to connect one BatchNorm layers behind 1, and one relu layers, one is MaxPool layers one-dimensional, and core size is 3,
Step-length is 2, and layer 7 exports 128*5;(6) the 8th layers of conv layers of convolution kernel size 3, number 256, step-length are primary behind 1
One BatchNorm layers are connect, one relu layers, one MaxPool layers one-dimensional, and core size is 5 output 256*1;(7) followed by 2
A full articulamentum, first input dimension 256 export dimension 64;Second full articulamentum inputs dimension 64, exports dimension 3;
It (8) is finally SoftMax layers.
According to embodiments of the present invention, identification module 330 may further include:
Probability Estimation module 331 estimates model for the phonetic feature to be inputted trained voice, obtains institute's predicate
The label probability of sound feature;
Object module 332, for estimating label corresponding to maximum probability in the label probability as the gender
As a result.
Optionally, the trained voice estimation model includes convolutional neural networks.
Optionally, the gender estimated result includes male, women or without voice.
According to embodiments of the present invention, device 300 can further include:
Display module 360, for showing the gender estimated result.
Fig. 4 shows the schematic block diagram of the estimating system 400 of voice gender according to an embodiment of the present invention.Voice gender
Estimating system 400 include storage device 410 and processor 420.
Storage device 410 stores for realizing the corresponding step in the estimation method of voice gender according to an embodiment of the present invention
Rapid program code.
The processor 420 is for running the program code stored in the storage device 410, to execute according to the present invention
The corresponding steps of the estimation method of the voice gender of embodiment, and for realizing voice gender according to an embodiment of the present invention
Data acquisition module 310 in estimation device, characteristic extracting module 320 and identification module 330.
In addition, according to embodiments of the present invention, additionally providing a kind of storage medium, storing program on said storage
Instruction, when described program instruction is run by computer or processor for executing the estimation of the voice gender of the embodiment of the present invention
The corresponding steps of method, and for realizing the corresponding module in the estimation device of voice gender according to an embodiment of the present invention.
The storage medium for example may include the storage card of smart phone, the storage unit of tablet computer, personal computer hard disk,
Read-only memory (ROM), Erasable Programmable Read Only Memory EPROM (EPROM), portable compact disc read-only memory (CD-ROM),
Any combination of USB storage or above-mentioned storage medium.The computer readable storage medium can be one or more meters
Any combination of calculation machine readable storage medium storing program for executing, such as a computer readable storage medium include to refer to for being randomly generated movement
The computer-readable program code of sequence is enabled, another computer readable storage medium includes for carrying out estimating for voice gender
The computer-readable program code of meter.
In one embodiment, the computer program instructions may be implemented real according to the present invention when being run by computer
Each functional module of the estimation device of the voice gender of example is applied, and/or language according to an embodiment of the present invention can be executed
The estimation method of sound gender.
Each module in the estimating system of voice gender according to an embodiment of the present invention can be by implementing according to the present invention
The processor computer program instructions that store in memory of operation of the electronic equipment of the estimation of the voice gender of example realize,
Or the computer that can be stored in the computer readable storage medium of computer program product according to an embodiment of the present invention
Realization when instruction is run by computer.
Estimation method, device, system and the storage medium of voice gender according to an embodiment of the present invention, to voice data
After carrying out feature extraction, estimate that model carries out voice estimation by the voice gender of foundation, thus realize voice complexity and
Fast and accurately voice gender is estimated under the environment such as different phonetic, promotes user experience.
Those of ordinary skill in the art may be aware that list described in conjunction with the examples disclosed in the embodiments of the present disclosure
Member and algorithm steps can be realized with the combination of electronic hardware or computer software and electronic hardware.These functions are actually
It is implemented in hardware or software, the specific application and design constraint depending on technical solution.Professional technician
Each specific application can be used different methods to achieve the described function, but this realization is it is not considered that exceed
The scope of the present invention.
The above description is merely a specific embodiment or to the explanation of specific embodiment, protection of the invention
Range is not limited thereto, and anyone skilled in the art in the technical scope disclosed by the present invention, can be easily
Expect change or replacement, should be covered by the protection scope of the present invention.Protection scope of the present invention should be with claim
Subject to protection scope.
Claims (10)
1. a kind of estimation method of voice gender, which is characterized in that the described method includes:
Obtain voice data to be identified;
Feature extraction is carried out to the voice data to be identified, obtains the phonetic feature of the voice data to be identified;
The phonetic feature is inputted into trained voice and estimates model, obtains the gender estimation knot of the voice data to be identified
Fruit.
2. the method as described in claim 1, which is characterized in that obtain voice data to be identified further include: alignment and/or pre-add
The weight voice data to be identified.
3. the method as described in claim 1, which is characterized in that carry out feature extraction to the voice data to be identified, obtain
The phonetic feature of the voice data to be identified, comprising:
Framing is carried out to the voice data to be identified, and Hamming window is added to every frame voice data to be identified after framing;
Based on adding every frame voice data to be identified after Hamming window to carry out Fourier transformation or Fast Fourier Transform (FFT) or short
When Fourier transformation obtain vector characteristics;
The amplitude frequency spectrum of the vector characteristics is converted into power spectrum;
Mel filtering is carried out to the power spectrum, the voice for obtaining Mel cepstrum feature as the voice data to be identified is special
Sign.
4. the method as described in claim 1, which is characterized in that the method also includes:
Feature extraction is carried out to the voice training data with label, obtains training phonetic feature;
Neural network is trained to obtain the trained voice estimation mould based on training phonetic feature and corresponding label
Type.
5. the method as described in claim 1, which is characterized in that the phonetic feature is inputted trained voice and estimates mould
Type obtains the gender estimated result of the voice data to be identified, comprising:
The phonetic feature is inputted into trained voice and estimates model, obtains the label probability of the phonetic feature;
Using label corresponding to maximum probability in the label probability as the gender estimated result.
6. method as claimed in claim 4, which is characterized in that the trained voice estimation model includes convolutional Neural net
Network.
7. the method as described in claim 1, which is characterized in that the gender estimated result includes male, women or without voice.
8. a kind of estimation device of voice gender, which is characterized in that described device includes:
Data acquisition module, for obtaining voice data to be identified;
Characteristic extracting module obtains the voice data to be identified for carrying out feature extraction to the voice data to be identified
Phonetic feature;
Identification module estimates model for the phonetic feature to be inputted trained voice, obtains the voice number to be identified
According to gender estimated result.
9. a kind of estimating system of voice gender, including memory, processor and it is stored on the memory and at the place
The computer program run on reason device, which is characterized in that the processor realizes claim 1 when executing the computer program
The step of to any one of 7 the method.
10. a kind of computer storage medium, is stored thereon with computer program, which is characterized in that the computer program is counted
The step of calculation machine realizes any one of claims 1 to 7 the method when executing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910539105.7A CN110136726A (en) | 2019-06-20 | 2019-06-20 | A kind of estimation method, device, system and the storage medium of voice gender |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910539105.7A CN110136726A (en) | 2019-06-20 | 2019-06-20 | A kind of estimation method, device, system and the storage medium of voice gender |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110136726A true CN110136726A (en) | 2019-08-16 |
Family
ID=67578869
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910539105.7A Pending CN110136726A (en) | 2019-06-20 | 2019-06-20 | A kind of estimation method, device, system and the storage medium of voice gender |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110136726A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110931023A (en) * | 2019-11-29 | 2020-03-27 | 厦门快商通科技股份有限公司 | Gender identification method, system, mobile terminal and storage medium |
CN111312286A (en) * | 2020-02-12 | 2020-06-19 | 深圳壹账通智能科技有限公司 | Age identification method, age identification device, age identification equipment and computer readable storage medium |
CN112581942A (en) * | 2020-12-29 | 2021-03-30 | 云从科技集团股份有限公司 | Method, system, device and medium for recognizing target object based on voice |
WO2021175031A1 (en) * | 2020-03-03 | 2021-09-10 | 深圳壹账通智能科技有限公司 | Information prompting method and apparatus, electronic device, and medium |
CN114049881A (en) * | 2021-11-23 | 2022-02-15 | 深圳依时货拉拉科技有限公司 | Voice gender recognition method, device, storage medium and computer equipment |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017113680A1 (en) * | 2015-12-30 | 2017-07-06 | 百度在线网络技术(北京)有限公司 | Method and device for voiceprint authentication processing |
CN108962223A (en) * | 2018-06-25 | 2018-12-07 | 厦门快商通信息技术有限公司 | A kind of voice gender identification method, equipment and medium based on deep learning |
CN109545227A (en) * | 2018-04-28 | 2019-03-29 | 华中师范大学 | Speaker's gender automatic identifying method and system based on depth autoencoder network |
-
2019
- 2019-06-20 CN CN201910539105.7A patent/CN110136726A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017113680A1 (en) * | 2015-12-30 | 2017-07-06 | 百度在线网络技术(北京)有限公司 | Method and device for voiceprint authentication processing |
CN109545227A (en) * | 2018-04-28 | 2019-03-29 | 华中师范大学 | Speaker's gender automatic identifying method and system based on depth autoencoder network |
CN108962223A (en) * | 2018-06-25 | 2018-12-07 | 厦门快商通信息技术有限公司 | A kind of voice gender identification method, equipment and medium based on deep learning |
Non-Patent Citations (1)
Title |
---|
黄珊: "基于深度学习的说话人性别特征识别研究", 《中国优秀博硕士论文全文数据库(硕士) 信息科技辑》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110931023A (en) * | 2019-11-29 | 2020-03-27 | 厦门快商通科技股份有限公司 | Gender identification method, system, mobile terminal and storage medium |
CN110931023B (en) * | 2019-11-29 | 2022-08-19 | 厦门快商通科技股份有限公司 | Gender identification method, system, mobile terminal and storage medium |
CN111312286A (en) * | 2020-02-12 | 2020-06-19 | 深圳壹账通智能科技有限公司 | Age identification method, age identification device, age identification equipment and computer readable storage medium |
WO2021175031A1 (en) * | 2020-03-03 | 2021-09-10 | 深圳壹账通智能科技有限公司 | Information prompting method and apparatus, electronic device, and medium |
CN112581942A (en) * | 2020-12-29 | 2021-03-30 | 云从科技集团股份有限公司 | Method, system, device and medium for recognizing target object based on voice |
CN114049881A (en) * | 2021-11-23 | 2022-02-15 | 深圳依时货拉拉科技有限公司 | Voice gender recognition method, device, storage medium and computer equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021208287A1 (en) | Voice activity detection method and apparatus for emotion recognition, electronic device, and storage medium | |
CN110457432B (en) | Interview scoring method, interview scoring device, interview scoring equipment and interview scoring storage medium | |
CN107731233B (en) | Voiceprint recognition method based on RNN | |
CN110136726A (en) | A kind of estimation method, device, system and the storage medium of voice gender | |
CN110909613A (en) | Video character recognition method and device, storage medium and electronic equipment | |
CN106504768B (en) | Phone testing audio frequency classification method and device based on artificial intelligence | |
TW201935464A (en) | Method and device for voiceprint recognition based on memorability bottleneck features | |
CN110211565A (en) | Accent recognition method, apparatus and computer readable storage medium | |
CN109658921B (en) | Voice signal processing method, equipment and computer readable storage medium | |
CN110415701A (en) | The recognition methods of lip reading and its device | |
CN112949708A (en) | Emotion recognition method and device, computer equipment and storage medium | |
CN110570873A (en) | voiceprint wake-up method and device, computer equipment and storage medium | |
CN110473552A (en) | Speech recognition authentication method and system | |
CN114913859B (en) | Voiceprint recognition method, voiceprint recognition device, electronic equipment and storage medium | |
CN110782902A (en) | Audio data determination method, apparatus, device and medium | |
CN112992155B (en) | Far-field voice speaker recognition method and device based on residual error neural network | |
CN114492579A (en) | Emotion recognition method, camera device, emotion recognition device and storage device | |
CN109817223A (en) | Phoneme marking method and device based on audio fingerprints | |
KR102220964B1 (en) | Method and device for audio recognition | |
CN117935789A (en) | Speech recognition method, system, equipment and storage medium | |
CN113823271B (en) | Training method and device for voice classification model, computer equipment and storage medium | |
CN114333844A (en) | Voiceprint recognition method, voiceprint recognition device, voiceprint recognition medium and voiceprint recognition equipment | |
CN114141271A (en) | Psychological state detection method and system | |
Nguyen et al. | Vietnamese speaker authentication using deep models | |
Alkhatib et al. | ASR Features Extraction Using MFCC And LPC: A Comparative Study |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190816 |