CN107481723A - A kind of channel matched method and its device for Application on Voiceprint Recognition - Google Patents
A kind of channel matched method and its device for Application on Voiceprint Recognition Download PDFInfo
- Publication number
- CN107481723A CN107481723A CN201710751356.2A CN201710751356A CN107481723A CN 107481723 A CN107481723 A CN 107481723A CN 201710751356 A CN201710751356 A CN 201710751356A CN 107481723 A CN107481723 A CN 107481723A
- Authority
- CN
- China
- Prior art keywords
- voice
- channel
- speech data
- data
- speech
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 37
- 238000004891 communication Methods 0.000 claims abstract description 49
- 238000004088 simulation Methods 0.000 claims abstract description 27
- 238000012549 training Methods 0.000 claims abstract description 21
- 230000006835 compression Effects 0.000 claims abstract description 12
- 238000007906 compression Methods 0.000 claims abstract description 12
- 238000004422 calculation algorithm Methods 0.000 claims description 20
- 238000012360 testing method Methods 0.000 claims description 6
- 230000005540 biological transmission Effects 0.000 claims description 3
- 230000006855 networking Effects 0.000 abstract description 5
- 239000013598 vector Substances 0.000 description 31
- 239000011159 matrix material Substances 0.000 description 12
- 238000001228 spectrum Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 5
- 238000000605 extraction Methods 0.000 description 4
- 230000003044 adaptive effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000000556 factor analysis Methods 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 230000001755 vocal effect Effects 0.000 description 2
- 235000007926 Craterellus fallax Nutrition 0.000 description 1
- 240000007175 Datura inoxia Species 0.000 description 1
- 240000005308 Juniperus chinensis Species 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000005094 computer simulation Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000011840 criminal investigation Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000005284 excitation Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 229940074869 marquis Drugs 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000001151 other effect Effects 0.000 description 1
- VBUNOIXRZNJNAD-UHFFFAOYSA-N ponazuril Chemical compound CC1=CC(N2C(N(C)C(=O)NC2=O)=O)=CC=C1OC1=CC=C(S(=O)(=O)C(F)(F)F)C=C1 VBUNOIXRZNJNAD-UHFFFAOYSA-N 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/02—Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/20—Pattern transformations or operations aimed at increasing system robustness, e.g. against channel noise or different working conditions
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Telephonic Communication Services (AREA)
Abstract
The present invention proposes a kind of channel matched method and its device for Application on Voiceprint Recognition, belongs to speech recognition and field of voice communication.The inventive method gathers speech data first, carries out voice coding to speech data according to the communication pattern to be simulated and obtains compressing speech data;Error code operation is then carried out to compression speech data according to the bit error rate that respective communication pattern lower channel is simulated, obtains channel simulation speech data;Tone decoding operation finally is carried out to the speech data, obtains the voice under respective communication pattern.Apparatus of the present invention include:Voice collecting and read module, voice coding module, channel error code analog module, tone decoding module and data memory module.The present invention can simulate the voice communication courses such as fixed line, the VOIP networking telephones, wechat phone, QQ phones, 2G, 3G, 4G, so as to obtain training voice with tested speech channel condition identical, channel mismatch is efficiently solved the problems, such as, suitable for the application demand of Application on Voiceprint Recognition.
Description
Technical field
The present invention relates to speech recognition and field of voice communication, is a kind of channel for Application on Voiceprint Recognition specifically
Method of completing the square and its device.
Technical background
Application on Voiceprint Recognition, also referred to as Speaker Identification, it is to utilize computer, according to the life of voice automatic decision speaker's identity
Thing feature identification technique.According to different application scenarios, sound groove recognition technology in e has a variety of sorting techniques:According to voice content whether
, it is known that Application on Voiceprint Recognition can be divided into the unrelated with text of text correlation.According to the difference of identification mission, Application on Voiceprint Recognition can be divided into
Talk about people's identification and speaker verification.Sound groove recognition technology in e is mainly used in the fields such as security monitoring, the criminal investigation administration of justice and ecommerce.
In recent years, the unrelated speaker of the text of main flow recognizes (hereinafter referred to as Speaker Identification) technology and is based on
Douglas A.Reynolds were in gauss hybrid models-universal background model (the Gaussian mixture proposed in 2000
Model-universal background models, GMM-UBM) Speaker Recognition System.GMM-UBM systems are from speaker
Identify angle, it is proposed that weigh the theoretical frame and implementation of two sections of voice similarity degrees, there is landmark meaning.
Voice communication refers to, by voice and by the communication way of transmission medium, there is base call, mobile phone communication, intercommunication
Machine is conversed, and voice-enabled chat above network etc., is referred to as voice call.Voice communication mode common at present has landline telephone to lead to
Letter, the VOIP networking telephones, wechat phone, QQ phones, 2G communication, 3G communications and 4G communications etc..
Public Switched Telephony Network (Public Switched Telephone Network, PSTN), i.e., in daily life
Conventional telephone network.PSTN is a kind of circuit-switched network based on analogue technique, and its speech coding algorithm used is
G.711a rate coding mode or u rate coding modes.The coded system that the VOIP networking telephones often use is International Telecommunication Union
G.723 standard, specially algebraic code-excited linear predictive coding ACELP encode.The communication party that wechat phone, QQ phones use
Formula is narrowband self-adaption multi code Rate of Chinese character AMR-NB coded systems.2G communicates, i.e. Generation Mobile Telecommunication System technology, bag Chinese juniper gsm communication system
System and CDMA1x communication systems, the 2G of wherein China Mobile and CHINAUNICOM use GSM standard, and China Telecom 2G is used
It is CDMA1x standards.GSM voice codings are Regular-Pulse Excitation long-term linearity predictive coding RPE-LTP.3G communications include China
The TD-SCDMA and the WCDMA of the CHINAUNICOM and CDMA2000 of China Telecom that movable independent is formulated.TD-SCDMA and WCDMA
All encoded using adaptive multi-beam forming AMR-NB or AMR-WB.Telecommunications 2G, 3G uses enhanced variable rate encoding and decoding
Device EVRC or QCELP coded system.4G communicates, and China Mobile uses TD-LTE (Time Division Long Term
Evolution) standard, CHINAUNICOM and China Telecom use FDD-LTE standards.What 4G communications used is high definition voice
Converse VoLTE, and voice coding modes are adaptive multi-beam forming AMR.
Due to the extensive use of digital voice communication system, Speaker Recognition System can be obtained instructing in actual environment
White silk voice and the coding of tested speech are often different, and Application on Voiceprint Recognition at this moment is just faced with because training and tested speech encode not
Same and caused voice channel mismatch problem, this will have a huge impact to the performance of Speaker Recognition System.Solves letter
Road mismatch problem is to improve Speaker Identification performance, strengthens one of the key of Speaker Recognition System degree of being practical.
To solve the problems, such as channel mismatch in Application on Voiceprint Recognition, what is be commonly used at present is that research models calculation across the vocal print of channel
Method.Under the technology main vocal print modeling algorithm have disturbance component projection model (Nuisance Attribute Projection,
NAP), simultaneous factor analysis model (Joint Factor Analysis, JFA), identity-based authentication vector (identity-
Vector, i-vector) Speaker Identification modeling method and combine speech recognition DNN acoustic models and i-vector models
Speaker Identification modeling method etc..
NAP and JFA is the subspace model put forward for channel mismatch problem.Wherein NAP direct estimations go out one
Channel subspace, then the subspace is removed to reduce channel information to Speaker Identification from GMM average super vectors space
Interference.JFA is thought in the higher dimensional space of GMM averages super vector, two sub-spaces is present and is contained speaker respectively
Information and channel information, can be more effectively by speaker information in voice and letter by the way that the two subspaces are carried out with joint modeling
Road information separates, so as to lift the Speaker Recognition System performance under Complex Channel.Because channel in JFA models is empty
The interior speaker information contained compared with horn of plenty, the method that JFA separately models to speaker and channel can be to speaker information
Larger damage is produced, 2010, Dehak et al. proposed i-vector models on JFA basis.In i-vector models only
A sub-spaces, referred to as entire change subspace are defined, speaker information and channel information are contained simultaneously in the subspace.
Further every section of voice has been expressed as a low dimension vector in the subspace, i.e. i-vector.Finally by i-
The mode that vector aspects carry out channel compensation weakens influence of the channel to Speaker Recognition System performance.With JFA model phases
Than, the complexity of i-vector models greatly reduces, while more flexible by way of carrying out channel compensation in subspace,
And more preferable Speaker Identification performance is shown, and this also causes i-vector models to become most main flow and forefront
Speaker Identification modeling method.2014, Lei and Kenny et al. proposed one kind and combine speech recognition DNN acoustic models and i-
The Speaker Identification modeling method of vector models:During the valuation of i-vector model correlation sufficient statistics, use
The DNN acoustic models classified in speech recognition to phoneme state replace traditional UBM model to calculate frame posterior probability.The party
Method reduces Speaker Recognition System modeling complexity, and recognition effect lifting is obvious.
Illustrated by taking conventional JFA method for recognizing sound-groove as an example, this method assumes that can be used by giving one section of voice by one
Super vector represents that then this super vector can be expressed as:
Mh(s)=mubm+vy(s)+uxh(s) (1)
Wherein Mh(s) given speaker s h section voices, m are representedubmUBM average super vector is represented, v represents to say
People's space matrix is talked about, u represents channel space matrix, and y (s) is speaker's factor, xh(s) be given speaker s h section languages
The channel factors that sound has.
In JFA, it is necessary first to estimate v, the two matrixes of u, i.e. speaker space matrix and channel space matrix.Work as instruction
Practice the when marquis of speaker, then corresponding speaker factor y (s) and channel factors x are specifically estimated respectively to every section of training voiceh
(s), so as to obtain this section training voice corresponding to speaker model be:
M (s)=mubm+vy(s) (2)
During test, it is only necessary to estimate the channel factors x of every section of tested speechh(s), you can be said with what is above obtained
Words people's model is combined to realize Speaker Identification.
In Voiceprint Recognition System based on JFA, it is necessary to speech data, by function division have following three part:Training
The speech data of common background gauss hybrid models;The speech data of training objective speaker;Speech data to be identified.
The existing method for recognizing sound-groove based on JFA include training the universal model stage, estimation speaker space matrix and
Channel space stage matrix, training speaker model stage and test phase, wherein:
1) the universal model stage is trained, is comprised the following steps:
1-a) by voice pretreatment and feature extraction, the speech data for training common background gauss hybrid models is converted
For spectrum signature;
It is initial to common background gauss hybrid models using K-means algorithms 1-b) based on the spectrum signature extracted
Change;
1-c) using EM algorithm (Expectation maximum, EM) renewal step 1-b) initialization it is general
Background gauss hybrid models.
2) estimate speaker space matrix and channel space stage matrix, comprise the following steps:
The characteristic vector of all voices of speaker 2-a) is calculated relative to the single order of Gaussian component in universal background model
Baum-welch statistics, obtain corresponding average super vector;
2-b) combine step 2-a) gained average super vector, estimate speaker space matrix v using EM algorithm iterations;
The characteristic vector of all voices under same channel 2-c) is calculated relative to one of Gaussian component in universal background model
Rank baum-welch statistics, obtain corresponding average super vector;
2-d) combine step 2-c) gained average super vector, estimate channel space matrix u using EM algorithm iterations.
3) the speaker model stage is trained, is comprised the following steps:
3-a) by voice pretreatment and feature extraction, the speech data of training objective speaker is converted into frequency spectrum spy
Sign;
3-b) be based on step 3-a) spectrum signature, calculate baum-welch statistics, obtain corresponding average super vector;
3-c) combine step 3-b) gained average super vector, estimate corresponding speaker's factor y using the E-step of EM algorithms
And channel factors x (s)h(s) combine vector, take y (s) part therein;
3-d) combine the speaker space matrix v and channel space matrix u that step 2) obtains, and 3-c) gained y (s),
Target speaker model is calculated.
In step 3-d) in, train speaker model corresponding to voice to be calculated according to formula (2).
4) test phase:
4-a) by voice pretreatment and feature extraction, voice to be identified is converted into spectrum signature;
4-b) be based on 4-a) spectrum signature, calculate baum-welch statistics, obtain corresponding average super vector;
4-c) combine step 3-b) gained average super vector, estimate corresponding channel factors x using the E-step of EM algorithmsh
(s);
The speaker model that step 3) obtains 4-d) is combined, calculation of group dividing affinity score is obtained using likelihood ratio;
4-e) using step 4-d) largest score of gained is calculated as the test statement recognition result.
Above-mentioned method for recognizing sound-groove may serve to solve the problems, such as channel mismatch, but corresponding problem be present.By taking JFA as an example,
It is required that amount of training data is very big, operand is also very big when test, is difficult often to obtain to know very well in actual applications
Other effect.
The content of the invention
It is open to provide a kind of channel for Application on Voiceprint Recognition the invention aims to solve the deficiency of prior art
Matching process and its device.The present invention can effectively carry out channel soft simulation, simulation fixed line, 2G, 3G, 4G etc. to voice communication
Voice communication course, so as to obtain training voice with tested speech channel condition identical, channel mismatch is efficiently solved the problems, such as,
Suitable for the application demand of reality.
The technical solution adopted by the present invention is as follows:
A kind of channel matched method for Application on Voiceprint Recognition, it is characterised in that this method includes:Data acquisition and reading rank
Section, voice coding stage, channel error code simulation stage, tone decoding stage;
1) data acquisition and reading stage comprise the following steps:
1-a) gather and read primary voice data, wherein, primary voice data is WAV forms;
The file header of primary voice data 1-b) is removed according to WAV format standards, obtains pure speech data block;
2) the voice coding stage comprises the following steps:
Voice communication mode to be simulated 2-a) is selected according to primary voice data, the voice communication mode be fixed line,
Any one in the VOIP networking telephones, wechat phone, QQ phones, GSM, 3G, 4G;
2-b) according to speech coding standard corresponding with selected voice communication mode to step 1-b) obtained pure voice number
Voice coding is carried out according to block, obtains compressing speech data;
3) the channel error code dummy run phase comprises the following steps:
3-a) according to step 2-a) selected by voice communication mode selection respective channel, obtain voice under different state of signal-to-noise
The bit error rate of transmission;
Signal to noise ratio 3-b) is selected, to step 2-b) in after encoded compression speech data carry out error code operation, obtain
Speech data after channel simulation, the speech data is as channel simulation speech data;Wherein, error code operation is to be communicated according to selected
The bit error rate under pattern, that is, bit error rate corresponding under signal to noise ratio is selected, random error is carried out to compression speech data;
4) the tone decoding stage comprises the following steps:
4-a) according to step 2-a) selected by voice communication mode select corresponding tone decoding algorithm;
4-b) with corresponding tone decoding algorithm to step 3-b) obtained channel simulation speech data decodes;
Wav file head 4-c) is added to decoded speech data, obtains training language with tested speech channel condition identical
Sound data.
A kind of channel matched device for Application on Voiceprint Recognition is also proposed based on the above method present invention, it is characterised in that should
Device includes following 5 modules:
Voice collecting and read module:For gathering and reading the original sound data of speaker, speech data will be obtained
Voice data file head is removed, obtains pure speech data block;
Voice coding module:For selecting voice communication mode to be simulated according to primary voice data, to according to voice
Collection and read module obtain pure speech data block and carry out voice coding, so as to obtain the compression voice number under respective communication pattern
According to;
Channel error code analog module:Respective channel is selected according to communication pattern, realizes and the channel for compressing speech data is missed
Code simulation, obtains channel simulation speech data;
Tone decoding module:According to the corresponding decoding algorithm of voice communication mode, language is carried out to channel simulation speech data
Sound is decoded, and increases file header to decoded voice, and voice is exported with tested speech channel condition identical so as to obtain;
Data memory module:For store primary voice data, compression speech data, channel simulation speech data and with survey
Voice channel condition identical output speech data is tried, and corresponding data is passed into corresponding module.
The features of the present invention and beneficial effect:
(1) compared with traditional method for recognizing sound-groove, information channel simulation method is applied in Application on Voiceprint Recognition by the inventive method, only
Need that channel simulator will be carried out for the primary voice data of training, with regard to that can obtain training with tested speech channel condition identical
Voice, so as to solve the problems, such as channel mismatch existing for traditional method for recognizing sound-groove.
(2) compared with the Application on Voiceprint Recognition modeling algorithm across channel, the present invention simply carries out letter to original training speech samples
Road emulates, without changing Application on Voiceprint Recognition modeling algorithm, so as to reduce the complexity of recognizer, while recognition effect
It is more preferable than the Application on Voiceprint Recognition modeling algorithm across channel.Therefore, the inventive method meets actual answer more suitable for Application on Voiceprint Recognition task
Demand.
Brief description of the drawings
Fig. 1 is the method flow block diagram of the present invention.
Fig. 2 is the apparatus structure block diagram of the present invention.
Embodiment
A kind of channel matched method and its device for Application on Voiceprint Recognition proposed by the present invention, be described with reference to the accompanying drawings as
Under.
A kind of channel matched method for Application on Voiceprint Recognition proposed by the present invention, its flow is as shown in figure 1, this method bag
Include:Data acquisition and reading stage, voice coding stage, channel error code dummy run phase, tone decoding stage;
1) data acquisition and reading stage, comprise the following steps:
1-a) gather and read primary voice data, wherein, primary voice data is WAV forms;
The file header of primary voice data 1-b) is removed according to WAV format standards, obtains pure speech data block;
2) the voice coding stage, comprise the following steps:
Voice communication mode to be simulated 2-a) is selected according to primary voice data, the voice communication mode be fixed line,
Any one in the VOIP networking telephones, wechat phone, QQ phones, GSM, 3G, 4G;
2-b) according to the speech coding standard corresponding with selected voice communication mode to step 1-b) obtained pure voice
Data block carries out voice coding, obtains compressing speech data;
3) the channel error code dummy run phase comprises the following steps:
3-a) according to step 2-a) selected by voice communication mode selection respective channel, obtain different state of signal-to-noise lower channels
Transmit the bit error rate of voice;
Signal to noise ratio 3-b) is selected, to step 2-b) in after encoded compression speech data carry out error code operation, obtain
Speech data after channel simulation, the speech data is as channel simulation speech data;Wherein, error code operation is to be communicated according to selected
The bit error rate under pattern, that is, bit error rate corresponding under signal to noise ratio is selected, random error is carried out to compression speech data;
4) the tone decoding stage, comprise the following steps:
4-a) according to step 2-a) selected by voice communication mode select corresponding tone decoding algorithm;
4-b) with corresponding tone decoding algorithm to step 3-b) obtained channel simulation speech data decodes;
Wav file head 4-c) is added to decoded speech data, obtains training language with tested speech channel condition identical
Sound data, the training speech data will be used for Application on Voiceprint Recognition.
Above-mentioned steps 1-a) in, primary voice data requires that form is WAV, 8Khz or 16Khz sampling, and 16bit quantifies, its
Middle selection 8Khz or 16Khz sampling is determined by voice coding modes.
Obtained by the inventive method after training voice, conventional method for recognizing sound-groove can be used to carry out Application on Voiceprint Recognition.Example
Such as Application on Voiceprint Recognition is carried out using i-vector models.The i-vector features of extraction training voice and tested speech, are calculated afterwards
Therebetween the maximum as recognition result of cos distances, wherein distance value.
The present invention also proposes the channel matched device for Application on Voiceprint Recognition using the above method, it is characterised in that the dress
Put including following 5 modules:
Voice collecting and read module:For gathering and reading the original sound data of speaker, speech data will be obtained
Voice data file head is removed, obtains pure speech data block;
Voice coding module:For selecting voice communication mode to be simulated according to primary voice data, to obtaining pure language
Sound data block carries out voice coding, so as to obtain the compression speech data under corresponding voice communication mode;
Channel error code analog module:Respective channel is selected according to voice communication mode, realizes the letter to compressing speech data
Road error code simulation, obtains channel simulation speech data;
Tone decoding module:According to the corresponding decoding algorithm of voice communication mode, language is carried out to channel simulation speech data
Sound is decoded, and increases file header to decoded voice, and speech data is exported with tested speech channel condition identical so as to obtain;
Data memory module:For store primary voice data, compression speech data, channel simulation speech data and with survey
Voice channel condition identical output speech data is tried, and corresponding data is passed into corresponding module.
The available conventional simulation of above-mentioned each module, digital integrated electronic circuit are realized.
Claims (2)
- A kind of 1. channel matched method for Application on Voiceprint Recognition, it is characterised in that this method includes:Data acquisition and reading rank Section, voice coding stage, channel error code simulation stage, tone decoding stage;1) data acquisition and reading stage comprise the following steps:1-a) gather and read primary voice data, wherein, primary voice data is WAV forms;The file header of primary voice data 1-b) is removed according to WAV format standards, obtains pure speech data block;2) the voice coding stage comprises the following steps:Voice communication mode to be simulated 2-a) is selected according to primary voice data, the voice communication mode is fixed line, VOIP nets Any one in network phone, wechat phone, QQ phones, GSM, 3G, 4G;2-b) according to speech coding standard corresponding with selected voice communication mode to step 1-b) obtained pure speech data block Voice coding is carried out, obtains compressing speech data;3) the channel error code dummy run phase comprises the following steps:3-a) according to step 2-a) selected by voice communication mode selection respective channel, obtain voice channel under different state of signal-to-noise The bit error rate of transmission;Signal to noise ratio 3-b) is selected, to step 2-b) in after encoded compression speech data carry out error code operation, obtain channel Speech data after simulation, the speech data is as channel simulation speech data;Wherein, error code operation is the communication pattern selected by Under the bit error rate, i.e., corresponding bit error rate, random error is carried out to compression speech data under selected signal to noise ratio;4) the tone decoding stage comprises the following steps:4-a) according to step 2-a) selected by voice communication mode select corresponding tone decoding algorithm;4-b) with corresponding tone decoding algorithm to step 3-b) obtained channel simulation speech data decodes;Wav file head 4-c) is added to decoded speech data, obtains training voice number with tested speech channel condition identical According to.
- 2. a kind of channel matched device for Application on Voiceprint Recognition using method as claimed in claim 1, it is characterised in that should Device includes following 5 modules:Voice collecting and read module:For gathering and reading the original sound data of speaker, speech data removal will be obtained Voice data file head, obtain pure speech data block;Voice coding module:For selecting voice communication mode to be simulated according to primary voice data, to according to voice collecting And read module obtains pure speech data block and carries out voice coding, so as to obtain the compression speech data under respective communication pattern;Channel error code analog module:Respective channel is selected according to communication pattern, realizes the channel error code mould to compressing speech data Intend, obtain channel simulation speech data;Tone decoding module:According to the corresponding decoding algorithm of voice communication mode, voice solution is carried out to channel simulation speech data Code, increases file header to decoded voice, and voice is exported with tested speech channel condition identical so as to obtain;Data memory module:For storing primary voice data, compressing speech data, channel simulation speech data and with testing language Sound channel condition identical exports speech data, and corresponding data is passed into corresponding module.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710751356.2A CN107481723A (en) | 2017-08-28 | 2017-08-28 | A kind of channel matched method and its device for Application on Voiceprint Recognition |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710751356.2A CN107481723A (en) | 2017-08-28 | 2017-08-28 | A kind of channel matched method and its device for Application on Voiceprint Recognition |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107481723A true CN107481723A (en) | 2017-12-15 |
Family
ID=60604022
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710751356.2A Pending CN107481723A (en) | 2017-08-28 | 2017-08-28 | A kind of channel matched method and its device for Application on Voiceprint Recognition |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107481723A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109192216A (en) * | 2018-08-08 | 2019-01-11 | 联智科技(天津)有限责任公司 | A kind of Application on Voiceprint Recognition training dataset emulation acquisition methods and its acquisition device |
CN111210809A (en) * | 2018-11-22 | 2020-05-29 | 阿里巴巴集团控股有限公司 | Voice training data adaptation method and device, voice data conversion method and electronic equipment |
CN111312283A (en) * | 2020-02-24 | 2020-06-19 | 中国工商银行股份有限公司 | Cross-channel voiceprint processing method and device |
CN111402899A (en) * | 2020-03-25 | 2020-07-10 | 中国工商银行股份有限公司 | Cross-channel voiceprint identification method and device |
CN111817943A (en) * | 2019-04-12 | 2020-10-23 | 腾讯科技(深圳)有限公司 | Data processing method and device based on instant messaging application |
CN112489678A (en) * | 2020-11-13 | 2021-03-12 | 苏宁云计算有限公司 | Scene recognition method and device based on channel characteristics |
CN117765951A (en) * | 2023-09-21 | 2024-03-26 | 南京龙垣信息科技有限公司 | Information processing method and device for telephone voice recognition |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1306352A (en) * | 2000-03-01 | 2001-08-01 | 深圳市中兴通讯股份有限公司 | Test method of echo cancel function |
CN1791035A (en) * | 2005-12-22 | 2006-06-21 | 西安交通大学 | Design method for distributed wireless communication transmission technique test platform |
JP2010093815A (en) * | 2008-10-13 | 2010-04-22 | Ntt Docomo Inc | Method for time-space encoding, and method and apparatus for transmitting, receiving and decoding radio signal |
CN103730112A (en) * | 2013-12-25 | 2014-04-16 | 安徽讯飞智元信息科技有限公司 | Multi-channel voice simulation and acquisition method |
CN106374975A (en) * | 2016-08-22 | 2017-02-01 | 王毅 | Multiport-based digitalized power line channel simulation device and simulation method |
-
2017
- 2017-08-28 CN CN201710751356.2A patent/CN107481723A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1306352A (en) * | 2000-03-01 | 2001-08-01 | 深圳市中兴通讯股份有限公司 | Test method of echo cancel function |
CN1791035A (en) * | 2005-12-22 | 2006-06-21 | 西安交通大学 | Design method for distributed wireless communication transmission technique test platform |
JP2010093815A (en) * | 2008-10-13 | 2010-04-22 | Ntt Docomo Inc | Method for time-space encoding, and method and apparatus for transmitting, receiving and decoding radio signal |
CN103730112A (en) * | 2013-12-25 | 2014-04-16 | 安徽讯飞智元信息科技有限公司 | Multi-channel voice simulation and acquisition method |
CN106374975A (en) * | 2016-08-22 | 2017-02-01 | 王毅 | Multiport-based digitalized power line channel simulation device and simulation method |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109192216A (en) * | 2018-08-08 | 2019-01-11 | 联智科技(天津)有限责任公司 | A kind of Application on Voiceprint Recognition training dataset emulation acquisition methods and its acquisition device |
CN111210809A (en) * | 2018-11-22 | 2020-05-29 | 阿里巴巴集团控股有限公司 | Voice training data adaptation method and device, voice data conversion method and electronic equipment |
CN111210809B (en) * | 2018-11-22 | 2024-03-19 | 阿里巴巴集团控股有限公司 | Voice training data adaptation method and device, voice data conversion method and electronic equipment |
CN111817943A (en) * | 2019-04-12 | 2020-10-23 | 腾讯科技(深圳)有限公司 | Data processing method and device based on instant messaging application |
CN111817943B (en) * | 2019-04-12 | 2022-06-14 | 腾讯科技(深圳)有限公司 | Data processing method and device based on instant messaging application |
US11683278B2 (en) | 2019-04-12 | 2023-06-20 | Tencent Technology (Shenzhen) Company Limited | Spectrogram and message bar generation based on audio data in an instant messaging application |
CN111312283A (en) * | 2020-02-24 | 2020-06-19 | 中国工商银行股份有限公司 | Cross-channel voiceprint processing method and device |
CN111402899A (en) * | 2020-03-25 | 2020-07-10 | 中国工商银行股份有限公司 | Cross-channel voiceprint identification method and device |
CN111402899B (en) * | 2020-03-25 | 2023-10-13 | 中国工商银行股份有限公司 | Cross-channel voiceprint recognition method and device |
CN112489678A (en) * | 2020-11-13 | 2021-03-12 | 苏宁云计算有限公司 | Scene recognition method and device based on channel characteristics |
CN112489678B (en) * | 2020-11-13 | 2023-12-05 | 深圳市云网万店科技有限公司 | Scene recognition method and device based on channel characteristics |
CN117765951A (en) * | 2023-09-21 | 2024-03-26 | 南京龙垣信息科技有限公司 | Information processing method and device for telephone voice recognition |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107481723A (en) | A kind of channel matched method and its device for Application on Voiceprint Recognition | |
Weng et al. | Semantic communication systems for speech transmission | |
CN107633842B (en) | Audio recognition method, device, computer equipment and storage medium | |
CN109272988B (en) | Voice recognition method based on multi-path convolution neural network | |
CN108806667A (en) | The method for synchronously recognizing of voice and mood based on neural network | |
CN101447184B (en) | Chinese-English bilingual speech recognition method based on phoneme confusion | |
CN108281137A (en) | A kind of universal phonetic under whole tone element frame wakes up recognition methods and system | |
CN109977207A (en) | Talk with generation method, dialogue generating means, electronic equipment and storage medium | |
CN109754790B (en) | Speech recognition system and method based on hybrid acoustic model | |
CN107408383A (en) | Encoder selects | |
CN109036391A (en) | Audio recognition method, apparatus and system | |
CN108922513A (en) | Speech differentiation method, apparatus, computer equipment and storage medium | |
CN106297826A (en) | Speech emotional identification system and method | |
CN104036774A (en) | Method and system for recognizing Tibetan dialects | |
CN109192216A (en) | A kind of Application on Voiceprint Recognition training dataset emulation acquisition methods and its acquisition device | |
CN110415701A (en) | The recognition methods of lip reading and its device | |
CN107767861A (en) | voice awakening method, system and intelligent terminal | |
CN110310619A (en) | Polyphone prediction technique, device, equipment and computer readable storage medium | |
CN101221766B (en) | Method for switching audio encoder | |
CN108922521A (en) | A kind of voice keyword retrieval method, apparatus, equipment and storage medium | |
CN110176237A (en) | A kind of audio recognition method and device | |
CN111128211B (en) | Voice separation method and device | |
CN110570869A (en) | Voiceprint recognition method, device, equipment and storage medium | |
CN112131359A (en) | Intention identification method based on graphical arrangement intelligent strategy and electronic equipment | |
CN113539232B (en) | Voice synthesis method based on lesson-admiring voice data set |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20171215 |
|
WD01 | Invention patent application deemed withdrawn after publication |